Re: PyLucene 4.8.0 - 'make install' with 'root=' missing a couple of files

2014-05-30 Thread Eduard Rozenberg
Hello Andi,

Thanks for your note. It took quite some days of effort but 
I finally found a “golden path” to make a package out of 
PyLucene.

I had to do ‘make’ and ‘make test’ without the setup.cfg
first. Otherwise 'make test' would dump a bunch of stuff
into my specified install root in a temporary directory and
then fail the tests entirely. 

After ‘make’ and ‘make test’ succeed, I then put the
setup.cfg in the source folder and run ‘make install’ to
get the files to be copied to the alternate root specified
in the setup.cfg. Then I need to move and rename the
folders copied to the alternate root to match what the
install would normally do if it were allowed to install to
the usual root destination, because install apparently
behaves differently when given an alternate root. Then
I copy the missing _lucene.py and native_libs.txt files
from the source directory to the alternate root path.

Finally I created a doinst.sh script which uses sed to
add the proper path to easy-install.pth.

Then I take everything in the alternate root and make
a package out of it.

In case it helps anyone I put the various build and
script files in Dropbox, I’ll hopefully be submitting 
them to slackbuilds.org sometime soon.

https://www.dropbox.com/sh/zsbw4uuhva2vy5h/AADKCmKyF-oOhKNTPiwppYELa

I think the reason it is so hard to package PyLucene
is that it is exceedingly free software, and doesn’t
like to be locked up inside of a package :).

Regards,
—Ed


On May 27, 2014, at 18:53, Andi Vajda va...@apache.org wrote:

 
 On Tue, 27 May 2014, Eduard Rozenberg wrote:
 
 Hello folks,
 
 I?m working on packaging PyLucene to a Slackware
 package by using a setup.cfg in the source directory
 and redirecting the installation root to
 /tmp/pylucene_installdir.
 
 I noticed that a couple of files are missing when doing
 this alternate root install compared to the regular install.
 
 ## setup.cfg ###
 
 [easy_install]
 
 [build]
 
 [install]
 root = /tmp/pylucene_installdir
 compile = False
 force = True
 single-version-externally-managed = True
 
 ##
 
 I noticed that two files are missing when I do this root=
 install compared to the regular install to /usr/lib?/
 Are these two files below not necessary when
 packaging PyLucene for distribution?
 
 Missing: native_libs.txt
 
 I don't know what this file, native_libs.txt, is for. Maybe a setuptools 
 artifact ?
 
 --
 Contains:
 lucene/_lucene.so
 
 
 Missing: _lucene.py
 
 Yes, that one you need.
 Did you try running pylucene tests without it ?
 
 Andi..
 
 --
 Contains:
 def __bootstrap__():
   global __bootstrap__, __loader__, __file__
   import sys, pkg_resources, imp
   __file__ = pkg_resources.resource_filename(__name__, '_lucene.so')
   __loader__ = None; del __bootstrap__, __loader__
   imp.load_dynamic(__name__,__file__)
 __bootstrap__()
 
 
 Thanks in advance!
 
 Regards,
 ?Ed



PyLucene 4.8.0 - samples/FacetExample.py appears broken

2014-05-30 Thread Eduard Rozenberg
Hello,

I’m getting an error running this sample, I believe it is related to an old 
issue of Facet API having changed  -
http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201302.mbox/%3calpine.OSX.2.01.1302101349450.6538@yuzu.local%3e

$ python FacetExample.py 

Traceback (most recent call last):
  File FacetExample.py, line 51, in module
from org.apache.lucene.facet.index import FacetFields
ImportError: No module named index

Regards,
—Ed

Re: PyLucene 4.8.0 - samples/FacetExample.py appears broken

2014-05-30 Thread Thomas Koch
Hello Ed,
yes this is a 'known issue‘ - I contributed the FacetExample based on PyLucene 
3.6.x. Meanwhile Lucene Facet API has changed and I shall adapt the example. 

regards,
Thomas 
--
Am 30.05.2014 um 18:09 schrieb Eduard Rozenberg edua...@pobox.com:

 Hello,
 
 I’m getting an error running this sample, I believe it is related to an old 
 issue of Facet API having changed  -
 http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201302.mbox/%3calpine.OSX.2.01.1302101349450.6538@yuzu.local%3e
 
 $ python FacetExample.py 
 
 Traceback (most recent call last):
  File FacetExample.py, line 51, in module
from org.apache.lucene.facet.index import FacetFields
 ImportError: No module named index
 
 Regards,
 —Ed



Re: PyLucene 4.8.0 - samples/FacetExample.py appears broken

2014-05-30 Thread Andi Vajda

 On May 30, 2014, at 10:58, Thomas Koch k...@orbiteam.de wrote:
 
 Hello Ed,
 yes this is a 'known issue‘ - I contributed the FacetExample based on 
 PyLucene 3.6.x. Meanwhile Lucene Facet API has changed and I shall adapt the 
 example. 
 
 regards,

Thanks Thomas !
Once you do that maybe it's time to make a 4.8.1 release too.
Let me know when you're ready with a fix.

Andi..

 Thomas 
 --
 Am 30.05.2014 um 18:09 schrieb Eduard Rozenberg edua...@pobox.com:
 
 Hello,
 
 I’m getting an error running this sample, I believe it is related to an old 
 issue of Facet API having changed  -
 http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201302.mbox/%3calpine.OSX.2.01.1302101349450.6538@yuzu.local%3e
 
 $ python FacetExample.py 
 
 Traceback (most recent call last):
 File FacetExample.py, line 51, in module
   from org.apache.lucene.facet.index import FacetFields
 ImportError: No module named index
 
 Regards,
 —Ed
 


[jira] [Updated] (SOLR-6119) TestReplicationHandler attempts to remove open folders

2014-05-30 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-6119:
--

Attachment: SOLR-6119.patch

Varun, your patch is fine but it doesn't address the original problem. I've 
added a try/finally block around rm because the backup folder cannot be removed 
(it's still open on Windows). I attach an updated patch that removes this block 
-- if you run it now, you should be seeing the same issue (you may need a 
Windows machine for this, though).

{code}
[08:00:08.376] ERROR   3.78s | TestReplicationHandler.doTestBackup 
Throwable #1: java.io.IOException: Could not remove the following files 
(in the order of attempts):
   
C:\Work\lucene-solr-svn\trunk\solr\build\solr-core\test\J0\.\temp\solr.handler.TestReplicationHandler-B751491BC59B33CA-001\solr-instance-001\collection1\data\snapshot.eyxtuk

   at 
__randomizedtesting.SeedInfo.seed([B751491BC59B33CA:F6DA697EE225C085]:0)
   at org.apache.lucene.util.TestUtil.rm(TestUtil.java:118)
   at 
org.apache.solr.handler.TestReplicationHandler.doTestBackup(TestReplicationHandler.java:1559)
...
{code}

 TestReplicationHandler attempts to remove open folders
 --

 Key: SOLR-6119
 URL: https://issues.apache.org/jira/browse/SOLR-6119
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Priority: Minor
 Attachments: SOLR-6119.patch, SOLR-6119.patch, SOLR-6119.patch


 TestReplicationHandler has a weird logic around the 'snapDir' variable. It 
 attempts to remove snapshot folders, even though they're not closed yet. My 
 recent patch uncovered the bug but I don't know how to fix it cleanly -- the 
 test itself seems to be very fragile (for example I don't understand the 
 'namedBackup' variable which is always set to true, yet there are 
 conditionals around it).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6119) TestReplicationHandler attempts to remove open folders

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013348#comment-14013348
 ] 

Dawid Weiss commented on SOLR-6119:
---

The problem is in the test, here:
{code}
  BackupThread deleteBackupThread = new BackupThread(backupNames[i], 
ReplicationHandler.CMD_DELETE_BACKUP);
  deleteBackupThread.start();
  int waitCnt = 0;
  CheckDeleteBackupStatus checkDeleteBackupStatus = new 
CheckDeleteBackupStatus();
  while (true) {
checkDeleteBackupStatus.fetchStatus();
{code}

you run the backup threads but never wait for the backup to finish, checking 
the delete status. There is a race condition in there -- either the check for 
backup status should really return true after backup files are removed or the 
wait for the backup itself should be done in an alternative way.

If you add a log to backup (before/ after) and to the finally block in the 
test, the wrong interleaving is:
{code}
4752 T60 oash.SnapShooter.deleteNamedSnapshot Deleting snapshot: eyxtuk
4752 T12 oash.TestReplicationHandler.doTestBackup  -- DELETING (finally 
block in the test)
4754 T60 oash.SnapPuller.delTree WARN Unable to delete file : 
C:\Users\dweiss\AppData\Local\Temp\solr.handler.TestReplicationHandler-B751491BC59B33CA-005\solr-instance-001\collection1\data\snapshot.eyxtuk\_0.cfs
4754 T60 oash.SnapShooter.deleteNamedSnapshot WARN Unable to delete snapshot: 
eyxtuk
4754 T60 oash.SnapShooter.deleteNamedSnapshot Deleting snapshot: eyxtuk (DONE)
{code}

So the test never waits for the snapshooter.deleteNamedSnapshot to finish.

 TestReplicationHandler attempts to remove open folders
 --

 Key: SOLR-6119
 URL: https://issues.apache.org/jira/browse/SOLR-6119
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Priority: Minor
 Attachments: SOLR-6119.patch, SOLR-6119.patch, SOLR-6119.patch


 TestReplicationHandler has a weird logic around the 'snapDir' variable. It 
 attempts to remove snapshot folders, even though they're not closed yet. My 
 recent patch uncovered the bug but I don't know how to fix it cleanly -- the 
 test itself seems to be very fragile (for example I don't understand the 
 'namedBackup' variable which is always set to true, yet there are 
 conditionals around it).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6117) Replication command=fetchindex always return success.

2014-05-30 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013351#comment-14013351
 ] 

Shalin Shekhar Mangar commented on SOLR-6117:
-

bq. The actual error stack trace gets printed in the logs. Should we change how 
the remaining also get handled?

I think an exception should be returned in the response if possible. The old 
days of just logging exceptions are gone. We can't expect users to sift through 
GBs of logs in SolrCloud to find the reason behind the failure. But that's a 
big change so I think we should do it in another issue.

 Replication command=fetchindex always return success.
 -

 Key: SOLR-6117
 URL: https://issues.apache.org/jira/browse/SOLR-6117
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.6
Reporter: Raintung Li
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-6117.patch, SOLR-6117.patch, SOLR-6117.txt


 Replication API command=fetchindex do fetch the index. while occur the error, 
 still give success response. 
 API should return the right status, especially WAIT parameter is 
 true.(synchronous).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6120) zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty

2014-05-30 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013357#comment-14013357
 ] 

Shalin Shekhar Mangar commented on SOLR-6120:
-

Thanks Shawn. The windows patch doesn't unzip the solr.war file but it prints 
out that you need to do that yourself before zkcli.bat can work.

 zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty
 ---

 Key: SOLR-6120
 URL: https://issues.apache.org/jira/browse/SOLR-6120
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8.1
Reporter: sebastian badea
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-6120-windows.patch, SOLR-6120.patch


 When calling /solr-4.8.1/example/scripts/cloud-scripts/zkcli.sh the 
 org.apache.solr.cloud.ZkCLI class is not found
 The cause is that /opt/solr-4.8.1/example/solr-webapp is empty



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5700) Add 'accountable' interface for various ramBytesUsed

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013368#comment-14013368
 ] 

ASF subversion and git services commented on LUCENE-5700:
-

Commit 1598470 from [~jpountz] in branch 'dev/trunk'
[ https://svn.apache.org/r1598470 ]

LUCENE-5700: Add oal.util.Accountable and make all classes that can compute 
their memory usage implement it.

 Add 'accountable' interface for various ramBytesUsed
 

 Key: LUCENE-5700
 URL: https://issues.apache.org/jira/browse/LUCENE-5700
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5700.patch


 Currently this is a disaster. there is ramBytesUsed(), sizeInBytes(), etc etc 
 everywhere, with zero consistency, little javadocs, and no structure. For 
 example, look at LUCENE-5695, where we go back and forth on how to handle 
 don't know. 
 I don't think we should add any more of these methods to any classes in 
 lucene until this has been cleaned up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6119) TestReplicationHandler attempts to remove open folders

2014-05-30 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-6119:


Attachment: SOLR-6119.patch

Hi Dawid,

Thanks for tracking that down. Yes that does seem to be the problem.

Looks like we should not be deleting the directories in the finally block if it 
is a named snapshot. We delete in as part of the test. Attached a patch which 
changes that.

All of this is looking very fragile and looking at SOLR-6117 it does look like 
ReplicationHandler needs a overhaul.

 TestReplicationHandler attempts to remove open folders
 --

 Key: SOLR-6119
 URL: https://issues.apache.org/jira/browse/SOLR-6119
 Project: Solr
  Issue Type: Bug
Reporter: Dawid Weiss
Priority: Minor
 Attachments: SOLR-6119.patch, SOLR-6119.patch, SOLR-6119.patch, 
 SOLR-6119.patch


 TestReplicationHandler has a weird logic around the 'snapDir' variable. It 
 attempts to remove snapshot folders, even though they're not closed yet. My 
 recent patch uncovered the bug but I don't know how to fix it cleanly -- the 
 test itself seems to be very fragile (for example I don't understand the 
 'namedBackup' variable which is always set to true, yet there are 
 conditionals around it).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5700) Add 'accountable' interface for various ramBytesUsed

2014-05-30 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5700.
--

   Resolution: Fixed
Fix Version/s: 5.0
   4.9
 Assignee: Adrien Grand

Committed. Thanks Robert and Dawid for the feedback!

 Add 'accountable' interface for various ramBytesUsed
 

 Key: LUCENE-5700
 URL: https://issues.apache.org/jira/browse/LUCENE-5700
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Adrien Grand
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5700.patch


 Currently this is a disaster. there is ramBytesUsed(), sizeInBytes(), etc etc 
 everywhere, with zero consistency, little javadocs, and no structure. For 
 example, look at LUCENE-5695, where we go back and forth on how to handle 
 don't know. 
 I don't think we should add any more of these methods to any classes in 
 lucene until this has been cleaned up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5700) Add 'accountable' interface for various ramBytesUsed

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013387#comment-14013387
 ] 

ASF subversion and git services commented on LUCENE-5700:
-

Commit 1598479 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598479 ]

LUCENE-5700: Add oal.util.Accountable and make all classes that can compute 
their memory usage implement it.

 Add 'accountable' interface for various ramBytesUsed
 

 Key: LUCENE-5700
 URL: https://issues.apache.org/jira/browse/LUCENE-5700
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Adrien Grand
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5700.patch


 Currently this is a disaster. there is ramBytesUsed(), sizeInBytes(), etc etc 
 everywhere, with zero consistency, little javadocs, and no structure. For 
 example, look at LUCENE-5695, where we go back and forth on how to handle 
 don't know. 
 I don't think we should add any more of these methods to any classes in 
 lucene until this has been cleaned up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5708:
---

Attachment: LUCENE-5708.patch

New patch, I think it's ready; I'll commit soon...

bq. It looks to me that we should be able to make some fields final now that we 
don't have a clone method anymore

+1, I fixed a few of these, and found a couple more implements Cloneable to 
remove.

bq. (eg. MergePolicy.writer)

Looks like we'll remove IW as a field in MP/MS with LUCENE-5711.


 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5701) Move core closed listeners to AtomicReader

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013404#comment-14013404
 ] 

ASF subversion and git services commented on LUCENE-5701:
-

Commit 1598487 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598487 ]

LUCENE-5701: Move core closed listeners to AtomicReader.

 Move core closed listeners to AtomicReader
 --

 Key: LUCENE-5701
 URL: https://issues.apache.org/jira/browse/LUCENE-5701
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5701.patch, LUCENE-5701.patch


 Core listeners are very helpful when managing per-segment caches (filters, 
 uninverted doc values, etc.) yet this API is only exposed on 
 {{SegmentReader}}. If you want to use it today, you need to do instanceof 
 checks, try to unwrap in case of a FilterAtomicReader and finally fall back 
 to a reader closed listener if every other attempt to get the underlying 
 SegmentReader failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013408#comment-14013408
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598489 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1598489 ]

LUCENE-5708: remove IWC.clone

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013412#comment-14013412
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598492 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598492 ]

LUCENE-5708: remove IWC.clone

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5708.


Resolution: Fixed

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5688) NumericDocValues fields with sparse data can be compressed better

2014-05-30 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013414#comment-14013414
 ] 

Varun Thacker commented on LUCENE-5688:
---

Hi Shai,

Thanks for reviewing.

bq. Perhaps you can also experiment with a tiny hash-map, using plain 
int[]+long[] or a pair of packed arrays, instead of the binary search tree. I 
am writing one now because I am experimenting with improvements to updatable 
DocValues. It's based on Solr's HashDocSet which I modify to act as an 
int-to-long map. I can share the code here if you want

Sure this approach looks promising also. Faster access vs more memory. Perhaps 
we could provide both options in the same codec.

bq. Another thing, maybe this codec should wrap another and delegate to in case 
the number of docs-with-values exceeds some threshold? For instance, ignoring 
packing, the default DV encodes 8 bytes per document, while this codec encodes 
12 bytes (doc+value) per document which has a value. So I'm thinking that 
unless the field is really sparse, we might prefer the default encoding. We 
should fold that as well into the benchmark.

I thought about it, but since we are writing a codec dedicated to sparse values 
and not adding it as an optimization for the default codec I did not include it 
in my patch. If you feel that we should then I will add it.

A couple of other general doubts that I had - 
- Currently only addNumericField is implemented. Looking at the 
Lucene45DocValuesConsumer - addBinaryField does not write missing value so the 
same code can be reused?
- For addSortedField, addSortedSetField the only method which needs to be 
changed would be addTermsDict?

 NumericDocValues fields with sparse data can be compressed better 
 --

 Key: LUCENE-5688
 URL: https://issues.apache.org/jira/browse/LUCENE-5688
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Varun Thacker
Priority: Minor
 Attachments: LUCENE-5688.patch, LUCENE-5688.patch


 I ran into this problem where I had a dynamic field in Solr and indexed data 
 into lots of fields. For each field only a few documents had actual values 
 and the remaining documents the default value ( 0 ) got indexed. Now when I 
 merge segments, the index size jumps up.
 For example I have 10 segments - Each with 1 DV field. When I merge segments 
 into 1 that segment will contain all 10 DV fields with lots if 0s. 
 This was the motivation behind trying to come up with a compression for a use 
 case like this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5645) StringHelper should check for empty string of tests.seed system property

2014-05-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013418#comment-14013418
 ] 

Michael McCandless commented on LUCENE-5645:


Maybe we could simply use prop.hashCode() entirely, instead of trying to 
parseInt base 16? It's not important that the value we use here matches the 
value that randomized testing extracted from the seed, just that it's the same 
for the same seed

 StringHelper should check for empty string of tests.seed system property
 --

 Key: LUCENE-5645
 URL: https://issues.apache.org/jira/browse/LUCENE-5645
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.8, 5.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 4.8.1, 5.0

 Attachments: LUCENE-5645_StringHelper_empty_tests_seed_condition.patch


 As of LUCENE-5604 (committed to v4.8), StringHelper will initialize 
 GOOD_FAST_HASH_SEED based on the system property tests.seed if it is set.  
 Unfortunately it doesn't do an empty-string check, and it's common at least 
 in my setup that copies Lucene's maven pom.xml that the string will be empty 
 unless I set it on the command line.  FWIW Randomized Testing does do an 
 empty-string check.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 22825 - Failure!

2014-05-30 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/22825/

2 tests failed.
REGRESSION:  
org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat2.testDFBlockSizeMultiple

Error Message:
background merge hit exception: _5(4.9):c613 _3(4.9):c427 _2(4.9):C275 
_1(4.9):C216 _7(4.9):C213 _8(4.9):c119 _0(4.9):C109 _6(4.9):c47 _4(4.9):C29 
into _9 [maxNumSegments=1]

Stack Trace:
java.io.IOException: background merge hit exception: _5(4.9):c613 _3(4.9):c427 
_2(4.9):C275 _1(4.9):C216 _7(4.9):C213 _8(4.9):c119 _0(4.9):C109 _6(4.9):c47 
_4(4.9):C29 into _9 [maxNumSegments=1]
at 
__randomizedtesting.SeedInfo.seed([8A08CE34F9EF0C2A:4DBD8BCB0CDC9E3F]:0)
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1825)
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1761)
at 
org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat2.tearDown(TestBlockPostingsFormat2.java:60)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:885)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsupportedOperationException: this codec cannot index 
offsets
at 
org.apache.lucene.codecs.lucene3x.PreFlexRWFieldsWriter.addField(PreFlexRWFieldsWriter.java:89)
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:71)
at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
 

Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 22825 - Failure!

2014-05-30 Thread Michael McCandless
I'll fix.

Mike McCandless

http://blog.mikemccandless.com


On Fri, May 30, 2014 at 4:35 AM,  buil...@flonkings.com wrote:
 Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/22825/

 2 tests failed.
 REGRESSION:  
 org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat2.testDFBlockSizeMultiple

 Error Message:
 background merge hit exception: _5(4.9):c613 _3(4.9):c427 _2(4.9):C275 
 _1(4.9):C216 _7(4.9):C213 _8(4.9):c119 _0(4.9):C109 _6(4.9):c47 _4(4.9):C29 
 into _9 [maxNumSegments=1]

 Stack Trace:
 java.io.IOException: background merge hit exception: _5(4.9):c613 
 _3(4.9):c427 _2(4.9):C275 _1(4.9):C216 _7(4.9):C213 _8(4.9):c119 _0(4.9):C109 
 _6(4.9):c47 _4(4.9):C29 into _9 [maxNumSegments=1]
 at 
 __randomizedtesting.SeedInfo.seed([8A08CE34F9EF0C2A:4DBD8BCB0CDC9E3F]:0)
 at 
 org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1825)
 at 
 org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1761)
 at 
 org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat2.tearDown(TestBlockPostingsFormat2.java:60)
 at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:885)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.UnsupportedOperationException: this codec cannot index 
 offsets
 at 
 

[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013424#comment-14013424
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598496 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598496 ]

LUCENE-5708: fix test bug

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013426#comment-14013426
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598497 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1598497 ]

LUCENE-5708: fix test bug

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5645) StringHelper should check for empty string of tests.seed system property

2014-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013428#comment-14013428
 ] 

Robert Muir commented on LUCENE-5645:
-

I'm still confused: since when was the empty string a supported value for 
'tests.seed'? I don't think it should be: its unnecessary and just makes the 
test framework complicated. I'm not sure it will even really work.

 its sad that StringHelper gets the blame here, LuceneTestCase should have 
thrown an exception earlier.



 StringHelper should check for empty string of tests.seed system property
 --

 Key: LUCENE-5645
 URL: https://issues.apache.org/jira/browse/LUCENE-5645
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.8, 5.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 4.8.1, 5.0

 Attachments: LUCENE-5645_StringHelper_empty_tests_seed_condition.patch


 As of LUCENE-5604 (committed to v4.8), StringHelper will initialize 
 GOOD_FAST_HASH_SEED based on the system property tests.seed if it is set.  
 Unfortunately it doesn't do an empty-string check, and it's common at least 
 in my setup that copies Lucene's maven pom.xml that the string will be empty 
 unless I set it on the command line.  FWIW Randomized Testing does do an 
 empty-string check.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5645) StringHelper should check for empty string of tests.seed system property

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013429#comment-14013429
 ] 

Dawid Weiss commented on LUCENE-5645:
-

 Why is randomized testing lenient?

It parses an empty value of the property because it's more convenient sometimes 
to pass it from ant as empty (rather then not defining it at all).

The start = (prop == null ? 0 : prop.hashCode()) seems like a neat trick for 
seeding the hash random though.

 StringHelper should check for empty string of tests.seed system property
 --

 Key: LUCENE-5645
 URL: https://issues.apache.org/jira/browse/LUCENE-5645
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.8, 5.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 4.8.1, 5.0

 Attachments: LUCENE-5645_StringHelper_empty_tests_seed_condition.patch


 As of LUCENE-5604 (committed to v4.8), StringHelper will initialize 
 GOOD_FAST_HASH_SEED based on the system property tests.seed if it is set.  
 Unfortunately it doesn't do an empty-string check, and it's common at least 
 in my setup that copies Lucene's maven pom.xml that the string will be empty 
 unless I set it on the command line.  FWIW Randomized Testing does do an 
 empty-string check.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5695) Add DocIdSet.ramBytesUsed

2014-05-30 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5695:
-

Attachment: LUCENE-5695.patch

I tried to make this method only exposed on doc id sets that can be cached by 
introducing a new CacheableDocIdSet that would implement Accountable while 
DocIdSet would not, but this doesn't play nicely with filtering 
(FilteredDocIdSet)...

The attached patch uses the same approach as the previous one except that it 
makes DocIdSet implement Accountable instead of having its own ramBytesUsed 
method. 

 Add DocIdSet.ramBytesUsed
 -

 Key: LUCENE-5695
 URL: https://issues.apache.org/jira/browse/LUCENE-5695
 Project: Lucene - Core
  Issue Type: New Feature
Affects Versions: 4.9, 5.0
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5695.patch, LUCENE-5695.patch, LUCENE-5695.patch


 LUCENE-5463 tried to remove calls to {{RamUsageEstimator.sizeOf(Object)}} yet 
 it was not always possible to remove the call when there was no other API to 
 compute the memory usage of a particular class. In particular, this is the 
 case for {{CachingWrapperFilter.sizeInBytes()}} that needs to be able to get 
 the memory usage of any cacheable {{DocIdSet}} instance.
 We could add {{DocIdSet.ramBytesUsed}} in order to remove the need for 
 {{RamUsageEstimator}}. This will also help have bounded filter caches and 
 take the size of the cached doc id sets into account when doing evictions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 22826 - Still Failing!

2014-05-30 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/22826/

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat3

Error Message:
Captured an uncaught exception in thread: Thread[id=327, name=Lucene Merge 
Thread #0, state=RUNNABLE, group=TGRP-TestBlockPostingsFormat3]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=327, name=Lucene Merge Thread #0, 
state=RUNNABLE, group=TGRP-TestBlockPostingsFormat3]
Caused by: org.apache.lucene.index.MergePolicy$MergeException: 
java.lang.UnsupportedOperationException: this codec cannot index offsets
at __randomizedtesting.SeedInfo.seed([9D5BC98E888A5AB4]:0)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.lang.UnsupportedOperationException: this codec cannot index 
offsets
at 
org.apache.lucene.codecs.sep.SepPostingsWriter.setField(SepPostingsWriter.java:196)
at 
org.apache.lucene.codecs.blockterms.BlockTermsWriter$TermsWriter.init(BlockTermsWriter.java:205)
at 
org.apache.lucene.codecs.blockterms.BlockTermsWriter.addField(BlockTermsWriter.java:136)
at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:148)
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:71)
at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:112)
at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4149)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3745)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)


REGRESSION:  org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat3.test

Error Message:
background merge hit exception: _e(4.9):C1337 _l(4.9):C265 _o(4.9):C250 
_p(4.9):c171 _m(4.9):c129 _j(4.9):c100 _i(4.9):c87 _n(4.9):c92 _k(4.9):c54 
_f(4.9):c25 _g(4.9):c25 _h(4.9):c25 into _q [maxNumSegments=1]

Stack Trace:
java.io.IOException: background merge hit exception: _e(4.9):C1337 _l(4.9):C265 
_o(4.9):C250 _p(4.9):c171 _m(4.9):c129 _j(4.9):c100 _i(4.9):c87 _n(4.9):c92 
_k(4.9):c54 _f(4.9):c25 _g(4.9):c25 _h(4.9):c25 into _q [maxNumSegments=1]
at 
__randomizedtesting.SeedInfo.seed([9D5BC98E888A5AB4:150FF6542676374C]:0)
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1825)
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1761)
at 
org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat3.test(TestBlockPostingsFormat3.java:144)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
at 

Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 22826 - Still Failing!

2014-05-30 Thread Michael McCandless
Flonkings the taskmaster ... I'll fix.  I sense a pattern...

Mike McCandless

http://blog.mikemccandless.com


On Fri, May 30, 2014 at 4:51 AM,  buil...@flonkings.com wrote:
 Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/22826/

 2 tests failed.
 FAILED:  
 junit.framework.TestSuite.org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat3

 Error Message:
 Captured an uncaught exception in thread: Thread[id=327, name=Lucene Merge 
 Thread #0, state=RUNNABLE, group=TGRP-TestBlockPostingsFormat3]

 Stack Trace:
 com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
 uncaught exception in thread: Thread[id=327, name=Lucene Merge Thread #0, 
 state=RUNNABLE, group=TGRP-TestBlockPostingsFormat3]
 Caused by: org.apache.lucene.index.MergePolicy$MergeException: 
 java.lang.UnsupportedOperationException: this codec cannot index offsets
 at __randomizedtesting.SeedInfo.seed([9D5BC98E888A5AB4]:0)
 at 
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545)
 at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
 Caused by: java.lang.UnsupportedOperationException: this codec cannot index 
 offsets
 at 
 org.apache.lucene.codecs.sep.SepPostingsWriter.setField(SepPostingsWriter.java:196)
 at 
 org.apache.lucene.codecs.blockterms.BlockTermsWriter$TermsWriter.init(BlockTermsWriter.java:205)
 at 
 org.apache.lucene.codecs.blockterms.BlockTermsWriter.addField(BlockTermsWriter.java:136)
 at 
 org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.addField(PerFieldPostingsFormat.java:148)
 at 
 org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:71)
 at 
 org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:389)
 at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:112)
 at 
 org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4149)
 at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3745)
 at 
 org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
 at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)


 REGRESSION:  org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat3.test

 Error Message:
 background merge hit exception: _e(4.9):C1337 _l(4.9):C265 _o(4.9):C250 
 _p(4.9):c171 _m(4.9):c129 _j(4.9):c100 _i(4.9):c87 _n(4.9):c92 _k(4.9):c54 
 _f(4.9):c25 _g(4.9):c25 _h(4.9):c25 into _q [maxNumSegments=1]

 Stack Trace:
 java.io.IOException: background merge hit exception: _e(4.9):C1337 
 _l(4.9):C265 _o(4.9):C250 _p(4.9):c171 _m(4.9):c129 _j(4.9):c100 _i(4.9):c87 
 _n(4.9):c92 _k(4.9):c54 _f(4.9):c25 _g(4.9):c25 _h(4.9):c25 into _q 
 [maxNumSegments=1]
 at 
 __randomizedtesting.SeedInfo.seed([9D5BC98E888A5AB4:150FF6542676374C]:0)
 at 
 org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1825)
 at 
 org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1761)
 at 
 org.apache.lucene.codecs.lucene41.TestBlockPostingsFormat3.test(TestBlockPostingsFormat3.java:144)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 

[jira] [Commented] (LUCENE-5645) StringHelper should check for empty string of tests.seed system property

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013433#comment-14013433
 ] 

Dawid Weiss commented on LUCENE-5645:
-

 I'm still confused: since when was the empty string a supported value for 
 'tests.seed'?

It's been supported since the very beginning. A non-existent tests.seed or an 
empty string are equivalent to the testing framework (and default to a randomly 
picked seed).

You can override this in Lucene of course, but then you'll have to change 
common-build to pass a propertyref instead of a property and it'll be more 
complicated. Try it.


 StringHelper should check for empty string of tests.seed system property
 --

 Key: LUCENE-5645
 URL: https://issues.apache.org/jira/browse/LUCENE-5645
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.8, 5.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 4.8.1, 5.0

 Attachments: LUCENE-5645_StringHelper_empty_tests_seed_condition.patch


 As of LUCENE-5604 (committed to v4.8), StringHelper will initialize 
 GOOD_FAST_HASH_SEED based on the system property tests.seed if it is set.  
 Unfortunately it doesn't do an empty-string check, and it's common at least 
 in my setup that copies Lucene's maven pom.xml that the string will be empty 
 unless I set it on the command line.  FWIW Randomized Testing does do an 
 empty-string check.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013434#comment-14013434
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598502 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598502 ]

LUCENE-5708: fix test bug

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013437#comment-14013437
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598503 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1598503 ]

LUCENE-5708: fix test bug

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1598502 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/codecs/lucene41/TestBlockPostingsFormat3.java

2014-05-30 Thread Robert Muir
Why are lucene41 tests executing with other codecs?
Why is this suddenly happening?

On Fri, May 30, 2014 at 4:57 AM,  mikemcc...@apache.org wrote:
 Author: mikemccand
 Date: Fri May 30 08:57:42 2014
 New Revision: 1598502

 URL: http://svn.apache.org/r1598502
 Log:
 LUCENE-5708: fix test bug

 Modified:
 
 lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/codecs/lucene41/TestBlockPostingsFormat3.java

 Modified: 
 lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/codecs/lucene41/TestBlockPostingsFormat3.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/codecs/lucene41/TestBlockPostingsFormat3.java?rev=1598502r1=1598501r2=1598502view=diff
 ==
 --- 
 lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/codecs/lucene41/TestBlockPostingsFormat3.java
  (original)
 +++ 
 lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/codecs/lucene41/TestBlockPostingsFormat3.java
  Fri May 30 08:57:42 2014
 @@ -139,6 +139,7 @@ public class TestBlockPostingsFormat3 ex
  verify(dir);
  TestUtil.checkIndex(dir); // for some extra coverage, checkIndex before 
 we forceMerge
  iwc = newIndexWriterConfig(TEST_VERSION_CURRENT, analyzer);
 +iwc.setCodec(TestUtil.alwaysPostingsFormat(new 
 Lucene41PostingsFormat()));
  iwc.setOpenMode(OpenMode.APPEND);
  IndexWriter iw2 = new IndexWriter(dir, iwc);
  iw2.forceMerge(1);



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException

2014-05-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reopened LUCENE-5710:
-


I don't see a 4.x commit for this.

 DefaultIndexingChain swallows useful information from 
 MaxBytesLengthExceededException
 -

 Key: LUCENE-5710
 URL: https://issues.apache.org/jira/browse/LUCENE-5710
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.8.1
Reporter: Lee Hinman
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5710.patch, LUCENE-5710.patch


 In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, 
 the original message is discarded, however, the message contains useful 
 information like the size that exceeded the limit.
 Lucene should make this information included in the newly thrown 
 IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException

2014-05-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013440#comment-14013440
 ] 

Michael McCandless commented on LUCENE-5710:


Woops ... I'll commit to 4.x.  Dublin also had beer, perhaps too much...

 DefaultIndexingChain swallows useful information from 
 MaxBytesLengthExceededException
 -

 Key: LUCENE-5710
 URL: https://issues.apache.org/jira/browse/LUCENE-5710
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.8.1
Reporter: Lee Hinman
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5710.patch, LUCENE-5710.patch


 In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, 
 the original message is discarded, however, the message contains useful 
 information like the size that exceeded the limit.
 Lucene should make this information included in the newly thrown 
 IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5645) StringHelper should check for empty string of tests.seed system property

2014-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013441#comment-14013441
 ] 

Robert Muir commented on LUCENE-5645:
-

{quote}
It's been supported since the very beginning. A non-existent tests.seed or an 
empty string are equivalent to the testing framework (and default to a randomly 
picked seed).
{quote}

But thats not a general system property thing, its just something special that 
the test-framework is doing, and only for this specific case of tests.seed?

Sorry, I have to try to stop the empty string :) Once it becomes supported 
nobody will ever remove support for it. Time to make my stand.

 StringHelper should check for empty string of tests.seed system property
 --

 Key: LUCENE-5645
 URL: https://issues.apache.org/jira/browse/LUCENE-5645
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.8, 5.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 4.8.1, 5.0

 Attachments: LUCENE-5645_StringHelper_empty_tests_seed_condition.patch


 As of LUCENE-5604 (committed to v4.8), StringHelper will initialize 
 GOOD_FAST_HASH_SEED based on the system property tests.seed if it is set.  
 Unfortunately it doesn't do an empty-string check, and it's common at least 
 in my setup that copies Lucene's maven pom.xml that the string will be empty 
 unless I set it on the command line.  FWIW Randomized Testing does do an 
 empty-string check.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013448#comment-14013448
 ] 

ASF subversion and git services commented on LUCENE-5710:
-

Commit 1598514 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598514 ]

LUCENE-5710: don't swallow inner immense term exception

 DefaultIndexingChain swallows useful information from 
 MaxBytesLengthExceededException
 -

 Key: LUCENE-5710
 URL: https://issues.apache.org/jira/browse/LUCENE-5710
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.8.1
Reporter: Lee Hinman
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5710.patch, LUCENE-5710.patch


 In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, 
 the original message is discarded, however, the message contains useful 
 information like the size that exceeded the limit.
 Lucene should make this information included in the newly thrown 
 IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5710) DefaultIndexingChain swallows useful information from MaxBytesLengthExceededException

2014-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5710.


Resolution: Fixed

OK, I backported ... thanks Rob.

 DefaultIndexingChain swallows useful information from 
 MaxBytesLengthExceededException
 -

 Key: LUCENE-5710
 URL: https://issues.apache.org/jira/browse/LUCENE-5710
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.8.1
Reporter: Lee Hinman
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5710.patch, LUCENE-5710.patch


 In DefaultIndexingChain, when a MaxBytesLengthExceededException is caught, 
 the original message is discarded, however, the message contains useful 
 information like the size that exceeded the limit.
 Lucene should make this information included in the newly thrown 
 IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5645) StringHelper should check for empty string of tests.seed system property

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013452#comment-14013452
 ] 

Dawid Weiss commented on LUCENE-5645:
-

I'll keep the previous behaviour in randomizedtesting but you can easily forbid 
it at LTC level. I think it's not worth complicating the ant file though... 
you'll see if you start modifying it.

 StringHelper should check for empty string of tests.seed system property
 --

 Key: LUCENE-5645
 URL: https://issues.apache.org/jira/browse/LUCENE-5645
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.8, 5.0
Reporter: David Smiley
Assignee: David Smiley
Priority: Minor
 Fix For: 4.8.1, 5.0

 Attachments: LUCENE-5645_StringHelper_empty_tests_seed_condition.patch


 As of LUCENE-5604 (committed to v4.8), StringHelper will initialize 
 GOOD_FAST_HASH_SEED based on the system property tests.seed if it is set.  
 Unfortunately it doesn't do an empty-string check, and it's common at least 
 in my setup that copies Lucene's maven pom.xml that the string will be empty 
 unless I set it on the command line.  FWIW Randomized Testing does do an 
 empty-string check.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reopened LUCENE-5708:
-


Can we remove the fix test bugs commits and fix the true underlying issue?

Tests arent testing what they should, we need to fix the disease or back this 
change out.

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013459#comment-14013459
 ] 

Michael McCandless commented on LUCENE-5708:


bq. Tests arent testing what they should, we need to fix the disease or back 
this change out.

I'm happy to back this change out if you want.

But, I'm confused: are you talking about how these tests hardwire the PF to 
Lucene41?  I agree we should fix that, but this is a pre-existing issue.

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6123) The 'clusterstatus' API filtered by collection times out if a long running operation is in progress

2014-05-30 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-6123:
---

 Summary: The 'clusterstatus' API filtered by collection times out 
if a long running operation is in progress
 Key: SOLR-6123
 URL: https://issues.apache.org/jira/browse/SOLR-6123
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.9
Reporter: Shalin Shekhar Mangar
 Fix For: 4.9


If a long running shard split is in progress, say for collection=X, then 
clusterstatus API with collection=X will time out.

The OverseerCollectionProcessor should never block an operation such as 
clusterstatus even if there are tasks for the same collection in progress.

This bug was introduced by SOLR-5681.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013477#comment-14013477
 ] 

Robert Muir commented on LUCENE-5708:
-

All these PostingsFormat tests wire their codec to themselves. This allows us 
to implement generic base classes with tests that all codecs should pass, as 
well as codec-specific tests (e.g. in lucene41) that test particular corner 
cases of importance.

this worked well in knocking out bugs for postings, so the whole scheme was 
duplicated to docvalues, storedfields, vectors, everything.

Now here comes this commit, and these tests (which are important to ensure the 
index format is working) are no longer testing what they are supposed to. For 
example the tests in Lucene41 package explicitly test special cases of that 
codec that would otherwise be extraordinarily rare in the existing random 
tests. If they are executing against random codecs or even random 
configurations then they just became useless.

So thats why I'm concerned: I see this commit causing these failures, and I 
know we just experienced a significant loss of test coverage to the index 
format. We are relying upon tests to *fail* to detect this, but unfortunately 
'loss of test coverage' doesn't always trigger a jenkins build.

So maybe instead of playing whack-a-mole with jenkins tests failures, we should 
pay more attention reviewing *all* changes to unit tests where clone() was 
previously used. The patch is buggy here, and I just want to ensure its taken 
seriously so that we do not lose coverage


 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5442) Build system should sanity check transative 3rd party dependencies

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013494#comment-14013494
 ] 

ASF subversion and git services commented on LUCENE-5442:
-

Commit 1598538 from [~steve_rowe] in branch 'dev/trunk'
[ https://svn.apache.org/r1598538 ]

LUCENE-5442: ant check-lib-versions will fail the build if there are unexpected 
version conflicts between direct and transitive dependencies.

 Build system should sanity check transative 3rd party dependencies
 --

 Key: LUCENE-5442
 URL: https://issues.apache.org/jira/browse/LUCENE-5442
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Hoss Man
Assignee: Steve Rowe
 Attachments: LUCENE-5442.patch


 SOLR-5365 is an example of a bug that croped up because we upgraded a 3rd 
 party dep (tika) w/o realizing that the version we upgraded too depended on a 
 newer version of another 3rd party dep (commons-compress)
 in a comment in SOLR-5365, Jan suggested that it would be nice if there was 
 an easy way to spot problems like this ... i asked steve about it, thinking 
 maybe this is something the maven build could help with, and he mentioned 
 that there is already an ant task to inspect the ivy transative deps in order 
 to generate the maven deps and it could be used to help detect this sort of 
 problem.
 opening this issue per steve's request as a reminder to look into this 
 possibility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013498#comment-14013498
 ] 

Michael McCandless commented on LUCENE-5708:


I agree that the distinction of randomly test this PF via the generic
base test class and the specifically test tricky corner cases for
this particular PF (e.g. TestBlockPostingsFormat/2/3.java) is
important, and the specific IWC settings for those tests are
necessary.

I reviewed all the test changes more closely, and found a couple other
places that needed to carry over explicit IWC changes after pulling a
random IWC (I'll commit shortly).

I think this is net/net good vs the clone we had before: it means we
are still randomly changing the things the test didn't care about, and
fixing the settings that it does.


 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5442) Build system should sanity check transative 3rd party dependencies

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013499#comment-14013499
 ] 

ASF subversion and git services commented on LUCENE-5442:
-

Commit 1598539 from [~steve_rowe] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598539 ]

LUCENE-5442: ant check-lib-versions will fail the build if there are unexpected 
version conflicts between direct and transitive dependencies. (merged trunk 
r1598538)

 Build system should sanity check transative 3rd party dependencies
 --

 Key: LUCENE-5442
 URL: https://issues.apache.org/jira/browse/LUCENE-5442
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Hoss Man
Assignee: Steve Rowe
 Attachments: LUCENE-5442.patch


 SOLR-5365 is an example of a bug that croped up because we upgraded a 3rd 
 party dep (tika) w/o realizing that the version we upgraded too depended on a 
 newer version of another 3rd party dep (commons-compress)
 in a comment in SOLR-5365, Jan suggested that it would be nice if there was 
 an easy way to spot problems like this ... i asked steve about it, thinking 
 maybe this is something the maven build could help with, and he mentioned 
 that there is already an ant task to inspect the ivy transative deps in order 
 to generate the maven deps and it could be used to help detect this sort of 
 problem.
 opening this issue per steve's request as a reminder to look into this 
 possibility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5442) Build system should sanity check transative 3rd party dependencies

2014-05-30 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-5442.


   Resolution: Fixed
Fix Version/s: 5.0
   4.9

Committed to trunk and branch_4x.

I'll open a follow-on issue to reduce the number of expected version conflicts 
listed in {{ivy-ignore-conflicts.properties}}, by upgrading the corresponding 
direct dependencies in {{ivy-versions.properties}}.

 Build system should sanity check transative 3rd party dependencies
 --

 Key: LUCENE-5442
 URL: https://issues.apache.org/jira/browse/LUCENE-5442
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Hoss Man
Assignee: Steve Rowe
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5442.patch


 SOLR-5365 is an example of a bug that croped up because we upgraded a 3rd 
 party dep (tika) w/o realizing that the version we upgraded too depended on a 
 newer version of another 3rd party dep (commons-compress)
 in a comment in SOLR-5365, Jan suggested that it would be nice if there was 
 an easy way to spot problems like this ... i asked steve about it, thinking 
 maybe this is something the maven build could help with, and he mentioned 
 that there is already an ant task to inspect the ivy transative deps in order 
 to generate the maven deps and it could be used to help detect this sort of 
 problem.
 opening this issue per steve's request as a reminder to look into this 
 possibility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013506#comment-14013506
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598543 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1598543 ]

LUCENE-5708: fix these tests to also 'mimic' previous IWC.clone

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013508#comment-14013508
 ] 

ASF subversion and git services commented on LUCENE-5708:
-

Commit 1598545 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598545 ]

LUCENE-5708: fix these tests to also 'mimic' previous IWC.clone

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5715) Upgrade direct dependencies known to be older than transitive dependencies

2014-05-30 Thread Steve Rowe (JIRA)
Steve Rowe created LUCENE-5715:
--

 Summary: Upgrade direct dependencies known to be older than 
transitive dependencies
 Key: LUCENE-5715
 URL: https://issues.apache.org/jira/browse/LUCENE-5715
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor


LUCENE-5442 added functionality to the {{check-lib-versions}} ant task to fail 
the build if a direct dependency's version conflicts with that of a transitive 
dependency.

{{ivy-ignore-conflicts.properties}} contains a list of 19 transitive 
dependencies with versions that are newer than direct dependencies' versions: 
https://issues.apache.org/jira/browse/LUCENE-5442?focusedCommentId=14012220page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14012220

We should try to keep that list small.  It's likely that upgrading most of 
those dependencies will require little effort.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5442) Build system should sanity check transative 3rd party dependencies

2014-05-30 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013510#comment-14013510
 ] 

Steve Rowe commented on LUCENE-5442:


bq. I'll open a follow-on issue to reduce the number of expected version 
conflicts listed in {{ivy-ignore-conflicts.properties}}, by upgrading the 
corresponding direct dependencies in {{ivy-versions.properties}}.

Done: LUCENE-5715

 Build system should sanity check transative 3rd party dependencies
 --

 Key: LUCENE-5442
 URL: https://issues.apache.org/jira/browse/LUCENE-5442
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/build
Reporter: Hoss Man
Assignee: Steve Rowe
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5442.patch


 SOLR-5365 is an example of a bug that croped up because we upgraded a 3rd 
 party dep (tika) w/o realizing that the version we upgraded too depended on a 
 newer version of another 3rd party dep (commons-compress)
 in a comment in SOLR-5365, Jan suggested that it would be nice if there was 
 an easy way to spot problems like this ... i asked steve about it, thinking 
 maybe this is something the maven build could help with, and he mentioned 
 that there is already an ant task to inspect the ivy transative deps in order 
 to generate the maven deps and it could be used to help detect this sort of 
 problem.
 opening this issue per steve's request as a reminder to look into this 
 possibility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6124) NPE in BoostedQuery.hashCode() submitting ({!boost b=1} *:*) query

2014-05-30 Thread Massimo Schiavon (JIRA)
Massimo Schiavon created SOLR-6124:
--

 Summary: NPE in BoostedQuery.hashCode() submitting ({!boost b=1} 
*:*) query
 Key: SOLR-6124
 URL: https://issues.apache.org/jira/browse/SOLR-6124
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.8.1
Reporter: Massimo Schiavon
Priority: Minor


Using example configuration included in solr distribution submitting 
{noformat}({!boost b=1} *:*){noformat} or {noformat}{!boost b=1}{noformat} as q 
parameter results in a NPE.

Submitting the same query without a space doesn't {noformat}({!boost 
b=1}*:*){noformat}
Maybe a little bug in query parser?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5708) Remove IndexWriterConfig.clone

2014-05-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5708.


Resolution: Fixed

 Remove IndexWriterConfig.clone
 --

 Key: LUCENE-5708
 URL: https://issues.apache.org/jira/browse/LUCENE-5708
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5708.patch, LUCENE-5708.patch, LUCENE-5708.patch


 We originally added this clone to allow a single IWC to be re-used against 
 more than one IndexWriter, but I think this is a mis-feature: it adds 
 complexity to hairy classes (merge policy/scheduler, DW thread pool, etc.), I 
 think it's buggy today.
 I think we should just disallow sharing: you must make a new IWC for a new 
 IndexWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).

2014-05-30 Thread Dawid Weiss (JIRA)
Dawid Weiss created LUCENE-5716:
---

 Summary: Track file handle leaks (FileDescriptor, NIO Path SPI and 
Socket mostly).
 Key: LUCENE-5716
 URL: https://issues.apache.org/jira/browse/LUCENE-5716
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013590#comment-14013590
 ] 

Dawid Weiss commented on LUCENE-5716:
-

I spoke with Robert at Buzzwords about it a bit. It'd be interesting to capture 
file handle leaks from tests (unclosed sockets, unclosed files). There are 
essentially no java built-in mechanisms to do it. One could use the operating 
system's tools such as lsof, but this would measure ALL file handles, including 
handles of open jar files, etc. It's also an interesting puzzle to see if one 
can do it from Java in a portable (?) way.

There are a few options.

1) Override bootclasspath for a particular release of the JVM and modify system 
classes to allow resource tracking. A similar solution kind of exists already 
in FileInputStream:
{code}
public int read() throws IOException {
Object traceContext = IoTrace.fileReadBegin(path);
int b = 0;
try {
b = read0();
} finally {
IoTrace.fileReadEnd(traceContext, b == -1 ? 0 : 1);
}
return b;
}
{code}
the obscure IoTrace class unfortunately only tracks file writes, not its 
open/close status. A similar hack could be applied to capture constructors and 
close calls though.

2) Use bytecode transformation and capture all places where interesting objects 
are created/ closed. AspectJ is closest to my heart since it allows fast 
prototyping and is essentially just a bytecode manipulation framework. I've 
written a simple aspect tracing FileInputStream usage, here's the code.

https://github.com/dweiss/fhandle-tracing

The aspect itself:
https://github.com/dweiss/fhandle-tracing/blob/master/src/main/aspect/com/carrotsearch/aspects/FileInputStreamTracker.aj

There is a JUnit test in there and if you run mvn test, you can see that it's 
actually working quite nice. Not everything can be easily addressed (for 
example, it's difficult to advise close in classes that inherit from FIS but 
don't override this method), but alternative solutions to the problem also 
exist (capture all close calls, capture all constructors of 
FileInputStream+.new). Doable and should work for 99% of the use cases I think.

3) use jvmti instrumentation agents to essentially provide the same 
instrumentation as above. I don't think it is functionally any different from 
(2) and I like aspect's syntax for fast hacking. The only exception might be 
(didn't check) if we could redefine and reload the definitions of core Java 
classes (FileInputStream, for example) so that we can instrument constructors 
and methods of these base classes directly. This would be quite nice because 
then we'd be able to make it portable across all JVMs and it should work for 
all code fragments, including standard library itself.

I'll keep experimenting with (2) for now as it's the low-hanging fruit (already 
works), but if somebody wants to inspect (3) it'd be quite interesting.


 Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
 -

 Key: LUCENE-5716
 URL: https://issues.apache.org/jira/browse/LUCENE-5716
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: SOLR-6067: add methods to reduce code du...

2014-05-30 Thread cpoerschke
GitHub user cpoerschke opened a pull request:

https://github.com/apache/lucene-solr/pull/54

SOLR-6067: add methods to reduce code duplication in SolrIndexSearcher, 
Grouping

For ticket https://issues.apache.org/jira/i#browse/SOLR-6067 - baselined 
against trunk (previous pull request was against branch_4x).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bloomberg/lucene-solr trunk-solr-6067

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/54.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #54


commit 2b4d92f1192c6a4126cc69b38f5026fa4bbb7304
Author: Christine Poerschke cpoersc...@bloomberg.net
Date:   2014-05-08T13:23:32Z

SOLR-6067: add buildAndRunCollectorChain methods to reduce code duplication 
in SolrIndexSearcher, Grouping




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6067) add buildAndRunCollectorChain methods to reduce code duplication in SolrIndexSearcher, Grouping

2014-05-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013592#comment-14013592
 ] 

ASF GitHub Bot commented on SOLR-6067:
--

GitHub user cpoerschke opened a pull request:

https://github.com/apache/lucene-solr/pull/54

SOLR-6067: add methods to reduce code duplication in SolrIndexSearcher, 
Grouping

For ticket https://issues.apache.org/jira/i#browse/SOLR-6067 - baselined 
against trunk (previous pull request was against branch_4x).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bloomberg/lucene-solr trunk-solr-6067

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/54.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #54


commit 2b4d92f1192c6a4126cc69b38f5026fa4bbb7304
Author: Christine Poerschke cpoersc...@bloomberg.net
Date:   2014-05-08T13:23:32Z

SOLR-6067: add buildAndRunCollectorChain methods to reduce code duplication 
in SolrIndexSearcher, Grouping




 add buildAndRunCollectorChain methods to reduce code duplication in 
 SolrIndexSearcher, Grouping
 ---

 Key: SOLR-6067
 URL: https://issues.apache.org/jira/browse/SOLR-6067
 Project: Solr
  Issue Type: Improvement
Reporter: Christine Poerschke
Priority: Minor
 Attachments: SOLR-6067.patch


 https://github.com/apache/lucene-solr/pull/48 has the proposed change. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013590#comment-14013590
 ] 

Dawid Weiss edited comment on LUCENE-5716 at 5/30/14 12:31 PM:
---

I spoke with Robert at Buzzwords about it a bit. It'd be interesting to capture 
file handle leaks from tests (unclosed sockets, unclosed files). There are 
essentially no java built-in mechanisms to do it. One could use the operating 
system's tools such as lsof, but this would measure ALL file handles, including 
handles of open jar files, etc. It's also an interesting puzzle to see if one 
can do it from Java in a portable (?) way.

There are a few options.

1) Override bootclasspath for a particular release of the JVM and modify system 
classes to allow resource tracking. A similar solution kind of exists already 
in FileInputStream:
{code}
public int read() throws IOException {
Object traceContext = IoTrace.fileReadBegin(path);
int b = 0;
try {
b = read0();
} finally {
IoTrace.fileReadEnd(traceContext, b == -1 ? 0 : 1);
}
return b;
}
{code}
the obscure IoTrace class unfortunately only tracks file writes, not its 
open/close status. A similar hack could be applied to capture constructors and 
close calls though.

2) Use bytecode transformation and capture all places where interesting objects 
are created/ closed. AspectJ is closest to my heart since it allows fast 
prototyping and is essentially just a bytecode manipulation framework. I've 
written a simple aspect tracing FileInputStream usage, here's the code.

https://github.com/dweiss/fhandle-tracing

The aspect itself:
https://github.com/dweiss/fhandle-tracing/blob/master/src/main/aspect/com/carrotsearch/aspects/FileInputStreamTracker.aj

There is a JUnit test in there and if you run mvn test, you can see that it's 
actually working quite nice. Not everything can be easily addressed (for 
example, it's difficult to advise close in classes that inherit from FIS but 
don't override this method), but alternative solutions to the problem also 
exist (capture all close calls, capture all constructors of 
FileInputStream+.new). Doable and should work for 99% of the use cases I think.

3) use jvmti instrumentation agents to essentially provide the same as above. I 
don't think it is functionally any different from (2) and I like aspectj's 
syntax for fast hacking. The only add-on value might be (didn't check) if we 
could redefine and reload the definitions of core Java classes 
(FileInputStream, for example) so that we can instrument constructors and 
methods of these base classes directly. This would be quite nice because then 
we'd be able to make it portable across all JVMs and it should work for all the 
code, including standard library itself.

I'll keep experimenting with (2) for now as it's the low-hanging fruit (already 
works), but if somebody wants to inspect (3) it'd be quite interesting (I may 
do it too, time permitting).



was (Author: dweiss):
I spoke with Robert at Buzzwords about it a bit. It'd be interesting to capture 
file handle leaks from tests (unclosed sockets, unclosed files). There are 
essentially no java built-in mechanisms to do it. One could use the operating 
system's tools such as lsof, but this would measure ALL file handles, including 
handles of open jar files, etc. It's also an interesting puzzle to see if one 
can do it from Java in a portable (?) way.

There are a few options.

1) Override bootclasspath for a particular release of the JVM and modify system 
classes to allow resource tracking. A similar solution kind of exists already 
in FileInputStream:
{code}
public int read() throws IOException {
Object traceContext = IoTrace.fileReadBegin(path);
int b = 0;
try {
b = read0();
} finally {
IoTrace.fileReadEnd(traceContext, b == -1 ? 0 : 1);
}
return b;
}
{code}
the obscure IoTrace class unfortunately only tracks file writes, not its 
open/close status. A similar hack could be applied to capture constructors and 
close calls though.

2) Use bytecode transformation and capture all places where interesting objects 
are created/ closed. AspectJ is closest to my heart since it allows fast 
prototyping and is essentially just a bytecode manipulation framework. I've 
written a simple aspect tracing FileInputStream usage, here's the code.

https://github.com/dweiss/fhandle-tracing

The aspect itself:
https://github.com/dweiss/fhandle-tracing/blob/master/src/main/aspect/com/carrotsearch/aspects/FileInputStreamTracker.aj

There is a JUnit test in there and if you run mvn test, you can see that it's 
actually working quite nice. Not everything can be easily addressed (for 
example, it's difficult to advise close in classes that inherit from FIS but 
don't override this 

[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).

2014-05-30 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013604#comment-14013604
 ] 

Shalin Shekhar Mangar commented on LUCENE-5716:
---

In a past job, I have used a custom protocol factory for http and https 
connections and tracked open/close, bytes read/written using a thread-local to 
make sure that webapps aren't leaking connections.

 Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
 -

 Key: LUCENE-5716
 URL: https://issues.apache.org/jira/browse/LUCENE-5716
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: add SolrConfig/updateHandler/indexWriter...

2014-05-30 Thread cpoerschke
GitHub user cpoerschke opened a pull request:

https://github.com/apache/lucene-solr/pull/55

add SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE support

For https://issues.apache.org/jira/i#browse/SOLR-6125 ticket.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bloomberg/lucene-solr 
trunk-closeWaitsForMerges

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/55.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #55


commit d8e6b676b9b3f5b8e9767a2e8b266d17a01d92f5
Author: Christine Poerschke cpoersc...@bloomberg.net
Date:   2014-05-30T08:32:50Z

solr: add SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE 
support




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6125) add SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE support

2014-05-30 Thread Christine Poerschke (JIRA)
Christine Poerschke created SOLR-6125:
-

 Summary: add 
SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE support
 Key: SOLR-6125
 URL: https://issues.apache.org/jira/browse/SOLR-6125
 Project: Solr
  Issue Type: Improvement
Reporter: Christine Poerschke
Priority: Minor


The problem we saw was slow stopping of the overseer solr instance because it 
was in the middle of a big merge.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6125) add SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE support

2014-05-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013605#comment-14013605
 ] 

ASF GitHub Bot commented on SOLR-6125:
--

GitHub user cpoerschke opened a pull request:

https://github.com/apache/lucene-solr/pull/55

add SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE support

For https://issues.apache.org/jira/i#browse/SOLR-6125 ticket.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bloomberg/lucene-solr 
trunk-closeWaitsForMerges

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/55.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #55


commit d8e6b676b9b3f5b8e9767a2e8b266d17a01d92f5
Author: Christine Poerschke cpoersc...@bloomberg.net
Date:   2014-05-30T08:32:50Z

solr: add SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE 
support




 add SolrConfig/updateHandler/indexWriter/closeWaitsForMerges=FALSE support
 --

 Key: SOLR-6125
 URL: https://issues.apache.org/jira/browse/SOLR-6125
 Project: Solr
  Issue Type: Improvement
Reporter: Christine Poerschke
Priority: Minor

 The problem we saw was slow stopping of the overseer solr instance because it 
 was in the middle of a big merge.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013606#comment-14013606
 ] 

Dawid Weiss commented on LUCENE-5716:
-

Thanks Shalin. What I'm aiming at here is much more low-level (and should be 
more accurate). We'll see how it goes.

 Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
 -

 Key: LUCENE-5716
 URL: https://issues.apache.org/jira/browse/LUCENE-5716
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6067) add buildAndRunCollectorChain methods to reduce code duplication in SolrIndexSearcher, Grouping

2014-05-30 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013610#comment-14013610
 ] 

Christine Poerschke commented on SOLR-6067:
---

Hi [~hossman] - i have updated with 
https://github.com/apache/lucene-solr/pull/54, could you try applying that 
patch (instead of the earlier #48 one)? Thanks.

 add buildAndRunCollectorChain methods to reduce code duplication in 
 SolrIndexSearcher, Grouping
 ---

 Key: SOLR-6067
 URL: https://issues.apache.org/jira/browse/SOLR-6067
 Project: Solr
  Issue Type: Improvement
Reporter: Christine Poerschke
Priority: Minor
 Attachments: SOLR-6067.patch


 https://github.com/apache/lucene-solr/pull/48 has the proposed change. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6120) zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty

2014-05-30 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-6120:
---

Attachment: SOLR-6120-windows.patch

Updated windows patch.  The instructions for what to unzip and where to put it 
were getting lost in all the other output from the script.  This change makes 
those instructions very readable.

I will also try to cook up something that actually extracts the .war for you on 
Windows.

 zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty
 ---

 Key: SOLR-6120
 URL: https://issues.apache.org/jira/browse/SOLR-6120
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8.1
Reporter: sebastian badea
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-6120-windows.patch, SOLR-6120-windows.patch, 
 SOLR-6120.patch


 When calling /solr-4.8.1/example/scripts/cloud-scripts/zkcli.sh the 
 org.apache.solr.cloud.ZkCLI class is not found
 The cause is that /opt/solr-4.8.1/example/solr-webapp is empty



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6120) zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty

2014-05-30 Thread Shawn Heisey (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-6120:
---

Attachment: SOLR-6120-windows.patch

Extremely minor change to new patch -- uppercasing one letter.

 zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty
 ---

 Key: SOLR-6120
 URL: https://issues.apache.org/jira/browse/SOLR-6120
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8.1
Reporter: sebastian badea
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-6120-windows.patch, SOLR-6120-windows.patch, 
 SOLR-6120-windows.patch, SOLR-6120.patch


 When calling /solr-4.8.1/example/scripts/cloud-scripts/zkcli.sh the 
 org.apache.solr.cloud.ZkCLI class is not found
 The cause is that /opt/solr-4.8.1/example/solr-webapp is empty



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5717) Postings highlighter support for multi term queries within filtered and constant score queries

2014-05-30 Thread Luca Cavanna (JIRA)
Luca Cavanna created LUCENE-5717:


 Summary: Postings highlighter support for multi term queries 
within filtered and constant score queries
 Key: LUCENE-5717
 URL: https://issues.apache.org/jira/browse/LUCENE-5717
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna


The automata extraction that is done to make multi term queries work with the 
postings highlighter does support boolean queries but it should also support 
other compound queries like filtered and constant score.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5717) Postings highlighter support for multi term queries within filtered and constant score queries

2014-05-30 Thread Luca Cavanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated LUCENE-5717:
-

Attachment: LUCENE-5717.patch

First patch attached. At this time there's no generic way to retrieve 
sub-queries from compound queries, thus I could only add two more ifs to the 
existing extractAutomata method. Maybe it's worth discussing if there's a way 
to make this more generic in a separate issue. Also not sure if there are 
others compound queries that I missed.

 Postings highlighter support for multi term queries within filtered and 
 constant score queries
 --

 Key: LUCENE-5717
 URL: https://issues.apache.org/jira/browse/LUCENE-5717
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna
 Attachments: LUCENE-5717.patch


 The automata extraction that is done to make multi term queries work with the 
 postings highlighter does support boolean queries but it should also support 
 other compound queries like filtered and constant score.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5717) Postings highlighter support for multi term queries within filtered and constant score queries

2014-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013808#comment-14013808
 ] 

Robert Muir commented on LUCENE-5717:
-

+1

 Postings highlighter support for multi term queries within filtered and 
 constant score queries
 --

 Key: LUCENE-5717
 URL: https://issues.apache.org/jira/browse/LUCENE-5717
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna
 Attachments: LUCENE-5717.patch


 The automata extraction that is done to make multi term queries work with the 
 postings highlighter does support boolean queries but it should also support 
 other compound queries like filtered and constant score.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5718) More flexible compound queries (containing mtq) support in postings highlighter

2014-05-30 Thread Luca Cavanna (JIRA)
Luca Cavanna created LUCENE-5718:


 Summary: More flexible compound queries (containing mtq) support 
in postings highlighter
 Key: LUCENE-5718
 URL: https://issues.apache.org/jira/browse/LUCENE-5718
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna


The postings highlighter currently pulls the automata from multi term queries 
and doesn't require calling rewrite to make highlighting work. In order to do 
so it also needs to check whether the query is a compound one and eventually 
extract its subqueries. This is currently done in the MultiTermHighlighting 
class and works well but has two potential problems:

1) not all the possible compound queries are necessarily supported as we need 
to go over each of them one by one (see LUCENE-5717) and this requires keeping 
the switch up-to-date if new queries gets added to lucene
2) it doesn't support custom compound queries but only the set of queries 
available out-of-the-box

I've been thinking about how this can be improved and one of the ideas I came 
up with is to introduce a generic way to retrieve the subqueries from compound 
queries, like for instance have a new abstract base class with a getLeaves or 
getSubQueries method and have all the compound queries extend it. What this 
method would do is return a flat array of all the leaf queries that the 
compound query is made of. 

Not sure whether this would be needed in other places in lucene, but it doesn't 
seem like a small change and it would definitely affect (or benefit?) more than 
just the postings highlighter support for multi term queries.

In particular the second problem (custom queries) seems hard to solve without a 
way to expose this info directly from the query though, unless we want to make 
the MultiTermHighlighting#extractAutomata method extensible in some way.

Would like to hear what people think and work on this as soon as we identified 
which direction we want to take.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5717) Postings highlighter support for multi term queries within filtered and constant score queries

2014-05-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013986#comment-14013986
 ] 

Michael McCandless commented on LUCENE-5717:


+1

 Postings highlighter support for multi term queries within filtered and 
 constant score queries
 --

 Key: LUCENE-5717
 URL: https://issues.apache.org/jira/browse/LUCENE-5717
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna
 Attachments: LUCENE-5717.patch


 The automata extraction that is done to make multi term queries work with the 
 postings highlighter does support boolean queries but it should also support 
 other compound queries like filtered and constant score.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5719) Add additional checks and enforcement around Deprecation to the build system.

2014-05-30 Thread Mark Miller (JIRA)
Mark Miller created LUCENE-5719:
---

 Summary: Add additional checks and enforcement around Deprecation 
to the build system.
 Key: LUCENE-5719
 URL: https://issues.apache.org/jira/browse/LUCENE-5719
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Mark Miller


It would be great if we enforced a reasonable minimum for deprecation use.

Something like:

* Both the annotation and the comment should be required.
* Comment deprecation should require a comment.
* Annotation should require a JIRA url. (Doesn't seem we could do this nicely 
:( )

I run into, and perhaps even write, too many places with Deprecate and no 
helpful path or clue for the user and obviously Deprecation should be as non 
confusing as possible and polite deprecation enforced to the degree that we can 
do it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6095) SolrCloud cluster can end up without an overseer

2014-05-30 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014137#comment-14014137
 ] 

Ramkumar Aiyengar commented on SOLR-6095:
-

Not sure I understand. You bring down first wave, overseers move to second 
wave. When you bring back first wave, they use the overseer in the second wave 
to recover and become active. Then you start with the second wave. Why would 
this be a problem?

 SolrCloud cluster can end up without an overseer
 

 Key: SOLR-6095
 URL: https://issues.apache.org/jira/browse/SOLR-6095
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8
Reporter: Shalin Shekhar Mangar
Assignee: Noble Paul
 Fix For: 4.9, 5.0


 We have a large cluster running on ec2 which occasionally ends up without an 
 overseer after a rolling restart. We always restart our overseer nodes at the 
 very last otherwise we end up with a large number of shards that can't 
 recover properly.
 This cluster is running a custom branch forked from 4.8 and has SOLR-5473, 
 SOLR-5495 and SOLR-5468 applied. We have a large number of small collections 
 (120 collections each with approx 5M docs) on 16 Solr nodes. We are also 
 using the overseer roles feature to designate two specified nodes as 
 overseers. However, I think the problem that we're seeing is not specific to 
 the overseer roles feature.
 As soon as the overseer was shutdown, we saw the following on the node which 
 was next in line to become the overseer:
 {code}
 2014-05-20 09:55:39,261 [main-EventThread] INFO  solr.cloud.ElectionContext  
 - I am going to be the leader ec2-xx.compute-1.amazonaws.com:8987_solr
 2014-05-20 09:55:39,265 [main-EventThread] WARN  solr.cloud.LeaderElector  - 
 org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
 NodeExists for /overseer_elect/leader
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
   at 
 org.apache.solr.common.cloud.SolrZkClient$10.execute(SolrZkClient.java:432)
   at 
 org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:73)
   at 
 org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:429)
   at 
 org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:386)
   at 
 org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:373)
   at 
 org.apache.solr.cloud.OverseerElectionContext.runLeaderProcess(ElectionContext.java:551)
   at 
 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:142)
   at 
 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:110)
   at org.apache.solr.cloud.LeaderElector.access$200(LeaderElector.java:55)
   at 
 org.apache.solr.cloud.LeaderElector$ElectionWatcher.process(LeaderElector.java:303)
   at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
 {code}
 When the overseer leader node is gracefully shutdown, we get the following in 
 the logs:
 {code}
 2014-05-20 09:55:39,254 [Thread-63] ERROR solr.cloud.Overseer  - Exception in 
 Overseer main queue loop
 org.apache.solr.common.SolrException: Could not load collection from ZK:sm12
   at 
 org.apache.solr.common.cloud.ZkStateReader.getExternCollectionFresh(ZkStateReader.java:778)
   at 
 org.apache.solr.common.cloud.ZkStateReader.updateClusterState(ZkStateReader.java:553)
   at 
 org.apache.solr.common.cloud.ZkStateReader.updateClusterState(ZkStateReader.java:246)
   at 
 org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:237)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.InterruptedException
   at java.lang.Object.wait(Native Method)
   at java.lang.Object.wait(Object.java:503)
   at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
   at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1040)
   at 
 org.apache.solr.common.cloud.SolrZkClient$4.execute(SolrZkClient.java:226)
   at 
 org.apache.solr.common.cloud.SolrZkClient$4.execute(SolrZkClient.java:223)
   at 
 org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:73)
   at 
 org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:223)
   at 
 org.apache.solr.common.cloud.ZkStateReader.getExternCollectionFresh(ZkStateReader.java:767)
   ... 4 more
 2014-05-20 09:55:39,254 [Thread-63] INFO  solr.cloud.Overseer  - Overseer 
 Loop exiting : ec2-xx.compute-1.amazonaws.com:8986_solr

[jira] [Commented] (LUCENE-5716) Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).

2014-05-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014246#comment-14014246
 ] 

Dawid Weiss commented on LUCENE-5716:
-

Option (3) is feasible, I just checked (same project as above, jvmti branch). 
Even though FileInputStream is already loaded, it can be redefined with 
(relative) ease. If we used asm we could just inject some simple before/after 
hooks that would call to some normal (not-asmified) code.

 Track file handle leaks (FileDescriptor, NIO Path SPI and Socket mostly).
 -

 Key: LUCENE-5716
 URL: https://issues.apache.org/jira/browse/LUCENE-5716
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5720) Optimize on disk packed integers part 2

2014-05-30 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5720:
---

 Summary: Optimize on disk packed integers part 2
 Key: LUCENE-5720
 URL: https://issues.apache.org/jira/browse/LUCENE-5720
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0


These are heavily optimized for the in-RAM case (for example FieldCache uses 
PackedInts.FAST to make it even faster so), but for the docvalues case they are 
not: we always essentially use COMPACT, we have only one decoder that must 
solve all the cases, even the complicated ones, we use BlockPackedWriter for 
all integers (even if they are ordinals), etc.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5720) Optimize on disk packed integers part 2

2014-05-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5720:


Attachment: LUCENE-5720.patch

Here's my first stab. this adds a fastestDirectBits(float overhead) versus 
trying to integrate with the existing one, because the logic is different when 
dealing with the directory API.

We can probably improve this stuff more for 5.0, e.g. the directory api was 
always geared at sequential access and we might be able to introduce some API 
changes later to speed it up more: but this seems like a safe win.

 Optimize on disk packed integers part 2
 ---

 Key: LUCENE-5720
 URL: https://issues.apache.org/jira/browse/LUCENE-5720
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5720.patch


 These are heavily optimized for the in-RAM case (for example FieldCache uses 
 PackedInts.FAST to make it even faster so), but for the docvalues case they 
 are not: we always essentially use COMPACT, we have only one decoder that 
 must solve all the cases, even the complicated ones, we use BlockPackedWriter 
 for all integers (even if they are ordinals), etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5720) Optimize on disk packed integers part 2

2014-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014261#comment-14014261
 ] 

Robert Muir commented on LUCENE-5720:
-

I tried to hack luceneutil up for a performance test, not sure wikipedia 
'title' is the best, but i tried on 1M:

Size: 500KB increase in docvalues data (5.7MB - 6.2MB)
Note that in context, the entire index is 385MB (no stored fields or vectors), 
so the 500KB docvalues increase is negligible.

20% improvement in sort performance.

 Optimize on disk packed integers part 2
 ---

 Key: LUCENE-5720
 URL: https://issues.apache.org/jira/browse/LUCENE-5720
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5720.patch


 These are heavily optimized for the in-RAM case (for example FieldCache uses 
 PackedInts.FAST to make it even faster so), but for the docvalues case they 
 are not: we always essentially use COMPACT, we have only one decoder that 
 must solve all the cases, even the complicated ones, we use BlockPackedWriter 
 for all integers (even if they are ordinals), etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5717) Postings highlighter support for multi term queries within filtered and constant score queries

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014274#comment-14014274
 ] 

ASF subversion and git services commented on LUCENE-5717:
-

Commit 1598755 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1598755 ]

LUCENE-5717: Postings highlighter support for multi term queries within 
filtered and constant score queries

 Postings highlighter support for multi term queries within filtered and 
 constant score queries
 --

 Key: LUCENE-5717
 URL: https://issues.apache.org/jira/browse/LUCENE-5717
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna
 Attachments: LUCENE-5717.patch


 The automata extraction that is done to make multi term queries work with the 
 postings highlighter does support boolean queries but it should also support 
 other compound queries like filtered and constant score.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5717) Postings highlighter support for multi term queries within filtered and constant score queries

2014-05-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014275#comment-14014275
 ] 

ASF subversion and git services commented on LUCENE-5717:
-

Commit 1598756 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1598756 ]

LUCENE-5717: Postings highlighter support for multi term queries within 
filtered and constant score queries

 Postings highlighter support for multi term queries within filtered and 
 constant score queries
 --

 Key: LUCENE-5717
 URL: https://issues.apache.org/jira/browse/LUCENE-5717
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5717.patch


 The automata extraction that is done to make multi term queries work with the 
 postings highlighter does support boolean queries but it should also support 
 other compound queries like filtered and constant score.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5717) Postings highlighter support for multi term queries within filtered and constant score queries

2014-05-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5717.
-

   Resolution: Fixed
Fix Version/s: 5.0
   4.9

Thanks Luca!

 Postings highlighter support for multi term queries within filtered and 
 constant score queries
 --

 Key: LUCENE-5717
 URL: https://issues.apache.org/jira/browse/LUCENE-5717
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5717.patch


 The automata extraction that is done to make multi term queries work with the 
 postings highlighter does support boolean queries but it should also support 
 other compound queries like filtered and constant score.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5718) More flexible compound queries (containing mtq) support in postings highlighter

2014-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014278#comment-14014278
 ] 

Robert Muir commented on LUCENE-5718:
-

For the actual multitermqueries themselves, we could consider a method to get 
an automaton representation of what they do. on one hand, its specific to 
highlighting, on the other, i dont know a better way that avoids a very 
expensive rewrite against the entire index.

As far as punching through the query structure, we have a similar thing 
(extractTerms) geared at highlighting-type things. it avoids MUST_NOT clauses 
for example. We could consider an extractQueries... maybe there is a cleaner 
solution.

 More flexible compound queries (containing mtq) support in postings 
 highlighter
 ---

 Key: LUCENE-5718
 URL: https://issues.apache.org/jira/browse/LUCENE-5718
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.8.1
Reporter: Luca Cavanna

 The postings highlighter currently pulls the automata from multi term queries 
 and doesn't require calling rewrite to make highlighting work. In order to do 
 so it also needs to check whether the query is a compound one and eventually 
 extract its subqueries. This is currently done in the MultiTermHighlighting 
 class and works well but has two potential problems:
 1) not all the possible compound queries are necessarily supported as we need 
 to go over each of them one by one (see LUCENE-5717) and this requires 
 keeping the switch up-to-date if new queries gets added to lucene
 2) it doesn't support custom compound queries but only the set of queries 
 available out-of-the-box
 I've been thinking about how this can be improved and one of the ideas I came 
 up with is to introduce a generic way to retrieve the subqueries from 
 compound queries, like for instance have a new abstract base class with a 
 getLeaves or getSubQueries method and have all the compound queries extend 
 it. What this method would do is return a flat array of all the leaf queries 
 that the compound query is made of. 
 Not sure whether this would be needed in other places in lucene, but it 
 doesn't seem like a small change and it would definitely affect (or benefit?) 
 more than just the postings highlighter support for multi term queries.
 In particular the second problem (custom queries) seems hard to solve without 
 a way to expose this info directly from the query though, unless we want to 
 make the MultiTermHighlighting#extractAutomata method extensible in some way.
 Would like to hear what people think and work on this as soon as we 
 identified which direction we want to take.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5715) Upgrade direct dependencies known to be older than transitive dependencies

2014-05-30 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014434#comment-14014434
 ] 

Steve Rowe commented on LUCENE-5715:


I'm adding the capability to report the latest version of dependencies with 
version conflicts, so people don't have go look that up in order to fix via 
version upgrade.  For example:

{noformat}
[libversions] VERSION CONFLICT: transitive dependency in module(s) uima:
[libversions] /commons-digester/commons-digester=2.0
[libversions] +-- /commons-beanutils/commons-beanutils=1.8.0  Conflict 
(direct=1.7.0, latest=1.9.2)
[libversions] ... and 1 more
{noformat} 

 Upgrade direct dependencies known to be older than transitive dependencies
 --

 Key: LUCENE-5715
 URL: https://issues.apache.org/jira/browse/LUCENE-5715
 Project: Lucene - Core
  Issue Type: Task
  Components: general/build
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Minor

 LUCENE-5442 added functionality to the {{check-lib-versions}} ant task to 
 fail the build if a direct dependency's version conflicts with that of a 
 transitive dependency.
 {{ivy-ignore-conflicts.properties}} contains a list of 19 transitive 
 dependencies with versions that are newer than direct dependencies' versions: 
 https://issues.apache.org/jira/browse/LUCENE-5442?focusedCommentId=14012220page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14012220
 We should try to keep that list small.  It's likely that upgrading most of 
 those dependencies will require little effort.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1148: POMs out of sync

2014-05-30 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1148/

2 tests failed.
FAILED:  org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch

Error Message:
There are still nodes recoverying - waited for 330 seconds

Stack Trace:
java.lang.AssertionError: There are still nodes recoverying - waited for 330 
seconds
at 
__randomizedtesting.SeedInfo.seed([9B6D809ADDE3BF0C:1A8B0E82AABCDF30]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:178)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:137)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:132)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCollectionsAPI(CollectionsAPIDistributedZkTest.java:770)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:203)


FAILED:  org.apache.solr.cloud.HttpPartitionTest.testDistribSearch

Error Message:
No registered leader was found after waiting for 6ms , collection: 
c8n_1x3_lf slice: shard1

Stack Trace:
org.apache.solr.common.SolrException: No registered leader was found after 
waiting for 6ms , collection: c8n_1x3_lf slice: shard1
at 
__randomizedtesting.SeedInfo.seed([75D3F100E960383D:F4357F189E3F5801]:0)
at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:567)
at 
org.apache.solr.cloud.HttpPartitionTest.testRf3WithLeaderFailover(HttpPartitionTest.java:349)
at 
org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:148)




Build Log:
[...truncated 54639 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:490: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:182: 
The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77:
 Java returned: 1

Total time: 244 minutes 51 seconds
Build step 'Invoke Ant' marked build as failure
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5720) Optimize on disk packed integers part 2

2014-05-30 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5720:


Attachment: LUCENE-5720.patch

With file formats and javadocs.

 Optimize on disk packed integers part 2
 ---

 Key: LUCENE-5720
 URL: https://issues.apache.org/jira/browse/LUCENE-5720
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5720.patch, LUCENE-5720.patch


 These are heavily optimized for the in-RAM case (for example FieldCache uses 
 PackedInts.FAST to make it even faster so), but for the docvalues case they 
 are not: we always essentially use COMPACT, we have only one decoder that 
 must solve all the cases, even the complicated ones, we use BlockPackedWriter 
 for all integers (even if they are ordinals), etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5721) Monotonic packed could maybe be faster

2014-05-30 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5721:
---

 Summary: Monotonic packed could maybe be faster
 Key: LUCENE-5721
 URL: https://issues.apache.org/jira/browse/LUCENE-5721
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir


This compression is used in lucene for monotonically increasing offsets, e.g. 
stored fields index, dv BINARY/SORTED_SET offsets, OrdinalMap (used for merging 
and faceting dv) and so on.

Today this stores a +/- deviation from an expected line of y=mx + b, where b is 
the minValue for the block and m is the average delta from the previous value. 
Because it can be negative, we have to do some additional work to zigzag-decode.

Can we just instead waste a bit for every value explicitly (lower the minValue 
by the min delta) so that deltas are always positive and we can have a simpler 
decode? Maybe If we do this, the new guy should assert that values are actually 
monotic at write-time. The current one supports mostly monotic but do we 
really need that flexibility anywhere? If so it could always be kept...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5721) Monotonic packed could maybe be faster

2014-05-30 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014514#comment-14014514
 ] 

Robert Muir commented on LUCENE-5721:
-

Also it would be nice to think about how much the floating point stuff saves 
compression-wise in practice. Maybe an integer average is enough?

 Monotonic packed could maybe be faster
 --

 Key: LUCENE-5721
 URL: https://issues.apache.org/jira/browse/LUCENE-5721
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir

 This compression is used in lucene for monotonically increasing offsets, e.g. 
 stored fields index, dv BINARY/SORTED_SET offsets, OrdinalMap (used for 
 merging and faceting dv) and so on.
 Today this stores a +/- deviation from an expected line of y=mx + b, where b 
 is the minValue for the block and m is the average delta from the previous 
 value. Because it can be negative, we have to do some additional work to 
 zigzag-decode.
 Can we just instead waste a bit for every value explicitly (lower the 
 minValue by the min delta) so that deltas are always positive and we can have 
 a simpler decode? Maybe If we do this, the new guy should assert that values 
 are actually monotic at write-time. The current one supports mostly monotic 
 but do we really need that flexibility anywhere? If so it could always be 
 kept...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5808) collections?action=SPLITSHARD running out of heap space due to large segments

2014-05-30 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014530#comment-14014530
 ] 

Shalin Shekhar Mangar commented on SOLR-5808:
-

I just ran into this as well. A large segment on a 500M doc index (12GB heap) 
took down the node. I'll investigate and try to reduce the memory requirements.

 collections?action=SPLITSHARD running out of heap space due to large segments
 -

 Key: SOLR-5808
 URL: https://issues.apache.org/jira/browse/SOLR-5808
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 4.7
Reporter: Will Butler
Assignee: Shalin Shekhar Mangar
  Labels: outofmemory, shard, split

 This issue is related to [https://issues.apache.org/jira/browse/SOLR-5214]. 
 Although memory issues due to merging have been resolved, we still run out of 
 memory when splitting a shard containing a large segment (created by 
 optimizing). The Lucene MultiPassIndexSplitter is able to split the index 
 without error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org