[jira] [Comment Edited] (SOLR-2649) MM ignored in edismax queries with operators

2014-04-30 Thread Greg Pendlebury (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986408#comment-13986408
 ] 

Greg Pendlebury edited comment on SOLR-2649 at 5/1/14 6:54 AM:
---

I applied this patch to 4.7.2 Yesterday and tried it out on our dev servers. At 
first I thought it was pretty bad and failed completely... but then I had a 
good think and re-read everything on this ticket and this[1] article and 
realised my understanding of the problem was flawed. Using just this patch in 
isolation it converted all of the OR operators to AND operators with mm=100%. 
Very confusing behaviour for our business area, but I realise now that it is 
correct.

Perhaps the confusion stems from the way the q.op and mm parameters interact. 
If the behaviour was to instead separate them more clearly then we could change 
the config entirely. At the moment our mm is 100% because we effectively want 
q.op=AND, but if q.op was instead applied 1) always, 2) first and 3) 
independently from mm (ie. insert AND wherever an operator is missing) we could 
set mm=1 and achieve what we want by respecting the OR parameters provided by 
the user.

I've added this on top of the patch already here and deployed again to our dev 
servers using 'q.op=AND & mm=1' and now everything appears to function as it 
should. I'll upload the patch in a minute, and it includes several unit tests 
with different mm and q.op values. From my perspective I think the two 
parameters are interacting appropriately, but perhaps someone with more 
convoluted mm settings could give it a try?

The change is simply in the constructor of the ExtendedSolrQueryParser class 
where it was hardcoded to force the default operator to OR (presumably so that 
mm would take care of things) I've made it look at the parameter provided with 
the query (copied the code from the Simple QParser and adjusted to fit).

The unit test from the first patch that was marked TODO I have tweaked 
slightly. I think not finding a result in that case is entirely appropriate if 
the user can now tweak q.op. Opinions may vary of course.

[1] http://searchhub.org/2011/12/28/why-not-and-or-and-not/


was (Author: gpendleb):
I applied this patch to 4.7.2 Yesterday and tried it out on or dev servers. At 
first I thought it was pretty bad and failed completely... but then I had a 
good think and re-read everything on this ticket and this[1] article and 
realised my understanding of the problem was flawed. Using just this patch in 
isolation it converted all of the OR operators to AND operators with mm=100%. 
Very confusing behaviour for our business area, but I realise now that it is 
correct.

Perhaps the confusion stems from the way the q.op and mm parameters interact. 
If the behaviour was to instead separate them more clearly then we could change 
the config entirely. At the moment our mm is 100% because we effectively want 
q.op=AND, but if q.op was instead applied 1) always, 2) first and 3) 
independently from mm (ie. insert AND wherever an operator is missing) we could 
set mm=1 and achieve what we want by respecting the OR parameters provided by 
the user.

I've added this on top of the patch already here and deployed again to our dev 
servers using 'q.op=AND & mm=1' and now everything appears to function as it 
should. I'll upload the patch in a minute, and it includes several unit tests 
with different mm and q.op values. From my perspective I think the two 
parameters are interacting appropriately, but perhaps someone with more 
convoluted mm settings could give it a try?

The change is simply in the constructor of the ExtendedSolrQueryParser class 
where it was hardcoded to force the default operator to OR (presumably so that 
mm would take care of things) I've made it look at the parameter provided with 
the query (copied the code from the Simple QParser and adjusted to fit).

The unit test from the first patch that was marked TODO I have tweaked 
slightly. I think not finding a result in that case is entirely appropriate if 
the user can now tweak q.op. Opinions may vary of course.

[1] http://searchhub.org/2011/12/28/why-not-and-or-and-not/

> MM ignored in edismax queries with operators
> 
>
> Key: SOLR-2649
> URL: https://issues.apache.org/jira/browse/SOLR-2649
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Reporter: Magnus Bergmark
>Priority: Minor
> Fix For: 4.9, 5.0
>
> Attachments: SOLR-2649-with-Qop.patch, SOLR-2649.diff, SOLR-2649.patch
>
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed 
> together
> The beh

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1541 - Failure!

2014-04-30 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1541/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC

No tests ran.

Build Log:
[...truncated 8349 lines...]
   [junit4] JVM J0: stdout was not empty, see: 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/join/test/temp/junit4-J0-20140501_065329_348.sysout
   [junit4] >>> JVM J0: stdout (verbatim) 
   [junit4] #
   [junit4] # A fatal error has been detected by the Java Runtime Environment:
   [junit4] #
   [junit4] #  SIGFPE (0x8) at pc=0x7fff8a0d6b63, pid=203, tid=21251
   [junit4] #
   [junit4] # JRE version: Java(TM) SE Runtime Environment (8.0_05-b13) (build 
1.8.0_05-b13)
   [junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.5-b02 mixed mode 
bsd-amd64 compressed oops)
   [junit4] # Problematic frame:
   [junit4] # C  [libsystem_c.dylib+0x2b63]  __commpage_gettimeofday+0x43
   [junit4] #
   [junit4] # Failed to write core dump. Core dumps have been disabled. To 
enable core dumping, try "ulimit -c unlimited" before starting Java again
   [junit4] #
   [junit4] # An error report file with more information is saved as:
   [junit4] # 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/join/test/J0/hs_err_pid203.log
   [junit4] #
   [junit4] # Compiler replay data is saved as:
   [junit4] # 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/join/test/J0/replay_pid203.log
   [junit4] #
   [junit4] # If you would like to submit a bug report, please visit:
   [junit4] #   http://bugreport.sun.com/bugreport/crash.jsp
   [junit4] #
   [junit4] <<< JVM J0: EOF 

[...truncated 1 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/jre/bin/java 
-XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
-Dtests.prefix=tests -Dtests.seed=D104BAEF7A7EA4D6 -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perMethod 
-Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.monster=false 
-Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 
-DtempDir=. -Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/join/test/temp
 
-Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Djdk.map.althashing.threshold=0 
-Dtests.leaveTemporary=false -Dtests.filterstacks=true -Dtests.disableHdfs=true 
-Dfile.encoding=UTF-8 -classpath 
/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/join/classes/test:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/test-framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/codecs/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/grouping/lucene-grouping-5.0-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/core/classes/java:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/test-framework/lib/junit-4.10.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/test-framework/lib/randomizedtesting-runner-2.1.3.jar:/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/join/classes/java:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-launcher.jar:/Users/jenkins/.ant/lib/ivy-2.3.0.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-antlr.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-apache-bcel.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-apache-bsf.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-apache-log4j.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-apache-oro.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-apache-regexp.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-apache-resolver.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-apache-xalan2.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2/lib/ant-commons-logging.jar:/Users/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT

[jira] [Updated] (SOLR-2649) MM ignored in edismax queries with operators

2014-04-30 Thread Greg Pendlebury (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Pendlebury updated SOLR-2649:
--

Attachment: SOLR-2649-with-Qop.patch

> MM ignored in edismax queries with operators
> 
>
> Key: SOLR-2649
> URL: https://issues.apache.org/jira/browse/SOLR-2649
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Reporter: Magnus Bergmark
>Priority: Minor
> Fix For: 4.9, 5.0
>
> Attachments: SOLR-2649-with-Qop.patch, SOLR-2649.diff, SOLR-2649.patch
>
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed 
> together
> The behavior seems to be intentional, although the reason why is never 
> explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from 
> tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the 
> primary features of dismax.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators

2014-04-30 Thread Greg Pendlebury (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986408#comment-13986408
 ] 

Greg Pendlebury commented on SOLR-2649:
---

I applied this patch to 4.7.2 Yesterday and tried it out on or dev servers. At 
first I thought it was pretty bad and failed completely... but then I had a 
good think and re-read everything on this ticket and this[1] article and 
realised my understanding of the problem was flawed. Using just this patch in 
isolation it converted all of the OR operators to AND operators with mm=100%. 
Very confusing behaviour for our business area, but I realise now that it is 
correct.

Perhaps the confusion stems from the way the q.op and mm parameters interact. 
If the behaviour was to instead separate them more clearly then we could change 
the config entirely. At the moment our mm is 100% because we effectively want 
q.op=AND, but if q.op was instead applied 1) always, 2) first and 3) 
independently from mm (ie. insert AND wherever an operator is missing) we could 
set mm=1 and achieve what we want by respecting the OR parameters provided by 
the user.

I've added this on top of the patch already here and deployed again to our dev 
servers using 'q.op=AND & mm=1' and now everything appears to function as it 
should. I'll upload the patch in a minute, and it includes several unit tests 
with different mm and q.op values. From my perspective I think the two 
parameters are interacting appropriately, but perhaps someone with more 
convoluted mm settings could give it a try?

The change is simply in the constructor of the ExtendedSolrQueryParser class 
where it was hardcoded to force the default operator to OR (presumably so that 
mm would take care of things) I've made it look at the parameter provided with 
the query (copied the code from the Simple QParser and adjusted to fit).

The unit test from the first patch that was marked TODO I have tweaked 
slightly. I think not finding a result in that case is entirely appropriate if 
the user can now tweak q.op. Opinions may vary of course.

[1] http://searchhub.org/2011/12/28/why-not-and-or-and-not/

> MM ignored in edismax queries with operators
> 
>
> Key: SOLR-2649
> URL: https://issues.apache.org/jira/browse/SOLR-2649
> Project: Solr
>  Issue Type: Bug
>  Components: query parsers
>Reporter: Magnus Bergmark
>Priority: Minor
> Fix For: 4.9, 5.0
>
> Attachments: SOLR-2649.diff, SOLR-2649.patch
>
>
> Hypothetical scenario:
>   1. User searches for "stocks oil gold" with MM set to "50%"
>   2. User adds "-stockings" to the query: "stocks oil gold -stockings"
>   3. User gets no hits since MM was ignored and all terms where AND-ed 
> together
> The behavior seems to be intentional, although the reason why is never 
> explained:
>   // For correct lucene queries, turn off mm processing if there
>   // were explicit operators (except for AND).
>   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
> (lines 232-234 taken from 
> tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
> This makes edismax unsuitable as an replacement to dismax; mm is one of the 
> primary features of dismax.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20274 - Failure!

2014-04-30 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20274/

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestIndexWriter.testManyFields

Error Message:
3.x codec does not support payloads on vectors!

Stack Trace:
java.lang.UnsupportedOperationException: 3.x codec does not support payloads on 
vectors!
at 
__randomizedtesting.SeedInfo.seed([A7DEAB3AF9A18B1F:8FE0FDA4EBDC8889]:0)
at 
org.apache.lucene.codecs.lucene3x.PreFlexRWTermVectorsWriter.startField(PreFlexRWTermVectorsWriter.java:82)
at 
org.apache.lucene.index.TermVectorsConsumerPerField.finishDocument(TermVectorsConsumerPerField.java:83)
at 
org.apache.lucene.index.TermVectorsConsumer.finishDocument(TermVectorsConsumer.java:112)
at org.apache.lucene.index.TermsHash.finishDocument(TermsHash.java:93)
at 
org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:345)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:222)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:459)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1541)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1211)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1192)
at 
org.apache.lucene.index.TestIndexWriter.testManyFields(TestIndexWriter.java:275)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.

[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20267 - Failure!

2014-04-30 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20267/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads

Error Message:
fdx size mismatch: docCount is 10 but fdx file size is 117 
file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@725790ac); 
now aborting this merge to prevent index corruption

Stack Trace:
java.lang.RuntimeException: fdx size mismatch: docCount is 10 but fdx file size 
is 117 
file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@725790ac); 
now aborting this merge to prevent index corruption
at 
__randomizedtesting.SeedInfo.seed([E7718A0E31497745:6809733D1A83CD36]:0)
at 
org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.finish(Lucene40StoredFieldsWriter.java:233)
at 
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:96)
at 
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
at 
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
at 
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3203)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3179)
at 
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:995)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:939)
at 
org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads(TestIndexWriterWithThreads.java:168)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequire

[jira] [Updated] (LUCENE-5634) Reuse TokenStream instances in Field

2014-04-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5634:
---

Attachment: LUCENE-5634.patch

Here's an alternate approach, pulled from early iterations on LUCENE-5611, to 
specialize indexing just a single string ... there are still nocommits, it 
needs to be more generic to any not tokenized field, etc.  It's sort of silly 
to build up an entire TokenStream when really you just need to index the one 
token ...

This patch indexes geonames in ~38.5 sec, or ~31% faster than trunk

> Reuse TokenStream instances in Field
> 
>
> Key: LUCENE-5634
> URL: https://issues.apache.org/jira/browse/LUCENE-5634
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5634.patch, LUCENE-5634.patch
>
>
> If you don't reuse your Doc/Field instances (which is very expert: I
> suspect few apps do) then there's a lot of garbage created to index each
> StringField because we make a new StringTokenStream or
> NumericTokenStream (and their Attributes).
> We should be able to re-use these instances via a static
> ThreadLocal...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5634) Reuse TokenStream instances in Field

2014-04-30 Thread Shay Banon (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986177#comment-13986177
 ] 

Shay Banon commented on LUCENE-5634:


this optimization has proven to help a lot in the context of ES, but we can use 
a static thread local since we are fully in control of the threading model. 
With Lucene itself, where it can be used in many different environment, then 
this can cause some unexpected behavior. For example, this might cause Tomcat 
to warn on leaking resources when unloading a war.

> Reuse TokenStream instances in Field
> 
>
> Key: LUCENE-5634
> URL: https://issues.apache.org/jira/browse/LUCENE-5634
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5634.patch
>
>
> If you don't reuse your Doc/Field instances (which is very expert: I
> suspect few apps do) then there's a lot of garbage created to index each
> StringField because we make a new StringTokenStream or
> NumericTokenStream (and their Attributes).
> We should be able to re-use these instances via a static
> ThreadLocal...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5634) Reuse TokenStream instances in Field

2014-04-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986149#comment-13986149
 ] 

Michael McCandless commented on LUCENE-5634:


Initial patch, but I'd like to find a way to reuse NumericTokenStream
too ... but it's trickier since the precStep is final (maybe we can
un-final it and add a setter?)

I ran a quick test, indexing all (8.6 M docs) of Geonames, which is a
good test for "tiny documents" ... with trunk it 56 seconds and with
the patch it's 45 seconds, ~20% faster.


> Reuse TokenStream instances in Field
> 
>
> Key: LUCENE-5634
> URL: https://issues.apache.org/jira/browse/LUCENE-5634
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5634.patch
>
>
> If you don't reuse your Doc/Field instances (which is very expert: I
> suspect few apps do) then there's a lot of garbage created to index each
> StringField because we make a new StringTokenStream or
> NumericTokenStream (and their Attributes).
> We should be able to re-use these instances via a static
> ThreadLocal...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5634) Reuse TokenStream instances in Field

2014-04-30 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5634:
---

Attachment: LUCENE-5634.patch

> Reuse TokenStream instances in Field
> 
>
> Key: LUCENE-5634
> URL: https://issues.apache.org/jira/browse/LUCENE-5634
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5634.patch
>
>
> If you don't reuse your Doc/Field instances (which is very expert: I
> suspect few apps do) then there's a lot of garbage created to index each
> StringField because we make a new StringTokenStream or
> NumericTokenStream (and their Attributes).
> We should be able to re-use these instances via a static
> ThreadLocal...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5634) Reuse TokenStream instances in Field

2014-04-30 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-5634:
--

 Summary: Reuse TokenStream instances in Field
 Key: LUCENE-5634
 URL: https://issues.apache.org/jira/browse/LUCENE-5634
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
 Fix For: 4.9, 5.0


If you don't reuse your Doc/Field instances (which is very expert: I
suspect few apps do) then there's a lot of garbage created to index each
StringField because we make a new StringTokenStream or
NumericTokenStream (and their Attributes).

We should be able to re-use these instances via a static
ThreadLocal...




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-30 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986116#comment-13986116
 ] 

Anshum Gupta commented on SOLR-6022:


Sure, let that be. Will look into it more and open a new JIRA if required.
Thanks.

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch, SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-30 Thread Ryan Ernst (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986089#comment-13986089
 ] 

Ryan Ernst commented on SOLR-6022:
--

bq. I am not sure if this should be the IndexAnalyzer or the QueryAnalyzer as 
AFAIR, this tries to construct a query out of the terms from a document (given 
an id).

I only kept what was there before.  I don't know enough about MLT to make a 
statement whether it is correct or not.  If it is broken, I think another jira 
should be opened?

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch, SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20243 - Failure!

2014-04-30 Thread Robert Muir
The exception handling in processDocument() is too complex. it must do
try+finally with two separate cases: aborting exception and
non-aborting exception, depending on where it happens.

I think since the whole thing loops through fields, if we factor out
processField it will be easier...

On Wed, Apr 30, 2014 at 4:21 PM, Robert Muir  wrote:
> reproduces if you tack on -Dtests.dups=100 ... looking
>
> On Wed, Apr 30, 2014 at 4:04 PM,   wrote:
>> Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20243/
>>
>> 1 tests failed.
>> REGRESSION:  
>> org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads
>>
>> Error Message:
>> fdx size mismatch: docCount is 12 but fdx file size is 133 
>> file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87);
>>  now aborting this merge to prevent index corruption
>>
>> Stack Trace:
>> java.lang.RuntimeException: fdx size mismatch: docCount is 12 but fdx file 
>> size is 133 
>> file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87);
>>  now aborting this merge to prevent index corruption
>> at 
>> __randomizedtesting.SeedInfo.seed([70486912DEA23EAA:FF309021F56884D9]:0)
>> at 
>> org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.finish(Lucene40StoredFieldsWriter.java:233)
>> at 
>> org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:96)
>> at 
>> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
>> at 
>> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
>> at 
>> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
>> at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3203)
>> at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3179)
>> at 
>> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:995)
>> at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:939)
>> at 
>> org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads(TestIndexWriterWithThreads.java:168)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
>> at 
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> at 
>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(Syste

[jira] [Commented] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Marvin Justice (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986042#comment-13986042
 ] 

Marvin Justice commented on SOLR-6035:
--

Just spun up a 4 shard x 2 replica test collection. Replication seems to be 
working just fine for me when going directly to the coreUrl. 

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Assignee: Joel Bernstein
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5591) ReaderAndUpdates should create a proper IOContext when writing DV updates

2014-04-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5591.


   Resolution: Fixed
Fix Version/s: 5.0
   4.9
 Assignee: Shai Erera

Thanks Mike. Committed to trunk and 4x.

> ReaderAndUpdates should create a proper IOContext when writing DV updates
> -
>
> Key: LUCENE-5591
> URL: https://issues.apache.org/jira/browse/LUCENE-5591
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Shai Erera
>Assignee: Shai Erera
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5591.patch, LUCENE-5591.patch, LUCENE-5591.patch
>
>
> Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ 
> NRTCachingDirectory, it means the latter will attempt to write the entire DV 
> field in its RAMDirectory, which could lead to OOM.
> Would be good if we can build our own FlushInfo, estimating the number of 
> bytes we're about to write. I didn't see off hand a quick way to guesstimate 
> that - I thought to use the current DV's sizeInBytes as an approximation, but 
> I don't see a way to get it, not a direct way at least.
> Maybe we can use the size of the in-memory updates to guesstimate that 
> amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is 
> it a too wild guess?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5591) ReaderAndUpdates should create a proper IOContext when writing DV updates

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986028#comment-13986028
 ] 

ASF subversion and git services commented on LUCENE-5591:
-

Commit 1591474 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1591474 ]

LUCENE-5591: pass proper IOContext when writing DocValues updates

> ReaderAndUpdates should create a proper IOContext when writing DV updates
> -
>
> Key: LUCENE-5591
> URL: https://issues.apache.org/jira/browse/LUCENE-5591
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Shai Erera
> Attachments: LUCENE-5591.patch, LUCENE-5591.patch, LUCENE-5591.patch
>
>
> Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ 
> NRTCachingDirectory, it means the latter will attempt to write the entire DV 
> field in its RAMDirectory, which could lead to OOM.
> Would be good if we can build our own FlushInfo, estimating the number of 
> bytes we're about to write. I didn't see off hand a quick way to guesstimate 
> that - I thought to use the current DV's sizeInBytes as an approximation, but 
> I don't see a way to get it, not a direct way at least.
> Maybe we can use the size of the in-memory updates to guesstimate that 
> amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is 
> it a too wild guess?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20243 - Failure!

2014-04-30 Thread Robert Muir
reproduces if you tack on -Dtests.dups=100 ... looking

On Wed, Apr 30, 2014 at 4:04 PM,   wrote:
> Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20243/
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads
>
> Error Message:
> fdx size mismatch: docCount is 12 but fdx file size is 133 
> file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87);
>  now aborting this merge to prevent index corruption
>
> Stack Trace:
> java.lang.RuntimeException: fdx size mismatch: docCount is 12 but fdx file 
> size is 133 
> file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87);
>  now aborting this merge to prevent index corruption
> at 
> __randomizedtesting.SeedInfo.seed([70486912DEA23EAA:FF309021F56884D9]:0)
> at 
> org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.finish(Lucene40StoredFieldsWriter.java:233)
> at 
> org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:96)
> at 
> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
> at 
> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
> at 
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
> at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3203)
> at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3179)
> at 
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:995)
> at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:939)
> at 
> org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads(TestIndexWriterWithThreads.java:168)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.ja

[jira] [Commented] (LUCENE-5591) ReaderAndUpdates should create a proper IOContext when writing DV updates

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13986012#comment-13986012
 ] 

ASF subversion and git services commented on LUCENE-5591:
-

Commit 1591469 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1591469 ]

LUCENE-5591: pass proper IOContext when writing DocValues updates

> ReaderAndUpdates should create a proper IOContext when writing DV updates
> -
>
> Key: LUCENE-5591
> URL: https://issues.apache.org/jira/browse/LUCENE-5591
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Shai Erera
> Attachments: LUCENE-5591.patch, LUCENE-5591.patch, LUCENE-5591.patch
>
>
> Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ 
> NRTCachingDirectory, it means the latter will attempt to write the entire DV 
> field in its RAMDirectory, which could lead to OOM.
> Would be good if we can build our own FlushInfo, estimating the number of 
> bytes we're about to write. I didn't see off hand a quick way to guesstimate 
> that - I thought to use the current DV's sizeInBytes as an approximation, but 
> I don't see a way to get it, not a direct way at least.
> Maybe we can use the size of the in-memory updates to guesstimate that 
> amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is 
> it a too wild guess?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20243 - Failure!

2014-04-30 Thread Robert Muir
I'm on it

On Wed, Apr 30, 2014 at 4:04 PM,   wrote:
> Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20243/
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads
>
> Error Message:
> fdx size mismatch: docCount is 12 but fdx file size is 133 
> file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87);
>  now aborting this merge to prevent index corruption
>
> Stack Trace:
> java.lang.RuntimeException: fdx size mismatch: docCount is 12 but fdx file 
> size is 133 
> file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87);
>  now aborting this merge to prevent index corruption
> at 
> __randomizedtesting.SeedInfo.seed([70486912DEA23EAA:FF309021F56884D9]:0)
> at 
> org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.finish(Lucene40StoredFieldsWriter.java:233)
> at 
> org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:96)
> at 
> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
> at 
> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
> at 
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
> at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3203)
> at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3179)
> at 
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:995)
> at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:939)
> at 
> org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads(TestIndexWriterWithThreads.java:168)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.rando

[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20243 - Failure!

2014-04-30 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20243/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads

Error Message:
fdx size mismatch: docCount is 12 but fdx file size is 133 
file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87); 
now aborting this merge to prevent index corruption

Stack Trace:
java.lang.RuntimeException: fdx size mismatch: docCount is 12 but fdx file size 
is 133 
file=MockIndexOutputWrapper(org.apache.lucene.store.RAMOutputStream@3afabf87); 
now aborting this merge to prevent index corruption
at 
__randomizedtesting.SeedInfo.seed([70486912DEA23EAA:FF309021F56884D9]:0)
at 
org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.finish(Lucene40StoredFieldsWriter.java:233)
at 
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:96)
at 
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
at 
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
at 
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3203)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3179)
at 
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:995)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:939)
at 
org.apache.lucene.index.TestIndexWriterWithThreads.testImmediateDiskFullWithThreads(TestIndexWriterWithThreads.java:168)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequire

Re: DocumentsWriterPerThread architecture

2014-04-30 Thread Michael McCandless
It's still current.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Apr 30, 2014 at 2:06 PM, david.w.smi...@gmail.com
 wrote:
> Is this still up to date?:
> https://blog.trifork.com/2011/04/01/gimme-all-resources-you-have-i-can-use-them/
> I thought at some point subsequently, some significant work was done, and
> perhaps it was blogged. But I can’t find it.
> ~ David

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5591) ReaderAndUpdates should create a proper IOContext when writing DV updates

2014-04-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985965#comment-13985965
 ] 

Michael McCandless commented on LUCENE-5591:


+1, thanks Shai!

> ReaderAndUpdates should create a proper IOContext when writing DV updates
> -
>
> Key: LUCENE-5591
> URL: https://issues.apache.org/jira/browse/LUCENE-5591
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Shai Erera
> Attachments: LUCENE-5591.patch, LUCENE-5591.patch, LUCENE-5591.patch
>
>
> Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ 
> NRTCachingDirectory, it means the latter will attempt to write the entire DV 
> field in its RAMDirectory, which could lead to OOM.
> Would be good if we can build our own FlushInfo, estimating the number of 
> bytes we're about to write. I didn't see off hand a quick way to guesstimate 
> that - I thought to use the current DV's sizeInBytes as an approximation, but 
> I don't see a way to get it, not a direct way at least.
> Maybe we can use the size of the in-memory updates to guesstimate that 
> amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is 
> it a too wild guess?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6036) Can't create collection with replicationFactor=0

2014-04-30 Thread John Wong (JIRA)
John Wong created SOLR-6036:
---

 Summary: Can't create collection with replicationFactor=0
 Key: SOLR-6036
 URL: https://issues.apache.org/jira/browse/SOLR-6036
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1, 4.8
Reporter: John Wong
Priority: Trivial


solrcloud$ curl 
'http://localhost:8983/solr/admin/collections?action=CREATE&name=collection&numShards=2&replicationFactor=0'


40060052org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 replicationFactor must be greater than or equal to 0replicationFactor must be greater than or 
equal to 0400replicationFactor must be greater than or equal to 0400


I am using solr 4.3.1, but I peeked into the source up to 4.8 and the problem 
still persists, but in 4.8, the exception message now is changed to be greater 
than 0.

The code snippet in OverseerCollectionProcessor.java:

  if (repFactor <= 0) {
throw new SolrException(ErrorCode.BAD_REQUEST, REPLICATION_FACTOR + " 
must be greater than 0");
  }

I believe the <= should just be < as it won't allow 0.  It may have been legacy 
from when replicationFactor of 1 included the leader/master copy, whereas in 
solr 4.x, replicationFactor is defined by additional replicas on top of the 
leader.

http://wiki.apache.org/solr/SolrCloud

replicationFactor: The number of copies of each document (or, the number of 
physical replicas to be created for each logical shard of the collection.) A 
replicationFactor of 3 means that there will be 3 replicas (one of which is 
normally designated to be the leader) for each logical shard. NOTE: in Solr 
4.0, replicationFactor was the number of *additional* copies as opposed to the 
total number of copies. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985916#comment-13985916
 ] 

Ramkumar Aiyengar commented on SOLR-6035:
-

As far as I understand `DistributedUpdateProcessor`, indexing to the core 
itself would take care of replication just fine. But of course, a test 
confirming this will be good..

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Assignee: Joel Bernstein
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5633) NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985910#comment-13985910
 ] 

ASF subversion and git services commented on LUCENE-5633:
-

Commit 1591446 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1591446 ]

LUCENE-5633: leftover from bad merge

> NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE
> 
>
> Key: LUCENE-5633
> URL: https://issues.apache.org/jira/browse/LUCENE-5633
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5633.patch
>
>
> Currently there are two singletons available - MergePolicy.NO_COMPOUND_FILES 
> and MergePolicy.COMPOUND_FILES and it's confusing to distinguish on compound 
> files when the merge policy never merges segments. 
> We should have one singleton - NoMergePolicy.INSTANCE
> Post to the relevant discussion - 
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/201404.mbox/%3CCAOdYfZXXyVSf9%2BxYaRhr5v2O4Mc6S2v-qWuT112_CJFYhWTPqw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5633) NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE

2014-04-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5633.


   Resolution: Fixed
Fix Version/s: 5.0
   4.9
 Assignee: Shai Erera
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x. Thanks Varun!

> NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE
> 
>
> Key: LUCENE-5633
> URL: https://issues.apache.org/jira/browse/LUCENE-5633
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5633.patch
>
>
> Currently there are two singletons available - MergePolicy.NO_COMPOUND_FILES 
> and MergePolicy.COMPOUND_FILES and it's confusing to distinguish on compound 
> files when the merge policy never merges segments. 
> We should have one singleton - NoMergePolicy.INSTANCE
> Post to the relevant discussion - 
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/201404.mbox/%3CCAOdYfZXXyVSf9%2BxYaRhr5v2O4Mc6S2v-qWuT112_CJFYhWTPqw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5633) NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985901#comment-13985901
 ] 

ASF subversion and git services commented on LUCENE-5633:
-

Commit 1591444 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1591444 ]

LUCENE-5633: replace NoMergePolicy.COMPOUND/NO_COMPOUND by 
NoMergePolicy.INSTANCE

> NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE
> 
>
> Key: LUCENE-5633
> URL: https://issues.apache.org/jira/browse/LUCENE-5633
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Priority: Minor
> Attachments: LUCENE-5633.patch
>
>
> Currently there are two singletons available - MergePolicy.NO_COMPOUND_FILES 
> and MergePolicy.COMPOUND_FILES and it's confusing to distinguish on compound 
> files when the merge policy never merges segments. 
> We should have one singleton - NoMergePolicy.INSTANCE
> Post to the relevant discussion - 
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/201404.mbox/%3CCAOdYfZXXyVSf9%2BxYaRhr5v2O4Mc6S2v-qWuT112_CJFYhWTPqw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Marvin Justice (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985896#comment-13985896
 ] 

Marvin Justice commented on SOLR-6035:
--

We see average indexing times drop from 9.2 ms to 6.7 ms. The overall timing 
distribution has a smaller tail for the patched version with stddev dropping 
from 15.4 ms to 9.0 ms.

Good point about replication, I didn't actually test that. This work was done 
on our "alpha" cluster which is 64 shards on 16 nodes spread across 4 machines 
but with no replication.

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Assignee: Joel Bernstein
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985866#comment-13985866
 ] 

Joel Bernstein commented on SOLR-6035:
--

I should have some time in the next couple of days to test out the patch. One 
of things we'll want to be sure of is that there aren't any issues caused by 
going directly to coreUrl rather then through the collection Url. For exampe, 
does replication within the shard still happen if you go directly to the 
coreUrl.

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Assignee: Joel Bernstein
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5633) NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985861#comment-13985861
 ] 

ASF subversion and git services commented on LUCENE-5633:
-

Commit 1591432 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1591432 ]

LUCENE-5633: replace NoMergePolicy.COMPOUND/NO_COMPOUND by 
NoMergePolicy.INSTANCE

> NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE
> 
>
> Key: LUCENE-5633
> URL: https://issues.apache.org/jira/browse/LUCENE-5633
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Priority: Minor
> Attachments: LUCENE-5633.patch
>
>
> Currently there are two singletons available - MergePolicy.NO_COMPOUND_FILES 
> and MergePolicy.COMPOUND_FILES and it's confusing to distinguish on compound 
> files when the merge policy never merges segments. 
> We should have one singleton - NoMergePolicy.INSTANCE
> Post to the relevant discussion - 
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/201404.mbox/%3CCAOdYfZXXyVSf9%2BxYaRhr5v2O4Mc6S2v-qWuT112_CJFYhWTPqw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985859#comment-13985859
 ] 

Erick Erickson commented on SOLR-6035:
--

Marvin:

Can you quantify "dramatic"? Just off the top of your head, I'm curious how big 
an improvement you're seeing

Thanks...

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Assignee: Joel Bernstein
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-6035:


Assignee: Joel Bernstein

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Assignee: Joel Bernstein
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985856#comment-13985856
 ] 

Joel Bernstein commented on SOLR-6035:
--

This is good, thanks Marvin. 


> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



DocumentsWriterPerThread architecture

2014-04-30 Thread david.w.smi...@gmail.com
Is this still up to date?:
https://blog.trifork.com/2011/04/01/gimme-all-resources-you-have-i-can-use-them/
I thought at some point subsequently, some significant work was done, and
perhaps it was blogged. But I can’t find it.
~ David


[jira] [Updated] (LUCENE-5591) ReaderAndUpdates should create a proper IOContext when writing DV updates

2014-04-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5591:
---

Attachment: LUCENE-5591.patch

OK I moved to Math.ceil(). I thought ceil2() is quite cool :). But this isn't a 
hot code, it's called once per flush and better to have readable code..

> ReaderAndUpdates should create a proper IOContext when writing DV updates
> -
>
> Key: LUCENE-5591
> URL: https://issues.apache.org/jira/browse/LUCENE-5591
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Shai Erera
> Attachments: LUCENE-5591.patch, LUCENE-5591.patch, LUCENE-5591.patch
>
>
> Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ 
> NRTCachingDirectory, it means the latter will attempt to write the entire DV 
> field in its RAMDirectory, which could lead to OOM.
> Would be good if we can build our own FlushInfo, estimating the number of 
> bytes we're about to write. I didn't see off hand a quick way to guesstimate 
> that - I thought to use the current DV's sizeInBytes as an approximation, but 
> I don't see a way to get it, not a direct way at least.
> Maybe we can use the size of the in-memory updates to guesstimate that 
> amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is 
> it a too wild guess?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20227 - Failure!

2014-04-30 Thread Michael McCandless
I committed a fix.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Apr 30, 2014 at 12:12 PM, Michael McCandless
 wrote:
> I'll dig.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Apr 30, 2014 at 11:52 AM,   wrote:
>> Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20227/
>>
>> 1 tests failed.
>> REGRESSION:  
>> org.apache.lucene.index.TestTermVectorsWriter.testInconsistentTermVectorOptions
>>
>> Error Message:
>>
>>
>> Stack Trace:
>> java.lang.AssertionError
>> at 
>> __randomizedtesting.SeedInfo.seed([F423BF9ECA066875:62AF43CBE382AFB5]:0)
>> at 
>> org.apache.lucene.codecs.pulsing.PulsingPostingsWriter.finishTerm(PulsingPostingsWriter.java:346)
>> at 
>> org.apache.lucene.codecs.memory.FSTOrdTermsWriter$TermsWriter.finishTerm(FSTOrdTermsWriter.java:328)
>> at 
>> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:501)
>> at 
>> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:80)
>> at 
>> org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:114)
>> at 
>> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
>> at 
>> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
>> at 
>> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
>> at 
>> org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:377)
>> at 
>> org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:315)
>> at 
>> org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:257)
>> at 
>> org.apache.lucene.index.TestTermVectorsWriter.doTestMixup(TestTermVectorsWriter.java:666)
>> at 
>> org.apache.lucene.index.TestTermVectorsWriter.testInconsistentTermVectorOptions(TestTermVectorsWriter.java:599)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
>> at 
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> at 
>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at 
>> com.carrotsearch.randomized

[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985763#comment-13985763
 ] 

ASF subversion and git services commented on LUCENE-5611:
-

Commit 1591399 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1591399 ]

LUCENE-5611: always call PerField.finish even on non-aborting exc

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5611) Simplify the default indexing chain

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985712#comment-13985712
 ] 

ASF subversion and git services commented on LUCENE-5611:
-

Commit 1591391 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1591391 ]

LUCENE-5611: always call PerField.finish even on non-aborting exc

> Simplify the default indexing chain
> ---
>
> Key: LUCENE-5611
> URL: https://issues.apache.org/jira/browse/LUCENE-5611
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5611.patch, LUCENE-5611.patch
>
>
> I think Lucene's current indexing chain has too many classes /
> hierarchy / abstractions, making it look much more complex than it
> really should be, and discouraging users from experimenting/innovating
> with their own indexing chains.
> Also, if it were easier to understand/approach, then new developers
> would more likely try to improve it ... it really should be simpler.
> So I'm exploring a pared back indexing chain, and have a starting patch
> that I think is looking ok: it seems more approachable than the
> current indexing chain, or at least has fewer strange classes.
> I also thought this could give some speedup for tiny documents (a more
> common use of Lucene lately), and it looks like, with the evil
> optimizations, this is a ~25% speedup for Geonames docs.  Even without
> those evil optos it's a bit faster.
> This is very much a work in progress / nocommits, and there are some
> behavior changes e.g. the new chain requires all fields to have the
> same TV options (rather than auto-upgrading all fields by the same
> name that the current chain does)...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985704#comment-13985704
 ] 

Mark Miller commented on SOLR-6035:
---

Interesting - that may explain some earlier reports.

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20227 - Failure!

2014-04-30 Thread Michael McCandless
I'll dig.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Apr 30, 2014 at 11:52 AM,   wrote:
> Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20227/
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.index.TestTermVectorsWriter.testInconsistentTermVectorOptions
>
> Error Message:
>
>
> Stack Trace:
> java.lang.AssertionError
> at 
> __randomizedtesting.SeedInfo.seed([F423BF9ECA066875:62AF43CBE382AFB5]:0)
> at 
> org.apache.lucene.codecs.pulsing.PulsingPostingsWriter.finishTerm(PulsingPostingsWriter.java:346)
> at 
> org.apache.lucene.codecs.memory.FSTOrdTermsWriter$TermsWriter.finishTerm(FSTOrdTermsWriter.java:328)
> at 
> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:501)
> at 
> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:80)
> at 
> org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:114)
> at 
> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
> at 
> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
> at 
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
> at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:377)
> at 
> org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:315)
> at 
> org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:257)
> at 
> org.apache.lucene.index.TestTermVectorsWriter.doTestMixup(TestTermVectorsWriter.java:666)
> at 
> org.apache.lucene.index.TestTermVectorsWriter.testInconsistentTermVectorOptions(TestTermVectorsWriter.java:599)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carr

[jira] [Updated] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Marvin Justice (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marvin Justice updated SOLR-6035:
-

Attachment: SOLR-6035.patch

Uploading patch against branch_4x

> CloudSolrServer directUpdate routing should use getCoreUrl 
> ---
>
> Key: SOLR-6035
> URL: https://issues.apache.org/jira/browse/SOLR-6035
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Marvin Justice
>Priority: Minor
>  Labels: performance
> Attachments: SOLR-6035.patch
>
>
> In a multisharded node environment we were seeing forward-to-leader hops when 
> using CloudSolrServer directUpdate (with the hop being on the same node) . 
> Consequently, there was no improvement in indexing performance over 
> non-directUpdate. 
> Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we 
> see a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-5632.
---

   Resolution: Fixed
Fix Version/s: 5.0

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5632-4x.patch, LUCENE-5632.patch, 
> LUCENE-5632.patch, LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20227 - Failure!

2014-04-30 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20227/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestTermVectorsWriter.testInconsistentTermVectorOptions

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([F423BF9ECA066875:62AF43CBE382AFB5]:0)
at 
org.apache.lucene.codecs.pulsing.PulsingPostingsWriter.finishTerm(PulsingPostingsWriter.java:346)
at 
org.apache.lucene.codecs.memory.FSTOrdTermsWriter$TermsWriter.finishTerm(FSTOrdTermsWriter.java:328)
at 
org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:501)
at 
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:80)
at 
org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:114)
at 
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:415)
at 
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:512)
at 
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:622)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:377)
at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:315)
at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:257)
at 
org.apache.lucene.index.TestTermVectorsWriter.doTestMixup(TestTermVectorsWriter.java:666)
at 
org.apache.lucene.index.TestTermVectorsWriter.testInconsistentTermVectorOptions(TestTermVectorsWriter.java:599)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.ja

[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985648#comment-13985648
 ] 

ASF subversion and git services commented on LUCENE-5632:
-

Commit 1591365 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1591365 ]

Merged revision(s) 1591333 from lucene/dev/branches/branch_4x:
LUCENE-5632: Transition Version constants from LUCENE_MN to LUCENE_M_N

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5632-4x.patch, LUCENE-5632.patch, 
> LUCENE-5632.patch, LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6035) CloudSolrServer directUpdate routing should use getCoreUrl

2014-04-30 Thread Marvin Justice (JIRA)
Marvin Justice created SOLR-6035:


 Summary: CloudSolrServer directUpdate routing should use 
getCoreUrl 
 Key: SOLR-6035
 URL: https://issues.apache.org/jira/browse/SOLR-6035
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Marvin Justice
Priority: Minor


In a multisharded node environment we were seeing forward-to-leader hops when 
using CloudSolrServer directUpdate (with the hop being on the same node) . 
Consequently, there was no improvement in indexing performance over 
non-directUpdate. 

Changing buildUrlMap to use getCoreUrl eliminated the extra hop and now we see 
a dramatic improvement in performance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5633) NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE

2014-04-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5633:
---

Attachment: LUCENE-5633.patch

Patch removes the two singletons and adds a single INSTANCE. Also 
.useCompoundFile returns newSegment.info.isCompound. The majority of the patch 
is in test classes which made use of these singletons (for no good reason!).

All tests pass.

> NoMergePolicy should have one singleton - NoMergePolicy.INSTANCE
> 
>
> Key: LUCENE-5633
> URL: https://issues.apache.org/jira/browse/LUCENE-5633
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Varun Thacker
>Priority: Minor
> Attachments: LUCENE-5633.patch
>
>
> Currently there are two singletons available - MergePolicy.NO_COMPOUND_FILES 
> and MergePolicy.COMPOUND_FILES and it's confusing to distinguish on compound 
> files when the merge policy never merges segments. 
> We should have one singleton - NoMergePolicy.INSTANCE
> Post to the relevant discussion - 
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/201404.mbox/%3CCAOdYfZXXyVSf9%2BxYaRhr5v2O4Mc6S2v-qWuT112_CJFYhWTPqw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5632:
--

Attachment: LUCENE-5632.patch

Merged patch for 5.0 (trunk). Will commit after tests are happy.

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9
>
> Attachments: LUCENE-5632-4x.patch, LUCENE-5632.patch, 
> LUCENE-5632.patch, LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

2014-04-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985615#comment-13985615
 ] 

Michael McCandless commented on LUCENE-4396:


bq. I haven't merged my branch to the newest trunk version, because my network 
account at school for April has been run out and I couldn't pull the code from 
github untill 1 May. Sorry for that.

Good grief, that's awful you have to "budget" your allowed network
access month by month ... don't worry about it, it's easy to apply
the patch.  Alternatively, just add a pointer here to your github fork
and we can clone from there to review?

bq. I'm very sorry for the code style. That's my fault. Very sorry for that.

No need to apologize; it's just code style ... everyone has their own,
but we have a standard one in Lucene so we don't spend all our time
fighting over whose style is best :)

{quote}
If you mean the .advance method of BooleanNovelScorer itself, I think it would 
be confusing, 
because BooleanNovelScorer now is used when there's at least one MUST clause, 
no matter whether it acts as a top scorer or not. Therefore, .advance() of 
BooleanNovelScorer 
must be called when BooleanNovelScorer acts as a non-top scorer.
{quote}

Yeah I meant BNS.advance, when it's a top scorer.  Ie, can BNS beat
BS2 in this case.  It seems like you could test this case now since as
a topScorer nobody would call the unfinished BNS.advance method.  This
way, if BNS can beat BS2 in this case we know it's worth pursuing.

bq. I think this issue should be dealed with together.

+1, but only if time this summer permits (i.e. top priority is this
issue, allowing BS to accept MUST clauses).


> BooleanScorer should sometimes be used for MUST clauses
> ---
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Attachments: LUCENE-4396.patch, LUCENE-4396.patch
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 100 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5591) ReaderAndUpdates should create a proper IOContext when writing DV updates

2014-04-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985609#comment-13985609
 ] 

Michael McCandless commented on LUCENE-5591:


Looks good, thanks Shai.

Maybe we can just do floating point math and take ceil in the end?  The ceil2 
is sort of confusing...


> ReaderAndUpdates should create a proper IOContext when writing DV updates
> -
>
> Key: LUCENE-5591
> URL: https://issues.apache.org/jira/browse/LUCENE-5591
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Shai Erera
> Attachments: LUCENE-5591.patch, LUCENE-5591.patch
>
>
> Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ 
> NRTCachingDirectory, it means the latter will attempt to write the entire DV 
> field in its RAMDirectory, which could lead to OOM.
> Would be good if we can build our own FlushInfo, estimating the number of 
> bytes we're about to write. I didn't see off hand a quick way to guesstimate 
> that - I thought to use the current DV's sizeInBytes as an approximation, but 
> I don't see a way to get it, not a direct way at least.
> Maybe we can use the size of the in-memory updates to guesstimate that 
> amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is 
> it a too wild guess?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-4410) Highlight query parameter (hl.q) does not honor QParser defType parameter

2014-04-30 Thread Scott Smerchek (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Smerchek closed SOLR-4410.


Resolution: Duplicate

No longer a need for this issue/patch.

> Highlight query parameter (hl.q) does not honor QParser defType parameter
> -
>
> Key: SOLR-4410
> URL: https://issues.apache.org/jira/browse/SOLR-4410
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 4.1
>Reporter: Scott Smerchek
>Priority: Minor
>  Labels: highlighter, qparserplugin
> Attachments: SOLR-4410.patch
>
>
> If one uses a custom QParser for parsing standard queries, that person cannot 
> do the same with the highlight query parameter using the 'defType' parameter.
> The hl.q QParser will always default to the default defType unless the local 
> params syntax is used to specify a different QParser.
> The typical expectation would be that q and hl.q would behave the same.
> The following examples should highlight the document in the same way:
> {code}
> q=field:text&hl=true&defType=custom
> {code}
> {code}
> q=id:123&hl=true&hl.q=field:text&defType=custom
> {code}
> This is how you have to do it now:
> {code}
> q=field:text&hl=true&defType=custom
> {code}
> {code}
> q=id:123&hl=true&hl.q={!custom}field:text&defType=custom
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985599#comment-13985599
 ] 

ASF subversion and git services commented on LUCENE-5632:
-

Commit 1591333 from [~thetaphi] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1591333 ]

LUCENE-5632: Transition Version constants from LUCENE_MN to LUCENE_M_N

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9
>
> Attachments: LUCENE-5632-4x.patch, LUCENE-5632.patch, 
> LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985585#comment-13985585
 ] 

Uwe Schindler commented on LUCENE-5632:
---

Tests are happy, committing and forward-porting!

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9
>
> Attachments: LUCENE-5632-4x.patch, LUCENE-5632.patch, 
> LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-04-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985583#comment-13985583
 ] 

Michael McCandless commented on LUCENE-5376:


Thanks James!

bq. If user doesn't specify a handler, or an invalid one, return a list of 
valid handlers, possibly with a little param documentation instead of the 
current IllegalArgumentException

+1: it's really important that all errors that come back from the server are 
transparent/clear as possible.  Maybe point to the live docs handler?

bq. Allow commands like "createIndex" be execute as GET and not require POST. 
Maybe let users pass parameters on the URL and not always expect an incoming 
JSON document? And/or have a parameter "json" for such a document?

I think this makes sense as long as the GET API is just a mirror of what you 
could do via JSON?  E.g., maybe take all CGI args, turn into the corresponding 
JSON struct, and pretend that JSON had arrived via POST?  Something like that 
...

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-30 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5632:
--

Attachment: LUCENE-5632-4x.patch

Patch for 4.x, I will commit this once tests are happy.

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9
>
> Attachments: LUCENE-5632-4x.patch, LUCENE-5632.patch, 
> LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-04-30 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985564#comment-13985564
 ] 

James Dyer commented on LUCENE-5376:


Just playing around with this a little bit.  I'm running into some barriers for 
a quick start, wonder if we could...

- If user doesn't specify a handler, or an invalid one, return a list of valid 
handlers, possibly with a little param documentation instead of the current 
IllegalArgumentException
- Allow commands like "createIndex" be execute as GET and not require POST.  
Maybe let users pass parameters on the URL and not always expect an incoming 
JSON document?  And/or have a parameter "json" for such a document?

I'd be happy to try and add this but wasn't sure if the intent was to make this 
strictly a server-to-server app, or also something users could play around with 
interactively with a browser.

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985496#comment-13985496
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1591292 from jd...@apache.org in branch 'dev/branches/lucene5376_2'
[ https://svn.apache.org/r1591292 ]

LUCENE-5376: Use a default 'stateDir'  instead of throwing NPE

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985464#comment-13985464
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1591283 from jd...@apache.org in branch 'dev/branches/lucene5376_2'
[ https://svn.apache.org/r1591283 ]

LUCENE-5376: add "lib" directory to svn:ignore

> Add a demo search server
> 
>
> Key: LUCENE-5376
> URL: https://issues.apache.org/jira/browse/LUCENE-5376
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Attachments: lucene-demo-server.tgz
>
>
> I think it'd be useful to have a "demo" search server for Lucene.
> Rather than being fully featured, like Solr, it would be minimal, just 
> wrapping the existing Lucene modules to show how you can make use of these 
> features in a server setting.
> The purpose is to demonstrate how one can build a minimal search server on 
> top of APIs like SearchManager, SearcherLifetimeManager, etc.
> This is also useful for finding rough edges / issues in Lucene's APIs that 
> make building a server unnecessarily hard.
> I don't think it should have back compatibility promises (except Lucene's 
> index back compatibility), so it's free to improve as Lucene's APIs change.
> As a starting point, I'll post what I built for the "eating your own dog 
> food" search app for Lucene's & Solr's jira issues 
> http://jirasearch.mikemccandless.com (blog: 
> http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
> uses Netty to expose basic indexing & searching APIs via JSON, but it's very 
> rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5632) transition Version constants from LUCENE_MN to LUCENE_M_N

2014-04-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985445#comment-13985445
 ] 

Uwe Schindler commented on LUCENE-5632:
---

Thanks Hoss!
I will keep the lenient parser for 5.0, too. I will only remove the constants 
in source code, so Java code can no longer use them.
In fact, I will start on the 4.x branch and clean it up including all the 
deprecations. I will then merge to trunk. This is the better approach for this 
task.

> transition Version constants from LUCENE_MN to LUCENE_M_N
> -
>
> Key: LUCENE-5632
> URL: https://issues.apache.org/jira/browse/LUCENE-5632
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: Robert Muir
>Assignee: Uwe Schindler
> Fix For: 4.9
>
> Attachments: LUCENE-5632.patch, LUCENE-5632.patch
>
>
> We should fix this, otherwise the constants will be hard to read (e.g. 
> Version.LUCENE_410, is it 4.1.0 or 4.10 or whatever).
> I do not want this to be an excuse for an arbitrary 5.0 release that does not 
> have the features expected of a major release :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985432#comment-13985432
 ] 

ASF subversion and git services commented on LUCENE-5622:
-

Commit 1591279 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1591279 ]

Follow up cleanups to LUCENE-5622.

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch, LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-5622.
-

Resolution: Fixed

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch, LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.

2014-04-30 Thread Isabel Mendonca (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985426#comment-13985426
 ] 

Isabel Mendonca commented on LUCENE-5629:
-

What we meant is an xml file containing indexing metadata such as the 
similarity function and the analyzer. This xml file could be stored in a 
different location from where the actual index exists. 

> Comparing the Version of Lucene , the Analyzer and the similarity function 
> that are being used for indexing and searching.
> --
>
> Key: LUCENE-5629
> URL: https://issues.apache.org/jira/browse/LUCENE-5629
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/index, core/queryparser, core/search
> Environment: Operating system : Windows 8.1
> Software platform : Eclipse Kepler 4.3.2
>Reporter: Isabel Mendonca
>Priority: Minor
>  Labels: features, patch
> Fix For: 4.8, 4.9, 5.0
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We have observed that Lucene does not check if the same Similarity function 
> is used during indexing and searching. The same problem exists for the 
> Analyzer that is used. This may lead to poor or misleading results.
> So we decided to create an xml file during indexing that will store 
> information such as the Analyzer and the Similarity function that were used 
> as well as the version of Lucene that was used. This xml file will always be 
> available to the users.
> At search time , we will retrieve this information using SAX parsing and 
> check if the utils used for searching , match those used for indexing. If not 
> , a warning message will be displayed to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985421#comment-13985421
 ] 

ASF subversion and git services commented on LUCENE-5622:
-

Commit 1591273 from [~dawidweiss] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1591273 ]

LUCENE-5622: Fail tests if they print over the given limit of bytes to 
System.out or System.err

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch, LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20212 - Failure!

2014-04-30 Thread Robert Muir
This is a test bug, but actually the whole test is buggy.

it explicitly wants to test behavior about different inconsistent tv
options (such as exception message text), but it randomizes the
options too.

I will improve the test to actually test the different possibilities instead.

On Wed, Apr 30, 2014 at 8:08 AM, Robert Muir  wrote:
> I'll look
>
> On Wed, Apr 30, 2014 at 7:52 AM,   wrote:
>> Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20212/
>>
>> 2 tests failed.
>> FAILED:  junit.framework.TestSuite.org.apache.lucene.search.TestTermVectors
>>
>> Error Message:
>> MockDirectoryWrapper: cannot close: there are still open files: {_b.fdt=1, 
>> _b.fdx=1}
>>
>> Stack Trace:
>> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are 
>> still open files: {_b.fdt=1, _b.fdx=1}
>> at __randomizedtesting.SeedInfo.seed([F8788671361FF2D7]:0)
>> at 
>> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:664)
>> at 
>> org.apache.lucene.search.TestTermVectors.afterClass(TestTermVectors.java:80)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:790)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
>> at java.lang.Thread.run(Thread.java:724)
>> Caused by: java.lang.RuntimeException: unclosed IndexOutput: _b.fdx
>> at 
>> org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:551)
>> at 
>> org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:523)
>> at 
>> org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
>> at 
>> org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.(Lucene40StoredFieldsWriter.java:98)
>> at 
>> org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsFormat.fieldsWriter(Lucene40StoredFieldsFormat.java:97)
>> at 
>> org.apache.lucene.index.DefaultIndexingChain.(DefaultIndexingChain.java:83)
>> at 
>> org.apache.lucene.index.DocumentsWriterPerThread$1.getChain(DocumentsWriterPerThread.java:62)
>> at 
>> org.apache.lucene.index.DocumentsWriterPerThread.(DocumentsWriterPerThread.java:186)
>> at 
>> org.apache.lucene.index.DocumentsWriter.ensureInitialized(DocumentsWriter.java:399)
>> at 
>> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:454)
>> at 
>> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1541)
>> at 
>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1211)
>> at 
>> org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:146)
>> at 
>> org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:108)
>> at 
>> org.apache.lucene.search.TestTermVectors.testMixedVectrosVectors(TestTermVectors.java:115)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at 
>> sun.reflect

Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20212 - Failure!

2014-04-30 Thread Robert Muir
I'll look

On Wed, Apr 30, 2014 at 7:52 AM,   wrote:
> Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20212/
>
> 2 tests failed.
> FAILED:  junit.framework.TestSuite.org.apache.lucene.search.TestTermVectors
>
> Error Message:
> MockDirectoryWrapper: cannot close: there are still open files: {_b.fdt=1, 
> _b.fdx=1}
>
> Stack Trace:
> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are 
> still open files: {_b.fdt=1, _b.fdx=1}
> at __randomizedtesting.SeedInfo.seed([F8788671361FF2D7]:0)
> at 
> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:664)
> at 
> org.apache.lucene.search.TestTermVectors.afterClass(TestTermVectors.java:80)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:790)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
> at 
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
> at java.lang.Thread.run(Thread.java:724)
> Caused by: java.lang.RuntimeException: unclosed IndexOutput: _b.fdx
> at 
> org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:551)
> at 
> org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:523)
> at 
> org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
> at 
> org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.(Lucene40StoredFieldsWriter.java:98)
> at 
> org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsFormat.fieldsWriter(Lucene40StoredFieldsFormat.java:97)
> at 
> org.apache.lucene.index.DefaultIndexingChain.(DefaultIndexingChain.java:83)
> at 
> org.apache.lucene.index.DocumentsWriterPerThread$1.getChain(DocumentsWriterPerThread.java:62)
> at 
> org.apache.lucene.index.DocumentsWriterPerThread.(DocumentsWriterPerThread.java:186)
> at 
> org.apache.lucene.index.DocumentsWriter.ensureInitialized(DocumentsWriter.java:399)
> at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:454)
> at 
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1541)
> at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1211)
> at 
> org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:146)
> at 
> org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:108)
> at 
> org.apache.lucene.search.TestTermVectors.testMixedVectrosVectors(TestTermVectors.java:115)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.j

[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 20212 - Failure!

2014-04-30 Thread builder
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/20212/

2 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.lucene.search.TestTermVectors

Error Message:
MockDirectoryWrapper: cannot close: there are still open files: {_b.fdt=1, 
_b.fdx=1}

Stack Trace:
java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 
open files: {_b.fdt=1, _b.fdx=1}
at __randomizedtesting.SeedInfo.seed([F8788671361FF2D7]:0)
at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:664)
at 
org.apache.lucene.search.TestTermVectors.afterClass(TestTermVectors.java:80)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:790)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.RuntimeException: unclosed IndexOutput: _b.fdx
at 
org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:551)
at 
org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:523)
at 
org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:44)
at 
org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsWriter.(Lucene40StoredFieldsWriter.java:98)
at 
org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsFormat.fieldsWriter(Lucene40StoredFieldsFormat.java:97)
at 
org.apache.lucene.index.DefaultIndexingChain.(DefaultIndexingChain.java:83)
at 
org.apache.lucene.index.DocumentsWriterPerThread$1.getChain(DocumentsWriterPerThread.java:62)
at 
org.apache.lucene.index.DocumentsWriterPerThread.(DocumentsWriterPerThread.java:186)
at 
org.apache.lucene.index.DocumentsWriter.ensureInitialized(DocumentsWriter.java:399)
at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:454)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1541)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1211)
at 
org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:146)
at 
org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:108)
at 
org.apache.lucene.search.TestTermVectors.testMixedVectrosVectors(TestTermVectors.java:115)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
org.apache.luce

[jira] [Updated] (LUCENE-5591) ReaderAndUpdates should create a proper IOContext when writing DV updates

2014-04-30 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5591:
---

Attachment: LUCENE-5591.patch

Thanks Mike. I modified to ramBytesPerDoc and fixed Numeric to return a proper 
approximation (I neglected to factor in the values themselves!!). Also, I 
estimate the amount of RAM per document used by the PagedGrowableWriters.

I don't call BytesRef.append(), but grow() and arraycopy(). I could have used 
bytesRef.grow() followed by bytesRef.append(), but it double-checks the 
capacity...

Tests pass.

> ReaderAndUpdates should create a proper IOContext when writing DV updates
> -
>
> Key: LUCENE-5591
> URL: https://issues.apache.org/jira/browse/LUCENE-5591
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Shai Erera
> Attachments: LUCENE-5591.patch, LUCENE-5591.patch
>
>
> Today we pass IOContext.DEFAULT. If DV updates are used in conjunction w/ 
> NRTCachingDirectory, it means the latter will attempt to write the entire DV 
> field in its RAMDirectory, which could lead to OOM.
> Would be good if we can build our own FlushInfo, estimating the number of 
> bytes we're about to write. I didn't see off hand a quick way to guesstimate 
> that - I thought to use the current DV's sizeInBytes as an approximation, but 
> I don't see a way to get it, not a direct way at least.
> Maybe we can use the size of the in-memory updates to guesstimate that 
> amount? Something like {{sizeOfInMemUpdates * (maxDoc/numUpdatedDocs)}}? Is 
> it a too wild guess?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.

2014-04-30 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985384#comment-13985384
 ] 

Ahmet Arslan commented on LUCENE-5629:
--

bq. store the information concerning the indexing into a separate file?
You mean a separate file other than schema.xml?

> Comparing the Version of Lucene , the Analyzer and the similarity function 
> that are being used for indexing and searching.
> --
>
> Key: LUCENE-5629
> URL: https://issues.apache.org/jira/browse/LUCENE-5629
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/index, core/queryparser, core/search
> Environment: Operating system : Windows 8.1
> Software platform : Eclipse Kepler 4.3.2
>Reporter: Isabel Mendonca
>Priority: Minor
>  Labels: features, patch
> Fix For: 4.8, 4.9, 5.0
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We have observed that Lucene does not check if the same Similarity function 
> is used during indexing and searching. The same problem exists for the 
> Analyzer that is used. This may lead to poor or misleading results.
> So we decided to create an xml file during indexing that will store 
> information such as the Analyzer and the Similarity function that were used 
> as well as the version of Lucene that was used. This xml file will always be 
> available to the users.
> At search time , we will retrieve this information using SAX parsing and 
> check if the utils used for searching , match those used for indexing. If not 
> , a warning message will be displayed to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded

2014-04-30 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5681:
---

Attachment: SOLR-5681.patch

Patch with all but 1 test passing.

> Make the OverseerCollectionProcessor multi-threaded
> ---
>
> Key: SOLR-5681
> URL: https://issues.apache.org/jira/browse/SOLR-5681
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
> Attachments: SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch
>
>
> Right now, the OverseerCollectionProcessor is single threaded i.e submitting 
> anything long running would have it block processing of other mutually 
> exclusive tasks.
> When OCP tasks become optionally async (SOLR-5477), it'd be good to have 
> truly non-blocking behavior by multi-threading the OCP itself.
> For example, a ShardSplit call on Collection1 would block the thread and 
> thereby, not processing a create collection task (which would stay queued in 
> zk) though both the tasks are mutually exclusive.
> Here are a few of the challenges:
> * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An 
> easy way to handle that is to only let 1 task per collection run at a time.
> * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. 
> The task from the workQueue is only removed on completion so that in case of 
> a failure, the new Overseer can re-consume the same task and retry. A queue 
> is not the right data structure in the first place to look ahead i.e. get the 
> 2nd task from the queue when the 1st one is in process. Also, deleting tasks 
> which are not at the head of a queue is not really an 'intuitive' thing.
> Proposed solutions for task management:
> * Task funnel and peekAfter(): The parent thread is responsible for getting 
> and passing the request to a new thread (or one from the pool). The parent 
> method uses a peekAfter(last element) instead of a peek(). The peekAfter 
> returns the task after the 'last element'. Maintain this request information 
> and use it for deleting/cleaning up the workQueue.
> * Another (almost duplicate) queue: While offering tasks to workQueue, also 
> offer them to a new queue (call it volatileWorkQueue?). The difference is, as 
> soon as a task from this is picked up for processing by the thread, it's 
> removed from the queue. At the end, the cleanup is done from the workQueue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5474) Have a new mode for SolrJ to support stateFormat=2

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985377#comment-13985377
 ] 

ASF subversion and git services commented on SOLR-5474:
---

Commit 1591253 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1591253 ]

revert SOLR-5473 , SOLR-5474

> Have a new mode for SolrJ to support stateFormat=2
> --
>
> Key: SOLR-5474
> URL: https://issues.apache.org/jira/browse/SOLR-5474
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-5474.patch, SOLR-5474.patch, SOLR-5474.patch, 
> fail.logs
>
>
> In this mode SolrJ would not watch any ZK node
> It fetches the state  on demand and cache the most recently used n 
> collections in memory.
> SolrJ would not listen to any ZK node. When a request comes for a collection 
> ‘xcoll’
> it would first check if such a collection exists
> If yes it first looks up the details in the local cache for that collection
> If not found in cache , it fetches the node /collections/xcoll/state.json and 
> caches the information
> Any query/update will be sent with extra query param specifying the 
> collection name , version (example \_stateVer=xcoll:34) . A node would throw 
> an error (INVALID_NODE) if it does not have the right version
> If SolrJ gets INVALID_NODE error it would invalidate the cache and fetch 
> fresh state information for that collection (and caches it again)
> If there is a connection timeout, SolrJ assumes the node is down and re-fetch 
> the state for the collection and try again



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5473) Make one state.json per collection

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985376#comment-13985376
 ] 

ASF subversion and git services commented on SOLR-5473:
---

Commit 1591253 from [~noble.paul] in branch 'dev/trunk'
[ https://svn.apache.org/r1591253 ]

revert SOLR-5473 , SOLR-5474

> Make one state.json per collection
> --
>
> Key: SOLR-5473
> URL: https://issues.apache.org/jira/browse/SOLR-5473
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 5.0
>
> Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
> SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
> SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, 
> ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log
>
>
> As defined in the parent issue, store the states of each collection under 
> /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.

2014-04-30 Thread Isabel Mendonca (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985357#comment-13985357
 ] 

Isabel Mendonca commented on LUCENE-5629:
-

Leaving aside the comparing part, wouldn't it be useful to store the 
information concerning the indexing into a separate file? In that way, the 
index and query analysis will still be independent and this information will be 
accessible to whoever wants to see it. 

> Comparing the Version of Lucene , the Analyzer and the similarity function 
> that are being used for indexing and searching.
> --
>
> Key: LUCENE-5629
> URL: https://issues.apache.org/jira/browse/LUCENE-5629
> Project: Lucene - Core
>  Issue Type: New Feature
>  Components: core/index, core/queryparser, core/search
> Environment: Operating system : Windows 8.1
> Software platform : Eclipse Kepler 4.3.2
>Reporter: Isabel Mendonca
>Priority: Minor
>  Labels: features, patch
> Fix For: 4.8, 4.9, 5.0
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We have observed that Lucene does not check if the same Similarity function 
> is used during indexing and searching. The same problem exists for the 
> Analyzer that is used. This may lead to poor or misleading results.
> So we decided to create an xml file during indexing that will store 
> information such as the Analyzer and the Similarity function that were used 
> as well as the version of Lucene that was used. This xml file will always be 
> available to the users.
> At search time , we will retrieve this information using SAX parsing and 
> check if the utils used for searching , match those used for indexing. If not 
> , a warning message will be displayed to the user.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded

2014-04-30 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5681:
---

Attachment: SOLR-5681.patch

A little more cleanup.

> Make the OverseerCollectionProcessor multi-threaded
> ---
>
> Key: SOLR-5681
> URL: https://issues.apache.org/jira/browse/SOLR-5681
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
> Attachments: SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch
>
>
> Right now, the OverseerCollectionProcessor is single threaded i.e submitting 
> anything long running would have it block processing of other mutually 
> exclusive tasks.
> When OCP tasks become optionally async (SOLR-5477), it'd be good to have 
> truly non-blocking behavior by multi-threading the OCP itself.
> For example, a ShardSplit call on Collection1 would block the thread and 
> thereby, not processing a create collection task (which would stay queued in 
> zk) though both the tasks are mutually exclusive.
> Here are a few of the challenges:
> * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An 
> easy way to handle that is to only let 1 task per collection run at a time.
> * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. 
> The task from the workQueue is only removed on completion so that in case of 
> a failure, the new Overseer can re-consume the same task and retry. A queue 
> is not the right data structure in the first place to look ahead i.e. get the 
> 2nd task from the queue when the 1st one is in process. Also, deleting tasks 
> which are not at the head of a queue is not really an 'intuitive' thing.
> Proposed solutions for task management:
> * Task funnel and peekAfter(): The parent thread is responsible for getting 
> and passing the request to a new thread (or one from the pool). The parent 
> method uses a peekAfter(last element) instead of a peek(). The peekAfter 
> returns the task after the 'last element'. Maintain this request information 
> and use it for deleting/cleaning up the workQueue.
> * Another (almost duplicate) queue: While offering tasks to workQueue, also 
> offer them to a new queue (call it volatileWorkQueue?). The difference is, as 
> soon as a task from this is picked up for processing by the thread, it's 
> removed from the queue. At the end, the cleanup is done from the workQueue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6012) Dutch language stemming issues

2014-04-30 Thread Ashokkumar Balasubramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashokkumar Balasubramanian updated SOLR-6012:
-

Priority: Major  (was: Minor)

> Dutch language stemming issues
> --
>
> Key: SOLR-6012
> URL: https://issues.apache.org/jira/browse/SOLR-6012
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.5
> Environment: Linux
>Reporter: Ashokkumar Balasubramanian
>  Labels: easyfix, newbie
>
> I am trying to search a word in dutch language with the word Brievenbussen. 
> Normally this is the proper dutch word and it should result in some matches 
> but it results in 0 matches. The dutch word Brievenbusen (Letter 's' is 
> removed) returns matches.
> The problem is it doesn't take the last word 'bus' vowel character into 
> account. If a vowel is found in the last before character (in this case, it 
> is 'U'), then the proper dutch word should be Brievenbussen.
> Can you please confirm if this is a problem with 3.5 version.
> Please let me know if you need more information.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985325#comment-13985325
 ] 

Dawid Weiss commented on LUCENE-5622:
-

Committed to trunk. Will let it bake a bit before backporting to 4x

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch, LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-5622:


Fix Version/s: 5.0
   4.9

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch, LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985324#comment-13985324
 ] 

ASF subversion and git services commented on LUCENE-5622:
-

Commit 1591222 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1591222 ]

LUCENE-5622: Fail tests if they print over the given limit of bytes to 
System.out or System.err

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch, LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded

2014-04-30 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5681:
---

Attachment: SOLR-5681.patch

Another one. Still has some issues with:
* Removal of tasks from work queue.
* Failed tasks.

Working on the above two.

> Make the OverseerCollectionProcessor multi-threaded
> ---
>
> Key: SOLR-5681
> URL: https://issues.apache.org/jira/browse/SOLR-5681
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
> Attachments: SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, 
> SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch
>
>
> Right now, the OverseerCollectionProcessor is single threaded i.e submitting 
> anything long running would have it block processing of other mutually 
> exclusive tasks.
> When OCP tasks become optionally async (SOLR-5477), it'd be good to have 
> truly non-blocking behavior by multi-threading the OCP itself.
> For example, a ShardSplit call on Collection1 would block the thread and 
> thereby, not processing a create collection task (which would stay queued in 
> zk) though both the tasks are mutually exclusive.
> Here are a few of the challenges:
> * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An 
> easy way to handle that is to only let 1 task per collection run at a time.
> * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. 
> The task from the workQueue is only removed on completion so that in case of 
> a failure, the new Overseer can re-consume the same task and retry. A queue 
> is not the right data structure in the first place to look ahead i.e. get the 
> 2nd task from the queue when the 1st one is in process. Also, deleting tasks 
> which are not at the head of a queue is not really an 'intuitive' thing.
> Proposed solutions for task management:
> * Task funnel and peekAfter(): The parent thread is responsible for getting 
> and passing the request to a new thread (or one from the pool). The parent 
> method uses a peekAfter(last element) instead of a peek(). The peekAfter 
> returns the task after the 'last element'. Maintain this request information 
> and use it for deleting/cleaning up the workQueue.
> * Another (almost duplicate) queue: While offering tasks to workQueue, also 
> offer them to a new queue (call it volatileWorkQueue?). The difference is, as 
> soon as a task from this is picked up for processing by the thread, it's 
> removed from the queue. At the end, the cleanup is done from the workQueue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-5622:


Attachment: LUCENE-5622.patch

New patch with an adjustablelimit of bytes written to sysout/ syserr before an 
assertion is thrown. Disabled the check for Solr tests entirely and a few 
classes from Lucene as well.

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch, LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6034) Use a wildcard in order to delete fields with Atomic Update

2014-04-30 Thread Constantin Muraru (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Constantin Muraru updated SOLR-6034:


Component/s: (was: contrib - DataImportHandler)

> Use a wildcard in order to delete fields with Atomic Update
> ---
>
> Key: SOLR-6034
> URL: https://issues.apache.org/jira/browse/SOLR-6034
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.7
>Reporter: Constantin Muraru
>
> As discussed on the SOLR user group, it would a great feature to be able to 
> remove all fields matching a pattern, using Atomic Updates.
> Example:
> 
>   100
>   
> 
> The *_day_i should be expanded server-side and all fields matching this 
> pattern should be removed from the specified document.
> Workaround: When removing fields from a document, we can make a query to SOLR 
> from the client, in order to see what fields are actually present for the 
> specific document. After that, we can create the XML update document to be 
> sent to SOLR. However, this is going to increase the number of queries to 
> SOLR and for large amount of documents this is going to weigh pretty much. It 
> would be great performance-wise and simplicity-wise to be able to provide 
> wildcards.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-5622:


Description: 
Some tests print so much stuff they are now undebuggable (see LUCENE-5612).

>From now on, when tests.verbose is false, the number of bytes printed to 
>standard output and error streams will be accounted for and if it exceeds a 
>given limit an assertion will be thrown. The limit is adjustable per-suite 
>using Limit annotation, with the default of 8kb per suite. The check can be 
>suppressed entirely by specifying SuppressSysoutChecks.

  was:
Some tests print so much stuff they are now undebuggable (see LUCENE-5612).

I think its bad that the testrunner hides this stuff, we used to stay on top of 
it. Instead, whne tests.verbose is false, we should install a printstreams 
(system.out/err) that fail the test instantly because they are noisy. 

This will ensure that our tests don't go out of control.


> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> From now on, when tests.verbose is false, the number of bytes printed to 
> standard output and error streams will be accounted for and if it exceeds a 
> given limit an assertion will be thrown. The limit is adjustable per-suite 
> using Limit annotation, with the default of 8kb per suite. The check can be 
> suppressed entirely by specifying SuppressSysoutChecks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5622) Fail tests if they print over the given limit of bytes to System.out or System.err

2014-04-30 Thread Dawid Weiss (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-5622:


Summary: Fail tests if they print over the given limit of bytes to 
System.out or System.err  (was: Fail tests if they print, and tests.verbose is 
not set)

> Fail tests if they print over the given limit of bytes to System.out or 
> System.err
> --
>
> Key: LUCENE-5622
> URL: https://issues.apache.org/jira/browse/LUCENE-5622
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Assignee: Dawid Weiss
> Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
> LUCENE-5622.patch
>
>
> Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
> I think its bad that the testrunner hides this stuff, we used to stay on top 
> of it. Instead, whne tests.verbose is false, we should install a printstreams 
> (system.out/err) that fail the test instantly because they are noisy. 
> This will ensure that our tests don't go out of control.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-30 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985257#comment-13985257
 ] 

Anshum Gupta commented on SOLR-6022:


Good stuff. Makes things easier to comprehend.

I am not sure if this should be the IndexAnalyzer or the QueryAnalyzer as 
AFAIR, this tries to construct a query out of the terms from a document (given 
an id).
{code:title= MoreLikeThisHandler.java|borderStyle=solid}
-  mlt.setAnalyzer( searcher.getSchema().getAnalyzer() );
+  mlt.setAnalyzer( searcher.getSchema().getIndexAnalyzer() );
{code}

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch, SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

2014-04-30 Thread Da Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985254#comment-13985254
 ] 

Da Huang commented on LUCENE-4396:
--

Thanks for your suggestions, Mike. And sorry for my late reply.
{quote}
Hmm, the patch didn't cleanly apply, but I was able to work through
it. I think your dev area is not up to date with trunk?
{quote}
I haven't merged my branch to the newest trunk version, because my network 
account at school for April has been run out and I couldn't pull the code from 
github untill 1 May. Sorry for that.
{quote}
Small code style things
{quote}
I'm very sorry for the code style. That's my fault. Very sorry for that.
{quote}
So it looks like BooleanNovelScorer is able to be a Scorer because the
linked-list of visited buckets in one window are guaranteed to be in
docID order, because we first visit the requiredConjunctionScorer's
docs in that window.
{quote}
Yes, you're right.
{quote}
Have you tested performance when the .advance method here isn't called?
Ie, just boolean queries w/ one MUST and one or more SHOULD? 
{quote}
No, I haven't. Do you mean the .advance method of subScorers in 
BooleanNovelScorer?
If so, I will do that. 
If you mean the .advance method of BooleanNovelScorer itself, I think it would 
be confusing, 
because BooleanNovelScorer now is used when there's at least one MUST clause, 
no matter whether it acts as a top scorer or not. Therefore, .advance() of 
BooleanNovelScorer 
must be called when BooleanNovelScorer acts as a non-top scorer. 
{quote}
 I think the important question here is whether/in what cases the
BooleanNovelScorer approach beats BooleanScorer2 performance?
{quote}
Yes, you're right. But BooleanNovelScorer has not been totally finished, and 
the performance itself 
remans to be improved especially its .advance method.
{quote}
I realized LUCENE-4872 is related here, i.e. we should also sometimes
use BooleanScorer for the minShouldMatch>1 case.
{quote}
Yes, I also notice that. :) I think this issue should be dealed with together.

> BooleanScorer should sometimes be used for MUST clauses
> ---
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
> Attachments: LUCENE-4396.patch, LUCENE-4396.patch
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 100 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 1505 - Failure!

2014-04-30 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1505/
Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 10900 lines...]
   [junit4] JVM J0: stderr was not empty, see: 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140430_072856_676.syserr
   [junit4] >>> JVM J0: stderr (verbatim) 
   [junit4] java(285,0x147d12000) malloc: *** error for object 
0x400147d00d30: pointer being freed was not allocated
   [junit4] *** set a breakpoint in malloc_error_break to debug
   [junit4] <<< JVM J0: EOF 

[...truncated 1 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home/jre/bin/java 
-XX:-UseCompressedOops -XX:+UseSerialGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/heapdumps 
-Dtests.prefix=tests -Dtests.seed=948673D40C57FAA8 -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=4.9 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.monster=false 
-Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 
-DtempDir=. -Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=4.9-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Djdk.map.althashing.threshold=0 
-Dtests.leaveTemporary=false -Dtests.filterstacks=true -Dtests.disableHdfs=true 
-classpath 
/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/classes/test:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-test-framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/test-framework/lib/junit4-ant-2.1.3.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/src/test-files:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/test-framework/classes/java:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/codecs/classes/java:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-solrj/classes/java:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/classes/java:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/analysis/common/lucene-analyzers-common-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/analysis/kuromoji/lucene-analyzers-kuromoji-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/analysis/phonetic/lucene-analyzers-phonetic-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/codecs/lucene-codecs-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/highlighter/lucene-highlighter-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/memory/lucene-memory-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/misc/lucene-misc-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/spatial/lucene-spatial-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/expressions/lucene-expressions-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/suggest/lucene-suggest-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/grouping/lucene-grouping-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/queries/lucene-queries-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/queryparser/lucene-queryparser-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/join/lucene-join-4.9-SNAPSHOT.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/lib/antlr-runtime-3.5.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/lib/asm-4.1.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/lib/asm-commons-4.1.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/lib/commons-cli-1.2.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/lib/commons-codec-1.9.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/lib/commons-configuration-1.6.jar:/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/core/lib/commons-fileuplo

[jira] [Updated] (SOLR-6022) Rename getAnalyzer to getIndexAnalyzer

2014-04-30 Thread Ryan Ernst (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst updated SOLR-6022:
-

Attachment: SOLR-6022.patch

Another trunk patch.  I had to rework how subclasses enable analyzers for their 
type.  Before subclasses had to override setAnalyzer, and implement it to set 
the protected member var.  With this patch they instead override 
{{supportsAnalyzers()}} to return true.

> Rename getAnalyzer to getIndexAnalyzer
> --
>
> Key: SOLR-6022
> URL: https://issues.apache.org/jira/browse/SOLR-6022
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ryan Ernst
> Attachments: SOLR-6022.branch_4x-deprecation.patch, SOLR-6022.patch, 
> SOLR-6022.patch, SOLR-6022.patch
>
>
> We have separate index/query analyzer chains, but the access methods for the 
> analyzers do not match up with the names.  This can lead to unknowingly using 
> the wrong analyzer chain (as it did in SOLR-6017).  We should do this 
> renaming in trunk, and deprecate the old getAnalyzer function in 4x.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org