[jira] [Updated] (SOLR-3925) Expose SpanFirst in eDismax
[ https://issues.apache.org/jira/browse/SOLR-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-3925: Attachment: SOLR-3925-trunk-2.patch Updated patch for today's trunk. Expose SpanFirst in eDismax --- Key: SOLR-3925 URL: https://issues.apache.org/jira/browse/SOLR-3925 Project: Solr Issue Type: Improvement Components: query parsers Affects Versions: 4.0-BETA Environment: solr-spec 5.0.0.2012.10.09.19.29.59 solr-impl 5.0-SNAPSHOT 1366361:1396116M - markus - 2012-10-09 19:29:59 Reporter: Markus Jelsma Fix For: 4.1, 5.0 Attachments: SOLR-3925-trunk-1.patch, SOLR-3925-trunk-2.patch Expose Lucene's SpanFirst capability in Solr's extended Dismax query parser. This issue adds the SF-parameter (SpanFirst) and takes a FIELD~DISTANCE^BOOST formatted value. For example, sf=title~5^2 will give a boost of 2 if one of the normal clauses, originally generated for automatic phrase queries, is located within five positions from the field's start. Unit test is included and all tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3970) Admin dashboard shows incomplete java version
[ https://issues.apache.org/jira/browse/SOLR-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492244#comment-13492244 ] Stefan Matheis (steffkes) commented on SOLR-3970: - Shawn, nothing easier than this: {{solr/core/src/java/org/apache/solr/handler/admin/SystemInfoHandler.java}} on Line 214: {code}jvm.add( version, System.getProperty(java.vm.version) );{code} but i don't know what makes more sense to show as information at the dashboard? Admin dashboard shows incomplete java version - Key: SOLR-3970 URL: https://issues.apache.org/jira/browse/SOLR-3970 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.0 Environment: Linux bigindy5 2.6.32-279.9.1.el6.centos.plus.x86_64 #1 SMP Wed Sep 26 03:52:55 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_07 Java(TM) SE Runtime Environment (build 1.7.0_07-b10) Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode) Reporter: Shawn Heisey Priority: Minor Fix For: 4.1 The admin dashboard shows the following for Runtime under JVM but it is incomplete. Unless you are intimately familiar with the correlation between HotSpot version numbers and Java version numbers, you can't look at this and know what version of Oracle Java is being used. Java HotSpot(TM) 64-Bit Server VM (23.3-b01) The complete version output (from java -version) on this system is this: java version 1.7.0_07 Java(TM) SE Runtime Environment (build 1.7.0_07-b10) Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4031) Rare mixup of request content
[ https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-4031: --- Fix Version/s: 4.1 Rare mixup of request content -- Key: SOLR-4031 URL: https://issues.apache.org/jira/browse/SOLR-4031 Project: Solr Issue Type: Bug Components: multicore, search, SolrCloud Affects Versions: 4.0 Reporter: Per Steffensen Labels: bug, data-integrity, mixup, request, security Fix For: 4.1 We are using Solr 4.0 and run intensive performance/data-integrity/endurance tests on it. In very rare occasions the content of two concurrent requests to Solr get mixed up. We have spent a lot of time narrowing down this issue and found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it as a bug with Jetty. Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936 Mailing list thread: http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530 The reports to Jetty is very detailed so you can go and read about it there. We have found that the problem seems to be solved in Jetty 8.1.7. Therefore we are now running Solr 4.0 (plus our additional changes) on top of Jetty 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the Apache side sometime soon. Alt least now you know what to tell people if the start complaining about mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run tomcat or something) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4031) Rare mixup of request content
[ https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492247#comment-13492247 ] Yonik Seeley commented on SOLR-4031: Thanks for tracking that down Per! Sounds like we should definitely upgrade to the latest Jetty 8. I've marked this for 4.1 Rare mixup of request content -- Key: SOLR-4031 URL: https://issues.apache.org/jira/browse/SOLR-4031 Project: Solr Issue Type: Bug Components: multicore, search, SolrCloud Affects Versions: 4.0 Reporter: Per Steffensen Labels: bug, data-integrity, mixup, request, security Fix For: 4.1 We are using Solr 4.0 and run intensive performance/data-integrity/endurance tests on it. In very rare occasions the content of two concurrent requests to Solr get mixed up. We have spent a lot of time narrowing down this issue and found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it as a bug with Jetty. Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936 Mailing list thread: http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530 The reports to Jetty is very detailed so you can go and read about it there. We have found that the problem seems to be solved in Jetty 8.1.7. Therefore we are now running Solr 4.0 (plus our additional changes) on top of Jetty 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the Apache side sometime soon. Alt least now you know what to tell people if the start complaining about mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run tomcat or something) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4532) TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy failure
[ https://issues.apache.org/jira/browse/LUCENE-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-4532. Resolution: Fixed Fix Version/s: 5.0 4.1 Lucene Fields: New,Patch Available (was: New) Committed to trunk and 4.x. I didn't commit to 4.0.x because it seems we're not going to have a 4.0.1, but rather focus on 4.1. TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy failure Key: LUCENE-4532 URL: https://issues.apache.org/jira/browse/LUCENE-4532 Project: Lucene - Core Issue Type: Bug Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 4.1, 5.0 Attachments: LUCENE-4532.patch, LUCENE-4532.patch The following failure on Jenkins: {noformat} Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1404/ Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC 1 tests failed. REGRESSION: org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy Error Message: Stack Trace: java.lang.ArrayIndexOutOfBoundsException at __randomizedtesting.SeedInfo.seed([6AB10D3E4E956CFA:BFB2863DB7E077E0]:0) at java.lang.System.arraycopy(Native Method) at org.apache.lucene.facet.taxonomy.directory.ParentArray.refresh(ParentArray.java:99) at org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader.refresh(DirectoryTaxonomyReader.java:407) at org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader.doTestReadRecreatedTaxono(TestDirectoryTaxonomyReader.java:167) at org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader.testRefreshReadRecreatedTaxonomy(TestDirectoryTaxonomyReader.java:130) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at
[jira] [Created] (SOLR-4044) CloudSolrServer early connect problems
Grant Ingersoll created SOLR-4044: - Summary: CloudSolrServer early connect problems Key: SOLR-4044 URL: https://issues.apache.org/jira/browse/SOLR-4044 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0 Reporter: Grant Ingersoll If you call CloudSolrServer.connect() after Zookeeper is up, but before clusterstate, etc. is populated, you will get No live SolrServer exceptions (line 322 in LBHttpSolrServer): {code} throw new SolrServerException(No live SolrServers available to handle this request);{code} for all requests made even though all the Solr nodes are coming up just fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4031) Rare mixup of request content
[ https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley reassigned SOLR-4031: -- Assignee: Yonik Seeley Rare mixup of request content -- Key: SOLR-4031 URL: https://issues.apache.org/jira/browse/SOLR-4031 Project: Solr Issue Type: Bug Components: multicore, search, SolrCloud Affects Versions: 4.0 Reporter: Per Steffensen Assignee: Yonik Seeley Labels: bug, data-integrity, mixup, request, security Fix For: 4.1 We are using Solr 4.0 and run intensive performance/data-integrity/endurance tests on it. In very rare occasions the content of two concurrent requests to Solr get mixed up. We have spent a lot of time narrowing down this issue and found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it as a bug with Jetty. Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936 Mailing list thread: http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530 The reports to Jetty is very detailed so you can go and read about it there. We have found that the problem seems to be solved in Jetty 8.1.7. Therefore we are now running Solr 4.0 (plus our additional changes) on top of Jetty 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the Apache side sometime soon. Alt least now you know what to tell people if the start complaining about mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run tomcat or something) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3931) Turn off coord() factor for scoring
[ https://issues.apache.org/jira/browse/SOLR-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492300#comment-13492300 ] Joel Nothman commented on SOLR-3931: Version 4.0.0 allows the specification of a custom similarity factory for each field in schema.xml (see SOLR-2338; it seems documentation is a bit lacking). So these options are not per-query, but per-core. It would be possible to copy or patch Lucene's {{DefaultSimilarity}} and Solr's {{DefaultSimilarityFactory}} to take `useCoord` and `useQueryNorm` parameters. Turn off coord() factor for scoring --- Key: SOLR-3931 URL: https://issues.apache.org/jira/browse/SOLR-3931 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Bill Bell We would like to remove coordination factor from scoring. FOr small fields (like name of doctor), we want to not score higher if the same term is in the field more than once. Makes sense for books, not so much for formal names. /solr/select?q=*:*coordFactor=false Default is true. (Note: we might want to make each of these optional - tf, idf, coord, queryNorm coord(q,d) is a score factor based on how many of the query terms are found in the specified document. Typically, a document that contains more of the query's terms will receive a higher score than another document with fewer query terms. This is a search time factor computed in coord(q,d) by the Similarity in effect at search time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3931) Turn off coord() factor for scoring
[ https://issues.apache.org/jira/browse/SOLR-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492300#comment-13492300 ] Joel Nothman edited comment on SOLR-3931 at 11/7/12 12:41 PM: -- Version 4.0.0 allows the specification of a custom similarity factory for each field in schema.xml (see SOLR-2338; it seems documentation is a bit lacking). So these options are not per-query, but per-core. It would be possible to copy or patch Lucene's {{DefaultSimilarity}} and Solr's {{DefaultSimilarityFactory}} to take {{useCoord}} and {{useQueryNorm}} parameters. was (Author: jnothman): Version 4.0.0 allows the specification of a custom similarity factory for each field in schema.xml (see SOLR-2338; it seems documentation is a bit lacking). So these options are not per-query, but per-core. It would be possible to copy or patch Lucene's {{DefaultSimilarity}} and Solr's {{DefaultSimilarityFactory}} to take `useCoord` and `useQueryNorm` parameters. Turn off coord() factor for scoring --- Key: SOLR-3931 URL: https://issues.apache.org/jira/browse/SOLR-3931 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Bill Bell We would like to remove coordination factor from scoring. FOr small fields (like name of doctor), we want to not score higher if the same term is in the field more than once. Makes sense for books, not so much for formal names. /solr/select?q=*:*coordFactor=false Default is true. (Note: we might want to make each of these optional - tf, idf, coord, queryNorm coord(q,d) is a score factor based on how many of the query terms are found in the specified document. Typically, a document that contains more of the query's terms will receive a higher score than another document with fewer query terms. This is a search time factor computed in coord(q,d) by the Similarity in effect at search time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable
[ https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492307#comment-13492307 ] Piotr commented on LUCENE-4542: --- I'd prefer not to create a patch, I don't feel so comfortable with lucene code. Make RECURSION_CAP in HunspellStemmer configurable -- Key: LUCENE-4542 URL: https://issues.apache.org/jira/browse/LUCENE-4542 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Piotr Assignee: Chris Male Currently there is private static final int RECURSION_CAP = 2; in the code of the class HunspellStemmer. It makes using hunspell with several dictionaries almost unusable, due to bad performance (f.ex. it costs 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for recursion_cap=1). It would be nice to be able to tune this number as needed. AFAIK this number (2) was chosen arbitrary. (it's a first issue in my life, so please forgive me any mistakes done). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable
[ https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492307#comment-13492307 ] Piotr edited comment on LUCENE-4542 at 11/7/12 12:57 PM: - I'd prefer not to create a patch myself, I don't feel so comfortable with lucene code. was (Author: zasnuty): I'd prefer not to create a patch, I don't feel so comfortable with lucene code. Make RECURSION_CAP in HunspellStemmer configurable -- Key: LUCENE-4542 URL: https://issues.apache.org/jira/browse/LUCENE-4542 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Piotr Assignee: Chris Male Currently there is private static final int RECURSION_CAP = 2; in the code of the class HunspellStemmer. It makes using hunspell with several dictionaries almost unusable, due to bad performance (f.ex. it costs 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for recursion_cap=1). It would be nice to be able to tune this number as needed. AFAIK this number (2) was chosen arbitrary. (it's a first issue in my life, so please forgive me any mistakes done). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4031) Rare mixup of request content
[ https://issues.apache.org/jira/browse/SOLR-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-4031. Resolution: Fixed Upgraded to Jetty 8.1.7, committed to trunk, 4x. Rare mixup of request content -- Key: SOLR-4031 URL: https://issues.apache.org/jira/browse/SOLR-4031 Project: Solr Issue Type: Bug Components: multicore, search, SolrCloud Affects Versions: 4.0 Reporter: Per Steffensen Assignee: Yonik Seeley Labels: bug, data-integrity, mixup, request, security Fix For: 4.1 We are using Solr 4.0 and run intensive performance/data-integrity/endurance tests on it. In very rare occasions the content of two concurrent requests to Solr get mixed up. We have spent a lot of time narrowing down this issue and found that it is a bug in Jetty 8.1.2. Therefore of course we have filed it as a bug with Jetty. Official bugzilla: https://bugs.eclipse.org/bugs/show_bug.cgi?id=392936 Mailing list thread: http://dev.eclipse.org/mhonarc/lists/jetty-dev/threads.html#01530 The reports to Jetty is very detailed so you can go and read about it there. We have found that the problem seems to be solved in Jetty 8.1.7. Therefore we are now running Solr 4.0 (plus our additional changes) on top of Jetty 8.1.7 instead of 8.1.2. You probably want to do the same upgrade on the Apache side sometime soon. Alt least now you know what to tell people if the start complaining about mixed up requests in Solr 4.0 - upgrade the Jetty underneath to 8.1.7 (or run tomcat or something) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable
[ https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafał Kuć updated LUCENE-4542: -- Attachment: LUCENE-4542.patch As Piotr doesn't want to provide the patch, I'll do it for him :) Simple patch adding a new constructor that allows to pass additional parameter - the recursion cap. The old constructor is there and the default value for recursion cap is 2. Make RECURSION_CAP in HunspellStemmer configurable -- Key: LUCENE-4542 URL: https://issues.apache.org/jira/browse/LUCENE-4542 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Piotr Assignee: Chris Male Attachments: LUCENE-4542.patch Currently there is private static final int RECURSION_CAP = 2; in the code of the class HunspellStemmer. It makes using hunspell with several dictionaries almost unusable, due to bad performance (f.ex. it costs 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for recursion_cap=1). It would be nice to be able to tune this number as needed. AFAIK this number (2) was chosen arbitrary. (it's a first issue in my life, so please forgive me any mistakes done). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory
Markus Jelsma created LUCENE-4545: - Summary: Better error reporting StemmerOverrideFilterFactory Key: LUCENE-4545 URL: https://issues.apache.org/jira/browse/LUCENE-4545 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Markus Jelsma Fix For: 4.1, 5.0 If the dictionary contains an error such as a space instead of a tab somewhere in the dictionary it is hard to find the error in a long dictionary. This patch includes the file and line number in the exception, helping to debug it quickly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory
[ https://issues.apache.org/jira/browse/LUCENE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-4545: -- Attachment: LUCENE-4545-trunk-1.patch Patch for trunk. Better error reporting StemmerOverrideFilterFactory --- Key: LUCENE-4545 URL: https://issues.apache.org/jira/browse/LUCENE-4545 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Markus Jelsma Fix For: 4.1, 5.0 Attachments: LUCENE-4545-trunk-1.patch If the dictionary contains an error such as a space instead of a tab somewhere in the dictionary it is hard to find the error in a long dictionary. This patch includes the file and line number in the exception, helping to debug it quickly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory
[ https://issues.apache.org/jira/browse/LUCENE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-4545: -- Priority: Trivial (was: Major) Better error reporting StemmerOverrideFilterFactory --- Key: LUCENE-4545 URL: https://issues.apache.org/jira/browse/LUCENE-4545 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Markus Jelsma Priority: Trivial Fix For: 4.1, 5.0 Attachments: LUCENE-4545-trunk-1.patch If the dictionary contains an error such as a space instead of a tab somewhere in the dictionary it is hard to find the error in a long dictionary. This patch includes the file and line number in the exception, helping to debug it quickly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable
[ https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rafał Kuć updated LUCENE-4542: -- Attachment: LUCENE-4542-with-solr.patch Chris I've attached a second patch which includes changes to Solr HunspellFilter and its factory. Please review it and say if you want any changes to be made to it. I'll be glad to do it. Make RECURSION_CAP in HunspellStemmer configurable -- Key: LUCENE-4542 URL: https://issues.apache.org/jira/browse/LUCENE-4542 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Piotr Assignee: Chris Male Attachments: LUCENE-4542.patch, LUCENE-4542-with-solr.patch Currently there is private static final int RECURSION_CAP = 2; in the code of the class HunspellStemmer. It makes using hunspell with several dictionaries almost unusable, due to bad performance (f.ex. it costs 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for recursion_cap=1). It would be nice to be able to tune this number as needed. AFAIK this number (2) was chosen arbitrary. (it's a first issue in my life, so please forgive me any mistakes done). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b58) - Build # 2257 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2257/ Java: 32bit/jdk1.8.0-ea-b58 -server -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 19613 lines...] check-licenses: [echo] License check under: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar [...truncated 2 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:67: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build.xml:223: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/custom-tasks.xml:44: License check failed. Check the logs. Total time: 28 minutes 3 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.8.0-ea-b58 -server -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.6.0_37) - Build # 1476 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1476/ Java: 32bit/jdk1.6.0_37 -client -XX:+UseParallelGC All tests passed Build Log: [...truncated 18989 lines...] check-licenses: [echo] License check under: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-continuation-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-deploy-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-http-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-io-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-jmx-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-security-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-server-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-servlet-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-util-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-webapp-8.1.7.v20120910.jar [...truncated 2 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:67: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build.xml:223: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\lucene\tools\custom-tasks.xml:44: License check failed. Check the logs. Total time: 47 minutes 56 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -client -XX:+UseParallelGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4545) Better error reporting StemmerOverrideFilterFactory
[ https://issues.apache.org/jira/browse/LUCENE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492366#comment-13492366 ] Robert Muir commented on LUCENE-4545: - I'm for the idea, but not for the logic contained to this specific factory. Instead of tracking our own line numbers, we should use LineNumberReader and so on. WordListLoader.getStemDict should be changed to take a generic map (Not a chararraymap), so that it can be used by this method. In fact, since nothing at all is using this method, we can do whatever we want with it. Also the logic should not use split(s, 2): I think instead it should just use split(s)? This way we detect the situation where there are multiple tabs in a line unexpectedly, too. Better error reporting StemmerOverrideFilterFactory --- Key: LUCENE-4545 URL: https://issues.apache.org/jira/browse/LUCENE-4545 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Markus Jelsma Priority: Trivial Fix For: 4.1, 5.0 Attachments: LUCENE-4545-trunk-1.patch If the dictionary contains an error such as a space instead of a tab somewhere in the dictionary it is hard to find the error in a long dictionary. This patch includes the file and line number in the exception, helping to debug it quickly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4546) SorterTemplate.quicksort incorrect
Stefan Pohl created LUCENE-4546: --- Summary: SorterTemplate.quicksort incorrect Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 4.0, 3.6.1, 4.1 Reporter: Stefan Pohl Fix For: 4.1, 4.0, 3.6.1 On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Pohl updated LUCENE-4546: Attachment: SorterTemplate.java.patch TestSorterTemplate.java Test and patch file now attached. SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2248 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2248/ Java: 32bit/jdk1.7.0_09 -client -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 19574 lines...] check-licenses: [echo] License check under: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar [...truncated 2 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:67: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build.xml:223: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/custom-tasks.xml:44: License check failed. Check the logs. Total time: 34 minutes 18 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.7.0_09 -client -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4536) Make PackedInts byte-aligned?
[ https://issues.apache.org/jira/browse/LUCENE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-4536. -- Resolution: Fixed Committed: - trunk: r1406651 - branch 4.x: r1406660 Make PackedInts byte-aligned? - Key: LUCENE-4536 URL: https://issues.apache.org/jira/browse/LUCENE-4536 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4536.patch, LUCENE-4536.patch PackedInts are more and more used to save/restore small arrays, but given that they are long-aligned, up to 63 bits are wasted per array. We should try to make PackedInts storage byte-aligned so that only 7 bits are wasted in the worst case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4547) PackedIntsDocValue field broken on large indexes
Robert Muir created LUCENE-4547: --- Summary: PackedIntsDocValue field broken on large indexes Key: LUCENE-4547 URL: https://issues.apache.org/jira/browse/LUCENE-4547 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Priority: Blocker Fix For: 4.1 I tried to write a test to sanity check LUCENE-4536 (first running against svn revision 1406416, before the change). But i found docvalues is already broken here for large indexes that have a PackedLongDocValues field: {code} final int numDocs = 5; for (int i = 0; i numDocs; ++i) { if (i == 0) { field.setLongValue(0L); // force 32bit deltas } else { field.setLongValue(133L); } w.addDocument(doc); } w.forceMerge(1); w.close(); dir.close(); // checkindex {code} {noformat} [junit4:junit4] 2 WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,6,TGRP-Test2GBDocValues] [junit4:junit4] 2 org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2at __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0) [junit4:junit4] 2at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535) [junit4:junit4] 2at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508) [junit4:junit4] 2 Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2at org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305) [junit4:junit4] 2at org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115) [junit4:junit4] 2at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109) [junit4:junit4] 2at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80) [junit4:junit4] 2at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130) [junit4:junit4] 2at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.6.0_37) - Build # 1471 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1471/ Java: 64bit/jdk1.6.0_37 -XX:+UseSerialGC All tests passed Build Log: [...truncated 18872 lines...] check-licenses: [echo] License check under: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-continuation-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-deploy-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-http-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-io-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-jmx-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-security-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-server-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-servlet-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-util-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\lib\jetty-webapp-8.1.7.v20120910.jar [...truncated 2 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:67: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\build.xml:223: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\lucene\tools\custom-tasks.xml:44: License check failed. Check the logs. Total time: 49 minutes 19 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 64bit/jdk1.6.0_37 -XX:+UseSerialGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4547) PackedIntsDocValue field broken on large indexes
[ https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4547: Attachment: test.patch Here was my initial test, just screwing around. I ran with 'ant test -Dtestcase=Test2GBDocValues -Dtests.nightly=true -Dtests.heapsize=5G' PackedIntsDocValue field broken on large indexes Key: LUCENE-4547 URL: https://issues.apache.org/jira/browse/LUCENE-4547 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Priority: Blocker Fix For: 4.1 Attachments: test.patch I tried to write a test to sanity check LUCENE-4536 (first running against svn revision 1406416, before the change). But i found docvalues is already broken here for large indexes that have a PackedLongDocValues field: {code} final int numDocs = 5; for (int i = 0; i numDocs; ++i) { if (i == 0) { field.setLongValue(0L); // force 32bit deltas } else { field.setLongValue(133L); } w.addDocument(doc); } w.forceMerge(1); w.close(); dir.close(); // checkindex {code} {noformat} [junit4:junit4] 2 WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,6,TGRP-Test2GBDocValues] [junit4:junit4] 2 org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508) [junit4:junit4] 2 Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80) [junit4:junit4] 2 at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130) [junit4:junit4] 2 at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4547) PackedIntsDocValue field broken on large indexes
[ https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492405#comment-13492405 ] Robert Muir commented on LUCENE-4547: - There is even a out-of-coffee bug in the test, its only using like 2 bits per value :) So this is really even worse. I'm not sure we should be using ByteBlockPool etc here. I think it shouldnt be used outside of the indexer. PackedIntsDocValue field broken on large indexes Key: LUCENE-4547 URL: https://issues.apache.org/jira/browse/LUCENE-4547 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Priority: Blocker Fix For: 4.1 Attachments: test.patch I tried to write a test to sanity check LUCENE-4536 (first running against svn revision 1406416, before the change). But i found docvalues is already broken here for large indexes that have a PackedLongDocValues field: {code} final int numDocs = 5; for (int i = 0; i numDocs; ++i) { if (i == 0) { field.setLongValue(0L); // force 32bit deltas } else { field.setLongValue(133L); } w.addDocument(doc); } w.forceMerge(1); w.close(); dir.close(); // checkindex {code} {noformat} [junit4:junit4] 2 WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,6,TGRP-Test2GBDocValues] [junit4:junit4] 2 org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508) [junit4:junit4] 2 Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80) [junit4:junit4] 2 at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130) [junit4:junit4] 2 at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b58) - Build # 2258 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2258/ Java: 64bit/jdk1.8.0-ea-b58 -XX:+UseSerialGC All tests passed Build Log: [...truncated 19639 lines...] check-licenses: [echo] License check under: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar [...truncated 2 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:67: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/solr/build.xml:223: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/tools/custom-tasks.xml:44: License check failed. Check the logs. Total time: 27 minutes 8 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 64bit/jdk1.8.0-ea-b58 -XX:+UseSerialGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4547) DocValues field broken on large indexes
[ https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4547: Summary: DocValues field broken on large indexes (was: PackedIntsDocValue field broken on large indexes) editing description: I think it affects more than PackedIntValues actually? I think the bug is in how FixedStraightBytesImpl uses byteblockpool. So this means the problem should be way more widespread: e.g. if you have lots of documents in general I think you are fucked (as norms should trip it too). DocValues field broken on large indexes --- Key: LUCENE-4547 URL: https://issues.apache.org/jira/browse/LUCENE-4547 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Priority: Blocker Fix For: 4.1 Attachments: test.patch I tried to write a test to sanity check LUCENE-4536 (first running against svn revision 1406416, before the change). But i found docvalues is already broken here for large indexes that have a PackedLongDocValues field: {code} final int numDocs = 5; for (int i = 0; i numDocs; ++i) { if (i == 0) { field.setLongValue(0L); // force 32bit deltas } else { field.setLongValue(133L); } w.addDocument(doc); } w.forceMerge(1); w.close(); dir.close(); // checkindex {code} {noformat} [junit4:junit4] 2 WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,6,TGRP-Test2GBDocValues] [junit4:junit4] 2 org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508) [junit4:junit4] 2 Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80) [junit4:junit4] 2 at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130) [junit4:junit4] 2 at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
Alessandro Tommasi created SOLR-4045: Summary: SOLR admin page returns HTTP 404 on core names containing a '.' (dot) Key: SOLR-4045 URL: https://issues.apache.org/jira/browse/SOLR-4045 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Environment: Linux, Ubuntu 12.04 Reporter: Alessandro Tommasi Priority: Minor When SOLR is configured in multicore mode, cores with '.' (dot) in their names are inaccessible via the admin web guy. (localhost:8983/solr). The page shows an alert with the message (test.test was my core name): 404 Not Found get #/test.test To replicate: start solr in multicore mode, go to localhost:8983/solr, via core admin create a new core test.test, then refresh the page. test.test will show under the menu at the bottom left. Clicking on it causes the message, while no core menu appears. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4547) DocValues field broken on large indexes
[ https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492421#comment-13492421 ] Robert Muir commented on LUCENE-4547: - Another bug is that I had to pass tests.heapsize at all. I think its bad that docvalues gobbles up so much ram when merging. Cant we merge this stuff from disk? DocValues field broken on large indexes --- Key: LUCENE-4547 URL: https://issues.apache.org/jira/browse/LUCENE-4547 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Priority: Blocker Fix For: 4.1 Attachments: test.patch I tried to write a test to sanity check LUCENE-4536 (first running against svn revision 1406416, before the change). But i found docvalues is already broken here for large indexes that have a PackedLongDocValues field: {code} final int numDocs = 5; for (int i = 0; i numDocs; ++i) { if (i == 0) { field.setLongValue(0L); // force 32bit deltas } else { field.setLongValue(133L); } w.addDocument(doc); } w.forceMerge(1); w.close(); dir.close(); // checkindex {code} {noformat} [junit4:junit4] 2 WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,6,TGRP-Test2GBDocValues] [junit4:junit4] 2 org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535) [junit4:junit4] 2 at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508) [junit4:junit4] 2 Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536 [junit4:junit4] 2 at org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109) [junit4:junit4] 2 at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80) [junit4:junit4] 2 at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130) [junit4:junit4] 2 at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.
Per Steffensen created SOLR-4046: Summary: An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a. Key: SOLR-4046 URL: https://issues.apache.org/jira/browse/SOLR-4046 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.0 Environment: Solr 4.0.0. Actually revision 1394844 on branch lucene_solr_4_0 but I believe that is the same Reporter: Per Steffensen Priority: Critical CloudSolrServer saves urlList, leaderUrlList and replicasList on instance level, and only recalculates those lists in case of clusterState changes. The values calculated for the lists will be different for different target-collections. Therefore they also ought to recalculated for a request R, if the target-collection for R is different from the target-collection for the request handled just before R by the same CloudSolrServer instance. Another problem with the implementation in CloudSolrServer is with the lastClusterStateHashCode. lastClusterStateHashCode is updated when the first request after a clusterState-change is handled. Before the lastClusterStateHashCode is updated one of the following two sets of lists are updated: * In case sendToLeader==true for the request: leaderUrlList and replicasList are updated, but not urlList * In case sendToLeader==false for the request: urlList is updated, but not leaderUrlList and replicasList But the lastClusterStateHashCode is always updated. So even though there was just one collection in the world there is a problem: If the first request after a clusterState-change is a sendToLeader==true-request urlList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==false-request to the same CloudSolrServer instance. If the first request after a clusterState-change is a sendToLeader==false-request leaderUrlList and replicasList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==true-request to the same CloudSolrServer instance. Besides that it is a very bad idea to have instance- and local-method-variables with the same name. CloudSolrServer has an instance variable called urlList and method CloudSolrServer.request has a local-method-variable called urlList and the method also operates on instance variable urlList. This makes the code hard to read. Havnt made a test in Apache Solr regi to reproduce the main error (the one mentioned at the top above) but I guess you can easily do it yourself: Make a setup with two collections collection1 and collection2 - no default collection. Add some documents to collection2 (without any autocommit). Then do cloudSolrServer.commit(collection1) and afterwards cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). Then try to search collection2 for the documents you inserted into it. They ought to be found, but are not, because the cloudSolrServer.commit(collection2) will not do a commit of collection2 - it will actually do a commit of collection1. Well, actually you cant do cloudSolrServer.commit(collection-name) (the method doesnt exist), but that ought to be corrected too. But you can do the following instead: {code} UpdateRequest req = new UpdateRequest(); req.setAction(UpdateRequest.ACTION.COMMIT, true, true); req.setParam(CoreAdminParams.COLLECTION, collection-name); req.process(cloudSolrServer); {code} In general I think you should add misc tests to your test-suite - tests that run Solr-clusters with more than one collection and makes clever tests on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_09) - Build # 2249 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2249/ Java: 32bit/jdk1.7.0_09 -client -XX:+UseG1GC All tests passed Build Log: [...truncated 19521 lines...] check-licenses: [echo] License check under: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-continuation-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-deploy-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-http-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-io-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-jmx-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-security-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-server-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-servlet-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-util-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/example/lib/jetty-webapp-8.1.7.v20120910.jar [...truncated 2 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:67: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build.xml:223: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/custom-tasks.xml:44: License check failed. Check the logs. Total time: 28 minutes 24 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.7.0_09 -client -XX:+UseG1GC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.
[ https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492430#comment-13492430 ] Mark Miller commented on SOLR-4046: --- I think this is a dupe of SOLR-3920? An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a. -- Key: SOLR-4046 URL: https://issues.apache.org/jira/browse/SOLR-4046 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.0 Environment: Solr 4.0.0. Actually revision 1394844 on branch lucene_solr_4_0 but I believe that is the same Reporter: Per Steffensen Priority: Critical CloudSolrServer saves urlList, leaderUrlList and replicasList on instance level, and only recalculates those lists in case of clusterState changes. The values calculated for the lists will be different for different target-collections. Therefore they also ought to recalculated for a request R, if the target-collection for R is different from the target-collection for the request handled just before R by the same CloudSolrServer instance. Another problem with the implementation in CloudSolrServer is with the lastClusterStateHashCode. lastClusterStateHashCode is updated when the first request after a clusterState-change is handled. Before the lastClusterStateHashCode is updated one of the following two sets of lists are updated: * In case sendToLeader==true for the request: leaderUrlList and replicasList are updated, but not urlList * In case sendToLeader==false for the request: urlList is updated, but not leaderUrlList and replicasList But the lastClusterStateHashCode is always updated. So even though there was just one collection in the world there is a problem: If the first request after a clusterState-change is a sendToLeader==true-request urlList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==false-request to the same CloudSolrServer instance. If the first request after a clusterState-change is a sendToLeader==false-request leaderUrlList and replicasList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==true-request to the same CloudSolrServer instance. Besides that it is a very bad idea to have instance- and local-method-variables with the same name. CloudSolrServer has an instance variable called urlList and method CloudSolrServer.request has a local-method-variable called urlList and the method also operates on instance variable urlList. This makes the code hard to read. Havnt made a test in Apache Solr regi to reproduce the main error (the one mentioned at the top above) but I guess you can easily do it yourself: Make a setup with two collections collection1 and collection2 - no default collection. Add some documents to collection2 (without any autocommit). Then do cloudSolrServer.commit(collection1) and afterwards cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). Then try to search collection2 for the documents you inserted into it. They ought to be found, but are not, because the cloudSolrServer.commit(collection2) will not do a commit of collection2 - it will actually do a commit of collection1. Well, actually you cant do cloudSolrServer.commit(collection-name) (the method doesnt exist), but that ought to be corrected too. But you can do the following instead: {code} UpdateRequest req = new UpdateRequest(); req.setAction(UpdateRequest.ACTION.COMMIT, true, true); req.setParam(CoreAdminParams.COLLECTION, collection-name); req.process(cloudSolrServer); {code} In general I think you should add misc tests to your test-suite - tests that run Solr-clusters with more than one collection and makes clever tests on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4047) dataimporter.functions.encodeUrl throughs Unable to encode expression: field.name with value: null
Igor Dobritskiy created SOLR-4047: - Summary: dataimporter.functions.encodeUrl throughs Unable to encode expression: field.name with value: null Key: SOLR-4047 URL: https://issues.apache.org/jira/browse/SOLR-4047 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.0 Environment: Windows 7 Reporter: Igor Dobritskiy For some reason dataimporter.functions.encoude URL stopped work after update to solr 4.0 from 3.5. Here is the error {code} Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to encode expression: attach.name with value: null Processing Document # 1 {code} Here is the data import config snippet: {code} ... entity name=account query=select name from accounts where account_id = '${attach.account_id}' entity name=img_index processor=TikaEntityProcessor dataSource=bin format=text url=http://example.com/data/${account.name}/attaches/${attach.item_id}/${dataimporter.functions.encodeUrl(attach.name)} field column=text name=body / /entity /entity ... {code} When I'm changing it to *not* use dataimporter.functions.encodeUrl it works but I need to url encode file names as they have special chars in theirs names. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.6.0_37) - Build # 1477 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1477/ Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 18960 lines...] check-licenses: [echo] License check under: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-continuation-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-deploy-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-http-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-io-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-jmx-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-security-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-server-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-servlet-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-util-8.1.7.v20120910.jar [licenses] MISSING sha1 checksum file for: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\example\lib\jetty-webapp-8.1.7.v20120910.jar [...truncated 2 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:67: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build.xml:223: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\lucene\tools\custom-tasks.xml:44: License check failed. Check the logs. Total time: 45 minutes 53 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.
[ https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Per Steffensen updated SOLR-4046: - Attachment: SOLR-4046.patch I have made the following patch in our local version of Solr. The patch could be done in various ways, but I decided to get rid of unneccesary code-complexity at the expense of negligible performance optimizations. So the idea about calculating and caching the different lists and only recalculate them on clusterState-change is gone. The lists are calculated from in-memory clusterState and it cannot take many ms to calculate the lists for every request - and the additional GC that comes out of it should also be negligible. The good think is that code becomes easier to read and understand. Well, of course you can choose a different approach. An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a. -- Key: SOLR-4046 URL: https://issues.apache.org/jira/browse/SOLR-4046 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.0 Environment: Solr 4.0.0. Actually revision 1394844 on branch lucene_solr_4_0 but I believe that is the same Reporter: Per Steffensen Priority: Critical Attachments: SOLR-4046.patch CloudSolrServer saves urlList, leaderUrlList and replicasList on instance level, and only recalculates those lists in case of clusterState changes. The values calculated for the lists will be different for different target-collections. Therefore they also ought to recalculated for a request R, if the target-collection for R is different from the target-collection for the request handled just before R by the same CloudSolrServer instance. Another problem with the implementation in CloudSolrServer is with the lastClusterStateHashCode. lastClusterStateHashCode is updated when the first request after a clusterState-change is handled. Before the lastClusterStateHashCode is updated one of the following two sets of lists are updated: * In case sendToLeader==true for the request: leaderUrlList and replicasList are updated, but not urlList * In case sendToLeader==false for the request: urlList is updated, but not leaderUrlList and replicasList But the lastClusterStateHashCode is always updated. So even though there was just one collection in the world there is a problem: If the first request after a clusterState-change is a sendToLeader==true-request urlList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==false-request to the same CloudSolrServer instance. If the first request after a clusterState-change is a sendToLeader==false-request leaderUrlList and replicasList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==true-request to the same CloudSolrServer instance. Besides that it is a very bad idea to have instance- and local-method-variables with the same name. CloudSolrServer has an instance variable called urlList and method CloudSolrServer.request has a local-method-variable called urlList and the method also operates on instance variable urlList. This makes the code hard to read. Havnt made a test in Apache Solr regi to reproduce the main error (the one mentioned at the top above) but I guess you can easily do it yourself: Make a setup with two collections collection1 and collection2 - no default collection. Add some documents to collection2 (without any autocommit). Then do cloudSolrServer.commit(collection1) and afterwards cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). Then try to search collection2 for the documents you inserted into it. They ought to be found, but are not, because the cloudSolrServer.commit(collection2) will not do a commit of collection2 - it will actually do a commit of collection1. Well, actually you cant do cloudSolrServer.commit(collection-name) (the method doesnt exist), but that ought to be corrected too. But you can do the following instead: {code} UpdateRequest req = new UpdateRequest(); req.setAction(UpdateRequest.ACTION.COMMIT, true, true); req.setParam(CoreAdminParams.COLLECTION, collection-name); req.process(cloudSolrServer); {code} In general I think you should add misc tests to your test-suite - tests that run Solr-clusters with more than one collection and makes clever tests on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: concurrentmergescheduller
On Tue, Nov 6, 2012 at 10:43 PM, Robert Muir rcm...@gmail.com wrote: On Tue, Nov 6, 2012 at 6:32 AM, Michael McCandless luc...@mikemccandless.com wrote: While confusing, I think the code is actually nearly correct... My question is, who is going to create the MikeSays account? LOL :) Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.
[ https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492437#comment-13492437 ] Per Steffensen commented on SOLR-4046: -- Yes, Mark, that seems like a dupe of SOLR-3920. But the patch is very different. First I thought about making the same patch, where the idea is to keep maps of the lists, but I just think, that if all this is only for performance reasons (not having to recalculate the lists every time) it is not worth the complexity in the code. Such an operation on in-memory stuff is negligible compared to what really uses time and resources in Solr, like storing to disk, sending stuff over network etc. But anyways, you can use the patch if you want. I will consider if we will use your solution on our side or stay with our own. Thanks a lot for responding. Regards, Per Steffensen An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a. -- Key: SOLR-4046 URL: https://issues.apache.org/jira/browse/SOLR-4046 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.0 Environment: Solr 4.0.0. Actually revision 1394844 on branch lucene_solr_4_0 but I believe that is the same Reporter: Per Steffensen Priority: Critical Attachments: SOLR-4046.patch CloudSolrServer saves urlList, leaderUrlList and replicasList on instance level, and only recalculates those lists in case of clusterState changes. The values calculated for the lists will be different for different target-collections. Therefore they also ought to recalculated for a request R, if the target-collection for R is different from the target-collection for the request handled just before R by the same CloudSolrServer instance. Another problem with the implementation in CloudSolrServer is with the lastClusterStateHashCode. lastClusterStateHashCode is updated when the first request after a clusterState-change is handled. Before the lastClusterStateHashCode is updated one of the following two sets of lists are updated: * In case sendToLeader==true for the request: leaderUrlList and replicasList are updated, but not urlList * In case sendToLeader==false for the request: urlList is updated, but not leaderUrlList and replicasList But the lastClusterStateHashCode is always updated. So even though there was just one collection in the world there is a problem: If the first request after a clusterState-change is a sendToLeader==true-request urlList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==false-request to the same CloudSolrServer instance. If the first request after a clusterState-change is a sendToLeader==false-request leaderUrlList and replicasList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==true-request to the same CloudSolrServer instance. Besides that it is a very bad idea to have instance- and local-method-variables with the same name. CloudSolrServer has an instance variable called urlList and method CloudSolrServer.request has a local-method-variable called urlList and the method also operates on instance variable urlList. This makes the code hard to read. Havnt made a test in Apache Solr regi to reproduce the main error (the one mentioned at the top above) but I guess you can easily do it yourself: Make a setup with two collections collection1 and collection2 - no default collection. Add some documents to collection2 (without any autocommit). Then do cloudSolrServer.commit(collection1) and afterwards cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). Then try to search collection2 for the documents you inserted into it. They ought to be found, but are not, because the cloudSolrServer.commit(collection2) will not do a commit of collection2 - it will actually do a commit of collection1. Well, actually you cant do cloudSolrServer.commit(collection-name) (the method doesnt exist), but that ought to be corrected too. But you can do the following instead: {code} UpdateRequest req = new UpdateRequest(); req.setAction(UpdateRequest.ACTION.COMMIT, true, true); req.setParam(CoreAdminParams.COLLECTION, collection-name); req.process(cloudSolrServer); {code} In general I think you should add misc tests to your test-suite - tests that run Solr-clusters with more than one collection and makes clever tests on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (SOLR-4048) Add a getRecursive method to NamedList
Shawn Heisey created SOLR-4048: -- Summary: Add a getRecursive method to NamedList Key: SOLR-4048 URL: https://issues.apache.org/jira/browse/SOLR-4048 Project: Solr Issue Type: New Feature Affects Versions: 4.0 Reporter: Shawn Heisey Priority: Minor Fix For: 4.1 Most of the time when accessing data from a NamedList, what you'll be doing is using get() to retrieve another NamedList, and doing so over and over until you reach the final level, where you'll actually retrieve the value you want. I propose adding a method to NamedList which would do all that heavy lifting for you. I created the following method for my own code. It could be adapted fairly easily for inclusion into NamedList itself. The only reason I did not include it as a patch is because I figure you'll want to ensure it meets all your particular coding guidelines, and that the JavaDoc is much better than I have done here: {code} /** * Recursively parse a NamedList and return the value at the last level, * assuming that the object found at each level is also a NamedList. For * example, if response is the NamedList response from the Solr4 mbean * handler, the following code makes sense: * * String coreName = (String) getRecursiveFromResponse(response, new * String[] { solr-mbeans, CORE, core, stats, coreName }); * * * @param namedList the NamedList to parse * @param args A list of values to recursively request * @return the object at the last level. * @throws SolrServerException */ @SuppressWarnings(unchecked) private final Object getRecursiveFromResponse( NamedListObject namedList, String[] args) throws CommonSolrException { NamedListObject list = null; Object value = null; try { for (String key : args) { if (list == null) { list = namedList; } else { list = (NamedListObject) value; } value = list.get(key); } return value; } catch (Exception e) { throw new SolrServerException( Failed to recursively parse NamedList, e); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4048) Add a getRecursive method to NamedList
[ https://issues.apache.org/jira/browse/SOLR-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-4048: --- Description: Most of the time when accessing data from a NamedList, what you'll be doing is using get() to retrieve another NamedList, and doing so over and over until you reach the final level, where you'll actually retrieve the value you want. I propose adding a method to NamedList which would do all that heavy lifting for you. I created the following method for my own code. It could be adapted fairly easily for inclusion into NamedList itself. The only reason I did not include it as a patch is because I figure you'll want to ensure it meets all your particular coding guidelines, and that the JavaDoc is much better than I have done here: {code} /** * Recursively parse a NamedList and return the value at the last level, * assuming that the object found at each level is also a NamedList. For * example, if response is the NamedList response from the Solr4 mbean * handler, the following code makes sense: * * String coreName = (String) getRecursiveFromResponse(response, new * String[] { solr-mbeans, CORE, core, stats, coreName }); * * * @param namedList the NamedList to parse * @param args A list of values to recursively request * @return the object at the last level. * @throws SolrServerException */ @SuppressWarnings(unchecked) private final Object getRecursiveFromResponse( NamedListObject namedList, String[] args) throws SolrServerException { NamedListObject list = null; Object value = null; try { for (String key : args) { if (list == null) { list = namedList; } else { list = (NamedListObject) value; } value = list.get(key); } return value; } catch (Exception e) { throw new SolrServerException( Failed to recursively parse NamedList, e); } } {code} was: Most of the time when accessing data from a NamedList, what you'll be doing is using get() to retrieve another NamedList, and doing so over and over until you reach the final level, where you'll actually retrieve the value you want. I propose adding a method to NamedList which would do all that heavy lifting for you. I created the following method for my own code. It could be adapted fairly easily for inclusion into NamedList itself. The only reason I did not include it as a patch is because I figure you'll want to ensure it meets all your particular coding guidelines, and that the JavaDoc is much better than I have done here: {code} /** * Recursively parse a NamedList and return the value at the last level, * assuming that the object found at each level is also a NamedList. For * example, if response is the NamedList response from the Solr4 mbean * handler, the following code makes sense: * * String coreName = (String) getRecursiveFromResponse(response, new * String[] { solr-mbeans, CORE, core, stats, coreName }); * * * @param namedList the NamedList to parse * @param args A list of values to recursively request * @return the object at the last level. * @throws SolrServerException */ @SuppressWarnings(unchecked) private final Object getRecursiveFromResponse( NamedListObject namedList, String[] args) throws CommonSolrException { NamedListObject list = null; Object value = null; try { for (String key : args) { if (list == null) { list = namedList; } else { list = (NamedListObject) value; } value = list.get(key); } return value; } catch (Exception e)
[jira] [Commented] (SOLR-4048) Add a getRecursive method to NamedList
[ https://issues.apache.org/jira/browse/SOLR-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492441#comment-13492441 ] Shawn Heisey commented on SOLR-4048: Had to edit that. I have my own Exception type, I forgot to change one of the lines to SolrServerException. Add a getRecursive method to NamedList Key: SOLR-4048 URL: https://issues.apache.org/jira/browse/SOLR-4048 Project: Solr Issue Type: New Feature Affects Versions: 4.0 Reporter: Shawn Heisey Priority: Minor Fix For: 4.1 Most of the time when accessing data from a NamedList, what you'll be doing is using get() to retrieve another NamedList, and doing so over and over until you reach the final level, where you'll actually retrieve the value you want. I propose adding a method to NamedList which would do all that heavy lifting for you. I created the following method for my own code. It could be adapted fairly easily for inclusion into NamedList itself. The only reason I did not include it as a patch is because I figure you'll want to ensure it meets all your particular coding guidelines, and that the JavaDoc is much better than I have done here: {code} /** * Recursively parse a NamedList and return the value at the last level, * assuming that the object found at each level is also a NamedList. For * example, if response is the NamedList response from the Solr4 mbean * handler, the following code makes sense: * * String coreName = (String) getRecursiveFromResponse(response, new * String[] { solr-mbeans, CORE, core, stats, coreName }); * * * @param namedList the NamedList to parse * @param args A list of values to recursively request * @return the object at the last level. * @throws SolrServerException */ @SuppressWarnings(unchecked) private final Object getRecursiveFromResponse( NamedListObject namedList, String[] args) throws SolrServerException { NamedListObject list = null; Object value = null; try { for (String key : args) { if (list == null) { list = namedList; } else { list = (NamedListObject) value; } value = list.get(key); } return value; } catch (Exception e) { throw new SolrServerException( Failed to recursively parse NamedList, e); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
[ https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-4045: Fix Version/s: 5.0 Assignee: Stefan Matheis (steffkes) SOLR admin page returns HTTP 404 on core names containing a '.' (dot) - Key: SOLR-4045 URL: https://issues.apache.org/jira/browse/SOLR-4045 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Environment: Linux, Ubuntu 12.04 Reporter: Alessandro Tommasi Assignee: Stefan Matheis (steffkes) Priority: Minor Labels: admin, solr, webgui Fix For: 5.0 When SOLR is configured in multicore mode, cores with '.' (dot) in their names are inaccessible via the admin web guy. (localhost:8983/solr). The page shows an alert with the message (test.test was my core name): 404 Not Found get #/test.test To replicate: start solr in multicore mode, go to localhost:8983/solr, via core admin create a new core test.test, then refresh the page. test.test will show under the menu at the bottom left. Clicking on it causes the message, while no core menu appears. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
[ https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-4045: Attachment: SOLR-4045.patch [~alley] would you mind verifying this Patch? Just to be sure that i didn't miss one Controller. While changing all those Files, i already that it would be good to have one central place holding kind of a core-pattern .. will try to change that as well, if that patch is okay SOLR admin page returns HTTP 404 on core names containing a '.' (dot) - Key: SOLR-4045 URL: https://issues.apache.org/jira/browse/SOLR-4045 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Environment: Linux, Ubuntu 12.04 Reporter: Alessandro Tommasi Assignee: Stefan Matheis (steffkes) Priority: Minor Labels: admin, solr, webgui Fix For: 5.0 Attachments: SOLR-4045.patch When SOLR is configured in multicore mode, cores with '.' (dot) in their names are inaccessible via the admin web guy. (localhost:8983/solr). The page shows an alert with the message (test.test was my core name): 404 Not Found get #/test.test To replicate: start solr in multicore mode, go to localhost:8983/solr, via core admin create a new core test.test, then refresh the page. test.test will show under the menu at the bottom left. Clicking on it causes the message, while no core menu appears. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4544) possible bug in ConcurrentMergeScheduler.merge(IndexWriter)
[ https://issues.apache.org/jira/browse/LUCENE-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492446#comment-13492446 ] Michael McCandless commented on LUCENE-4544: I think it needs more than cutting over to thread pool to clean it up :) We've actually looked at using a thread pool (see LUCENE-2063) but it apparently wasn't straightforward ... if you can see a way that'd be nice :) But I think we should do that under a separate issue ... leave this one focused on the off-by-one on maxMergeCount. possible bug in ConcurrentMergeScheduler.merge(IndexWriter) Key: LUCENE-4544 URL: https://issues.apache.org/jira/browse/LUCENE-4544 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 5.0 Reporter: Radim Kolar Assignee: Michael McCandless Attachments: LUCENE-4544.patch from dev list: ¨i suspect that this code is broken. Lines 331 - 343 in org.apache.lucene.index.ConcurrentMergeScheduler.merge(IndexWriter) mergeThreadCount() are currently active merges, they can be at most maxThreadCount, maxMergeCount is number of queued merges defaulted with maxThreadCount+2 and it can never be lower then maxThreadCount, which means that condition in while can never become true. synchronized(this) { long startStallTime = 0; while (mergeThreadCount() = 1+maxMergeCount) { startStallTime = System.currentTimeMillis(); if (verbose()) { message(too many merges; stalling...); } try { wait(); } catch (InterruptedException ie) { throw new ThreadInterruptedException(ie); } } While confusing, I think the code is actually nearly correct... but I would love to find some simplifications of CMS's logic (it's really hairy). It turns out mergeThreadCount() is allowed to go higher than maxThreadCount; when this happens, Lucene pauses mergeThreadCount()-maxThreadCount of those merge threads, and resumes them once threads finish (see updateMergeThreads). Ie, CMS will accept up to maxMergeCount merges (and launch threads for them), but will only allow maxThreadCount of those threads to be running at once. So what that while loop is doing is preventing more than maxMergeCount+1 threads from starting, and then pausing the incoming thread to slow down the rate of segment creation (since merging cannot keep up). But ... I think the 1+ is wrong ... it seems like it should just be mergeThreadCount() = maxMergeCount(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492445#comment-13492445 ] Uwe Schindler commented on LUCENE-4546: --- Hi, I think the problem is your test case: The SorterTemplate in your test does not handle the pivot value correctly - The setPivot() and comparePivot() methods get the index to compare with, but setPivot must store the actual value of the pivot, not the index of the pivot. Your code just stores the pivot index. You can fix this by correctly implementing setPivot: this.pivot = x[i] and implement comparePivot accordingly. See ArrayUtil for an example. SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.
[ https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492447#comment-13492447 ] Mark Miller commented on SOLR-4046: --- Yeah, I honestly had the same thought when I was fixing - I almost just dropped the caching completely - it didn't seem like the perf would be much different and the code is complicated. It's mostly a random dice roll that I ended up keeping the caching. Mostly, I was too lazy to test if it mattered (even though intuitively, I doubt it would). I'll keep this open until I'm home from Germany and can re look at it. An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a. -- Key: SOLR-4046 URL: https://issues.apache.org/jira/browse/SOLR-4046 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.0 Environment: Solr 4.0.0. Actually revision 1394844 on branch lucene_solr_4_0 but I believe that is the same Reporter: Per Steffensen Priority: Critical Attachments: SOLR-4046.patch CloudSolrServer saves urlList, leaderUrlList and replicasList on instance level, and only recalculates those lists in case of clusterState changes. The values calculated for the lists will be different for different target-collections. Therefore they also ought to recalculated for a request R, if the target-collection for R is different from the target-collection for the request handled just before R by the same CloudSolrServer instance. Another problem with the implementation in CloudSolrServer is with the lastClusterStateHashCode. lastClusterStateHashCode is updated when the first request after a clusterState-change is handled. Before the lastClusterStateHashCode is updated one of the following two sets of lists are updated: * In case sendToLeader==true for the request: leaderUrlList and replicasList are updated, but not urlList * In case sendToLeader==false for the request: urlList is updated, but not leaderUrlList and replicasList But the lastClusterStateHashCode is always updated. So even though there was just one collection in the world there is a problem: If the first request after a clusterState-change is a sendToLeader==true-request urlList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==false-request to the same CloudSolrServer instance. If the first request after a clusterState-change is a sendToLeader==false-request leaderUrlList and replicasList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==true-request to the same CloudSolrServer instance. Besides that it is a very bad idea to have instance- and local-method-variables with the same name. CloudSolrServer has an instance variable called urlList and method CloudSolrServer.request has a local-method-variable called urlList and the method also operates on instance variable urlList. This makes the code hard to read. Havnt made a test in Apache Solr regi to reproduce the main error (the one mentioned at the top above) but I guess you can easily do it yourself: Make a setup with two collections collection1 and collection2 - no default collection. Add some documents to collection2 (without any autocommit). Then do cloudSolrServer.commit(collection1) and afterwards cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). Then try to search collection2 for the documents you inserted into it. They ought to be found, but are not, because the cloudSolrServer.commit(collection2) will not do a commit of collection2 - it will actually do a commit of collection1. Well, actually you cant do cloudSolrServer.commit(collection-name) (the method doesnt exist), but that ought to be corrected too. But you can do the following instead: {code} UpdateRequest req = new UpdateRequest(); req.setAction(UpdateRequest.ACTION.COMMIT, true, true); req.setParam(CoreAdminParams.COLLECTION, collection-name); req.process(cloudSolrServer); {code} In general I think you should add misc tests to your test-suite - tests that run Solr-clusters with more than one collection and makes clever tests on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492450#comment-13492450 ] Uwe Schindler commented on LUCENE-4546: --- By the way we have a random test (TestArrayUtil) that does exactly the same, but the tested ArrayUtil handles the pivot value correctly, so it works correct. If you use your failing example array and sort it with ArrayUtil it passes. SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.6.0_37) - Build # 2259 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2259/ Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 28385 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:294: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 34 minutes 10 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -client -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-4546: - Assignee: Uwe Schindler SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Assignee: Uwe Schindler Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492457#comment-13492457 ] Nagendra Nagarajayya commented on SOLR-3816: @Otis: Regarding the performance improvement, apart from the performance improvement, realtime-search makes available a realtime (nrt) view of the index as to current Solr implementation of point-in-time snapshots of the index. So each search may return new results ... Need a more granular nrt system that is close to a realtime system. --- Key: SOLR-3816 URL: https://issues.apache.org/jira/browse/SOLR-3816 Project: Solr Issue Type: Improvement Components: clients - java, replication (java), search, SearchComponents - other, SolrCloud, update Affects Versions: 4.0 Reporter: Nagendra Nagarajayya Labels: nrt, realtime, replication, search, solrcloud, update Attachments: alltests_passed_with_realtime_turnedoff.log, SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, solr-3816-realtime_nrt.patch Need a more granular NRT system that is close to a realtime system. A realtime system should be able to reflect changes to the index as and when docs are added/updated to the index. soft-commit offers NRT and is more realtime friendly than hard commit but is limited by the dependency on the SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4046) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a.
[ https://issues.apache.org/jira/browse/SOLR-4046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492459#comment-13492459 ] Per Steffensen commented on SOLR-4046: -- Well, the entire Apache Solr test-suite is still green with my fix - not that it makes any guarantee that the simplification does not matter :-) An instance of CloudSolrServer is not able to handle consecutive request on different collections o.a. -- Key: SOLR-4046 URL: https://issues.apache.org/jira/browse/SOLR-4046 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.0 Environment: Solr 4.0.0. Actually revision 1394844 on branch lucene_solr_4_0 but I believe that is the same Reporter: Per Steffensen Priority: Critical Attachments: SOLR-4046.patch CloudSolrServer saves urlList, leaderUrlList and replicasList on instance level, and only recalculates those lists in case of clusterState changes. The values calculated for the lists will be different for different target-collections. Therefore they also ought to recalculated for a request R, if the target-collection for R is different from the target-collection for the request handled just before R by the same CloudSolrServer instance. Another problem with the implementation in CloudSolrServer is with the lastClusterStateHashCode. lastClusterStateHashCode is updated when the first request after a clusterState-change is handled. Before the lastClusterStateHashCode is updated one of the following two sets of lists are updated: * In case sendToLeader==true for the request: leaderUrlList and replicasList are updated, but not urlList * In case sendToLeader==false for the request: urlList is updated, but not leaderUrlList and replicasList But the lastClusterStateHashCode is always updated. So even though there was just one collection in the world there is a problem: If the first request after a clusterState-change is a sendToLeader==true-request urlList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==false-request to the same CloudSolrServer instance. If the first request after a clusterState-change is a sendToLeader==false-request leaderUrlList and replicasList will (potentially) be wrong (and will not be recalculated) for the next sendToLeader==true-request to the same CloudSolrServer instance. Besides that it is a very bad idea to have instance- and local-method-variables with the same name. CloudSolrServer has an instance variable called urlList and method CloudSolrServer.request has a local-method-variable called urlList and the method also operates on instance variable urlList. This makes the code hard to read. Havnt made a test in Apache Solr regi to reproduce the main error (the one mentioned at the top above) but I guess you can easily do it yourself: Make a setup with two collections collection1 and collection2 - no default collection. Add some documents to collection2 (without any autocommit). Then do cloudSolrServer.commit(collection1) and afterwards cloudSolrServer.commit(collection2) (use same instance of CloudSolrServer). Then try to search collection2 for the documents you inserted into it. They ought to be found, but are not, because the cloudSolrServer.commit(collection2) will not do a commit of collection2 - it will actually do a commit of collection1. Well, actually you cant do cloudSolrServer.commit(collection-name) (the method doesnt exist), but that ought to be corrected too. But you can do the following instead: {code} UpdateRequest req = new UpdateRequest(); req.setAction(UpdateRequest.ACTION.COMMIT, true, true); req.setParam(CoreAdminParams.COLLECTION, collection-name); req.process(cloudSolrServer); {code} In general I think you should add misc tests to your test-suite - tests that run Solr-clusters with more than one collection and makes clever tests on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4546: -- Attachment: TestSorterTemplate.java Attached the corrected testcase, which passes. BTW: Your SorterTemplate implementation fails with with mergeSort completely :-) SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Assignee: Uwe Schindler Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
[ https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492465#comment-13492465 ] Alessandro Tommasi commented on SOLR-4045: -- Thank you for your prompt action on this. I have tried the patch, but patching my existing 4.0 installation (as downloaded from the website) was a little troublesome, as those files that the patch indicated as being in: solr/webapp/web/js/scripts are actually under: solr-webapp/webapp/js/scripts in my installation. Replacing the paths in the patch and applying it, however, worked, and the web gui seems to work w/o issues. (I had however to open the web gui in another browser, as mine seemed to have cached all those js and refused to reload them unless I refreshed them one by one). SOLR admin page returns HTTP 404 on core names containing a '.' (dot) - Key: SOLR-4045 URL: https://issues.apache.org/jira/browse/SOLR-4045 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Environment: Linux, Ubuntu 12.04 Reporter: Alessandro Tommasi Assignee: Stefan Matheis (steffkes) Priority: Minor Labels: admin, solr, webgui Fix For: 5.0 Attachments: SOLR-4045.patch When SOLR is configured in multicore mode, cores with '.' (dot) in their names are inaccessible via the admin web guy. (localhost:8983/solr). The page shows an alert with the message (test.test was my core name): 404 Not Found get #/test.test To replicate: start solr in multicore mode, go to localhost:8983/solr, via core admin create a new core test.test, then refresh the page. test.test will show under the menu at the bottom left. Clicking on it causes the message, while no core menu appears. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-4546. --- Resolution: Not A Problem SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Assignee: Uwe Schindler Labels: patch Fix For: 4.1, 4.0, 3.6.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4045) SOLR admin page returns HTTP 404 on core names containing a '.' (dot)
[ https://issues.apache.org/jira/browse/SOLR-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492475#comment-13492475 ] Stefan Matheis (steffkes) commented on SOLR-4045: - {quote}as those files that the patch indicated as being in: solr/webapp/web/js/scripts are actually under: solr-webapp/webapp/js/scripts{quote} in {{solr/webapp/web}} the source-files are located, where as in {{example/solr-webapp/webapp}} your running instance is holding their copies of the source-files But anyway, fine that it works -- will work on the second version with a global corename-pattern to make these changes in the feature a bit easier Thanks Allesandro! SOLR admin page returns HTTP 404 on core names containing a '.' (dot) - Key: SOLR-4045 URL: https://issues.apache.org/jira/browse/SOLR-4045 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0 Environment: Linux, Ubuntu 12.04 Reporter: Alessandro Tommasi Assignee: Stefan Matheis (steffkes) Priority: Minor Labels: admin, solr, webgui Fix For: 5.0 Attachments: SOLR-4045.patch When SOLR is configured in multicore mode, cores with '.' (dot) in their names are inaccessible via the admin web guy. (localhost:8983/solr). The page shows an alert with the message (test.test was my core name): 404 Not Found get #/test.test To replicate: start solr in multicore mode, go to localhost:8983/solr, via core admin create a new core test.test, then refresh the page. test.test will show under the menu at the bottom left. Clicking on it causes the message, while no core menu appears. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492480#comment-13492480 ] Stefan Pohl commented on LUCENE-4546: - Thanks for the clarification, Uwe! Out of curiosity and for reference, are there any reasons for the abstraction having to overwrite setPivot/comparePivot? Using the implementation in my patch would actually allow to get rid of having to overwrite these methods at all, possibly being faster due to removal of some calls depending on JVM optimization and possibly being slower due to a few more swaps and branches in the code. Pure speculation. SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Assignee: Uwe Schindler Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.7.0_09) - Build # 1472 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1472/ Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC All tests passed Build Log: [...truncated 28989 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 54 minutes 23 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3589) Edismax parser does not honor mm parameter if analyzer splits a token
[ https://issues.apache.org/jira/browse/SOLR-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492487#comment-13492487 ] Tom Burton-West commented on SOLR-3589: --- Forgot to work from your latest patch with the synonyms test. I'll post a new backport of the patch with the synonyms test and against the latest 3.6x in svn shortly Edismax parser does not honor mm parameter if analyzer splits a token - Key: SOLR-3589 URL: https://issues.apache.org/jira/browse/SOLR-3589 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0-BETA Reporter: Tom Burton-West Assignee: Robert Muir Attachments: SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589_test.patch, testSolr3589.xml.gz, testSolr3589.xml.gz With edismax mm set to 100% if one of the tokens is split into two tokens by the analyzer chain (i.e. fire-fly = fire fly), the mm parameter is ignored and the equivalent of OR query for fire OR fly is produced. This is particularly a problem for languages that do not use white space to separate words such as Chinese or Japenese. See these messages for more discussion: http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.7.0_09) - Build # 2250 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2250/ Java: 64bit/jdk1.7.0_09 -XX:+UseG1GC All tests passed Build Log: [...truncated 28980 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:294: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 37 minutes 8 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 64bit/jdk1.7.0_09 -XX:+UseG1GC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_09) - Build # 2260 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2260/ Java: 32bit/jdk1.7.0_09 -client -XX:+UseSerialGC All tests passed Build Log: [...truncated 29062 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:294: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 32 minutes 47 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.7.0_09 -client -XX:+UseSerialGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.7.0_09) - Build # 1478 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1478/ Java: 32bit/jdk1.7.0_09 -server -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 29089 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 53 minutes 25 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.7.0_09 -server -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3589) Edismax parser does not honor mm parameter if analyzer splits a token
[ https://issues.apache.org/jira/browse/SOLR-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated SOLR-3589: -- Attachment: SOLR-3589-3.6.PATCH Backport to 3.6 r1406713. Includes synonyms test. Will test in against production later today Edismax parser does not honor mm parameter if analyzer splits a token - Key: SOLR-3589 URL: https://issues.apache.org/jira/browse/SOLR-3589 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0-BETA Reporter: Tom Burton-West Assignee: Robert Muir Attachments: SOLR-3589-3.6.PATCH, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589_test.patch, testSolr3589.xml.gz, testSolr3589.xml.gz With edismax mm set to 100% if one of the tokens is split into two tokens by the analyzer chain (i.e. fire-fly = fire fly), the mm parameter is ignored and the equivalent of OR query for fire OR fly is produced. This is particularly a problem for languages that do not use white space to separate words such as Chinese or Japenese. See these messages for more discussion: http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3816) Need a more granular nrt system that is close to a realtime system.
[ https://issues.apache.org/jira/browse/SOLR-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492551#comment-13492551 ] Otis Gospodnetic commented on SOLR-3816: [~nnagarajayya] Hmmm maybe I'm missing something but if you set the soft commit in Solr to something very, very low, then yes, while it is still technically point in time view, that point in time is shifted so frequently that it looks like RT search to a human - new results can show up with every new search. So the effect can be as (N)RT as you choose with the soft commit frequency. I think the only Q is whether that approach vs. the approach in your patch yields better performance, and it looks like [~hsn] will test that soon and we're all anxiously waiting to see the results! :) Need a more granular nrt system that is close to a realtime system. --- Key: SOLR-3816 URL: https://issues.apache.org/jira/browse/SOLR-3816 Project: Solr Issue Type: Improvement Components: clients - java, replication (java), search, SearchComponents - other, SolrCloud, update Affects Versions: 4.0 Reporter: Nagendra Nagarajayya Labels: nrt, realtime, replication, search, solrcloud, update Attachments: alltests_passed_with_realtime_turnedoff.log, SOLR-3816_4.0_branch.patch, SOLR-3816-4.x.trunk.patch, solr-3816-realtime_nrt.patch Need a more granular NRT system that is close to a realtime system. A realtime system should be able to reflect changes to the index as and when docs are added/updated to the index. soft-commit offers NRT and is more realtime friendly than hard commit but is limited by the dependency on the SolrIndexSearcher being closed and reopened and offers a coarse granular NRT. Closing and reopening of the SolrIndexSearcher may impact performance also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0) - Build # 2251 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/2251/ Java: 64bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0 -XnoOpt All tests passed Build Log: [...truncated 28172 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:294: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 43 minutes 44 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 64bit/jrockit-jdk1.6.0_33-R28.2.4-4.1.0 -XnoOpt Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4544) possible bug in ConcurrentMergeScheduler.merge(IndexWriter)
[ https://issues.apache.org/jira/browse/LUCENE-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4544: --- Attachment: LUCENE-4544.patch Added test case ... I think it's ready. possible bug in ConcurrentMergeScheduler.merge(IndexWriter) Key: LUCENE-4544 URL: https://issues.apache.org/jira/browse/LUCENE-4544 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 5.0 Reporter: Radim Kolar Assignee: Michael McCandless Attachments: LUCENE-4544.patch, LUCENE-4544.patch from dev list: ¨i suspect that this code is broken. Lines 331 - 343 in org.apache.lucene.index.ConcurrentMergeScheduler.merge(IndexWriter) mergeThreadCount() are currently active merges, they can be at most maxThreadCount, maxMergeCount is number of queued merges defaulted with maxThreadCount+2 and it can never be lower then maxThreadCount, which means that condition in while can never become true. synchronized(this) { long startStallTime = 0; while (mergeThreadCount() = 1+maxMergeCount) { startStallTime = System.currentTimeMillis(); if (verbose()) { message(too many merges; stalling...); } try { wait(); } catch (InterruptedException ie) { throw new ThreadInterruptedException(ie); } } While confusing, I think the code is actually nearly correct... but I would love to find some simplifications of CMS's logic (it's really hairy). It turns out mergeThreadCount() is allowed to go higher than maxThreadCount; when this happens, Lucene pauses mergeThreadCount()-maxThreadCount of those merge threads, and resumes them once threads finish (see updateMergeThreads). Ie, CMS will accept up to maxMergeCount merges (and launch threads for them), but will only allow maxThreadCount of those threads to be running at once. So what that while loop is doing is preventing more than maxMergeCount+1 threads from starting, and then pausing the incoming thread to slow down the rate of segment creation (since merging cannot keep up). But ... I think the 1+ is wrong ... it seems like it should just be mergeThreadCount() = maxMergeCount(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/ Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC All tests passed Build Log: [...truncated 28176 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 55 minutes 56 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3589) Edismax parser does not honor mm parameter if analyzer splits a token
[ https://issues.apache.org/jira/browse/SOLR-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492591#comment-13492591 ] Tom Burton-West commented on SOLR-3589: --- Hi Robert, I just put the backport to 3.6 up on our test server and pointed it to one of our production shards. The improvement for Chinese queries are dramatic. (Especially for longer queries like the TREC 5 queries, see examples below) When you have time, please look over the backport of the patch. I think it is fine but I would appreciate you looking it over. My understanding of your patch is that it just affects a small portion of the edismax logic, but I don't understand the edismax parser well enough to be sure there isn't some difference between 3.6 and 4.0 that I didn't account for in the patch. Thanks for working on this. Naomi and I are both very excited about this bug finally being fixed and want to put the fix into production soon. --- Example TREC 5 Chinese queries: num Number: CH4 E-title The newly discovered oil fields in China. C-title 中国大陆新发现的油田 40,135 items found for 中国大陆新发现的油田 with current implementation (due to dismax bug) 78 items found for 中国大陆新发现的油田 with patch num Number: CH10 E-title Border Trade in Xinjiang C-title 新疆的边境贸易 20,249 items found for 新疆的边境贸易 current implementation (with bug) 243 items found for 新疆的边境贸易 with patch. Edismax parser does not honor mm parameter if analyzer splits a token - Key: SOLR-3589 URL: https://issues.apache.org/jira/browse/SOLR-3589 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0-BETA Reporter: Tom Burton-West Assignee: Robert Muir Attachments: SOLR-3589-3.6.PATCH, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589.patch, SOLR-3589_test.patch, testSolr3589.xml.gz, testSolr3589.xml.gz With edismax mm set to 100% if one of the tokens is split into two tokens by the analyzer chain (i.e. fire-fly = fire fly), the mm parameter is ignored and the equivalent of OR query for fire OR fly is produced. This is particularly a problem for languages that do not use white space to separate words such as Chinese or Japenese. See these messages for more discussion: http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!
Looks like Yonik's Jetty upgrade upgrade caused these build failures ... Yonik can you fix? Thanks. And we all should try to remember to run ant precommit before committing ... Mike McCandless http://blog.mikemccandless.com On Wed, Nov 7, 2012 at 1:28 PM, Policeman Jenkins Server jenk...@sd-datasolutions.de wrote: Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/ Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC All tests passed Build Log: [...truncated 28176 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 55 minutes 56 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
BooleanFilter MUST clauses and getDocIdSet(acceptDocs)
I am about to write a Filter that only operates on a set of documents that have already passed other filter(s). It's rather expensive, since it has to use DocValues to examine a value and then determine if its a match. So it scales O(n) where n is the number of documents it must see. The 2nd arg of getDocIdSet is Bits acceptDocs. Unfortunately Bits doesn't have an int iterator but I can deal with that seeing if it extends DocIdSet. I'm looking at BooleanFilter which I want to use and I notice that it passes null to filter.getDocIdSet for acceptDocs, and it justifies this with the following comment: // we dont pass acceptDocs, we will filter at the end using an additional filter Uwe wrote this comment in relation to LUCENE-1536 (r1188624). For the MUST clause loop, couldn't it give it the accumulated bits of the MUST clauses? ~ David
[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.7.0_09) - Build # 1479 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows/1479/ Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC All tests passed Build Log: [...truncated 29069 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 52 minutes 25 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.7.0_09 -server -XX:+UseParallelGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!
I committed a fix On Wed, Nov 7, 2012 at 2:06 PM, Michael McCandless luc...@mikemccandless.com wrote: Looks like Yonik's Jetty upgrade upgrade caused these build failures ... Yonik can you fix? Thanks. And we all should try to remember to run ant precommit before committing ... Mike McCandless http://blog.mikemccandless.com On Wed, Nov 7, 2012 at 1:28 PM, Policeman Jenkins Server jenk...@sd-datasolutions.de wrote: Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/ Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC All tests passed Build Log: [...truncated 28176 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 55 minutes 56 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492619#comment-13492619 ] Chris Russell commented on SOLR-2894: - In regards to my above comment, I have determined that it is because if you specify a limit for a field that you are not requesting facet counts for, solr will not automatically over-request on that field. i.e. facet.pivot=somefield f.somefield.facet.limit=10 This will make your pivots weird because the limit of 10 will not be over requested unless you add this line: facet.field=somefield Since solr does not do distributed pivoting yet, this has not been an issue yet. I am working on an update to the patch that will correct this issue. Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Fix For: 4.1 Attachments: distributed_pivot.patch, distributed_pivot.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1473 - Still Failing!
Thanks Robert! Mike McCandless http://blog.mikemccandless.com On Wed, Nov 7, 2012 at 2:27 PM, Robert Muir rcm...@gmail.com wrote: I committed a fix On Wed, Nov 7, 2012 at 2:06 PM, Michael McCandless luc...@mikemccandless.com wrote: Looks like Yonik's Jetty upgrade upgrade caused these build failures ... Yonik can you fix? Thanks. And we all should try to remember to run ant precommit before committing ... Mike McCandless http://blog.mikemccandless.com On Wed, Nov 7, 2012 at 1:28 PM, Policeman Jenkins Server jenk...@sd-datasolutions.de wrote: Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1473/ Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC All tests passed Build Log: [...truncated 28176 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 55 minutes 56 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseParallelGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.6.0_37) - Build # 2262 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux/2262/ Java: 64bit/jdk1.6.0_37 -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 28365 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:294: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 31 minutes 26 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 64bit/jdk1.6.0_37 -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.6.0_37) - Build # 1474 - Still Failing!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows/1474/ Java: 32bit/jdk1.6.0_37 -server -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 28179 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\build.xml:294: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\extra-targets.xml:117: The following files are missing svn:eol-style (or binary svn:mime-type): * solr/licenses/jetty-continuation-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-deploy-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-http-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-io-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-jmx-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-security-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-server-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-servlet-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-util-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-webapp-8.1.7.v20120910.jar.sha1 * solr/licenses/jetty-xml-8.1.7.v20120910.jar.sha1 Total time: 59 minutes 7 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Description set: Java: 32bit/jdk1.6.0_37 -server -XX:+UseConcMarkSweepGC Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4482) Likely Zing JVM bug causes failures in TestPayloadNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4482. Resolution: Fixed The new Zing 5.5 release looks to have fixed this issue! I can now pass all Lucene/Solr tests with Zing ... at lest two times :) Likely Zing JVM bug causes failures in TestPayloadNearQuery --- Key: LUCENE-4482 URL: https://issues.apache.org/jira/browse/LUCENE-4482 Project: Lucene - Core Issue Type: Bug Environment: Lucene trunk, rev 1397735 Zing: {noformat} java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-6) Java HotSpot(TM) 64-Bit Tiered VM (build 1.6.0_31-ZVM_5.2.3.0-b6-product-azlinuxM-X86_64, mixed mode) {noformat} Ubuntu 12.04 LTS 3.2.0-23-generic kernel Reporter: Michael McCandless Attachments: LUCENE-4482.patch I dug into one of the Lucene test failures when running with Zing JVM (available free for open source devs...). At least one other test sometimes fails but I haven't dug into that yet. I managed to get the failure easily reproduced: with the attached patch, on rev 1397735 checkout, if you cd to lucene/core and run: {noformat} ant test -Dtests.jvms=1 -Dtests.seed=C3802435F5FB39D0 -Dtests.showSuccess=true {noformat} Then you'll hit several failures in TestPayloadNearQuery, eg: {noformat} Suite: org.apache.lucene.search.payloads.TestPayloadNearQuery 1 FAILED 2 NOTE: reproduce with: ant test -Dtestcase=TestPayloadNearQuery -Dtests.method=test -Dtests.seed=C3802435F5FB39D0 -Dtests.slow=true -Dtests.locale=ga -Dtests.timezone=America/Adak -Dtests.file.encoding=US-ASCII ERROR 0.01s | TestPayloadNearQuery.test Throwable #1: java.lang.RuntimeException: overridden idfExplain method in TestPayloadNearQuery.BoostingSimilarity was not called at __randomizedtesting.SeedInfo.seed([C3802435F5FB39D0:4BD41BEF5B075428]:0) at org.apache.lucene.search.similarities.TFIDFSimilarity.computeWeight(TFIDFSimilarity.java:740) at org.apache.lucene.search.spans.SpanWeight.init(SpanWeight.java:62) at org.apache.lucene.search.payloads.PayloadNearQuery$PayloadNearSpanWeight.init(PayloadNearQuery.java:147) at org.apache.lucene.search.payloads.PayloadNearQuery.createWeight(PayloadNearQuery.java:75) at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:648) at org.apache.lucene.search.AssertingIndexSearcher.createNormalizedWeight(AssertingIndexSearcher.java:60) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:265) at org.apache.lucene.search.payloads.TestPayloadNearQuery.test(TestPayloadNearQuery.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at
[jira] [Comment Edited] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492464#comment-13492464 ] Uwe Schindler edited comment on LUCENE-4546 at 11/7/12 9:25 PM: Attached the corrected testcase, which passes. -BTW: Your SorterTemplate implementation fails with with mergeSort completely- :-) _(this was incorrect, mergeSort does not use the pivot methods)_ was (Author: thetaphi): Attached the corrected testcase, which passes. BTW: Your SorterTemplate implementation fails with with mergeSort completely :-) SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Assignee: Uwe Schindler Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4546) SorterTemplate.quicksort incorrect
[ https://issues.apache.org/jira/browse/LUCENE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492720#comment-13492720 ] Uwe Schindler commented on LUCENE-4546: --- Hi Stefan, it is some time ago when I worked on SorterTemplate, so I don't have all the facts in mind. There are different implementations of QuickSort available, also those working without any pivot (like yours). But as far as I remember, the performance tests showed, that the additional swaps and compares added some slowdown (depending on the order of input data), so the explicit pivot methods helped. The SorterTemplate quicksort implementation is also the one that was used in Lucene from the beginning, so I did not want to change the algorithm in a minor release. We could add some new performance tests with your implementation and compare the speed, but I think, e.g. CollectionUtil, which uses Collections.swap() would get much slower by this. I agree, the class is very nice for sorting of non-array data, but it is currently marked as @lucene.internal, so the usability for non-lucene code was never thought of, performance was the only driving force :-) But I checked the javadocs, it is clearly documented that setPivot(i) has to store the value of slot i for later comparison with comparePivot(j). SorterTemplate.quicksort incorrect -- Key: LUCENE-4546 URL: https://issues.apache.org/jira/browse/LUCENE-4546 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.6.1, 4.0, 4.1 Reporter: Stefan Pohl Assignee: Uwe Schindler Labels: patch Fix For: 3.6.1, 4.0, 4.1 Attachments: SorterTemplate.java.patch, TestSorterTemplate.java, TestSorterTemplate.java On trying to use the very useful o.a.l.utils.SorterTemplate, I stumbled upon inconsistent sorting behaviour, of course, only a randomized test caught this;) Because SorterTemplate.quicksort is used in several places in the code (directly BytesRefList, ArrayUtil, BytesRefHash, CollectionUtil and transitively index and search), I'm a bit puzzled that this either hasn't been caught by another higher-level test or that neither my test nor my understanding of an insufficiency in the code is valid;) If the former holds and given that the same code is released in 3.6 and 4.0, this might even be a more critical issue requiring a higher priority than 'major'. So, can a second pair of eyes please have a timely look at the attached test and patch? Basically the current quicksort implementation seems to assume that luckily always the median is chosen as pivot element by grabbing the mid element, not handling the case where the initially chosen pivot ends up not in the middle. Hope this and the test helps to understand the issue. Reproducible, currently failing test and a patch attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1916) investigate DIH use of default locale
[ https://issues.apache.org/jira/browse/SOLR-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492769#comment-13492769 ] James Dyer commented on SOLR-1916: -- Robert, I'm having a hard time finding a seed, locale, or timezone for which TestEvaluatorBag#testGetDateFormatEvaluator will fail. Can you provide more info? (Maybe my jvm doesn't support enough locales for me to get a failure-prone one?) investigate DIH use of default locale - Key: SOLR-1916 URL: https://issues.apache.org/jira/browse/SOLR-1916 Project: Solr Issue Type: Task Components: contrib - DataImportHandler Affects Versions: 3.1, 4.0-ALPHA Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.1 Attachments: SOLR-1916.patch This is a spinoff from LUCENE-2466. In this issue I changed my locale to various locales and found some problems in Lucene/Solr triggered by use of the default Locale. I noticed some use of the default-locale for Date operations in DIH (TimeZone.getDefault/Locale.getDefault) and, while no tests fail, I think it might be better to support a locale parameter for this. The wiki documents that numeric parsing can support localized numerics formats: http://wiki.apache.org/solr/DataImportHandler#NumberFormatTransformer In both cases, I don't think we should ever use the default Locale. If no Locale is provided, I find that new Locale() -- Unicode Root Locale, is a better default for a server situation in a lot of cases, as it won't change depending on the computer, or perhaps we just make Locale params mandatory for this. Finally, in both cases, if localized numbers/dates are explicitly supported, I think we should come up with a test strategy to ensure everything is working. One idea is to do something similar to or make use of Lucene's LocalizedTestCase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1916) investigate DIH use of default locale
[ https://issues.apache.org/jira/browse/SOLR-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492775#comment-13492775 ] Robert Muir commented on SOLR-1916: --- James, thanks for looking at this!!! It may not be a locale issue, instead a time zone issue (or both). But this test definitely failed intermittently in the past. For example, it failed during a Daylight savings time window (but only for developers in Europe!) and Chris Male addressed some of the issues in SOLR-1821. Fortunately, Uwe Schindler has made it dead easy to identify most of these issues: we not longer have to solely rely upon unit tests alone. http://blog.thetaphi.de/2012/07/default-locales-default-charsets-and.html DIH currently has 40 violations! Try this: {noformat} Index: build.xml === --- build.xml (revision 1406757) +++ build.xml (working copy) @@ -250,8 +250,6 @@ /apiFileSet fileset dir=${basedir}/build include name=**/*.class / -!-- exclude DIH for now as it is broken with Locales and Encodings: SOLR-1916 -- -exclude name=contrib/solr-dataimporthandler*/** / /fileset /forbidden-apis /target {noformat} Then run {noformat} rmuir@beast:~/workspace/lucene-trunk/solr$ ant check-forbidden-apis ... -check-forbidden-java-apis: [forbidden-apis] Reading API signatures: /home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/commons-io.txt [forbidden-apis] Reading API signatures: /home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/executors.txt [forbidden-apis] Reading API signatures: /home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/jdk-deprecated.txt [forbidden-apis] Reading API signatures: /home/rmuir/workspace/lucene-trunk/lucene/tools/forbiddenApis/jdk.txt [forbidden-apis] Loading classes to check... [forbidden-apis] Scanning for API signatures and dependencies... [forbidden-apis] Forbidden method invocation: java.text.DecimalFormatSymbols#init() [forbidden-apis] in org.apache.solr.handler.dataimport.TestNumberFormatTransformer (TestNumberFormatTransformer.java:36) [forbidden-apis] Forbidden method invocation: java.text.DecimalFormatSymbols#init() [forbidden-apis] in org.apache.solr.handler.dataimport.TestNumberFormatTransformer (TestNumberFormatTransformer.java:37) [forbidden-apis] Forbidden method invocation: java.text.MessageFormat#init(java.lang.String) [forbidden-apis] in org.apache.solr.handler.dataimport.DebugLogger (DebugLogger.java:52) [forbidden-apis] Forbidden method invocation: java.text.SimpleDateFormat#init(java.lang.String) [forbidden-apis] in org.apache.solr.handler.dataimport.TestDateFormatTransformer (TestDateFormatTransformer.java:43) [forbidden-apis] Forbidden method invocation: java.text.SimpleDateFormat#init(java.lang.String) [forbidden-apis] in org.apache.solr.handler.dataimport.TestDateFormatTransformer (TestDateFormatTransformer.java:66) [forbidden-apis] Forbidden method invocation: java.text.SimpleDateFormat#init(java.lang.String) [forbidden-apis] in org.apache.solr.handler.dataimport.MailEntityProcessor (MailEntityProcessor.java:88) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [forbidden-apis] in org.apache.solr.handler.dataimport.TestDocBuilder2 (TestDocBuilder2.java:250) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [forbidden-apis] in org.apache.solr.handler.dataimport.TestDocBuilder2 (TestDocBuilder2.java:251) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [forbidden-apis] in org.apache.solr.handler.dataimport.TestDocBuilder2 (TestDocBuilder2.java:252) [forbidden-apis] Forbidden method invocation: java.lang.String#getBytes() [forbidden-apis] in org.apache.solr.handler.dataimport.TestDocBuilder2 (TestDocBuilder2.java:257) [forbidden-apis] Forbidden method invocation: java.text.SimpleDateFormat#init(java.lang.String) [forbidden-apis] in org.apache.solr.handler.dataimport.DataImporter$3 (DataImporter.java:490) [forbidden-apis] Forbidden method invocation: java.lang.String#format(java.lang.String,java.lang.Object[]) [forbidden-apis] in org.apache.solr.handler.dataimport.DocBuilder (DocBuilder.java:711) [forbidden-apis] Forbidden method invocation: java.lang.String#format(java.lang.String,java.lang.Object[]) [forbidden-apis] in org.apache.solr.handler.dataimport.DocBuilder (DocBuilder.java:717) [forbidden-apis] Forbidden method invocation: java.lang.String#format(java.lang.String,java.lang.Object[]) [forbidden-apis] in org.apache.solr.handler.dataimport.DocBuilder (DocBuilder.java:725) [forbidden-apis] Forbidden method invocation: java.lang.String#format(java.lang.String,java.lang.Object[]) [forbidden-apis] in org.apache.solr.handler.dataimport.DocBuilder (DocBuilder.java:727) [forbidden-apis] Forbidden method invocation:
[jira] [Resolved] (LUCENE-4527) CompressingStoredFieldsFormat: encode numStoredFields more efficiently
[ https://issues.apache.org/jira/browse/LUCENE-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-4527. -- Resolution: Fixed Committed - trunk: r1406704 - branch 4.x: r1406712 CompressingStoredFieldsFormat: encode numStoredFields more efficiently -- Key: LUCENE-4527 URL: https://issues.apache.org/jira/browse/LUCENE-4527 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1 Attachments: LUCENE-4527.patch, LUCENE-4527.patch Another interesting idea from Robert: many applications have a schema and all documents are likely to have the same number of stored fields. We could save space by using packed ints and the same kind of optimization as {{ForUtil}} (requiring only one VInt if all values are equal). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3855) DocValues support
[ https://issues.apache.org/jira/browse/SOLR-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated SOLR-3855: --- Attachment: SOLR-3855.patch New patch: - ability to have direct doc values, - doc values are not fetched by default, you need to explicitely add their name to the fl parameter to load them, - all tests pass except BasicDistributedZkTest.testDistribSearch, but it doesn't pass either without the patch applied on my (very slow...) laptop. This patch is not perfect... for example I am not happy that I had to add a new createDocValuesFields method in FieldType. The reason is that only poly fields are allowed to return several fields in createFields but I think this would require a more globabl refactoring and should not block this issue? If you want to play with doc values and Solr, I modified the example schema.xml so that popularity and inStock have doc values enabled. You can try to display their values, sort on them and/or use function queries on them. When a field is indexed and has doc values, the patch always tries to use doc values instead of the field cache. DocValues support - Key: SOLR-3855 URL: https://issues.apache.org/jira/browse/SOLR-3855 Project: Solr Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.1, 5.0 Attachments: SOLR-3855.patch, SOLR-3855.patch It would be nice if Solr supported DocValues: - for ID fields (fewer disk seeks when running distributed search), - for sorting/faceting/function queries (faster warmup time than fieldcache), - better on-disk and in-memory efficiency (you can use packed impls). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable
[ https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13492892#comment-13492892 ] Chris Male commented on LUCENE-4542: Rafał, Thanks for creating the patches, they are looking great. Couple of very small improvements: - Can we mark recursionCap as final? - Can we improve the javadoc for the recursionCap parameter so it's clear what purpose it serves? - Maybe also drop in a comment at the field about how the recursion cap of 2 is the default value based on documentation about Hunspell (as opposed to something we arbitrarily chose). Make RECURSION_CAP in HunspellStemmer configurable -- Key: LUCENE-4542 URL: https://issues.apache.org/jira/browse/LUCENE-4542 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Piotr Assignee: Chris Male Attachments: LUCENE-4542.patch, LUCENE-4542-with-solr.patch Currently there is private static final int RECURSION_CAP = 2; in the code of the class HunspellStemmer. It makes using hunspell with several dictionaries almost unusable, due to bad performance (f.ex. it costs 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for recursion_cap=1). It would be nice to be able to tune this number as needed. AFAIK this number (2) was chosen arbitrary. (it's a first issue in my life, so please forgive me any mistakes done). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: BooleanFilter MUST clauses and getDocIdSet(acceptDocs)
Hi David, the idea of passing the already build bits for the MUST is a good idea and can be implemented easily. The reason why the acceptDocs were not passed down is the new way of filter works in Lucene 4.0 and to optimize caching. Because accept docs are the only thing that changes when deletions are applied and filters are required to handle them separately: whenever something is able to cache (e.g. CachingWrapperFilter), the acceptDocs are not cached, so the underlying filters get a null acceptDocs to produce the full bitset and the filtering is done when CachingWrapperFilter gets the “uptodate” acceptDocs. But for this case this does not matter if the first filter clause does not get acceptdocs, but later MUST clauses of course can get them (they are not deletion-specific)! Can you open issue to optimize the MUST case (possibly MUST_NOT, too)? Another thing that could help here: You can stop using BooleanFilter if you can apply the filters sequentially (only MUST clauses) by wrapping with multiple FilteredQuery: new FilteredQuery(new FilteredQuery(originalQuery, clause1), clause2). If the DocIdSets enable bits() and the FilteredQuery autodetection decides to use random access filters, the acceptdocs are also passed down from the outside to the inner, removing the documents filtered out. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de http://www.thetaphi.de/ eMail: u...@thetaphi.de From: david.w.smi...@gmail.com [mailto:david.w.smi...@gmail.com] Sent: Wednesday, November 07, 2012 8:23 PM To: dev@lucene.apache.org Subject: BooleanFilter MUST clauses and getDocIdSet(acceptDocs) I am about to write a Filter that only operates on a set of documents that have already passed other filter(s). It's rather expensive, since it has to use DocValues to examine a value and then determine if its a match. So it scales O(n) where n is the number of documents it must see. The 2nd arg of getDocIdSet is Bits acceptDocs. Unfortunately Bits doesn't have an int iterator but I can deal with that seeing if it extends DocIdSet. I'm looking at BooleanFilter which I want to use and I notice that it passes null to filter.getDocIdSet for acceptDocs, and it justifies this with the following comment: // we dont pass acceptDocs, we will filter at the end using an additional filter Uwe wrote this comment in relation to LUCENE-1536 (r1188624). For the MUST clause loop, couldn't it give it the accumulated bits of the MUST clauses? ~ David
Fwd: [concurrency-interest] _interrupted field visibility bug in OpenJDK 7+
Thought you'd be interested. I don't think it affects us but it's good to know about it. Reproduces for me all the time on newer hotspots. New (invisible) bug entry is at: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8003135 Dawid -- Forwarded message -- From: Dr Heinz M. Kabutz he...@javaspecialists.eu Date: Wed, Nov 7, 2012 at 11:00 PM Subject: [concurrency-interest] _interrupted field visibility bug in OpenJDK 7+ To: concurrency-inter...@cs.oswego.edu During a hands-on session today of my new Concurrency Specialist Course, one of my students discovered what we think might be an interesting and potentially serious bug in the JVM. It seems that the Server HotSpot in OpenJDK 7 may sometimes hoist the value of the _interrupted field. This is interesting, since the value is not stored in Java, but rather in the OSThread.hpp file in the jint _interrupted field. It is also pretty serious, because it means we cannot rely on the interrupted status in order to shut down threads. This will affect Future.cancel(), ExecutorService.shutdownNow() and a whole bunch of other mechanisms that use interruptions to cooperatively cancel tasks. (Obviously the exercise was more involved than the code presented in this email, after all the course is aimed at intermediate to advanced Java developers. So please don't expect that this won't happen in your code - I've just taken away unnecessary code until we can see the bug without any of the paraphernalia that might distract.) First off, some code that works as expected. As soon as you interrupt the thread, it breaks out of the while() loop and exits: public void think() { while (true) { if (Thread.currentThread().isInterrupted()) break; } System.out.println(We're done thinking); } However, if you extract the Thread.currentThread().isInterrupted() into a separate method, then that might be optimized by HotSpot to always return false and the code then never ends: public void think() { while (true) { if (checkInterruptedStatus()) break; } System.out.println(We're done thinking); } private boolean checkInterruptedStatus() { return Thread.currentThread().isInterrupted(); } My assumption is that the checkInterruptedStatus() method is aggressively optimized and then the actual status is not read again. This does not happen with the client hotspot and also not with Java 1.6.0_37. It does happen with the 1.8 EA that I've got on my MacBook Pro. The student was using a Windows machine, so this not just a Mac problem. Here is the complete code: public class InterruptedVisibilityTest { public void think() { while (true) { if (checkInterruptedStatus()) break; } System.out.println(We're done thinking); } private boolean checkInterruptedStatus() { return Thread.currentThread().isInterrupted(); } public static void main(String[] args) throws InterruptedException { final InterruptedVisibilityTest test = new InterruptedVisibilityTest(); Thread thinkerThread = new Thread(Thinker) { public void run() { test.think(); } }; thinkerThread.start(); Thread.sleep(500); thinkerThread.interrupt(); long timeOfInterruption = System.currentTimeMillis(); thinkerThread.join(500); if (thinkerThread.isAlive()) { System.err.println(Thinker did not shut down within 500ms); System.err.println(Error in Java Virtual Machine!); System.err.println(Interrupted: + thinkerThread.isInterrupted()); System.err.println(); System.err.println((Let's see if the thread ever dies and how long it takes)); while (thinkerThread.isAlive()) { thinkerThread.join(1000); if (thinkerThread.isAlive()) { System.err.println( ... still waiting); } } } System.err.println(Finally, the thread has died - that took + (System.currentTimeMillis() - timeOfInterruption) + ms); } } As I said, the original code was more involved, but this demonstrates the essentials. I hope some of you might be able to take a look at what's going on. Regards Heinz -- Dr Heinz M. Kabutz (PhD CompSci) Author of The Java(tm) Specialists' Newsletter Sun Java Champion IEEE Certified Software Development Professional http://www.javaspecialists.eu Tel: +30 69 75 595 262 Skype: kabutz ___ Concurrency-interest mailing list concurrency-inter...@cs.oswego.edu http://cs.oswego.edu/mailman/listinfo/concurrency-interest - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Compressed stored fields and multiGet(sorted luceneId[])?
Just a theoretical question, would it make sense to add some sort of StoredDocument[] bulkGet(int[] docId) to fetch multiple stored documents in one go? The reasoning behind is that now with compressed blocks random-access gets more expensive, and in some cases a user needs to fetch more documents in one go. If it happens that more documents come from one block it is a win. I would also assume, even without compression , bulk access on sorted docIds cold be a win (sequential access)? Does that make sense, is it doable? Or even worse, does it already exist :) By the way, I am impressed how well compression does, even on really short stored documents, approx. 150b we observe 35% reduction. Fetching 1000 short documents on fully cached index is observably slower (2-3 times), but as soon as you memory gets low, compression wins quickly. Did not test it thoroughly, but looks good so far. Great job! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org