[jira] [Commented] (SOLR-3433) binary field returns differently when do the distribute search
[ https://issues.apache.org/jira/browse/SOLR-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283161#comment-13283161 ] Sami Siren commented on SOLR-3433: -- It seems this has been fixed in trunk: SOLR-3035. Alex, can you please give me some more details on how you tested this and what versions, especially did you see this still happen in trunk. binary field returns differently when do the distribute search -- Key: SOLR-3433 URL: https://issues.apache.org/jira/browse/SOLR-3433 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 3.6, 4.0 Environment: linux, amazon ec2 Reporter: Alex Liu When install multiple nodes (more than one node), the repeated searches through solr returns binary data back differently each time. lst name=responseHeaderint name=status0/intint name=QTime26/intlst name=paramsstr name=qtext_col:woodman/str/lst/lstresult name=response numFound=1 start=0 maxScore=0.13258252docstr name=binary_col[B:[B@714fef9f/str lst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=qtext_col:woodman/str/lst/lstresult name=response numFound=1 start=0 maxScore=0.13258252docstr name=binary_col[B:[B@4be22114/str check this link, some one report the same issue. http://grokbase.com/t/lucene/solr-user/11beyhmxjw/distributed-search-and-binary-fields-w-solr-3-4 it works for a single node, but fails for multiple node. it's something related to distributed search -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields
Sami Siren created SOLR-3487: Summary: XMLResponseParser does not handle named lists in doc fields Key: SOLR-3487 URL: https://issues.apache.org/jira/browse/SOLR-3487 Project: Solr Issue Type: Bug Reporter: Sami Siren Priority: Minor Fix For: 4.0 For example when one uses xml and specifies fl to contain [explain style=nl] parser currently cannot handle the response. I also noticed that the example tests are not run with xml (that would have caught this earlier). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields
[ https://issues.apache.org/jira/browse/SOLR-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated SOLR-3487: - Attachment: SOLR-3487.patch here's a proposed fix. I also added a test class that runs the example tests by using the xml format. will commit shortly unless someone stops me... XMLResponseParser does not handle named lists in doc fields --- Key: SOLR-3487 URL: https://issues.apache.org/jira/browse/SOLR-3487 Project: Solr Issue Type: Bug Reporter: Sami Siren Priority: Minor Fix For: 4.0 Attachments: SOLR-3487.patch For example when one uses xml and specifies fl to contain [explain style=nl] parser currently cannot handle the response. I also noticed that the example tests are not run with xml (that would have caught this earlier). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3433) binary field returns differently when do the distribute search
[ https://issues.apache.org/jira/browse/SOLR-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283178#comment-13283178 ] Alex Liu commented on SOLR-3433: Sami, I think SOLR-3035 fixed the issue for a single node. This ticket is only for multiple node. To reproduce it, you can set up a three nodes cluster, and upload solrconfig.xml, schema.xml with binary fields and some testing data, then you can search on any node. binary field returns differently when do the distribute search -- Key: SOLR-3433 URL: https://issues.apache.org/jira/browse/SOLR-3433 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 3.6, 4.0 Environment: linux, amazon ec2 Reporter: Alex Liu When install multiple nodes (more than one node), the repeated searches through solr returns binary data back differently each time. lst name=responseHeaderint name=status0/intint name=QTime26/intlst name=paramsstr name=qtext_col:woodman/str/lst/lstresult name=response numFound=1 start=0 maxScore=0.13258252docstr name=binary_col[B:[B@714fef9f/str lst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=qtext_col:woodman/str/lst/lstresult name=response numFound=1 start=0 maxScore=0.13258252docstr name=binary_col[B:[B@4be22114/str check this link, some one report the same issue. http://grokbase.com/t/lucene/solr-user/11beyhmxjw/distributed-search-and-binary-fields-w-solr-3-4 it works for a single node, but fails for multiple node. it's something related to distributed search -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #175
Let me know if you need any help (or if you have questions) wrt to the new test infrastructure. I am busy with other things at the moment and there are rough edges... I plan to jump into it again once we ship a release (don't know when this going to happen at the moment). Dawid On Thu, May 24, 2012 at 11:59 PM, Mark Miller markrmil...@gmail.com wrote: Just noticed this seems to happen fairly frequently in the java 7 windows build, but I don't seem to see it in the java 6 windows build. I'll try and use Java 7 on my win machine when I get chance - should make it easier to experiment with fixes if I can get the same results locally. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields
[ https://issues.apache.org/jira/browse/SOLR-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren updated SOLR-3487: - Component/s: clients - java XMLResponseParser does not handle named lists in doc fields --- Key: SOLR-3487 URL: https://issues.apache.org/jira/browse/SOLR-3487 Project: Solr Issue Type: Bug Components: clients - java Reporter: Sami Siren Priority: Minor Fix For: 4.0 Attachments: SOLR-3487.patch For example when one uses xml and specifies fl to contain [explain style=nl] parser currently cannot handle the response. I also noticed that the example tests are not run with xml (that would have caught this earlier). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3433) binary field returns differently when do the distribute search
[ https://issues.apache.org/jira/browse/SOLR-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283191#comment-13283191 ] Sami Siren commented on SOLR-3433: -- From what I understand from SOLR-3035 it was not about single node. I also did some tests with multiple shards and did not see this problem on trunk. Perhaps I am missing something important. Could you provide a test case that demonstrates the problem on trunk? binary field returns differently when do the distribute search -- Key: SOLR-3433 URL: https://issues.apache.org/jira/browse/SOLR-3433 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5, 3.6, 4.0 Environment: linux, amazon ec2 Reporter: Alex Liu When install multiple nodes (more than one node), the repeated searches through solr returns binary data back differently each time. lst name=responseHeaderint name=status0/intint name=QTime26/intlst name=paramsstr name=qtext_col:woodman/str/lst/lstresult name=response numFound=1 start=0 maxScore=0.13258252docstr name=binary_col[B:[B@714fef9f/str lst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=qtext_col:woodman/str/lst/lstresult name=response numFound=1 start=0 maxScore=0.13258252docstr name=binary_col[B:[B@4be22114/str check this link, some one report the same issue. http://grokbase.com/t/lucene/solr-user/11beyhmxjw/distributed-search-and-binary-fields-w-solr-3-4 it works for a single node, but fails for multiple node. it's something related to distributed search -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #119
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/119/ -- [...truncated 14684 lines...] [junit4] 2 16087 T3282 oass.SolrIndexSearcher.init WARNING WARNING: Directory impl does not support setting indexDir: org.apache.lucene.store.MockDirectoryWrapper [junit4] 2 16088 T3282 oasu.CommitTracker.init Hard AutoCommit: disabled [junit4] 2 16088 T3282 oasu.CommitTracker.init Soft AutoCommit: disabled [junit4] 2 16088 T3282 oashc.SpellCheckComponent.inform Initializing spell checkers [junit4] 2 16096 T3282 oass.DirectSolrSpellChecker.init init: {name=direct,classname=DirectSolrSpellChecker,field=lowerfilt,minQueryLength=3} [junit4] 2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting socketTimeout to: 0 [junit4] 2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting urlScheme to: http:// [junit4] 2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting connTimeout to: 0 [junit4] 2 16142 T3282 oashc.HttpShardHandlerFactory.getParameter Setting maxConnectionsPerHost to: 20 [junit4] 2 16143 T3282 oashc.HttpShardHandlerFactory.getParameter Setting corePoolSize to: 0 [junit4] 2 16143 T3282 oashc.HttpShardHandlerFactory.getParameter Setting maximumPoolSize to: 2147483647 [junit4] 2 16143 T3282 oashc.HttpShardHandlerFactory.getParameter Setting maxThreadIdleTime to: 5 [junit4] 2 16144 T3282 oashc.HttpShardHandlerFactory.getParameter Setting sizeOfQueue to: -1 [junit4] 2 16144 T3282 oashc.HttpShardHandlerFactory.getParameter Setting fairnessPolicy to: false [junit4] 2 16154 T3285 oasc.SolrCore.registerSearcher [collection1] Registered new searcher Searcher@3d102489 main{StandardDirectoryReader(segments_1:1)} [junit4] 2 16154 T3282 oasc.CoreContainer.register registering core: collection1 [junit4] 2 16155 T3282 oasu.AbstractSolrTestCase.setUp SETUP_END testSoftAndHardCommitMaxTimeDelete [junit4] 2 16156 T3282 oasu.AbstractSolrTestCase.tearDown TEARDOWN_START testSoftAndHardCommitMaxTimeDelete [junit4] 2 16156 T3282 oasc.CoreContainer.shutdown Shutting down CoreContainer instance=883130242 [junit4] 2 16156 T3282 oasc.SolrCore.close [collection1] CLOSING SolrCore org.apache.solr.core.SolrCore@5a084acd [junit4] 2 16160 T3282 oasc.SolrCore.closeSearcher [collection1] Closing main searcher on request. [junit4] 2 16160 T3282 oasu.DirectUpdateHandler2.close closing DirectUpdateHandler2{commits=0,autocommits=0,soft autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0} [junit4] 2 [junit4] Completed in 3.27s, 3 tests, 3 skipped [junit4] [junit4] Suite: org.apache.solr.handler.component.DistributedTermsComponentTest [junit4] Completed in 14.90s, 1 test [junit4] [junit4] Suite: org.apache.solr.TestGroupingSearch [junit4] Completed in 6.62s, 12 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.SpellCheckComponentTest [junit4] Completed in 12.90s, 9 tests [junit4] [junit4] Suite: org.apache.solr.cloud.TestMultiCoreConfBootstrap [junit4] Completed in 5.82s, 1 test [junit4] [junit4] Suite: org.apache.solr.request.SimpleFacetsTest [junit4] Completed in 9.93s, 29 tests [junit4] [junit4] Suite: org.apache.solr.update.DirectUpdateHandlerTest [junit4] Completed in 5.39s, 6 tests [junit4] [junit4] Suite: org.apache.solr.handler.MoreLikeThisHandlerTest [junit4] Completed in 1.72s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.TestCoreContainer [junit4] Completed in 5.34s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.TestIndexingPerformance [junit4] Completed in 1.57s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.similarities.TestLMDirichletSimilarityFactory [junit4] Completed in 0.32s, 2 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.TermsComponentTest [junit4] Completed in 1.65s, 13 tests [junit4] [junit4] Suite: org.apache.solr.search.function.SortByFunctionTest [junit4] Completed in 3.15s, 2 tests [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterTSTTest [junit4] Completed in 2.36s, 4 tests [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterWFSTTest [junit4] Completed in 2.30s, 4 tests [junit4] [junit4] Suite: org.apache.solr.search.TestFoldingMultitermQuery [junit4] Completed in 2.06s, 18 tests [junit4] [junit4] Suite: org.apache.solr.schema.CurrencyFieldTest [junit4] IGNORED 0.00s | CurrencyFieldTest.testPerformance [junit4] Cause: Annotated @Ignore() [junit4] Completed in 1.99s, 8 tests, 1 skipped [junit4] [junit4] Suite:
Re: Welcome Simon Svensson as a new committer
Welcome in Simon ! 2012/5/24 Prescott Nasser geobmx...@hotmail.com Hey All, Our roster is growing a bit, I'd like to welcome Simon as a new committer. Simon has been quite active on the user mailing list helping answer community questions, he also maintains a C# port of the lucene-hunspell project (java: http://code.google.com/p/lucene-hunspell/, Simons c# port: https://github.com/sisve/Lucene.Net.Analysis.Hunspell) which is commonly used for spell checking (but has a wide array of purposes. Please join me in welcoming Simon to the team, ~Prescott
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14323 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14323/ 1 tests failed. REGRESSION: org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDistributed Error Message: Server at http://localhost:18961/example/core0 returned non ok status:500, message:Server Error Stack Trace: org.apache.solr.common.SolrException: Server at http://localhost:18961/example/core0 returned non ok status:500, message:Server Error at __randomizedtesting.SeedInfo.seed([7A452B0B4F6909F6:15779CE33ED15799]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:403) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:209) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDistributed(MultiCoreExampleJettyTest.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Build Log (for compile errors): [...truncated 11071 lines...] - To unsubscribe, e-mail:
[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2878: Attachment: LUCENE-2878.patch hey alan, great job.. your are getting up to speed. I fixed that testcase (the boolean one) since in the conjunction case you have to consume the conjunction positions/offsets ie. the intervals given by the term matches. I also fixed the license header in that file and brought the highlighter prototype test back. I will commit this to the branch now. wow man this makes me happy! Good job. Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Positions Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, mentor Fix For: Positions Branch Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283217#comment-13283217 ] Simon Willnauer commented on LUCENE-2878: - oh btw. All tests on the branch pass now :) Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Positions Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, mentor Fix For: Positions Branch Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2878: Attachment: LUCENE-2878.patch I messed up the last patch - here is the actual patch. Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Positions Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, mentor Fix For: Positions Branch Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #120
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/120/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Solr-trunk - Build # 1865 - Failure
Build: https://builds.apache.org/job/Solr-trunk/1865/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at __randomizedtesting.SeedInfo.seed([8B4A827F28B6F16]:0) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480) Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241) at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3031) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451) Build Log (for compile errors): [...truncated 41930 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-2566) + - operators allow any amount of whitespace
[ https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl reopened LUCENE-2566: - Assignee: Jan Høydahl Re-opening for backport + - operators allow any amount of whitespace Key: LUCENE-2566 URL: https://issues.apache.org/jira/browse/LUCENE-2566 Project: Lucene - Java Issue Type: Bug Components: core/queryparser Reporter: Yonik Seeley Assignee: Jan Høydahl Priority: Minor Fix For: 4.0 Attachments: LUCENE-2566.patch As an example, (foo - bar) is treated like (foo -bar). It seems like for +- to be treated as unary operators, they should be immediately followed by the operand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2566) + - operators allow any amount of whitespace
[ https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated LUCENE-2566: Attachment: LUCENE-2566-3x.patch Backport to 3.6 branch. All tests pass. Committing soon. + - operators allow any amount of whitespace Key: LUCENE-2566 URL: https://issues.apache.org/jira/browse/LUCENE-2566 Project: Lucene - Java Issue Type: Bug Components: core/queryparser Affects Versions: 3.6 Reporter: Yonik Seeley Assignee: Jan Høydahl Priority: Minor Fix For: 4.0, 3.6.1 Attachments: LUCENE-2566-3x.patch, LUCENE-2566.patch As an example, (foo - bar) is treated like (foo -bar). It seems like for +- to be treated as unary operators, they should be immediately followed by the operand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4076) When doing nested (index-time) joins, ToParentBlockJoinCollector delivers incomplete information on the grand-children
Christoph Kaser created LUCENE-4076: --- Summary: When doing nested (index-time) joins, ToParentBlockJoinCollector delivers incomplete information on the grand-children Key: LUCENE-4076 URL: https://issues.apache.org/jira/browse/LUCENE-4076 Project: Lucene - Java Issue Type: Bug Components: modules/join Affects Versions: 3.6, 3.5, 3.4 Reporter: Christoph Kaser ToParentBlockJoinCollector.getTopGroups does not provide the correct answer when a query with nested ToParentBlockJoinCollectors is performed. Given the following example query: {code} Query grandChildQuery=new TermQuery(new Term(color, red)); Filter childFilter = new CachingWrapperFilter(new RawTermFilter(new Term(type,child)), DeletesMode.IGNORE); ToParentBlockJoinQuery grandchildJoinQuery = new ToParentBlockJoinQuery(grandChildQuery, childFilter, ScoreMode.Max); BooleanQuery childQuery= new BooleanQuery(); childQuery.add(grandchildJoinQuery, Occur.MUST); childQuery.add(new TermQuery(new Term(shape, round)), Occur.MUST); Filter parentFilter = new CachingWrapperFilter(new RawTermFilter(new Term(type,parent)), DeletesMode.IGNORE); ToParentBlockJoinQuery childJoinQuery = new ToParentBlockJoinQuery(childQuery, parentFilter, ScoreMode.Max); parentQuery=new BooleanQuery(); parentQuery.add(childJoinQuery, Occur.MUST); parentQuery.add(new TermQuery(new Term(name, test)), Occur.MUST); ToParentBlockJoinCollector parentCollector= new ToParentBlockJoinCollector(Sort.RELEVANCE, 30, true, true); searcher.search(parentQuery, null, parentCollector); {code} This produces the correct results: {code} TopGroupsInteger childGroups = parentCollector.getTopGroups(childJoinQuery, null, 0, 20, 0, false); {code} However, this does not: {code} TopGroupsInteger grandChildGroups = parentCollector.getTopGroups(grandchildJoinQuery, null, 0, 20, 0, false); {code} The content of grandChildGroups is broken in the following ways: * The groupValue is not the document id of the child document (which is the parent of a grandchild document), but the document id of the _previous_ matching parent document * There are only as much GroupDocs as there are parent documents (not child documents), and they only contain the children of the last child document (but, as mentioned before, with the wrong groupValue). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2566) + - operators allow any amount of whitespace
[ https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated LUCENE-2566: Affects Version/s: 3.6 Fix Version/s: 3.6.1 + - operators allow any amount of whitespace Key: LUCENE-2566 URL: https://issues.apache.org/jira/browse/LUCENE-2566 Project: Lucene - Java Issue Type: Bug Components: core/queryparser Affects Versions: 3.6 Reporter: Yonik Seeley Assignee: Jan Høydahl Priority: Minor Fix For: 4.0, 3.6.1 Attachments: LUCENE-2566-3x.patch, LUCENE-2566.patch As an example, (foo - bar) is treated like (foo -bar). It seems like for +- to be treated as unary operators, they should be immediately followed by the operand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support
[ https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283255#comment-13283255 ] Lajos Kesik commented on LUCENE-3131: - Really there is no plan to support XA Transactions? Without it is in quite hard to keep consistency between database and lucene index. XA Resource/Transaction support Key: LUCENE-3131 URL: https://issues.apache.org/jira/browse/LUCENE-3131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1 Reporter: Magnus Assignee: Robert Muir Priority: Minor Fix For: 3.1.1 Please add XAResoure/XATransaction support into Lucene core. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4077) ToParentBlockJoinCollector provides no way to access computed scores and the maxScore
Christoph Kaser created LUCENE-4077: --- Summary: ToParentBlockJoinCollector provides no way to access computed scores and the maxScore Key: LUCENE-4077 URL: https://issues.apache.org/jira/browse/LUCENE-4077 Project: Lucene - Java Issue Type: Bug Components: modules/join Affects Versions: 3.6, 3.5, 3.4 Reporter: Christoph Kaser The constructor of ToParentBlockJoinCollector allows to turn on the tracking of parent scores and the maximum parent score, however there is no way to access those scores because: * maxScore is a private field, and there is no getter * TopGroups / GroupDocs does not provide access to the scores for the parent documents, only the children -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3486) The memory size of Solr caches should be configurable
[ https://issues.apache.org/jira/browse/SOLR-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated SOLR-3486: --- Attachment: SOLR-3486.patch Hi Shawn, I modified the patch in order to make it easier to add this functionality to other cache implementations. All you need to do for SOLR-3393 to support maximum memory size is to split your implementation into a LFU map (a regular map, with no evictions) which iterates (entrySet().iterator()) in frequency order and a LFU cache (that will probably extend or wrap this LFU map). Then to have a LFU cache with a fixed max mem size, just wrap your LFU map into a new SizableCache instance. The memory size of Solr caches should be configurable - Key: SOLR-3486 URL: https://issues.apache.org/jira/browse/SOLR-3486 Project: Solr Issue Type: Improvement Components: search Reporter: Adrien Grand Priority: Minor Attachments: SOLR-3486.patch, SOLR-3486.patch It is currently possible to configure the sizes of Solr caches based on the number of entries of the cache. The problem is that the memory size of cached values may vary a lot over time (depending on IndexReader.maxDoc and the queries that are run) although the JVM heap size does not. Having a configurable max size in bytes would also help optimize cache utilization, making it possible to store more values provided that they have a small memory footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3487) XMLResponseParser does not handle named lists in doc fields
[ https://issues.apache.org/jira/browse/SOLR-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Siren resolved SOLR-3487. -- Resolution: Fixed Assignee: Sami Siren XMLResponseParser does not handle named lists in doc fields --- Key: SOLR-3487 URL: https://issues.apache.org/jira/browse/SOLR-3487 Project: Solr Issue Type: Bug Components: clients - java Reporter: Sami Siren Assignee: Sami Siren Priority: Minor Fix For: 4.0 Attachments: SOLR-3487.patch For example when one uses xml and specifies fl to contain [explain style=nl] parser currently cannot handle the response. I also noticed that the example tests are not run with xml (that would have caught this earlier). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java7-64 #121
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/121/changes Changes: [siren] SOLR-3487: handle named lists in xml response -- [...truncated 4693 lines...] resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [javac] Compiling 1 source file to http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\core\classes\java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] 1 warning compile-core: init: compile-test: [echo] Building queries... ivy-availability-check: ivy-fail: ivy-configure: resolve: common.init: compile-lucene-core: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: compile-core: compile-test-framework: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: compile-lucene-core: compile-core: common.compile-test: [mkdir] Created dir: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queries\classes\test [javac] Compiling 11 source files to http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queries\classes\test [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 1 warning [echo] Building queryparser... ivy-availability-check: ivy-fail: ivy-configure: [ivy:configure] :: loading settings :: file = http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\ivy-settings.xml resolve: common.init: compile-lucene-core: jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [javac] Compiling 1 source file to http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\core\classes\java [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] 1 warning compile-core: init: compile-test: [echo] Building queryparser... check-queries-uptodate: jar-queries: check-sandbox-uptodate: jar-sandbox: ivy-availability-check: ivy-fail: ivy-configure: resolve: common.init: compile-lucene-core: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: compile-core: compile-test-framework: ivy-availability-check: ivy-fail: ivy-configure: resolve: [ivy:retrieve] :: loading settings :: url = jar:file:/C:/Users/JenkinsSlave/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml init: compile-lucene-core: compile-core: common.compile-test: [mkdir] Created dir: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queryparser\classes\test [javac] Compiling 40 source files to http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\build\queryparser\classes\test [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javac] http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\queryparser\src\test\org\apache\lucene\queryparser\util\QueryParserTestBase.java:1137: warning: [rawtypes] found raw type: Class [javac] QueryParser.class.getConstructor(new Class[] {CharStream.class}); [javac]^ [javac] missing type arguments for generic class ClassT [javac] where T is a type-variable: [javac] T extends Object declared in class Class [javac] http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/ws/lucene\queryparser\src\test\org\apache\lucene\queryparser\util\QueryParserTestBase.java:1143: warning: [rawtypes] found raw type: Class [javac] QueryParser.class.getConstructor(new Class[] {QueryParserTokenManager.class}); [javac]
[jira] [Commented] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB
[ https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283303#comment-13283303 ] Jan Høydahl commented on LUCENE-4074: - Checked in a fix in 3.6 for non-compiling TestSort.testRamBuffer. It referred to random().nextInt() instead of random.nextInt() - clear copy/paste error from Trunk code FST Sorter BufferSize causes int overflow if BufferSize 2048MB Key: LUCENE-4074 URL: https://issues.apache.org/jira/browse/LUCENE-4074 Project: Lucene - Java Issue Type: Bug Components: modules/spellchecker Affects Versions: 3.6, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0, 3.6.1 Attachments: LUCENE-4074.patch the BufferSize constructor accepts size in MB as an integer and uses multiplication to convert to bytes. While its checking the size in bytes to be less than 2048 MB it does that after byte conversion. If you pass a value 2047 to the ctor the value overflows since all constants and methods based on MB expect 32 bit signed ints. This does not even result in an exception until the BufferSize is actually passed to the sorter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4074) FST Sorter BufferSize causes int overflow if BufferSize 2048MB
[ https://issues.apache.org/jira/browse/LUCENE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283305#comment-13283305 ] Simon Willnauer commented on LUCENE-4074: - thanks jan! totally my fault! seems we dont' have jenkins testing this though :( FST Sorter BufferSize causes int overflow if BufferSize 2048MB Key: LUCENE-4074 URL: https://issues.apache.org/jira/browse/LUCENE-4074 Project: Lucene - Java Issue Type: Bug Components: modules/spellchecker Affects Versions: 3.6, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0, 3.6.1 Attachments: LUCENE-4074.patch the BufferSize constructor accepts size in MB as an integer and uses multiplication to convert to bytes. While its checking the size in bytes to be less than 2048 MB it does that after byte conversion. If you pass a value 2047 to the ctor the value overflows since all constants and methods based on MB expect 32 bit signed ints. This does not even result in an exception until the BufferSize is actually passed to the sorter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3488) Create a Collections API for SolrCloud
Mark Miller created SOLR-3488: - Summary: Create a Collections API for SolrCloud Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #199
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/199/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
I actually know what this one is now. Jetty is shutting down, and the graceful timeout is too low, and so jetty interrupts the webapp, and while we are waiting for merges to finish on IW#close, an interrupt is thrown and we stop waiting. So the directory is then closed out from under the merge thread. So really, mostly a test issue it seems? So I changed out jetty instances in tests to a 30 second graceful shutdown. Tests went from 6 minutes for me, to 33 minutes. I won't make this fix for now :) One idea is to perhaps do it just for this test - but even then it makes the test *much* longer, and there is no reason it can't happen on other tests that use jetty instances. It just happens to only show up in the test currently AFAICT. On May 25, 2012, at 5:30 AM, Apache Jenkins Server wrote: Build: https://builds.apache.org/job/Solr-trunk/1865/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at __randomizedtesting.SeedInfo.seed([8B4A827F28B6F16]:0) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480) Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241) at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3031) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451) Build Log (for compile errors): [...truncated 41930 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
Just thinking out loud... shouldn't solr(cloud) manage such situation gracefully? I mean in real life solr instances can be killed or even whole servers can go away. Would it be ok to ignore that exception instead? -- Sami Siren On Fri, May 25, 2012 at 3:01 PM, Mark Miller markrmil...@gmail.com wrote: I actually know what this one is now. Jetty is shutting down, and the graceful timeout is too low, and so jetty interrupts the webapp, and while we are waiting for merges to finish on IW#close, an interrupt is thrown and we stop waiting. So the directory is then closed out from under the merge thread. So really, mostly a test issue it seems? So I changed out jetty instances in tests to a 30 second graceful shutdown. Tests went from 6 minutes for me, to 33 minutes. I won't make this fix for now :) One idea is to perhaps do it just for this test - but even then it makes the test *much* longer, and there is no reason it can't happen on other tests that use jetty instances. It just happens to only show up in the test currently AFAICT. On May 25, 2012, at 5:30 AM, Apache Jenkins Server wrote: Build: https://builds.apache.org/job/Solr-trunk/1865/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551) Caused by: org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at __randomizedtesting.SeedInfo.seed([8B4A827F28B6F16]:0) at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:480) Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:244) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:241) at org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:345) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3031) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451) Build Log (for compile errors): [...truncated 41930 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
[jira] [Resolved] (LUCENE-2566) + - operators allow any amount of whitespace
[ https://issues.apache.org/jira/browse/LUCENE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved LUCENE-2566. - Resolution: Fixed Checked in for 3.6.1 + - operators allow any amount of whitespace Key: LUCENE-2566 URL: https://issues.apache.org/jira/browse/LUCENE-2566 Project: Lucene - Java Issue Type: Bug Components: core/queryparser Affects Versions: 3.6 Reporter: Yonik Seeley Assignee: Jan Høydahl Priority: Minor Fix For: 4.0, 3.6.1 Attachments: LUCENE-2566-3x.patch, LUCENE-2566.patch As an example, (foo - bar) is treated like (foo -bar). It seems like for +- to be treated as unary operators, they should be immediately followed by the operand. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #200
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/200/ -- [...truncated 13549 lines...] [junit4] Suite: org.apache.solr.analysis.TestPatternReplaceCharFilterFactory [junit4] Completed in 0.02s, 3 tests [junit4] [junit4] Suite: org.apache.solr.schema.CopyFieldTest [junit4] Completed in 0.74s, 6 tests [junit4] [junit4] Suite: org.apache.solr.analysis.TestRussianLightStemFilterFactory [junit4] Completed in 0.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestPseudoReturnFields [junit4] Completed in 1.70s, 13 tests [junit4] [junit4] Suite: org.apache.solr.search.ReturnFieldsTest [junit4] Completed in 1.02s, 10 tests [junit4] [junit4] Suite: org.apache.solr.search.TestRealTimeGet [junit4] IGNOR/A 0.01s | TestRealTimeGet.testStressRecovery [junit4] Assumption #1: FIXME: This test is horribly slow sometimes on Windows! [junit4] 2 20987 T2115 oas.SolrTestCaseJ4.setUp ###Starting testStressRecovery [junit4] 2 20988 T2115 oas.SolrTestCaseJ4.tearDown ###Ending testStressRecovery [junit4] 2 [junit4] Completed in 29.76s, 8 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.cloud.OverseerTest [junit4] Completed in 56.01s, 7 tests [junit4] [junit4] Suite: org.apache.solr.cloud.RecoveryZkTest [junit4] Completed in 29.48s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.NodeStateWatcherTest [junit4] Completed in 24.76s, 1 test [junit4] [junit4] Suite: org.apache.solr.cloud.ZkSolrClientTest [junit4] Completed in 16.14s, 4 tests [junit4] [junit4] Suite: org.apache.solr.TestDistributedGrouping [junit4] Completed in 22.43s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.component.DistributedSpellCheckComponentTest [junit4] Completed in 16.26s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestRangeQuery [junit4] Completed in 7.02s, 2 tests [junit4] [junit4] Suite: org.apache.solr.search.TestSort [junit4] Completed in 2.99s, 2 tests [junit4] [junit4] Suite: org.apache.solr.core.TestJmxIntegration [junit4] IGNORED 0.00s | TestJmxIntegration.testJmxOnCoreReload [junit4] Cause: Annotated @Ignore(timing problem? https://issues.apache.org/jira/browse/SOLR-2715) [junit4] Completed in 1.82s, 3 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.spelling.IndexBasedSpellCheckerTest [junit4] Completed in 1.24s, 5 tests [junit4] [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy2 [junit4] Completed in 0.85s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.TestIndexingPerformance [junit4] Completed in 0.95s, 1 test [junit4] [junit4] Suite: org.apache.solr.spelling.DirectSolrSpellCheckerTest [junit4] Completed in 1.17s, 2 tests [junit4] [junit4] Suite: org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactoryTest [junit4] Completed in 1.00s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.DocumentBuilderTest [junit4] Completed in 1.03s, 11 tests [junit4] [junit4] Suite: org.apache.solr.search.SpatialFilterTest [junit4] Completed in 1.82s, 3 tests [junit4] [junit4] Suite: org.apache.solr.schema.PolyFieldTest [junit4] Completed in 1.59s, 4 tests [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterWFSTTest [junit4] Completed in 1.55s, 4 tests [junit4] [junit4] Suite: org.apache.solr.core.TestPropInject [junit4] Completed in 1.95s, 2 tests [junit4] [junit4] Suite: org.apache.solr.core.RequestHandlersTest [junit4] Completed in 1.13s, 3 tests [junit4] [junit4] Suite: org.apache.solr.highlight.FastVectorHighlighterTest [junit4] Completed in 1.13s, 2 tests [junit4] [junit4] Suite: org.apache.solr.search.TestDocSet [junit4] Completed in 0.73s, 2 tests [junit4] [junit4] Suite: org.apache.solr.analysis.TestReversedWildcardFilterFactory [junit4] Completed in 0.84s, 4 tests [junit4] [junit4] Suite: org.apache.solr.schema.RequiredFieldsTest [junit4] Completed in 0.93s, 3 tests [junit4] [junit4] Suite: org.apache.solr.core.IndexReaderFactoryTest [junit4] Completed in 0.97s, 1 test [junit4] [junit4] Suite: org.apache.solr.highlight.HighlighterConfigTest [junit4] Completed in 1.08s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestSolrQueryParser [junit4] Completed in 0.97s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.AlternateDirectoryTest [junit4] Completed in 0.95s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.processor.UpdateRequestProcessorFactoryTest [junit4] Completed in 0.84s, 1 test [junit4] [junit4] Suite: org.apache.solr.schema.MultiTermTest [junit4] Completed in 0.43s, 3 tests [junit4] [junit4] Suite:
[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support
[ https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283337#comment-13283337 ] Michael McCandless commented on LUCENE-3131: Lucene itself is already transactional (see http://blog.mikemccandless.com/2012/03/transactional-lucene.html ); it's just that we don't have XA wrapper... XA Resource/Transaction support Key: LUCENE-3131 URL: https://issues.apache.org/jira/browse/LUCENE-3131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1 Reporter: Magnus Assignee: Robert Muir Priority: Minor Fix For: 3.1.1 Please add XAResoure/XATransaction support into Lucene core. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
On May 25, 2012, at 8:11 AM, Sami Siren wrote: Just thinking out loud... shouldn't solr(cloud) manage such situation gracefully? Currently, you can handle it gracefully if you up the graceful timeout in jetty. It's easy enough to do that with the jetty we ship, but it's painful (extremely it seems) to do it in tests. In any case, I don't think it hurts anything practically? The merge thread fails, and so simply, you don't get those merges I think? The problem with the tests is that the exception is thrown from the merge thread. We have no affect on that from Solr - the test framework picks up an uncaught exception in the thread, and our goose is cooked. I mean in real life solr instances can be killed or even whole servers can go away. Would it be ok to ignore that exception instead? It's at the Lucene level really, so unless we try really hard to work around it, we would have to figure out if something different made sense there I think. Right now, if its waiting for merges to finish and gets interrupted, it throws an interrupted exception. Unless we explicitly try and kill the current merge threads, I'd think that could be a problem in any general code. You close the IW with wait for merges to finish = true, then you start closing other resources, because you assume you are done with the IW, but in fact merges can still be occurring if the thread was interrupted. And you might close resources merging depends on (ie the directory). Lucene does not like interruptions in other cases as well, but unfortunately, running in a webapp, we can't easily always avoid them it seems. -- Sami Siren On Fri, May 25, 2012 at 3:01 PM, Mark Miller markrmil...@gmail.com wrote: I actually know what this one is now. Jetty is shutting down, and the graceful timeout is too low, and so jetty interrupts the webapp, and while we are waiting for merges to finish on IW#close, an interrupt is thrown and we stop waiting. So the directory is then closed out from under the merge thread. So really, mostly a test issue it seems? So I changed out jetty instances in tests to a 30 second graceful shutdown. Tests went from 6 minutes for me, to 33 minutes. I won't make this fix for now :) One idea is to perhaps do it just for this test - but even then it makes the test *much* longer, and there is no reason it can't happen on other tests that use jetty instances. It just happens to only show up in the test currently AFAICT. On May 25, 2012, at 5:30 AM, Apache Jenkins Server wrote: Build: https://builds.apache.org/job/Solr-trunk/1865/ 1 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.testDistribSearch Error Message: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] Stack Trace: java.lang.RuntimeException: Thread threw an uncaught exception, thread: Thread[Lucene Merge Thread #2,6,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
On Fri, May 25, 2012 at 3:44 PM, Mark Miller markrmil...@gmail.com wrote: On May 25, 2012, at 8:11 AM, Sami Siren wrote: Just thinking out loud... shouldn't solr(cloud) manage such situation gracefully? Currently, you can handle it gracefully if you up the graceful timeout in jetty. It's easy enough to do that with the jetty we ship, but it's painful (extremely it seems) to do it in tests. In any case, I don't think it hurts anything practically? that was my point. the test framework picks up an uncaught exception in the thread, and our goose is cooked. by ignoring the exception I was trying to say that it should be ignored from POV of test framework, ie not fail the build. I now understand that it might not actually solve the issue... -- Sami Siren - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java7-64 #122
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java7-64/122/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2822) don't run update processors twice
[ https://issues.apache.org/jira/browse/SOLR-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283361#comment-13283361 ] Mark Miller commented on SOLR-2822: --- +1 - approach seems as elegant as we could shoot for right now. I much prefer it to juggling multiple chains. I still worry about the 'clone doc' issue and update procs between distrib and run - if we do decide to not let procs live there, we should probably hard fail on it. Latest patch looks good to me - let's commit and iterate on trunk. don't run update processors twice - Key: SOLR-2822 URL: https://issues.apache.org/jira/browse/SOLR-2822 Project: Solr Issue Type: Sub-task Components: SolrCloud, update Reporter: Yonik Seeley Fix For: 4.0 Attachments: SOLR-2822.patch, SOLR-2822.patch, SOLR-2822.patch An update will first go through processors until it gets to the point where it is forwarded to the leader (or forwarded to replicas if already on the leader). We need a way to skip over the processors that were already run (perhaps by using a processor chain dedicated to sub-updates? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3429) new GatherTransformer
[ https://issues.apache.org/jira/browse/SOLR-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Bricconi resolved SOLR-3429. - Resolution: Fixed Fix Version/s: 4.0 I propose this implementation new GatherTransformer - Key: SOLR-3429 URL: https://issues.apache.org/jira/browse/SOLR-3429 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Affects Versions: 4.0 Reporter: Giovanni Bricconi Priority: Minor Labels: json Fix For: 4.0 Attachments: SOLR-3429.patch This is a new transformer for dih. I'm often asked to import a lot of fields, many of these fields are read only and sould not be searched. I found useful to gather them in a single json field, and returning them untouched to the client. This patch provides a transformer that collects a list of db columns an writes out a json map that contains all of them. A regression test is included. A new dependency for jsonic has been added to dih, (already used by langid), I can use a different library if needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support
[ https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283376#comment-13283376 ] Magnus commented on LUCENE-3131: Is XA wrapper in the roadmap? XA Resource/Transaction support Key: LUCENE-3131 URL: https://issues.apache.org/jira/browse/LUCENE-3131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1 Reporter: Magnus Assignee: Robert Muir Priority: Minor Fix For: 3.1.1 Please add XAResoure/XATransaction support into Lucene core. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283382#comment-13283382 ] Mark Miller commented on SOLR-3488: --- I'll post an initial patch just for create soon. It's just a start though. I've added a bunch of comments for TODOs or things to consider for the future. I'd like to start simple just to get 'something' in though. So initially, you can create a new collection and pass an existing collection name to determine which shards it's created on. Would also be nice to be able to explicitly pass the shard urls to use, as well as simply offer X shards, Y replicas. In that case, perhaps the leader could handle ensuring that. You might also want to be able to simply say, create it on all known shards. Further things to look at: * other commands, like remove/delete. * what to do when some create calls fail? should we instead add a create node to a queue in zookeeper? Make the overseer responsible for checking for any jobs there, completing them (if needed) and then removing the job from the queue? Other ideas. Create a Collections API for SolrCloud -- Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283392#comment-13283392 ] Alan Woodward commented on LUCENE-2878: --- I think my next step is to have a go at implementing ReqOptSumScorer and RelExclScorer, so that all the BooleanQuery cases work. Testing it via the PosHighlighter seems to be the way to go as well. This might take a little longer, in that it will require me to actually think about what I'm doing... Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Positions Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, mentor Fix For: Positions Branch Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #201
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/201/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
On May 25, 2012, at 9:00 AM, Sami Siren wrote: by ignoring the exception I was trying to say that it should be ignored from POV of test framework, ie not fail the build. I now understand that it might not actually solve the issue... Yeah, I suppose if we could tell the test framework, for this test, ignore this expected uncaught exception, that might help. Usually you can work around this type of thing more cleanly though - so I don't know if it's worth the effort or extra code - if this ends up being it's only use case, it's hard to argue we add the capability. And I suspect it would mean conning dawid to suck it up and update our test jars? I think it also has a lot of potential for abuse. But the fail sucks too, so I don't know... - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3489) Config file replication less error prone
Jochen Just created SOLR-3489: - Summary: Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {{ !-- the error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str }} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Just updated SOLR-3489: -- Description: If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. was: If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {{ !-- the error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str }} It would be nice, if that space simply would be ignored. Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #175
This def happens more with Java 7 for me. Rather than seeing it like 1 out of 100 at best, it is now happening about 1 out of 20. Going on vacation for a week, so not sure if I will figure this out anytime soon, but now at least I can try some things and get more rapid and trustable feedback. On Thu, May 24, 2012 at 11:59 PM, Mark Miller markrmil...@gmail.com wrote: Just noticed this seems to happen fairly frequently in the java 7 windows build, but I don't seem to see it in the java 6 windows build. I'll try and use Java 7 on my win machine when I get chance - should make it easier to experiment with fixes if I can get the same results locally. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index
[ https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-2923: - Assignee: Mark Miller IllegalArgumentException when using useFilterForSortedQuery on an empty index - Key: SOLR-2923 URL: https://issues.apache.org/jira/browse/SOLR-2923 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0 Reporter: Adrien Grand Assignee: Mark Miller Priority: Trivial Attachments: SOLR-2923.patch An IllegalArgumentException can occur under the following circumstances: - the index is empty, - {{useFilterForSortedQuery}} is enabled, - {{queryResultsCache}} is disabled. Here are what the exception and its stack trace look like (Solr trunk): {quote} numHits must be 0; please use TotalHitCountCollector if you just need the total hit count java.lang.IllegalArgumentException: numHits must be 0; please use TotalHitCountCollector if you just need the total hit count at org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917) at org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353) ... {quote} To reproduce this error from a fresh copy of Solr trunk, edit {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a query which sorts against any field ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Just updated SOLR-3489: -- Attachment: SOLR-3489_reproducing_config.tar.gz Steps to reproduce: # unpack SOLR-3489_reproducing_config.tar.gz into solr-example directory # start master via {{java -Denable.master=true -Dsolr.solr.home=master -jar start.jar}} # start slave via {{java -Denable.slave=true -Dsolr.solr.home=slave -Djetty.port=8984 -jar start.jar}} # add document in master/singledoc.xml to master # either replicate manually or wait 60 seconds Result: * test.txt will be replicated * stopwords.txt won't Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283456#comment-13283456 ] Jochen Just edited comment on SOLR-3489 at 5/25/12 2:14 PM: Steps to reproduce: # unpack [^SOLR-3489_reproducing_config.tar.gz] into solr-example directory # start master via {{java -Denable.master=true -Dsolr.solr.home=master -jar start.jar}} # start slave via {{java -Denable.slave=true -Dsolr.solr.home=slave -Djetty.port=8984 -jar start.jar}} # add document in master/singledoc.xml to master # either replicate manually or wait 60 seconds Result: * test.txt will be replicated * stopwords.txt won't was (Author: jjaa): Steps to reproduce: # unpack SOLR-3489_reproducing_config.tar.gz into solr-example directory # start master via {{java -Denable.master=true -Dsolr.solr.home=master -jar start.jar}} # start slave via {{java -Denable.slave=true -Dsolr.solr.home=slave -Djetty.port=8984 -jar start.jar}} # add document in master/singledoc.xml to master # either replicate manually or wait 60 seconds Result: * test.txt will be replicated * stopwords.txt won't Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
On Fri, May 25, 2012 at 9:54 AM, Mark Miller markrmil...@gmail.com wrote: On May 25, 2012, at 9:00 AM, Sami Siren wrote: by ignoring the exception I was trying to say that it should be ignored from POV of test framework, ie not fail the build. I now understand that it might not actually solve the issue... Yeah, I suppose if we could tell the test framework, for this test, ignore this expected uncaught exception, that might help. Usually you can work around this type of thing more cleanly though - so I don't know if it's worth the effort or extra code - if this ends up being it's only use case, it's hard to argue we add the capability. And I suspect it would mean conning dawid to suck it up and update our test jars? I think it also has a lot of potential for abuse. the exception-from-another-thread is just an uncaught exception handler. you can replace it with your own that handles things differently, and restore the old one back. Here's an example of one that does this when the exception is really a jvm bug: http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/miscellaneous/PatternAnalyzerTest.java -- lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3486) The memory size of Solr caches should be configurable
[ https://issues.apache.org/jira/browse/SOLR-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283461#comment-13283461 ] Shawn Heisey commented on SOLR-3486: It's going to take me a while to digest what you've just said, but my first thought is that I can't change the implementation without destroying the O(1) nature. The cache is implemented in two parts - a simple map (HashMap) for fast lookup, and an array of sets (LinkedHashSet[]) for fast frequency ordering. When the frequency for an entry needs to be changed, it is removed from one set and added to another. Although it's not implemented as an actual iterator method, I have code to iterate over the array. I should probably create an iterator and backwards iterator, just to eliminate some duplicate code. If I don't already have a remove method, I should be able to add one. The memory size of Solr caches should be configurable - Key: SOLR-3486 URL: https://issues.apache.org/jira/browse/SOLR-3486 Project: Solr Issue Type: Improvement Components: search Reporter: Adrien Grand Priority: Minor Attachments: SOLR-3486.patch, SOLR-3486.patch It is currently possible to configure the sizes of Solr caches based on the number of entries of the cache. The problem is that the memory size of cached values may vary a lot over time (depending on IndexReader.maxDoc and the queries that are run) although the JVM heap size does not. Having a configurable max size in bytes would also help optimize cache utilization, making it possible to store more values provided that they have a small memory footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Just updated SOLR-3489: -- Attachment: SOLR-3489.patch The attached patch should solve that problem. Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments
[ https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283467#comment-13283467 ] sebastian L. commented on LUCENE-3440: -- Hi Koji, hi Simon, if there is something to do for me, please let me know. Maybe it would be better to split the patch in several smaller ones, e.g. 1. Use Getters/Setters where possible in FVH 2. Make FieldFragList interface and BaseFieldFragList abstract class 3. Introduction of SimpleFieldFragList and SimpleFragListBuilder as default 4. Introduction of WeightedFieldFragList and WeightedFragListBuilder 5. Integration into Solr When's the 4.0-release scheduled, anyway? A Patch for trunk 1342490 is on it's way. FastVectorHighlighter: IDF-weighted terms for ordered fragments Key: LUCENE-3440 URL: https://issues.apache.org/jira/browse/LUCENE-3440 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Reporter: sebastian L. Priority: Minor Labels: FastVectorHighlighter Fix For: 4.0 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, weight-vs-boost_table01.html, weight-vs-boost_table02.html The FastVectorHighlighter uses for every term found in a fragment an equal weight, which causes a higher ranking for fragments with a high number of words or, in the worst case, a high number of very common words than fragments that contains *all* of the terms used in the original query. This patch provides ordered fragments with IDF-weighted terms: total weight = total weight + IDF for unique term per fragment * boost of query; The ranking-formula should be the same, or at least similar, to that one used in org.apache.lucene.search.highlight.QueryTermScorer. The patch is simple, but it works for us. Some ideas: - A better approach would be moving the whole fragments-scoring into a separate class. - Switch scoring via parameter - Exact phrases should be given a even better score, regardless if a phrase-query was executed or not - edismax/dismax-parameters pf, ps and pf^boost should be observed and corresponding fragments should be ranked higher -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283469#comment-13283469 ] Jochen Just commented on SOLR-3489: --- The patch is based on branch lucene_solr_36 Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments
[ https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sebastian L. updated LUCENE-3440: - Attachment: LUCENE-3440.patch Patch for trunk (1342490) FastVectorHighlighter: IDF-weighted terms for ordered fragments Key: LUCENE-3440 URL: https://issues.apache.org/jira/browse/LUCENE-3440 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Reporter: sebastian L. Priority: Minor Labels: FastVectorHighlighter Fix For: 4.0 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, weight-vs-boost_table01.html, weight-vs-boost_table02.html The FastVectorHighlighter uses for every term found in a fragment an equal weight, which causes a higher ranking for fragments with a high number of words or, in the worst case, a high number of very common words than fragments that contains *all* of the terms used in the original query. This patch provides ordered fragments with IDF-weighted terms: total weight = total weight + IDF for unique term per fragment * boost of query; The ranking-formula should be the same, or at least similar, to that one used in org.apache.lucene.search.highlight.QueryTermScorer. The patch is simple, but it works for us. Some ideas: - A better approach would be moving the whole fragments-scoring into a separate class. - Switch scoring via parameter - Exact phrases should be given a even better score, regardless if a phrase-query was executed or not - edismax/dismax-parameters pf, ps and pf^boost should be observed and corresponding fragments should be ranked higher -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Just updated SOLR-3489: -- Attachment: (was: SOLR-3489_reproducing_config.tar.gz) Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Just updated SOLR-3489: -- Attachment: SOLR-3489_reproducing_config.tar.gz Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283478#comment-13283478 ] Sami Siren commented on SOLR-3488: -- bq. should we instead add a create node to a queue in zookeeper? Make the overseer responsible for checking for any jobs there, completing them (if needed) and then removing the job from the queue? I like this idea, i would also refactor current zkcontroller-overseer communication to use this same technique. Create a Collections API for SolrCloud -- Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283485#comment-13283485 ] Jack Krupansky commented on SOLR-3489: -- It would be nice to add a similar protection against space before and after the colon for aliases. As well as a check for an empty name before and after the colon. Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Just updated SOLR-3489: -- Attachment: SOLR-3489_reproducing_config.tar.gz Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jochen Just updated SOLR-3489: -- Attachment: (was: SOLR-3489_reproducing_config.tar.gz) Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283488#comment-13283488 ] Jochen Just commented on SOLR-3489: --- I will look into that, but not before next week i guess Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index
[ https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283493#comment-13283493 ] Mark Miller commented on SOLR-2923: --- patch looks good to me IllegalArgumentException when using useFilterForSortedQuery on an empty index - Key: SOLR-2923 URL: https://issues.apache.org/jira/browse/SOLR-2923 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0 Reporter: Adrien Grand Assignee: Mark Miller Priority: Trivial Attachments: SOLR-2923.patch An IllegalArgumentException can occur under the following circumstances: - the index is empty, - {{useFilterForSortedQuery}} is enabled, - {{queryResultsCache}} is disabled. Here are what the exception and its stack trace look like (Solr trunk): {quote} numHits must be 0; please use TotalHitCountCollector if you just need the total hit count java.lang.IllegalArgumentException: numHits must be 0; please use TotalHitCountCollector if you just need the total hit count at org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917) at org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353) ... {quote} To reproduce this error from a fresh copy of Solr trunk, edit {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a query which sorts against any field ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4072) CharFilter that Unicode-normalizes input
[ https://issues.apache.org/jira/browse/LUCENE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4072: Attachment: LUCENE-4072.patch attached is the filter, turned into a patch. however, I added an additional random test and it currently fails... will look into this more. CharFilter that Unicode-normalizes input Key: LUCENE-4072 URL: https://issues.apache.org/jira/browse/LUCENE-4072 Project: Lucene - Java Issue Type: New Feature Components: modules/analysis Reporter: Ippei UKAI Attachments: LUCENE-4072.patch, ippeiukai-ICUNormalizer2CharFilter-4752cad.zip I'd like to contribute a CharFilter that Unicode-normalizes input with ICU4J. The benefit of having this process as CharFilter is that tokenizer can work on normalised text while offset-correction ensuring fast vector highlighter and other offset-dependent features do not break. The implementation is available at following repository: https://github.com/ippeiukai/ICUNormalizer2CharFilter Unfortunately this is my unpaid side-project and cannot spend much time to merge my work to Lucene to make appropriate patch. I'd appreciate it if anyone could give it a go. I'm happy to relicense it to whatever that meets your needs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3440) FastVectorHighlighter: IDF-weighted terms for ordered fragments
[ https://issues.apache.org/jira/browse/LUCENE-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283496#comment-13283496 ] Koji Sekiguchi commented on LUCENE-3440: Hi sebastian! bq. Maybe it would be better to split the patch in several smaller ones, e.g. This is a great idea and it helps me a lot! If you could provide them one by one for trunk, I think I can review the smaller patch and commit them one by one. FastVectorHighlighter: IDF-weighted terms for ordered fragments Key: LUCENE-3440 URL: https://issues.apache.org/jira/browse/LUCENE-3440 Project: Lucene - Java Issue Type: Improvement Components: modules/highlighter Reporter: sebastian L. Priority: Minor Labels: FastVectorHighlighter Fix For: 4.0 Attachments: LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440.patch, LUCENE-3440_3.6.1-SNAPSHOT.patch, LUCENE-4.0-SNAPSHOT-3440-9.patch, weight-vs-boost_table01.html, weight-vs-boost_table02.html The FastVectorHighlighter uses for every term found in a fragment an equal weight, which causes a higher ranking for fragments with a high number of words or, in the worst case, a high number of very common words than fragments that contains *all* of the terms used in the original query. This patch provides ordered fragments with IDF-weighted terms: total weight = total weight + IDF for unique term per fragment * boost of query; The ranking-formula should be the same, or at least similar, to that one used in org.apache.lucene.search.highlight.QueryTermScorer. The patch is simple, but it works for us. Some ideas: - A better approach would be moving the whole fragments-scoring into a separate class. - Switch scoring via parameter - Exact phrases should be given a even better score, regardless if a phrase-query was executed or not - edismax/dismax-parameters pf, ps and pf^boost should be observed and corresponding fragments should be ranked higher -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3488) Create a Collections API for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283510#comment-13283510 ] Yonik Seeley commented on SOLR-3488: bq. should we instead add a create node to a queue in zookeeper? Yeah, a work queue in ZK makes perfect sense. Perhaps serialize the params to a JSON map/object per line? Possible parameters: - name of the collection - the config for the collection - number of shards in the new collection - default replication factor Operations: - add a collection - remove a collection - different options here... leave cores up, bring cores down, completely remove cores (and data) - change collection properties (replication factor, config) - expand collection (split shards) - add/remove a collection alias Shard operations: - add a shard (more for custom sharding) - remove a shard - change shard properties (replication factor) - split a shard - add/remove a shard alias Create a Collections API for SolrCloud -- Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3488) Create a Collections API for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283510#comment-13283510 ] Yonik Seeley edited comment on SOLR-3488 at 5/25/12 2:58 PM: - bq. should we instead add a create node to a queue in zookeeper? Yeah, a work queue in ZK makes perfect sense. Perhaps serialize the params to a JSON map/object per line? edit: or perhaps it makes more sense for each operation to be a separate file (which is what I think you wrote anyway) Possible parameters: - name of the collection - the config for the collection - number of shards in the new collection - default replication factor Operations: - add a collection - remove a collection - different options here... leave cores up, bring cores down, completely remove cores (and data) - change collection properties (replication factor, config) - expand collection (split shards) - add/remove a collection alias Shard operations: - add a shard (more for custom sharding) - remove a shard - change shard properties (replication factor) - split a shard - add/remove a shard alias was (Author: ysee...@gmail.com): bq. should we instead add a create node to a queue in zookeeper? Yeah, a work queue in ZK makes perfect sense. Perhaps serialize the params to a JSON map/object per line? Possible parameters: - name of the collection - the config for the collection - number of shards in the new collection - default replication factor Operations: - add a collection - remove a collection - different options here... leave cores up, bring cores down, completely remove cores (and data) - change collection properties (replication factor, config) - expand collection (split shards) - add/remove a collection alias Shard operations: - add a shard (more for custom sharding) - remove a shard - change shard properties (replication factor) - split a shard - add/remove a shard alias Create a Collections API for SolrCloud -- Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: PrimaryKey40PerformanceTestSrc.zip BloomFilterCodec40.patch Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6 Reporter: Mark Harwood Priority: Minor Fix For: 3.6.1 Attachments: BloomFilterCodec40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: BloomFilterCodec40.patch PrimaryKey40PerformanceTestSrc.zip I've ported this Bloom Filtering code to work as a 4.0 Codec now. I see a 35% improvement over standard Codecs on random lookups on a warmed index. I also notice that the PulsingCodec is no longer faster than standard Codec - is this news to people as I thought it was supposed to be the way forward? My test rig (adapted from Mike's original primary key test rig here http://blog.mikemccandless.com/2010/06/lucenes-pulsingcodec-on-primary-key.html) is attached as a zip. The new BloomFilteringCodec is also attached here as a patch. Searches against plain text fields also look to be faster (using AOL500k queries searching Wikipedia English) but obviously that particular test rig is harder to include as an attachment here. I can open a seperate JIRA issue for this 4.0 version of the code if that makes more sense. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6 Reporter: Mark Harwood Priority: Minor Fix For: 3.6.1 Attachments: BloomFilterCodec40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: (was: BloomFilterCodec40.patch) Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6 Reporter: Mark Harwood Priority: Minor Fix For: 3.6.1 Attachments: BloomFilterCodec40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Harwood updated LUCENE-4069: - Attachment: (was: PrimaryKey40PerformanceTestSrc.zip) Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6 Reporter: Mark Harwood Priority: Minor Fix For: 3.6.1 Attachments: BloomFilterCodec40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3488) Create a Collections API for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283382#comment-13283382 ] Mark Miller edited comment on SOLR-3488 at 5/25/12 3:12 PM: I'll post an initial patch just for create soon. It's just a start though. I've added a bunch of comments for TODOs or things to consider for the future. I'd like to start simple just to get 'something' in though. So initially, you can create a new collection and pass an existing collection name to determine which shards it's created on. Would also be nice to be able to explicitly pass the shard urls to use, as well as simply offer X shards, Y replicas. In that case, perhaps the -leader- overseer could handle ensuring that. You might also want to be able to simply say, create it on all known shards. Further things to look at: * other commands, like remove/delete. * what to do when some create calls fail? should we instead add a create node to a queue in zookeeper? Make the overseer responsible for checking for any jobs there, completing them (if needed) and then removing the job from the queue? Other ideas. was (Author: markrmil...@gmail.com): I'll post an initial patch just for create soon. It's just a start though. I've added a bunch of comments for TODOs or things to consider for the future. I'd like to start simple just to get 'something' in though. So initially, you can create a new collection and pass an existing collection name to determine which shards it's created on. Would also be nice to be able to explicitly pass the shard urls to use, as well as simply offer X shards, Y replicas. In that case, perhaps the leader could handle ensuring that. You might also want to be able to simply say, create it on all known shards. Further things to look at: * other commands, like remove/delete. * what to do when some create calls fail? should we instead add a create node to a queue in zookeeper? Make the overseer responsible for checking for any jobs there, completing them (if needed) and then removing the job from the queue? Other ideas. Create a Collections API for SolrCloud -- Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax
[ https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer resolved SOLR-2058. -- Resolution: Fixed Assignee: James Dyer Committed to Trunk, r1342681. This is the May 17, 2012 patch which is a touched-up version of Ron Mayer's work from August 31, 2010 (Thank you!). Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax Key: SOLR-2058 URL: https://issues.apache.org/jira/browse/SOLR-2058 Project: Solr Issue Type: Improvement Components: query parsers Environment: n/a Reporter: Ron Mayer Assignee: James Dyer Priority: Minor Fix For: 4.0 Attachments: SOLR-2058-and-3351-not-finished.patch, SOLR-2058.patch, edismax_pf_with_slop_v2.1.patch, edismax_pf_with_slop_v2.patch, pf2_with_slop.patch http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3E {quote} From Ron Mayer r...@0ape.com ... my results might be even better if I had a couple different pf2s with different ps's at the same time. In particular. One with ps=0 to put a high boost on ones the have the right ordering of words. For example insuring that [the query]: red hat black jacket boosts only documents with red hats and not black hats. And another pf2 with a more modest boost with ps=5 or so to handle the query above also boosting docs with red baseball hat. {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3E] {quote} From Yonik Seeley yo...@lucidimagination.com Perhaps fold it into the pf/pf2 syntax? pf=text^2// current syntax... makes phrases with a boost of 2 pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and a boost of 2 That actually seems pretty natural given the lucene query syntax - an actual boosted sloppy phrase query already looks like {{text:foo bar~1^2}} -Yonik {quote} [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3E] {quote} From Chris Hostetter hossman_luc...@fucit.org Big +1 to this idea ... the existing ps param can stick arround as the default for any field that doesn't specify it's own slop in the pf/pf2/pf3 fields using the ~ syntax. -Hoss {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index
[ https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-2923. --- Resolution: Fixed Fix Version/s: 4.0 Thanks Adrien! IllegalArgumentException when using useFilterForSortedQuery on an empty index - Key: SOLR-2923 URL: https://issues.apache.org/jira/browse/SOLR-2923 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0 Reporter: Adrien Grand Assignee: Mark Miller Priority: Trivial Fix For: 4.0 Attachments: SOLR-2923.patch An IllegalArgumentException can occur under the following circumstances: - the index is empty, - {{useFilterForSortedQuery}} is enabled, - {{queryResultsCache}} is disabled. Here are what the exception and its stack trace look like (Solr trunk): {quote} numHits must be 0; please use TotalHitCountCollector if you just need the total hit count java.lang.IllegalArgumentException: numHits must be 0; please use TotalHitCountCollector if you just need the total hit count at org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917) at org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353) ... {quote} To reproduce this error from a fresh copy of Solr trunk, edit {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a query which sorts against any field ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2923) IllegalArgumentException when using useFilterForSortedQuery on an empty index
[ https://issues.apache.org/jira/browse/SOLR-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283559#comment-13283559 ] Adrien Grand commented on SOLR-2923: Hi Mark, thanks for the review! IllegalArgumentException when using useFilterForSortedQuery on an empty index - Key: SOLR-2923 URL: https://issues.apache.org/jira/browse/SOLR-2923 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0 Reporter: Adrien Grand Assignee: Mark Miller Priority: Trivial Fix For: 4.0 Attachments: SOLR-2923.patch An IllegalArgumentException can occur under the following circumstances: - the index is empty, - {{useFilterForSortedQuery}} is enabled, - {{queryResultsCache}} is disabled. Here are what the exception and its stack trace look like (Solr trunk): {quote} numHits must be 0; please use TotalHitCountCollector if you just need the total hit count java.lang.IllegalArgumentException: numHits must be 0; please use TotalHitCountCollector if you just need the total hit count at org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:917) at org.apache.solr.search.SolrIndexSearcher.sortDocSet(SolrIndexSearcher.java:1741) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1211) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:353) ... {quote} To reproduce this error from a fresh copy of Solr trunk, edit {{example/solr/conf/solrconfig.xml}} to disable {{queryResultCache}} and enable {{useFilterForSortedQuery}}. Then run {{ant run-example}} and issue a query which sorts against any field ({{http://localhost:8983/solr/select?q=*:*sort=manu+desc}} for example). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3488) Create a Collections API for SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-3488: -- Attachment: SOLR-3488.patch I'm going on vacation for a week, so here is my early work on just getting something basic going. It does not involved any overseer stuff yet. Someone feel free to take it - commit it and iterate, or iterate in patch form - whatever makes sense. I'll help when I get back if there is more to do, and if no one makes any progress, I'll continue on it when I get back. Currently, I've copied the core admin handler pattern and made a collections handler. There is one simple test and currently the only way to choose which nodes the collection is put on is to give an existing template collection. The test asserts nothing at the moment - all very early work. But I imagine we will be changing direction a fair amount, so that's good I think. Create a Collections API for SolrCloud -- Key: SOLR-3488 URL: https://issues.apache.org/jira/browse/SOLR-3488 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Attachments: SOLR-3488.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283574#comment-13283574 ] Michael McCandless commented on LUCENE-4069: bq. I see a 35% improvement over standard Codecs on random lookups on a warmed index. Impressive! This is for primary key lookups? It looks like the primary keys are GUID-like right? (Ie randomly generated). I wonder if they had some structure instead (eg '%09d' % (id++)) how the results would look... bq. I also notice that the PulsingCodec is no longer faster than standard Codec - is this news to people as I thought it was supposed to be the way forward? That's baffling to me: it should only save seeks vs Lucene40 codec, so on a cold index you should see substantial gains, and on a warm index I'd still expect some gains. Not sure what's up... bq. I can open a seperate JIRA issue for this 4.0 version of the code if that makes more sense. I think it's fine to do it here? Really 3.6.x is only for bug fixes now ... so I think we should commit this to trunk. I wonder if you can wrap any other PostingsFormat (ie instead of hard-coding to Lucene40PostingsFormat)? This way users can wrap any PF they have w/ the bloom filter... Can you use FixedBitSet instead of OpenBitSet? Or is there a reason to use OpenBitSet here...? Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6 Reporter: Mark Harwood Priority: Minor Fix For: 3.6.1 Attachments: BloomFilterCodec40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3486) The memory size of Solr caches should be configurable
[ https://issues.apache.org/jira/browse/SOLR-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated SOLR-3486: --- Attachment: LFUMap.java I've just uploaded LFUMap.java based on your implementation of LFUCache. To have a LFU cache with configurable maximum size in bytes, just wrap an instance of this class into a SizableCache. I uploaded this file to show how SizableCache could be used with different kinds of backends. But building a LFUCache is a different issue. I think we should continue the discussion on LFUCache on SOLR-3393 and only discuss configurability of the mem size of Solr caches here. Feel free to reuse the code LFUMap.java if you want, just beware that I didn't test it much. :-) The memory size of Solr caches should be configurable - Key: SOLR-3486 URL: https://issues.apache.org/jira/browse/SOLR-3486 Project: Solr Issue Type: Improvement Components: search Reporter: Adrien Grand Priority: Minor Attachments: LFUMap.java, SOLR-3486.patch, SOLR-3486.patch It is currently possible to configure the sizes of Solr caches based on the number of entries of the cache. The problem is that the memory size of cached values may vary a lot over time (depending on IndexReader.maxDoc and the queries that are run) although the JVM heap size does not. Having a configurable max size in bytes would also help optimize cache utilization, making it possible to store more values provided that they have a small memory footprint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
Yeah, I suppose if we could tell the test framework, for this test, ignore this expected uncaught exception, that might help. The point of handling uncaught exceptions, thread leaks etc. at the test framework's level is in the essence to capture bad tests or unanticipated conditions, not failures that we know of or can predict. So is interrupting leaked threads (because there is no way to do it otherwise if you don't know anything about a given thread). This said, there are obviously ways to handle the above situation -- from trying to speed up jetty shutdown to capturing that uncaught error. Why is jetty so slow to shutdown? What does it mean slow? Tests went from 6 minutes for me, to 33 minutes. I don't think this can be explained by shutting down jetty... this seems too long. Can you provide a repeatable test case what would demonstrate the failure you mentioned? Once I have it it'll be easier to try to come up with workarounds. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283583#comment-13283583 ] Mark Harwood commented on LUCENE-4069: -- My current focus is speeding up primary key lookups but this may have applications outside of that (Zipf tells us there is a lot of low frequency stuff in free text). Following the principle of the best IO is no IO the Bloom Filter helps us quickly understand which segments to even bother looking in. That has to be a win overall. I started trying to write this Codec as a wrapper for any other Codec (it simply listens to a stream of terms and stores a bitset of recorded hashes in a .blm file). However that was trickier than I expected - I'd need to encode a special entry in my blm files just to know the name of the delegated codec I needed to instantiate at read-time because Lucene's normal Codec-instantiation logic would be looking for BloomCodec and I'd have to discover the delegate that was used to write all of the non-blm files. Not looked at FixedBitSet but I imagine that could be used instead. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6 Reporter: Mark Harwood Priority: Minor Fix For: 3.6.1 Attachments: BloomFilterCodec40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3131) XA Resource/Transaction support
[ https://issues.apache.org/jira/browse/LUCENE-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283588#comment-13283588 ] Michael McCandless commented on LUCENE-3131: bq. Is XA wrapper in the roadmap? There is no roadmap in open source. Instead, users and devs scratch their own itches and contribute patches to fix things, add new features, etc. So once someone who understands XA and Lucene contributes/iterates on a patch, then we'll have XA support... it could be someone out there has already built it but just hasn't offered it back yet... XA Resource/Transaction support Key: LUCENE-3131 URL: https://issues.apache.org/jira/browse/LUCENE-3131 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.1 Reporter: Magnus Assignee: Robert Muir Priority: Minor Fix For: 3.1.1 Please add XAResoure/XATransaction support into Lucene core. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3490) When DocumentObjectBinder encounters an invalid setter method, it should add that to the runtimeexception message.
Nicholas DiPiazza created SOLR-3490: --- Summary: When DocumentObjectBinder encounters an invalid setter method, it should add that to the runtimeexception message. Key: SOLR-3490 URL: https://issues.apache.org/jira/browse/SOLR-3490 Project: Solr Issue Type: Improvement Affects Versions: 3.6 Environment: All Reporter: Nicholas DiPiazza Priority: Minor While trying to use QueryResponse.getBeans(ClassT type), I have an application getting the RuntimeException: Invalid setter method. Must have one and only one parameter. This is from org.apache.solr.client.solrj.beans.DocumentObjectBinder.DocField.storeType() I was forced to get out the debugger in order to get the name of the Pojo and the Setter it is referring to. Please add information into the RuntimeException. throw new RuntimeException(Invalid setter method in + setter.getName() + in class + setter.getClass().getName() + . Setter method must have at least one parameter.); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
Yeah, I suppose if we could tell the test framework, for this test, ignore this expected uncaught exception, that might help. This is not a problem technically; for a single test suite I'd just wrap everything with a rule that would temporarily capture unhandled exceptions, that's it. I suspect a lot of other exceptions/ problems we're seeing are due to leaked threads and unclosed jetty/zk sockets, so I'd rather work on trying to make this more efficient. I am pretty swamped with other things at the moment but if you can give me a test case that somehow shows these long jetty shutdown times it'd be a big help. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283593#comment-13283593 ] Michael McCandless commented on LUCENE-4062: Thanks Adrien, this looks great! I'll commit soon... More fine-grained control over the packed integer implementation that is chosen --- Key: LUCENE-4062 URL: https://issues.apache.org/jira/browse/LUCENE-4062 Project: Lucene - Java Issue Type: Improvement Components: core/other Reporter: Adrien Grand Assignee: Michael McCandless Priority: Minor Labels: performance Fix For: 4.1 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch In order to save space, Lucene has two main PackedInts.Mutable implentations, one that is very fast and is based on a byte/short/integer/long array (Direct*) and another one which packs bits in a memory-efficient manner (Packed*). The packed implementation tends to be much slower than the direct one, which discourages some Lucene components to use it. On the other hand, if you store 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. If you accept to trade some space for speed, you could store 3 of these 21 bits integers in a long, resulting in an overhead of 1/3 bit per value. One advantage of this approach is that you never need to read more than one block to read or write a value, so this can be significantly faster than Packed32 and Packed64 which always need to read/write two blocks in order to avoid costly branches. I ran some tests, and for 1000 21 bits values, this implementation takes less than 2% more space and has 44% faster writes and 30% faster reads. The 12 bits version (5 values per block) has the same performance improvement and a 6% memory overhead compared to the packed implementation. In order to select the best implementation for a given integer size, I wrote the {{PackedInts.getMutable(valueCount, bitsPerValue, acceptableOverheadPerValue)}} method. This method select the fastest implementation that has less than {{acceptableOverheadPerValue}} wasted bits per value. For example, if you accept an overhead of 20% ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty reasonable, here is what implementations would be selected: * 1: Packed64SingleBlock1 * 2: Packed64SingleBlock2 * 3: Packed64SingleBlock3 * 4: Packed64SingleBlock4 * 5: Packed64SingleBlock5 * 6: Packed64SingleBlock6 * 7: Direct8 * 8: Direct8 * 9: Packed64SingleBlock9 * 10: Packed64SingleBlock10 * 11: Packed64SingleBlock12 * 12: Packed64SingleBlock12 * 13: Packed64 * 14: Direct16 * 15: Direct16 * 16: Direct16 * 17: Packed64 * 18: Packed64SingleBlock21 * 19: Packed64SingleBlock21 * 20: Packed64SingleBlock21 * 21: Packed64SingleBlock21 * 22: Packed64 * 23: Packed64 * 24: Packed64 * 25: Packed64 * 26: Packed64 * 27: Direct32 * 28: Direct32 * 29: Direct32 * 30: Direct32 * 31: Direct32 * 32: Direct32 * 33: Packed64 * 34: Packed64 * 35: Packed64 * 36: Packed64 * 37: Packed64 * 38: Packed64 * 39: Packed64 * 40: Packed64 * 41: Packed64 * 42: Packed64 * 43: Packed64 * 44: Packed64 * 45: Packed64 * 46: Packed64 * 47: Packed64 * 48: Packed64 * 49: Packed64 * 50: Packed64 * 51: Packed64 * 52: Packed64 * 53: Packed64 * 54: Direct64 * 55: Direct64 * 56: Direct64 * 57: Direct64 * 58: Direct64 * 59: Direct64 * 60: Direct64 * 61: Direct64 * 62: Direct64 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still choose the slower Packed64 implementation. Allowing a 50% overhead would prevent the packed implementation to be selected for bits per value under 32. Allowing an overhead of 32 bits per value would make sure that a Direct* implementation is always selected. Next steps would be to: * make lucene components use this {{getMutable}} method and let users decide what trade-off better suits them, * write a Packed32SingleBlock implementation if necessary (I didn't do it because I have no 32-bits computer to test the performance improvements). I think this would allow more fine-grained control over the speed/space trade-off, what do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (SOLR-2822) don't run update processors twice
[ https://issues.apache.org/jira/browse/SOLR-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2822. Resolution: Fixed Assignee: Hoss Man Committed revision 1342743. don't run update processors twice - Key: SOLR-2822 URL: https://issues.apache.org/jira/browse/SOLR-2822 Project: Solr Issue Type: Sub-task Components: SolrCloud, update Reporter: Yonik Seeley Assignee: Hoss Man Fix For: 4.0 Attachments: SOLR-2822.patch, SOLR-2822.patch, SOLR-2822.patch An update will first go through processors until it gets to the point where it is forwarded to the leader (or forwarded to replicas if already on the leader). We need a way to skip over the processors that were already run (perhaps by using a processor chain dedicated to sub-updates? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4069) Segment-level Bloom filters for a 2 x speed up on rare term searches
[ https://issues.apache.org/jira/browse/LUCENE-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283615#comment-13283615 ] Mark Harwood commented on LUCENE-4069: -- Update- I've discovered this Bloom Filter Codec currently has a bug where it doesn't handle indexes with 1 field. It's probably all tangled up in the PerField... codec logic so I need to do some more digging. Segment-level Bloom filters for a 2 x speed up on rare term searches Key: LUCENE-4069 URL: https://issues.apache.org/jira/browse/LUCENE-4069 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 3.6 Reporter: Mark Harwood Priority: Minor Fix For: 3.6.1 Attachments: BloomFilterCodec40.patch, MHBloomFilterOn3.6Branch.patch, PrimaryKey40PerformanceTestSrc.zip An addition to each segment which stores a Bloom filter for selected fields in order to give fast-fail to term searches, helping avoid wasted disk access. Best suited for low-frequency fields e.g. primary keys on big indexes with many segments but also speeds up general searching in my tests. Overview slideshow here: http://www.slideshare.net/MarkHarwood/lucene-bloomfilteredsegments Benchmarks based on Wikipedia content here: http://goo.gl/X7QqU Patch based on 3.6 codebase attached. There are no API changes currently - to play just add a field with _blm on the end of the name to invoke special indexing/querying capability. Clearly a new Field or schema declaration(!) would need adding to APIs to configure the service properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Build failed in Jenkins: Lucene-Solr-trunk-Windows-Java6-64 #205
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/205/changes Changes: [markrmiller] SOLR-2923: IllegalArgumentException when using useFilterForSortedQuery on an empty index. -- [...truncated 10778 lines...] [junit4] Completed in 0.03s, 1 test [junit4] [junit4] Suite: org.apache.solr.util.FileUtilsTest [junit4] Completed in 0.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.TestCodecSupport [junit4] Completed in 0.26s, 3 tests [junit4] [junit4] Suite: org.apache.solr.ConvertedLegacyTest [junit4] Completed in 5.05s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.component.DebugComponentTest [junit4] Completed in 1.49s, 2 tests [junit4] [junit4] Suite: org.apache.solr.cloud.BasicDistributedZkTest [junit4] Completed in 54.73s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestRangeQuery [junit4] Completed in 9.22s, 2 tests [junit4] [junit4] Suite: org.apache.solr.search.TestRecovery [junit4] Completed in 13.61s, 9 tests [junit4] [junit4] Suite: org.apache.solr.cloud.BasicZkTest [junit4] Completed in 8.74s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.AutoCommitTest [junit4] Completed in 9.68s, 3 tests [junit4] [junit4] Suite: org.apache.solr.TestGroupingSearch [junit4] Completed in 6.01s, 12 tests [junit4] [junit4] Suite: org.apache.solr.handler.component.QueryElevationComponentTest [junit4] Completed in 7.61s, 7 tests [junit4] [junit4] Suite: org.apache.solr.update.PeerSyncTest [junit4] Completed in 5.34s, 1 test [junit4] [junit4] Suite: org.apache.solr.spelling.suggest.SuggesterFSTTest [junit4] Completed in 1.65s, 4 tests [junit4] [junit4] Suite: org.apache.solr.handler.MoreLikeThisHandlerTest [junit4] Completed in 1.30s, 1 test [junit4] [junit4] Suite: org.apache.solr.handler.StandardRequestHandlerTest [junit4] Completed in 1.06s, 1 test [junit4] [junit4] Suite: org.apache.solr.BasicFunctionalityTest [junit4] IGNORED 0.00s | BasicFunctionalityTest.testDeepPaging [junit4] Cause: Annotated @Ignore(See SOLR-1726) [junit4] Completed in 3.08s, 23 tests, 1 skipped [junit4] [junit4] Suite: org.apache.solr.SolrInfoMBeanTest [junit4] Completed in 0.94s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.SolrCmdDistributorTest [junit4] Completed in 2.08s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest [junit4] Completed in 1.63s, 6 tests [junit4] [junit4] Suite: org.apache.solr.core.TestSolrDeletionPolicy2 [junit4] Completed in 1.15s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.TestIndexingPerformance [junit4] Completed in 0.89s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestPseudoReturnFields [junit4] Completed in 1.44s, 13 tests [junit4] [junit4] Suite: org.apache.solr.highlight.HighlighterTest [junit4] Completed in 1.99s, 27 tests [junit4] [junit4] Suite: org.apache.solr.update.processor.UniqFieldsUpdateProcessorFactoryTest [junit4] Completed in 0.81s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.SpatialFilterTest [junit4] Completed in 1.52s, 3 tests [junit4] [junit4] Suite: org.apache.solr.servlet.NoCacheHeaderTest [junit4] Completed in 1.04s, 3 tests [junit4] [junit4] Suite: org.apache.solr.update.SolrIndexConfigTest [junit4] Completed in 1.82s, 2 tests [junit4] [junit4] Suite: org.apache.solr.search.TestDocSet [junit4] Completed in 0.63s, 2 tests [junit4] [junit4] Suite: org.apache.solr.response.TestPHPSerializedResponseWriter [junit4] Completed in 1.00s, 2 tests [junit4] [junit4] Suite: org.apache.solr.DisMaxRequestHandlerTest [junit4] Completed in 1.09s, 3 tests [junit4] [junit4] Suite: org.apache.solr.handler.JsonLoaderTest [junit4] Completed in 1.05s, 5 tests [junit4] [junit4] Suite: org.apache.solr.core.IndexReaderFactoryTest [junit4] Completed in 0.88s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.TestExtendedDismaxParser [junit4] Completed in 9.15s, 8 tests [junit4] [junit4] Suite: org.apache.solr.schema.TestCollationField [junit4] Completed in 0.45s, 8 tests [junit4] [junit4] Suite: org.apache.solr.schema.IndexSchemaRuntimeFieldTest [junit4] Completed in 1.19s, 1 test [junit4] [junit4] Suite: org.apache.solr.core.AlternateDirectoryTest [junit4] Completed in 1.01s, 1 test [junit4] [junit4] Suite: org.apache.solr.update.processor.UpdateRequestProcessorFactoryTest [junit4] Completed in 1.06s, 1 test [junit4] [junit4] Suite: org.apache.solr.search.ReturnFieldsTest [junit4] Completed in 1.25s, 10 tests [junit4] [junit4] Suite:
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283631#comment-13283631 ] Simon Willnauer commented on LUCENE-2878: - bq. This might take a little longer, in that it will require me to actually think about what I'm doing... no worries, good job so far. Did the updated patch made sense to you? I think you had a good warmup phase now we can go somewhat deeper! Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Positions Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, mentor Fix For: Positions Branch Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 14333 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/14333/ 1 tests failed. REGRESSION: org.apache.solr.cloud.FullSolrCloudTest.testDistribSearch Error Message: Timeout occured while waiting response from server at: http://localhost:56592/solr/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://localhost:56592/solr/collection1 at __randomizedtesting.SeedInfo.seed([B8E9683C451CA579:390FE6243243C545]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:433) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:209) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.cloud.FullSolrCloudTest.index_specific(FullSolrCloudTest.java:498) at org.apache.solr.cloud.FullSolrCloudTest.brindDownShardIndexSomeDocsAndRecover(FullSolrCloudTest.java:713) at org.apache.solr.cloud.FullSolrCloudTest.doTest(FullSolrCloudTest.java:550) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:680) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53) at org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605) at
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
On May 25, 2012, at 12:26 PM, Dawid Weiss wrote: I don't think this can be explained by shutting down jetty... this seems too long. Can you provide a repeatable test case what would demonstrate the failure you mentioned? Once I have it it'll be easier to try to come up with workarounds. Dawid No, I don't think it would be that easy to make a repeatable test case, so I don't think I'll have near term time for it. This one is not really a practical issue, so low on my priority list. It repeats on jenkins on the rare occasion ;) Jetty seems more likely than the test framework to me - IW#close happens well before the test is over, and in the main thread, and that is what is interrupted (waiting for merges to finish)...and Jetty will send an interrupt on shutdown after the graceful shutdown timeout. Increasing that timeout will drastically lessen the chances of it happening - but we start and shutdown jetty serially, and that is likely why its so much longer - some tests use a lot of jetties. Trying to stop jetties in parallel might be one thing to try obviously. - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen
[ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4062. Resolution: Fixed Fix Version/s: (was: 4.1) 4.0 Thanks Adrien! More fine-grained control over the packed integer implementation that is chosen --- Key: LUCENE-4062 URL: https://issues.apache.org/jira/browse/LUCENE-4062 Project: Lucene - Java Issue Type: Improvement Components: core/other Reporter: Adrien Grand Assignee: Michael McCandless Priority: Minor Labels: performance Fix For: 4.0 Attachments: LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch, LUCENE-4062.patch In order to save space, Lucene has two main PackedInts.Mutable implentations, one that is very fast and is based on a byte/short/integer/long array (Direct*) and another one which packs bits in a memory-efficient manner (Packed*). The packed implementation tends to be much slower than the direct one, which discourages some Lucene components to use it. On the other hand, if you store 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. If you accept to trade some space for speed, you could store 3 of these 21 bits integers in a long, resulting in an overhead of 1/3 bit per value. One advantage of this approach is that you never need to read more than one block to read or write a value, so this can be significantly faster than Packed32 and Packed64 which always need to read/write two blocks in order to avoid costly branches. I ran some tests, and for 1000 21 bits values, this implementation takes less than 2% more space and has 44% faster writes and 30% faster reads. The 12 bits version (5 values per block) has the same performance improvement and a 6% memory overhead compared to the packed implementation. In order to select the best implementation for a given integer size, I wrote the {{PackedInts.getMutable(valueCount, bitsPerValue, acceptableOverheadPerValue)}} method. This method select the fastest implementation that has less than {{acceptableOverheadPerValue}} wasted bits per value. For example, if you accept an overhead of 20% ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty reasonable, here is what implementations would be selected: * 1: Packed64SingleBlock1 * 2: Packed64SingleBlock2 * 3: Packed64SingleBlock3 * 4: Packed64SingleBlock4 * 5: Packed64SingleBlock5 * 6: Packed64SingleBlock6 * 7: Direct8 * 8: Direct8 * 9: Packed64SingleBlock9 * 10: Packed64SingleBlock10 * 11: Packed64SingleBlock12 * 12: Packed64SingleBlock12 * 13: Packed64 * 14: Direct16 * 15: Direct16 * 16: Direct16 * 17: Packed64 * 18: Packed64SingleBlock21 * 19: Packed64SingleBlock21 * 20: Packed64SingleBlock21 * 21: Packed64SingleBlock21 * 22: Packed64 * 23: Packed64 * 24: Packed64 * 25: Packed64 * 26: Packed64 * 27: Direct32 * 28: Direct32 * 29: Direct32 * 30: Direct32 * 31: Direct32 * 32: Direct32 * 33: Packed64 * 34: Packed64 * 35: Packed64 * 36: Packed64 * 37: Packed64 * 38: Packed64 * 39: Packed64 * 40: Packed64 * 41: Packed64 * 42: Packed64 * 43: Packed64 * 44: Packed64 * 45: Packed64 * 46: Packed64 * 47: Packed64 * 48: Packed64 * 49: Packed64 * 50: Packed64 * 51: Packed64 * 52: Packed64 * 53: Packed64 * 54: Direct64 * 55: Direct64 * 56: Direct64 * 57: Direct64 * 58: Direct64 * 59: Direct64 * 60: Direct64 * 61: Direct64 * 62: Direct64 Under 32 bits per value, only 13, 17 and 22-26 bits per value would still choose the slower Packed64 implementation. Allowing a 50% overhead would prevent the packed implementation to be selected for bits per value under 32. Allowing an overhead of 32 bits per value would make sure that a Direct* implementation is always selected. Next steps would be to: * make lucene components use this {{getMutable}} method and let users decide what trade-off better suits them, * write a Packed32SingleBlock implementation if necessary (I didn't do it because I have no 32-bits computer to test the performance improvements). I think this would allow more fine-grained control over the speed/space trade-off, what do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To
Jenkins build is back to normal : Lucene-Solr-trunk-Windows-Java6-64 #206
See http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Windows-Java6-64/206/changes - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward updated LUCENE-2878: -- Attachment: LUCENE-2878.patch New patch, implementing positions() for ReqExclScorer and ReqOptSumScorer, with a couple of basic tests. These just return Conj/Disj PositionIterators, ignoring the excluded Scorers. It works in the simple cases that I've got here, but they may need to be made more complex when we take proximity searches into account. Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Positions Branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, gsoc2012, lucene-gsoc-11, lucene-gsoc-12, mentor Fix For: Positions Branch Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-trunk - Build # 1865 - Failure
On May 25, 2012, at 2:07 PM, Mark Miller wrote: Trying to stop jetties in parallel might be one thing to try obviously. But I still expected to see an ugly slowdown on many tests (eg even 30 seconds * 10 tests is a significant add). It may be we simply have to do it in this one test though (add to the graceful exit time) - other tests don't have enough indexing occurring to cause long end merges I think. - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible
[ https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283725#comment-13283725 ] Andrzej Bialecki commented on LUCENE-4055: --- +1, this looks very good. One comment re. SegmentInfoPerCommit. This class is not extensible and contains a fixed set of attributes. In LUCENE-3837 this or similar place would be the ideal mechanism to carry info about stacked segments, since this information is specific to a commit point. Unfortunately, there are no MapString,String attributes on this level, so I guess for now this type of aux data will have to be put in SegmentInfos.userData even though it's not index global but segment-specific. Refactor SegmentInfo / FieldInfo to make them extensible Key: LUCENE-4055 URL: https://issues.apache.org/jira/browse/LUCENE-4055 Project: Lucene - Java Issue Type: Improvement Components: core/codecs Reporter: Andrzej Bialecki Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-4055.patch After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes should be made abstract so that they can be extended by Codec-s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible
[ https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283738#comment-13283738 ] Robert Muir commented on LUCENE-4055: - Right, but I think this is correct: the codec should be responsible for encode/decode of inverted index segments only (the whole problem here originally was trying to have it also look after commits). So it really shouldn't be customizing things about the commit, as that creates a confusing impedance mismatch. I think things like stacked segments in LUCENE-3837 that need to do things other than implement encoding/decoding of segment should be above the codec level: since its a separate concern, if someone wants to have updatable fields thats unrelated to the integer compression algorithm used or what not. Refactor SegmentInfo / FieldInfo to make them extensible Key: LUCENE-4055 URL: https://issues.apache.org/jira/browse/LUCENE-4055 Project: Lucene - Java Issue Type: Improvement Components: core/codecs Reporter: Andrzej Bialecki Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-4055.patch After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes should be made abstract so that they can be extended by Codec-s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible
[ https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283755#comment-13283755 ] Andrzej Bialecki commented on LUCENE-4055: --- bq. stacked segments in LUCENE-3837 that need to do things other than implement encoding/decoding of segment should be above the codec level .. Certainly, that's why it would make sense to put this extended info in SegmentInfoPerCommit and not in any file handled by Codec. Refactor SegmentInfo / FieldInfo to make them extensible Key: LUCENE-4055 URL: https://issues.apache.org/jira/browse/LUCENE-4055 Project: Lucene - Java Issue Type: Improvement Components: core/codecs Reporter: Andrzej Bialecki Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-4055.patch After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes should be made abstract so that they can be extended by Codec-s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4055) Refactor SegmentInfo / FieldInfo to make them extensible
[ https://issues.apache.org/jira/browse/LUCENE-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283755#comment-13283755 ] Andrzej Bialecki edited comment on LUCENE-4055 at 5/25/12 9:32 PM: bq. stacked segments in LUCENE-3837 that need to do things other than implement encoding/decoding of segment should be above the codec level .. Certainly, that's why it would make sense to put this extended info in SegmentInfoPerCommit and not in any file handled by Codec. My comment was about the lack of easy extensibility of the codec-independent per-segment data (SegmentInfoPerCommit - info about stacked data is per-segment and per-commit), so LUCENE-3837 will need to use for now the codec-independent index-global data (SegmentInfos). It's not ideal but not a deal breaker either, especially since we now have version info in both of these places. was (Author: ab): bq. stacked segments in LUCENE-3837 that need to do things other than implement encoding/decoding of segment should be above the codec level .. Certainly, that's why it would make sense to put this extended info in SegmentInfoPerCommit and not in any file handled by Codec. Refactor SegmentInfo / FieldInfo to make them extensible Key: LUCENE-4055 URL: https://issues.apache.org/jira/browse/LUCENE-4055 Project: Lucene - Java Issue Type: Improvement Components: core/codecs Reporter: Andrzej Bialecki Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-4055.patch After LUCENE-4050 is done the resulting SegmentInfo / FieldInfo classes should be made abstract so that they can be extended by Codec-s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-3489. --- Resolution: Fixed Thanks for reporting. You patch (which is identical with the trunk code) is committed to branch 3_6 Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Assignee: Jan Høydahl Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-3489) Config file replication less error prone
[ https://issues.apache.org/jira/browse/SOLR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl reassigned SOLR-3489: - Assignee: Jan Høydahl Config file replication less error prone Key: SOLR-3489 URL: https://issues.apache.org/jira/browse/SOLR-3489 Project: Solr Issue Type: Improvement Components: replication (java) Affects Versions: 3.6 Reporter: Jochen Just Assignee: Jan Høydahl Priority: Minor Attachments: SOLR-3489.patch, SOLR-3489_reproducing_config.tar.gz If the listing of configuration files that should be replicated contains a space, the following file is not replicated. Example: {code:xml} !-- The error in the configuration is the space before stopwords.txt. Because of that that file is not replicated -- str name=confFilesschema.xml,test.txt, stopwords.txt/str {code} It would be nice, if that space simply would be ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org