convert SuggestWord[][] to python
Hi, suggestWordsList = wordBreakSpellChecker.suggestWordBreaks( term, maxSugs, reader, sugMode, sortMethod) for suggestWords in suggestWordsList: print suggestWords[0] Java doc here describes .. http://lucene.apache.org/core/4_3_0/suggest/org/apache/lucene/search/spell/WordBreakSpellChecker.html public SuggestWord http://lucene.apache.org/core/4_3_0/suggest/org/apache/lucene/search/spell/SuggestWord.html[][] suggestWordBreaks(Term http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/index/Term.html?is-external=true term, int maxSuggestions, IndexReader http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/index/IndexReader.html?is-external=true ir, SuggestMode http://lucene.apache.org/core/4_3_0/suggest/org/apache/lucene/search/spell/SuggestMode.html suggestMode, WordBreakSpellChecker.BreakSuggestionSortMethod http://lucene.apache.org/core/4_3_0/suggest/org/apache/lucene/search/spell/WordBreakSpellChecker.BreakSuggestionSortMethod.html sortMethod) throws IOException http://download.oracle.com/javase/6/docs/api/java/io/IOException.html?is-external=true .. but .. How do we convert from SuggestWordhttp://lucene.apache.org/core/4_3_0/suggest/org/apache/lucene/search/spell/SuggestWord.html [][] to something accessible from python? Thanks!
Re: Request for Mentor for LUCENE-2562 : Make Luke a Lucene/Solr Module
@Mark - Do you have it in public repo? On Tue, Jul 16, 2013 at 12:02 AM, Mark Miller markrmil...@gmail.com wrote: My feeling is that what we need most is what I've been working on (surprise, surprise :) ) We need a simple Java app, very similar to the std Luke app. We need it to be Apache licensed all the way through. We need it to be fully integrated as a module. We need it to be straightforward enough that any of the Lucene/Solr committers can easily work on it and update it as API's change. GWT is probably a stretch for that goal - Apache Pivot is pretty straight forward though - for any reasonable Java developer. I picked it up in absolutely no time to build the thing from scratch - modifying it is 10 times easier. The backend code is all java, the layout and widgets all XML. I've been pushing towards that goal (over the years now) with Luke ALE (Apache Lucene Edition). It's not a straight port of Luke with thinlet to Luke with Apache Pivot - Luke has 90% of it's code in one huge class - I've already been working on modularizing that code as I've moved it over - not too heavily because that would have made it difficult to keep porting code, but a good start. Now that the majority of features have been moved over, it's probably easier to keep refactoring - which is needed, because another very important missing piece is unit tests - and good units tests will require even more refactoring of the code. I also think a GWT version - something that could probably run nicely with Solr - would be awesome. But way down the line in priority for me. We need something very close to Lucene that the committers will push up the hill as they push Lucene. - Mark On Jul 15, 2013, at 11:15 AM, Robert Muir rcm...@gmail.com wrote: I disagree with this completely. Solr is last priority On Jul 15, 2013 6:14 AM, Jack Krupansky j...@basetechnology.com wrote: My personal thoughts/preferences/suggestions for Luke: 1. Need a clean Luke Java library – heavily unit-tested. As integrated with Lucene as possible. 2. A simple command line interface – always useful. 3. A Solr plugin handler – based on #1. Good for apps as well as Admin UI. Nice to be able to curl a request to look at a specific doc, for example. 4. GUI fully integrated with the new Solr Web Admin UI. A separate UI... sucks. 5. Any additional, un-untegrated GUI is icing on the cake and not really desirable for Solr. May be great for Elasticsearch and other Lucene-based apps, but Solr should be the #1 priority – after #1 and #2 above. -- Jack Krupansky *From:* Dmitry Kan dmitry.luc...@gmail.com *Sent:* Monday, July 15, 2013 8:54 AM *To:* dev@lucene.apache.org *Subject:* Re: Request for Mentor for LUCENE-2562 : Make Luke a Lucene/Solr Module Hello guys, Indeed, the GWT port is work in progress and far from done. The driving factor here was to be able to later integrate luke into the solr admin as well as have the standalone webapp for non-solr users. There is (was?) a luke stats handler in the solr ui, that printed some stats on the index. That could be substituted with the GWT app. The code isn't yet ready to see the light. So if it makes more sense for Ajay to work on the existing jira with the Apache Pivot implementation, I would say go ahead. In the current port effort (the aforementioned github's fork) the UI is the original one, developed by Andrzej. Beside the UI rework there is plenty things to port / verify (like e.g. Hadoop plugin) against the latest lucene versions. See the readme.md: https://github.com/dmitrykey/luke Whichever way's taken, hopefully we end up having stable releases of luke :) Dmitry Kan On 14 July 2013 22:38, Andrzej Bialecki a...@getopt.org wrote: On 7/14/13 5:04 AM, Ajay Bhat wrote: Shawn and Andrzej, Thanks for answering my questions. I've looked over the code done by Dmitry and I'll look into what I can do to help with the UI porting in future. I was actually thinking of doing this JIRA as a project by myself with some assistance from the community after getting a mentor for the ASF ICFOSS program, which I haven't found yet. It would be great if I could get one of you guys as a mentor. As the UI work has been mostly done by others like Dmitry Kan, I don't think I need to work on that majorly for now. It's far from done - he just started the process. What other work is there to be done that I can do as a project? Any new features or improvements? Regards, Ajay On Jul 14, 2013 1:54 AM, Andrzej Bialecki a...@getopt.org mailto:a...@getopt.org wrote: On 7/13/13 8:56 PM, Shawn Heisey wrote: On 7/13/2013 3:15 AM, Ajay Bhat wrote: One more question : What version of Lucene does Luke currently support right now? I saw a comment on the issue page that it doesn't support the Lucene 4.1 and 4.2 trunk. The official Luke project only has
[jira] [Commented] (LUCENE-5116) IW.addIndexes doesn't prune all deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-5116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709506#comment-13709506 ] Shai Erera commented on LUCENE-5116: +1. This should be an easy and nice improvement. There's no need to spend the work adding those segments just so they are dropped on the next merge. IW.addIndexes doesn't prune all deleted segments Key: LUCENE-5116 URL: https://issues.apache.org/jira/browse/LUCENE-5116 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5116_test.patch at the least, this can easily create segments with maxDoc == 0. It seems buggy: elsewhere we prune these segments out, so its expected to have a commit point with no segments rather than a segment with 0 documents... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[VOTE] Release 4.4
Please vote to release Lucene and Solr 4.4, built off revision 1503555 of https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_4. RC0 artifacts are available at: http://people.apache.org/~sarowe/staging_area/lucene-solr-4.4.0-RC0-rev1503555 The smoke tester passes for me. Here's my +1. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups
[ https://issues.apache.org/jira/browse/SOLR-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709520#comment-13709520 ] Amit Nithian commented on SOLR-4310: While it's been a few months since I looked at this patch (and unfortunately I don't work with Solr much anymore), I think the patch should be good to go insofar as the unit tests I submitted passes. Is there anything I can do to help get this moving along? If groups.ngroups is specified, the docList's numFound should be the number of groups - Key: SOLR-4310 URL: https://issues.apache.org/jira/browse/SOLR-4310 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.1 Reporter: Amit Nithian Assignee: Hoss Man Priority: Minor Fix For: 4.4 Attachments: SOLR-4310_2.patch, SOLR-4310_3.patch, SOLR-4310.patch If you group by a field, the response may look like this: lst name=grouped lst name=series int name=matches138/int int name=ngroups1/int result name=doclist numFound=138 start=0 doc int name=id267038365/int str name=name Larry's Grand Ole Garage Country Dance - Pure Country /str /doc /result /lst /lst and if you specify group.main then the doclist becomes the result and you lose all context of the number of groups. If you want to keep your response format backwards compatible with clients (i.e. clients who don't know about the grouped format), setting group.main=true solves this BUT the numFound is the number of raw matches instead of the number of groups. This may have downstream consequences. I'd like to propose that if the user specifies ngroups=true then when creating the returning DocSlice, set the numFound to be the number of groups instead of the number of raw matches to keep the response consistent with what the user would expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4428) Update SolrUIMA wiki page
[ https://issues.apache.org/jira/browse/SOLR-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709537#comment-13709537 ] Tommaso Teofili commented on SOLR-4428: --- Hi Eva, The first thing to do would be update the configuration samples on the wiki which are written using the old XML format from the first patch, they should be converted to the Solr format as per examples at: http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/src/test-files/uima/solr/collection1/conf/solrconfig.xml Once sample configurations have been updated one can go deeper describing each sample and what the configuration parameters do. Update SolrUIMA wiki page - Key: SOLR-4428 URL: https://issues.apache.org/jira/browse/SOLR-4428 Project: Solr Issue Type: Task Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor SolrUIMA wiki page (see http://wiki.apache.org/solr/SolrUIMA) is actually outdated and needs to be updated ont the following topics: * proper XML configuration * how to use existing UIMA analyzers * what's the default configuration * how to change the default configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4997) The splitshard api doesn't call commit on new sub shards
[ https://issues.apache.org/jira/browse/SOLR-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709561#comment-13709561 ] Shalin Shekhar Mangar commented on SOLR-4997: - I opened SOLR-5041 for the test task. The splitshard api doesn't call commit on new sub shards Key: SOLR-4997 URL: https://issues.apache.org/jira/browse/SOLR-4997 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3, 4.3.1 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 4.4 Attachments: SOLR-4997.patch, SOLR-4997.patch The splitshard api doesn't call commit on new sub shards but it happily sets them to active state which means on a successful split, the documents are not visible to searchers unless an explicit commit is called on the cluster. The coreadmin split api will still not call commit on targetCores. That is by design and we're not going to change that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5041) Add a test to make sure that a leader always recovers from log on startup
Shalin Shekhar Mangar created SOLR-5041: --- Summary: Add a test to make sure that a leader always recovers from log on startup Key: SOLR-5041 URL: https://issues.apache.org/jira/browse/SOLR-5041 Project: Solr Issue Type: Test Components: SolrCloud Reporter: Shalin Shekhar Mangar Fix For: 4.5 From my comment on SOLR-4997: bq. I fixed a bug that I had introduced which skipped log recovery on startup for all leaders instead of only sub shard leaders. I caught this only because I was doing another line-by-line review of all my changes. We should have a test which catches such a condition. Add a test which tests that leaders always recover from log on startup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4997) The splitshard api doesn't call commit on new sub shards
[ https://issues.apache.org/jira/browse/SOLR-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-4997. - Resolution: Fixed The splitshard api doesn't call commit on new sub shards Key: SOLR-4997 URL: https://issues.apache.org/jira/browse/SOLR-4997 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3, 4.3.1 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 4.4 Attachments: SOLR-4997.patch, SOLR-4997.patch The splitshard api doesn't call commit on new sub shards but it happily sets them to active state which means on a successful split, the documents are not visible to searchers unless an explicit commit is called on the cluster. The coreadmin split api will still not call commit on targetCores. That is by design and we're not going to change that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5113) Allow for packing the pending values of our AppendingLongBuffers
[ https://issues.apache.org/jira/browse/LUCENE-5113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709564#comment-13709564 ] ASF subversion and git services commented on LUCENE-5113: - Commit 1503578 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1503578 ] LUCENE-5113: Added (Monotonic)AppendingLongBuffer.freeze to pack the pending values. Allow for packing the pending values of our AppendingLongBuffers Key: LUCENE-5113 URL: https://issues.apache.org/jira/browse/LUCENE-5113 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5113.patch When working with small arrays, the pending values might require substantial space. So we could allow for packing the pending values in order to save space, the drawback being that this operation will make the buffer read-only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable
[ https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709573#comment-13709573 ] Lukas Vlcek commented on LUCENE-4542: - +1 on having this merged into Lucene Make RECURSION_CAP in HunspellStemmer configurable -- Key: LUCENE-4542 URL: https://issues.apache.org/jira/browse/LUCENE-4542 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Piotr Assignee: Chris Male Attachments: Lucene-4542-javadoc.patch, LUCENE-4542.patch, LUCENE-4542-with-solr.patch Currently there is private static final int RECURSION_CAP = 2; in the code of the class HunspellStemmer. It makes using hunspell with several dictionaries almost unusable, due to bad performance (f.ex. it costs 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for recursion_cap=1). It would be nice to be able to tune this number as needed. AFAIK this number (2) was chosen arbitrary. (it's a first issue in my life, so please forgive me any mistakes done). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5113) Allow for packing the pending values of our AppendingLongBuffers
[ https://issues.apache.org/jira/browse/LUCENE-5113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709572#comment-13709572 ] ASF subversion and git services commented on LUCENE-5113: - Commit 1503580 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1503580 ] LUCENE-5113: Added (Monotonic)AppendingLongBuffer.freeze to pack the pending values. Allow for packing the pending values of our AppendingLongBuffers Key: LUCENE-5113 URL: https://issues.apache.org/jira/browse/LUCENE-5113 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5113.patch When working with small arrays, the pending values might require substantial space. So we could allow for packing the pending values in order to save space, the drawback being that this operation will make the buffer read-only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 322 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/322/ 1 tests failed. REGRESSION: org.apache.lucene.index.Test2BPostings.test Error Message: GC overhead limit exceeded Stack Trace: java.lang.OutOfMemoryError: GC overhead limit exceeded at __randomizedtesting.SeedInfo.seed([213D4E6918AFE1A5:A96971B3B6538C5D]:0) at org.apache.lucene.document.Document.indexedFieldsIterator(Document.java:315) at org.apache.lucene.document.Document.access$000(Document.java:45) at org.apache.lucene.document.Document$1.iterator(Document.java:289) at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:185) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:265) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:432) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1511) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1186) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1167) at org.apache.lucene.index.Test2BPostings.test(Test2BPostings.java:76) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) Build Log: [...truncated 1433 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-trunk/build.xml:396: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-trunk/build.xml:369: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-trunk/build.xml:39: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-trunk/lucene/build.xml:49: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-trunk/lucene/common-build.xml:1247: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-trunk/lucene/common-build.xml:890: There were test failures: 363 suites, 2297 tests, 1 error, 47 ignored (34 assumptions) Total time: 29 minutes 44 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4542) Make RECURSION_CAP in HunspellStemmer configurable
[ https://issues.apache.org/jira/browse/LUCENE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709575#comment-13709575 ] Rafał Kuć commented on LUCENE-4542: --- +1 from me also. [~cmale] do we need updates on the patches or they are OK? Make RECURSION_CAP in HunspellStemmer configurable -- Key: LUCENE-4542 URL: https://issues.apache.org/jira/browse/LUCENE-4542 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.0 Reporter: Piotr Assignee: Chris Male Attachments: Lucene-4542-javadoc.patch, LUCENE-4542.patch, LUCENE-4542-with-solr.patch Currently there is private static final int RECURSION_CAP = 2; in the code of the class HunspellStemmer. It makes using hunspell with several dictionaries almost unusable, due to bad performance (f.ex. it costs 36ms to stem long sentence in latvian for recursion_cap=2 and 5 ms for recursion_cap=1). It would be nice to be able to tune this number as needed. AFAIK this number (2) was chosen arbitrary. (it's a first issue in my life, so please forgive me any mistakes done). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5113) Allow for packing the pending values of our AppendingLongBuffers
[ https://issues.apache.org/jira/browse/LUCENE-5113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5113. -- Resolution: Fixed Fix Version/s: 4.5 Allow for packing the pending values of our AppendingLongBuffers Key: LUCENE-5113 URL: https://issues.apache.org/jira/browse/LUCENE-5113 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: 4.5 Attachments: LUCENE-5113.patch When working with small arrays, the pending values might require substantial space. So we could allow for packing the pending values in order to save space, the drawback being that this operation will make the buffer read-only. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5115) Make WAH8DocIdSet compute its cardinality at building time and use it for cost()
[ https://issues.apache.org/jira/browse/LUCENE-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709615#comment-13709615 ] ASF subversion and git services commented on LUCENE-5115: - Commit 1503619 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1503619 ] LUCENE-5115: WAHDocIdSet's iterator cost() function now returns the exact cardinality of the set. Make WAH8DocIdSet compute its cardinality at building time and use it for cost() Key: LUCENE-5115 URL: https://issues.apache.org/jira/browse/LUCENE-5115 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5115.patch DocIdSetIterator.cost() accuracy can be important for the performance of some queries (eg.ConjunctionScorer). Since WAH8DocIdSet is immutable, we could compute its cardinality at building time and use it for the cost function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release 4.4
Smoke tester passes for me, +1. Shai On Tue, Jul 16, 2013 at 9:32 AM, Steve Rowe sar...@gmail.com wrote: Please vote to release Lucene and Solr 4.4, built off revision 1503555 of https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_4. RC0 artifacts are available at: http://people.apache.org/~sarowe/staging_area/lucene-solr-4.4.0-RC0-rev1503555 The smoke tester passes for me. Here's my +1. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release 4.4
+1 Artifacts look good and smoke tester was happy. -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 1800 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/1800/ 2 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=1285, name=recoveryCmdExecutor-457-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=1285, name=recoveryCmdExecutor-457-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) at __randomizedtesting.SeedInfo.seed([1164206EE9791A42]:0) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: There are still zombie threads that couldn't be terminated:1) Thread[id=1285, name=recoveryCmdExecutor-457-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4147 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4147/ 2 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=2408, name=recoveryCmdExecutor-981-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391) at java.net.Socket.connect(Socket.java:579) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=2408, name=recoveryCmdExecutor-981-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391) at java.net.Socket.connect(Socket.java:579) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) at __randomizedtesting.SeedInfo.seed([8053406BCD77C098]:0) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: There are still zombie threads that couldn't be terminated:1) Thread[id=2408, name=recoveryCmdExecutor-981-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
Re: [jira] [Commented] (SOLR-4428) Update SolrUIMA wiki page
Eva: Unfortunately we had a spam-bot problem a while ago, so we locked down the Wiki. You need to be granted edit karma to change any of the Solr/Lucene Wiki pages. It's easy to get, just let us know what your Wiki login ID is and usually Steve Rowe or I will add you as soon as we see it. Best Erick On Tue, Jul 16, 2013 at 3:00 AM, Tommaso Teofili (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709537#comment-13709537 ] Tommaso Teofili commented on SOLR-4428: --- Hi Eva, The first thing to do would be update the configuration samples on the wiki which are written using the old XML format from the first patch, they should be converted to the Solr format as per examples at: http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/src/test-files/uima/solr/collection1/conf/solrconfig.xml Once sample configurations have been updated one can go deeper describing each sample and what the configuration parameters do. Update SolrUIMA wiki page - Key: SOLR-4428 URL: https://issues.apache.org/jira/browse/SOLR-4428 Project: Solr Issue Type: Task Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor SolrUIMA wiki page (see http://wiki.apache.org/solr/SolrUIMA) is actually outdated and needs to be updated ont the following topics: * proper XML configuration * how to use existing UIMA analyzers * what's the default configuration * how to change the default configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5030) FuzzySuggester has to operate FSTs of Unicode-letters, not UTF-8, to work correctly for 1-byte (like English) and multi-byte (non-Latin) letters
[ https://issues.apache.org/jira/browse/LUCENE-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709695#comment-13709695 ] Michael McCandless commented on LUCENE-5030: Sorry for the long delay here ... Just to verify: there is no point to passing FUZZY_UNICODE_AWARE to AnalyzingSuggester, right? In which case, I think we the AnalyzingLookupFactory should not be changed? But, furthermore, I think we can isolate the changes to FuzzySuggester? E.g., move the FUZZY_UNICODE_AWARE flag down to FuzzySuggester, fix its ctor to strip that option when calling super() and move the isFuzzyUnicodeAware down as well, and then override toLookupAutomaton to do the utf8 conversion + det? This way it's not even possible to send the fuzzy flag to AnalyzingSuggester. FuzzySuggester has to operate FSTs of Unicode-letters, not UTF-8, to work correctly for 1-byte (like English) and multi-byte (non-Latin) letters Key: LUCENE-5030 URL: https://issues.apache.org/jira/browse/LUCENE-5030 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.3 Reporter: Artem Lukanin Assignee: Michael McCandless Fix For: 5.0, 4.4 Attachments: benchmark-INFO_SEP.txt, benchmark-old.txt, benchmark-wo_convertion.txt, LUCENE-5030.patch, LUCENE-5030.patch, LUCENE-5030.patch, LUCENE-5030.patch, nonlatin_fuzzySuggester1.patch, nonlatin_fuzzySuggester2.patch, nonlatin_fuzzySuggester3.patch, nonlatin_fuzzySuggester4.patch, nonlatin_fuzzySuggester_combo1.patch, nonlatin_fuzzySuggester_combo2.patch, nonlatin_fuzzySuggester_combo.patch, nonlatin_fuzzySuggester.patch, nonlatin_fuzzySuggester.patch, nonlatin_fuzzySuggester.patch, run-suggest-benchmark.patch There is a limitation in the current FuzzySuggester implementation: it computes edits in UTF-8 space instead of Unicode character (code point) space. This should be fixable: we'd need to fix TokenStreamToAutomaton to work in Unicode character space, then fix FuzzySuggester to do the same steps that FuzzyQuery does: do the LevN expansion in Unicode character space, then convert that automaton to UTF-8, then intersect with the suggest FST. See the discussion here: http://lucene.472066.n3.nabble.com/minFuzzyLength-in-FuzzySuggester-behaves-differently-for-English-and-Russian-td4067018.html#none -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-4280: Attachment: SOLR-4280-trunk.patch I forgot i had a working patch laying around. Specify spellcheck.percentageResultsForSuggest=0.25 to force maxResultsForSuggest to be 25% of the smallest filterQuery DocSet. This allows maxResultsForSuggest to be adjusted dynamically based on the filters specified. It doesn't seem to work in a distributed environment although the parameters are passed nicely. I haven't figured that out yet, but all shards return the same collation for undistributed requests. Tips? spellcheck.maxResultsForSuggest based on filter query results - Key: SOLR-4280 URL: https://issues.apache.org/jira/browse/SOLR-4280 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Markus Jelsma Fix For: 4.4 Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch spellcheck.maxResultsForSuggest takes a fixed number but ideally should be able to take a ratio and calculate that against the maximum number of results the filter queries return. At least in our case this would certainly add a lot of value. 99% of our end-users search within one or more filters of which one is always unique. The number of documents for each of those unique filters varies significantly ranging from 300 to 3.000.000 documents in which they search. The maxResultsForSuggest is set to a reasonable low value so it kind of works fine but sometimes leads to undesired suggestions for a large subcorpus that has more misspellings. Spun off from SOLR-4278. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5042) MoreLikeThis doesn't return a score when mlt.count is set to 10
Josh Curran created SOLR-5042: - Summary: MoreLikeThis doesn't return a score when mlt.count is set to 10 Key: SOLR-5042 URL: https://issues.apache.org/jira/browse/SOLR-5042 Project: Solr Issue Type: Bug Components: MoreLikeThis Affects Versions: 4.3 Reporter: Josh Curran Priority: Minor The problem appears to be around the mlt.count with in the solrconfig.xml. When this value is set to 10, the 10 values that have been identified as 'most like this' are returned with the original query, however the 'score' field is missing. Changing the mlt.count to say 11 and issuing the same query then the 'score' field is returned with the same query. This appears to be the workaround. 11 was just an arbitrary value, 12 or 15 also work The same problem was raised on stackoverflow http://stackoverflow.com/questions/16513719/solr-more-like-this-dont-return-score-while-specify-mlt-count -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release 4.4
+1 I upgraded http://jirasearch.mikemccandless.com to 4.4.0 and all looks good. And smoke tester is happy on Linux. Mike McCandless http://blog.mikemccandless.com On Tue, Jul 16, 2013 at 2:32 AM, Steve Rowe sar...@gmail.com wrote: Please vote to release Lucene and Solr 4.4, built off revision 1503555 of https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_4. RC0 artifacts are available at: http://people.apache.org/~sarowe/staging_area/lucene-solr-4.4.0-RC0-rev1503555 The smoke tester passes for me. Here's my +1. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release 4.4
I upgraded Elasticsearch to 4.4 all tests pass. The only thing that was tricky was a custom deletion policy that now receives an empty commits list in DeletionPolicy#onInit(ListCommit commits); due to some changes how we handle empty indices (the create case) it seems pretty expert but it might be worth documenting it on the javadoc level that we also pass empty lists here. This should not block the release. +1 from my side On Tue, Jul 16, 2013 at 11:19 AM, Adrien Grand jpou...@gmail.com wrote: +1 Artifacts look good and smoke tester was happy. -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5103) join on single-valued field with deleted docs scores too few docs
[ https://issues.apache.org/jira/browse/LUCENE-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709792#comment-13709792 ] Martijn van Groningen commented on LUCENE-5103: --- @David Thanks for fixing this! join on single-valued field with deleted docs scores too few docs - Key: LUCENE-5103 URL: https://issues.apache.org/jira/browse/LUCENE-5103 Project: Lucene - Core Issue Type: Bug Components: modules/join Affects Versions: 4.3.1 Reporter: David Smiley Assignee: David Smiley Fix For: 4.4 Attachments: LUCENE-5103_join_livedocs_bug.patch TermsIncludingScoreQuery has an inner class SVInnerScorer used when the to side of a join is single-valued. This has a nextDocOutOfOrder() method that is faulty when there are deleted documents, and a document that is deleted is matched by the join. It'll terminate with NO_MORE_DOCS prematurely. Interestingly, it _appears_ MVInnerScorer (multi-valued) was coded properly to not have this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Jiang updated LUCENE-3069: -- Attachment: LUCENE-3069.patch Patch: revert hashCode() Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 4.4 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) CollapsingQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Attachment: SOLR-5027.patch CollapsingQParserPlugin --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch The CollapsingQParserPlugin is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Summary: Result Set Collapse and Expand Plugins (was: CollapsingQParserPlugin) Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch The CollapsingQParserPlugin is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations
[ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709801#comment-13709801 ] Martijn van Groningen commented on LUCENE-3972: --- I think so as well. What would be the best way to use the OrdinalMap? Just create an OrdinalsMap from a top level reader via SlowCompositeReaderWrapper#getSortedSetDocValues()? This seems the only place where OrdinalsMaps are cached. Improve AllGroupsCollector implementations -- Key: LUCENE-3972 URL: https://issues.apache.org/jira/browse/LUCENE-3972 Project: Lucene - Core Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Attachments: LUCENE-3972.patch, LUCENE-3972.patch I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Description: This ticket introduces two new Solr plugins, the CollapsingQParserPlugin and the ExpandComponent. The CollapsingQParserPlugin is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The ExpandComponed is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. was: The CollapsingQParserPlugin is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the CollapsingQParserPlugin and the ExpandComponent. The CollapsingQParserPlugin is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The ExpandComponed is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-5027: - Description: This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. was: This ticket introduces two new Solr plugins, the CollapsingQParserPlugin and the ExpandComponent. The CollapsingQParserPlugin is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The ExpandComponed is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Request for Mentor for LUCENE-2562 : Make Luke a Lucene/Solr Module
Yes, it's in the public repo - see my recent comment on https://issues.apache.org/jira/browse/LUCENE-2562 - it's really simple to check it out and run it. On Tue, Jul 16, 2013 at 2:07 AM, Ashish paliwalash...@gmail.com wrote: @Mark - Do you have it in public repo? On Tue, Jul 16, 2013 at 12:02 AM, Mark Miller markrmil...@gmail.comwrote: My feeling is that what we need most is what I've been working on (surprise, surprise :) ) We need a simple Java app, very similar to the std Luke app. We need it to be Apache licensed all the way through. We need it to be fully integrated as a module. We need it to be straightforward enough that any of the Lucene/Solr committers can easily work on it and update it as API's change. GWT is probably a stretch for that goal - Apache Pivot is pretty straight forward though - for any reasonable Java developer. I picked it up in absolutely no time to build the thing from scratch - modifying it is 10 times easier. The backend code is all java, the layout and widgets all XML. I've been pushing towards that goal (over the years now) with Luke ALE (Apache Lucene Edition). It's not a straight port of Luke with thinlet to Luke with Apache Pivot - Luke has 90% of it's code in one huge class - I've already been working on modularizing that code as I've moved it over - not too heavily because that would have made it difficult to keep porting code, but a good start. Now that the majority of features have been moved over, it's probably easier to keep refactoring - which is needed, because another very important missing piece is unit tests - and good units tests will require even more refactoring of the code. I also think a GWT version - something that could probably run nicely with Solr - would be awesome. But way down the line in priority for me. We need something very close to Lucene that the committers will push up the hill as they push Lucene. - Mark On Jul 15, 2013, at 11:15 AM, Robert Muir rcm...@gmail.com wrote: I disagree with this completely. Solr is last priority On Jul 15, 2013 6:14 AM, Jack Krupansky j...@basetechnology.com wrote: My personal thoughts/preferences/suggestions for Luke: 1. Need a clean Luke Java library – heavily unit-tested. As integrated with Lucene as possible. 2. A simple command line interface – always useful. 3. A Solr plugin handler – based on #1. Good for apps as well as Admin UI. Nice to be able to curl a request to look at a specific doc, for example. 4. GUI fully integrated with the new Solr Web Admin UI. A separate UI... sucks. 5. Any additional, un-untegrated GUI is icing on the cake and not really desirable for Solr. May be great for Elasticsearch and other Lucene-based apps, but Solr should be the #1 priority – after #1 and #2 above. -- Jack Krupansky *From:* Dmitry Kan dmitry.luc...@gmail.com *Sent:* Monday, July 15, 2013 8:54 AM *To:* dev@lucene.apache.org *Subject:* Re: Request for Mentor for LUCENE-2562 : Make Luke a Lucene/Solr Module Hello guys, Indeed, the GWT port is work in progress and far from done. The driving factor here was to be able to later integrate luke into the solr admin as well as have the standalone webapp for non-solr users. There is (was?) a luke stats handler in the solr ui, that printed some stats on the index. That could be substituted with the GWT app. The code isn't yet ready to see the light. So if it makes more sense for Ajay to work on the existing jira with the Apache Pivot implementation, I would say go ahead. In the current port effort (the aforementioned github's fork) the UI is the original one, developed by Andrzej. Beside the UI rework there is plenty things to port / verify (like e.g. Hadoop plugin) against the latest lucene versions. See the readme.md: https://github.com/dmitrykey/luke Whichever way's taken, hopefully we end up having stable releases of luke :) Dmitry Kan On 14 July 2013 22:38, Andrzej Bialecki a...@getopt.org wrote: On 7/14/13 5:04 AM, Ajay Bhat wrote: Shawn and Andrzej, Thanks for answering my questions. I've looked over the code done by Dmitry and I'll look into what I can do to help with the UI porting in future. I was actually thinking of doing this JIRA as a project by myself with some assistance from the community after getting a mentor for the ASF ICFOSS program, which I haven't found yet. It would be great if I could get one of you guys as a mentor. As the UI work has been mostly done by others like Dmitry Kan, I don't think I need to work on that majorly for now. It's far from done - he just started the process. What other work is there to be done that I can do as a project? Any new features or improvements? Regards, Ajay On Jul 14, 2013 1:54 AM, Andrzej Bialecki a...@getopt.org mailto:a...@getopt.org wrote: On 7/13/13 8:56 PM, Shawn Heisey wrote: On 7/13/2013 3:15 AM, Ajay Bhat wrote: One more question : What version of
[jira] [Updated] (LUCENE-5114) remove boolean useCache param from TermsEnum.seekCeil/Exact
[ https://issues.apache.org/jira/browse/LUCENE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5114: --- Attachment: LUCENE-5114.patch Initial patch; tests pass but I haven't run ant precommit yet ... remove boolean useCache param from TermsEnum.seekCeil/Exact --- Key: LUCENE-5114 URL: https://issues.apache.org/jira/browse/LUCENE-5114 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-5114.patch Long ago terms dict had a cache, but it was problematic and we removed it, but the API still has a relic boolean useCache ... I think we should drop it from the API as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release 4.4
On Tue, Jul 16, 2013 at 9:44 AM, Simon Willnauer simon.willna...@gmail.com wrote: The only thing that was tricky was a custom deletion policy that now receives an empty commits list in DeletionPolicy#onInit(ListCommit commits); due to some changes how we handle empty indices (the create case) it seems pretty expert but it might be worth documenting it on the javadoc level that we also pass empty lists here. I'll update IDP's javadocs. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5116) IW.addIndexes doesn't prune all deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-5116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709850#comment-13709850 ] Michael McCandless commented on LUCENE-5116: +1 IW.addIndexes doesn't prune all deleted segments Key: LUCENE-5116 URL: https://issues.apache.org/jira/browse/LUCENE-5116 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5116_test.patch at the least, this can easily create segments with maxDoc == 0. It seems buggy: elsewhere we prune these segments out, so its expected to have a commit point with no segments rather than a segment with 0 documents... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups
[ https://issues.apache.org/jira/browse/SOLR-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709852#comment-13709852 ] Hoss Man commented on SOLR-4310: Amit: when you posted your last patch, you said... bq. Admittedly there are some points I still need to investigate like why the groupcount isn't set in the TopGroups during distributed search. ...is that a non-issue? is there still something there that needs fixed? If groups.ngroups is specified, the docList's numFound should be the number of groups - Key: SOLR-4310 URL: https://issues.apache.org/jira/browse/SOLR-4310 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.1 Reporter: Amit Nithian Assignee: Hoss Man Priority: Minor Fix For: 4.4 Attachments: SOLR-4310_2.patch, SOLR-4310_3.patch, SOLR-4310.patch If you group by a field, the response may look like this: lst name=grouped lst name=series int name=matches138/int int name=ngroups1/int result name=doclist numFound=138 start=0 doc int name=id267038365/int str name=name Larry's Grand Ole Garage Country Dance - Pure Country /str /doc /result /lst /lst and if you specify group.main then the doclist becomes the result and you lose all context of the number of groups. If you want to keep your response format backwards compatible with clients (i.e. clients who don't know about the grouped format), setting group.main=true solves this BUT the numFound is the number of raw matches instead of the number of groups. This may have downstream consequences. I'd like to propose that if the user specifies ngroups=true then when creating the returning DocSlice, set the numFound to be the number of groups instead of the number of raw matches to keep the response consistent with what the user would expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release 4.4
+1 Smoke tester is happy on mac. On Tue, Jul 16, 2013 at 12:02 PM, Steve Rowe sar...@gmail.com wrote: Please vote to release Lucene and Solr 4.4, built off revision 1503555 of https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_4. RC0 artifacts are available at: http://people.apache.org/~sarowe/staging_area/lucene-solr-4.4.0-RC0-rev1503555 The smoke tester passes for me. Here's my +1. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Regards, Shalin Shekhar Mangar.
[jira] [Commented] (LUCENE-5114) remove boolean useCache param from TermsEnum.seekCeil/Exact
[ https://issues.apache.org/jira/browse/LUCENE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709857#comment-13709857 ] Robert Muir commented on LUCENE-5114: - +1 to nuking this parameter! remove boolean useCache param from TermsEnum.seekCeil/Exact --- Key: LUCENE-5114 URL: https://issues.apache.org/jira/browse/LUCENE-5114 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-5114.patch Long ago terms dict had a cache, but it was problematic and we removed it, but the API still has a relic boolean useCache ... I think we should drop it from the API as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5117) DISI.iterator() should never return null.
Robert Muir created LUCENE-5117: --- Summary: DISI.iterator() should never return null. Key: LUCENE-5117 URL: https://issues.apache.org/jira/browse/LUCENE-5117 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir If you have a Filter, you have to check for null twice: Filter.getDocIDSet() can return a null DocIDSet, and then DocIDSet.iterator() can return a null iterator. There is no reason for this: I think iterator() should never return null (consistent with terms/postings apis). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5117) DISI.iterator() should never return null.
[ https://issues.apache.org/jira/browse/LUCENE-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709861#comment-13709861 ] Michael McCandless commented on LUCENE-5117: +1 DISI.iterator() should never return null. - Key: LUCENE-5117 URL: https://issues.apache.org/jira/browse/LUCENE-5117 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir If you have a Filter, you have to check for null twice: Filter.getDocIDSet() can return a null DocIDSet, and then DocIDSet.iterator() can return a null iterator. There is no reason for this: I think iterator() should never return null (consistent with terms/postings apis). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709867#comment-13709867 ] Michael McCandless commented on LUCENE-3069: bq. However, seekExact(BytesRef, TermsState) simply 'copy' the value of termState to enum, which doesn't actually operate 'seek' on dictionary. This is normal / by design. It's so that the case of seekExact(TermState) followed by .docs or .docsAndPositions is fast. We only need to re-load the metadata if the caller then tries to do .next() {quote} bq. Maybe instead of term and meta members, we could just hold the current pair? Oh, yes, I once thought about this, but not sure: like, can the callee always makes sure that, when 'term()' is called, it will always return a valid term? The codes in MemoryPF just return 'pair.output' regardless whether pair==null, is it safe? {quote} We can't guarantee that, but I think we can just check if pair == null and return null from term()? {quote} By the way, for real data, when two outputs are not 'NO_OUTPUT', even they contains the same metadata + stats, it seems to be very seldom that their arcs can be identical on FST (increases less than 1MB for wikimedium1m if equals always return false for non-singleton argument). Therefore... yes, hashCode() isn't necessary here. {quote} Hmm, but it seems like we should implement it? Ie we do get a smaller FST when implementing it? Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 4.4 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4937) SolrCloud doesn't distribute null values
[ https://issues.apache.org/jira/browse/SOLR-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709874#comment-13709874 ] Shalin Shekhar Mangar commented on SOLR-4937: - Steve -- Support for writing null values in ClientUtils was added by SOLR-4133 in Solr 4.1. Which version of Solr were you using? SolrCloud doesn't distribute null values Key: SOLR-4937 URL: https://issues.apache.org/jira/browse/SOLR-4937 Project: Solr Issue Type: Bug Reporter: Steve Davids Fix For: 4.4 When trying to overwrite field values in SolrCloud using doc.setField(fieldName, null) it produces inconsistent behavior depending on the routing of the document to a specific shard. The binary format that is sent in preserves the null, but when the DistributedProcessor forwards the message to replicas it writes the message to XML using ClientUtils.writeVal(..) which drops any null value from the XML representation. This was especially problematic when a custom processor was initially placed after the distributed processor using the previously mentioned setField(null) approach but then moved ahead of the DistributedProcessor which no longer works as expected. It appears that I now need to updated the code to: doc.setField(fieldName, Collections.singletonMap(set, null)) for it to properly distribute throughout the cloud due to the XML restrictions. The fact that the custom processor needs to change depending on it's location in reference to the DistributedProcessor is a drag. I believe there should be a requirement that you can take a SolrInputDocument - toXml - toSolrInputDocument and assert that the two SolrInputDocuments are equivalent, instead of a lossy translation to XML. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709886#comment-13709886 ] ASF subversion and git services commented on LUCENE-3069: - Commit 1503781 from [~billy] in branch 'dev/branches/lucene3069' [ https://svn.apache.org/r1503781 ] LUCENE-3069: remove some nocommits, update hashCode() equal() Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 4.4 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5117) DISI.iterator() should never return null.
[ https://issues.apache.org/jira/browse/LUCENE-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709890#comment-13709890 ] Adrien Grand commented on LUCENE-5117: -- +1 DISI.iterator() should never return null. - Key: LUCENE-5117 URL: https://issues.apache.org/jira/browse/LUCENE-5117 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir If you have a Filter, you have to check for null twice: Filter.getDocIDSet() can return a null DocIDSet, and then DocIDSet.iterator() can return a null iterator. There is no reason for this: I think iterator() should never return null (consistent with terms/postings apis). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5117) DISI.iterator() should never return null.
[ https://issues.apache.org/jira/browse/LUCENE-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709893#comment-13709893 ] Uwe Schindler commented on LUCENE-5117: --- Damn, fix this. It's horrible with those null checks! :-) DISI.iterator() should never return null. - Key: LUCENE-5117 URL: https://issues.apache.org/jira/browse/LUCENE-5117 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir If you have a Filter, you have to check for null twice: Filter.getDocIDSet() can return a null DocIDSet, and then DocIDSet.iterator() can return a null iterator. There is no reason for this: I think iterator() should never return null (consistent with terms/postings apis). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5117) DISI.iterator() should never return null.
[ https://issues.apache.org/jira/browse/LUCENE-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709896#comment-13709896 ] Robert Muir commented on LUCENE-5117: - I am working on it. I am reviewing all uses of this method... DISI.iterator() should never return null. - Key: LUCENE-5117 URL: https://issues.apache.org/jira/browse/LUCENE-5117 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir If you have a Filter, you have to check for null twice: Filter.getDocIDSet() can return a null DocIDSet, and then DocIDSet.iterator() can return a null iterator. There is no reason for this: I think iterator() should never return null (consistent with terms/postings apis). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5088) Add term filter
[ https://issues.apache.org/jira/browse/LUCENE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-5088: -- Attachment: LUCENE-5088.patch Added a test. I will commit this soon. Add term filter --- Key: LUCENE-5088 URL: https://issues.apache.org/jira/browse/LUCENE-5088 Project: Lucene - Core Issue Type: Improvement Reporter: Martijn van Groningen Assignee: Martijn van Groningen Priority: Minor Attachments: LUCENE-5088.patch, LUCENE-5088.patch I think it makes sense add a term filter: * There is a TermsFilter, but no TermFilter. * I think it is bit a more efficient then wrapping a TermQuery in an QueryWrapperFilter. * Allows the usage of DocsEnum.FLAG_NONE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5043) hostanme lookup in SystemInfoHandler should be refactored to not block core (re)load
Hoss Man created SOLR-5043: -- Summary: hostanme lookup in SystemInfoHandler should be refactored to not block core (re)load Key: SOLR-5043 URL: https://issues.apache.org/jira/browse/SOLR-5043 Project: Solr Issue Type: Improvement Reporter: Hoss Man SystemInfoHandler currently lookups the hostname of the machine on it's init, and caches for it's lifecycle -- there is a comment to the effect that the reason for this is because on some machines (notably ones with wacky DNS settings) looking up the hostname can take a long ass time in some JVMs... {noformat} // on some platforms, resolving canonical hostname can cause the thread // to block for several seconds if nameservices aren't available // so resolve this once per handler instance //(ie: not static, so core reload will refresh) {noformat} But as we move forward with a lot more multi-core, solr-cloud, dynamically updated instances, even paying this cost per core-reload is expensive. we should refactoring this so that SystemInfoHandler instances init immediately, with some kind of lazy loading of the hostname info in a background thread, (especially since hte only real point of having that info here is for UI use so you cna keep track of what machine you are looking at) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709928#comment-13709928 ] ASF subversion and git services commented on LUCENE-3069: - Commit 1503797 from [~billy] in branch 'dev/branches/lucene3069' [ https://svn.apache.org/r1503797 ] LUCENE-3069: merge trunk changes over Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 4.4 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5117) DISI.iterator() should never return null.
[ https://issues.apache.org/jira/browse/LUCENE-5117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709944#comment-13709944 ] Robert Muir commented on LUCENE-5117: - After investigation there are a few concerns of mine: * today, there are some uses of DISI not checking the result of iterator() * changing the API here is kinda a break (maybe should be 5.0 only?) * I am not totally happy with the change because of Weight.scorer can return a null Scorer (which is a DISI). Although this is unrelated to DISI.iterator(), its still a potential cause for bugs. Maybe there is a better solution I'm not thinking of too... DISI.iterator() should never return null. - Key: LUCENE-5117 URL: https://issues.apache.org/jira/browse/LUCENE-5117 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir If you have a Filter, you have to check for null twice: Filter.getDocIDSet() can return a null DocIDSet, and then DocIDSet.iterator() can return a null iterator. There is no reason for this: I think iterator() should never return null (consistent with terms/postings apis). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5118) spatial strategy- add multiplier to makeDistanceValueSource()
David Smiley created LUCENE-5118: Summary: spatial strategy- add multiplier to makeDistanceValueSource() Key: LUCENE-5118 URL: https://issues.apache.org/jira/browse/LUCENE-5118 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.5 SpatialStrategy has this abstract method: {code} /** * Make a ValueSource returning the distance between the center of the * indexed shape and {@code queryPoint}. If there are multiple indexed shapes * then the closest one is chosen. */ public abstract ValueSource makeDistanceValueSource(Point queryPoint); {code} I'd like to add another argument {{double multiplier}} that is internally multiplied to the result per document. It's a convenience over having the user wrap this with another ValueSource, and it'd be faster too. Typical usage would be to add a degrees-to-kilometers multiplier. The current method could be marked deprecated with a default implementation that invokes the new one with a 1.0 multiplier. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2345) Extend geodist() to support MultiValued lat long field
[ https://issues.apache.org/jira/browse/SOLR-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-2345: --- Fix Version/s: (was: 4.4) 4.5 Extend geodist() to support MultiValued lat long field -- Key: SOLR-2345 URL: https://issues.apache.org/jira/browse/SOLR-2345 Project: Solr Issue Type: New Feature Components: spatial Reporter: Bill Bell Assignee: David Smiley Fix For: 4.5 Attachments: SOLR-2345_geodist_refactor.patch Extend geodist() and {!geofilt} to support a multiValued lat,long field without using geohash. sort=geodist() asc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3972) Improve AllGroupsCollector implementations
[ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709974#comment-13709974 ] Robert Muir commented on LUCENE-3972: - Yes you are correct. its currently pretty ugly. But, i don't think we should really cache this anywhere else other than a slow-wrapper (it would be wrong to do so)... Improve AllGroupsCollector implementations -- Key: LUCENE-3972 URL: https://issues.apache.org/jira/browse/LUCENE-3972 Project: Lucene - Core Issue Type: Improvement Components: modules/grouping Reporter: Martijn van Groningen Attachments: LUCENE-3972.patch, LUCENE-3972.patch I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead of an ArrayList. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5101: - Attachment: DocIdSetBenchmark.java Well spotted. Maybe I did a mistake when moving the data from the benchmark output to the charts. I modified the program so that it outputs directly the input of the charts. See the updated charts at http://people.apache.org/~jpountz/doc_id_sets.html. I also modified it so that memory uses a log scale too. make it easier to plugin different bitset implementations to CachingWrapperFilter - Key: LUCENE-5101 URL: https://issues.apache.org/jira/browse/LUCENE-5101 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch Currently this is possible, but its not so friendly: {code} protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) throws IOException { if (docIdSet == null) { // this is better than returning null, as the nonnull result can be cached return EMPTY_DOCIDSET; } else if (docIdSet.isCacheable()) { return docIdSet; } else { final DocIdSetIterator it = docIdSet.iterator(); // null is allowed to be returned by iterator(), // in this case we wrap with the sentinel set, // which is cacheable. if (it == null) { return EMPTY_DOCIDSET; } else { /* INTERESTING PART */ final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(it); return bits; /* END INTERESTING PART */ } } } {code} Is there any value to having all this other logic in the protected API? It seems like something thats not useful for a subclass... Maybe this stuff can become final, and INTERESTING PART calls a simpler method, something like: {code} protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(iterator); return bits; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4311) HunspellStemFilter returns another values than Hunspell in console / command line with same dictionaries.
[ https://issues.apache.org/jira/browse/LUCENE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709978#comment-13709978 ] Lukas Vlcek commented on LUCENE-4311: - Hi Chris, I have been doing some experiments with this czech dictionary and to me it seems that it yields the best results with RECURSION_CAP = 0. Seriously! The double folding does not bring any advantage in case of this particular dictionary. In fact the dictionary is in such a good shape that it allows for direct generation of all word forms for words in dic file and only one affix rule is enough for input words to see if it matches any of the root forms, no folding needed at all. With RECURSION_CAP 1 or 2 it can generate a lot of incorrect words. The shorter the input word is the higher chance of getting incorrect (i.e. completely misleading) results up to the point where it is not useful for Lucene indexing at all. Please, can we have this fixed? I believe all is needed now is to have a look at #LUCENE-4542 and make sure the recursion level is configurable. This would be really great enhancement. HunspellStemFilter returns another values than Hunspell in console / command line with same dictionaries. - Key: LUCENE-4311 URL: https://issues.apache.org/jira/browse/LUCENE-4311 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.5, 4.0-ALPHA, 3.6.1 Environment: Apache Solr 3.5 - 4.0, Apache Tomcat 7.0 Reporter: Jan Rieger Attachments: cs_CZ.aff, cs_CZ.dic When I used HunspellStemFilter for stemming the czech language text, it returns me bad results. For example word praha returns praha and prahnout, what is not correct. So I try the same in my console (Hunspell command line) with exactly same dictionaries and it returns only praha and this is correct. Can somebody help me? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5114) remove boolean useCache param from TermsEnum.seekCeil/Exact
[ https://issues.apache.org/jira/browse/LUCENE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709979#comment-13709979 ] ASF subversion and git services commented on LUCENE-5114: - Commit 1503805 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1503805 ] LUCENE-5114: remove unused useCache param remove boolean useCache param from TermsEnum.seekCeil/Exact --- Key: LUCENE-5114 URL: https://issues.apache.org/jira/browse/LUCENE-5114 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-5114.patch Long ago terms dict had a cache, but it was problematic and we removed it, but the API still has a relic boolean useCache ... I think we should drop it from the API as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups
[ https://issues.apache.org/jira/browse/SOLR-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Wooden updated SOLR-4310: -- Attachment: SOLR-4310_4.patch Added another test class to check a few more use cases, both single-core and distributed. There are also test cases written that rely on SOLR-2894 which are commented-out. Fixed an issue where numFound (group count) would be 0 for a single-core edge case. Our tests suggest that distributed works just fine with 4310 unless Amit recalls what he alluded to previously. On a semi-related note When writing the additional tests, we noticed some inconsistent behavior around rows. Not a result of the 4310 patch nor pertinent to 4310's purpose, just something we discovered. On a single core with group.limit 1 and group.main, setting rows=10 will return 10 _documents_. A distributed setup with the same params will return 10 _groups_. A commented-out failing testcase is included in the patch. If others can confirm it, we can open a new JIRA ticket for it. If groups.ngroups is specified, the docList's numFound should be the number of groups - Key: SOLR-4310 URL: https://issues.apache.org/jira/browse/SOLR-4310 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.1 Reporter: Amit Nithian Assignee: Hoss Man Priority: Minor Fix For: 4.4 Attachments: SOLR-4310_2.patch, SOLR-4310_3.patch, SOLR-4310_4.patch, SOLR-4310.patch If you group by a field, the response may look like this: lst name=grouped lst name=series int name=matches138/int int name=ngroups1/int result name=doclist numFound=138 start=0 doc int name=id267038365/int str name=name Larry's Grand Ole Garage Country Dance - Pure Country /str /doc /result /lst /lst and if you specify group.main then the doclist becomes the result and you lose all context of the number of groups. If you want to keep your response format backwards compatible with clients (i.e. clients who don't know about the grouped format), setting group.main=true solves this BUT the numFound is the number of raw matches instead of the number of groups. This may have downstream consequences. I'd like to propose that if the user specifies ngroups=true then when creating the returning DocSlice, set the numFound to be the number of groups instead of the number of raw matches to keep the response consistent with what the user would expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2750) add Kamikaze 3.0.1 into Lucene
[ https://issues.apache.org/jira/browse/LUCENE-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-2750: - Attachment: LUCENE-2750.patch Updated patch: DISI.cost() now returns the cardinality of the set, computed at building time. add Kamikaze 3.0.1 into Lucene -- Key: LUCENE-2750 URL: https://issues.apache.org/jira/browse/LUCENE-2750 Project: Lucene - Core Issue Type: Sub-task Components: modules/other Reporter: hao yan Assignee: Adrien Grand Attachments: LUCENE-2750.patch, LUCENE-2750.patch Original Estimate: 336h Remaining Estimate: 336h Kamikaze 3.0.1 is the updated version of Kamikaze 2.0.0. It can achieve significantly better performance then Kamikaze 2.0.0 in terms of both compressed size and decompression speed. The main difference between the two versions is Kamikaze 3.0.x uses the much more efficient implementation of the PForDelta compression algorithm. My goal is to integrate the highly efficient PForDelta implementation into Lucene Codec. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1503808 - /lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicAppendingLongBuffer.java
Nice catch. Now there is very low constant overhead to these things, thank you!!! On Tue, Jul 16, 2013 at 10:50 AM, jpou...@apache.org wrote: Author: jpountz Date: Tue Jul 16 17:50:00 2013 New Revision: 1503808 URL: http://svn.apache.org/r1503808 Log: Fix initial sizing of MonotonicAppendingLongBuffer. Modified: lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicAppendingLongBuffer.java Modified: lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicAppendingLongBuffer.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicAppendingLongBuffer.java?rev=1503808r1=1503807r2=1503808view=diff == --- lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicAppendingLongBuffer.java (original) +++ lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/util/packed/MonotonicAppendingLongBuffer.java Tue Jul 16 17:50:00 2013 @@ -43,7 +43,7 @@ public final class MonotonicAppendingLon * @param pageSize the size of a single page */ public MonotonicAppendingLongBuffer(int initialPageCount, int pageSize) { super(initialPageCount, pageSize); -averages = new float[pageSize]; +averages = new float[initialPageCount]; } /** Create an {@link MonotonicAppendingLongBuffer} with initialPageCount=16
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #388: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/388/ 2 tests failed. FAILED: org.apache.solr.cloud.BasicDistributedZkTest.org.apache.solr.cloud.BasicDistributedZkTest Error Message: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=2757, name=recoveryCmdExecutor-1556-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=2757, name=recoveryCmdExecutor-1556-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) at __randomizedtesting.SeedInfo.seed([316F4106FF3E7D59]:0) FAILED: org.apache.solr.cloud.BasicDistributedZkTest.org.apache.solr.cloud.BasicDistributedZkTest Error Message: There are still zombie threads that couldn't be terminated: 1) Thread[id=2757, name=recoveryCmdExecutor-1556-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at
[jira] [Commented] (SOLR-4310) If groups.ngroups is specified, the docList's numFound should be the number of groups
[ https://issues.apache.org/jira/browse/SOLR-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710047#comment-13710047 ] Amit Nithian commented on SOLR-4310: I apologize for my lack of memory on this... if I recall I don't think it was anything serious and perhaps something that seemed intuitive but wasn't and I had to work around it. If your tests show that this works in distributed mode then it's probably good to go! If groups.ngroups is specified, the docList's numFound should be the number of groups - Key: SOLR-4310 URL: https://issues.apache.org/jira/browse/SOLR-4310 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.1 Reporter: Amit Nithian Assignee: Hoss Man Priority: Minor Fix For: 4.4 Attachments: SOLR-4310_2.patch, SOLR-4310_3.patch, SOLR-4310_4.patch, SOLR-4310.patch If you group by a field, the response may look like this: lst name=grouped lst name=series int name=matches138/int int name=ngroups1/int result name=doclist numFound=138 start=0 doc int name=id267038365/int str name=name Larry's Grand Ole Garage Country Dance - Pure Country /str /doc /result /lst /lst and if you specify group.main then the doclist becomes the result and you lose all context of the number of groups. If you want to keep your response format backwards compatible with clients (i.e. clients who don't know about the grouped format), setting group.main=true solves this BUT the numFound is the number of raw matches instead of the number of groups. This may have downstream consequences. I'd like to propose that if the user specifies ngroups=true then when creating the returning DocSlice, set the numFound to be the number of groups instead of the number of raw matches to keep the response consistent with what the user would expect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5088) Add term filter
[ https://issues.apache.org/jira/browse/LUCENE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710068#comment-13710068 ] ASF subversion and git services commented on LUCENE-5088: - Commit 1503823 from [~martijn.v.groningen] in branch 'dev/trunk' [ https://svn.apache.org/r1503823 ] LUCENE-5088: Added TermFilter to filter docs by a specific term. Add term filter --- Key: LUCENE-5088 URL: https://issues.apache.org/jira/browse/LUCENE-5088 Project: Lucene - Core Issue Type: Improvement Reporter: Martijn van Groningen Assignee: Martijn van Groningen Priority: Minor Attachments: LUCENE-5088.patch, LUCENE-5088.patch I think it makes sense add a term filter: * There is a TermsFilter, but no TermFilter. * I think it is bit a more efficient then wrapping a TermQuery in an QueryWrapperFilter. * Allows the usage of DocsEnum.FLAG_NONE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4428) Update SolrUIMA wiki page
[ https://issues.apache.org/jira/browse/SOLR-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710076#comment-13710076 ] Eva Andreasson commented on SOLR-4428: -- Thanks for the pointer! So, before I continue, and with your last comment in mind, would it be preferred to do something similar to what is in the wiki already (alt. #1 below) or could I rewrite to some extent, i.e. step through an example (alt. #2 below) or both? NOTE: the postings in this comment are not final, I have some follow up questions on them as well. ALTERNATIVE #1: updateRequestProcessorChain name=uima processor class= “processor class path” lst name=uimaConfig lst name=runtimeParameters !-- parameters defined in the AE overriding parameters in the delegate AEs -- … /lst str name=analysisEngine !-- AE class path -- /str lst name=analyzeFields bool name=merge !-- true or false -- /bool arr name=fields !-- field definitions -- /arr /lst lst name=fieldMappings lst name=type str name=name!-- map class name --/str lst name=mapping str name=feature!-- feature name --/str str name=field!-- field name --/str /lst /lst … /lst /lst /processor processor class=solr.RunUpdateProcessorFactory /processor /updateRequestProcessorChain ALTERNATIVE #2: (used directly from the example you provided, but with the approach of stepping through each portion. !-- first you need to define the path to the UIMA class ... -- updateRequestProcessorChain name=uima processor class=org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory lst name=uimaConfig !-- then you need to define any analysis engine (Q1: assuming AE stands for analysis engine?) parameters, which will override parameters in the delegate analysis engines. You will need to define the type, a name, and the parameter value... -- lst name=runtimeParameters int name=ngramsize3/int /lst !-- and so on... with comments intersecting ... -- str name=analysisEngine/uima/TestAE.xml/str lst name=analyzeFields bool name=mergefalse/bool ... My input would be alt. 2. It works better for a user like me, easier to read and understand, as it provides both structure and example at the same time, but open to either. Update SolrUIMA wiki page - Key: SOLR-4428 URL: https://issues.apache.org/jira/browse/SOLR-4428 Project: Solr Issue Type: Task Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor SolrUIMA wiki page (see http://wiki.apache.org/solr/SolrUIMA) is actually outdated and needs to be updated ont the following topics: * proper XML configuration * how to use existing UIMA analyzers * what's the default configuration * how to change the default configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4311) HunspellStemFilter returns another values than Hunspell in console / command line with same dictionaries.
[ https://issues.apache.org/jira/browse/LUCENE-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709978#comment-13709978 ] Lukas Vlcek edited comment on LUCENE-4311 at 7/16/13 6:48 PM: -- Hi Chris, I have been doing some experiments with this czech dictionary and to me it seems that it yields the best results with RECURSION_CAP = 0. Seriously! The double folding does not bring any advantage in case of this particular dictionary. In fact the dictionary is in such a good shape that it allows for direct generation of all word forms for words in dic file and only one affix rule is enough for input words to see if it matches any of the root forms, i.e. best results with one folding only. With RECURSION_CAP 1 or 2 it can generate a lot of incorrect words. The shorter the input word is the higher chance of getting incorrect (i.e. completely misleading) results up to the point where it is not useful for Lucene indexing at all. Please, can we have this fixed? I believe all is needed now is to have a look at #LUCENE-4542 and make sure the recursion level is configurable. This would be really great enhancement. was (Author: lukas.vlcek): Hi Chris, I have been doing some experiments with this czech dictionary and to me it seems that it yields the best results with RECURSION_CAP = 0. Seriously! The double folding does not bring any advantage in case of this particular dictionary. In fact the dictionary is in such a good shape that it allows for direct generation of all word forms for words in dic file and only one affix rule is enough for input words to see if it matches any of the root forms, no folding needed at all. With RECURSION_CAP 1 or 2 it can generate a lot of incorrect words. The shorter the input word is the higher chance of getting incorrect (i.e. completely misleading) results up to the point where it is not useful for Lucene indexing at all. Please, can we have this fixed? I believe all is needed now is to have a look at #LUCENE-4542 and make sure the recursion level is configurable. This would be really great enhancement. HunspellStemFilter returns another values than Hunspell in console / command line with same dictionaries. - Key: LUCENE-4311 URL: https://issues.apache.org/jira/browse/LUCENE-4311 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 3.5, 4.0-ALPHA, 3.6.1 Environment: Apache Solr 3.5 - 4.0, Apache Tomcat 7.0 Reporter: Jan Rieger Attachments: cs_CZ.aff, cs_CZ.dic When I used HunspellStemFilter for stemming the czech language text, it returns me bad results. For example word praha returns praha and prahnout, what is not correct. So I try the same in my console (Hunspell command line) with exactly same dictionaries and it returns only praha and this is correct. Can somebody help me? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
4.4 release note drafts
I've taken a first pass at 4.4 release notes - please review/edit: https://wiki.apache.org/solr/ReleaseNote44 http://wiki.apache.org/lucene-java/ReleaseNote44 Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4428) Update SolrUIMA wiki page
[ https://issues.apache.org/jira/browse/SOLR-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710077#comment-13710077 ] Eva Andreasson commented on SOLR-4428: -- NOTE: indentation disappeared in my posting. It will be indented per common code practice, of course. Update SolrUIMA wiki page - Key: SOLR-4428 URL: https://issues.apache.org/jira/browse/SOLR-4428 Project: Solr Issue Type: Task Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor SolrUIMA wiki page (see http://wiki.apache.org/solr/SolrUIMA) is actually outdated and needs to be updated ont the following topics: * proper XML configuration * how to use existing UIMA analyzers * what's the default configuration * how to change the default configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-4.x-Java6 - Build # 1801 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java6/1801/ 2 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=1753, name=recoveryCmdExecutor-591-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=1753, name=recoveryCmdExecutor-591-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) at __randomizedtesting.SeedInfo.seed([BFC3F5EE01811FDD]:0) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: There are still zombie threads that couldn't be terminated:1) Thread[id=1753, name=recoveryCmdExecutor-591-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
[jira] [Updated] (LUCENE-5088) Add term filter
[ https://issues.apache.org/jira/browse/LUCENE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-5088: -- Fix Version/s: 4.5 5.0 Add term filter --- Key: LUCENE-5088 URL: https://issues.apache.org/jira/browse/LUCENE-5088 Project: Lucene - Core Issue Type: Improvement Reporter: Martijn van Groningen Assignee: Martijn van Groningen Priority: Minor Fix For: 5.0, 4.5 Attachments: LUCENE-5088.patch, LUCENE-5088.patch I think it makes sense add a term filter: * There is a TermsFilter, but no TermFilter. * I think it is bit a more efficient then wrapping a TermQuery in an QueryWrapperFilter. * Allows the usage of DocsEnum.FLAG_NONE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b96) - Build # 6593 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/6593/ Java: 64bit/jdk1.8.0-ea-b96 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.lucene.queries.TermFilterTest.testHashCodeAndEquals Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([51787609E716C65B:2BA5584D5E07BAAE]:0) at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.junit.Assert.assertFalse(Assert.java:79) at org.apache.lucene.queries.TermFilterTest.testHashCodeAndEquals(TermFilterTest.java:142) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:491) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:724) Build Log: [...truncated 7923 lines...] [junit4] Suite: org.apache.lucene.queries.TermFilterTest [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TermFilterTest -Dtests.method=testHashCodeAndEquals -Dtests.seed=51787609E716C65B -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=vi_VN -Dtests.timezone=Asia/Pyongyang -Dtests.file.encoding=US-ASCII [junit4] FAILURE 0.15s J0 | TermFilterTest.testHashCodeAndEquals
[jira] [Commented] (LUCENE-5114) remove boolean useCache param from TermsEnum.seekCeil/Exact
[ https://issues.apache.org/jira/browse/LUCENE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710094#comment-13710094 ] ASF subversion and git services commented on LUCENE-5114: - Commit 1503834 from [~mikemccand] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1503834 ] LUCENE-5114: remove unused useCache param remove boolean useCache param from TermsEnum.seekCeil/Exact --- Key: LUCENE-5114 URL: https://issues.apache.org/jira/browse/LUCENE-5114 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-5114.patch Long ago terms dict had a cache, but it was problematic and we removed it, but the API still has a relic boolean useCache ... I think we should drop it from the API as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5114) remove boolean useCache param from TermsEnum.seekCeil/Exact
[ https://issues.apache.org/jira/browse/LUCENE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-5114. Resolution: Fixed remove boolean useCache param from TermsEnum.seekCeil/Exact --- Key: LUCENE-5114 URL: https://issues.apache.org/jira/browse/LUCENE-5114 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-5114.patch Long ago terms dict had a cache, but it was problematic and we removed it, but the API still has a relic boolean useCache ... I think we should drop it from the API as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5036) Solr Ref Guide updates for Solr 4.4
[ https://issues.apache.org/jira/browse/SOLR-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710096#comment-13710096 ] Hoss Man commented on SOLR-5036: I made some more progress on Cassandra's list... * https://cwiki.apache.org/confluence/display/solr/RequestDispatcher+in+SolrConfig ** SOLR-2079 - addHttpRequestToContext verbage * https://cwiki.apache.org/confluence/display/solr/Query+Screen ** SOLR-3838 - fq verbage and updated creenshot ** SOLR-4719 - json default verbage updated screenshot * https://cwiki.apache.org/confluence/display/solr/Core-Specific+Tools ** update order of links to children and actual order of children in navigation * https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig ** SOLR-4941, SOLR-4934 and LUCENE-5038 - already ok, page doesn't go into depth about mergePolicy, example alreayd used correct compound file syntax * https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities ** SOLR-4972: Add PUT command to ZkCli tool * https://cwiki.apache.org/confluence/display/solr/Other+Parsers ** SOLR-4785: MaxScoreQParser But there is still a lot more that needs written -- stuff below i either don't understand the new feature enough to document myself, or didn't have the energy to try to tackle because it needs a lot written about it and i don't know it well enough to have a good sense of how to go about it... * New page on upgrading to 4.4 ** child of https://cwiki.apache.org/confluence/display/solr/Major+Changes+from+Solr+3+to+Solr+4 ** clone of https://cwiki.apache.org/confluence/display/solr/Upgrading+to+Solr+4.3 ** mention SOLR-4941, SOLR-4934 and LUCENE-5038 ** mention SOLR-4778 and new LogWatcher API * https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig ** SOLR-4761, SOLR-4976: Add option to plugin a merged segment warmer into solrconfig.xml. Info about segments warmed in the background is available via infostream. (Mark Miller, Ryan Ernst, Mike McCandless, Robert Muir) *** CT: Add to https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig * https://cwiki.apache.org/confluence/display/solr/Spell+Checking ** SOLR-3240: Add spellcheck.collateMaxCollectDocs option so that when testing potential Collations against the index, SpellCheckComponent will only collect n documents, thereby estimating the hit-count. This is a performance optimization in cases where exact hit-counts are unnecessary. Also, when collateExtendedResults is false, this optimization is always made (James Dyer). *** CT: Add to https://cwiki.apache.org/confluence/display/solr/Spell+Checking * https://cwiki.apache.org/confluence/display/solr/Core-Specific+Tools ** SOLR-4921: Admin UI now supports adding documents to Solr (gsingers, steffkes) *** CT: Add a new page under https://cwiki.apache.org/confluence/display/solr/Core-Specific+Tools ** make sure child is in correct order, both in parent page links and overall nav: https://cwiki.apache.org/confluence/pages/listpages-dirview.action?key=solr * Completley new docs about the HDFS SolrCloud support ... somewhere ** SOLR-4916: Add support to write and read Solr index files and transaction log files to and from HDFS. (phunt, Mark Miller, Greg Chanan) *** CT: Without studying this more, it's hard to know where this should go. It's not really SolrCloud, and it's not really a client, but depending on why it's being done it could overlap with either...If someone writes up what you'd tell someone about using it, I could give a better idea of where it fits in the existing page organization (if it does). * https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design ** SOLR-4897: Add solr/example/example-schemaless/, an example config set for schemaless mode. (Steve Rowe) *** CT: Schemaless in general needs to be added. The most likely place today is a new page under https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design * https://cwiki.apache.org/confluence/display/solr/Core+Admin+and+Configuring+solr.xml ** SOLR-4757: Change the example to use the new solr.xml format and core discovery by directory structure. (Mark Miller) *** CT: There is a page on solr.xml: https://cwiki.apache.org/confluence/display/solr/Core+Admin+and+Configuring+solr.xml. This should be updated to show the new format and still include information on the old format for anyone with the old format who uses this guide for reference. ** SOLR-4655: Add option to have Overseer assign generic node names so that new addresses can host shards without naming confusion. (Mark Miller, Anshum Gupta) *** CT: I think this only needs to be added to any new content for solr.xml at https://cwiki.apache.org/confluence/display/solr/Core+Admin+and+Configuring+solr.xml * https://cwiki.apache.org/confluence/display/solr/Collections+API ** SOLR-4693: A
Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b96) - Build # 6593 - Failure!
I'll fix. On 16 July 2013 21:05, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/6593/ Java: 64bit/jdk1.8.0-ea-b96 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.lucene.queries.TermFilterTest.testHashCodeAndEquals Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([51787609E716C65B:2BA5584D5E07BAAE]:0) at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.junit.Assert.assertFalse(Assert.java:79) at org.apache.lucene.queries.TermFilterTest.testHashCodeAndEquals(TermFilterTest.java:142) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:491) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:724) Build Log: [...truncated 7923 lines...] [junit4] Suite: org.apache.lucene.queries.TermFilterTest [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TermFilterTest -Dtests.method=testHashCodeAndEquals -Dtests.seed=51787609E716C65B -Dtests.multiplier=3 -Dtests.slow=true
[jira] [Commented] (SOLR-5036) Solr Ref Guide updates for Solr 4.4
[ https://issues.apache.org/jira/browse/SOLR-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710106#comment-13710106 ] Shalin Shekhar Mangar commented on SOLR-5036: - I'll document the deleteshard changes in https://cwiki.apache.org/confluence/display/solr/Collections+API Solr Ref Guide updates for Solr 4.4 --- Key: SOLR-5036 URL: https://issues.apache.org/jira/browse/SOLR-5036 Project: Solr Issue Type: Improvement Components: documentation Reporter: Cassandra Targett Fix For: 4.4 In response to Hoss' email about updating the Solr Ref Guide, I thought it would be helpful if I went through the CHANGES.txt file and noted what I think might need an update (or a new page) and what pages should be reviewed for each change. I hope this helps those who aren't yet fully versed with the way it's organized. I commented on every item in CHANGES.txt, so I'll post my suggestions in a couple of comments. Many items (like bug fixes) don't really need updates, but more eyes on that would be helpful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5088) Add term filter
[ https://issues.apache.org/jira/browse/LUCENE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710108#comment-13710108 ] ASF subversion and git services commented on LUCENE-5088: - Commit 1503837 from [~martijn.v.groningen] in branch 'dev/trunk' [ https://svn.apache.org/r1503837 ] LUCENE-5088: Fixed test Add term filter --- Key: LUCENE-5088 URL: https://issues.apache.org/jira/browse/LUCENE-5088 Project: Lucene - Core Issue Type: Improvement Reporter: Martijn van Groningen Assignee: Martijn van Groningen Priority: Minor Fix For: 5.0, 4.5 Attachments: LUCENE-5088.patch, LUCENE-5088.patch I think it makes sense add a term filter: * There is a TermsFilter, but no TermFilter. * I think it is bit a more efficient then wrapping a TermQuery in an QueryWrapperFilter. * Allows the usage of DocsEnum.FLAG_NONE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1503837 - /lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java
Hi Martijn, Maybe remove the System.out? Uwe m...@apache.org schrieb: Author: mvg Date: Tue Jul 16 19:15:24 2013 New Revision: 1503837 URL: http://svn.apache.org/r1503837 Log: LUCENE-5088: Fixed test Modified: lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java Modified: lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java?rev=1503837r1=1503836r2=1503837view=diff == --- lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java (original) +++ lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java Tue Jul 16 19:15:24 2013 @@ -123,7 +123,7 @@ public class TermFilterTest extends Luce String field1 = field + i; String field2 = field + i + num; String value1 = _TestUtil.randomRealisticUnicodeString(random()); - String value2 = _TestUtil.randomRealisticUnicodeString(random()); + String value2 = _TestUtil.randomRealisticUnicodeString(random()) + x; // this must be not equal to value1 TermFilter filter1 = termFilter(field1, value1); TermFilter filter2 = termFilter(field1, value2); @@ -139,6 +139,8 @@ public class TermFilterTest extends Luce assertEquals(termFilter.hashCode(), otherTermFilter.hashCode()); assertTrue(termFilter.equals(otherTermFilter)); } else { +System.out.println(termFilter); +System.out.println(otherTermFilter); assertFalse(termFilter.equals(otherTermFilter)); } } -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de
Re: 4.4 release note drafts
The lucene notes look great! Thanks Steve. Mike McCandless http://blog.mikemccandless.com On Tue, Jul 16, 2013 at 2:49 PM, Steve Rowe sar...@gmail.com wrote: I've taken a first pass at 4.4 release notes - please review/edit: https://wiki.apache.org/solr/ReleaseNote44 http://wiki.apache.org/lucene-java/ReleaseNote44 Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: 4.4 release note drafts
On 7/16/2013 12:49 PM, Steve Rowe wrote: I've taken a first pass at 4.4 release notes - please review/edit: https://wiki.apache.org/solr/ReleaseNote44 http://wiki.apache.org/lucene-java/ReleaseNote44 I made a minor edit on the Solr page for what seemed like awkward grammar to me. An additional thing that IMHO needs to be noted for Solr, but I am not really sure how to phrase it: Core discovery mode now exists, and the standard example uses it. I haven't checked to see whether any of the other examples have been converted. Thank you for putting effort into this! Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: 4.4 release note drafts
I have to check if something is still missing, but this looks fantastic and contains all important news as far as I remember. Uwe Steve Rowe sar...@gmail.com schrieb: I've taken a first pass at 4.4 release notes - please review/edit: https://wiki.apache.org/solr/ReleaseNote44 http://wiki.apache.org/lucene-java/ReleaseNote44 Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de
Re: svn commit: r1503837 - /lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java
Just did this... On 16 July 2013 21:19, Uwe Schindler u...@thetaphi.de wrote: Hi Martijn, Maybe remove the System.out? Uwe m...@apache.org schrieb: Author: mvg Date: Tue Jul 16 19:15:24 2013 New Revision: 1503837 URL: http://svn.apache.org/r1503837 Log: LUCENE-5088: Fixed test Modified: lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java Modified: lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java?rev=1503837r1=1503836r2=1503837view=diff -- --- lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java (original) +++ lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java Tue Jul 16 19:15:24 2013 @@ -123,7 +123,7 @@ public class TermFilterTest extends Luce String field1 = field + i; String field2 = field + i + num; String value1 = _TestUtil.randomRealisticUnicodeString(random()); - String value2 = _TestUtil.randomRealisticUnicodeString(random()); + String value2 = _TestUtil.randomRealisticUnicodeString(random()) + x; // this must be not equal to value1 TermFilter filter1 = termFilter(field1, value1); TermFilter filter2 = termFilter(field1, value2); @@ -139,6 +139,8 @@ public class TermFilterTest extends Luce assertEquals(termFilter.hashCode(), otherTermFilter.hashCode()); assertTrue(termFilter.equals(otherTermFilter)); } else { +System.out.println(termFilter); + System.out.println(otherTermFilter); assertFalse(termFilter.equals(otherTermFilter)); } } -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de -- Met vriendelijke groet, Martijn van Groningen
Re: 4.4 release note drafts
Looks good, maybe we should add something about segments flushed by indexwriter using CFS by default, just so we don't have lots of questions on the users list when they see CFS being used where it wasnt before. We could also mention some of the compression improvements to numeric docvalues. I can add writeups these entries to the wiki page... On Tue, Jul 16, 2013 at 12:28 PM, Uwe Schindler u...@thetaphi.de wrote: I have to check if something is still missing, but this looks fantastic and contains all important news as far as I remember. Uwe Steve Rowe sar...@gmail.com schrieb: I've taken a first pass at 4.4 release notes - please review/edit: https://wiki.apache.org/solr/ReleaseNote44 http://wiki.apache.org/lucene-java/ReleaseNote44 Steve -- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de
Re: 4.4 release note drafts
On Jul 16, 2013, at 3:26 PM, Shawn Heisey s...@elyograg.org wrote: Core discovery mode now exists, and the standard example uses it. I agree - I think the release notes should at a minimum point to a discussion of the changes in 4.4 and maybe also in trunk. I don't know how to write it up though. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1503837 - /lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java
Does this really fix this or just make it less likely to happen...? On Tue, Jul 16, 2013 at 12:15 PM, m...@apache.org wrote: String value1 = _TestUtil.randomRealisticUnicodeString(random()); - String value2 = _TestUtil.randomRealisticUnicodeString(random()); + String value2 = _TestUtil.randomRealisticUnicodeString(random()) + x; // this must be not equal to value1
Re: 4.4 release note drafts
On Jul 16, 2013, at 3:35 PM, Robert Muir rcm...@gmail.com wrote: Looks good, maybe we should add something about segments flushed by indexwriter using CFS by default, just so we don't have lots of questions on the users list when they see CFS being used where it wasnt before. We could also mention some of the compression improvements to numeric docvalues. I can add writeups these entries to the wiki page... +1, thanks. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1503837 - /lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFilterTest.java
This should fix this, b/c value2 (one extra character) will always be different than value1. The test initially failed b/c value1 and value2 were equal. On 16 July 2013 21:37, Robert Muir rcm...@gmail.com wrote: Does this really fix this or just make it less likely to happen...? On Tue, Jul 16, 2013 at 12:15 PM, m...@apache.org wrote: String value1 = _TestUtil.randomRealisticUnicodeString(random()); - String value2 = _TestUtil.randomRealisticUnicodeString(random()); + String value2 = _TestUtil.randomRealisticUnicodeString(random()) + x; // this must be not equal to value1 -- Met vriendelijke groet, Martijn van Groningen
Re: [VOTE] Release 4.4
+1 Smoker ran successful On 16 July 2013 17:36, Shalin Shekhar Mangar shalinman...@gmail.com wrote: +1 Smoke tester is happy on mac. On Tue, Jul 16, 2013 at 12:02 PM, Steve Rowe sar...@gmail.com wrote: Please vote to release Lucene and Solr 4.4, built off revision 1503555 of https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_4. RC0 artifacts are available at: http://people.apache.org/~sarowe/staging_area/lucene-solr-4.4.0-RC0-rev1503555 The smoke tester passes for me. Here's my +1. Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Regards, Shalin Shekhar Mangar. -- Met vriendelijke groet, Martijn van Groningen
[jira] [Commented] (SOLR-3633) web UI reports an error if CoreAdminHandler says there are no SolrCores
[ https://issues.apache.org/jira/browse/SOLR-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710147#comment-13710147 ] ASF subversion and git services commented on SOLR-3633: --- Commit 1503853 from [~steffkes] in branch 'dev/trunk' [ https://svn.apache.org/r1503853 ] SOLR-3633 - web UI reports an error if CoreAdminHandler says there are no SolrCores web UI reports an error if CoreAdminHandler says there are no SolrCores --- Key: SOLR-3633 URL: https://issues.apache.org/jira/browse/SOLR-3633 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0-ALPHA Reporter: Hoss Man Assignee: Stefan Matheis (steffkes) Fix For: 4.4 Attachments: SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch Spun off from SOLR-3591... * having no SolrCores is a valid situation * independent of what may happen in SOLR-3591, the web UI should cleanly deal with there being no SolrCores, and just hide/grey out any tabs that can't be supported w/o at least one core * even if there are no SolrCores the core admin features (ie: creating a new core) should be accessible in the UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3633) web UI reports an error if CoreAdminHandler says there are no SolrCores
[ https://issues.apache.org/jira/browse/SOLR-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710151#comment-13710151 ] ASF subversion and git services commented on SOLR-3633: --- Commit 1503855 from [~steffkes] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1503855 ] SOLR-3633 - web UI reports an error if CoreAdminHandler says there are no SolrCores (merge r1503853) web UI reports an error if CoreAdminHandler says there are no SolrCores --- Key: SOLR-3633 URL: https://issues.apache.org/jira/browse/SOLR-3633 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0-ALPHA Reporter: Hoss Man Assignee: Stefan Matheis (steffkes) Fix For: 4.4 Attachments: SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch Spun off from SOLR-3591... * having no SolrCores is a valid situation * independent of what may happen in SOLR-3591, the web UI should cleanly deal with there being no SolrCores, and just hide/grey out any tabs that can't be supported w/o at least one core * even if there are no SolrCores the core admin features (ie: creating a new core) should be accessible in the UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3633) web UI reports an error if CoreAdminHandler says there are no SolrCores
[ https://issues.apache.org/jira/browse/SOLR-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) resolved SOLR-3633. - Resolution: Fixed Fix Version/s: (was: 4.4) 4.5 5.0 web UI reports an error if CoreAdminHandler says there are no SolrCores --- Key: SOLR-3633 URL: https://issues.apache.org/jira/browse/SOLR-3633 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0-ALPHA Reporter: Hoss Man Assignee: Stefan Matheis (steffkes) Fix For: 5.0, 4.5 Attachments: SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch, SOLR-3633.patch Spun off from SOLR-3591... * having no SolrCores is a valid situation * independent of what may happen in SOLR-3591, the web UI should cleanly deal with there being no SolrCores, and just hide/grey out any tabs that can't be supported w/o at least one core * even if there are no SolrCores the core admin features (ie: creating a new core) should be accessible in the UI -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4220) Move RequestHandler for global Information out of core-scope
[ https://issues.apache.org/jira/browse/SOLR-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) resolved SOLR-4220. - Resolution: Duplicate Assignee: Stefan Matheis (steffkes) (was: Hoss Man) i'm going to close this as duplicate, since it is not only related .. it indeed _does_ describe the same requirement. Move RequestHandler for global Information out of core-scope Key: SOLR-4220 URL: https://issues.apache.org/jira/browse/SOLR-4220 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Assignee: Stefan Matheis (steffkes) Okay, the title perhaps wins no price right now .. but i don't have an better idea, i'm sorry - if you do, don't hesitate to update it! What it's all about: SOLR-3633 was created because at the moment it's not possible to use the Admin UI w/o at least one core. The reason for that is, that some (as you might think) global Information - which the UI shows on the top-level (which doesn't require you selecting a core) - must be fetched from a core-related url, because that's how the solr routing works right now. Hoss and I talked about that at the ApacheCon and he mentioned that this should not be that biggest change although we need to update the tests and ensure that the thing is still working, of course. I checked the UI for having a list of Functions and their related urls: * *Dashboard* ** solr/$first_core/admin/system?wt=json * *Logging* ** /solr/$first_core/admin/logging?wt=jsonsince=N * *Logging* / *Level* ** /solr/$first_core/admin/logging?wt=json * *Java Properties* ** /solr/$first_core/admin/properties?wt=json * *Threads* ** /solr/$first_core/admin/threads?wt=json For the sake of simplicity, i'd suggest that we're just moving the complete handler (regarding their url) on level up to something like {{/solr/admin/..}} like we have it already for the zookeeper thing? Regarding the contained content, i think we could (or perhaps should?) stick with the given information/content - only the Dashboard is not using all of the provided values, but just for the fact that we have no usage for prettified RAM-Usage numbers ... Let me know if the Issue contains all required informations, otherwise i'll try to update it according to your questions :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
TODO: Remove download links to the 3.6.2 release from the Lucene and Solr websites
I think we should stop encouraging people to download the 3.6.2 release - those big Download 3.6.2 buttons on every website page should go away. We can then also remove the 3.6.2 releases from the distribution mirrors - they will still live on at http://archive.apache.org/dist/ so people can get them if they really want them. Thoughts? Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 316 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/316/ 2 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=2635, name=recoveryCmdExecutor-1042-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=2635, name=recoveryCmdExecutor-1042-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) at __randomizedtesting.SeedInfo.seed([C8D11F8DD5323F20]:0) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest Error Message: There are still zombie threads that couldn't be terminated:1) Thread[id=2635, name=recoveryCmdExecutor-1042-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at
[jira] [Comment Edited] (SOLR-3076) Solr(Cloud) should support block joins
[ https://issues.apache.org/jira/browse/SOLR-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708856#comment-13708856 ] Mikhail Khludnev edited comment on SOLR-3076 at 7/16/13 9:07 PM: - [~ysee...@gmail.com] it's a [ginger cake|https://twitter.com/tastapod/status/164400600132497409]! was (Author: mkhludnev): [~ysee...@gmail.com]it's a ginger cake Solr(Cloud) should support block joins -- Key: SOLR-3076 URL: https://issues.apache.org/jira/browse/SOLR-3076 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Yonik Seeley Fix For: 5.0, 4.4 Attachments: 27M-singlesegment-histogram.png, 27M-singlesegment.png, bjq-vs-filters-backward-disi.patch, bjq-vs-filters-illegal-state.patch, child-bjqparser.patch, dih-3076.patch, dih-config.xml, parent-bjq-qparser.patch, parent-bjq-qparser.patch, Screen Shot 2012-07-17 at 1.12.11 AM.png, SOLR-3076-childDocs.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-3076.patch, SOLR-7036-childDocs-solr-fork-trunk-patched, solrconf-bjq-erschema-snippet.xml, solrconfig.xml.patch, tochild-bjq-filtered-search-fix.patch Lucene has the ability to do block joins, we should add it to Solr. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1503837 - /lucene/dev/trunk/lucene/queries/src/test/org/apache/lucene/queries/TermFi lterTest.java
: This should fix this, b/c value2 (one extra character) will always be : different than value1. The test initially failed b/c value1 and value2 were : equal. rmuir's point is that what you are describing is not garunteed to be true, because randomRealisticUnicodeString returns strings of random lengths as well (unless you specifiy minLength and maxLength) The first call to randomRealisticUnicodeString(random()) might set value1 to foox and the second call to randomRealisticUnicodeString(random()) might return foo making the value of value1 foox as well. I think you just want something like: String value2 = value1 + x : On 16 July 2013 21:37, Robert Muir rcm...@gmail.com wrote: : : Does this really fix this or just make it less likely to happen...? : : : On Tue, Jul 16, 2013 at 12:15 PM, m...@apache.org wrote: : : : String value1 = _TestUtil.randomRealisticUnicodeString(random()); : - String value2 = _TestUtil.randomRealisticUnicodeString(random()); : + String value2 = _TestUtil.randomRealisticUnicodeString(random()) + : x; // this must be not equal to value1 : : : : : -- : Met vriendelijke groet, : : Martijn van Groningen : -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: TODO: Remove download links to the 3.6.2 release from the Lucene and Solr websites
: I think we should stop encouraging people to download the 3.6.2 release : - those big Download 3.6.2 buttons on every website page should go +1. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5114) remove boolean useCache param from TermsEnum.seekCeil/Exact
[ https://issues.apache.org/jira/browse/LUCENE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710283#comment-13710283 ] David Smiley commented on LUCENE-5114: -- I'm very supportive of this change. However, isn't this a breaking change you committed to 4x? On 4x, might it make sense to leave these as overloaded deprecated methods? I think so. remove boolean useCache param from TermsEnum.seekCeil/Exact --- Key: LUCENE-5114 URL: https://issues.apache.org/jira/browse/LUCENE-5114 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-5114.patch Long ago terms dict had a cache, but it was problematic and we removed it, but the API still has a relic boolean useCache ... I think we should drop it from the API as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4428) Update SolrUIMA wiki page
[ https://issues.apache.org/jira/browse/SOLR-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710287#comment-13710287 ] Erick Erickson commented on SOLR-4428: -- I haven't worked with this page at all, I'll let [~teofili] comment. But the second way does look a little easier to read to me too. Erick Update SolrUIMA wiki page - Key: SOLR-4428 URL: https://issues.apache.org/jira/browse/SOLR-4428 Project: Solr Issue Type: Task Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor SolrUIMA wiki page (see http://wiki.apache.org/solr/SolrUIMA) is actually outdated and needs to be updated ont the following topics: * proper XML configuration * how to use existing UIMA analyzers * what's the default configuration * how to change the default configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5114) remove boolean useCache param from TermsEnum.seekCeil/Exact
[ https://issues.apache.org/jira/browse/LUCENE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710304#comment-13710304 ] Robert Muir commented on LUCENE-5114: - If you want to leave the old method, please please make it final... (otherwise, don't do it) In all cases the API must break, or will only invite bugs. remove boolean useCache param from TermsEnum.seekCeil/Exact --- Key: LUCENE-5114 URL: https://issues.apache.org/jira/browse/LUCENE-5114 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-5114.patch Long ago terms dict had a cache, but it was problematic and we removed it, but the API still has a relic boolean useCache ... I think we should drop it from the API as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org