AW: release votes
Hi Andi, I don't agree that it is unimportant to make PyLucene releases. Without a ready-to-run software package the hurdles to use PyLucene are raised. It is already not quite simple (for beginners) to install PyLucene on the various platforms. Having a packaged release that is tested by some users provides a benefit to the community in my opinion. However I can understand your arguments - there has been little feedback on your release announcements on the list recently. On the other hand there are frequent discussions about PyLucene on the list so I don't think the interest has declined. Did you check the number of downloads of the PyLucene distributions (if this is possible at all - due to the distributed releases on the apache mirrors ...)? This would be a more accurate indicator from my point of view. I must also admit that I did never understand the voting process in detail - i.e. who are the PMC members and what impact have votes of non PMC users. Maybe some more transparency and another call for action would help to raise awareness in the community. Just my thoughts... regards, Thomas -- OrbiTeam Software GmbH Co. KG http://www.orbiteam.de -Ursprüngliche Nachricht- Von: Andi Vajda [mailto:va...@apache.org] Gesendet: Donnerstag, 24. April 2014 02:28 An: pylucene-dev@lucene.apache.org Betreff: release votes Hi all, Given the tiny amount of interest the pylucene releases create, it's maybe become unimportant to actually make PyLucene releases ? The release votes have had an increasingly difficult time to garner the three required PMC votes to pass. Non PMC users are also eerily quiet. Maybe the time has come to switch to a different model: - when a Lucene release happens, a PyLucene branch gets created with all the necessary changes to build successfully and pass all tests against this Lucene release - users interested in PyLucene check out that branch - done - no more releases, no more votes JCC can continue to be released to PyPI independently as it is today. That doesn't require any voting anyway (?). What do readers of this list think ? Andi..
Re: AW: release votes
On Thu, 24 Apr 2014, Thomas Koch wrote: I don't agree that it is unimportant to make PyLucene releases. Without a ready-to-run software package the hurdles to use PyLucene are raised. It is already not quite simple (for beginners) to install PyLucene on the various platforms. Having a packaged release that is tested by some users provides a benefit to the community in my opinion. I agree with you that making releases is important. However, when votes are called to actually make them, it's been hard to get voters to respond. Anyone can vote. Anyone with an interest should vote. Three PMC votes are required to make a release happen, though. But any vote for or against is important, PMC or not. Lately, it's been hard to get the TWO extra PMC votes needed to make a release happen (since mine is cast when I cut the release candidate). I think this is in part _because_ no one else is showing an interest in the release and casting a vote either. However I can understand your arguments - there has been little feedback on your release announcements on the list recently. On the other hand there are frequent discussions about PyLucene on the list so I don't think the interest has declined. Did you check the number of downloads of the PyLucene distributions (if this is possible at all - due to the distributed releases on the apache mirrors ...)? This would be a more accurate indicator from my point of view. I have no idea about the number of downloads of PyLucene. JCC, however, has gotten over 2700 downloads in the past month: https://pypi.python.org/pypi/JCC/2.19 I must also admit that I did never understand the voting process in detail - i.e. who are the PMC members and what impact have votes of non PMC users. Maybe some more transparency and another call for action would help to raise awareness in the community. There are at least three classes in the Apache meritocracy: - users, developers, contributors but not committers - committers, ie developers who can commit patches to the project - PMC members, ie project committers that sit on the PMC (project management committee) For more information, please see: https://www.apache.org/foundation/how-it-works.html By the rules guiding the release of Apache projects, three PMC votes are necessary to release a tarball to the world. The list of Lucene committers is visible here: http://lucene.apache.org/whoweare.html Scroll down that list for the PMC membership. Andi.. Just my thoughts... regards, Thomas -- OrbiTeam Software GmbH Co. KG http://www.orbiteam.de -Ursprüngliche Nachricht- Von: Andi Vajda [mailto:va...@apache.org] Gesendet: Donnerstag, 24. April 2014 02:28 An: pylucene-dev@lucene.apache.org Betreff: release votes Hi all, Given the tiny amount of interest the pylucene releases create, it's maybe become unimportant to actually make PyLucene releases ? The release votes have had an increasingly difficult time to garner the three required PMC votes to pass. Non PMC users are also eerily quiet. Maybe the time has come to switch to a different model: - when a Lucene release happens, a PyLucene branch gets created with all the necessary changes to build successfully and pass all tests against this Lucene release - users interested in PyLucene check out that branch - done - no more releases, no more votes JCC can continue to be released to PyPI independently as it is today. That doesn't require any voting anyway (?). What do readers of this list think ? Andi..
Re: release votes
On Apr 24, 2014, at 11:40 AM, Andi Vajda va...@apache.org wrote: On Thu, 24 Apr 2014, Thomas Koch wrote: I don't agree that it is unimportant to make PyLucene releases. Without a ready-to-run software package the hurdles to use PyLucene are raised. It is already not quite simple (for beginners) to install PyLucene on the various platforms. Having a packaged release that is tested by some users provides a benefit to the community in my opinion. I agree with you that making releases is important. However, when votes are called to actually make them, it's been hard to get voters to respond. Anyone can vote. Anyone with an interest should vote. Three PMC votes are required to make a release happen, though. But any vote for or against is important, PMC or not. Lately, it's been hard to get the TWO extra PMC votes needed to make a release happen (since mine is cast when I cut the release candidate). I think this is in part _because_ no one else is showing an interest in the release and casting a vote either. Oh, well I for one had no idea votes from the community at large were encouraged. In that case… +1. I tested 4.7.2 against my downstream project. No issues. However I can understand your arguments - there has been little feedback on your release announcements on the list recently. On the other hand there are frequent discussions about PyLucene on the list so I don't think the interest has declined. Did you check the number of downloads of the PyLucene distributions (if this is possible at all - due to the distributed releases on the apache mirrors ...)? This would be a more accurate indicator from my point of view. I have no idea about the number of downloads of PyLucene. JCC, however, has gotten over 2700 downloads in the past month: https://pypi.python.org/pypi/JCC/2.19 I must also admit that I did never understand the voting process in detail - i.e. who are the PMC members and what impact have votes of non PMC users. Maybe some more transparency and another call for action would help to raise awareness in the community. There are at least three classes in the Apache meritocracy: - users, developers, contributors but not committers - committers, ie developers who can commit patches to the project - PMC members, ie project committers that sit on the PMC (project management committee) For more information, please see: https://www.apache.org/foundation/how-it-works.html By the rules guiding the release of Apache projects, three PMC votes are necessary to release a tarball to the world. The list of Lucene committers is visible here: http://lucene.apache.org/whoweare.html Scroll down that list for the PMC membership. Andi..
Re: AW: release votes
On Thu, Apr 24, 2014 at 2:40 PM, Andi Vajda va...@apache.org wrote: I agree with you that making releases is important. However, when votes are called to actually make them, it's been hard to get voters to respond. Anyone can vote. Anyone with an interest should vote. Three PMC votes are required to make a release happen, though. But any vote for or against is important, PMC or not. Lately, it's been hard to get the TWO extra PMC votes needed to make a release happen (since mine is cast when I cut the release candidate). I think this is in part _because_ no one else is showing an interest in the release and casting a vote either. I don't think thats necessarily the case, for me (as someone who tries to vote for pylucene releases), the problem was a combination of two things, as I did try to actually test it over the weekend: 1. being on travel, meaning stuck with a mac os X computer. 2. release candidate not compiling on my mac os X computer, because something tries to apply -mno-fused-madd when compiling, apparently this is a common issue with python and mavericks? Two things that may have nothing to do with pylucene, but was pretty annoying specially for a non-python developer :) I am happy to try it on my linux machine tonight!
Re: AW: release votes
On Apr 24, 2014, at 15:44, Robert Muir rcm...@gmail.com wrote: On Thu, Apr 24, 2014 at 2:40 PM, Andi Vajda va...@apache.org wrote: I agree with you that making releases is important. However, when votes are called to actually make them, it's been hard to get voters to respond. Anyone can vote. Anyone with an interest should vote. Three PMC votes are required to make a release happen, though. But any vote for or against is important, PMC or not. Lately, it's been hard to get the TWO extra PMC votes needed to make a release happen (since mine is cast when I cut the release candidate). I think this is in part _because_ no one else is showing an interest in the release and casting a vote either. I don't think thats necessarily the case, for me (as someone who tries to vote for pylucene releases), the problem was a combination of two things, as I did try to actually test it over the weekend: I sure didn't mean to blame you (or Mike) as you both usually provide the two extra PMC votes needed. No, I'm just trying to gauge what to do given the dwindling interest of the other readers of this list and the lack of interest of the 25 other PMC members. Doing releases is work and so is testing a build for a release vote. If the process can be streamlined, I'm all for it. 1. being on travel, meaning stuck with a mac os X computer. 2. release candidate not compiling on my mac os X computer, because something tries to apply -mno-fused-madd when compiling, apparently this is a common issue with python and mavericks? To build any Python extension such as PyLucene and JCC you need to use the same compiler that was used to build Python. On Mac, it seems that many people are running a gcc-built python but then use clang when building extensions. This is most likely because Apple switched compilers sometimes along the way. The simplest route is to build python from sources, it's simple and easy. Andi.. Two things that may have nothing to do with pylucene, but was pretty annoying specially for a non-python developer :) I am happy to try it on my linux machine tonight!
Re: [VOTE] Release PyLucene 4.7.2-1
This vote has passed ! Thank you to all, PMC or not PMC, who cast a vote. Andi.. On Tue, 15 Apr 2014, Andi Vajda wrote: The PyLucene 4.7.2-1 release tracking today's release of Apache Lucene 4.7.2 is ready. A release candidate is available from: http://people.apache.org/~vajda/staging_area/ A list of changes in this release can be seen at: http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_4_7/CHANGES PyLucene 4.7.2 is built with JCC 2.19 included in these release artifacts: http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/CHANGES A list of Lucene Java changes can be seen at: http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_7_2/lucene/CHANGES.txt Please vote to release these artifacts as PyLucene 4.7.2-1. Thanks ! Andi.. ps: the KEYS file for PyLucene release signing is at: http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS http://people.apache.org/~vajda/staging_area/KEYS pps: here is my +1
[jira] [Commented] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse
[ https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979419#comment-13979419 ] Simon Willnauer commented on LUCENE-5628: - I think it's worth doing the optimization here. A couple of comments * can we put the exit condition into the while block instead of at the end with a break I think it can just be while(string.length 0) * looking at the impl of State I think we can just use an identity hashset or maybe even an array since the Ids are within known bounds to check the pathStates? You could even just us a bitset and mark the state ID as visited? Hmm now that I wrote it I see your comment :) I will leave it here for dicsussion. * Somewhat unrelated but I think the State implementation has a problem since it doen't override equlas but it should since it has an hashcode impl. I wonder if we either should remove the hashCode or add equals just for consistency? * should we rather throw IllegalState than IllegalArgument :D * just for readability it might be good to s/strings/finiteStrings/ I had a hard time to see when you do things on the string vs. strings * is this a leftover == // a.getNumberedStates(); SpecialOperations.getFiniteStrings should not recurse - Key: LUCENE-5628 URL: https://issues.apache.org/jira/browse/LUCENE-5628 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5628.patch Today it consumes one Java stack frame per transition, which when used by AnalyzingSuggester is per character in each token. This can lead to stack overflows if you have a long suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979420#comment-13979420 ] Elran Dvir commented on SOLR-2894: -- Brett, thanks for your response. Having a mincount of -1 for the shards is correct. The reason is that while a given shard may have a count lower than mincount for a given term, the aggregate total count for that value when combined with the other shards could exceed the mincount, so we do need to know about it. For example, consider a mincount of 10. If we have 3 shards with a count of 5 for a term of Boston, we would still need to know about these because the total count would be 15, and would be higher than the mincount. If mincount of 1 is asked for a field, couldn't it be more efficient? Is mincount of -1 necessary in this case? I would expect the skipRefinementAtThisLevel to be false for the top level pivot facet, and true for each other level. Are you seeing otherwise? No. You are right. If you were to set a facet.limit of 10 for all levels of the pivot, what is the memory usage like? The memory usage in this case is about 200 MB. Thanks again. Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Fix For: 4.9, 5.0 Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, dateToObject.patch Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse
[ https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979445#comment-13979445 ] Robert Muir commented on LUCENE-5628: - Can we reduce the number of lines of code in the new method? It's not even comparable to the current code. How much of the loc is the cycle detection? Given the cost, this may not be worth it. This is expert shit and the user can add assert.isfinite to their code. How much of the loc is code optimization? Can the old code please be added to automatontestutil as slowxxx and compared against the new one with random automata? SpecialOperations.getFiniteStrings should not recurse - Key: LUCENE-5628 URL: https://issues.apache.org/jira/browse/LUCENE-5628 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5628.patch Today it consumes one Java stack frame per transition, which when used by AnalyzingSuggester is per character in each token. This can lead to stack overflows if you have a long suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse
[ https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979452#comment-13979452 ] Robert Muir commented on LUCENE-5628: - If we want to baby the user, and in not sure what user we have in mind here, just invoke isfinite. I don't like the code dup nor the precedence that unrelated code needs to deal with this. This thing needs to get much shorter and simpler to go in. Can we make it slower to achieve that? I would make it 10 times slower if it removed 1/2 the code... Without hesitation. SpecialOperations.getFiniteStrings should not recurse - Key: LUCENE-5628 URL: https://issues.apache.org/jira/browse/LUCENE-5628 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5628.patch Today it consumes one Java stack frame per transition, which when used by AnalyzingSuggester is per character in each token. This can lead to stack overflows if you have a long suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5622) Fail tests if they print, and tests.verbose is not set
[ https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979493#comment-13979493 ] Dawid Weiss commented on LUCENE-5622: - While annotating tests that do sysouts I came to the conclusion that it shouldn't be an all or nothing threshold. It should be much like the memory leak detector -- some sysouts per suite should be fine (say, 1kb), then it should start failing and suggest to change some of the sysouts to if (VERBOSE) or raise the limit by annotating the suite with a higher threshold. This would make sense in that we could enable those checks by default without additional jenkins jobs, special properties, etc. What do you think? Fail tests if they print, and tests.verbose is not set -- Key: LUCENE-5622 URL: https://issues.apache.org/jira/browse/LUCENE-5622 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Dawid Weiss Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch Some tests print so much stuff they are now undebuggable (see LUCENE-5612). I think its bad that the testrunner hides this stuff, we used to stay on top of it. Instead, whne tests.verbose is false, we should install a printstreams (system.out/err) that fail the test instantly because they are noisy. This will ensure that our tests don't go out of control. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: release votes
+1 Mike McCandless http://blog.mikemccandless.com On Wed, Apr 23, 2014 at 8:27 PM, Andi Vajda va...@apache.org wrote: Hi all, Given the tiny amount of interest the pylucene releases create, it's maybe become unimportant to actually make PyLucene releases ? The release votes have had an increasingly difficult time to garner the three required PMC votes to pass. Non PMC users are also eerily quiet. Maybe the time has come to switch to a different model: - when a Lucene release happens, a PyLucene branch gets created with all the necessary changes to build successfully and pass all tests against this Lucene release - users interested in PyLucene check out that branch - done - no more releases, no more votes JCC can continue to be released to PyPI independently as it is today. That doesn't require any voting anyway (?). What do readers of this list think ? Andi..
[jira] [Commented] (LUCENE-5622) Fail tests if they print, and tests.verbose is not set
[ https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979503#comment-13979503 ] ASF subversion and git services commented on LUCENE-5622: - Commit 1589645 from [~dawidweiss] in branch 'dev/branches/LUCENE-5622' [ https://svn.apache.org/r1589645 ] Branch for LUCENE-5622 Fail tests if they print, and tests.verbose is not set -- Key: LUCENE-5622 URL: https://issues.apache.org/jira/browse/LUCENE-5622 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Dawid Weiss Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch Some tests print so much stuff they are now undebuggable (see LUCENE-5612). I think its bad that the testrunner hides this stuff, we used to stay on top of it. Instead, whne tests.verbose is false, we should install a printstreams (system.out/err) that fail the test instantly because they are noisy. This will ensure that our tests don't go out of control. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse
[ https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979519#comment-13979519 ] Michael McCandless commented on LUCENE-5628: Good feedback, thanks! bq. can we put the exit condition into the while block instead of at the end with a break I think it can just be while(string.length 0) Fixed. bq. looking at the impl of State I think we can just use an identity hashset or maybe even an array since the Ids are within known bounds to check the pathStates? You could even just us a bitset and mark the state ID as visited? Hmm now that I wrote it I see your comment I will leave it here for dicsussion. I switched to IdentityHashSet. Yeah I struggled w/ this, but the original method didn't set the state numbers so I didn't want to change that. Setting the numbers does a DFS on the automaton... bq. Somewhat unrelated but I think the State implementation has a problem since it doen't override equlas but it should since it has an hashcode impl. I wonder if we either should remove the hashCode or add equals just for consistency? I removed State.hashCode. bq. should we rather throw IllegalState than IllegalArgument Hmm, IAE felt right since you passed it an invalid (cyclic) argument? bq. just for readability it might be good to s/strings/finiteStrings/ I had a hard time to see when you do things on the string vs. strings I changed to results. bq. is this a leftover == // a.getNumberedStates(); Removed. SpecialOperations.getFiniteStrings should not recurse - Key: LUCENE-5628 URL: https://issues.apache.org/jira/browse/LUCENE-5628 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5628.patch, LUCENE-5628.patch Today it consumes one Java stack frame per transition, which when used by AnalyzingSuggester is per character in each token. This can lead to stack overflows if you have a long suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse
[ https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5628: --- Attachment: LUCENE-5628.patch New patch. SpecialOperations.getFiniteStrings should not recurse - Key: LUCENE-5628 URL: https://issues.apache.org/jira/browse/LUCENE-5628 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5628.patch, LUCENE-5628.patch Today it consumes one Java stack frame per transition, which when used by AnalyzingSuggester is per character in each token. This can lead to stack overflows if you have a long suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse
[ https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979520#comment-13979520 ] Michael McCandless commented on LUCENE-5628: bq. Can we reduce the number of lines of code in the new method? It's not even comparable to the current code. I'll see if I can simplify it somehow ... bq. How much of the loc is the cycle detection? This is really a miniscule part of it: just look for whoever touches the pathStates. bq. How much of the loc is code optimization? It's not optimization; in fact I imagine this impl is slower. bq. Can the old code please be added to automatontestutil as slowxxx and compared against the new one with random automata? Great idea, I did that. SpecialOperations.getFiniteStrings should not recurse - Key: LUCENE-5628 URL: https://issues.apache.org/jira/browse/LUCENE-5628 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5628.patch, LUCENE-5628.patch Today it consumes one Java stack frame per transition, which when used by AnalyzingSuggester is per character in each token. This can lead to stack overflows if you have a long suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.
Isabel Mendonca created LUCENE-5629: --- Summary: Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching. Key: LUCENE-5629 URL: https://issues.apache.org/jira/browse/LUCENE-5629 Project: Lucene - Core Issue Type: New Feature Components: core/index, core/queryparser, core/search Affects Versions: 4.7.1, 4.7 Environment: Operating system : Windows 8.1 Software platform : Eclipse Kepler 4.3.2 Reporter: Isabel Mendonca Priority: Minor Fix For: 4.7.1, 4.7 We have observed that Lucene does not check if the same Similarity function is used during indexing and searching. The same problem exists for the Analyzer that is used. This may lead to poor or misleading results. So we decided to create an xml file during indexing that will store information such as the Analyzer and the Similarity function that were used as well as the version of Lucene that was used. This xml file will always be available to the users. At search time , we will retrieve this information using SAX parsing and check if the utils used for searching , match those used for indexing. If not , a warning message will be displayed to the user. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5340) Add support for named snapshots
[ https://issues.apache.org/jira/browse/SOLR-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979552#comment-13979552 ] Noble Paul commented on SOLR-5340: -- Sorry to raise this concern now. in deletebackup , Isn't possible to do the check of whether the snapshot exists etc to be done in the same thread and give a response back right away . It is much better than polling the status later I guess , even the backup command should do basic checks of the location etc before the call returns Add support for named snapshots --- Key: SOLR-5340 URL: https://issues.apache.org/jira/browse/SOLR-5340 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.5 Reporter: Mike Schrag Assignee: Noble Paul Attachments: SOLR-5340.patch, SOLR-5340.patch It would be really nice if Solr supported named snapshots. Right now if you snapshot a SolrCloud cluster, every node potentially records a slightly different timestamp. Correlating those back together to effectively restore the entire cluster to a consistent snapshot is pretty tedious. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.
[ https://issues.apache.org/jira/browse/LUCENE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabel Mendonca updated LUCENE-5629: Affects Version/s: (was: 4.7.1) (was: 4.7) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching. -- Key: LUCENE-5629 URL: https://issues.apache.org/jira/browse/LUCENE-5629 Project: Lucene - Core Issue Type: New Feature Components: core/index, core/queryparser, core/search Environment: Operating system : Windows 8.1 Software platform : Eclipse Kepler 4.3.2 Reporter: Isabel Mendonca Priority: Minor Labels: features, patch Fix For: 4.8, 4.9, 5.0 Original Estimate: 672h Remaining Estimate: 672h We have observed that Lucene does not check if the same Similarity function is used during indexing and searching. The same problem exists for the Analyzer that is used. This may lead to poor or misleading results. So we decided to create an xml file during indexing that will store information such as the Analyzer and the Similarity function that were used as well as the version of Lucene that was used. This xml file will always be available to the users. At search time , we will retrieve this information using SAX parsing and check if the utils used for searching , match those used for indexing. If not , a warning message will be displayed to the user. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.
[ https://issues.apache.org/jira/browse/LUCENE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabel Mendonca updated LUCENE-5629: Fix Version/s: (was: 4.7.1) (was: 4.7) 5.0 4.9 4.8 Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching. -- Key: LUCENE-5629 URL: https://issues.apache.org/jira/browse/LUCENE-5629 Project: Lucene - Core Issue Type: New Feature Components: core/index, core/queryparser, core/search Environment: Operating system : Windows 8.1 Software platform : Eclipse Kepler 4.3.2 Reporter: Isabel Mendonca Priority: Minor Labels: features, patch Fix For: 4.8, 4.9, 5.0 Original Estimate: 672h Remaining Estimate: 672h We have observed that Lucene does not check if the same Similarity function is used during indexing and searching. The same problem exists for the Analyzer that is used. This may lead to poor or misleading results. So we decided to create an xml file during indexing that will store information such as the Analyzer and the Similarity function that were used as well as the version of Lucene that was used. This xml file will always be available to the users. At search time , we will retrieve this information using SAX parsing and check if the utils used for searching , match those used for indexing. If not , a warning message will be displayed to the user. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC1
+1 Smoketester says: SUCCESS! [1:26:32.631579] Elasticsearch is happy with the RC as well thanks uwe On Wed, Apr 23, 2014 at 11:08 AM, Michael McCandless luc...@mikemccandless.com wrote: +1 SUCCESS! [0:44:41.170815] Mike McCandless http://blog.mikemccandless.com On Tue, Apr 22, 2014 at 2:47 PM, Uwe Schindler u...@thetaphi.de wrote: Hi, I prepared the first release candidate of Lucene and Solr 4.8.0. The artifacts can be found here: = http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC1-rev1589150/ It took a bit longer, because we had to fix some remaining bugs regarding NativeFSLockFactory, which did not work correctly and leaked file handles. I also updated the instructions about the preferred Java update versions. See also Mike's blog post: http://www.elasticsearch.org/blog/java-1-7u55-safe-use-elasticsearch-lucene/ Please check the artifacts and give your vote in the next 72 hrs. My +1 will hopefully come a little bit later because Solr tests are failing constantly on my release build and smoke tester machine. The reason: it seems to be lack of file handles. A standard Ubuntu configuration has 1024 file handles and I want a release to pass with that common default configuration. Instead, org.apache.solr.cloud.TestMiniSolrCloudCluster.testBasics fails always with crazy error messages (not about too less file handles, more that Jetty cannot start up or not bind ports or various other stuff). This did not happen on smoking 4.7.x releases. I will run now the smoker again without HDFS (via build.properties) and if that also fails then once again with more file handles. But we really have to fix our tests that they succeed with the default config of 1024 file handles. We can configure that in Jenkins (so the Jenkins job first sets and then runs ANT ulimit -n 1024). But this should not block the release, I just say: I gave up running those Solr tests, sorry! Anybody else can test that stuff! Uwe P.S.: Here's my smoker command line: $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 python3.2 -u smokeTestRelease.py ' http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC1-rev1589150/' 1589150 4.8.0 tmp - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979580#comment-13979580 ] Noble Paul commented on SOLR-5681: -- workQueue.peekTopN(10); If a task is being processed , this will always return immedietly with one item and the loop would continue without a pause , hogging CPU/ZK-traffic. You will need to ensure that the call returns if the available items in the queue are different from the ones being processed. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979580#comment-13979580 ] Noble Paul edited comment on SOLR-5681 at 4/24/14 11:26 AM: workQueue.peekTopN(10); If a task is being processed , this will always return immediately with one item and the loop would continue without a pause , hogging CPU/ZK-traffic. You will need to ensure that the call returns if the available items in the queue are different from the ones being processed. was (Author: noble.paul): workQueue.peekTopN(10); If a task is being processed , this will always return immedietly with one item and the loop would continue without a pause , hogging CPU/ZK-traffic. You will need to ensure that the call returns if the available items in the queue are different from the ones being processed. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6010) Wrong highlighting while querying by date range with wild card in the end range
Mohammad Abul Khaer created SOLR-6010: - Summary: Wrong highlighting while querying by date range with wild card in the end range Key: SOLR-6010 URL: https://issues.apache.org/jira/browse/SOLR-6010 Project: Solr Issue Type: Bug Components: highlighter, query parsers Affects Versions: 4.0 Environment: java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Linux 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Reporter: Mohammad Abul Khaer Solr is returning wrong highlights when I have a date range query with wild card *in the end range*. For example my query *q* is {noformat} (story)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO *] {noformat} In the above query activatedate, expiredate are date fields. Their definition in schema file is as follows {code} field name=activatedate type=date indexed=true stored=false omitNorms=true/ field name=expiredate type=date indexed=true stored=false omitNorms=true/ {code} In the query result I am getting wrong highlighting information. Only highlighting result is show below {code} highlighting: { article:3605: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em. ] }, article:3604: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em.. ] } } {code} It should highlight only *story* word but it is highlighting lot other words also. What I noticed that this happens only if I have a wildcard * in the end range. If I change the above query and set a fixed date in the end range instead of * then solr return correct highlights. Modified query is shown below - {noformat} (story)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO 3014-04-24T09:55:00Z] {noformat} I guess its a bug in SOLR. If I use filter query *fq* instead of normal query *q* then highlighting result is OK for both queries. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6010) Wrong highlighting while querying by date range with wild card in the end range
[ https://issues.apache.org/jira/browse/SOLR-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Abul Khaer updated SOLR-6010: -- Description: Solr is returning wrong highlights when I have a date range query with wild card *in the end range*. For example my query *q* is {noformat} (porta)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO *] {noformat} In the above query activatedate, expiredate are date fields. Their definition in schema file is as follows {code} field name=activatedate type=date indexed=true stored=false omitNorms=true/ field name=expiredate type=date indexed=true stored=false omitNorms=true/ {code} In the query result I am getting wrong highlighting information. Only highlighting result is show below {code} highlighting: { article:3605: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em. ] }, article:3604: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em.. ] } } {code} It should highlight only *story* word but it is highlighting lot other words also. What I noticed that this happens only if I have a wildcard * in the end range. If I change the above query and set a fixed date in the end range instead of * then solr return correct highlights. Modified query is shown below - {noformat} (porta)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO 3014-04-24T09:55:00Z] {noformat} I guess its a bug in SOLR. If I use filter query *fq* instead of normal query *q* then highlighting result is OK for both queries. was: Solr is returning wrong highlights when I have a date range query with wild card *in the end range*. For example my query *q* is {noformat} (story)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO *] {noformat} In the above query activatedate, expiredate are date fields. Their definition in schema file is as follows {code} field name=activatedate type=date indexed=true stored=false omitNorms=true/ field name=expiredate type=date indexed=true stored=false omitNorms=true/ {code} In the query result I am getting wrong highlighting information. Only highlighting result is show below {code} highlighting: { article:3605: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em. ] }, article:3604: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em.. ] } } {code} It should highlight only *story* word but it is highlighting lot other words also. What I noticed that this happens only if I have a wildcard * in the end range. If I change the above query and set a fixed date in the end range instead of * then solr return correct highlights. Modified query is shown below - {noformat} (story)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO 3014-04-24T09:55:00Z] {noformat} I guess its a bug in SOLR. If I use filter query *fq* instead of normal query *q* then highlighting result is OK for both queries. Wrong highlighting while querying by date range with wild card in the end range --- Key: SOLR-6010 URL: https://issues.apache.org/jira/browse/SOLR-6010 Project: Solr Issue Type: Bug Components: highlighter, query parsers Affects Versions: 4.0 Environment: java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Linux 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Reporter: Mohammad Abul Khaer Labels: date, highlighting, range, solr Solr is returning wrong
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979630#comment-13979630 ] Mark Miller commented on SOLR-5473: --- bq. The latest patch is not a final one . I have to stick with my veto - the API changes are way to crazy. There is no quick fix here. I'm going to insist on my code veto and that this gets on track in a branch. bq. The objective of that patch was to get the naming right . But it was barely even a start. However, the names are not even the problem, which is why these needs way more work. Even if we rename the methods, it's still all super crazy compared to what we have and straddling two worlds in a way that both are ugly and the combination is just whacked. I realize this was done because doing it where we keep sensible API's is harder given what you would like to do, but that's not a good enough reason. I'm also not sold on the new watch approach. I am sold on splitting up clusterstate.json, but you have tied a lot of other stuff into this commit that is much more controversial. I'm sticking to my code veto and this should be reverted and moved to a branch. Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC1
+1. SUCCESS! [0:37:04.608776] -- Mark Miller about.me/markrmiller On April 22, 2014 at 2:47:50 PM, Uwe Schindler (u...@thetaphi.de) wrote: Hi, I prepared the first release candidate of Lucene and Solr 4.8.0. The artifacts can be found here: = http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC1-rev1589150/ It took a bit longer, because we had to fix some remaining bugs regarding NativeFSLockFactory, which did not work correctly and leaked file handles. I also updated the instructions about the preferred Java update versions. See also Mike's blog post: http://www.elasticsearch.org/blog/java-1-7u55-safe-use-elasticsearch-lucene/ Please check the artifacts and give your vote in the next 72 hrs. My +1 will hopefully come a little bit later because Solr tests are failing constantly on my release build and smoke tester machine. The reason: it seems to be lack of file handles. A standard Ubuntu configuration has 1024 file handles and I want a release to pass with that common default configuration. Instead, org.apache.solr.cloud.TestMiniSolrCloudCluster.testBasics fails always with crazy error messages (not about too less file handles, more that Jetty cannot start up or not bind ports or various other stuff). This did not happen on smoking 4.7.x releases. I will run now the smoker again without HDFS (via build.properties) and if that also fails then once again with more file handles. But we really have to fix our tests that they succeed with the default config of 1024 file handles. We can configure that in Jenkins (so the Jenkins job first sets and then runs ANT ulimit -n 1024). But this should not block the release, I just say: I gave up running those Solr tests, sorry! Anybody else can test that stuff! Uwe P.S.: Here's my smoker command line: $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 python3.2 -u smokeTestRelease.py ' http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC1-rev1589150/' 1589150 4.8.0 tmp - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5610) Add Terms min/max
[ https://issues.apache.org/jira/browse/LUCENE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979743#comment-13979743 ] ASF subversion and git services commented on LUCENE-5610: - Commit 1589729 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1589729 ] LUCENE-5610: add Terms.getMin/Max Add Terms min/max -- Key: LUCENE-5610 URL: https://issues.apache.org/jira/browse/LUCENE-5610 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch Having upper/lower bounds on terms could be useful for various optimizations in the future, e.g. to accelerate sorting (if a segment can't compete, don't even search it), and so on. Its pretty obvious how to get the smallest term, but the maximum term for a field is tricky, but worst case you can do it in ~ log(N) time by binary searching term space. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6010) Wrong highlighting while querying by date range with wild card in the end range
[ https://issues.apache.org/jira/browse/SOLR-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979749#comment-13979749 ] Ahmet Arslan commented on SOLR-6010: Cant you set {{hl.requireFieldMatch}} to true? Wrong highlighting while querying by date range with wild card in the end range --- Key: SOLR-6010 URL: https://issues.apache.org/jira/browse/SOLR-6010 Project: Solr Issue Type: Bug Components: highlighter, query parsers Affects Versions: 4.0 Environment: java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Linux 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Reporter: Mohammad Abul Khaer Labels: date, highlighting, range, solr Solr is returning wrong highlights when I have a date range query with wild card *in the end range*. For example my query *q* is {noformat} (porta)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO *] {noformat} In the above query activatedate, expiredate are date fields. Their definition in schema file is as follows {code} field name=activatedate type=date indexed=true stored=false omitNorms=true/ field name=expiredate type=date indexed=true stored=false omitNorms=true/ {code} In the query result I am getting wrong highlighting information. Only highlighting result is show below {code} highlighting: { article:3605: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em. ] }, article:3604: { title: [ The emcreative/em emheadline/em of this emstory/em emreally/em emsays/em it emall/em ], summary: [ emEtiam/em emporta/em emsem/em emmalesuada/em emmagna/em emmollis/em emeuismod/em emaenean/em emeu/em emleo/em emquam/em. emPellentesque/em emornare/em emsem/em emlacinia/em emquam/em.. ] } } {code} It should highlight only *story* word but it is highlighting lot other words also. What I noticed that this happens only if I have a wildcard * in the end range. If I change the above query and set a fixed date in the end range instead of * then solr return correct highlights. Modified query is shown below - {noformat} (porta)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO 3014-04-24T09:55:00Z] {noformat} I guess its a bug in SOLR. If I use filter query *fq* instead of normal query *q* then highlighting result is OK for both queries. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5610) Add Terms min/max
[ https://issues.apache.org/jira/browse/LUCENE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979788#comment-13979788 ] ASF subversion and git services commented on LUCENE-5610: - Commit 1589749 from [~mikemccand] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1589749 ] LUCENE-5610: add Terms.getMin/Max Add Terms min/max -- Key: LUCENE-5610 URL: https://issues.apache.org/jira/browse/LUCENE-5610 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch Having upper/lower bounds on terms could be useful for various optimizations in the future, e.g. to accelerate sorting (if a segment can't compete, don't even search it), and so on. Its pretty obvious how to get the smallest term, but the maximum term for a field is tricky, but worst case you can do it in ~ log(N) time by binary searching term space. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5610) Add Terms min/max
[ https://issues.apache.org/jira/browse/LUCENE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979792#comment-13979792 ] ASF subversion and git services commented on LUCENE-5610: - Commit 1589752 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1589752 ] LUCENE-5610: improve CheckIndex checking; javadocs Add Terms min/max -- Key: LUCENE-5610 URL: https://issues.apache.org/jira/browse/LUCENE-5610 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 4.9, 5.0 Attachments: LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch Having upper/lower bounds on terms could be useful for various optimizations in the future, e.g. to accelerate sorting (if a segment can't compete, don't even search it), and so on. Its pretty obvious how to get the smallest term, but the maximum term for a field is tricky, but worst case you can do it in ~ log(N) time by binary searching term space. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5610) Add Terms min/max
[ https://issues.apache.org/jira/browse/LUCENE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-5610. Resolution: Fixed Fix Version/s: 5.0 4.9 Add Terms min/max -- Key: LUCENE-5610 URL: https://issues.apache.org/jira/browse/LUCENE-5610 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 4.9, 5.0 Attachments: LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch Having upper/lower bounds on terms could be useful for various optimizations in the future, e.g. to accelerate sorting (if a segment can't compete, don't even search it), and so on. Its pretty obvious how to get the smallest term, but the maximum term for a field is tricky, but worst case you can do it in ~ log(N) time by binary searching term space. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.8-Linux (64bit/jdk1.7.0_60-ea-b14) - Build # 57 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.8-Linux/57/ Java: 64bit/jdk1.7.0_60-ea-b14 -XX:+UseCompressedOops -XX:+UseSerialGC 1 tests failed. FAILED: org.apache.solr.handler.clustering.DistributedClusteringComponentTest.testDistribSearch Error Message: Timeout occured while waiting response from server at: https://127.0.0.1:57149 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: https://127.0.0.1:57149 at __randomizedtesting.SeedInfo.seed([4D372A03589DF004:CCD1A41B2FC29038]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:562) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102) at org.apache.solr.BaseDistributedSearchTestCase.indexDoc(BaseDistributedSearchTestCase.java:440) at org.apache.solr.BaseDistributedSearchTestCase.index(BaseDistributedSearchTestCase.java:429) at org.apache.solr.handler.clustering.DistributedClusteringComponentTest.doTest(DistributedClusteringComponentTest.java:36) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:871) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[jira] [Updated] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse
[ https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5628: --- Attachment: LUCENE-5628.patch New patch, with some simplification: I moved all the hairy logic about next label/transition into the PathNode. I think this helps. I put a nocommit to use Stack instead of PathNode[] ... this would be simpler (push/pop instead of .get/.remove) ... the only backside is it would mean new Java object on each push vs now where it re-uses. SpecialOperations.getFiniteStrings should not recurse - Key: LUCENE-5628 URL: https://issues.apache.org/jira/browse/LUCENE-5628 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5628.patch, LUCENE-5628.patch, LUCENE-5628.patch Today it consumes one Java stack frame per transition, which when used by AnalyzingSuggester is per character in each token. This can lead to stack overflows if you have a long suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5611) Simplify the default indexing chain
[ https://issues.apache.org/jira/browse/LUCENE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5611: --- Attachment: LUCENE-5611.patch New patch, I think it's ready. I fixed all nocommits, javadocs. I removed the specialization for String/NumericField; these gave decent performance gains, but we should pursue it separately. Simplify the default indexing chain --- Key: LUCENE-5611 URL: https://issues.apache.org/jira/browse/LUCENE-5611 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 Attachments: LUCENE-5611.patch, LUCENE-5611.patch I think Lucene's current indexing chain has too many classes / hierarchy / abstractions, making it look much more complex than it really should be, and discouraging users from experimenting/innovating with their own indexing chains. Also, if it were easier to understand/approach, then new developers would more likely try to improve it ... it really should be simpler. So I'm exploring a pared back indexing chain, and have a starting patch that I think is looking ok: it seems more approachable than the current indexing chain, or at least has fewer strange classes. I also thought this could give some speedup for tiny documents (a more common use of Lucene lately), and it looks like, with the evil optimizations, this is a ~25% speedup for Geonames docs. Even without those evil optos it's a bit faster. This is very much a work in progress / nocommits, and there are some behavior changes e.g. the new chain requires all fields to have the same TV options (rather than auto-upgrading all fields by the same name that the current chain does)... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5681: --- Attachment: SOLR-5681.patch Another patch with a few things fixed, got rid off a bit of the hard coded logic and a multi threading probable race-condition. Also, the main thread loop now continues if there's nothing new in the work-queue. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 19681 - Failure!
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/19681/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestTerms.testTermMinMaxRandom Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([3F2528AD7379C20F:7356A7D91FE5196A]:0) at org.apache.lucene.util.UnicodeUtil.UTF8toUTF16(UnicodeUtil.java:563) at org.apache.lucene.codecs.lucene3x.TermInfosWriter.compareToLastTerm(TermInfosWriter.java:187) at org.apache.lucene.codecs.lucene3x.TermInfosWriter.add(TermInfosWriter.java:217) at org.apache.lucene.codecs.lucene3x.PreFlexRWFieldsWriter$PreFlexTermsWriter.finishTerm(PreFlexRWFieldsWriter.java:209) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:548) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:465) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:518) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:628) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2942) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3101) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3068) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:320) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:257) at org.apache.lucene.index.TestTerms.testTermMinMaxRandom(TestTerms.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[jira] [Created] (SOLR-6011) inOrder does not work with the complexphrase parser
Yonik Seeley created SOLR-6011: -- Summary: inOrder does not work with the complexphrase parser Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.
[ https://issues.apache.org/jira/browse/LUCENE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979875#comment-13979875 ] Ahmet Arslan commented on LUCENE-5629: -- bq. The same problem exists for the Analyzer that is used. Can't we use different analyzers for indexing and searching? e.g. WordDelimiterFilter, SynonymFilter, NGramFilter, etc. Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching. -- Key: LUCENE-5629 URL: https://issues.apache.org/jira/browse/LUCENE-5629 Project: Lucene - Core Issue Type: New Feature Components: core/index, core/queryparser, core/search Environment: Operating system : Windows 8.1 Software platform : Eclipse Kepler 4.3.2 Reporter: Isabel Mendonca Priority: Minor Labels: features, patch Fix For: 4.8, 4.9, 5.0 Original Estimate: 672h Remaining Estimate: 672h We have observed that Lucene does not check if the same Similarity function is used during indexing and searching. The same problem exists for the Analyzer that is used. This may lead to poor or misleading results. So we decided to create an xml file during indexing that will store information such as the Analyzer and the Similarity function that were used as well as the version of Lucene that was used. This xml file will always be available to the users. At search time , we will retrieve this information using SAX parsing and check if the utils used for searching , match those used for indexing. If not , a warning message will be displayed to the user. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 19681 - Failure!
I'll dig. Mike McCandless http://blog.mikemccandless.com On Thu, Apr 24, 2014 at 11:51 AM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/19681/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestTerms.testTermMinMaxRandom Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([3F2528AD7379C20F:7356A7D91FE5196A]:0) at org.apache.lucene.util.UnicodeUtil.UTF8toUTF16(UnicodeUtil.java:563) at org.apache.lucene.codecs.lucene3x.TermInfosWriter.compareToLastTerm(TermInfosWriter.java:187) at org.apache.lucene.codecs.lucene3x.TermInfosWriter.add(TermInfosWriter.java:217) at org.apache.lucene.codecs.lucene3x.PreFlexRWFieldsWriter$PreFlexTermsWriter.finishTerm(PreFlexRWFieldsWriter.java:209) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:548) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:465) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:518) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:628) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2942) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3101) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3068) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:320) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:257) at org.apache.lucene.index.TestTerms.testTermMinMaxRandom(TestTerms.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
[jira] [Closed] (LUCENE-5629) Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching.
[ https://issues.apache.org/jira/browse/LUCENE-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson closed LUCENE-5629. -- Resolution: Not a Problem Closing, if you still think this is a problem we can re-open. Allowing different analyzers at index and query time is a deliberate decision. Otherwise all the effort that went in to allowing independent index and query analysis chains could have been avoided. In particular synonyms are often defined at index time but not at query time. Comparing the Version of Lucene , the Analyzer and the similarity function that are being used for indexing and searching. -- Key: LUCENE-5629 URL: https://issues.apache.org/jira/browse/LUCENE-5629 Project: Lucene - Core Issue Type: New Feature Components: core/index, core/queryparser, core/search Environment: Operating system : Windows 8.1 Software platform : Eclipse Kepler 4.3.2 Reporter: Isabel Mendonca Priority: Minor Labels: features, patch Fix For: 4.8, 4.9, 5.0 Original Estimate: 672h Remaining Estimate: 672h We have observed that Lucene does not check if the same Similarity function is used during indexing and searching. The same problem exists for the Analyzer that is used. This may lead to poor or misleading results. So we decided to create an xml file during indexing that will store information such as the Analyzer and the Similarity function that were used as well as the version of Lucene that was used. This xml file will always be available to the users. At search time , we will retrieve this information using SAX parsing and check if the utils used for searching , match those used for indexing. If not , a warning message will be displayed to the user. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5610) Add Terms min/max
[ https://issues.apache.org/jira/browse/LUCENE-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979885#comment-13979885 ] ASF subversion and git services commented on LUCENE-5610: - Commit 1589782 from [~mikemccand] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1589782 ] LUCENE-5610: don't use Lucene3x codec (the test writes arbitrary binary terms) Add Terms min/max -- Key: LUCENE-5610 URL: https://issues.apache.org/jira/browse/LUCENE-5610 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 4.9, 5.0 Attachments: LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch, LUCENE-5610.patch Having upper/lower bounds on terms could be useful for various optimizations in the future, e.g. to accelerate sorting (if a segment can't compete, don't even search it), and so on. Its pretty obvious how to get the smallest term, but the maximum term for a field is tricky, but worst case you can do it in ~ log(N) time by binary searching term space. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 19681 - Failure!
I committed a fix, disabling Lucene3x for that test case. I didn't use @SuppressCodecs because the other tests should work fine w/ Lucene3x. Mike McCandless http://blog.mikemccandless.com On Thu, Apr 24, 2014 at 11:57 AM, Michael McCandless luc...@mikemccandless.com wrote: I'll dig. Mike McCandless http://blog.mikemccandless.com On Thu, Apr 24, 2014 at 11:51 AM, buil...@flonkings.com wrote: Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/19681/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestTerms.testTermMinMaxRandom Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([3F2528AD7379C20F:7356A7D91FE5196A]:0) at org.apache.lucene.util.UnicodeUtil.UTF8toUTF16(UnicodeUtil.java:563) at org.apache.lucene.codecs.lucene3x.TermInfosWriter.compareToLastTerm(TermInfosWriter.java:187) at org.apache.lucene.codecs.lucene3x.TermInfosWriter.add(TermInfosWriter.java:217) at org.apache.lucene.codecs.lucene3x.PreFlexRWFieldsWriter$PreFlexTermsWriter.finishTerm(PreFlexRWFieldsWriter.java:209) at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:548) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:465) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:518) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:628) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2942) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3101) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3068) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:320) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:257) at org.apache.lucene.index.TestTerms.testTermMinMaxRandom(TestTerms.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979898#comment-13979898 ] Brett Lucey commented on SOLR-2894: --- Andrew actually raised that question to me yesterday as well and I spent a little bit of time looking into it. For the initial request to a shard, we only lower the mincount if the facet limit is set to something other than -1. In your case, this would be 10 for the top level pivot. We know we will (at most) get back 15 terms from each shard in this case. Because we are only faceting on a limited number of terms, having a mincount of 0 here provides us the benefit of potentially avoiding refinement. In refinement requests, we still need to know when a shard has responded to us with it's count for a term, so the mincount is -1 in that case because we are interested in the term even if the count is zero. It allows us to mark the shard as having responded and continue on. It's possible that we might be able to change this, but at the point of refinement, it's a rather targeted request so I don't expect there to be a significant benefit to doing so. In your case, with the facet limit being -1 on f2-f5, no refinement would be performed anyway. When we designed this implementation, the most important factor for us was speed, and we were willing to get it at a cost of memory. By making these changes, we reduced queries which previously took around 70 seconds for us down to around 600 milliseconds. I suspect that the biggest factor in the poor memory utilization is the wide open nature of using a facet.limit of -1, especially on a pivot so deep. Keep in mind that for each level of depth you add to a pivot, memory and time required will grow exponentially. Don't forget that if you are querying a node and all of the shards are located within the same Java VM, you are incurring the memory cost of both shards plus the node responding to the user query all within the same heap. I took a quick look at the code today while waiting for some other processes to finish, and I don't see any obvious low hanging fruit to free up a small amount of memory. Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Fix For: 4.9, 5.0 Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, dateToObject.patch Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979898#comment-13979898 ] Brett Lucey edited comment on SOLR-2894 at 4/24/14 4:20 PM: Andrew actually raised that question to me yesterday as well and I spent a little bit of time looking into it. For the initial request to a shard, we only lower the mincount to 0 if the facet limit is set to something other than -1. If the facet limit is -1, we lower the mincount to 1. In your case, this would the limit would be 10 for the top level pivot, so we know we will (at most) get back 15 terms from each shard in this case. Because we are only faceting on a limited number of terms, having a mincount of 0 here provides us the benefit of potentially avoiding refinement. In refinement requests, we still need to know when a shard has responded to us with it's count for a term, so the mincount is -1 in that case because we are interested in the term even if the count is zero. It allows us to mark the shard as having responded and continue on. It's possible that we might be able to change this, but at the point of refinement, it's a rather targeted request so I don't expect there to be a significant benefit to doing so. In your case, with the facet limit being -1 on f2-f5, no refinement would be performed anyway. When we designed this implementation, the most important factor for us was speed, and we were willing to get it at a cost of memory. By making these changes, we reduced queries which previously took around 70 seconds for us down to around 600 milliseconds. I suspect that the biggest factor in the poor memory utilization is the wide open nature of using a facet.limit of -1, especially on a pivot so deep. Keep in mind that for each level of depth you add to a pivot, memory and time required will grow exponentially. Don't forget that if you are querying a node and all of the shards are located within the same Java VM, you are incurring the memory cost of both shards plus the node responding to the user query all within the same heap. I took a quick look at the code today while waiting for some other processes to finish, and I don't see any obvious low hanging fruit to free up a small amount of memory. was (Author: brett.lucey): Andrew actually raised that question to me yesterday as well and I spent a little bit of time looking into it. For the initial request to a shard, we only lower the mincount if the facet limit is set to something other than -1. In your case, this would be 10 for the top level pivot. We know we will (at most) get back 15 terms from each shard in this case. Because we are only faceting on a limited number of terms, having a mincount of 0 here provides us the benefit of potentially avoiding refinement. In refinement requests, we still need to know when a shard has responded to us with it's count for a term, so the mincount is -1 in that case because we are interested in the term even if the count is zero. It allows us to mark the shard as having responded and continue on. It's possible that we might be able to change this, but at the point of refinement, it's a rather targeted request so I don't expect there to be a significant benefit to doing so. In your case, with the facet limit being -1 on f2-f5, no refinement would be performed anyway. When we designed this implementation, the most important factor for us was speed, and we were willing to get it at a cost of memory. By making these changes, we reduced queries which previously took around 70 seconds for us down to around 600 milliseconds. I suspect that the biggest factor in the poor memory utilization is the wide open nature of using a facet.limit of -1, especially on a pivot so deep. Keep in mind that for each level of depth you add to a pivot, memory and time required will grow exponentially. Don't forget that if you are querying a node and all of the shards are located within the same Java VM, you are incurring the memory cost of both shards plus the node responding to the user query all within the same heap. I took a quick look at the code today while waiting for some other processes to finish, and I don't see any obvious low hanging fruit to free up a small amount of memory. Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Fix For: 4.9, 5.0 Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch,
RE: svn commit: r1589782 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/index/TestTerms.java
Why not use the @SuppressCodes annotation? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: mikemcc...@apache.org [mailto:mikemcc...@apache.org] Sent: Thursday, April 24, 2014 6:08 PM To: comm...@lucene.apache.org Subject: svn commit: r1589782 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/ index/TestTerms.java Author: mikemccand Date: Thu Apr 24 16:07:30 2014 New Revision: 1589782 URL: http://svn.apache.org/r1589782 Log: LUCENE-5610: don't use Lucene3x codec (the test writes arbitrary binary terms) Modified: lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i ndex/TestTerms.java Modified: lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i ndex/TestTerms.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/cor e/src/test/org/apache/lucene/index/TestTerms.java?rev=1589782r1=1589 781r2=1589782view=diff == --- lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i ndex/TestTerms.java (original) +++ lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene +++ /index/TestTerms.java Thu Apr 24 16:07:30 2014 @@ -20,6 +20,8 @@ package org.apache.lucene.index; import java.util.*; import org.apache.lucene.analysis.CannedBinaryTokenStream; +import org.apache.lucene.codecs.Codec; +import org.apache.lucene.codecs.lucene3x.Lucene3xCodec; import org.apache.lucene.document.Document; import org.apache.lucene.document.DoubleField; import org.apache.lucene.document.Field; @@ -51,6 +53,7 @@ public class TestTerms extends LuceneTes } public void testTermMinMaxRandom() throws Exception { +assumeFalse(test writes binary terms, Codec.getDefault() + instanceof Lucene3xCodec); Directory dir = newDirectory(); RandomIndexWriter w = new RandomIndexWriter(random(), dir); int numDocs = atLeast(100); - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979898#comment-13979898 ] Brett Lucey edited comment on SOLR-2894 at 4/24/14 4:21 PM: Andrew actually raised that question to me yesterday as well and I spent a little bit of time looking into it. For the initial request to a shard, we only lower the mincount to 0 if the facet limit is set to something other than -1. If the facet limit is -1, we lower the mincount to 1. In your case, this would the limit would be 10 for the top level pivot, so we know we will (at most) get back 15 terms from each shard in this case. Because we are only faceting on a limited number of terms, having a mincount of 0 here provides us the benefit of potentially avoiding refinement. In refinement requests, we still need to know when a shard has responded to us with it's count for a term, so the mincount is -1 in that case because we are interested in the term even if the count is zero. It allows us to mark the shard as having responded and continue on. It's possible that we might be able to change this, but at the point of refinement, it's a rather targeted request so I don't expect there to be a significant benefit to doing so. In your case, with the facet limit being -1 on f2-f5, no refinement would be performed anyway. When we designed this implementation, the most important factor for us was speed, and we were willing to get it at a cost of memory. By making these changes, we reduced queries which previously took around 70 seconds for us down to around 600 milliseconds. I suspect that the biggest factor in the poor memory utilization is the wide open nature of using a facet.limit of -1, especially on a pivot so deep. Keep in mind that for each level of depth you add to a pivot, memory and time required will grow exponentially. Don't forget that if you are querying a node and all of the shards are located within the same Java VM, you are incurring the memory cost of both shards plus the node responding to the user query all within the same heap. I took a quick look at the code today while waiting for some other processes to finish, and I don't see any obvious low hanging fruit to free up memory. was (Author: brett.lucey): Andrew actually raised that question to me yesterday as well and I spent a little bit of time looking into it. For the initial request to a shard, we only lower the mincount to 0 if the facet limit is set to something other than -1. If the facet limit is -1, we lower the mincount to 1. In your case, this would the limit would be 10 for the top level pivot, so we know we will (at most) get back 15 terms from each shard in this case. Because we are only faceting on a limited number of terms, having a mincount of 0 here provides us the benefit of potentially avoiding refinement. In refinement requests, we still need to know when a shard has responded to us with it's count for a term, so the mincount is -1 in that case because we are interested in the term even if the count is zero. It allows us to mark the shard as having responded and continue on. It's possible that we might be able to change this, but at the point of refinement, it's a rather targeted request so I don't expect there to be a significant benefit to doing so. In your case, with the facet limit being -1 on f2-f5, no refinement would be performed anyway. When we designed this implementation, the most important factor for us was speed, and we were willing to get it at a cost of memory. By making these changes, we reduced queries which previously took around 70 seconds for us down to around 600 milliseconds. I suspect that the biggest factor in the poor memory utilization is the wide open nature of using a facet.limit of -1, especially on a pivot so deep. Keep in mind that for each level of depth you add to a pivot, memory and time required will grow exponentially. Don't forget that if you are querying a node and all of the shards are located within the same Java VM, you are incurring the memory cost of both shards plus the node responding to the user query all within the same heap. I took a quick look at the code today while waiting for some other processes to finish, and I don't see any obvious low hanging fruit to free up a small amount of memory. Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Fix For: 4.9, 5.0 Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch,
RE: svn commit: r1589782 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/inde x/TestTerms.java
: Why not use the @SuppressCodes annotation? mike mentioned this in his reply to the jenkins failure... I committed a fix, disabling Lucene3x for that test case. I didn't use @SuppressCodecs because the other tests should work fine w/ Lucene3x. ..perhaps putting this in the test as a comment owuld be useful? : : - : Uwe Schindler : H.-H.-Meier-Allee 63, D-28213 Bremen : http://www.thetaphi.de : eMail: u...@thetaphi.de : : : -Original Message- : From: mikemcc...@apache.org [mailto:mikemcc...@apache.org] : Sent: Thursday, April 24, 2014 6:08 PM : To: comm...@lucene.apache.org : Subject: svn commit: r1589782 - : /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/ : index/TestTerms.java : : Author: mikemccand : Date: Thu Apr 24 16:07:30 2014 : New Revision: 1589782 : : URL: http://svn.apache.org/r1589782 : Log: : LUCENE-5610: don't use Lucene3x codec (the test writes arbitrary binary : terms) : : Modified: : : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i : ndex/TestTerms.java : : Modified: : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i : ndex/TestTerms.java : URL: : http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/cor : e/src/test/org/apache/lucene/index/TestTerms.java?rev=1589782r1=1589 : 781r2=1589782view=diff : == : : --- : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i : ndex/TestTerms.java (original) : +++ : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene : +++ /index/TestTerms.java Thu Apr 24 16:07:30 2014 : @@ -20,6 +20,8 @@ package org.apache.lucene.index; import java.util.*; : : import org.apache.lucene.analysis.CannedBinaryTokenStream; : +import org.apache.lucene.codecs.Codec; : +import org.apache.lucene.codecs.lucene3x.Lucene3xCodec; : import org.apache.lucene.document.Document; : import org.apache.lucene.document.DoubleField; : import org.apache.lucene.document.Field; @@ -51,6 +53,7 @@ public class : TestTerms extends LuceneTes : } : : public void testTermMinMaxRandom() throws Exception { : +assumeFalse(test writes binary terms, Codec.getDefault() : + instanceof Lucene3xCodec); : Directory dir = newDirectory(); : RandomIndexWriter w = new RandomIndexWriter(random(), dir); : int numDocs = atLeast(100); : : : : - : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org : For additional commands, e-mail: dev-h...@lucene.apache.org : : -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1589782 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/inde x/TestTerms.java
Yeah, I didn't want to disable the full test, just that one method, because I want Terms.getMin/Max testing for Lucene3x too. Would be nice if we could @SuppressCodecs for just one method ... I'll add a comment. Mike McCandless http://blog.mikemccandless.com On Thu, Apr 24, 2014 at 12:26 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Why not use the @SuppressCodes annotation? mike mentioned this in his reply to the jenkins failure... I committed a fix, disabling Lucene3x for that test case. I didn't use @SuppressCodecs because the other tests should work fine w/ Lucene3x. ..perhaps putting this in the test as a comment owuld be useful? : : - : Uwe Schindler : H.-H.-Meier-Allee 63, D-28213 Bremen : http://www.thetaphi.de : eMail: u...@thetaphi.de : : : -Original Message- : From: mikemcc...@apache.org [mailto:mikemcc...@apache.org] : Sent: Thursday, April 24, 2014 6:08 PM : To: comm...@lucene.apache.org : Subject: svn commit: r1589782 - : /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/ : index/TestTerms.java : : Author: mikemccand : Date: Thu Apr 24 16:07:30 2014 : New Revision: 1589782 : : URL: http://svn.apache.org/r1589782 : Log: : LUCENE-5610: don't use Lucene3x codec (the test writes arbitrary binary : terms) : : Modified: : : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i : ndex/TestTerms.java : : Modified: : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i : ndex/TestTerms.java : URL: : http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/cor : e/src/test/org/apache/lucene/index/TestTerms.java?rev=1589782r1=1589 : 781r2=1589782view=diff : == : : --- : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/i : ndex/TestTerms.java (original) : +++ : lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene : +++ /index/TestTerms.java Thu Apr 24 16:07:30 2014 : @@ -20,6 +20,8 @@ package org.apache.lucene.index; import java.util.*; : : import org.apache.lucene.analysis.CannedBinaryTokenStream; : +import org.apache.lucene.codecs.Codec; : +import org.apache.lucene.codecs.lucene3x.Lucene3xCodec; : import org.apache.lucene.document.Document; : import org.apache.lucene.document.DoubleField; : import org.apache.lucene.document.Field; @@ -51,6 +53,7 @@ public class : TestTerms extends LuceneTes : } : : public void testTermMinMaxRandom() throws Exception { : +assumeFalse(test writes binary terms, Codec.getDefault() : + instanceof Lucene3xCodec); : Directory dir = newDirectory(); : RandomIndexWriter w = new RandomIndexWriter(random(), dir); : int numDocs = atLeast(100); : : : : - : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org : For additional commands, e-mail: dev-h...@lucene.apache.org : : -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6012) Dutch language stemming issues
Ashokkumar Balasubramanian created SOLR-6012: Summary: Dutch language stemming issues Key: SOLR-6012 URL: https://issues.apache.org/jira/browse/SOLR-6012 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.5 Environment: Linux Reporter: Ashokkumar Balasubramanian Priority: Minor I am trying to search a word in dutch language with the word Brievenbussen. Normally this is the proper dutch word and it should result in some matches but it results in 0 matches. The dutch word Brievenbusen (Letter 's' is removed) returns matches. The problem is it doesn't take the last word 'bus' vowel character into account. If a vowel is found in the last before character (in this case, it is 'U'), then the proper dutch word should be Brievenbussen. Can you please confirm if this is a problem with 3.5 version. Please let me know if you need more information. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: release votes
On Apr 24, 2014, at 8:07 AM, Thomas Koch k...@orbiteam.de wrote: Hi Andi, I don't agree that it is unimportant to make PyLucene releases. Without a ready-to-run software package the hurdles to use PyLucene are raised. It is already not quite simple (for beginners) to install PyLucene on the various platforms. Having a packaged release that is tested by some users provides a benefit to the community in my opinion. Relatedly, I have a pull request open to add a pylucene formula to homebrew. So mac users will be able to simply ‘brew install pylucene’. That would be more consistent with having releases. However I can understand your arguments - there has been little feedback on your release announcements on the list recently. On the other hand there are frequent discussions about PyLucene on the list so I don't think the interest has declined. Did you check the number of downloads of the PyLucene distributions (if this is possible at all - due to the distributed releases on the apache mirrors ...)? This would be a more accurate indicator from my point of view. I must also admit that I did never understand the voting process in detail - i.e. who are the PMC members and what impact have votes of non PMC users. Maybe some more transparency and another call for action would help to raise awareness in the community. Just my thoughts... regards, Thomas -- OrbiTeam Software GmbH Co. KG http://www.orbiteam.de -Ursprüngliche Nachricht- Von: Andi Vajda [mailto:va...@apache.org] Gesendet: Donnerstag, 24. April 2014 02:28 An: pylucene-...@lucene.apache.org Betreff: release votes Hi all, Given the tiny amount of interest the pylucene releases create, it's maybe become unimportant to actually make PyLucene releases ? The release votes have had an increasingly difficult time to garner the three required PMC votes to pass. Non PMC users are also eerily quiet. Maybe the time has come to switch to a different model: - when a Lucene release happens, a PyLucene branch gets created with all the necessary changes to build successfully and pass all tests against this Lucene release - users interested in PyLucene check out that branch - done - no more releases, no more votes JCC can continue to be released to PyPI independently as it is today. That doesn't require any voting anyway (?). What do readers of this list think ? Andi..
Re: 4.8 Solr Ref Guide Release Plan
I'm still trying to document some stuff (async core admin calls) but I'm having issues saving that stuff. Once I'm able to save that, I should be done with everything that's on my mind for 4.8 documentation. On Wed, Apr 23, 2014 at 10:33 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : 2) I'll review the TODO list arround 24 hours after the first Lucene/Solr 4.8 : RC VOTE is called -- if it doesn't look like anyone is in hte middle of : working on stuff, I'll go ahead and cut a ref-guide RC. If it looks like FYI: Tim Potter reached out to me that he's working on docing up the REST Manager stuff today -- so i'll plan on doing the RC arround 34 hours from now. If you see any low hanging fruit, jump on it today. -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://www.anshumgupta.net
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979944#comment-13979944 ] Ahmet Arslan commented on SOLR-6011: {code} {!complexphrase df=manu inOrder=true}high* vol* {code} is explained as {code} weight(spanNear([spanOr([manu:high]), spanOr([manu:volume])], 0, true) in 31) {code} and {code} {!complexphrase df=manu inOrder=false}high* vol* {code} is explained as {code} weight(spanNear([spanOr([manu:high]), spanOr([manu:volume])], 0, false) in 31) {code} It looks like local param {{inOrder}} is correctly propagated to constructor of {{SpanNearQuery}}. However both queries return the following example document. {code:xml} doc field name=id100-435805/field field name=manuhigh volume web/field /doc {code} On the other hand {code} {!complexphrase df=manu inOrder=true}vol* high* {code} and {code} {!complexphrase df=manu inOrder=true}vol* high* {code} do not return example document. Weird... inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: 4.8 Solr Ref Guide Release Plan
: I'm still trying to document some stuff (async core admin calls) but I'm : having issues saving that stuff. : Once I'm able to save that, I should be done with everything that's on my : mind for 4.8 documentation. Yeah ... cwiki seems to be having some performance issues at the moment, so releasing is on hold until it stabalizes and we can finish off some of hte final edits and Uwe can update the macro that handles the 4.8 javadoc links. -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-6011: --- Attachment: SOLR-6011.patch Here's a test that fails. I changed the testcase to not use separate schema / solrconfig files (it's crazy to add extra files for this). It was also necessary to switch to a stock solrconfig to reproduce the bugs I was seeing. After a quick look, it looks like hashCode / equals are not implemented correctly (they do not take into account inOrder) and hence the query cache can return incorrect results. I'll work on a fix. inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Attachments: SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979969#comment-13979969 ] Noble Paul edited comment on SOLR-5473 at 4/24/14 5:13 PM: --- hi [~markrmil...@gmail.com] I'm working on a reverse patch for this, and we can move the further development to a dedicated branch Meanwhile I would like to know what your concerns are on the following * The idea of splitting the clusterstate.json itself * The external interface . The public API, the changes we make to the zk nodes (I mean all the public things that impact the user) * The idea of selectively watching states . And other implementation details, Or any other particular solutions which you think are better * The API's which are 'undesirable' I would like to work towards a consensus and resolve this was (Author: noble.paul): hi [~markrmil...@gmail.com] I'm work on a reverse patch for this, and we can move the further development to a dedicated branch Meanwhile I would like to know what your concerns are on the following * The idea of splitting the clusterstate.json itself * The external interface . The public API, the changes we make to the zk nodes (I mean all the public things that impact the user) * The idea of selectively watching states . And other implementation details, Or any other particular solutions which you think are better * The API's which are 'undesirable' I would like to work towards a consensus and resolve this Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979969#comment-13979969 ] Noble Paul commented on SOLR-5473: -- hi [~markrmil...@gmail.com] I'm work on a reverse patch for this, and we can move the further development to a dedicated branch Meanwhile I would like to know what your concerns are on the following * The idea of splitting the clusterstate.json itself * The external interface . The public API, the changes we make to the zk nodes (I mean all the public things that impact the user) * The idea of selectively watching states . And other implementation details, Or any other particular solutions which you think are better * The API's which are 'undesirable' I would like to work towards a consensus and resolve this Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979983#comment-13979983 ] Mark Miller commented on SOLR-5473: --- Thank you. bq. The idea of splitting the clusterstate.json itself I have no problem with this. bq. The external interface I've already gone into a lot of the points. We can dig more into what is not clear from the above. bq. The idea of selectively watching states The is also probably fine, though I'm not sure it's right to change it by default, and I'm not sure it should be so tied into The idea of splitting the clusterstate.json itself. Breaking things into parts makes it easier to digest and build and document properly. bq. The API's which are 'undesirable' I go into that above - again, I can answer specific questions. Look at all the get collection methods - look at all the crazy different behaviors depending on what you call - look at the lack of documentation. Future developers will be hopelessly lost. Anyway, I've brought up enough issues above to get started on understanding what some of the current problems are. If you look at the API's now, you can see it's just a mess. It all seems to work okay, and that is good, but it needs to be done thoughtfully as well, and I don't think anyone can easily deal with API's as they are. bq. I would like to work towards a consensus and resolve this I'm sure that can be done - I do think there is a lot to do and it's too core to rush it in. I think a good approach would be too break it up and do things in discrete parts - eg splitting up clusterstate.json seems independent of a lot of these other changes. That part is not the most important part though - mostly, we have to get to some well documented API's that make sense - especially on 5x where we don't even necessarily have back compat concerns. Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-6011: --- Attachment: SOLR-6011.patch OK, here's an updated patch that fixes the issue. inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC1
I ran into a very serious bug during my manual testing of 4.8 that I think warrants a respin. https://issues.apache.org/jira/browse/SOLR-6011 In a normal Solr setup, incorrect results are returned from complex phrase queries if inOrder is ever changed for the same query. This would be maddening for most users to try and track down. -Yonik http://heliosearch.org - solve Solr GC pauses with off-heap filters and fieldcache - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5819) Investigate reduce size of ref-guide PDF
[ https://issues.apache.org/jira/browse/SOLR-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980009#comment-13980009 ] Hoss Man commented on SOLR-5819: Since the CWIKI upgrade still hasn't happened, i used rmuir's iText as the basis for a new micro-project on github... https://github.com/hossman/pdf-shrinker ...people building the ref guide can manually run this to reduce the PDF size until the CWIKI upgrade is complete. Investigate reduce size of ref-guide PDF -- Key: SOLR-5819 URL: https://issues.apache.org/jira/browse/SOLR-5819 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: img-0007.png, img-0008.png, img-0009.png, img-0010.png, img-0011.png, img-0012.png, img-0013.png, img-0014.png As noted on the solr-user mailing list in response to the ANNOUNCE about the 4.7 ref guide, the size of the 4.4, 4.5 4.6 PDF files were all under 5MB, but the 4.7 PDF was 30MB. opening this issue to track trying to reduce this -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5819) Investigate reduce size of ref-guide PDF
[ https://issues.apache.org/jira/browse/SOLR-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-5819: --- Description: As noted on the solr-user mailing list in response to the ANNOUNCE about the 4.7 ref guide, the size of the 4.4, 4.5 4.6 PDF files were all under 5MB, but the 4.7 PDF was 30MB. We've determine that the root cause is a bug in confluence 5.0 (related to duplicating images) that is fixed in 5.4.3 -- the next version Infra currently plans to upgrade to. Until such time as the upgrade is finished, a work arround is to use a manual pdf shrinking tool such as this one to eliminate the duplication... https://github.com/hossman/pdf-shrinker was: As noted on the solr-user mailing list in response to the ANNOUNCE about the 4.7 ref guide, the size of the 4.4, 4.5 4.6 PDF files were all under 5MB, but the 4.7 PDF was 30MB. opening this issue to track trying to reduce this Investigate reduce size of ref-guide PDF -- Key: SOLR-5819 URL: https://issues.apache.org/jira/browse/SOLR-5819 Project: Solr Issue Type: Improvement Reporter: Hoss Man Attachments: img-0007.png, img-0008.png, img-0009.png, img-0010.png, img-0011.png, img-0012.png, img-0013.png, img-0014.png As noted on the solr-user mailing list in response to the ANNOUNCE about the 4.7 ref guide, the size of the 4.4, 4.5 4.6 PDF files were all under 5MB, but the 4.7 PDF was 30MB. We've determine that the root cause is a bug in confluence 5.0 (related to duplicating images) that is fixed in 5.4.3 -- the next version Infra currently plans to upgrade to. Until such time as the upgrade is finished, a work arround is to use a manual pdf shrinking tool such as this one to eliminate the duplication... https://github.com/hossman/pdf-shrinker -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980013#comment-13980013 ] Noble Paul commented on SOLR-5473: -- bq.The external interface bq. I've already gone into a lot of the points. We can dig more into what is not clear from the above. stateFormat attribute in collection, state node have the .json suffix. Please add if missed anything bq.The idea of selectively watching states bq.The is also probably fine, though I'm not sure it's right to change it by default, and I'm not sure it should be so tied into The idea of splitting the clusterstate.json itself. Breaking things into parts makes it easier to digest and build and document properly. I'm not sure if it is possible to split them completely . The moment I split the states my choices are # all nodes watch all the collections # nodes selectively watch collections # nodes watch no collections and read them all real time One or more of the three solutions needs to be built new into the system and I have only added the 2nd because I thought only that would be useful. Do you think the other solutions are worthwhile to build or can you think of a better solution I probably would have missed? bq.The API's which are 'undesirable' I would take a relook at these . Meanwhile , if you can visualize what the API's should look like please post them here Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] Lucene/Solr 4.8.0 RC1
OK, I'll wait for the fix. Is this a new bug in 4.8? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Thursday, April 24, 2014 7:33 PM To: Lucene/Solr Dev Subject: Re: [VOTE] Lucene/Solr 4.8.0 RC1 I ran into a very serious bug during my manual testing of 4.8 that I think warrants a respin. https://issues.apache.org/jira/browse/SOLR-6011 In a normal Solr setup, incorrect results are returned from complex phrase queries if inOrder is ever changed for the same query. This would be maddening for most users to try and track down. -Yonik http://heliosearch.org - solve Solr GC pauses with off-heap filters and fieldcache - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6011: Fix Version/s: 5.0 4.9 4.8 inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC1
On Thu, Apr 24, 2014 at 1:57 PM, Uwe Schindler u...@thetaphi.de wrote: OK, I'll wait for the fix. Is this a new bug in 4.8? OK, thanks. It's an old bug in Lucene, but a new bug in Solr (since complex phrase queries weren't exposed before). I'll commit now. -Yonik - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5626: -- Priority: Blocker (was: Major) SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Priority: Blocker Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5626: -- Fix Version/s: 4.8 SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Priority: Blocker Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reopened LUCENE-5626: --- As we respin 4.8, I will backport this one, too, because otherwise it could happen that somebody (like me) hit this while smoke testing... SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980035#comment-13980035 ] ASF subversion and git services commented on LUCENE-5626: - Commit 1589811 from [~thetaphi] in branch 'dev/branches/lucene_solr_4_8' [ https://svn.apache.org/r1589811 ] Merged revision(s) 1589397 from lucene/dev/branches/branch_4x: Merged revision(s) 1589394 from lucene/dev/trunk: LUCENE-5626: Fix bug in SimpleFSLockFactory's obtain() that sometimes throwed IOException (ERROR_ACESS_DENIED) on Windows if the lock file was created concurrently SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Priority: Blocker Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980037#comment-13980037 ] ASF subversion and git services commented on SOLR-6011: --- Commit 1589812 from [~yo...@apache.org] in branch 'dev/trunk' [ https://svn.apache.org/r1589812 ] SOLR-6011: ComplexPhraseQuery hashCode/equals fix for inOrder inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980046#comment-13980046 ] ASF subversion and git services commented on LUCENE-5626: - Commit 1589813 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1589813 ] LUCENE-5626: Move changes entry SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Priority: Blocker Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility
[ https://issues.apache.org/jira/browse/SOLR-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron LaBella updated SOLR-6013: Attachment: 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch attaching the patch for review/comments. Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility -- Key: SOLR-6013 URL: https://issues.apache.org/jira/browse/SOLR-6013 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 4.7 Reporter: Aaron LaBella Fix For: 4.8 Attachments: 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch Original Estimate: 1h Remaining Estimate: 1h This is similar to issue 5981, the Evaluator class is declared as abstract, yet the parseParams method is package private? Surely this is an oversight, as I wouldn't expect everyone writing their own evaluators to have to deal with parsing the parameters. Similarly, I needed to refactor DateFormatEvaluator because I need to do some custom date math/parsing and it wasn't written in a way that I can extend it. Please review/apply my attached patch to the next version of Solr, ie: 4.8 or 4.9 if I must wait. Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980049#comment-13980049 ] ASF subversion and git services commented on LUCENE-5626: - Commit 1589814 from [~thetaphi] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1589814 ] Merged revision(s) 1589813 from lucene/dev/trunk: LUCENE-5626: Move changes entry SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Priority: Blocker Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980054#comment-13980054 ] ASF subversion and git services commented on LUCENE-5626: - Commit 1589824 from [~thetaphi] in branch 'dev/branches/lucene_solr_4_8' [ https://svn.apache.org/r1589824 ] Merged revision(s) 1589814 from lucene/dev/branches/branch_4x: Merged revision(s) 1589813 from lucene/dev/trunk: LUCENE-5626: Move changes entry (merge props only) SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Priority: Blocker Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility
Aaron LaBella created SOLR-6013: --- Summary: Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility Key: SOLR-6013 URL: https://issues.apache.org/jira/browse/SOLR-6013 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 4.7 Reporter: Aaron LaBella Fix For: 4.8 Attachments: 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch This is similar to issue 5981, the Evaluator class is declared as abstract, yet the parseParams method is package private? Surely this is an oversight, as I wouldn't expect everyone writing their own evaluators to have to deal with parsing the parameters. Similarly, I needed to refactor DateFormatEvaluator because I need to do some custom date math/parsing and it wasn't written in a way that I can extend it. Please review/apply my attached patch to the next version of Solr, ie: 4.8 or 4.9 if I must wait. Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980053#comment-13980053 ] Timothy Potter commented on SOLR-5473: -- Thought I'd add my 2 cents on this one as I've worked on some of this code and want to get a better sense of how to move forward. Reverting and moving out to a branch sounds like a good idea. In general, I think it would be good to split the discussion about this topic into 3 sections: 1) overall design / architecture, 2) implementation and impact on public API, 3) testing. Moving forward we should start with identifying where we have common ground in these areas and which aspects are more controversial and need more hashing out between us. Here's what I think I know but please correct where I'm off-base: 1) Overall Design / Architecture It sounds like we're all on-board with splitting cluster state into a per-collection state znode. Do we intend to support both formats or do we intend to just migrate to the split approach? I think the answer is the latter, that going forward, SolrCloud will keep state in a separate znode per collection. Noble's idea is that once the state is split, then cores only need to watch the znode for the collection/shard it's linked to. In other words, each SolrCore watches a specific state znode and thus does not receive any state change updates for other collections. In terms of what's watched and what is not watched, this patch includes code from 5474 (as they were too intimately tied together to keep separated) which doesn't watch collection state changes on the client side. Instead the client relies on a _stateVer_ check during request processing and receives an error from the server if the client state is stale. I too think this is a little controversial / confusing and maybe we don't have to keep that as part of this solution. It was our mistake to merge those two into a single patch. We originally were thinking 5474 was needed to keep the number of watchers on a znode to a minimum in the event of many clients using many collections. However, I do think this feature can be split out and dealt with in a better way, if at all. In other words, split state znodes are watched from server and client side. Are there any other things design / architecture wise that are controversial? 2) Implementation (and API impact) This seems like the biggest area of contention right now. The main issue is that the API changes still give the impression of two state tracking formats, whereas we really only want one format. The common ground here is that there should be no mention of external in any public method or state format for that matter, right? Noble: Assuming we're moving forward with stateFormat == 2 and the unified /clusterstate.json is going away, is it possible to not change any of the existing public methods? In other words, we're changing the internals of where state is kept, so why does that have to impact the public API? If not, let's come up with a plan for each change and how we can minimize impact of this. It seems to me that we need to be more diligent about API impacts of this change and focus on not breaking the public view of cluster state as much as possible. It would be helpful to have a bullet list of API impacts that are needed for this so we don't have to scour the patch looking for all possible changes. 3) Testing I just wanted to mention that we've been doing a fair amount of integration testing with 100's of external collections per cluster. So I realize this is a big change but we have been testing this extensively in our QA labs. I only mention this so that others know that have been concentrating on hardening this feature over the past couple of months. Once we sort out the API problems, I'm confident that this approach will be solid. To recap, I see a lot of common ground here and to move forward, we need to move this out to a branch and off trunk where we'll focus on cleaning up the API impacts of this work, support only the split format going forward (with a migration plan for existing installations). We also want to revisit the thinking behind not watching state changes on the client as that wasn't clear in the patch to this point. Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch,
[jira] [Resolved] (LUCENE-5626) SimpleFSLockFactory access denied on windows.
[ https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-5626. --- Resolution: Fixed Backported to 4.8 for respin. SimpleFSLockFactory access denied on windows. --- Key: LUCENE-5626 URL: https://issues.apache.org/jira/browse/LUCENE-5626 Project: Lucene - Core Issue Type: Bug Components: core/store Reporter: Robert Muir Assignee: Uwe Schindler Priority: Blocker Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5626.patch, LUCENE-5626.patch This happened twice in jenkins: {noformat} [lockStressTest2] Exception in thread main java.io.IOException: Access is denied [lockStressTest2] at java.io.WinNTFileSystem.createFileExclusively(Native Method) [lockStressTest2] at java.io.File.createNewFile(File.java:1012) [lockStressTest2] at org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135) {noformat} My windows machine got struck by lightning, so I cannot fix this easily. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980062#comment-13980062 ] ASF subversion and git services commented on SOLR-6011: --- Commit 1589826 from [~yo...@apache.org] in branch 'dev/branches/lucene_solr_4_8' [ https://svn.apache.org/r1589826 ] SOLR-6011: ComplexPhraseQuery hashCode/equals fix for inOrder inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5981) Please change method visibility of getSolrWriter in DataImportHandler to public (or at least protected)
[ https://issues.apache.org/jira/browse/SOLR-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980061#comment-13980061 ] Aaron LaBella commented on SOLR-5981: - I'm not seeing this fix in the git mirror of lucene-solr? I'm also wondering why it was moved from 4.8 into 4.9, I thought it was ready to go? Thanks. Please change method visibility of getSolrWriter in DataImportHandler to public (or at least protected) --- Key: SOLR-5981 URL: https://issues.apache.org/jira/browse/SOLR-5981 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 4.0 Environment: Linux 3.13.9-200.fc20.x86_64 Solr 4.6.0 Reporter: Aaron LaBella Assignee: Shawn Heisey Priority: Minor Fix For: 4.9, 5.0 Attachments: SOLR-5981.patch Original Estimate: 1h Remaining Estimate: 1h I've been using the org.apache.solr.handler.dataimport.DataImportHandler for a bit and it's an excellent model and architecture. I'd like to extend the usage of it to plugin my own DIHWriter, but, the code doesn't allow for it. Please change ~line 227 in the DataImportHander class to be: public SolrWriter getSolrWriter instead of: private SolrWriter getSolrWriter or, at a minimum, protected, so that I can extend DataImportHandler and override this method. Thank you *sincerely* in advance for the quick turn-around on this. If the change can be made in 4.6.0 and upstream, that'd be ideal. Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980060#comment-13980060 ] ASF subversion and git services commented on SOLR-6011: --- Commit 1589825 from [~yo...@apache.org] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1589825 ] SOLR-6011: ComplexPhraseQuery hashCode/equals fix for inOrder inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC1
On Thu, Apr 24, 2014 at 2:03 PM, Yonik Seeley yo...@heliosearch.com wrote: On Thu, Apr 24, 2014 at 1:57 PM, Uwe Schindler u...@thetaphi.de wrote: OK, I'll wait for the fix. Is this a new bug in 4.8? OK, thanks. It's an old bug in Lucene, but a new bug in Solr (since complex phrase queries weren't exposed before). I'll commit now. Actually, it looks like it was a new bug in Lucene as well, since inOrder was just added for 4.8 in https://issues.apache.org/jira/browse/LUCENE-3758 Anyway, the fix is now in the 4.8 branch. -Yonik http://heliosearch.org - solve Solr GC pauses with off-heap filters and fieldcache - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980067#comment-13980067 ] Uwe Schindler commented on SOLR-6011: - Somehow because of merge conflict the changes got lost in 4.x and 4.8. Sorry for the concurrent merge! inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980071#comment-13980071 ] Yonik Seeley commented on SOLR-6011: That's OK... I realized that it was a new bug for both lucene and solr, so no need to be in CHANGES at all... inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-6011. Resolution: Fixed inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1589782 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/inde x/TestTerms.java
On Thu, Apr 24, 2014 at 12:31 PM, Michael McCandless luc...@mikemccandless.com wrote: Yeah, I didn't want to disable the full test, just that one method, because I want Terms.getMin/Max testing for Lucene3x too. Would be nice if we could @SuppressCodecs for just one method ... Thats impossible: many tests create their indexes in @BeforeClass, so codec selection really must be a per-class thing. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980087#comment-13980087 ] ASF subversion and git services commented on SOLR-6011: --- Commit 1589832 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1589832 ] SOLR-6011: Remove changes entry also in trunk inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6011) inOrder does not work with the complexphrase parser
[ https://issues.apache.org/jira/browse/SOLR-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980090#comment-13980090 ] Uwe Schindler commented on SOLR-6011: - OK, I removed the changes entry in trunk, too. inOrder does not work with the complexphrase parser --- Key: SOLR-6011 URL: https://issues.apache.org/jira/browse/SOLR-6011 Project: Solr Issue Type: Bug Reporter: Yonik Seeley Priority: Critical Fix For: 4.8, 4.9, 5.0 Attachments: SOLR-6011.patch, SOLR-6011.patch {code} {!complexphrase}vol* high* does not match the Solr document containing ... high volume web ... (this is correct) But adding inOrder=false still fails to make it match. {!complexphrase inOrder=false}vol* high* {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1589782 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/inde x/TestTerms.java
Another possibility that works is to move this test to a separate class, annotated with SuppressCodecs (if it does not depend on indexes created in beforeclass). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, April 24, 2014 8:41 PM To: dev@lucene.apache.org Subject: Re: svn commit: r1589782 - /lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/ inde x/TestTerms.java On Thu, Apr 24, 2014 at 12:31 PM, Michael McCandless luc...@mikemccandless.com wrote: Yeah, I didn't want to disable the full test, just that one method, because I want Terms.getMin/Max testing for Lucene3x too. Would be nice if we could @SuppressCodecs for just one method ... Thats impossible: many tests create their indexes in @BeforeClass, so codec selection really must be a per-class thing. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] Lucene/Solr 4.8.0 RC1
OK, I'll wait a bit and respin before going to bed (to give Jenkins a chance to test it finally) :-) Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Thursday, April 24, 2014 8:31 PM To: Lucene/Solr Dev Subject: Re: [VOTE] Lucene/Solr 4.8.0 RC1 On Thu, Apr 24, 2014 at 2:03 PM, Yonik Seeley yo...@heliosearch.com wrote: On Thu, Apr 24, 2014 at 1:57 PM, Uwe Schindler u...@thetaphi.de wrote: OK, I'll wait for the fix. Is this a new bug in 4.8? OK, thanks. It's an old bug in Lucene, but a new bug in Solr (since complex phrase queries weren't exposed before). I'll commit now. Actually, it looks like it was a new bug in Lucene as well, since inOrder was just added for 4.8 in https://issues.apache.org/jira/browse/LUCENE-3758 Anyway, the fix is now in the 4.8 branch. -Yonik http://heliosearch.org - solve Solr GC pauses with off-heap filters and fieldcache - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1589838 - in /lucene/dev/branches/branch_4x: ./ lucene/ lucene/analysis/ lucene/analysis/common/src/resources/META-INF/services/ lucene/analysis/common/src/test/org/apache/lucene/anal
Can you commit this to 4.8, because otherwise the uppercase factory does not work in solr. Its still time to do this. Uwe On 24. April 2014 21:18:40 MESZ, rm...@apache.org wrote: Author: rmuir Date: Thu Apr 24 19:18:39 2014 New Revision: 1589838 URL: http://svn.apache.org/r1589838 Log: fix TestAllAnalysersHaveFactories to actually work, and add missing SPI entry Modified: lucene/dev/branches/branch_4x/ (props changed) lucene/dev/branches/branch_4x/lucene/ (props changed) lucene/dev/branches/branch_4x/lucene/analysis/ (props changed) lucene/dev/branches/branch_4x/lucene/analysis/common/src/resources/META-INF/services/org.apache.lucene.analysis.util.TokenFilterFactory lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestAllAnalyzersHaveFactories.java Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/resources/META-INF/services/org.apache.lucene.analysis.util.TokenFilterFactory URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/resources/META-INF/services/org.apache.lucene.analysis.util.TokenFilterFactory?rev=1589838r1=1589837r2=1589838view=diff == --- lucene/dev/branches/branch_4x/lucene/analysis/common/src/resources/META-INF/services/org.apache.lucene.analysis.util.TokenFilterFactory (original) +++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/resources/META-INF/services/org.apache.lucene.analysis.util.TokenFilterFactory Thu Apr 24 19:18:39 2014 @@ -30,6 +30,7 @@ org.apache.lucene.analysis.compound.Hyph org.apache.lucene.analysis.core.LowerCaseFilterFactory org.apache.lucene.analysis.core.StopFilterFactory org.apache.lucene.analysis.core.TypeTokenFilterFactory +org.apache.lucene.analysis.core.UpperCaseFilterFactory org.apache.lucene.analysis.cz.CzechStemFilterFactory org.apache.lucene.analysis.de.GermanLightStemFilterFactory org.apache.lucene.analysis.de.GermanMinimalStemFilterFactory Modified: lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestAllAnalyzersHaveFactories.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestAllAnalyzersHaveFactories.java?rev=1589838r1=1589837r2=1589838view=diff == --- lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestAllAnalyzersHaveFactories.java (original) +++ lucene/dev/branches/branch_4x/lucene/analysis/common/src/test/org/apache/lucene/analysis/core/TestAllAnalyzersHaveFactories.java Thu Apr 24 19:18:39 2014 @@ -130,6 +130,7 @@ public class TestAllAnalyzersHaveFactori || crazyComponents.contains(c) || oddlyNamedComponents.contains(c) || deprecatedDuplicatedComponents.contains(c) +|| c.isAnnotationPresent(Deprecated.class) // deprecated ones are typically back compat hacks || !(Tokenizer.class.isAssignableFrom(c) || TokenFilter.class.isAssignableFrom(c) || CharFilter.class.isAssignableFrom(c)) ) { continue; @@ -151,7 +152,7 @@ public class TestAllAnalyzersHaveFactori } assertSame(c, instance.create(new StringReader()).getClass()); } catch (IllegalArgumentException e) { - if (!e.getMessage().contains(SPI)) { + if (!e.getMessage().contains(SPI) || e.getMessage().contains(does not exist)) { throw e; } // TODO: For now pass because some factories have not yet a default config that always works @@ -173,7 +174,7 @@ public class TestAllAnalyzersHaveFactori assertSame(c, createdClazz); } } catch (IllegalArgumentException e) { - if (!e.getMessage().contains(SPI)) { + if (!e.getMessage().contains(SPI) || e.getMessage().contains(does not exist)) { throw e; } // TODO: For now pass because some factories have not yet a default config that always works @@ -195,7 +196,7 @@ public class TestAllAnalyzersHaveFactori assertSame(c, createdClazz); } } catch (IllegalArgumentException e) { - if (!e.getMessage().contains(SPI)) { + if (!e.getMessage().contains(SPI) || e.getMessage().contains(does not exist)) { throw e; } // TODO: For now pass because some factories have not yet a default config that always works -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de
[jira] [Created] (LUCENE-5630) Improve TestAllAnalyzersHaveFactories
Robert Muir created LUCENE-5630: --- Summary: Improve TestAllAnalyzersHaveFactories Key: LUCENE-5630 URL: https://issues.apache.org/jira/browse/LUCENE-5630 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir This test wasn't working at all, it would always pass. It is sensitive to the strings inside exception messages, if we change those, it might suddenly stop working. It would be great to improve this thing to be less fragile. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility
[ https://issues.apache.org/jira/browse/SOLR-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron LaBella updated SOLR-6013: Attachment: 0001-add-getters-for-datemathparser.patch one more small patch after fully testing the changes for extensibility Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility -- Key: SOLR-6013 URL: https://issues.apache.org/jira/browse/SOLR-6013 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 4.7 Reporter: Aaron LaBella Fix For: 4.8 Attachments: 0001-add-getters-for-datemathparser.patch, 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch Original Estimate: 1h Remaining Estimate: 1h This is similar to issue 5981, the Evaluator class is declared as abstract, yet the parseParams method is package private? Surely this is an oversight, as I wouldn't expect everyone writing their own evaluators to have to deal with parsing the parameters. Similarly, I needed to refactor DateFormatEvaluator because I need to do some custom date math/parsing and it wasn't written in a way that I can extend it. Please review/apply my attached patch to the next version of Solr, ie: 4.8 or 4.9 if I must wait. Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5559) Argument validation for TokenFilters having numeric constructor parameter(s)
[ https://issues.apache.org/jira/browse/LUCENE-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980175#comment-13980175 ] Ahmet Arslan commented on LUCENE-5559: -- Pinging [~rcmuir], if there is an interest for last patch that covers two overlooked TokenFilters : {{CapitalizationFilter}} and {{CodepointCountFilter}} Argument validation for TokenFilters having numeric constructor parameter(s) Key: LUCENE-5559 URL: https://issues.apache.org/jira/browse/LUCENE-5559 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.7 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.8, 5.0 Attachments: LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch Some TokenFilters have numeric arguments in their constructors. They should throw {{IllegalArgumentException}} for negative or meaningless values. Here is some examples that demonstrates invalid/meaningless arguments : {code:xml} filter class=solr.LimitTokenCountFilterFactory maxTokenCount=-10 / {code} {code:xml} filter class=solr.LengthFilterFactory min=-5 max=-1 / {code} {code:xml} filter class=solr.LimitTokenPositionFilterFactory maxTokenPosition=-3 / {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5559) Argument validation for TokenFilters having numeric constructor parameter(s)
[ https://issues.apache.org/jira/browse/LUCENE-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980187#comment-13980187 ] Robert Muir commented on LUCENE-5559: - Oops, looks like i missed this patch? thanks Ahmet. I will take care. Argument validation for TokenFilters having numeric constructor parameter(s) Key: LUCENE-5559 URL: https://issues.apache.org/jira/browse/LUCENE-5559 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.7 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.8, 5.0 Attachments: LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch Some TokenFilters have numeric arguments in their constructors. They should throw {{IllegalArgumentException}} for negative or meaningless values. Here is some examples that demonstrates invalid/meaningless arguments : {code:xml} filter class=solr.LimitTokenCountFilterFactory maxTokenCount=-10 / {code} {code:xml} filter class=solr.LengthFilterFactory min=-5 max=-1 / {code} {code:xml} filter class=solr.LimitTokenPositionFilterFactory maxTokenPosition=-3 / {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5559) Argument validation for TokenFilters having numeric constructor parameter(s)
[ https://issues.apache.org/jira/browse/LUCENE-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980202#comment-13980202 ] Ahmet Arslan commented on LUCENE-5559: -- bq. looks like i missed this patch? No, actually I found those two after your commit. Argument validation for TokenFilters having numeric constructor parameter(s) Key: LUCENE-5559 URL: https://issues.apache.org/jira/browse/LUCENE-5559 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.7 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.8, 5.0 Attachments: LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch, LUCENE-5559.patch Some TokenFilters have numeric arguments in their constructors. They should throw {{IllegalArgumentException}} for negative or meaningless values. Here is some examples that demonstrates invalid/meaningless arguments : {code:xml} filter class=solr.LimitTokenCountFilterFactory maxTokenCount=-10 / {code} {code:xml} filter class=solr.LengthFilterFactory min=-5 max=-1 / {code} {code:xml} filter class=solr.LimitTokenPositionFilterFactory maxTokenPosition=-3 / {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5630) Improve TestAllAnalyzersHaveFactories
[ https://issues.apache.org/jira/browse/LUCENE-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5630: -- Attachment: LUCENE-5630.patch This patch fixes the issue. In fact the whole check was wrong and really to fragile. The new approach is 100% safe: - Separately lookup the class before doing anything. If this throws any exception, then the component is really missing - Then do the same checks as before (to actually check that instantiation works), but don't check the message. Its easier. The newInstance method throws IAE, which wraps a NoSuchMethodException - so just check IAE#getCause() I will commit this fix to all three branches. Improve TestAllAnalyzersHaveFactories - Key: LUCENE-5630 URL: https://issues.apache.org/jira/browse/LUCENE-5630 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5630.patch This test wasn't working at all, it would always pass. It is sensitive to the strings inside exception messages, if we change those, it might suddenly stop working. It would be great to improve this thing to be less fragile. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5630) Improve TestAllAnalyzersHaveFactories
[ https://issues.apache.org/jira/browse/LUCENE-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5630: -- Fix Version/s: 5.0 4.9 4.8 Improve TestAllAnalyzersHaveFactories - Key: LUCENE-5630 URL: https://issues.apache.org/jira/browse/LUCENE-5630 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5630.patch This test wasn't working at all, it would always pass. It is sensitive to the strings inside exception messages, if we change those, it might suddenly stop working. It would be great to improve this thing to be less fragile. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5630) Improve TestAllAnalyzersHaveFactories
[ https://issues.apache.org/jira/browse/LUCENE-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13980219#comment-13980219 ] Robert Muir commented on LUCENE-5630: - Looks great, thanks! Improve TestAllAnalyzersHaveFactories - Key: LUCENE-5630 URL: https://issues.apache.org/jira/browse/LUCENE-5630 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5630.patch This test wasn't working at all, it would always pass. It is sensitive to the strings inside exception messages, if we change those, it might suddenly stop working. It would be great to improve this thing to be less fragile. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5630) Improve TestAllAnalyzersHaveFactories
[ https://issues.apache.org/jira/browse/LUCENE-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-5630: - Assignee: Uwe Schindler Improve TestAllAnalyzersHaveFactories - Key: LUCENE-5630 URL: https://issues.apache.org/jira/browse/LUCENE-5630 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.8, 4.9, 5.0 Attachments: LUCENE-5630.patch This test wasn't working at all, it would always pass. It is sensitive to the strings inside exception messages, if we change those, it might suddenly stop working. It would be great to improve this thing to be less fragile. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org