Re: lucene 2.9 sorting algorithm

2009-10-14 Thread Yonik Seeley
Interesting idea... though one further piece of info in the mix is that large segments are typically processed first, and tend to fill up the priority queue. Conversion from one segment to another is only done as needed... only the bottom slot is converted automatically when the segment is switche

lucene 2.9 sorting algorithm

2009-10-14 Thread John Wang
Hi guys: Looking at the 2.9 sorting algorithm, and while trying to understand FieldComparator class, I was wondering about the following optimization: (I am using StringOrdValComparator as an example) Currently we have 1 instance of per segment data structure, e.g. (ords,vals etc.), and we kee

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-14 Thread Michael Busch
I will send the latest version of the draft to java-user in a few days if nobody objects. Michael On 10/13/09 3:10 PM, Michael Busch wrote: OK, I made the draft a bit "more neutral" by pointing out the downsides clearer. However, I think we have to explain reasons for and against the change,

[jira] Resolved: (LUCENE-1963) ArabicAnalyzer: Lowercase before Stopfilter

2009-10-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1963. Resolution: Fixed Committed on 2.9.x. Thanks Robert! > ArabicAnalyzer: Lowercase

Re: 2.9.1

2009-10-14 Thread Mark Miller
Contrib committers do not have karma for branches - just the trunk contrib area. I assume its just because of how the karma is granted - not wildcard based eg */contrib/*. Michael McCandless wrote: > Ooh, I'll go commit that one (though it's kinda weird that you're not > able to do so)... > > Any

Re: 2.9.1

2009-10-14 Thread Michael McCandless
Ooh, I'll go commit that one (though it's kinda weird that you're not able to do so)... Any others? Mike On Wed, Oct 14, 2009 at 5:45 PM, Robert Muir wrote: > can someone take a look at LUCENE-1963?  :) > (there were no objections to back-porting the fix, but i do not have > permission to do it

Re: 2.9.1

2009-10-14 Thread Jason Rutherglen
Lets cut a release with this scorer bug fix? On Wed, Oct 14, 2009 at 2:39 PM, Michael McCandless wrote: > I can cut the 2.9.1 release, but... should we wait a bit to see > whether other issues come up?  Or do it, now? > > If there are any issues you've already fixed on trunk but think should > al

Re: 2.9.1

2009-10-14 Thread Robert Muir
can someone take a look at LUCENE-1963? :) (there were no objections to back-porting the fix, but i do not have permission to do it) On Wed, Oct 14, 2009 at 5:39 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > I can cut the 2.9.1 release, but... should we wait a bit to see > whether

2.9.1

2009-10-14 Thread Michael McCandless
I can cut the 2.9.1 release, but... should we wait a bit to see whether other issues come up? Or do it, now? If there are any issues you've already fixed on trunk but think should also be included in 2.9.1, please back-port them! Mike

[jira] Resolved: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch resolved LUCENE-1979. --- Resolution: Fixed Committed revision 825288. Thanks, Uwe! > Remove remaining deprecations

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765753#action_12765753 ] Uwe Schindler commented on LUCENE-1979: --- do it! > Remove remaining deprecations fro

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765752#action_12765752 ] Michael Busch commented on LUCENE-1979: --- Shall I commit it or are you going to, Uwe?

[jira] Resolved: (LUCENE-1983) IndexInput not closed by MultiLevelSkipListReader

2009-10-14 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved LUCENE-1983. -- Resolution: Duplicate Marking as duplicate of LUCENE-686 > IndexInput not closed by MultiLeve

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765701#action_12765701 ] Uwe Schindler commented on LUCENE-1979: --- +1 all tests pass here, too! > Remove rema

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765690#action_12765690 ] Michael Busch commented on LUCENE-1979: --- {quote} Seems that in the bw branch one mor

[jira] Issue Comment Edited: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765679#action_12765679 ] Uwe Schindler edited comment on LUCENE-1979 at 10/14/09 11:43 AM: --

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765683#action_12765683 ] Uwe Schindler commented on LUCENE-1979: --- Seems that in the bw branch one more docCou

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765679#action_12765679 ] Uwe Schindler commented on LUCENE-1979: --- Thanks! Your patch for TermVectorAccessor i

[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1979: -- Attachment: LUCENE-1979-2-bw.patch And the fix for the bw-branch. Running all tests again now

[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1979: -- Attachment: LUCENE-1979-2.patch Fixes the bug in TermVectorAccessor. > Remove remaining depre

[jira] Updated: (LUCENE-1983) IndexInput not closed by MultiLevelSkipListReader

2009-10-14 Thread Gui Forget (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gui Forget updated LUCENE-1983: --- Attachment: LUCENE-1983.patch Easy fix > IndexInput not closed by MultiLevelSkipListReader > --

[jira] Created: (LUCENE-1983) IndexInput not closed by MultiLevelSkipListReader

2009-10-14 Thread Gui Forget (JIRA)
IndexInput not closed by MultiLevelSkipListReader - Key: LUCENE-1983 URL: https://issues.apache.org/jira/browse/LUCENE-1983 Project: Lucene - Java Issue Type: Bug Components: Index

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765611#action_12765611 ] Michael Busch commented on LUCENE-1979: --- This one fails in the bw-branch: org.apache

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765610#action_12765610 ] Michael Busch commented on LUCENE-1979: --- I just ran all tests. test-contrib and test

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765593#action_12765593 ] Uwe Schindler commented on LUCENE-1979: --- My patch has a bug in TestTermVectorAccesso

[jira] Updated: (LUCENE-1963) ArabicAnalyzer: Lowercase before Stopfilter

2009-10-14 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1963: Attachment: LUCENE-1963_branch.patch its been a few days, no one objected to applying this fix to

[jira] Updated: (LUCENE-1976) isCurrent() and getVersion() on an NRT reader are broken

2009-10-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1976: --- Attachment: LUCENE-1976.patch Attached patch. Now, for an NRT reader, isCurrent ret

Re: LUCENE-1124 broken (?)

2009-10-14 Thread Mark Miller
Mark Miller wrote: > Timo Nentwig wrote: > >> Hi! >> >> Consider "abcd" and fuzzy factor 0.75: changing 1 character equals a >> levensthein distance of exactly 0.75. So isn't it wrong to abandon >> term > (1/(1-minSimilarity)) >> >> ? >> >> Wouldn't >= be correct? >> >> ---

Re: LUCENE-1124 broken (?)

2009-10-14 Thread Mark Miller
Timo Nentwig wrote: > Hi! > > Consider "abcd" and fuzzy factor 0.75: changing 1 character equals a > levensthein distance of exactly 0.75. So isn't it wrong to abandon > term > (1/(1-minSimilarity)) > > ? > > Wouldn't >= be correct? > >

LUCENE-1124 broken (?)

2009-10-14 Thread Timo Nentwig
Hi! Consider "abcd" and fuzzy factor 0.75: changing 1 character equals a levensthein distance of exactly 0.75. So isn't it wrong to abandon term > (1/(1-minSimilarity)) ? Wouldn't >= be correct? - To unsubscribe, e-mail: j

[jira] Updated: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1974: --- Attachment: LUCENE-1974.patch I've modified TestBoolean2 to show the bug (attached p

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765544#action_12765544 ] Michael McCandless commented on LUCENE-1974: As a test, to tease out more corn

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-14 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765545#action_12765545 ] Michael McCandless commented on LUCENE-1974: bq. Though we have spent about a

[jira] Resolved: (LUCENE-1756) contrib/memory: PatternAnalyzerTest is a very, very, VERY, bad unit test

2009-10-14 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-1756. - Resolution: Fixed Committed revision 825112. > contrib/memory: PatternAnalyzerTest is a very, v

[jira] Resolved: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement

2009-10-14 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-1966. - Resolution: Fixed Committed revision 825110. Thanks Basem! > Arabic Analyzer: Stopwords list n

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-14 Thread Grant Ingersoll
I think this is a good thing. Ultimately, we just need to decide, but getting user feedback is also important. -Grant On Oct 13, 2009, at 6:10 PM, Michael Busch wrote: OK, I made the draft a bit "more neutral" by pointing out the downsides clearer. However, I think we have to explain reaso

Re: [jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-14 Thread Michael McCandless
I just tried this (I increased the numDeletedDocs by adding 1 to the original counts), but it doesn't hit this bug, I believe because all the deleted docs are created after the real docs. It does provoke new failures, but they all seem to be false failures [floating point precision issues], eg

[jira] Issue Comment Edited: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765483#action_12765483 ] Uwe Schindler edited comment on LUCENE-1979 at 10/14/09 2:29 AM: ---

[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1979: -- Attachment: LUCENE-1979-2-bw.patch LUCENE-1979-2.patch Here the patch for the

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765459#action_12765459 ] Uwe Schindler commented on LUCENE-1979: --- javac only shows a warning, if you override

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765454#action_12765454 ] Michael Busch commented on LUCENE-1979: --- For now I fixed all those that the compiler

[jira] Reopened: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-14 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reopened LUCENE-1979: --- I think there are some more deprecations (hidden). A search on deprecated finds some more files