RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
I wonder why this commit is needed. It only affects the core classes, not th tests. To compile correct backwards tests it should not be important if the methods exist or not. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: busc...@apache.org [mailto:busc...@apache.org] > Sent: Tuesday, October 13, 2009 9:00 AM > To: java-comm...@lucene.apache.org > Subject: svn commit: r824611 - in > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > Author: buschmi > Date: Tue Oct 13 06:59:40 2009 > New Revision: 824611 > > URL: http://svn.apache.org/viewvc?rev=824611&view=rev > Log: > More fixes that were accidentially left out in the previous commit > > Modified: > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/FieldMaskingSpanQuery.java > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanFirstQuery.java > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanNearQuery.java > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanNotQuery.java > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanOrQuery.java > > Modified: > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/FieldMaskingSpanQuery.java > URL: > http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t > ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re > v=824611&r1=824610&r2=824611&view=diff > == > > --- > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/FieldMaskingSpanQuery.java (original) > +++ > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009 > @@ -94,11 +94,6 @@ > return maskedQuery.getSpans(reader); >} > > - /** @deprecated use {...@link #extractTerms(Set)} instead. */ > - public Collection getTerms() { > -return maskedQuery.getTerms(); > - } > - >public void extractTerms(Set terms) { > maskedQuery.extractTerms(terms); >} > > Modified: > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanFirstQuery.java > URL: > http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t > ests/src/java/org/apache/lucene/search/spans/SpanFirstQuery.java?rev=82461 > 1&r1=824610&r2=824611&view=diff > == > > --- > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanFirstQuery.java (original) > +++ > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanFirstQuery.java Tue Oct 13 06:59:40 2009 > @@ -47,12 +47,6 @@ > >public String getField() { return match.getField(); } > > - /** Returns a collection of all terms matched by this query. > - * @deprecated use extractTerms instead > - * @see #extractTerms(Set) > - */ > - public Collection getTerms() { return match.getTerms(); } > - >public String toString(String field) { > StringBuffer buffer = new StringBuffer(); > buffer.append("spanFirst("); > > Modified: > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanNearQuery.java > URL: > http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t > ests/src/java/org/apache/lucene/search/spans/SpanNearQuery.java?rev=824611 > &r1=824610&r2=824611&view=diff > == > > --- > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanNearQuery.java (original) > +++ > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > ne/search/spans/SpanNearQuery.java Tue Oct 13 06:59:40 2009 > @@ -80,20 +80,6 @@ > >public String getField() { return field; } > > - /** Returns a collection of all terms matched by this query. > - * @deprecated use extractTerms instead > - * @see #extractTerms(Set) > - */ > - public Collection getTerms() { > -Collection terms = new ArrayList(); > -Iterator i = clauses.iterator(); > -while (i.hasNext()) { > - SpanQuery clause = (SpanQuery)i.next(); > - terms.addAll(clause.getTerms()); > -} > -return terms; > - } > - >public void extractTerms(Set terms) { > Iterator i = clauses.iterator(); > while (i.hasNext()) { > > Modified: > lucene/java/branch
Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
Yes that's indeed the case, see LUCENE-1529. Michael On 10/13/09 12:25 AM, Michael Busch wrote: It was weird - I ran all the tests before I did the previous commit and it worked fine. Then after committing I wanted to doublecheck by running 'ant test-tag' and got the compile errors. I think something is wrong with my eclipse and/or svn. But I also switched from tortoise to command-line recently - so maybe I'm just clumsy. Anyway, the new tag is working now, sorry for the noise. To your question: Wasn't there a fix recently to test-tag to test drop-in backwards-compatibility? Which means that it compiles the tests first against the sources of the back-compat branch, but then runs them against the new trunk JAR? That's why this commit is necessary I think. Michael On 10/13/09 12:18 AM, Uwe Schindler wrote: I wonder why this commit is needed. It only affects the core classes, not th tests. To compile correct backwards tests it should not be important if the methods exist or not. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: busc...@apache.org [mailto:busc...@apache.org] Sent: Tuesday, October 13, 2009 9:00 AM To: java-comm...@lucene.apache.org Subject: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java Author: buschmi Date: Tue Oct 13 06:59:40 2009 New Revision: 824611 URL: http://svn.apache.org/viewvc?rev=824611&view=rev Log: More fixes that were accidentially left out in the previous commit Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNearQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNotQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanOrQuery.java Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re v=824611&r1=824610&r2=824611&view=diff == --- lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java (original) +++ lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009 @@ -94,11 +94,6 @@ return maskedQuery.getSpans(reader); } - /** @deprecated use {...@link #extractTerms(Set)} instead. */ - public Collection getTerms() { -return maskedQuery.getTerms(); - } - public void extractTerms(Set terms) { maskedQuery.extractTerms(terms); } Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t ests/src/java/org/apache/lucene/search/spans/SpanFirstQuery.java?rev=82461 1&r1=824610&r2=824611&view=diff == --- lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java (original) +++ lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java Tue Oct 13 06:59:40 2009 @@ -47,12 +47,6 @@ public String getField() { return match.getField(); } - /** Returns a collection of all terms matched by this query. - * @deprecated use extractTerms instead - * @see #extractTerms(Set) - */ - public Collection getTerms() { return match.getTerms(); } - public String toString(String field) { StringBuffer buffer = new StringBuffer(); buffer.append("spanFirst("); Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNearQuery.java URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t ests/src/java/org/apache/lucene/search/spans/SpanNearQuery.java?rev=824611 &r1=824610&r2=824611&view=diff == --- lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNearQuery.java (original) +++ lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/Sp
Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
It was weird - I ran all the tests before I did the previous commit and it worked fine. Then after committing I wanted to doublecheck by running 'ant test-tag' and got the compile errors. I think something is wrong with my eclipse and/or svn. But I also switched from tortoise to command-line recently - so maybe I'm just clumsy. Anyway, the new tag is working now, sorry for the noise. To your question: Wasn't there a fix recently to test-tag to test drop-in backwards-compatibility? Which means that it compiles the tests first against the sources of the back-compat branch, but then runs them against the new trunk JAR? That's why this commit is necessary I think. Michael On 10/13/09 12:18 AM, Uwe Schindler wrote: I wonder why this commit is needed. It only affects the core classes, not th tests. To compile correct backwards tests it should not be important if the methods exist or not. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: busc...@apache.org [mailto:busc...@apache.org] Sent: Tuesday, October 13, 2009 9:00 AM To: java-comm...@lucene.apache.org Subject: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java Author: buschmi Date: Tue Oct 13 06:59:40 2009 New Revision: 824611 URL: http://svn.apache.org/viewvc?rev=824611&view=rev Log: More fixes that were accidentially left out in the previous commit Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNearQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNotQuery.java lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanOrQuery.java Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re v=824611&r1=824610&r2=824611&view=diff == --- lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java (original) +++ lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009 @@ -94,11 +94,6 @@ return maskedQuery.getSpans(reader); } - /** @deprecated use {...@link #extractTerms(Set)} instead. */ - public Collection getTerms() { -return maskedQuery.getTerms(); - } - public void extractTerms(Set terms) { maskedQuery.extractTerms(terms); } Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t ests/src/java/org/apache/lucene/search/spans/SpanFirstQuery.java?rev=82461 1&r1=824610&r2=824611&view=diff == --- lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java (original) +++ lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanFirstQuery.java Tue Oct 13 06:59:40 2009 @@ -47,12 +47,6 @@ public String getField() { return match.getField(); } - /** Returns a collection of all terms matched by this query. - * @deprecated use extractTerms instead - * @see #extractTerms(Set) - */ - public Collection getTerms() { return match.getTerms(); } - public String toString(String field) { StringBuffer buffer = new StringBuffer(); buffer.append("spanFirst("); Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNearQuery.java URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t ests/src/java/org/apache/lucene/search/spans/SpanNearQuery.java?rev=824611 &r1=824610&r2=824611&view=diff == --- lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNearQuery.java (original) +++ lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/SpanNearQuery.java Tue Oct 13 06:59:40 2009 @@ -80,20 +80,6 @@ public String getField() { return field; } - /** Returns a collection of all terms matched by this q
RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
Hi Michael, I fixed it here, should I commit? You problem was maybe that you thought, the backwards test code must compile against trunk. But it's vice versa. I reverted everything and only removed the getTerms() checks in the backwards branch. Now it works and the backwards testing is correct. Here the general rule applied: The backwards test code was checking against a deprecated API, just remove it. No need to rewrite the test for that. It is tested by the main tests. The main case of the backwards branch is to test drop in binary compatibility. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Tuesday, October 13, 2009 9:49 AM > To: java-dev@lucene.apache.org > Subject: RE: svn commit: r824611 - in > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > I found the reason why it broke: > > You changed in the backwards branch main code in your first commit the > following: > > +Set terms = new HashSet(); > +qr.extractTerms(terms); > +assertEquals(1, terms.size()); > > And the backwards branch core and test is compiled with Java 1.4 - bumm. > So > general rule: Never change the main code branch, only the tests in > backwards > and use where possible only the old *public* API. If you have to change > the > main code you have a backwards break. If you only test some internal > implementations in 2.9 (not public API), remove the tests in 2.9. > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Uwe Schindler [mailto:u...@thetaphi.de] > > Sent: Tuesday, October 13, 2009 9:43 AM > > To: java-dev@lucene.apache.org > > Subject: RE: svn commit: r824611 - in > > > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > > > Yes, thats why we do the tests. By this it is possible to test compiled > > Java > > 1.4 code against new Java 1.5 lucene core with generics and test, that > no > > upper generics boundaries (e.g. by things like ) > are > > violated. > > > > But if you rewrite the tests to only use the API of lucene 3.0 and no > > deprecated methods it should pass and it has no effect, if an additional > > deprecated method is still available in the branch's code. If we have to > > remove all deprecated code also from the backwards branch, we would not > > need > > the branch at all. So this commit is definitely not needed (and I tested > > it, > > it works without). In the backwards branch we should only fix the tests, > > never the core code. If we do it, it is contra-productive. > > > > There were some edge cases, when we have backwards-incompatible changes > in > > 2.9. But this is definitely not a backwards break. > > > > - > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > > -Original Message- > > > From: Michael Busch [mailto:busch...@gmail.com] > > > Sent: Tuesday, October 13, 2009 9:30 AM > > > To: java-dev@lucene.apache.org > > > Subject: Re: svn commit: r824611 - in > > > > > > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > > > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > > > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > > > > > Yes that's indeed the case, see LUCENE-1529. > > > > > > Michael > > > > > > On 10/13/09 12:25 AM, Michael Busch wrote: > > > > It was weird - I ran all the tests before I did the previous commit > > > > and it worked fine. Then after committing I wanted to doublecheck by > > > > running 'ant test-tag' and got the compile errors. > > > > > > > > I think something is wrong with my eclipse and/or svn. But I also > > > > switched from tortoise to command-line recently - so maybe I'm just > > > > clumsy. Anyway, the new tag is working now, sorry for the noise. > > > > > > > > To your question: Wasn't there a fix recently to test-tag to test > > > > drop-in backwards-compatibility? Which means that it compiles the > > > > tests first against the sources of the back-compat branch, but then > > > > runs them against the new trunk JAR? That's why this commit is > > > > necessary I think. > > > > > > > > Michael > > > > > > > > On 10/13/09 12:18 AM, Uwe Schindler wrote: > > > >> I wonder why this commit is needed. It only affects the core > classes, > > > >> not th > > > >> tests. To compile correct backwards tests it should not be > important > > > >> if the > > > >> methods exist or not. > > > >> > > > >> - > > > >> Uwe Schindler > > > >> H.-H.-Meier-Allee 63
RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
Yes, thats why we do the tests. By this it is possible to test compiled Java 1.4 code against new Java 1.5 lucene core with generics and test, that no upper generics boundaries (e.g. by things like ) are violated. But if you rewrite the tests to only use the API of lucene 3.0 and no deprecated methods it should pass and it has no effect, if an additional deprecated method is still available in the branch's code. If we have to remove all deprecated code also from the backwards branch, we would not need the branch at all. So this commit is definitely not needed (and I tested it, it works without). In the backwards branch we should only fix the tests, never the core code. If we do it, it is contra-productive. There were some edge cases, when we have backwards-incompatible changes in 2.9. But this is definitely not a backwards break. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael Busch [mailto:busch...@gmail.com] > Sent: Tuesday, October 13, 2009 9:30 AM > To: java-dev@lucene.apache.org > Subject: Re: svn commit: r824611 - in > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > Yes that's indeed the case, see LUCENE-1529. > > Michael > > On 10/13/09 12:25 AM, Michael Busch wrote: > > It was weird - I ran all the tests before I did the previous commit > > and it worked fine. Then after committing I wanted to doublecheck by > > running 'ant test-tag' and got the compile errors. > > > > I think something is wrong with my eclipse and/or svn. But I also > > switched from tortoise to command-line recently - so maybe I'm just > > clumsy. Anyway, the new tag is working now, sorry for the noise. > > > > To your question: Wasn't there a fix recently to test-tag to test > > drop-in backwards-compatibility? Which means that it compiles the > > tests first against the sources of the back-compat branch, but then > > runs them against the new trunk JAR? That's why this commit is > > necessary I think. > > > > Michael > > > > On 10/13/09 12:18 AM, Uwe Schindler wrote: > >> I wonder why this commit is needed. It only affects the core classes, > >> not th > >> tests. To compile correct backwards tests it should not be important > >> if the > >> methods exist or not. > >> > >> - > >> Uwe Schindler > >> H.-H.-Meier-Allee 63, D-28213 Bremen > >> http://www.thetaphi.de > >> eMail: u...@thetaphi.de > >> > >> > >>> -Original Message- > >>> From: busc...@apache.org [mailto:busc...@apache.org] > >>> Sent: Tuesday, October 13, 2009 9:00 AM > >>> To: java-comm...@lucene.apache.org > >>> Subject: svn commit: r824611 - in > >>> > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > >>> > >>> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > >>> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > >>> > >>> Author: buschmi > >>> Date: Tue Oct 13 06:59:40 2009 > >>> New Revision: 824611 > >>> > >>> URL: http://svn.apache.org/viewvc?rev=824611&view=rev > >>> Log: > >>> More fixes that were accidentially left out in the previous commit > >>> > >>> Modified: > >>> > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/FieldMaskingSpanQuery.java > >>> > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/SpanFirstQuery.java > >>> > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/SpanNearQuery.java > >>> > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/SpanNotQuery.java > >>> > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/SpanOrQuery.java > >>> > >>> Modified: > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/FieldMaskingSpanQuery.java > >>> URL: > >>> > http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t > >>> > >>> > ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re > >>> > >>> v=824611&r1=824610&r2=824611&view=diff > >>> > == > >>> > >>> > >>> --- > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/FieldMaskingSpanQuery.java (original) > >>> +++ > >>> > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > >>> > >>> ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009 > >>> @@ -94,11 +94,6 @@ > >>> return maskedQuery.getSpans(reader); > >>> } > >>> > >>> - /** @deprecated use {...@link #extractTerms(Set)} instead. */ > >>> - public Collection get
RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
I found the reason why it broke: You changed in the backwards branch main code in your first commit the following: +Set terms = new HashSet(); +qr.extractTerms(terms); +assertEquals(1, terms.size()); And the backwards branch core and test is compiled with Java 1.4 - bumm. So general rule: Never change the main code branch, only the tests in backwards and use where possible only the old *public* API. If you have to change the main code you have a backwards break. If you only test some internal implementations in 2.9 (not public API), remove the tests in 2.9. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Tuesday, October 13, 2009 9:43 AM > To: java-dev@lucene.apache.org > Subject: RE: svn commit: r824611 - in > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > Yes, thats why we do the tests. By this it is possible to test compiled > Java > 1.4 code against new Java 1.5 lucene core with generics and test, that no > upper generics boundaries (e.g. by things like ) are > violated. > > But if you rewrite the tests to only use the API of lucene 3.0 and no > deprecated methods it should pass and it has no effect, if an additional > deprecated method is still available in the branch's code. If we have to > remove all deprecated code also from the backwards branch, we would not > need > the branch at all. So this commit is definitely not needed (and I tested > it, > it works without). In the backwards branch we should only fix the tests, > never the core code. If we do it, it is contra-productive. > > There were some edge cases, when we have backwards-incompatible changes in > 2.9. But this is definitely not a backwards break. > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Michael Busch [mailto:busch...@gmail.com] > > Sent: Tuesday, October 13, 2009 9:30 AM > > To: java-dev@lucene.apache.org > > Subject: Re: svn commit: r824611 - in > > > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > > > Yes that's indeed the case, see LUCENE-1529. > > > > Michael > > > > On 10/13/09 12:25 AM, Michael Busch wrote: > > > It was weird - I ran all the tests before I did the previous commit > > > and it worked fine. Then after committing I wanted to doublecheck by > > > running 'ant test-tag' and got the compile errors. > > > > > > I think something is wrong with my eclipse and/or svn. But I also > > > switched from tortoise to command-line recently - so maybe I'm just > > > clumsy. Anyway, the new tag is working now, sorry for the noise. > > > > > > To your question: Wasn't there a fix recently to test-tag to test > > > drop-in backwards-compatibility? Which means that it compiles the > > > tests first against the sources of the back-compat branch, but then > > > runs them against the new trunk JAR? That's why this commit is > > > necessary I think. > > > > > > Michael > > > > > > On 10/13/09 12:18 AM, Uwe Schindler wrote: > > >> I wonder why this commit is needed. It only affects the core classes, > > >> not th > > >> tests. To compile correct backwards tests it should not be important > > >> if the > > >> methods exist or not. > > >> > > >> - > > >> Uwe Schindler > > >> H.-H.-Meier-Allee 63, D-28213 Bremen > > >> http://www.thetaphi.de > > >> eMail: u...@thetaphi.de > > >> > > >> > > >>> -Original Message- > > >>> From: busc...@apache.org [mailto:busc...@apache.org] > > >>> Sent: Tuesday, October 13, 2009 9:00 AM > > >>> To: java-comm...@lucene.apache.org > > >>> Subject: svn commit: r824611 - in > > >>> > > > /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc > > >>> > > >>> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java > > >>> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java > > >>> > > >>> Author: buschmi > > >>> Date: Tue Oct 13 06:59:40 2009 > > >>> New Revision: 824611 > > >>> > > >>> URL: http://svn.apache.org/viewvc?rev=824611&view=rev > > >>> Log: > > >>> More fixes that were accidentially left out in the previous commit > > >>> > > >>> Modified: > > >>> > > >>> > > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > > >>> > > >>> ne/search/spans/FieldMaskingSpanQuery.java > > >>> > > >>> > > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > > >>> > > >>> ne/search/spans/SpanFirstQuery.java > > >>> > > >>> > > > lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce > > >>> > > >>> ne/search/spans/SpanNea
Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
You're right of course! I made the changes to both testcases in the back-compat branch first, but I shouldn't have commit the changes to JustCompileSearchSpans - that was my mistake. And then I forgot for a minute about LUCENE-1529 (when I added the test-tag feature initially it *compiled* the tests against the trunk JAR) and didn't think the right solution would be to just revert JustCompileSearchSpans and thus had to make the other changes. Oh well, now I guess I'll not forget anymore :) Thanks for bearing with me and sorry for the noise. Michael On 10/13/09 12:48 AM, Uwe Schindler wrote: I found the reason why it broke: You changed in the backwards branch main code in your first commit the following: +Set terms = new HashSet(); +qr.extractTerms(terms); +assertEquals(1, terms.size()); And the backwards branch core and test is compiled with Java 1.4 - bumm. So general rule: Never change the main code branch, only the tests in backwards and use where possible only the old *public* API. If you have to change the main code you have a backwards break. If you only test some internal implementations in 2.9 (not public API), remove the tests in 2.9. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, October 13, 2009 9:43 AM To: java-dev@lucene.apache.org Subject: RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java Yes, thats why we do the tests. By this it is possible to test compiled Java 1.4 code against new Java 1.5 lucene core with generics and test, that no upper generics boundaries (e.g. by things like) are violated. But if you rewrite the tests to only use the API of lucene 3.0 and no deprecated methods it should pass and it has no effect, if an additional deprecated method is still available in the branch's code. If we have to remove all deprecated code also from the backwards branch, we would not need the branch at all. So this commit is definitely not needed (and I tested it, it works without). In the backwards branch we should only fix the tests, never the core code. If we do it, it is contra-productive. There were some edge cases, when we have backwards-incompatible changes in 2.9. But this is definitely not a backwards break. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael Busch [mailto:busch...@gmail.com] Sent: Tuesday, October 13, 2009 9:30 AM To: java-dev@lucene.apache.org Subject: Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java Yes that's indeed the case, see LUCENE-1529. Michael On 10/13/09 12:25 AM, Michael Busch wrote: It was weird - I ran all the tests before I did the previous commit and it worked fine. Then after committing I wanted to doublecheck by running 'ant test-tag' and got the compile errors. I think something is wrong with my eclipse and/or svn. But I also switched from tortoise to command-line recently - so maybe I'm just clumsy. Anyway, the new tag is working now, sorry for the noise. To your question: Wasn't there a fix recently to test-tag to test drop-in backwards-compatibility? Which means that it compiles the tests first against the sources of the back-compat branch, but then runs them against the new trunk JAR? That's why this commit is necessary I think. Michael On 10/13/09 12:18 AM, Uwe Schindler wrote: I wonder why this commit is needed. It only affects the core classes, not th tests. To compile correct backwards tests it should not be important if the methods exist or not. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: busc...@apache.org [mailto:busc...@apache.org] Sent: Tuesday, October 13, 2009 9:00 AM To: java-comm...@lucene.apache.org Subject: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java Author: buschmi Date: Tue Oct 13 06:59:40 2009 New Revision: 824611 URL: http://svn.apache.org/viewvc?rev=824611&view=rev Log: More fixes that were accidentially left out in the previous commit Modified: lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce ne/search/spans/FieldMaskingSpanQuery.java lucene/java/branches/luc
Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.
Yeah please go ahead! Thanks for fixing. I have it hear working too now - I just took the lucene_2_9_back_compat_tests_20091011 tag and made only the fix to TestFieldMaskingSpanQuery (without Java 1.5 code of course ;) ) and *not* the changes to JustCompileSearchSpans and test-tag is passing now against current trunk. I think that's the same you have now, right? Please go ahead and commit... today is not my day - I should go to bed :) Michael On 10/13/09 1:05 AM, Uwe Schindler wrote: Hi Michael, I fixed it here, should I commit? You problem was maybe that you thought, the backwards test code must compile against trunk. But it's vice versa. I reverted everything and only removed the getTerms() checks in the backwards branch. Now it works and the backwards testing is correct. Here the general rule applied: The backwards test code was checking against a deprecated API, just remove it. No need to rewrite the test for that. It is tested by the main tests. The main case of the backwards branch is to test drop in binary compatibility. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, October 13, 2009 9:49 AM To: java-dev@lucene.apache.org Subject: RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java I found the reason why it broke: You changed in the backwards branch main code in your first commit the following: +Set terms = new HashSet(); +qr.extractTerms(terms); +assertEquals(1, terms.size()); And the backwards branch core and test is compiled with Java 1.4 - bumm. So general rule: Never change the main code branch, only the tests in backwards and use where possible only the old *public* API. If you have to change the main code you have a backwards break. If you only test some internal implementations in 2.9 (not public API), remove the tests in 2.9. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, October 13, 2009 9:43 AM To: java-dev@lucene.apache.org Subject: RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java Yes, thats why we do the tests. By this it is possible to test compiled Java 1.4 code against new Java 1.5 lucene core with generics and test, that no upper generics boundaries (e.g. by things like) are violated. But if you rewrite the tests to only use the API of lucene 3.0 and no deprecated methods it should pass and it has no effect, if an additional deprecated method is still available in the branch's code. If we have to remove all deprecated code also from the backwards branch, we would not need the branch at all. So this commit is definitely not needed (and I tested it, it works without). In the backwards branch we should only fix the tests, never the core code. If we do it, it is contra-productive. There were some edge cases, when we have backwards-incompatible changes in 2.9. But this is definitely not a backwards break. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael Busch [mailto:busch...@gmail.com] Sent: Tuesday, October 13, 2009 9:30 AM To: java-dev@lucene.apache.org Subject: Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java Yes that's indeed the case, see LUCENE-1529. Michael On 10/13/09 12:25 AM, Michael Busch wrote: It was weird - I ran all the tests before I did the previous commit and it worked fine. Then after committing I wanted to doublecheck by running 'ant test-tag' and got the compile errors. I think something is wrong with my eclipse and/or svn. But I also switched from tortoise to command-line recently - so maybe I'm just clumsy. Anyway, the new tag is working now, sorry for the noise. To your question: Wasn't there a fix recently to test-tag to test drop-in backwards-compatibility? Which means that it compiles the tests first against the sources of the back-compat branch, but then runs them against the new trunk JAR? That's why this commit is necessary I think. Michael On 10/13/09 12:18 AM, Uwe Schindler wrote: I wonder why this commit is needed. It only affects the core classes,
[jira] Created: (LUCENE-1976) isCurrent() and getVersion() on an NRT reader are broken
isCurrent() and getVersion() on an NRT reader are broken Key: LUCENE-1976 URL: https://issues.apache.org/jira/browse/LUCENE-1976 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 2.9 Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 3.1 Right now isCurrent() will always return true for an NRT reader and getVersion() will always return the version of the last commit. This is because the NRT reader holds the live segmentInfos. I think isCurrent() should return "false" when any further changes have occurred with the writer, else true. This is actually fairly easy to determine, since the writer tracks how many docs & deletions are buffered in RAM and these counters only increase with each change. getVersion should return the version as of when the reader was created. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic
[ https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1972: -- Summary: Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic (was: Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort) > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of > deprecated sort logic > > > Key: LUCENE-1972 > URL: https://issues.apache.org/jira/browse/LUCENE-1972 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic
[ https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1972: -- Attachment: LUCENE-1972-bw.patch LUCENE-1972.patch This patch removes ExtendedFieldCache bw layer. It also removes the AUTO and CUSTOM caches. Because of that, also lot's of SortField logic was also changed and deprecations removed (not yet complete, HitCollector is still there). But with this patch most of the deprecated sort logic is removed (old Collectors, old sorting collectors, legacy search,...) I also converted the Sort() ctors/setSort methods to varargs and changed the tests. It's now easier to use. Will commit, when all tests were run again and nobody complains. This patch may miss to remove some dead code, but this should be done later, when the inventors of the new Search API look closer over it. > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of > deprecated sort logic > > > Key: LUCENE-1972 > URL: https://issues.apache.org/jira/browse/LUCENE-1972 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1972-bw.patch, LUCENE-1972.patch > > > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic
[ https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-1972. --- Resolution: Fixed Committed revision: 824699 > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of > deprecated sort logic > > > Key: LUCENE-1972 > URL: https://issues.apache.org/jira/browse/LUCENE-1972 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1972-bw.patch, LUCENE-1972.patch > > > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
OK I will cut a branch & commit Mark's last patch onto it, unless anyone has objections soonish... I'll also branch (twig?) the back compat branch so we can commit the patch there as well. Mike On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller wrote: > > SVN is about as good at merging branches as any of us are with a patch > and trunk unfortunately. But that can still be somewhat more convenient > than all these huge patches, with different people at different stages. > > Depends on how many people end up working on this though. Any more than > 2, and I think the branch has got to be worth it. > > From my perspective, it doesn't make any of the merging process any > easier - but it can be easier than juggling all these patches - you have > a central code base that can always be targeted for current merging. > > Michael Busch wrote: >> I think it's supposed to work pretty good - though I have no personal >> experience with merging branches with svn. >> >> I think we should try it - then we'll know! :) >> >> Michael >> >> On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: >>> [ >>> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 >>> ] >>> >>> Michael McCandless commented on LUCENE-1458: >>> >>> >>> bq. Shall we create a flexible-indexing branch and commit this? >>> >>> I think this is a good idea. >>> >>> But I haven't played heavily w/ svn& branching. EG if we branch >>> now, and trunk moves fast (which it still is w/ deprecation >>> removals), are we going to have conflicts? Or... is svn good about >>> merging branches? >>> >>> Further steps towards flexible indexing --- Key: LUCENE-1458 URL: https://issues.apache.org/jira/browse/LUCENE-1458 Project: Lucene - Java Issue Type: New Feature Components: Index Affects Versions: 2.9 Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Attachments: LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 I attached a very rough checkpoint of my current patch, to get early feedback. All tests pass, though back compat tests don't pass due to changes to package-private APIs plus certain bugs in tests that happened to work (eg call TermPostions.nextPosition() too many times, which the new API asserts against). [Aside: I think, when we commit changes to package-private APIs such that back-compat tests don't pass, we could go back, make a branch on the back-compat tag, commit changes to the tests to use the new package private APIs on that branch, then fix nightly build to use the tip of that branch?o] There's still plenty to do before this is committable! This is a rather large change: * Switches to a new more efficient terms dict format. This still uses tii/tis files, but the tii only stores term& long offset (not a TermInfo). At seek points, tis encodes term& freq/prox offsets absolutely instead of with deltas delta. Also, tis/tii are structured by field, so we don't have to record field number in every term. . On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). . RAM usage when loading terms dict index is significantly less since we only load an array of offsets and an array of String (no more TermInfo array). It should be faster to init too. . This part is basically done. * Introduces modular reader codec that strongly decouples terms dict from docs/positions readers. EG there is no more TermInfo used when reading the new format. . There's nice symmetry now between reading& writing in the codec chain -- the current docs/prox format is captured in: {code} FormatPostingsTermsDictWriter/Reader FormatPostingsDocsWriter/Reader (.frq file) and FormatPostingsPositionsWriter/Reader (.prx file). {code} This part is basically done. * Introduces a new "flex" API for iterating
[jira] Updated: (LUCENE-1977) Remove MultiTermQuery.getTerm()
[ https://issues.apache.org/jira/browse/LUCENE-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1977: -- Attachment: LUCENE-1977.patch Here the patch. This also fixes the highlighter problem with NumericRange. > Remove MultiTermQuery.getTerm() > --- > > Key: LUCENE-1977 > URL: https://issues.apache.org/jira/browse/LUCENE-1977 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1977.patch > > > Removes the field and methods in MTQ that return the pattern term. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1977) Remove MultiTermQuery.getTerm()
Remove MultiTermQuery.getTerm() --- Key: LUCENE-1977 URL: https://issues.apache.org/jira/browse/LUCENE-1977 Project: Lucene - Java Issue Type: Task Components: Search Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 3.0 Removes the field and methods in MTQ that return the pattern term. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1978) Remove HitCollector
Remove HitCollector --- Key: LUCENE-1978 URL: https://issues.apache.org/jira/browse/LUCENE-1978 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Assignee: Uwe Schindler Remove the rest of HitCollectors -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1978) Remove HitCollector
[ https://issues.apache.org/jira/browse/LUCENE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1978: -- Attachment: LUCENE-1978-bw.patch LUCENE-1978.patch attached is the patch. Will commit, when full testsuite has run again. > Remove HitCollector > --- > > Key: LUCENE-1978 > URL: https://issues.apache.org/jira/browse/LUCENE-1978 > Project: Lucene - Java > Issue Type: Task >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: LUCENE-1978-bw.patch, LUCENE-1978.patch > > > Remove the rest of HitCollectors -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1977) Remove MultiTermQuery.getTerm()
[ https://issues.apache.org/jira/browse/LUCENE-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-1977. --- Resolution: Fixed Committed revision: 824771 > Remove MultiTermQuery.getTerm() > --- > > Key: LUCENE-1977 > URL: https://issues.apache.org/jira/browse/LUCENE-1977 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1977.patch > > > Removes the field and methods in MTQ that return the pattern term. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
[ https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765082#action_12765082 ] Uwe Schindler commented on LUCENE-1929: --- This is fixed also in trunk, but different where MTQ.getTerm() is not available. > Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery > -- > > Key: LUCENE-1929 > URL: https://issues.apache.org/jira/browse/LUCENE-1929 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Affects Versions: 2.9 >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 2.9.1 > > Attachments: LUCENE-1929.patch > > > Sucks. Will throw a NullPointer exception. > Only NumericRangeQuery will throw the exception. > RangeQuery just won't highlight. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
[ https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765082#action_12765082 ] Uwe Schindler edited comment on LUCENE-1929 at 10/13/09 7:11 AM: - This is fixed also in trunk, but different where MTQ.getTerm() is not available (LUCENE-1977) was (Author: thetaphi): This is fixed also in trunk, but different where MTQ.getTerm() is not available. > Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery > -- > > Key: LUCENE-1929 > URL: https://issues.apache.org/jira/browse/LUCENE-1929 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Affects Versions: 2.9 >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 2.9.1 > > Attachments: LUCENE-1929.patch > > > Sucks. Will throw a NullPointer exception. > Only NumericRangeQuery will throw the exception. > RangeQuery just won't highlight. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1978) Remove HitCollector
[ https://issues.apache.org/jira/browse/LUCENE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-1978. --- Resolution: Fixed Fix Version/s: 3.0 Committed revision: 824781 > Remove HitCollector > --- > > Key: LUCENE-1978 > URL: https://issues.apache.org/jira/browse/LUCENE-1978 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1978-bw.patch, LUCENE-1978.patch > > > Remove the rest of HitCollectors -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1978) Remove HitCollector
[ https://issues.apache.org/jira/browse/LUCENE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1978: -- Component/s: Search > Remove HitCollector > --- > > Key: LUCENE-1978 > URL: https://issues.apache.org/jira/browse/LUCENE-1978 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1978-bw.patch, LUCENE-1978.patch > > > Remove the rest of HitCollectors -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
I can trunk it once more if you'd like - its already pretty out of date :) If you havn't started anyway ... Michael McCandless wrote: > OK I will cut a branch & commit Mark's last patch onto it, unless > anyone has objections soonish... > > I'll also branch (twig?) the back compat branch so we can commit the > patch there as well. > > Mike > > On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller wrote: > >> SVN is about as good at merging branches as any of us are with a patch >> and trunk unfortunately. But that can still be somewhat more convenient >> than all these huge patches, with different people at different stages. >> >> Depends on how many people end up working on this though. Any more than >> 2, and I think the branch has got to be worth it. >> >> From my perspective, it doesn't make any of the merging process any >> easier - but it can be easier than juggling all these patches - you have >> a central code base that can always be targeted for current merging. >> >> Michael Busch wrote: >> >>> I think it's supposed to work pretty good - though I have no personal >>> experience with merging branches with svn. >>> >>> I think we should try it - then we'll know! :) >>> >>> Michael >>> >>> On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: >>> [ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 ] Michael McCandless commented on LUCENE-1458: bq. Shall we create a flexible-indexing branch and commit this? I think this is a good idea. But I haven't played heavily w/ svn& branching. EG if we branch now, and trunk moves fast (which it still is w/ deprecation removals), are we going to have conflicts? Or... is svn good about merging branches? > Further steps towards flexible indexing > --- > > Key: LUCENE-1458 > URL: https://issues.apache.org/jira/browse/LUCENE-1458 > Project: Lucene - Java > Issue Type: New Feature > Components: Index > Affects Versions: 2.9 > Reporter: Michael McCandless > Assignee: Michael McCandless > Priority: Minor > Attachments: LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2 > > > I attached a very rough checkpoint of my current patch, to get early > feedback. All tests pass, though back compat tests don't pass due to > changes to package-private APIs plus certain bugs in tests that > happened to work (eg call TermPostions.nextPosition() too many times, > which the new API asserts against). > [Aside: I think, when we commit changes to package-private APIs such > that back-compat tests don't pass, we could go back, make a branch on > the back-compat tag, commit changes to the tests to use the new > package private APIs on that branch, then fix nightly build to use the > tip of that branch?o] > There's still plenty to do before this is committable! This is a > rather large change: >* Switches to a new more efficient terms dict format. This still > uses tii/tis files, but the tii only stores term& long offset > (not a TermInfo). At seek points, tis encodes term& freq/prox > offsets absolutely instead of with deltas delta. Also, tis/tii > are structured by field, so we don't have to record field number > in every term. > . > On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB > -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). > . > RAM usage when loading terms dict index is significantly less > since we only load an array of offsets and an array of String (no > more TermInfo array). It should be faster to init too. > . > This part is basically done. >* Introduces modular reader codec that strongly decouples terms dict > from docs/positions readers. EG there is no more TermInfo used > when reading the new format. > . > There's nice symmetry now between reading& writing in the codec > chain -- the current docs/prox format
[jira] Commented: (LUCENE-1973) Remove deprecated query components
[ https://issues.apache.org/jira/browse/LUCENE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765093#action_12765093 ] Uwe Schindler commented on LUCENE-1973: --- There are still some of them: - explain() in Scorer (I do not know what to do exactly here, I use explain() very seldom) - idf() in Similarity ...and some more > Remove deprecated query components > -- > > Key: LUCENE-1973 > URL: https://issues.apache.org/jira/browse/LUCENE-1973 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler > Fix For: 3.0 > > > Remove deprecated query components around HitCollector -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic
[ https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1972: -- Attachment: LUCENE-1972-2.patch Some small additional deprecated removals after finishing the rest. Will commit now. > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of > deprecated sort logic > > > Key: LUCENE-1972 > URL: https://issues.apache.org/jira/browse/LUCENE-1972 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1972-2.patch, LUCENE-1972-bw.patch, > LUCENE-1972.patch > > > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1959) Index Splitter
[ https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765097#action_12765097 ] Andrzej Bialecki commented on LUCENE-1959: --- Indeed, thanks for the fix - I'll commit this. > Index Splitter > -- > > Key: LUCENE-1959 > URL: https://issues.apache.org/jira/browse/LUCENE-1959 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Jason Rutherglen >Assignee: Michael McCandless >Priority: Trivial > Fix For: 3.0 > > Attachments: LUCENE-1959.patch, LUCENE-1959.patch, > mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, > mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch > > > If an index has multiple segments, this tool allows splitting those segments > into separate directories. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic
[ https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765098#action_12765098 ] Uwe Schindler commented on LUCENE-1972: --- Committed revision: 824792 > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of > deprecated sort logic > > > Key: LUCENE-1972 > URL: https://issues.apache.org/jira/browse/LUCENE-1972 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 3.0 > > Attachments: LUCENE-1972-2.patch, LUCENE-1972-bw.patch, > LUCENE-1972.patch > > > Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-1973) Remove deprecated query components
[ https://issues.apache.org/jira/browse/LUCENE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765093#action_12765093 ] Uwe Schindler edited comment on LUCENE-1973 at 10/13/09 7:57 AM: - There are still some of them: - explain() in Scorer (I do not know what to do exactly here, I use explain() very seldom) - idf() in Similarity - IndexSearcher.fieldSortDoTrackScores / IS.fieldSortDoMaxScore - BoostingTermQuery - MultiValueSource (what to do with it?) - BooleanQuery scoreDocOutOfOrder & others (LUCENE-944) I am not familar with all of these, so I do not want to fix it. was (Author: thetaphi): There are still some of them: - explain() in Scorer (I do not know what to do exactly here, I use explain() very seldom) - idf() in Similarity ...and some more > Remove deprecated query components > -- > > Key: LUCENE-1973 > URL: https://issues.apache.org/jira/browse/LUCENE-1973 > Project: Lucene - Java > Issue Type: Task > Components: Search >Reporter: Uwe Schindler > Fix For: 3.0 > > > Remove deprecated query components around HitCollector -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1959) Index Splitter
[ https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765101#action_12765101 ] Andrzej Bialecki commented on LUCENE-1959: --- Committed revision 824798. > Index Splitter > -- > > Key: LUCENE-1959 > URL: https://issues.apache.org/jira/browse/LUCENE-1959 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Jason Rutherglen >Assignee: Michael McCandless >Priority: Trivial > Fix For: 3.0 > > Attachments: LUCENE-1959.patch, LUCENE-1959.patch, > mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, > mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch > > > If an index has multiple segments, this tool allows splitting those segments > into separate directories. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
Yes please! Mike On Tue, Oct 13, 2009 at 10:40 AM, Mark Miller wrote: > I can trunk it once more if you'd like - its already pretty out of date :) > > If you havn't started anyway ... > > > Michael McCandless wrote: >> OK I will cut a branch & commit Mark's last patch onto it, unless >> anyone has objections soonish... >> >> I'll also branch (twig?) the back compat branch so we can commit the >> patch there as well. >> >> Mike >> >> On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller wrote: >> >>> SVN is about as good at merging branches as any of us are with a patch >>> and trunk unfortunately. But that can still be somewhat more convenient >>> than all these huge patches, with different people at different stages. >>> >>> Depends on how many people end up working on this though. Any more than >>> 2, and I think the branch has got to be worth it. >>> >>> From my perspective, it doesn't make any of the merging process any >>> easier - but it can be easier than juggling all these patches - you have >>> a central code base that can always be targeted for current merging. >>> >>> Michael Busch wrote: >>> I think it's supposed to work pretty good - though I have no personal experience with merging branches with svn. I think we should try it - then we'll know! :) Michael On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: > [ > https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 > ] > > Michael McCandless commented on LUCENE-1458: > > > bq. Shall we create a flexible-indexing branch and commit this? > > I think this is a good idea. > > But I haven't played heavily w/ svn& branching. EG if we branch > now, and trunk moves fast (which it still is w/ deprecation > removals), are we going to have conflicts? Or... is svn good about > merging branches? > > > >> Further steps towards flexible indexing >> --- >> >> Key: LUCENE-1458 >> URL: https://issues.apache.org/jira/browse/LUCENE-1458 >> Project: Lucene - Java >> Issue Type: New Feature >> Components: Index >> Affects Versions: 2.9 >> Reporter: Michael McCandless >> Assignee: Michael McCandless >> Priority: Minor >> Attachments: LUCENE-1458-back-compat.patch, >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >> LUCENE-1458.tar.bz2 >> >> >> I attached a very rough checkpoint of my current patch, to get early >> feedback. All tests pass, though back compat tests don't pass due to >> changes to package-private APIs plus certain bugs in tests that >> happened to work (eg call TermPostions.nextPosition() too many times, >> which the new API asserts against). >> [Aside: I think, when we commit changes to package-private APIs such >> that back-compat tests don't pass, we could go back, make a branch on >> the back-compat tag, commit changes to the tests to use the new >> package private APIs on that branch, then fix nightly build to use the >> tip of that branch?o] >> There's still plenty to do before this is committable! This is a >> rather large change: >> * Switches to a new more efficient terms dict format. This still >> uses tii/tis files, but the tii only stores term& long offset >> (not a TermInfo). At seek points, tis encodes term& freq/prox >> offsets absolutely instead of with deltas delta. Also, tis/tii >> are structured by field, so we don't have to record field number >> in every term. >> . >> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB >> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). >> . >> RAM usage when loading terms dict index is significantly less >> since we only load an array of offsets and an array of String (no >> more TermInfo array). It should be faster to init too. >> . >> This part is basically done. >> * Introduces modular reader codec that strongly decouples terms dict >> from docs/positions readers. EG there is no more TermInfo used >>
RE: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
I think the big changes in the o.a.l.search package are over... :-) - Worked the whole day on it. Merging branches with TortoiseSVN works really good, you can even edit the conflicts directly in the diff view. Used it when fixing the IR/IW hell deprecations in the BW branch. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Tuesday, October 13, 2009 5:01 PM > To: java-dev@lucene.apache.org > Subject: Re: [jira] Commented: (LUCENE-1458) Further steps towards > flexible indexing > > Yes please! > > Mike > > On Tue, Oct 13, 2009 at 10:40 AM, Mark Miller > wrote: > > I can trunk it once more if you'd like - its already pretty out of date > :) > > > > If you havn't started anyway ... > > > > > > Michael McCandless wrote: > >> OK I will cut a branch & commit Mark's last patch onto it, unless > >> anyone has objections soonish... > >> > >> I'll also branch (twig?) the back compat branch so we can commit the > >> patch there as well. > >> > >> Mike > >> > >> On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller > wrote: > >> > >>> SVN is about as good at merging branches as any of us are with a patch > >>> and trunk unfortunately. But that can still be somewhat more > convenient > >>> than all these huge patches, with different people at different > stages. > >>> > >>> Depends on how many people end up working on this though. Any more > than > >>> 2, and I think the branch has got to be worth it. > >>> > >>> From my perspective, it doesn't make any of the merging process any > >>> easier - but it can be easier than juggling all these patches - you > have > >>> a central code base that can always be targeted for current merging. > >>> > >>> Michael Busch wrote: > >>> > I think it's supposed to work pretty good - though I have no personal > experience with merging branches with svn. > > I think we should try it - then we'll know! :) > > Michael > > On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: > > > [ > > https://issues.apache.org/jira/browse/LUCENE- > 1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment- > tabpanel&focusedCommentId=12764799#action_12764799 > > ] > > > > Michael McCandless commented on LUCENE-1458: > > > > > > bq. Shall we create a flexible-indexing branch and commit this? > > > > I think this is a good idea. > > > > But I haven't played heavily w/ svn& branching. EG if we branch > > now, and trunk moves fast (which it still is w/ deprecation > > removals), are we going to have conflicts? Or... is svn good about > > merging branches? > > > > > > > >> Further steps towards flexible indexing > >> --- > >> > >> Key: LUCENE-1458 > >> URL: https://issues.apache.org/jira/browse/LUCENE- > 1458 > >> Project: Lucene - Java > >> Issue Type: New Feature > >> Components: Index > >> Affects Versions: 2.9 > >> Reporter: Michael McCandless > >> Assignee: Michael McCandless > >> Priority: Minor > >> Attachments: LUCENE-1458-back-compat.patch, > >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > >> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE- > 1458.patch, > >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > >> LUCENE-1458.tar.bz2 > >> > >> > >> I attached a very rough checkpoint of my current patch, to get > early > >> feedback. All tests pass, though back compat tests don't pass due > to > >> changes to package-private APIs plus certain bugs in tests that > >> happened to work (eg call TermPostions.nextPosition() too many > times, > >> which the new API asserts against). > >> [Aside: I think, when we commit changes to package-private APIs > such > >> that back-compat tests don't pass, we could go back, make a branch > on > >> the back-compat tag, commit changes to the tests to use the new > >> package private APIs on that branch, then fix nightly build to use > the > >> tip of that branch?o] > >> There's still plenty to do before this is committable! This is a > >> rather large change: > >> * Switches to a new more efficient terms dict format. This > still > >> uses tii/tis files, but the tii only stor
Re: [jira] Commented: (LUCENE-1959) Index Splitter
Hmm ... doing some heavy merging so it might be me, but there also might be a test failure with this now and some of the trunk changes ... Andrzej Bialecki (JIRA) wrote: > [ > https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765101#action_12765101 > ] > > Andrzej Bialecki commented on LUCENE-1959: > --- > > Committed revision 824798. > > >> Index Splitter >> -- >> >> Key: LUCENE-1959 >> URL: https://issues.apache.org/jira/browse/LUCENE-1959 >> Project: Lucene - Java >> Issue Type: New Feature >> Components: Index >>Affects Versions: 2.9 >>Reporter: Jason Rutherglen >>Assignee: Michael McCandless >>Priority: Trivial >> Fix For: 3.0 >> >> Attachments: LUCENE-1959.patch, LUCENE-1959.patch, >> mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, >> mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch >> >> >> If an index has multiple segments, this tool allows splitting those segments >> into separate directories. >> > > -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1959) Index Splitter
[ https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1959. Resolution: Fixed Thanks Andrzej! > Index Splitter > -- > > Key: LUCENE-1959 > URL: https://issues.apache.org/jira/browse/LUCENE-1959 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Jason Rutherglen >Assignee: Michael McCandless >Priority: Trivial > Fix For: 3.0 > > Attachments: LUCENE-1959.patch, LUCENE-1959.patch, > mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, > mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch > > > If an index has multiple segments, this tool allows splitting those segments > into separate directories. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1959) Index Splitter
I think it was me - ran by itself with eclipse - must have been an incremental compile issue or something. Mark Miller wrote: > Hmm ... doing some heavy merging so it might be me, but there also might > be a test failure with this now and some of the trunk changes ... > > Andrzej Bialecki (JIRA) wrote: > >> [ >> https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765101#action_12765101 >> ] >> >> Andrzej Bialecki commented on LUCENE-1959: >> --- >> >> Committed revision 824798. >> >> >> >>> Index Splitter >>> -- >>> >>> Key: LUCENE-1959 >>> URL: https://issues.apache.org/jira/browse/LUCENE-1959 >>> Project: Lucene - Java >>> Issue Type: New Feature >>> Components: Index >>>Affects Versions: 2.9 >>>Reporter: Jason Rutherglen >>>Assignee: Michael McCandless >>>Priority: Trivial >>> Fix For: 3.0 >>> >>> Attachments: LUCENE-1959.patch, LUCENE-1959.patch, >>> mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, >>> mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch >>> >>> >>> If an index has multiple segments, this tool allows splitting those >>> segments into separate directories. >>> >>> >> >> > > > -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1458) Further steps towards flexible indexing
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1458: Attachment: LUCENE-1458.patch Latest to trunk - still issues with GC and the reopen thread safety test (unless the test is run in isolation). Must be a tweak needed, but I'm not sure what. I'm closing the thread locals when the StandardTermsDictReader is closed - I don't see a way to improve on that yet. > Further steps towards flexible indexing > --- > > Key: LUCENE-1458 > URL: https://issues.apache.org/jira/browse/LUCENE-1458 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Attachments: LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 > > > I attached a very rough checkpoint of my current patch, to get early > feedback. All tests pass, though back compat tests don't pass due to > changes to package-private APIs plus certain bugs in tests that > happened to work (eg call TermPostions.nextPosition() too many times, > which the new API asserts against). > [Aside: I think, when we commit changes to package-private APIs such > that back-compat tests don't pass, we could go back, make a branch on > the back-compat tag, commit changes to the tests to use the new > package private APIs on that branch, then fix nightly build to use the > tip of that branch?o] > There's still plenty to do before this is committable! This is a > rather large change: > * Switches to a new more efficient terms dict format. This still > uses tii/tis files, but the tii only stores term & long offset > (not a TermInfo). At seek points, tis encodes term & freq/prox > offsets absolutely instead of with deltas delta. Also, tis/tii > are structured by field, so we don't have to record field number > in every term. > . > On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB > -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). > . > RAM usage when loading terms dict index is significantly less > since we only load an array of offsets and an array of String (no > more TermInfo array). It should be faster to init too. > . > This part is basically done. > * Introduces modular reader codec that strongly decouples terms dict > from docs/positions readers. EG there is no more TermInfo used > when reading the new format. > . > There's nice symmetry now between reading & writing in the codec > chain -- the current docs/prox format is captured in: > {code} > FormatPostingsTermsDictWriter/Reader > FormatPostingsDocsWriter/Reader (.frq file) and > FormatPostingsPositionsWriter/Reader (.prx file). > {code} > This part is basically done. > * Introduces a new "flex" API for iterating through the fields, > terms, docs and positions: > {code} > FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum > {code} > This replaces TermEnum/Docs/Positions. SegmentReader emulates the > old API on top of the new API to keep back-compat. > > Next steps: > * Plug in new codecs (pulsing, pfor) to exercise the modularity / > fix any hidden assumptions. > * Expose new API out of IndexReader, deprecate old API but emulate > old API on top of new one, switch all core/contrib users to the > new API. > * Maybe switch to AttributeSources as the base class for TermsEnum, > DocsEnum, PostingsEnum -- this would give readers API flexibility > (not just index-file-format flexibility). EG if someone wanted > to store payload at the term-doc level instead of > term-doc-position level, you could just add a new attribute. > * Test performance & iterate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: svn commit: r824781 - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/memory/src/test/org/apache/lucene/index/memory/ contrib/remote/src/java/org/apache/l
On 10/13/09 7:28 AM, uschind...@apache.org wrote: @@ -115,7 +95,6 @@ *Applications should usually call {...@link Searcher#search(Query)} or * {...@link Searcher#search(Query,Filter)} instead. * @throws BooleanQuery.TooManyClauses - * @deprecated use {...@link #search(Weight, Filter, int)} instead. */ TopDocs search(Weight weight, Filter filter, int n) throws IOException; Was this method just accidentally deprecated? Michael - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
Shall we first remove the remaining deprecations from the indexer package? There are not many more left, shouldn't be much work. Michael On 10/13/09 5:47 AM, Michael McCandless wrote: OK I will cut a branch& commit Mark's last patch onto it, unless anyone has objections soonish... I'll also branch (twig?) the back compat branch so we can commit the patch there as well. Mike On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller wrote: SVN is about as good at merging branches as any of us are with a patch and trunk unfortunately. But that can still be somewhat more convenient than all these huge patches, with different people at different stages. Depends on how many people end up working on this though. Any more than 2, and I think the branch has got to be worth it. From my perspective, it doesn't make any of the merging process any easier - but it can be easier than juggling all these patches - you have a central code base that can always be targeted for current merging. Michael Busch wrote: I think it's supposed to work pretty good - though I have no personal experience with merging branches with svn. I think we should try it - then we'll know! :) Michael On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 ] Michael McCandless commented on LUCENE-1458: bq. Shall we create a flexible-indexing branch and commit this? I think this is a good idea. But I haven't played heavily w/ svn&branching. EG if we branch now, and trunk moves fast (which it still is w/ deprecation removals), are we going to have conflicts? Or... is svn good about merging branches? Further steps towards flexible indexing --- Key: LUCENE-1458 URL: https://issues.apache.org/jira/browse/LUCENE-1458 Project: Lucene - Java Issue Type: New Feature Components: Index Affects Versions: 2.9 Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Attachments: LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 I attached a very rough checkpoint of my current patch, to get early feedback. All tests pass, though back compat tests don't pass due to changes to package-private APIs plus certain bugs in tests that happened to work (eg call TermPostions.nextPosition() too many times, which the new API asserts against). [Aside: I think, when we commit changes to package-private APIs such that back-compat tests don't pass, we could go back, make a branch on the back-compat tag, commit changes to the tests to use the new package private APIs on that branch, then fix nightly build to use the tip of that branch?o] There's still plenty to do before this is committable! This is a rather large change: * Switches to a new more efficient terms dict format. This still uses tii/tis files, but the tii only stores term&long offset (not a TermInfo). At seek points, tis encodes term&freq/prox offsets absolutely instead of with deltas delta. Also, tis/tii are structured by field, so we don't have to record field number in every term. . On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB ->0.64 MB) and tis file is 9% smaller (75.5 MB ->68.5 MB). . RAM usage when loading terms dict index is significantly less since we only load an array of offsets and an array of String (no more TermInfo array). It should be faster to init too. . This part is basically done. * Introduces modular reader codec that strongly decouples terms dict from docs/positions readers. EG there is no more TermInfo used when reading the new format. . There's nice symmetry now between reading&writing in the codec chain -- the current docs/prox format is captured in: {code} FormatPostingsTermsDictWriter/Reader FormatPostingsDocsWriter/Reader (.frq file) and FormatPostingsPositionsWriter/Reader (.prx file). {code} This part is basically done. * Introduces a new "flex" API for iterating through the fields, terms, docs and positions: {code} FieldProducer ->TermsEnum ->DocsEnum ->PostingsEnum {code} This replaces TermEn
RE: svn commit: r824781 - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/memory/src/test/org/apache/lucene/index/memory/ contrib/remote/src/java/org/apache/l
I think this was a mistake. Especially because the hint to the replacement method is the method itself. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael Busch [mailto:busch...@gmail.com] > Sent: Tuesday, October 13, 2009 6:42 PM > To: java-dev@lucene.apache.org > Subject: Re: svn commit: r824781 - in /lucene/java/trunk: ./ > contrib/memory/src/java/org/apache/lucene/index/memory/ > contrib/memory/src/test/org/apache/lucene/index/memory/ > contrib/remote/src/java/org/apache/lucene/search/ > contrib/surround/src/test/org/apache/luce > > On 10/13/09 7:28 AM, uschind...@apache.org wrote: > > @@ -115,7 +95,6 @@ > > *Applications should usually call {...@link Searcher#search(Query)} > or > > * {...@link Searcher#search(Query,Filter)} instead. > > * @throws BooleanQuery.TooManyClauses > > - * @deprecated use {...@link #search(Weight, Filter, int)} instead. > > */ > > TopDocs search(Weight weight, Filter filter, int n) throws > IOException; > > > > Was this method just accidentally deprecated? > > Michael > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1979) Remove remaining deprecations from indexer package
Remove remaining deprecations from indexer package -- Key: LUCENE-1979 URL: https://issues.apache.org/jira/browse/LUCENE-1979 Project: Lucene - Java Issue Type: Task Components: Index Reporter: Michael Busch Priority: Minor Fix For: 3.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1980) Fix javadocs after deprecation removal
Fix javadocs after deprecation removal -- Key: LUCENE-1980 URL: https://issues.apache.org/jira/browse/LUCENE-1980 Project: Lucene - Java Issue Type: Task Reporter: Uwe Schindler Fix For: 3.0 There are a lot of @links in Javadocs to methods/classes that no longer exist. javadoc target prints tons of warnings. We should fix that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: svn commit: r824781 - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/memory/src/test/org/apache/lucene/index/memory/ contrib/remote/src/java/org/apache/l
Right. I was confused about that too. Michael On 10/13/09 9:43 AM, Uwe Schindler wrote: I think this was a mistake. Especially because the hint to the replacement method is the method itself. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael Busch [mailto:busch...@gmail.com] Sent: Tuesday, October 13, 2009 6:42 PM To: java-dev@lucene.apache.org Subject: Re: svn commit: r824781 - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/memory/src/test/org/apache/lucene/index/memory/ contrib/remote/src/java/org/apache/lucene/search/ contrib/surround/src/test/org/apache/luce On 10/13/09 7:28 AM, uschind...@apache.org wrote: @@ -115,7 +95,6 @@ *Applications should usually call {...@link Searcher#search(Query)} or * {...@link Searcher#search(Query,Filter)} instead. * @throws BooleanQuery.TooManyClauses - * @deprecated use {...@link #search(Weight, Filter, int)} instead. */ TopDocs search(Weight weight, Filter filter, int n) throws IOException; Was this method just accidentally deprecated? Michael - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Created: (LUCENE-1981) Allow access to entries in the field cache
Allow access to entries in the field cache -- Key: LUCENE-1981 URL: https://issues.apache.org/jira/browse/LUCENE-1981 Project: Lucene - Java Issue Type: New Feature Components: Search Affects Versions: 2.9 Reporter: Tom Hill Priority: Minor If the data required is already in the field cache, it seems unnecessary to go to the disk for it, if the data is already in RAM. We have a case where we need one field from a large number (500 -1000) of scattered documents in a fairly large index (50-100m docs), and seek time to collect the data from disk is prohibitive, so we'd like to grab the data from the cache, instead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765149#action_12765149 ] Mark Miller commented on LUCENE-1458: - Whoops - double check the wrong index splitter test - the multi pass one is throwing a null pointer exception for me - don't think its related to this patch, but I havn't checked. > Further steps towards flexible indexing > --- > > Key: LUCENE-1458 > URL: https://issues.apache.org/jira/browse/LUCENE-1458 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Attachments: LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 > > > I attached a very rough checkpoint of my current patch, to get early > feedback. All tests pass, though back compat tests don't pass due to > changes to package-private APIs plus certain bugs in tests that > happened to work (eg call TermPostions.nextPosition() too many times, > which the new API asserts against). > [Aside: I think, when we commit changes to package-private APIs such > that back-compat tests don't pass, we could go back, make a branch on > the back-compat tag, commit changes to the tests to use the new > package private APIs on that branch, then fix nightly build to use the > tip of that branch?o] > There's still plenty to do before this is committable! This is a > rather large change: > * Switches to a new more efficient terms dict format. This still > uses tii/tis files, but the tii only stores term & long offset > (not a TermInfo). At seek points, tis encodes term & freq/prox > offsets absolutely instead of with deltas delta. Also, tis/tii > are structured by field, so we don't have to record field number > in every term. > . > On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB > -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). > . > RAM usage when loading terms dict index is significantly less > since we only load an array of offsets and an array of String (no > more TermInfo array). It should be faster to init too. > . > This part is basically done. > * Introduces modular reader codec that strongly decouples terms dict > from docs/positions readers. EG there is no more TermInfo used > when reading the new format. > . > There's nice symmetry now between reading & writing in the codec > chain -- the current docs/prox format is captured in: > {code} > FormatPostingsTermsDictWriter/Reader > FormatPostingsDocsWriter/Reader (.frq file) and > FormatPostingsPositionsWriter/Reader (.prx file). > {code} > This part is basically done. > * Introduces a new "flex" API for iterating through the fields, > terms, docs and positions: > {code} > FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum > {code} > This replaces TermEnum/Docs/Positions. SegmentReader emulates the > old API on top of the new API to keep back-compat. > > Next steps: > * Plug in new codecs (pulsing, pfor) to exercise the modularity / > fix any hidden assumptions. > * Expose new API out of IndexReader, deprecate old API but emulate > old API on top of new one, switch all core/contrib users to the > new API. > * Maybe switch to AttributeSources as the base class for TermsEnum, > DocsEnum, PostingsEnum -- this would give readers API flexibility > (not just index-file-format flexibility). EG if someone wanted > to store payload at the term-doc level instead of > term-doc-position level, you could just add a new attribute. > * Test performance & iterate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1981) Allow access to entries in the field cache
[ https://issues.apache.org/jira/browse/LUCENE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Hill updated LUCENE-1981: - Attachment: lucene-1981.patch Here's a sample implementation. There are a number of possible ways to do this, but this seemed pretty minimally invasive. Adds one method to IndexReader and subclasses. > Allow access to entries in the field cache > -- > > Key: LUCENE-1981 > URL: https://issues.apache.org/jira/browse/LUCENE-1981 > Project: Lucene - Java > Issue Type: New Feature > Components: Search >Affects Versions: 2.9 >Reporter: Tom Hill >Priority: Minor > Attachments: lucene-1981.patch > > > If the data required is already in the field cache, it seems unnecessary to > go to the disk for it, if the data is already in RAM. > We have a case where we need one field from a large number (500 -1000) of > scattered documents in a fairly large index (50-100m docs), and seek time to > collect the data from disk is prohibitive, so we'd like to grab the data from > the cache, instead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-944) Remove deprecated methods in BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-944: - Attachment: lucene-944-bw.patch lucene-944.patch Tiny change in QueryUtils#checkSkipTo() to keep it more consistent to how it worked before. Also attaching the back-compat patch. Note that I have to make the change to checkSkipTo() there too, because it was not changed before to do the search per-segment. Now more tests actually run this check, exposing this problem. All tests pass now. > Remove deprecated methods in BooleanQuery > - > > Key: LUCENE-944 > URL: https://issues.apache.org/jira/browse/LUCENE-944 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Paul Elschot >Assignee: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: BooleanQuery20070626.patch, lucene-944-bw.patch, > lucene-944.patch, lucene-944.patch, lucene-944.patch > > > Remove deprecated methods setUseScorer14 and getUseScorer14 in BooleanQuery, > and adapt javadocs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1981) Allow access to entries in the field cache
[ https://issues.apache.org/jira/browse/LUCENE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765157#action_12765157 ] Yonik Seeley commented on LUCENE-1981: -- We shouldn't tie IndexReader/SegmentReader to the fieldCache. All of the public APIs already exist to use the FieldCache instead of document(). > Allow access to entries in the field cache > -- > > Key: LUCENE-1981 > URL: https://issues.apache.org/jira/browse/LUCENE-1981 > Project: Lucene - Java > Issue Type: New Feature > Components: Search >Affects Versions: 2.9 >Reporter: Tom Hill >Priority: Minor > Attachments: lucene-1981.patch > > > If the data required is already in the field cache, it seems unnecessary to > go to the disk for it, if the data is already in RAM. > We have a case where we need one field from a large number (500 -1000) of > scattered documents in a fairly large index (50-100m docs), and seek time to > collect the data from disk is prohibitive, so we'd like to grab the data from > the cache, instead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-1458) Further steps towards flexible indexing
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765149#action_12765149 ] Mark Miller edited comment on LUCENE-1458 at 10/13/09 10:45 AM: Whoops - double check the wrong index splitter test - the multi pass one is throwing a null pointer exception for me - don't think its related to this patch, but I havn't checked. *edit* Okay, just checked - it is this patch. Looks like perhaps something to do with LegacyFieldsEnum? Something that isnt being hit by core tests at the moment (I didnt run through all the backcompat tests with this yet, since that failed) was (Author: markrmil...@gmail.com): Whoops - double check the wrong index splitter test - the multi pass one is throwing a null pointer exception for me - don't think its related to this patch, but I havn't checked. > Further steps towards flexible indexing > --- > > Key: LUCENE-1458 > URL: https://issues.apache.org/jira/browse/LUCENE-1458 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Attachments: LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 > > > I attached a very rough checkpoint of my current patch, to get early > feedback. All tests pass, though back compat tests don't pass due to > changes to package-private APIs plus certain bugs in tests that > happened to work (eg call TermPostions.nextPosition() too many times, > which the new API asserts against). > [Aside: I think, when we commit changes to package-private APIs such > that back-compat tests don't pass, we could go back, make a branch on > the back-compat tag, commit changes to the tests to use the new > package private APIs on that branch, then fix nightly build to use the > tip of that branch?o] > There's still plenty to do before this is committable! This is a > rather large change: > * Switches to a new more efficient terms dict format. This still > uses tii/tis files, but the tii only stores term & long offset > (not a TermInfo). At seek points, tis encodes term & freq/prox > offsets absolutely instead of with deltas delta. Also, tis/tii > are structured by field, so we don't have to record field number > in every term. > . > On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB > -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). > . > RAM usage when loading terms dict index is significantly less > since we only load an array of offsets and an array of String (no > more TermInfo array). It should be faster to init too. > . > This part is basically done. > * Introduces modular reader codec that strongly decouples terms dict > from docs/positions readers. EG there is no more TermInfo used > when reading the new format. > . > There's nice symmetry now between reading & writing in the codec > chain -- the current docs/prox format is captured in: > {code} > FormatPostingsTermsDictWriter/Reader > FormatPostingsDocsWriter/Reader (.frq file) and > FormatPostingsPositionsWriter/Reader (.prx file). > {code} > This part is basically done. > * Introduces a new "flex" API for iterating through the fields, > terms, docs and positions: > {code} > FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum > {code} > This replaces TermEnum/Docs/Positions. SegmentReader emulates the > old API on top of the new API to keep back-compat. > > Next steps: > * Plug in new codecs (pulsing, pfor) to exercise the modularity / > fix any hidden assumptions. > * Expose new API out of IndexReader, deprecate old API but emulate > old API on top of new one, switch all core/contrib users to the > new API. > * Maybe switch to AttributeSources as the base class for TermsEnum, > DocsEnum, PostingsEnum -- this would give readers API flexibility > (not just index-file-format flexibility). EG if someone wanted > to store payload at the term-doc level instead of > term-doc-position level, you could just add a new attribute. > * Test performance & iterate. -- This message is automatic
[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1606: Attachment: LUCENE-1606.patch updated patch to trunk: * add support for optional regex features * remove recursion * improve performance for worst-case regexp/wildcard/FSM * improved docs & test * remove the fuzzy impl, NFA->DFA too slow for this, maybe a later addition. > Automaton Query/Filter (scalable regex) > --- > > Key: LUCENE-1606 > URL: https://issues.apache.org/jira/browse/LUCENE-1606 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Reporter: Robert Muir >Assignee: Robert Muir >Priority: Minor > Fix For: 3.0 > > Attachments: automaton.patch, automatonMultiQuery.patch, > automatonmultiqueryfuzzy.patch, automatonMultiQuerySmart.patch, > automatonWithWildCard.patch, automatonWithWildCard2.patch, LUCENE-1606.patch, > LUCENE-1606.patch > > > Attached is a patch for an AutomatonQuery/Filter (name can change if its not > suitable). > Whereas the out-of-box contrib RegexQuery is nice, I have some very large > indexes (100M+ unique tokens) where queries are quite slow, 2 minutes, etc. > Additionally all of the existing RegexQuery implementations in Lucene are > really slow if there is no constant prefix. This implementation does not > depend upon constant prefix, and runs the same query in 640ms. > Some use cases I envision: > 1. lexicography/etc on large text corpora > 2. looking for things such as urls where the prefix is not constant (http:// > or ftp://) > The Filter uses the BRICS package (http://www.brics.dk/automaton/) to convert > regular expressions into a DFA. Then, the filter "enumerates" terms in a > special way, by using the underlying state machine. Here is my short > description from the comments: > The algorithm here is pretty basic. Enumerate terms but instead of a > binary accept/reject do: > > 1. Look at the portion that is OK (did not enter a reject state in the > DFA) > 2. Generate the next possible String and seek to that. > the Query simply wraps the filter with ConstantScoreQuery. > I did not include the automaton.jar inside the patch but it can be downloaded > from http://www.brics.dk/automaton/ and is BSD-licensed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1342) 64bit JVM crashes on Linux
[ https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765180#action_12765180 ] Amit Nithian commented on LUCENE-1342: -- I just encountered this error in our own QA environment. The last 3 days our JVM has been dying around 3AM with this bug and I am running 1.6.0_12. What OS/hardware environments are causing problems? I am running CentOS 5.2 and I'll attach my crash dump too. Has anyone seen any info on the Sun lists about this? I perused the change logs from 13-16 and didn't see anything specific to this unless it was listed as something else. > 64bit JVM crashes on Linux > -- > > Key: LUCENE-1342 > URL: https://issues.apache.org/jira/browse/LUCENE-1342 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 2.0.0 > Environment: 2.6.18-53.el5 x86_64 GNU/Linux > Java(TM) SE Runtime Environment (build 1.6.0_04-b12) >Reporter: Kevin Richards > Attachments: hs_err_pid10565.log, hs_err_pid21301.log, > hs_err_pid27882.log, jvmerror.log > > > Whilst running lucene in our QA environment we received the following > exception. This problem was also reported here : > http://confluence.atlassian.com/display/KB/JSP-20240+-+POSSIBLE+64+bit+JDK+1.6+update+4+may+have+HotSpot+problems. > Is this a JVM problem or a problem in Lucene. > # > # An unexpected error has been detected by Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x2adb9e3f, pid=2275, tid=1085356352 > # > # Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed mode linux-amd64) > # Problematic frame: > # V [libjvm.so+0x1fce3f] > # > # If you would like to submit a bug report, please visit: > # http://java.sun.com/webapps/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x2aab0007f000): JavaThread "CompilerThread0" daemon > [_thread_in_vm, id=2301, stack(0x40a13000,0x40b14000)] > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), > si_addr=0x > Registers: > RAX=0x, RBX=0x2aab0007f000, RCX=0x, > RDX=0x2aab00309aa0 > RSP=0x40b10f60, RBP=0x40b10fb0, RSI=0x2aaab37d1ce8, > RDI=0x2aaad000 > R8 =0x2b40cd88, R9 =0x0ffc, R10=0x2b40cd90, > R11=0x2b410810 > R12=0x2aab00ae60b0, R13=0x2aab0a19cc30, R14=0x40b112f0, > R15=0x2aab00ae60b0 > RIP=0x2adb9e3f, EFL=0x00010246, CSGSFS=0x0033, > ERR=0x0004 > TRAPNO=0x000e > Top of Stack: (sp=0x40b10f60) > 0x40b10f60: 2aab0007f000 > 0x40b10f70: 2aab0a19cc30 0001 > 0x40b10f80: 2aab0007f000 > 0x40b10f90: 40b10fe0 2aab0a19cc30 > 0x40b10fa0: 2aab0a19cc30 2aab00ae60b0 > 0x40b10fb0: 40b10fe0 2ae9c2e4 > 0x40b10fc0: 2b413210 2b413350 > 0x40b10fd0: 40b112f0 2aab09796260 > 0x40b10fe0: 40b110e0 2ae9d7d8 > 0x40b10ff0: 2b40f3d0 2aab08c2a4c8 > 0x40b11000: 40b11940 2aab09796260 > 0x40b11010: 2aab09795b28 > 0x40b11020: 2aab08c2a4c8 2aab009b9750 > 0x40b11030: 2aab09796260 40b11940 > 0x40b11040: 2b40f3d0 2023 > 0x40b11050: 40b11940 2aab09796260 > 0x40b11060: 40b11090 2b0f199e > 0x40b11070: 40b11978 2aab08c2a458 > 0x40b11080: 2b413210 2023 > 0x40b11090: 40b110e0 2b0f1fcf > 0x40b110a0: 2023 2aab09796260 > 0x40b110b0: 2aab08c2a3c8 40b123b0 > 0x40b110c0: 2aab08c2a458 40b112f0 > 0x40b110d0: 2b40f3d0 2aab00043670 > 0x40b110e0: 40b11160 2b0e808d > 0x40b110f0: 2aab000417c0 2aab009b66a8 > 0x40b11100: 2aab009b9750 > 0x40b0: 40b112f0 2aab009bb360 > 0x40b11120: 0003 40b113d0 > 0x40b11130: 01002aab0052d0c0 40b113d0 > 0x40b11140: 00b3 40b112f0 > 0x40b11150: 40b113d0 2aab08c2a108 > Instructions: (pc=0x2adb9e3f) > 0x2adb9e2f: 48 89 5d b0 49 8b 55 08 49 8b 4c 24 08 48 8b 32 > 0x2adb9e3f: 4c 8b 21 8b 4e 1c 49 8d 7c 24 10 89 cb 4a 39 34 > Stack: [0x40a13000,0x40b14000], sp=0x40b10f60, free > space=1015k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native >
[jira] Updated: (LUCENE-1342) 64bit JVM crashes on Linux
[ https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amit Nithian updated LUCENE-1342: - Attachment: hs_err_pid13693.log > 64bit JVM crashes on Linux > -- > > Key: LUCENE-1342 > URL: https://issues.apache.org/jira/browse/LUCENE-1342 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 2.0.0 > Environment: 2.6.18-53.el5 x86_64 GNU/Linux > Java(TM) SE Runtime Environment (build 1.6.0_04-b12) >Reporter: Kevin Richards > Attachments: hs_err_pid10565.log, hs_err_pid13693.log, > hs_err_pid21301.log, hs_err_pid27882.log, jvmerror.log > > > Whilst running lucene in our QA environment we received the following > exception. This problem was also reported here : > http://confluence.atlassian.com/display/KB/JSP-20240+-+POSSIBLE+64+bit+JDK+1.6+update+4+may+have+HotSpot+problems. > Is this a JVM problem or a problem in Lucene. > # > # An unexpected error has been detected by Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x2adb9e3f, pid=2275, tid=1085356352 > # > # Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed mode linux-amd64) > # Problematic frame: > # V [libjvm.so+0x1fce3f] > # > # If you would like to submit a bug report, please visit: > # http://java.sun.com/webapps/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x2aab0007f000): JavaThread "CompilerThread0" daemon > [_thread_in_vm, id=2301, stack(0x40a13000,0x40b14000)] > siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), > si_addr=0x > Registers: > RAX=0x, RBX=0x2aab0007f000, RCX=0x, > RDX=0x2aab00309aa0 > RSP=0x40b10f60, RBP=0x40b10fb0, RSI=0x2aaab37d1ce8, > RDI=0x2aaad000 > R8 =0x2b40cd88, R9 =0x0ffc, R10=0x2b40cd90, > R11=0x2b410810 > R12=0x2aab00ae60b0, R13=0x2aab0a19cc30, R14=0x40b112f0, > R15=0x2aab00ae60b0 > RIP=0x2adb9e3f, EFL=0x00010246, CSGSFS=0x0033, > ERR=0x0004 > TRAPNO=0x000e > Top of Stack: (sp=0x40b10f60) > 0x40b10f60: 2aab0007f000 > 0x40b10f70: 2aab0a19cc30 0001 > 0x40b10f80: 2aab0007f000 > 0x40b10f90: 40b10fe0 2aab0a19cc30 > 0x40b10fa0: 2aab0a19cc30 2aab00ae60b0 > 0x40b10fb0: 40b10fe0 2ae9c2e4 > 0x40b10fc0: 2b413210 2b413350 > 0x40b10fd0: 40b112f0 2aab09796260 > 0x40b10fe0: 40b110e0 2ae9d7d8 > 0x40b10ff0: 2b40f3d0 2aab08c2a4c8 > 0x40b11000: 40b11940 2aab09796260 > 0x40b11010: 2aab09795b28 > 0x40b11020: 2aab08c2a4c8 2aab009b9750 > 0x40b11030: 2aab09796260 40b11940 > 0x40b11040: 2b40f3d0 2023 > 0x40b11050: 40b11940 2aab09796260 > 0x40b11060: 40b11090 2b0f199e > 0x40b11070: 40b11978 2aab08c2a458 > 0x40b11080: 2b413210 2023 > 0x40b11090: 40b110e0 2b0f1fcf > 0x40b110a0: 2023 2aab09796260 > 0x40b110b0: 2aab08c2a3c8 40b123b0 > 0x40b110c0: 2aab08c2a458 40b112f0 > 0x40b110d0: 2b40f3d0 2aab00043670 > 0x40b110e0: 40b11160 2b0e808d > 0x40b110f0: 2aab000417c0 2aab009b66a8 > 0x40b11100: 2aab009b9750 > 0x40b0: 40b112f0 2aab009bb360 > 0x40b11120: 0003 40b113d0 > 0x40b11130: 01002aab0052d0c0 40b113d0 > 0x40b11140: 00b3 40b112f0 > 0x40b11150: 40b113d0 2aab08c2a108 > Instructions: (pc=0x2adb9e3f) > 0x2adb9e2f: 48 89 5d b0 49 8b 55 08 49 8b 4c 24 08 48 8b 32 > 0x2adb9e3f: 4c 8b 21 8b 4e 1c 49 8d 7c 24 10 89 cb 4a 39 34 > Stack: [0x40a13000,0x40b14000], sp=0x40b10f60, free > space=1015k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > V [libjvm.so+0x1fce3f] > V [libjvm.so+0x2df2e4] > V [libjvm.so+0x2e07d8] > V [libjvm.so+0x52b08d] > V [libjvm.so+0x524914] > V [libjvm.so+0x51c0ea] > V [libjvm.so+0x519f77] > V [libjvm.so+0x519e7c] > V [libjvm.so+0x519ad5] > V [libjvm.so+0x1e0cf4] > V [libjvm.so+0x2a0bc0] > V [libjvm.so+0x528e03] > V [libjvm.so+0x51c0ea] > V [libjvm.so+0x519f77] > V [libjvm.so+0x519e7c] > V [libjvm.so+0x519ad5] > V [libjvm.s
[jira] Resolved: (LUCENE-944) Remove deprecated methods in BooleanQuery
[ https://issues.apache.org/jira/browse/LUCENE-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch resolved LUCENE-944. -- Resolution: Fixed Committed revision 824870. > Remove deprecated methods in BooleanQuery > - > > Key: LUCENE-944 > URL: https://issues.apache.org/jira/browse/LUCENE-944 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Paul Elschot >Assignee: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: BooleanQuery20070626.patch, lucene-944-bw.patch, > lucene-944.patch, lucene-944.patch, lucene-944.patch > > > Remove deprecated methods setUseScorer14 and getUseScorer14 in BooleanQuery, > and adapt javadocs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1756) contrib/memory: PatternAnalyzerTest is a very, very, VERY, bad unit test
[ https://issues.apache.org/jira/browse/LUCENE-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1756: Lucene Fields: [New, Patch Available] (was: [New]) Fix Version/s: 3.0 Assignee: Robert Muir assigning this one to myself, if there aren't any objections to the fix I would like to commit it soon. > contrib/memory: PatternAnalyzerTest is a very, very, VERY, bad unit test > > > Key: LUCENE-1756 > URL: https://issues.apache.org/jira/browse/LUCENE-1756 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/* >Reporter: Hoss Man >Assignee: Robert Muir >Priority: Minor > Fix For: 3.0 > > Attachments: LUCENE-1756.patch > > > while working on something else i was started getting consistent > IllegalStateExceptions from PatternAnalyzerTest -- but only when running the > test from the top level. > Digging into the test, i've found numerous things that are very scary... > * instead of using assertions to test that tokens streams match, it throws an > IllegalStateExceptions when they don't, and then logs a bunch of info about > the token streams to System.out -- having assertion messages that tell you > *exactly* what doens't match would make a lot more sense. > * it builds up a list of files to analyze using patsh thta it evaluates > relative to the current working directory -- which means you get different > files depending on wether you run the tests fro mthe contrib level, or from > the top level build file > * the list of files it looks for include: "../../*.txt", "../../*.html", > "../../*.xml" ... so not only do you get different results when you run the > tests in the contrib vs at the top level, but different people runing the > tests via the top level build file will get different results depending on > what types of text, html, and xml files they happen to have two directories > above where they checked out lucene. > * the test comments indicates that it's purpose is to show that > PatternAnalyzer produces the same tokens as other analyzers - but points out > this will fail for WhitespaceAnalyzer because of the 255 character token > limit WhitespaceTokenizer imposes -- the test then proceeds to compare > PaternAnalyzer to WhitespaceTokenizer, garunteeing a test failure for anyone > who happens to have a text file containing more then 255 characters of > non-whitespace in a row somewhere in "../../" (in my case: my bookmarks.html > file, and the hex encoded favicon.gif images) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Draft for java-user mail about backwards-compatibility policy changes
Hi all, I wrote a draft for a mail I'd like to send to java-user to get some feedback about the proposed changes to our backwards-compatibility policy we discussed here and on LUCENE-1698. Let me know what you think please! Michael Hello Lucene users: In the past we have discussed our backwards-compatibility policy frequently on the Lucene developer mailinglist and we are very tempted to make some significant changes. In this mail I'd like to outline the proposed changes to get some feedback from the user community. Our current backwards-compatibility policy regarding API changes states that we can only make changes that break backwards-compatibility in major releases (3.0, 4.0, etc.); the next major release is the upcoming 3.0. Given how often we made major releases in the past in Lucene this means that deprecated APIs need to stay in Lucene for a very long time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 before we can remove it. This means that the code gets very cluttered and adding new features gets somewhat more difficult, as attention has to be paid to properly support the old *and* new APIs for a quite long time. The current policy also leads to delaying a last minor release before a major release (e.g. 2.9), because the developers consider it as the last chance for a long time to introduce new APIs and deprecate old ones. The proposal now is to change this policy in a way, so that an API can only be removed if it was deprecated in at least one release, which can be a major *or* minor release. E.g. if we deprecate an API and release it with 3.1, we can remove it with the 3.2 release. For users this means of course that a simple jar drop-in replacement won't be possible anymore with almost every Lucene release (excluding bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if you're using a non-deprecated API it will be in the next release. Note that of course these proposed changes do not affect backwards-compatibility with old index formats. I.e. it will still be possible to read all 3.X indexes with any Lucene 4.X version. Our main goal is to find the right balance between backwards-compatibility support for all the Lucene users out there and fast and productive development of new features. If we get positive feedback here we will call a vote on the development mailinglist where the committers have to officially decide whether to make these changes or not. Note that in any case the changes will take affect *after* the 3.0 release. On behalf of the Lucene developers, Michael Busch
[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765204#action_12765204 ] Mark Miller commented on LUCENE-1458: - Looks pretty simple - the field is not getting set with LegacyFieldsEnum. > Further steps towards flexible indexing > --- > > Key: LUCENE-1458 > URL: https://issues.apache.org/jira/browse/LUCENE-1458 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Attachments: LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 > > > I attached a very rough checkpoint of my current patch, to get early > feedback. All tests pass, though back compat tests don't pass due to > changes to package-private APIs plus certain bugs in tests that > happened to work (eg call TermPostions.nextPosition() too many times, > which the new API asserts against). > [Aside: I think, when we commit changes to package-private APIs such > that back-compat tests don't pass, we could go back, make a branch on > the back-compat tag, commit changes to the tests to use the new > package private APIs on that branch, then fix nightly build to use the > tip of that branch?o] > There's still plenty to do before this is committable! This is a > rather large change: > * Switches to a new more efficient terms dict format. This still > uses tii/tis files, but the tii only stores term & long offset > (not a TermInfo). At seek points, tis encodes term & freq/prox > offsets absolutely instead of with deltas delta. Also, tis/tii > are structured by field, so we don't have to record field number > in every term. > . > On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB > -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). > . > RAM usage when loading terms dict index is significantly less > since we only load an array of offsets and an array of String (no > more TermInfo array). It should be faster to init too. > . > This part is basically done. > * Introduces modular reader codec that strongly decouples terms dict > from docs/positions readers. EG there is no more TermInfo used > when reading the new format. > . > There's nice symmetry now between reading & writing in the codec > chain -- the current docs/prox format is captured in: > {code} > FormatPostingsTermsDictWriter/Reader > FormatPostingsDocsWriter/Reader (.frq file) and > FormatPostingsPositionsWriter/Reader (.prx file). > {code} > This part is basically done. > * Introduces a new "flex" API for iterating through the fields, > terms, docs and positions: > {code} > FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum > {code} > This replaces TermEnum/Docs/Positions. SegmentReader emulates the > old API on top of the new API to keep back-compat. > > Next steps: > * Plug in new codecs (pulsing, pfor) to exercise the modularity / > fix any hidden assumptions. > * Expose new API out of IndexReader, deprecate old API but emulate > old API on top of new one, switch all core/contrib users to the > new API. > * Maybe switch to AttributeSources as the base class for TermsEnum, > DocsEnum, PostingsEnum -- this would give readers API flexibility > (not just index-file-format flexibility). EG if someone wanted > to store payload at the term-doc level instead of > term-doc-position level, you could just add a new attribute. > * Test performance & iterate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
Looks good! Mike On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch wrote: > Hi all, > > I wrote a draft for a mail I'd like to send to java-user to get some > feedback about the proposed changes to our backwards-compatibility policy we > discussed here and on LUCENE-1698. > Let me know what you think please! > > Michael > > > Hello Lucene users: > > In the past we have discussed our backwards-compatibility policy > frequently on the Lucene developer mailinglist and we are very tempted > to make some significant changes. In this mail I'd like to outline the > proposed changes to get some feedback from the user community. > > Our current backwards-compatibility policy regarding API changes > states that we can only make changes that break > backwards-compatibility in major releases (3.0, 4.0, etc.); the next > major release is the upcoming 3.0. > > Given how often we made major releases in the past in Lucene this > means that deprecated APIs need to stay in Lucene for a very long > time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 > before we can remove it. This means that the code gets very cluttered > and adding new features gets somewhat more difficult, as attention has > to be paid to properly support the old *and* new APIs for a quite long > time. > > The current policy also leads to delaying a last minor release before > a major release (e.g. 2.9), because the developers consider it as the > last chance for a long time to introduce new APIs and deprecate old ones. > > The proposal now is to change this policy in a way, so that an API can > only be removed if it was deprecated in at least one release, which > can be a major *or* minor release. E.g. if we deprecate an API and > release it with 3.1, we can remove it with the 3.2 release. > > For users this means of course that a simple jar drop-in replacement > won't be possible anymore with almost every Lucene release (excluding > bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if > you're using a non-deprecated API it will be in the next release. > > Note that of course these proposed changes do not affect > backwards-compatibility with old index formats. I.e. it will still be > possible to read all 3.X indexes with any Lucene 4.X version. > > Our main goal is to find the right balance between > backwards-compatibility support for all the Lucene users out there and > fast and productive development of new features. If we get positive > feedback here we will call a vote on the development mailinglist where > the committers have to officially decide whether to make these changes or > not. > > Note that in any case the changes will take affect *after* the 3.0 > release. > > On behalf of the Lucene developers, > Michael Busch - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
I think it should be more clear that the devs have not come to an agreement on this change yet, irregardless of the communities input. Michael McCandless wrote: > Looks good! > > Mike > > On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch wrote: > >> Hi all, >> >> I wrote a draft for a mail I'd like to send to java-user to get some >> feedback about the proposed changes to our backwards-compatibility policy we >> discussed here and on LUCENE-1698. >> Let me know what you think please! >> >> Michael >> >> >> Hello Lucene users: >> >> In the past we have discussed our backwards-compatibility policy >> frequently on the Lucene developer mailinglist and we are very tempted >> to make some significant changes. In this mail I'd like to outline the >> proposed changes to get some feedback from the user community. >> >> Our current backwards-compatibility policy regarding API changes >> states that we can only make changes that break >> backwards-compatibility in major releases (3.0, 4.0, etc.); the next >> major release is the upcoming 3.0. >> >> Given how often we made major releases in the past in Lucene this >> means that deprecated APIs need to stay in Lucene for a very long >> time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 >> before we can remove it. This means that the code gets very cluttered >> and adding new features gets somewhat more difficult, as attention has >> to be paid to properly support the old *and* new APIs for a quite long >> time. >> >> The current policy also leads to delaying a last minor release before >> a major release (e.g. 2.9), because the developers consider it as the >> last chance for a long time to introduce new APIs and deprecate old ones. >> >> The proposal now is to change this policy in a way, so that an API can >> only be removed if it was deprecated in at least one release, which >> can be a major *or* minor release. E.g. if we deprecate an API and >> release it with 3.1, we can remove it with the 3.2 release. >> >> For users this means of course that a simple jar drop-in replacement >> won't be possible anymore with almost every Lucene release (excluding >> bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if >> you're using a non-deprecated API it will be in the next release. >> >> Note that of course these proposed changes do not affect >> backwards-compatibility with old index formats. I.e. it will still be >> possible to read all 3.X indexes with any Lucene 4.X version. >> >> Our main goal is to find the right balance between >> backwards-compatibility support for all the Lucene users out there and >> fast and productive development of new features. If we get positive >> feedback here we will call a vote on the development mailinglist where >> the committers have to officially decide whether to make these changes or >> not. >> >> Note that in any case the changes will take affect *after* the 3.0 >> release. >> >> On behalf of the Lucene developers, >> Michael Busch >> > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
For the record - I still don't see what we gain but confusion. The major numbers don't have any significant meaning in terms of features or advancements. If we want to remove deprecations faster after deprecating in 4.1, we should just not release 4.2,4.3,4.4,4.5, and then 4.9. We should go from 4.1 to 4.9, or 4.1,4.2, then 4.9. We have always just chosen how long we were stuck with stuff by how fast we decided to skip the dots. Mark Miller wrote: > I think it should be more clear that the devs have not come to an > agreement on this change yet, irregardless of the communities input. > > Michael McCandless wrote: > >> Looks good! >> >> Mike >> >> On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch wrote: >> >> >>> Hi all, >>> >>> I wrote a draft for a mail I'd like to send to java-user to get some >>> feedback about the proposed changes to our backwards-compatibility policy we >>> discussed here and on LUCENE-1698. >>> Let me know what you think please! >>> >>> Michael >>> >>> >>> Hello Lucene users: >>> >>> In the past we have discussed our backwards-compatibility policy >>> frequently on the Lucene developer mailinglist and we are very tempted >>> to make some significant changes. In this mail I'd like to outline the >>> proposed changes to get some feedback from the user community. >>> >>> Our current backwards-compatibility policy regarding API changes >>> states that we can only make changes that break >>> backwards-compatibility in major releases (3.0, 4.0, etc.); the next >>> major release is the upcoming 3.0. >>> >>> Given how often we made major releases in the past in Lucene this >>> means that deprecated APIs need to stay in Lucene for a very long >>> time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 >>> before we can remove it. This means that the code gets very cluttered >>> and adding new features gets somewhat more difficult, as attention has >>> to be paid to properly support the old *and* new APIs for a quite long >>> time. >>> >>> The current policy also leads to delaying a last minor release before >>> a major release (e.g. 2.9), because the developers consider it as the >>> last chance for a long time to introduce new APIs and deprecate old ones. >>> >>> The proposal now is to change this policy in a way, so that an API can >>> only be removed if it was deprecated in at least one release, which >>> can be a major *or* minor release. E.g. if we deprecate an API and >>> release it with 3.1, we can remove it with the 3.2 release. >>> >>> For users this means of course that a simple jar drop-in replacement >>> won't be possible anymore with almost every Lucene release (excluding >>> bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if >>> you're using a non-deprecated API it will be in the next release. >>> >>> Note that of course these proposed changes do not affect >>> backwards-compatibility with old index formats. I.e. it will still be >>> possible to read all 3.X indexes with any Lucene 4.X version. >>> >>> Our main goal is to find the right balance between >>> backwards-compatibility support for all the Lucene users out there and >>> fast and productive development of new features. If we get positive >>> feedback here we will call a vote on the development mailinglist where >>> the committers have to officially decide whether to make these changes or >>> not. >>> >>> Note that in any case the changes will take affect *after* the 3.0 >>> release. >>> >>> On behalf of the Lucene developers, >>> Michael Busch >>> >>> >> - >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >> >> > > > -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
I think I'm against sending such a request for feedback - and I think we already know what the results will be. The email reads like "we want to do this, OK?" - and the beneficiaries of what is a volunteer effort are likely to respond overwhelmingly "OK!". One could take the reverse position and probably get just as many positive responses. Devs should decide, and if feedback is needed to help that, a neutral way of asking should be used. -Yonik http://www.lucidimagination.com On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch wrote: > Hi all, > > I wrote a draft for a mail I'd like to send to java-user to get some > feedback about the proposed changes to our backwards-compatibility policy we > discussed here and on LUCENE-1698. > Let me know what you think please! > > Michael > > > Hello Lucene users: > > In the past we have discussed our backwards-compatibility policy > frequently on the Lucene developer mailinglist and we are very tempted > to make some significant changes. In this mail I'd like to outline the > proposed changes to get some feedback from the user community. > > Our current backwards-compatibility policy regarding API changes > states that we can only make changes that break > backwards-compatibility in major releases (3.0, 4.0, etc.); the next > major release is the upcoming 3.0. > > Given how often we made major releases in the past in Lucene this > means that deprecated APIs need to stay in Lucene for a very long > time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 > before we can remove it. This means that the code gets very cluttered > and adding new features gets somewhat more difficult, as attention has > to be paid to properly support the old *and* new APIs for a quite long > time. > > The current policy also leads to delaying a last minor release before > a major release (e.g. 2.9), because the developers consider it as the > last chance for a long time to introduce new APIs and deprecate old ones. > > The proposal now is to change this policy in a way, so that an API can > only be removed if it was deprecated in at least one release, which > can be a major *or* minor release. E.g. if we deprecate an API and > release it with 3.1, we can remove it with the 3.2 release. > > For users this means of course that a simple jar drop-in replacement > won't be possible anymore with almost every Lucene release (excluding > bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if > you're using a non-deprecated API it will be in the next release. > > Note that of course these proposed changes do not affect > backwards-compatibility with old index formats. I.e. it will still be > possible to read all 3.X indexes with any Lucene 4.X version. > > Our main goal is to find the right balance between > backwards-compatibility support for all the Lucene users out there and > fast and productive development of new features. If we get positive > feedback here we will call a vote on the development mailinglist where > the committers have to officially decide whether to make these changes or > not. > > Note that in any case the changes will take affect *after* the 3.0 > release. > > On behalf of the Lucene developers, > Michael Busch - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
On 10/13/09 1:11 PM, Mark Miller wrote: I think it should be more clear that the devs have not come to an agreement on this change yet, irregardless of the communities input. OK I made a few changes near the end to make that clearer. How's it now? Draft: Hello Lucene users: In the past we have discussed our backwards-compatibility policy frequently on the Lucene developer mailinglist and we are very tempted to make some significant changes. In this mail I'd like to outline the proposed changes to get some feedback from the user community. Our current backwards-compatibility policy regarding API changes states that we can only make changes that break backwards-compatibility in major releases (3.0, 4.0, etc.); the next major release is the upcoming 3.0. Given how often we made major releases in the past in Lucene this means that deprecated APIs need to stay in Lucene for a very long time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 before we can remove it. This means that the code gets very cluttered and adding new features gets somewhat more difficult, as attention has to be paid to properly support the old *and* new APIs for a quite long time. The current policy also leads to delaying a last minor release before a major release (e.g. 2.9), because the developers consider it as the last chance for a long time to introduce new APIs and deprecate old ones. The proposal now is to change this policy in a way, so that an API can only be removed if it was deprecated in at least one release, which can be a major *or* minor release. E.g. if we deprecate an API and release it with 3.1, we can remove it with the 3.2 release. For users this means of course that a simple jar drop-in replacement won't be possible anymore with almost every Lucene release (excluding bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if you're using a non-deprecated API it will be in the next release. Note that of course these proposed changes do not affect backwards-compatibility with old index formats. I.e. it will still be possible to read all 3.X indexes with any Lucene 4.X version. Our main goal is to find the right balance between backwards-compatibility support for all the Lucene users out there and fast and productive development of new features. The developers haven't come to an agreement on this proposal yet, hence we'd like to ask the user community for feedback to help us make a decision. After we gathered some feedback here we will call a vote on the development mailinglist where the committers have to officially decide whether to make these changes or not. Note that in any case the changes will take affect *after* the 3.0 release. On behalf of the Lucene developers, Michael Busch - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
On 10/13/09 1:18 PM, Yonik Seeley wrote: I think I'm against sending such a request for feedback - and I think we already know what the results will be. I've mentioned it several times on java-dev and LUCENE-1698 that I'd like to ask the user community and nobody objected. The email reads like "we want to do this, OK?" - and the beneficiaries of what is a volunteer effort are likely to respond overwhelmingly "OK!". One could take the reverse position and probably get just as many positive responses. Devs should decide, and if feedback is needed to help that, a neutral way of asking should be used. Do you want to draft a new mail? Michael -Yonik http://www.lucidimagination.com On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch wrote: Hi all, I wrote a draft for a mail I'd like to send to java-user to get some feedback about the proposed changes to our backwards-compatibility policy we discussed here and on LUCENE-1698. Let me know what you think please! Michael Hello Lucene users: In the past we have discussed our backwards-compatibility policy frequently on the Lucene developer mailinglist and we are very tempted to make some significant changes. In this mail I'd like to outline the proposed changes to get some feedback from the user community. Our current backwards-compatibility policy regarding API changes states that we can only make changes that break backwards-compatibility in major releases (3.0, 4.0, etc.); the next major release is the upcoming 3.0. Given how often we made major releases in the past in Lucene this means that deprecated APIs need to stay in Lucene for a very long time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 before we can remove it. This means that the code gets very cluttered and adding new features gets somewhat more difficult, as attention has to be paid to properly support the old *and* new APIs for a quite long time. The current policy also leads to delaying a last minor release before a major release (e.g. 2.9), because the developers consider it as the last chance for a long time to introduce new APIs and deprecate old ones. The proposal now is to change this policy in a way, so that an API can only be removed if it was deprecated in at least one release, which can be a major *or* minor release. E.g. if we deprecate an API and release it with 3.1, we can remove it with the 3.2 release. For users this means of course that a simple jar drop-in replacement won't be possible anymore with almost every Lucene release (excluding bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if you're using a non-deprecated API it will be in the next release. Note that of course these proposed changes do not affect backwards-compatibility with old index formats. I.e. it will still be possible to read all 3.X indexes with any Lucene 4.X version. Our main goal is to find the right balance between backwards-compatibility support for all the Lucene users out there and fast and productive development of new features. If we get positive feedback here we will call a vote on the development mailinglist where the committers have to officially decide whether to make these changes or not. Note that in any case the changes will take affect *after* the 3.0 release. On behalf of the Lucene developers, Michael Busch - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
On Tue, 13 Oct 2009, Mark Miller wrote: For the record - I still don't see what we gain but confusion. The major numbers don't have any significant meaning in terms of features or advancements. That's a perception we don't have control over. A release incrementing the major release number is considered major whether we like to think so or not. If that release only contains backwards compatibility breaking changes and nothing much else to talk about, it is not that major and is likely to cause disappointment. Andi.. If we want to remove deprecations faster after deprecating in 4.1, we should just not release 4.2,4.3,4.4,4.5, and then 4.9. We should go from 4.1 to 4.9, or 4.1,4.2, then 4.9. We have always just chosen how long we were stuck with stuff by how fast we decided to skip the dots. Mark Miller wrote: I think it should be more clear that the devs have not come to an agreement on this change yet, irregardless of the communities input. Michael McCandless wrote: Looks good! Mike On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch wrote: Hi all, I wrote a draft for a mail I'd like to send to java-user to get some feedback about the proposed changes to our backwards-compatibility policy we discussed here and on LUCENE-1698. Let me know what you think please! Michael Hello Lucene users: In the past we have discussed our backwards-compatibility policy frequently on the Lucene developer mailinglist and we are very tempted to make some significant changes. In this mail I'd like to outline the proposed changes to get some feedback from the user community. Our current backwards-compatibility policy regarding API changes states that we can only make changes that break backwards-compatibility in major releases (3.0, 4.0, etc.); the next major release is the upcoming 3.0. Given how often we made major releases in the past in Lucene this means that deprecated APIs need to stay in Lucene for a very long time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 before we can remove it. This means that the code gets very cluttered and adding new features gets somewhat more difficult, as attention has to be paid to properly support the old *and* new APIs for a quite long time. The current policy also leads to delaying a last minor release before a major release (e.g. 2.9), because the developers consider it as the last chance for a long time to introduce new APIs and deprecate old ones. The proposal now is to change this policy in a way, so that an API can only be removed if it was deprecated in at least one release, which can be a major *or* minor release. E.g. if we deprecate an API and release it with 3.1, we can remove it with the 3.2 release. For users this means of course that a simple jar drop-in replacement won't be possible anymore with almost every Lucene release (excluding bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if you're using a non-deprecated API it will be in the next release. Note that of course these proposed changes do not affect backwards-compatibility with old index formats. I.e. it will still be possible to read all 3.X indexes with any Lucene 4.X version. Our main goal is to find the right balance between backwards-compatibility support for all the Lucene users out there and fast and productive development of new features. If we get positive feedback here we will call a vote on the development mailinglist where the committers have to officially decide whether to make these changes or not. Note that in any case the changes will take affect *after* the 3.0 release. On behalf of the Lucene developers, Michael Busch - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
On 10/13/09 9:43 AM, Michael Busch wrote: Shall we first remove the remaining deprecations from the indexer package? There are not many more left, shouldn't be much work. I wasn't quick enough for you :) Working on LUCENE-1979 now - that will be the first test on how good svn merge is! Michael Michael On 10/13/09 5:47 AM, Michael McCandless wrote: OK I will cut a branch& commit Mark's last patch onto it, unless anyone has objections soonish... I'll also branch (twig?) the back compat branch so we can commit the patch there as well. Mike On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller wrote: SVN is about as good at merging branches as any of us are with a patch and trunk unfortunately. But that can still be somewhat more convenient than all these huge patches, with different people at different stages. Depends on how many people end up working on this though. Any more than 2, and I think the branch has got to be worth it. From my perspective, it doesn't make any of the merging process any easier - but it can be easier than juggling all these patches - you have a central code base that can always be targeted for current merging. Michael Busch wrote: I think it's supposed to work pretty good - though I have no personal experience with merging branches with svn. I think we should try it - then we'll know! :) Michael On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 ] Michael McCandless commented on LUCENE-1458: bq. Shall we create a flexible-indexing branch and commit this? I think this is a good idea. But I haven't played heavily w/ svn&branching. EG if we branch now, and trunk moves fast (which it still is w/ deprecation removals), are we going to have conflicts? Or... is svn good about merging branches? Further steps towards flexible indexing --- Key: LUCENE-1458 URL: https://issues.apache.org/jira/browse/LUCENE-1458 Project: Lucene - Java Issue Type: New Feature Components: Index Affects Versions: 2.9 Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Attachments: LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 I attached a very rough checkpoint of my current patch, to get early feedback. All tests pass, though back compat tests don't pass due to changes to package-private APIs plus certain bugs in tests that happened to work (eg call TermPostions.nextPosition() too many times, which the new API asserts against). [Aside: I think, when we commit changes to package-private APIs such that back-compat tests don't pass, we could go back, make a branch on the back-compat tag, commit changes to the tests to use the new package private APIs on that branch, then fix nightly build to use the tip of that branch?o] There's still plenty to do before this is committable! This is a rather large change: * Switches to a new more efficient terms dict format. This still uses tii/tis files, but the tii only stores term&long offset (not a TermInfo). At seek points, tis encodes term& freq/prox offsets absolutely instead of with deltas delta. Also, tis/tii are structured by field, so we don't have to record field number in every term. . On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB ->0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). . RAM usage when loading terms dict index is significantly less since we only load an array of offsets and an array of String (no more TermInfo array). It should be faster to init too. . This part is basically done. * Introduces modular reader codec that strongly decouples terms dict from docs/positions readers. EG there is no more TermInfo used when reading the new format. . There's nice symmetry now between reading&writing in the codec chain -- the current docs/prox format is captured in: {code} FormatPostingsTermsDictWriter/Reader FormatPostingsDocsWriter/Reader (.frq file) and FormatPostingsPositionsWriter/Reader (.prx file). {code} This part is basically done. * Introduces a new "fl
Re: Draft for java-user mail about backwards-compatibility policy changes
On Tue, Oct 13, 2009 at 4:25 PM, Michael Busch wrote: > I've mentioned it several times on java-dev and LUCENE-1698 that I'd like to > ask the user > community and nobody objected. It's the old polling problem - how you ask influences the outcome (as I said below), and you didn't say exactly how you were going to ask before. >> The email reads like "we want to do this, OK?" - and the beneficiaries >> of what is a volunteer effort are likely to respond overwhelmingly >> "OK!". One could take the reverse position and probably get just as >> many positive responses. >> >> Devs should decide, and if feedback is needed to help that, a neutral >> way of asking should be used. >> > > Do you want to draft a new mail? Only if I was sure I wanted feedback :-) Which do you prefer as a back compatibility policy for Lucene: A) best effort drop-in back compatibility for minor version numbers (e.g. v3.5 will be compatible with v3.2) B) best effort drop-in back compatibility for the next minor version number only, and deprecations may be removed after one minor release (e.g. v3.3 will be compat with v3.2, but not v3.4) In either case forward index format compatibility would be maintained for an entire major version and the previous (e.g. v3.5 would be able to read an index written by v2.2) http://www.lucidimagination.com -Yonik - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765234#action_12765234 ] Michael McCandless commented on LUCENE-1458: OK I think I've committed Mark's last patch onto this branch: https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458 and I also branched the 2.9 back-compat branch and committed the last back compat patch: https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458_2_9_back_compat_tests Mark can you check it out & see if I missed anything? > Further steps towards flexible indexing > --- > > Key: LUCENE-1458 > URL: https://issues.apache.org/jira/browse/LUCENE-1458 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Attachments: LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 > > > I attached a very rough checkpoint of my current patch, to get early > feedback. All tests pass, though back compat tests don't pass due to > changes to package-private APIs plus certain bugs in tests that > happened to work (eg call TermPostions.nextPosition() too many times, > which the new API asserts against). > [Aside: I think, when we commit changes to package-private APIs such > that back-compat tests don't pass, we could go back, make a branch on > the back-compat tag, commit changes to the tests to use the new > package private APIs on that branch, then fix nightly build to use the > tip of that branch?o] > There's still plenty to do before this is committable! This is a > rather large change: > * Switches to a new more efficient terms dict format. This still > uses tii/tis files, but the tii only stores term & long offset > (not a TermInfo). At seek points, tis encodes term & freq/prox > offsets absolutely instead of with deltas delta. Also, tis/tii > are structured by field, so we don't have to record field number > in every term. > . > On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB > -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). > . > RAM usage when loading terms dict index is significantly less > since we only load an array of offsets and an array of String (no > more TermInfo array). It should be faster to init too. > . > This part is basically done. > * Introduces modular reader codec that strongly decouples terms dict > from docs/positions readers. EG there is no more TermInfo used > when reading the new format. > . > There's nice symmetry now between reading & writing in the codec > chain -- the current docs/prox format is captured in: > {code} > FormatPostingsTermsDictWriter/Reader > FormatPostingsDocsWriter/Reader (.frq file) and > FormatPostingsPositionsWriter/Reader (.prx file). > {code} > This part is basically done. > * Introduces a new "flex" API for iterating through the fields, > terms, docs and positions: > {code} > FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum > {code} > This replaces TermEnum/Docs/Positions. SegmentReader emulates the > old API on top of the new API to keep back-compat. > > Next steps: > * Plug in new codecs (pulsing, pfor) to exercise the modularity / > fix any hidden assumptions. > * Expose new API out of IndexReader, deprecate old API but emulate > old API on top of new one, switch all core/contrib users to the > new API. > * Maybe switch to AttributeSources as the base class for TermsEnum, > DocsEnum, PostingsEnum -- this would give readers API flexibility > (not just index-file-format flexibility). EG if someone wanted > to store payload at the term-doc level instead of > term-doc-position level, you could just add a new attribute. > * Test performance & iterate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
Woops sorry I missed that! Yes this'll be our first test :) Mike On Tue, Oct 13, 2009 at 4:58 PM, Michael Busch wrote: > On 10/13/09 9:43 AM, Michael Busch wrote: >> >> Shall we first remove the remaining deprecations from the indexer package? >> There are not many more left, shouldn't be much work. >> > > I wasn't quick enough for you :) Working on LUCENE-1979 now - that will be > the first test on how good svn merge is! > > Michael > >> Michael >> >> On 10/13/09 5:47 AM, Michael McCandless wrote: >>> >>> OK I will cut a branch& commit Mark's last patch onto it, unless >>> anyone has objections soonish... >>> >>> I'll also branch (twig?) the back compat branch so we can commit the >>> patch there as well. >>> >>> Mike >>> >>> On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller >>> wrote: SVN is about as good at merging branches as any of us are with a patch and trunk unfortunately. But that can still be somewhat more convenient than all these huge patches, with different people at different stages. Depends on how many people end up working on this though. Any more than 2, and I think the branch has got to be worth it. From my perspective, it doesn't make any of the merging process any easier - but it can be easier than juggling all these patches - you have a central code base that can always be targeted for current merging. Michael Busch wrote: > > I think it's supposed to work pretty good - though I have no personal > experience with merging branches with svn. > > I think we should try it - then we'll know! :) > > Michael > > On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: >> >> [ >> >> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 >> ] >> >> Michael McCandless commented on LUCENE-1458: >> >> >> bq. Shall we create a flexible-indexing branch and commit this? >> >> I think this is a good idea. >> >> But I haven't played heavily w/ svn& branching. EG if we branch >> now, and trunk moves fast (which it still is w/ deprecation >> removals), are we going to have conflicts? Or... is svn good about >> merging branches? >> >> >>> Further steps towards flexible indexing >>> --- >>> >>> Key: LUCENE-1458 >>> URL: >>> https://issues.apache.org/jira/browse/LUCENE-1458 >>> Project: Lucene - Java >>> Issue Type: New Feature >>> Components: Index >>> Affects Versions: 2.9 >>> Reporter: Michael McCandless >>> Assignee: Michael McCandless >>> Priority: Minor >>> Attachments: LUCENE-1458-back-compat.patch, >>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >>> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >>> LUCENE-1458.tar.bz2 >>> >>> >>> I attached a very rough checkpoint of my current patch, to get early >>> feedback. All tests pass, though back compat tests don't pass due to >>> changes to package-private APIs plus certain bugs in tests that >>> happened to work (eg call TermPostions.nextPosition() too many times, >>> which the new API asserts against). >>> [Aside: I think, when we commit changes to package-private APIs such >>> that back-compat tests don't pass, we could go back, make a branch on >>> the back-compat tag, commit changes to the tests to use the new >>> package private APIs on that branch, then fix nightly build to use >>> the >>> tip of that branch?o] >>> There's still plenty to do before this is committable! This is a >>> rather large change: >>> * Switches to a new more efficient terms dict format. This still >>> uses tii/tis files, but the tii only stores term& long offset >>> (not a TermInfo). At seek points, tis encodes term& >>> freq/prox >>> offsets absolutely instead of with deltas delta. Also, tis/tii >>> are structured by field, so we don't have to record field number >>> in every term. >>> . >>> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB >>> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 >>> MB). >>>
[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
[ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765237#action_12765237 ] Uwe Schindler commented on LUCENE-1458: --- By the way, a lot of these PriorityQueues can be generified like in trunk to remove the unneeded casts in lessThan, pop, insert,... everywhere. > Further steps towards flexible indexing > --- > > Key: LUCENE-1458 > URL: https://issues.apache.org/jira/browse/LUCENE-1458 > Project: Lucene - Java > Issue Type: New Feature > Components: Index >Affects Versions: 2.9 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Attachments: LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, > LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, > LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, > LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 > > > I attached a very rough checkpoint of my current patch, to get early > feedback. All tests pass, though back compat tests don't pass due to > changes to package-private APIs plus certain bugs in tests that > happened to work (eg call TermPostions.nextPosition() too many times, > which the new API asserts against). > [Aside: I think, when we commit changes to package-private APIs such > that back-compat tests don't pass, we could go back, make a branch on > the back-compat tag, commit changes to the tests to use the new > package private APIs on that branch, then fix nightly build to use the > tip of that branch?o] > There's still plenty to do before this is committable! This is a > rather large change: > * Switches to a new more efficient terms dict format. This still > uses tii/tis files, but the tii only stores term & long offset > (not a TermInfo). At seek points, tis encodes term & freq/prox > offsets absolutely instead of with deltas delta. Also, tis/tii > are structured by field, so we don't have to record field number > in every term. > . > On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB > -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). > . > RAM usage when loading terms dict index is significantly less > since we only load an array of offsets and an array of String (no > more TermInfo array). It should be faster to init too. > . > This part is basically done. > * Introduces modular reader codec that strongly decouples terms dict > from docs/positions readers. EG there is no more TermInfo used > when reading the new format. > . > There's nice symmetry now between reading & writing in the codec > chain -- the current docs/prox format is captured in: > {code} > FormatPostingsTermsDictWriter/Reader > FormatPostingsDocsWriter/Reader (.frq file) and > FormatPostingsPositionsWriter/Reader (.prx file). > {code} > This part is basically done. > * Introduces a new "flex" API for iterating through the fields, > terms, docs and positions: > {code} > FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum > {code} > This replaces TermEnum/Docs/Positions. SegmentReader emulates the > old API on top of the new API to keep back-compat. > > Next steps: > * Plug in new codecs (pulsing, pfor) to exercise the modularity / > fix any hidden assumptions. > * Expose new API out of IndexReader, deprecate old API but emulate > old API on top of new one, switch all core/contrib users to the > new API. > * Maybe switch to AttributeSources as the base class for TermsEnum, > DocsEnum, PostingsEnum -- this would give readers API flexibility > (not just index-file-format flexibility). EG if someone wanted > to store payload at the term-doc level instead of > term-doc-position level, you could just add a new attribute. > * Test performance & iterate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
No problem! I'm excited about the new branch! Have to try to write some codecs now... Michael On 10/13/09 2:09 PM, Michael McCandless wrote: Woops sorry I missed that! Yes this'll be our first test :) Mike On Tue, Oct 13, 2009 at 4:58 PM, Michael Busch wrote: On 10/13/09 9:43 AM, Michael Busch wrote: Shall we first remove the remaining deprecations from the indexer package? There are not many more left, shouldn't be much work. I wasn't quick enough for you :) Working on LUCENE-1979 now - that will be the first test on how good svn merge is! Michael Michael On 10/13/09 5:47 AM, Michael McCandless wrote: OK I will cut a branch&commit Mark's last patch onto it, unless anyone has objections soonish... I'll also branch (twig?) the back compat branch so we can commit the patch there as well. Mike On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller wrote: SVN is about as good at merging branches as any of us are with a patch and trunk unfortunately. But that can still be somewhat more convenient than all these huge patches, with different people at different stages. Depends on how many people end up working on this though. Any more than 2, and I think the branch has got to be worth it. From my perspective, it doesn't make any of the merging process any easier - but it can be easier than juggling all these patches - you have a central code base that can always be targeted for current merging. Michael Busch wrote: I think it's supposed to work pretty good - though I have no personal experience with merging branches with svn. I think we should try it - then we'll know! :) Michael On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 ] Michael McCandless commented on LUCENE-1458: bq. Shall we create a flexible-indexing branch and commit this? I think this is a good idea. But I haven't played heavily w/ svn& branching. EG if we branch now, and trunk moves fast (which it still is w/ deprecation removals), are we going to have conflicts? Or... is svn good about merging branches? Further steps towards flexible indexing --- Key: LUCENE-1458 URL: https://issues.apache.org/jira/browse/LUCENE-1458 Project: Lucene - Java Issue Type: New Feature Components: Index Affects Versions: 2.9 Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Attachments: LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2 I attached a very rough checkpoint of my current patch, to get early feedback. All tests pass, though back compat tests don't pass due to changes to package-private APIs plus certain bugs in tests that happened to work (eg call TermPostions.nextPosition() too many times, which the new API asserts against). [Aside: I think, when we commit changes to package-private APIs such that back-compat tests don't pass, we could go back, make a branch on the back-compat tag, commit changes to the tests to use the new package private APIs on that branch, then fix nightly build to use the tip of that branch?o] There's still plenty to do before this is committable! This is a rather large change: * Switches to a new more efficient terms dict format. This still uses tii/tis files, but the tii only stores term& long offset (not a TermInfo). At seek points, tis encodes term& freq/prox offsets absolutely instead of with deltas delta. Also, tis/tii are structured by field, so we don't have to record field number in every term. . On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). . RAM usage when loading terms dict index is significantly less since we only load an array of offsets and an array of String (no more TermInfo array). It should be faster to init too. . This part is basically done. * Introduces modular reader codec that strongly decouples terms dict from docs/positions readers. EG there is no more TermInfo used when reading the new format. . There
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
I've added missing enums classes, but everything else is looking good so far. Michael McCandless (JIRA) wrote: > [ > https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765234#action_12765234 > ] > > Michael McCandless commented on LUCENE-1458: > > > OK I think I've committed Mark's last patch onto this branch: > > https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458 > > and I also branched the 2.9 back-compat branch and committed the last back > compat patch: > > > https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458_2_9_back_compat_tests > > Mark can you check it out & see if I missed anything? > > >> Further steps towards flexible indexing >> --- >> >> Key: LUCENE-1458 >> URL: https://issues.apache.org/jira/browse/LUCENE-1458 >> Project: Lucene - Java >> Issue Type: New Feature >> Components: Index >>Affects Versions: 2.9 >>Reporter: Michael McCandless >>Assignee: Michael McCandless >>Priority: Minor >> Attachments: LUCENE-1458-back-compat.patch, >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >> LUCENE-1458.tar.bz2 >> >> >> I attached a very rough checkpoint of my current patch, to get early >> feedback. All tests pass, though back compat tests don't pass due to >> changes to package-private APIs plus certain bugs in tests that >> happened to work (eg call TermPostions.nextPosition() too many times, >> which the new API asserts against). >> [Aside: I think, when we commit changes to package-private APIs such >> that back-compat tests don't pass, we could go back, make a branch on >> the back-compat tag, commit changes to the tests to use the new >> package private APIs on that branch, then fix nightly build to use the >> tip of that branch?o] >> There's still plenty to do before this is committable! This is a >> rather large change: >> * Switches to a new more efficient terms dict format. This still >> uses tii/tis files, but the tii only stores term & long offset >> (not a TermInfo). At seek points, tis encodes term & freq/prox >> offsets absolutely instead of with deltas delta. Also, tis/tii >> are structured by field, so we don't have to record field number >> in every term. >> . >> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB >> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). >> . >> RAM usage when loading terms dict index is significantly less >> since we only load an array of offsets and an array of String (no >> more TermInfo array). It should be faster to init too. >> . >> This part is basically done. >> * Introduces modular reader codec that strongly decouples terms dict >> from docs/positions readers. EG there is no more TermInfo used >> when reading the new format. >> . >> There's nice symmetry now between reading & writing in the codec >> chain -- the current docs/prox format is captured in: >> {code} >> FormatPostingsTermsDictWriter/Reader >> FormatPostingsDocsWriter/Reader (.frq file) and >> FormatPostingsPositionsWriter/Reader (.prx file). >> {code} >> This part is basically done. >> * Introduces a new "flex" API for iterating through the fields, >> terms, docs and positions: >> {code} >> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum >> {code} >> This replaces TermEnum/Docs/Positions. SegmentReader emulates the >> old API on top of the new API to keep back-compat. >> >> Next steps: >> * Plug in new codecs (pulsing, pfor) to exercise the modularity / >> fix any hidden assumptions. >> * Expose new API out of IndexReader, deprecate old API but emulate >> old API on top of new one, switch all core/contrib users to the >> new API. >> * Maybe switch to AttributeSources as the base class for TermsEnum, >> DocsEnum, PostingsEnum -- this would give readers API flexibility >> (not just index-file-format flexibility). EG if someone wanted >> to store payload at the term-doc level instead of >> term-doc-position level, you could just add a new attribute. >> * Test performance & iterate. >> > > -- - Mark http://www.lucidimagination.com --
[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765246#action_12765246 ] Michael McCandless commented on LUCENE-1969: Patch looks good! How do I run the tests? When I cd to contrib/kamikaze and run "ant test" I get this output: {code} download-ivy: [echo] installing ivy... [get] Getting: http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.0.0-beta1/ivy-2.0.0-beta1.jar [get] To: /lucene/kami.1969/contrib/kamikaze/ivy/ivy.jar [get] Not modified - so not downloaded install-ivy: resolve: No ivy:settings found for the default reference 'ivy.instance'. A default instance will be used no settings file found, using default... [ivy:retrieve] :: Ivy 2.0.0-beta1 - 20071206070608 :: http://ant.apache.org/ivy/ :: :: loading settings :: url = jar:file:/lucene/kami.1969/contrib/kamikaze/ivy/ivy.jar!/org/apache/ivy/core/settings/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: com.kamikaze#kamikaze;work...@rhumba [ivy:retrieve] confs: [master, test] [ivy:retrieve] found log4j#log4j;1.2.15 in public [ivy:retrieve] found org.apache.lucene#lucene-core;2.9.0 in public [ivy:retrieve] found junit#junit;4.5 in public [ivy:retrieve] :: resolution report :: resolve 277ms :: artifacts dl 9ms - | |modules|| artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| - | master | 2 | 0 | 0 | 0 || 2 | 0 | | test | 1 | 0 | 0 | 0 || 1 | 0 | - [ivy:retrieve] :: retrieving :: com.kamikaze#kamikaze [ivy:retrieve] confs: [master, test] [ivy:retrieve] 0 artifacts copied, 3 already retrieved (0kB/10ms) init: compile: compile-test: test: BUILD FAILED /lucene/kami.1969/contrib/kamikaze/build.xml:88: Test com.kamikaze.test.TestDocIdSetSuite failed Total time: 2 seconds {code} > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang > Attachments: kamikaze-contrib.patch > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing
Excellent, thanks! Mike On Tue, Oct 13, 2009 at 5:32 PM, Mark Miller wrote: > I've added missing enums classes, but everything else is looking good so > far. > > Michael McCandless (JIRA) wrote: >> [ >> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765234#action_12765234 >> ] >> >> Michael McCandless commented on LUCENE-1458: >> >> >> OK I think I've committed Mark's last patch onto this branch: >> >> https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458 >> >> and I also branched the 2.9 back-compat branch and committed the last back >> compat patch: >> >> >> https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458_2_9_back_compat_tests >> >> Mark can you check it out & see if I missed anything? >> >> >>> Further steps towards flexible indexing >>> --- >>> >>> Key: LUCENE-1458 >>> URL: https://issues.apache.org/jira/browse/LUCENE-1458 >>> Project: Lucene - Java >>> Issue Type: New Feature >>> Components: Index >>> Affects Versions: 2.9 >>> Reporter: Michael McCandless >>> Assignee: Michael McCandless >>> Priority: Minor >>> Attachments: LUCENE-1458-back-compat.patch, >>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, >>> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, >>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, >>> LUCENE-1458.tar.bz2 >>> >>> >>> I attached a very rough checkpoint of my current patch, to get early >>> feedback. All tests pass, though back compat tests don't pass due to >>> changes to package-private APIs plus certain bugs in tests that >>> happened to work (eg call TermPostions.nextPosition() too many times, >>> which the new API asserts against). >>> [Aside: I think, when we commit changes to package-private APIs such >>> that back-compat tests don't pass, we could go back, make a branch on >>> the back-compat tag, commit changes to the tests to use the new >>> package private APIs on that branch, then fix nightly build to use the >>> tip of that branch?o] >>> There's still plenty to do before this is committable! This is a >>> rather large change: >>> * Switches to a new more efficient terms dict format. This still >>> uses tii/tis files, but the tii only stores term & long offset >>> (not a TermInfo). At seek points, tis encodes term & freq/prox >>> offsets absolutely instead of with deltas delta. Also, tis/tii >>> are structured by field, so we don't have to record field number >>> in every term. >>> . >>> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB >>> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB). >>> . >>> RAM usage when loading terms dict index is significantly less >>> since we only load an array of offsets and an array of String (no >>> more TermInfo array). It should be faster to init too. >>> . >>> This part is basically done. >>> * Introduces modular reader codec that strongly decouples terms dict >>> from docs/positions readers. EG there is no more TermInfo used >>> when reading the new format. >>> . >>> There's nice symmetry now between reading & writing in the codec >>> chain -- the current docs/prox format is captured in: >>> {code} >>> FormatPostingsTermsDictWriter/Reader >>> FormatPostingsDocsWriter/Reader (.frq file) and >>> FormatPostingsPositionsWriter/Reader (.prx file). >>> {code} >>> This part is basically done. >>> * Introduces a new "flex" API for iterating through the fields, >>> terms, docs and positions: >>> {code} >>> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum >>> {code} >>> This replaces TermEnum/Docs/Positions. SegmentReader emulates the >>> old API on top of the new API to keep back-compat. >>> >>> Next steps: >>> * Plug in new codecs (pulsing, pfor) to exercise the modularity / >>> fix any hidden assumptions. >>> * Expose new API out of IndexReader, deprecate old API but emulate >>> old API on top of new one, switch all core/contrib users to the >>> new API. >>> * Maybe switch to AttributeSources as the base class for TermsEnum, >>> DocsEnum, PostingsEnum -- this would give readers API flexibility >>> (not just index-file-format flexibility). EG if someone wanted >>> to store payload at the term-doc level instead of >>> term-doc-position level, you could just add a ne
[jira] Resolved: (LUCENE-1981) Allow access to entries in the field cache
[ https://issues.apache.org/jira/browse/LUCENE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved LUCENE-1981. -- Resolution: Invalid > Allow access to entries in the field cache > -- > > Key: LUCENE-1981 > URL: https://issues.apache.org/jira/browse/LUCENE-1981 > Project: Lucene - Java > Issue Type: New Feature > Components: Search >Affects Versions: 2.9 >Reporter: Tom Hill >Priority: Minor > Attachments: lucene-1981.patch > > > If the data required is already in the field cache, it seems unnecessary to > go to the disk for it, if the data is already in RAM. > We have a case where we need one field from a large number (500 -1000) of > scattered documents in a fairly large index (50-100m docs), and seek time to > collect the data from disk is prohibitive, so we'd like to grab the data from > the cache, instead. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-1937) Add more methods to manipulate QueryNodeProcessorPipeline elements
[ https://issues.apache.org/jira/browse/LUCENE-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adriano Crestani reassigned LUCENE-1937: Assignee: (was: Adriano Crestani) > Add more methods to manipulate QueryNodeProcessorPipeline elements > -- > > Key: LUCENE-1937 > URL: https://issues.apache.org/jira/browse/LUCENE-1937 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* >Affects Versions: 2.9 >Reporter: Adriano Crestani >Priority: Minor > Fix For: 3.1 > > Attachments: LUCENE-1937.patch, LUCENE-1937_10_13_2009.patch > > > QueryNodeProcessorPipeline allows the user to define a list of processors to > process a query tree. However, it's not very flexible when the user wants to > extend/modify an already created pipeline, because it only provides an add > method, which only allows the user to append a new processor to the pipeline. > So, I propose to add new methods to manipulate the processor in a pipeline. I > think the methods should not consider an index position when modifying the > pipeline, hence the index position in a pipeline does not mean anything, a > processor has a meaning when it's after or before another processor. > Therefore, I suggest the methods should always consider another processor > when inserting/modifying the pipeline. For example, insertAfter(processor, > newProcessor), which will insert the "newProcessor" after the "processor". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1937) Add more methods to manipulate QueryNodeProcessorPipeline elements
[ https://issues.apache.org/jira/browse/LUCENE-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adriano Crestani updated LUCENE-1937: - Attachment: LUCENE-1937_10_13_2009.patch New patch, now QueryNodeProcessorPipeline implements List interface > Add more methods to manipulate QueryNodeProcessorPipeline elements > -- > > Key: LUCENE-1937 > URL: https://issues.apache.org/jira/browse/LUCENE-1937 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* >Affects Versions: 2.9 >Reporter: Adriano Crestani >Assignee: Adriano Crestani >Priority: Minor > Fix For: 3.1 > > Attachments: LUCENE-1937.patch, LUCENE-1937_10_13_2009.patch > > > QueryNodeProcessorPipeline allows the user to define a list of processors to > process a query tree. However, it's not very flexible when the user wants to > extend/modify an already created pipeline, because it only provides an add > method, which only allows the user to append a new processor to the pipeline. > So, I propose to add new methods to manipulate the processor in a pipeline. I > think the methods should not consider an index position when modifying the > pipeline, hence the index position in a pipeline does not mean anything, a > processor has a meaning when it's after or before another processor. > Therefore, I suggest the methods should always consider another processor > when inserting/modifying the pipeline. For example, insertAfter(processor, > newProcessor), which will insert the "newProcessor" after the "processor". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-1938) Precedence query parser using the contrib/queryparser framework
[ https://issues.apache.org/jira/browse/LUCENE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adriano Crestani reassigned LUCENE-1938: Assignee: (was: Adriano Crestani) > Precedence query parser using the contrib/queryparser framework > --- > > Key: LUCENE-1938 > URL: https://issues.apache.org/jira/browse/LUCENE-1938 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: Adriano Crestani >Priority: Minor > Fix For: 3.1 > > Attachments: LUCENE-1938.patch > > > Extend the current StandardQueryParser on contrib so it supports boolean > precedence -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Draft for java-user mail about backwards-compatibility policy changes
OK, I made the draft a bit "more neutral" by pointing out the downsides clearer. However, I think we have to explain reasons for and against the change, otherwise people who didn't follow these discussions on java-dev will have no idea why we actually want to make a change at all. I added your sentences near the end. How's it now? Also looking at the 2.9 CHANGES.txt we have a pretty long compat-break section, so the drop-in replacement guarantee isn't really one anymore, which I didn't even mention in this draft. Draft: Hello Lucene users: In the past we have discussed our backwards-compatibility policy frequently on the Lucene developer mailinglist and we are thinking about making some significant changes. In this mail I'd like to outline the proposed changes to get some feedback from the user community. Our current backwards-compatibility policy regarding API changes states that we can only make changes that break backwards-compatibility in major releases (3.0, 4.0, etc.); the next major release is the upcoming 3.0. Given how often we made major releases in the past in Lucene this means that deprecated APIs need to stay in Lucene for a very long time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0 before we can remove it. This means that the code gets very cluttered and adding new features gets somewhat more difficult, as attention has to be paid to properly support the old *and* new APIs for a quite long time. The current policy also leads to delaying a last minor release before a major release (e.g. 2.9), because the developers consider it as the last chance for a long time to introduce new APIs and deprecate old ones. The proposal now is to change this policy in a way, so that an API can only be removed if it was deprecated in at least one release, which can be a major *or* minor release. E.g. if we deprecate an API and release it with 3.1, we can remove it with the 3.2 release. The obvious downside of this proposal is that a simple jar drop-in replacement will not be possble anymore with almost every Lucene release (excluding bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if you're using a non-deprecated API it will be in the next release. Note that of course these proposed changes do not affect backwards-compatibility with old index formats. I.e. it will still be possible to read all 3.X indexes with any Lucene 4.X version. Our main goal is to find the right balance between backwards-compatibility support for all the Lucene users out there and fast and productive development of new features. The developers haven't come to an agreement on this proposal yet. Potentionally giving up the drop-in replacement promise that Lucene could make in the past is the main reason for the struggle the developers are in and why we'd like to ask the user community for feedback to help us make a decision. After we gathered some feedback here we will call a vote on the development mailinglist where the committers have to officially decide whether to make these changes or not. So please tell us which you prefer as a back compatibility policy for Lucene: A) best effort drop-in back compatibility for minor version numbers (e.g. v3.5 will be compatible with v3.2) B) best effort drop-in back compatibility for the next minor version number only, and deprecations may be removed after one minor release (e.g. v3.3 will be compat with v3.2, but not v3.4) Note that in any case the changes will take affect *after* the 3.0 release. On behalf of the Lucene developers, Michael Busch On 10/13/09 2:05 PM, Yonik Seeley wrote: On Tue, Oct 13, 2009 at 4:25 PM, Michael Busch wrote: I've mentioned it several times on java-dev and LUCENE-1698 that I'd like to ask the user community and nobody objected. It's the old polling problem - how you ask influences the outcome (as I said below), and you didn't say exactly how you were going to ask before. The email reads like "we want to do this, OK?" - and the beneficiaries of what is a volunteer effort are likely to respond overwhelmingly "OK!". One could take the reverse position and probably get just as many positive responses. Devs should decide, and if feedback is needed to help that, a neutral way of asking should be used. Do you want to draft a new mail? Only if I was sure I wanted feedback :-) Which do you prefer as a back compatibility policy for Lucene: A) best effort drop-in back compatibility for minor version numbers (e.g. v3.5 will be compatible with v3.2) B) best effort drop-in back compatibility for the next minor version number only, and deprecations may be removed after one minor release (e.g. v3.3 will be compat with v3.2, but not v3.4) In either case forward index format compatibility would be maintained for an entire major version and the previous (e.g. v3.5 would be able to read an index written by v2.2) http://www.lucidimagination.com -Yonik -
[jira] Updated: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1974: - Attachment: LUCENE-1974.test.patch this is the same as the previously attached test but i've simplified it (to me) and revamped it to be a patch that can be applied to 2.9.0. I can confirm that it fails for me (against 2.9.0) and seems to suggest a weird hit collection bug somwhere in the BooleanScorer or Prefix scoring code (a prefix query works, a boolean query containing term queries work, but a boolean query containing a prefix query fails to find all the expected matches) Unless i'm missing something really silly, this suggests a pretty heinious bug somewhere in the core scoring code. > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1974: - Attachment: LUCENE-1974.test.patch tweaked test so that it can be applied to 2.4.1 (by removing readOnly param from IndexSearcher constructor) verified this test passes against 2.4.1 ... it's a new bug in 2.9.0 > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
(possible) heinous scoring bug in 2.9.0: LUCENE-1974 ? ? ?
Can someone smarter then me review the patch in LUCENE-1974... https://issues.apache.org/jira/browse/LUCENE-1974 ...on the surface this seems to suggest a pretty serious error somewhere in the low level scoring code when a BooleanQuery is involved. (If this really is a bug, and not just me overlooking some flaw in the test, then it probably warrants an urgent 2.9.1 release) -Hoss - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1974: -- Assignee: Michael McCandless > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: (possible) heinous scoring bug in 2.9.0: LUCENE-1974 ? ? ?
I'm looking at it... Mike On Tue, Oct 13, 2009 at 7:06 PM, Chris Hostetter wrote: > > Can someone smarter then me review the patch in LUCENE-1974... > > https://issues.apache.org/jira/browse/LUCENE-1974 > > ...on the surface this seems to suggest a pretty serious error somewhere in > the low level scoring code when a BooleanQuery is involved. > > > (If this really is a bug, and not just me overlooking some flaw in the test, > then it probably warrants an urgent 2.9.1 release) > > > > -Hoss > > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765299#action_12765299 ] Michael McCandless commented on LUCENE-1974: Hmm... seems to be a bug in BooleanScorer... if you call static BooleanQuery.setAllowDocsOutOfOrder(false) the test passes (so that's a viable workaround it seems). > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765303#action_12765303 ] Robert Muir commented on LUCENE-1974: - Hoss man, i played with this a little, maybe this is all obvious tho * test passes if you set BooleanQuery.setAllowDocsOutOfOrder(false) [its booleanscorer, not booleanscorer2] * to simplify things, you can use ConstantScoreQuery of a single term instead of PrefixQuery to trigger it agree with the comment in the original test, if you trace the execution, the problem is it doesnt actually refill the queue with his second doc (which is docid 11,000 or something). this is because .score() is being called on the subscorer with an end limit of 8192 or so. {code} // refill the queue more = false; ... if (subScorerDocID != NO_MORE_DOCS) { more |= sub.scorer.score(sub.collector, end, subScorerDocID); ... } while (current != null || more); {code} > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765306#action_12765306 ] John Wang commented on LUCENE-1969: --- My bad! The build.xml is not updated with the package name changes. I will update post the fixed build.xml. > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang > Attachments: kamikaze-contrib.patch > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Wang updated LUCENE-1969: -- Attachment: build.xml updated build.xml with package name changes. > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang > Attachments: build.xml, kamikaze-contrib.patch > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1979: -- Attachment: lucene-1979.patch Removes almost all deprecations from the indexer package. The only things left are: * IndexReader#getFieldCacheKey() - what do we do with that one? * calls to IndexInput#readChars() and #skipChars() - I think we have to keep those until 4.0? All core & contrib tests pass. It'd be good if someone could review this patch though. > Remove remaining deprecations from indexer package > -- > > Key: LUCENE-1979 > URL: https://issues.apache.org/jira/browse/LUCENE-1979 > Project: Lucene - Java > Issue Type: Task > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: lucene-1979.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765310#action_12765310 ] Michael McCandless commented on LUCENE-1974: Ugh, this is the bug: {code} Index: src/java/org/apache/lucene/search/Scorer.java === --- src/java/org/apache/lucene/search/Scorer.java (revision 824846) +++ src/java/org/apache/lucene/search/Scorer.java (working copy) @@ -87,7 +87,7 @@ collector.collect(doc); doc = nextDoc(); } -return doc == NO_MORE_DOCS; +return doc != NO_MORE_DOCS; } /** Returns the score of the current document matching the query. {code} I'll commit shortly, to trunk & 2.9 branch. > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765315#action_12765315 ] Michael Busch commented on LUCENE-1974: --- It's also concerning that no unit test catches this... > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765316#action_12765316 ] Michael McCandless commented on LUCENE-1979: bq. IndexReader#getFieldCacheKey() - what do we do with that one? I think we should undeprecate it. I had deprecated it thinking LUCENE-831 would land. > Remove remaining deprecations from indexer package > -- > > Key: LUCENE-1979 > URL: https://issues.apache.org/jira/browse/LUCENE-1979 > Project: Lucene - Java > Issue Type: Task > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: lucene-1979.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765318#action_12765318 ] Michael McCandless commented on LUCENE-1969: Excellent, I can now run ant test; thanks. Except, it runs a bunch of tests that seem to be succeding, yet I get a BUILD FAILED at the end. I'll attach the full output. Is there some way to shorten these tests without losing [much] coverage? It's great how thorough they are, but it took my machine 6 min 22 sec to run which'd be a big addition to the build time. > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang > Attachments: build.xml, kamikaze-contrib.patch > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765319#action_12765319 ] Michael McCandless commented on LUCENE-1969: Also, John, have you started the software grant? I think you need to fill this in: http://www.apache.org/licenses/software-grant.txt and then list the files contained (in the current patch) and get that to Grant (I think?), and then post the md5 of that patch here. > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang > Attachments: build.xml, kamikaze-contrib.patch > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1969: --- Attachment: kamikaze.test.out Output when I ran "ant test". > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang > Attachments: build.xml, kamikaze-contrib.patch, kamikaze.test.out > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765320#action_12765320 ] Michael Busch commented on LUCENE-1979: --- OK, will do! > Remove remaining deprecations from indexer package > -- > > Key: LUCENE-1979 > URL: https://issues.apache.org/jira/browse/LUCENE-1979 > Project: Lucene - Java > Issue Type: Task > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: lucene-1979.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Assigned: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1969: -- Assignee: Michael McCandless > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang >Assignee: Michael McCandless > Attachments: build.xml, kamikaze-contrib.patch, kamikaze.test.out > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1979: -- Attachment: lucene-1979-bw.patch Patch for the back-compat trunk. Hmm, everything passes, except this one: {noformat} [junit] java.lang.NoSuchMethodError: org.apache.lucene.index.SnapshotDeletionPolicy.snapshot()Lorg/apache/lucene/index/IndexCommitPoint; [junit] at org.apache.lucene.TestSnapshotDeletionPolicy.testReuseAcrossWriters(TestSnapshotDeletionPolicy.java:82) [junit] at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:206) [junit] Test org.apache.lucene.TestSnapshotDeletionPolicy FAILED {noformat} Here drop-in replacement doesn't seem to work. The method snapshot() of SnapshotDeletionPolicy was changed to return IndexCommit instead of IndexCommitPoint. IndexCommit used to implement the deprecated IndexCommitPoint, which this patch removes. So the tests are compiled against snapshot() returning IndexCommitPoint in the bw-branch, and when run against the method returning IndexCommit of trunk it fails with the exception above. > Remove remaining deprecations from indexer package > -- > > Key: LUCENE-1979 > URL: https://issues.apache.org/jira/browse/LUCENE-1979 > Project: Lucene - Java > Issue Type: Task > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: lucene-1979-bw.patch, lucene-1979.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib
[ https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765324#action_12765324 ] Yonik Seeley commented on LUCENE-1969: -- As a package name, perhaps something like "docset" is more appropriate and descriptive? > adding kamikaze to lucene contrib > - > > Key: LUCENE-1969 > URL: https://issues.apache.org/jira/browse/LUCENE-1969 > Project: Lucene - Java > Issue Type: New Feature > Components: contrib/* >Affects Versions: 2.9 >Reporter: John Wang >Assignee: Michael McCandless > Attachments: build.xml, kamikaze-contrib.patch, kamikaze.test.out > > > Adding kamikaze to lucene contrib -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765327#action_12765327 ] Michael McCandless commented on LUCENE-1974: bq. It's also concerning that no unit test catches this... I agree I'll commit tangfulin & Hoss's test case. I think the other tests do not catch it because the error only happens if the docID is over 8192 (the chunk size that BooleanScorer uses). Most of our tests work on smaller sets of docs. > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1974. Resolution: Fixed Fix Version/s: 3.0 2.9.1 Thanks tangfulin and Hoss! I think we need to spin 2.9.1 for this. > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Fix For: 2.9.1, 3.0 > > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1979: -- Attachment: lucene-1979.patch Same patch as before, but with IndexReader#getFieldCacheKey() undeprecated. Is it correct that we keep IndexInput#readChars() and IndexInput#skipChars() for index format compatibility? > Remove remaining deprecations from indexer package > -- > > Key: LUCENE-1979 > URL: https://issues.apache.org/jira/browse/LUCENE-1979 > Project: Lucene - Java > Issue Type: Task > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: lucene-1979-bw.patch, lucene-1979.patch, > lucene-1979.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition
[ https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765332#action_12765332 ] Yonik Seeley commented on LUCENE-1974: -- bq. It's also concerning that no unit test catches this... I've said it before, I'll say it again... anything of sufficient complexity really benefits from random tests to hit boundary cases that one would not have thought to code for. We have quite a few in Solr, but not enough. We obviously don't have enough in Lucene either. One other simple tactic I've used in Solr to increase the chance of hitting boundary conditions is to make sure many segments are created by default (bad for performance, good for testing), and that cache sizes, window sizes, etc are small so that they are crossed more often by more tests. > BooleanQuery can not find all matches in special condition > -- > > Key: LUCENE-1974 > URL: https://issues.apache.org/jira/browse/LUCENE-1974 > Project: Lucene - Java > Issue Type: Bug > Components: Query/Scoring >Affects Versions: 2.9 >Reporter: tangfulin >Assignee: Michael McCandless > Fix For: 2.9.1, 3.0 > > Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, > LUCENE-1974.test.patch > > > query: (name:tang*) > doc=5137 score=1.0 doc:Document> > doc=11377 score=1.0 doc:Document> > query: name:tang* name:notexistnames > doc=5137 score=0.048133932 doc:Document> > It is two queries on the same index, one is just a prefix query in a > boolean query, and the other is a prefix query plus a term query in a > boolean query, all with Occur.SHOULD . > what I wonder is why the later query can not find the doc=11377 doc ? > the problem can be repreduced by the code in the attachment . -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package
[ https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765336#action_12765336 ] Michael McCandless commented on LUCENE-1979: bq. Is it correct that we keep IndexInput#readChars() and IndexInput#skipChars() for index format compatibility? Yes, we need to keep them. It was in 2.4 (LUCENE-510) when we switched to writing strings as UTF8, so any index created by eg 2.3 (which we must be able to read through at least 3.9) will need these methods. > Remove remaining deprecations from indexer package > -- > > Key: LUCENE-1979 > URL: https://issues.apache.org/jira/browse/LUCENE-1979 > Project: Lucene - Java > Issue Type: Task > Components: Index >Reporter: Michael Busch >Priority: Minor > Fix For: 3.0 > > Attachments: lucene-1979-bw.patch, lucene-1979.patch, > lucene-1979.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org