date:20091013

RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Uwe Schindler

I wonder why this commit is needed. It only affects the core classes, not th
tests. To compile correct backwards tests it should not be important if the
methods exist or not.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: busc...@apache.org [mailto:busc...@apache.org]
> Sent: Tuesday, October 13, 2009 9:00 AM
> To: java-comm...@lucene.apache.org
> Subject: svn commit: r824611 - in
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> 
> Author: buschmi
> Date: Tue Oct 13 06:59:40 2009
> New Revision: 824611
> 
> URL: http://svn.apache.org/viewvc?rev=824611&view=rev
> Log:
> More fixes that were accidentially left out in the previous commit
> 
> Modified:
> 
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/FieldMaskingSpanQuery.java
> 
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanFirstQuery.java
> 
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanNearQuery.java
> 
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanNotQuery.java
> 
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanOrQuery.java
> 
> Modified:
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/FieldMaskingSpanQuery.java
> URL:
> http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t
> ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re
> v=824611&r1=824610&r2=824611&view=diff
> ==
> 
> ---
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/FieldMaskingSpanQuery.java (original)
> +++
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009
> @@ -94,11 +94,6 @@
>  return maskedQuery.getSpans(reader);
>}
> 
> -  /** @deprecated use {...@link #extractTerms(Set)} instead. */
> -  public Collection getTerms() {
> -return maskedQuery.getTerms();
> -  }
> -
>public void extractTerms(Set terms) {
>  maskedQuery.extractTerms(terms);
>}
> 
> Modified:
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanFirstQuery.java
> URL:
> http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t
> ests/src/java/org/apache/lucene/search/spans/SpanFirstQuery.java?rev=82461
> 1&r1=824610&r2=824611&view=diff
> ==
> 
> ---
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanFirstQuery.java (original)
> +++
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanFirstQuery.java Tue Oct 13 06:59:40 2009
> @@ -47,12 +47,6 @@
> 
>public String getField() { return match.getField(); }
> 
> -  /** Returns a collection of all terms matched by this query.
> -   * @deprecated use extractTerms instead
> -   * @see #extractTerms(Set)
> -   */
> -  public Collection getTerms() { return match.getTerms(); }
> -
>public String toString(String field) {
>  StringBuffer buffer = new StringBuffer();
>  buffer.append("spanFirst(");
> 
> Modified:
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanNearQuery.java
> URL:
> http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t
> ests/src/java/org/apache/lucene/search/spans/SpanNearQuery.java?rev=824611
> &r1=824610&r2=824611&view=diff
> ==
> 
> ---
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanNearQuery.java (original)
> +++
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> ne/search/spans/SpanNearQuery.java Tue Oct 13 06:59:40 2009
> @@ -80,20 +80,6 @@
> 
>public String getField() { return field; }
> 
> -  /** Returns a collection of all terms matched by this query.
> -   * @deprecated use extractTerms instead
> -   * @see #extractTerms(Set)
> -   */
> -  public Collection getTerms() {
> -Collection terms = new ArrayList();
> -Iterator i = clauses.iterator();
> -while (i.hasNext()) {
> -  SpanQuery clause = (SpanQuery)i.next();
> -  terms.addAll(clause.getTerms());
> -}
> -return terms;
> -  }
> -
>public void extractTerms(Set terms) {
>   Iterator i = clauses.iterator();
>   while (i.hasNext()) {
> 
> Modified:
> lucene/java/branch

Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Michael Busch


Yes that's indeed the case, see LUCENE-1529.

 Michael

On 10/13/09 12:25 AM, Michael Busch wrote:
It was weird - I ran all the tests before I did the previous commit 
and it worked fine. Then after committing I wanted to doublecheck by 
running 'ant test-tag' and got the compile errors.


I think something is wrong with my eclipse and/or svn. But I also 
switched from tortoise to command-line recently - so maybe I'm just 
clumsy. Anyway, the new tag is working now, sorry for the noise.


To your question: Wasn't there a fix recently to test-tag to test 
drop-in backwards-compatibility? Which means that it compiles the 
tests first against the sources of the back-compat branch, but then 
runs them against the new trunk JAR? That's why this commit is 
necessary I think.


 Michael

On 10/13/09 12:18 AM, Uwe Schindler wrote:
I wonder why this commit is needed. It only affects the core classes, 
not th
tests. To compile correct backwards tests it should not be important 
if the

methods exist or not.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de



-Original Message-
From: busc...@apache.org [mailto:busc...@apache.org]
Sent: Tuesday, October 13, 2009 9:00 AM
To: java-comm...@lucene.apache.org
Subject: svn commit: r824611 - in
/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc 


ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

Author: buschmi
Date: Tue Oct 13 06:59:40 2009
New Revision: 824611

URL: http://svn.apache.org/viewvc?rev=824611&view=rev
Log:
More fixes that were accidentially left out in the previous commit

Modified:

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/FieldMaskingSpanQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanFirstQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanNearQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanNotQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanOrQuery.java

Modified:
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/FieldMaskingSpanQuery.java
URL:
http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t 

ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re 


v=824611&r1=824610&r2=824611&view=diff
== 



---
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/FieldMaskingSpanQuery.java (original)
+++
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009
@@ -94,11 +94,6 @@
  return maskedQuery.getSpans(reader);
}

-  /** @deprecated use {...@link #extractTerms(Set)} instead. */
-  public Collection getTerms() {
-return maskedQuery.getTerms();
-  }
-
public void extractTerms(Set terms) {
  maskedQuery.extractTerms(terms);
}

Modified:
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanFirstQuery.java
URL:
http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t 

ests/src/java/org/apache/lucene/search/spans/SpanFirstQuery.java?rev=82461 


1&r1=824610&r2=824611&view=diff
== 



---
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanFirstQuery.java (original)
+++
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanFirstQuery.java Tue Oct 13 06:59:40 2009
@@ -47,12 +47,6 @@

public String getField() { return match.getField(); }

-  /** Returns a collection of all terms matched by this query.
-   * @deprecated use extractTerms instead
-   * @see #extractTerms(Set)
-   */
-  public Collection getTerms() { return match.getTerms(); }
-
public String toString(String field) {
  StringBuffer buffer = new StringBuffer();
  buffer.append("spanFirst(");

Modified:
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanNearQuery.java
URL:
http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t 

ests/src/java/org/apache/lucene/search/spans/SpanNearQuery.java?rev=824611 


&r1=824610&r2=824611&view=diff
== 



---
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/SpanNearQuery.java (original)
+++
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce 


ne/search/spans/Sp

Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Michael Busch

It was weird - I ran all the tests before I did the previous commit and 
it worked fine. Then after committing I wanted to doublecheck by running 
'ant test-tag' and got the compile errors.


I think something is wrong with my eclipse and/or svn. But I also 
switched from tortoise to command-line recently - so maybe I'm just 
clumsy. Anyway, the new tag is working now, sorry for the noise.


To your question: Wasn't there a fix recently to test-tag to test 
drop-in backwards-compatibility? Which means that it compiles the tests 
first against the sources of the back-compat branch, but then runs them 
against the new trunk JAR? That's why this commit is necessary I think.


 Michael

On 10/13/09 12:18 AM, Uwe Schindler wrote:

I wonder why this commit is needed. It only affects the core classes, not th
tests. To compile correct backwards tests it should not be important if the
methods exist or not.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


   

-Original Message-
From: busc...@apache.org [mailto:busc...@apache.org]
Sent: Tuesday, October 13, 2009 9:00 AM
To: java-comm...@lucene.apache.org
Subject: svn commit: r824611 - in
/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

Author: buschmi
Date: Tue Oct 13 06:59:40 2009
New Revision: 824611

URL: http://svn.apache.org/viewvc?rev=824611&view=rev
Log:
More fixes that were accidentially left out in the previous commit

Modified:

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/FieldMaskingSpanQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanFirstQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanNearQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanNotQuery.java

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanOrQuery.java

Modified:
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/FieldMaskingSpanQuery.java
URL:
http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t
ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re
v=824611&r1=824610&r2=824611&view=diff
==

---
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/FieldMaskingSpanQuery.java (original)
+++
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009
@@ -94,11 +94,6 @@
  return maskedQuery.getSpans(reader);
}

-  /** @deprecated use {...@link #extractTerms(Set)} instead. */
-  public Collection getTerms() {
-return maskedQuery.getTerms();
-  }
-
public void extractTerms(Set terms) {
  maskedQuery.extractTerms(terms);
}

Modified:
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanFirstQuery.java
URL:
http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t
ests/src/java/org/apache/lucene/search/spans/SpanFirstQuery.java?rev=82461
1&r1=824610&r2=824611&view=diff
==

---
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanFirstQuery.java (original)
+++
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanFirstQuery.java Tue Oct 13 06:59:40 2009
@@ -47,12 +47,6 @@

public String getField() { return match.getField(); }

-  /** Returns a collection of all terms matched by this query.
-   * @deprecated use extractTerms instead
-   * @see #extractTerms(Set)
-   */
-  public Collection getTerms() { return match.getTerms(); }
-
public String toString(String field) {
  StringBuffer buffer = new StringBuffer();
  buffer.append("spanFirst(");

Modified:
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanNearQuery.java
URL:
http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t
ests/src/java/org/apache/lucene/search/spans/SpanNearQuery.java?rev=824611
&r1=824610&r2=824611&view=diff
==

---
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanNearQuery.java (original)
+++
lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
ne/search/spans/SpanNearQuery.java Tue Oct 13 06:59:40 2009
@@ -80,20 +80,6 @@

public String getField() { return field; }

-  /** Returns a collection of all terms matched by this q

RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Uwe Schindler

Hi Michael,

I fixed it here, should I commit?

You problem was maybe that you thought, the backwards test code must compile
against trunk. But it's vice versa. I reverted everything and only removed
the getTerms() checks in the backwards branch. Now it works and the
backwards testing is correct.

Here the general rule applied: The backwards test code was checking against
a deprecated API, just remove it. No need to rewrite the test for that. It
is tested by the main tests. The main case of the backwards branch is to
test drop in binary compatibility.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Tuesday, October 13, 2009 9:49 AM
> To: java-dev@lucene.apache.org
> Subject: RE: svn commit: r824611 - in
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> 
> I found the reason why it broke:
> 
> You changed in the backwards branch main code in your first commit the
> following:
> 
> +Set terms = new HashSet();
> +qr.extractTerms(terms);
> +assertEquals(1, terms.size());
> 
> And the backwards branch core and test is compiled with Java 1.4 - bumm.
> So
> general rule: Never change the main code branch, only the tests in
> backwards
> and use where possible only the old *public* API. If you have to change
> the
> main code you have a backwards break. If you only test some internal
> implementations in 2.9 (not public API), remove the tests in 2.9.
> 
> Uwe
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -Original Message-
> > From: Uwe Schindler [mailto:u...@thetaphi.de]
> > Sent: Tuesday, October 13, 2009 9:43 AM
> > To: java-dev@lucene.apache.org
> > Subject: RE: svn commit: r824611 - in
> >
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> >
> > Yes, thats why we do the tests. By this it is possible to test compiled
> > Java
> > 1.4 code against new Java 1.5 lucene core with generics and test, that
> no
> > upper generics boundaries (e.g. by things like )
> are
> > violated.
> >
> > But if you rewrite the tests to only use the API of lucene 3.0 and no
> > deprecated methods it should pass and it has no effect, if an additional
> > deprecated method is still available in the branch's code. If we have to
> > remove all deprecated code also from the backwards branch, we would not
> > need
> > the branch at all. So this commit is definitely not needed (and I tested
> > it,
> > it works without). In the backwards branch we should only fix the tests,
> > never the core code. If we do it, it is contra-productive.
> >
> > There were some edge cases, when we have backwards-incompatible changes
> in
> > 2.9. But this is definitely not a backwards break.
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> > > -Original Message-
> > > From: Michael Busch [mailto:busch...@gmail.com]
> > > Sent: Tuesday, October 13, 2009 9:30 AM
> > > To: java-dev@lucene.apache.org
> > > Subject: Re: svn commit: r824611 - in
> > >
> >
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> > > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> > > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> > >
> > > Yes that's indeed the case, see LUCENE-1529.
> > >
> > >   Michael
> > >
> > > On 10/13/09 12:25 AM, Michael Busch wrote:
> > > > It was weird - I ran all the tests before I did the previous commit
> > > > and it worked fine. Then after committing I wanted to doublecheck by
> > > > running 'ant test-tag' and got the compile errors.
> > > >
> > > > I think something is wrong with my eclipse and/or svn. But I also
> > > > switched from tortoise to command-line recently - so maybe I'm just
> > > > clumsy. Anyway, the new tag is working now, sorry for the noise.
> > > >
> > > > To your question: Wasn't there a fix recently to test-tag to test
> > > > drop-in backwards-compatibility? Which means that it compiles the
> > > > tests first against the sources of the back-compat branch, but then
> > > > runs them against the new trunk JAR? That's why this commit is
> > > > necessary I think.
> > > >
> > > >  Michael
> > > >
> > > > On 10/13/09 12:18 AM, Uwe Schindler wrote:
> > > >> I wonder why this commit is needed. It only affects the core
> classes,
> > > >> not th
> > > >> tests. To compile correct backwards tests it should not be
> important
> > > >> if the
> > > >> methods exist or not.
> > > >>
> > > >> -
> > > >> Uwe Schindler
> > > >> H.-H.-Meier-Allee 63

RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Uwe Schindler

Yes, thats why we do the tests. By this it is possible to test compiled Java
1.4 code against new Java 1.5 lucene core with generics and test, that no
upper generics boundaries (e.g. by things like ) are
violated.

But if you rewrite the tests to only use the API of lucene 3.0 and no
deprecated methods it should pass and it has no effect, if an additional
deprecated method is still available in the branch's code. If we have to
remove all deprecated code also from the backwards branch, we would not need
the branch at all. So this commit is definitely not needed (and I tested it,
it works without). In the backwards branch we should only fix the tests,
never the core code. If we do it, it is contra-productive.

There were some edge cases, when we have backwards-incompatible changes in
2.9. But this is definitely not a backwards break. 

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Michael Busch [mailto:busch...@gmail.com]
> Sent: Tuesday, October 13, 2009 9:30 AM
> To: java-dev@lucene.apache.org
> Subject: Re: svn commit: r824611 - in
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> 
> Yes that's indeed the case, see LUCENE-1529.
> 
>   Michael
> 
> On 10/13/09 12:25 AM, Michael Busch wrote:
> > It was weird - I ran all the tests before I did the previous commit
> > and it worked fine. Then after committing I wanted to doublecheck by
> > running 'ant test-tag' and got the compile errors.
> >
> > I think something is wrong with my eclipse and/or svn. But I also
> > switched from tortoise to command-line recently - so maybe I'm just
> > clumsy. Anyway, the new tag is working now, sorry for the noise.
> >
> > To your question: Wasn't there a fix recently to test-tag to test
> > drop-in backwards-compatibility? Which means that it compiles the
> > tests first against the sources of the back-compat branch, but then
> > runs them against the new trunk JAR? That's why this commit is
> > necessary I think.
> >
> >  Michael
> >
> > On 10/13/09 12:18 AM, Uwe Schindler wrote:
> >> I wonder why this commit is needed. It only affects the core classes,
> >> not th
> >> tests. To compile correct backwards tests it should not be important
> >> if the
> >> methods exist or not.
> >>
> >> -
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: u...@thetaphi.de
> >>
> >>
> >>> -Original Message-
> >>> From: busc...@apache.org [mailto:busc...@apache.org]
> >>> Sent: Tuesday, October 13, 2009 9:00 AM
> >>> To: java-comm...@lucene.apache.org
> >>> Subject: svn commit: r824611 - in
> >>>
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> >>>
> >>> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> >>> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> >>>
> >>> Author: buschmi
> >>> Date: Tue Oct 13 06:59:40 2009
> >>> New Revision: 824611
> >>>
> >>> URL: http://svn.apache.org/viewvc?rev=824611&view=rev
> >>> Log:
> >>> More fixes that were accidentially left out in the previous commit
> >>>
> >>> Modified:
> >>>
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/FieldMaskingSpanQuery.java
> >>>
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/SpanFirstQuery.java
> >>>
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/SpanNearQuery.java
> >>>
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/SpanNotQuery.java
> >>>
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/SpanOrQuery.java
> >>>
> >>> Modified:
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/FieldMaskingSpanQuery.java
> >>> URL:
> >>>
> http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9_back_compat_t
> >>>
> >>>
> ests/src/java/org/apache/lucene/search/spans/FieldMaskingSpanQuery.java?re
> >>>
> >>> v=824611&r1=824610&r2=824611&view=diff
> >>>
> ==
> >>>
> >>> 
> >>> ---
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/FieldMaskingSpanQuery.java (original)
> >>> +++
> >>>
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> >>>
> >>> ne/search/spans/FieldMaskingSpanQuery.java Tue Oct 13 06:59:40 2009
> >>> @@ -94,11 +94,6 @@
> >>>   return maskedQuery.getSpans(reader);
> >>> }
> >>>
> >>> -  /** @deprecated use {...@link #extractTerms(Set)} instead. */
> >>> -  public Collection get

RE: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Uwe Schindler

I found the reason why it broke:

You changed in the backwards branch main code in your first commit the
following:

+Set terms = new HashSet();
+qr.extractTerms(terms);
+assertEquals(1, terms.size());

And the backwards branch core and test is compiled with Java 1.4 - bumm. So
general rule: Never change the main code branch, only the tests in backwards
and use where possible only the old *public* API. If you have to change the
main code you have a backwards break. If you only test some internal
implementations in 2.9 (not public API), remove the tests in 2.9.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Tuesday, October 13, 2009 9:43 AM
> To: java-dev@lucene.apache.org
> Subject: RE: svn commit: r824611 - in
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> 
> Yes, thats why we do the tests. By this it is possible to test compiled
> Java
> 1.4 code against new Java 1.5 lucene core with generics and test, that no
> upper generics boundaries (e.g. by things like ) are
> violated.
> 
> But if you rewrite the tests to only use the API of lucene 3.0 and no
> deprecated methods it should pass and it has no effect, if an additional
> deprecated method is still available in the branch's code. If we have to
> remove all deprecated code also from the backwards branch, we would not
> need
> the branch at all. So this commit is definitely not needed (and I tested
> it,
> it works without). In the backwards branch we should only fix the tests,
> never the core code. If we do it, it is contra-productive.
> 
> There were some edge cases, when we have backwards-incompatible changes in
> 2.9. But this is definitely not a backwards break.
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> > -Original Message-
> > From: Michael Busch [mailto:busch...@gmail.com]
> > Sent: Tuesday, October 13, 2009 9:30 AM
> > To: java-dev@lucene.apache.org
> > Subject: Re: svn commit: r824611 - in
> >
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> > ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> > SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> >
> > Yes that's indeed the case, see LUCENE-1529.
> >
> >   Michael
> >
> > On 10/13/09 12:25 AM, Michael Busch wrote:
> > > It was weird - I ran all the tests before I did the previous commit
> > > and it worked fine. Then after committing I wanted to doublecheck by
> > > running 'ant test-tag' and got the compile errors.
> > >
> > > I think something is wrong with my eclipse and/or svn. But I also
> > > switched from tortoise to command-line recently - so maybe I'm just
> > > clumsy. Anyway, the new tag is working now, sorry for the noise.
> > >
> > > To your question: Wasn't there a fix recently to test-tag to test
> > > drop-in backwards-compatibility? Which means that it compiles the
> > > tests first against the sources of the back-compat branch, but then
> > > runs them against the new trunk JAR? That's why this commit is
> > > necessary I think.
> > >
> > >  Michael
> > >
> > > On 10/13/09 12:18 AM, Uwe Schindler wrote:
> > >> I wonder why this commit is needed. It only affects the core classes,
> > >> not th
> > >> tests. To compile correct backwards tests it should not be important
> > >> if the
> > >> methods exist or not.
> > >>
> > >> -
> > >> Uwe Schindler
> > >> H.-H.-Meier-Allee 63, D-28213 Bremen
> > >> http://www.thetaphi.de
> > >> eMail: u...@thetaphi.de
> > >>
> > >>
> > >>> -Original Message-
> > >>> From: busc...@apache.org [mailto:busc...@apache.org]
> > >>> Sent: Tuesday, October 13, 2009 9:00 AM
> > >>> To: java-comm...@lucene.apache.org
> > >>> Subject: svn commit: r824611 - in
> > >>>
> >
> /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
> > >>>
> > >>> ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
> > >>> SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java
> > >>>
> > >>> Author: buschmi
> > >>> Date: Tue Oct 13 06:59:40 2009
> > >>> New Revision: 824611
> > >>>
> > >>> URL: http://svn.apache.org/viewvc?rev=824611&view=rev
> > >>> Log:
> > >>> More fixes that were accidentially left out in the previous commit
> > >>>
> > >>> Modified:
> > >>>
> > >>>
> >
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> > >>>
> > >>> ne/search/spans/FieldMaskingSpanQuery.java
> > >>>
> > >>>
> >
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> > >>>
> > >>> ne/search/spans/SpanFirstQuery.java
> > >>>
> > >>>
> >
> lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
> > >>>
> > >>> ne/search/spans/SpanNea

Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Michael Busch

You're right of course! I made the changes to both testcases in the 
back-compat branch first, but I shouldn't have commit the changes to 
JustCompileSearchSpans - that was my mistake. And then I forgot for a 
minute about LUCENE-1529 (when I added the test-tag feature initially it 
*compiled* the tests against the trunk JAR) and didn't think the right 
solution would be to just revert JustCompileSearchSpans and thus had to 
make the other changes.


Oh well, now I guess I'll not forget anymore :)

Thanks for bearing with me and sorry for the noise.

 Michael

On 10/13/09 12:48 AM, Uwe Schindler wrote:

I found the reason why it broke:

You changed in the backwards branch main code in your first commit the
following:

+Set  terms = new HashSet();
+qr.extractTerms(terms);
+assertEquals(1, terms.size());

And the backwards branch core and test is compiled with Java 1.4 - bumm. So
general rule: Never change the main code branch, only the tests in backwards
and use where possible only the old *public* API. If you have to change the
main code you have a backwards break. If you only test some internal
implementations in 2.9 (not public API), remove the tests in 2.9.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


   

-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Tuesday, October 13, 2009 9:43 AM
To: java-dev@lucene.apache.org
Subject: RE: svn commit: r824611 - in
/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

Yes, thats why we do the tests. By this it is possible to test compiled
Java
1.4 code against new Java 1.5 lucene core with generics and test, that no
upper generics boundaries (e.g. by things like) are
violated.

But if you rewrite the tests to only use the API of lucene 3.0 and no
deprecated methods it should pass and it has no effect, if an additional
deprecated method is still available in the branch's code. If we have to
remove all deprecated code also from the backwards branch, we would not
need
the branch at all. So this commit is definitely not needed (and I tested
it,
it works without). In the backwards branch we should only fix the tests,
never the core code. If we do it, it is contra-productive.

There were some edge cases, when we have backwards-incompatible changes in
2.9. But this is definitely not a backwards break.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 

-Original Message-
From: Michael Busch [mailto:busch...@gmail.com]
Sent: Tuesday, October 13, 2009 9:30 AM
To: java-dev@lucene.apache.org
Subject: Re: svn commit: r824611 - in

   

/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
 

ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

Yes that's indeed the case, see LUCENE-1529.

   Michael

On 10/13/09 12:25 AM, Michael Busch wrote:
   

It was weird - I ran all the tests before I did the previous commit
and it worked fine. Then after committing I wanted to doublecheck by
running 'ant test-tag' and got the compile errors.

I think something is wrong with my eclipse and/or svn. But I also
switched from tortoise to command-line recently - so maybe I'm just
clumsy. Anyway, the new tag is working now, sorry for the noise.

To your question: Wasn't there a fix recently to test-tag to test
drop-in backwards-compatibility? Which means that it compiles the
tests first against the sources of the back-compat branch, but then
runs them against the new trunk JAR? That's why this commit is
necessary I think.

  Michael

On 10/13/09 12:18 AM, Uwe Schindler wrote:
 

I wonder why this commit is needed. It only affects the core classes,
not th
tests. To compile correct backwards tests it should not be important
if the
methods exist or not.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


   

-Original Message-
From: busc...@apache.org [mailto:busc...@apache.org]
Sent: Tuesday, October 13, 2009 9:00 AM
To: java-comm...@lucene.apache.org
Subject: svn commit: r824611 - in

 
   

/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
 

ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

Author: buschmi
Date: Tue Oct 13 06:59:40 2009
New Revision: 824611

URL: http://svn.apache.org/viewvc?rev=824611&view=rev
Log:
More fixes that were accidentially left out in the previous commit

Modified:


 
   

lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luce
 

ne/search/spans/FieldMaskingSpanQuery.java


 
   

lucene/java/branches/luc

Re: svn commit: r824611 - in /lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/lucene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java SpanNearQuery.java SpanNotQuery.

2009-10-13 Thread Michael Busch


Yeah please go ahead! Thanks for fixing.

I have it hear working too now - I just took the 
lucene_2_9_back_compat_tests_20091011 tag and made only the fix to 
TestFieldMaskingSpanQuery (without Java 1.5 code of course ;) ) and 
*not* the changes to JustCompileSearchSpans and test-tag is passing now 
against current trunk. I think that's the same you have now, right?


Please go ahead and commit... today is not my day - I should go to bed :)

 Michael


On 10/13/09 1:05 AM, Uwe Schindler wrote:

Hi Michael,

I fixed it here, should I commit?

You problem was maybe that you thought, the backwards test code must compile
against trunk. But it's vice versa. I reverted everything and only removed
the getTerms() checks in the backwards branch. Now it works and the
backwards testing is correct.

Here the general rule applied: The backwards test code was checking against
a deprecated API, just remove it. No need to rewrite the test for that. It
is tested by the main tests. The main case of the backwards branch is to
test drop in binary compatibility.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


   

-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Tuesday, October 13, 2009 9:49 AM
To: java-dev@lucene.apache.org
Subject: RE: svn commit: r824611 - in
/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

I found the reason why it broke:

You changed in the backwards branch main code in your first commit the
following:

+Set  terms = new HashSet();
+qr.extractTerms(terms);
+assertEquals(1, terms.size());

And the backwards branch core and test is compiled with Java 1.4 - bumm.
So
general rule: Never change the main code branch, only the tests in
backwards
and use where possible only the old *public* API. If you have to change
the
main code you have a backwards break. If you only test some internal
implementations in 2.9 (not public API), remove the tests in 2.9.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 

-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Tuesday, October 13, 2009 9:43 AM
To: java-dev@lucene.apache.org
Subject: RE: svn commit: r824611 - in

   

/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
 

ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

Yes, thats why we do the tests. By this it is possible to test compiled
Java
1.4 code against new Java 1.5 lucene core with generics and test, that
   

no
 

upper generics boundaries (e.g. by things like)
   

are
 

violated.

But if you rewrite the tests to only use the API of lucene 3.0 and no
deprecated methods it should pass and it has no effect, if an additional
deprecated method is still available in the branch's code. If we have to
remove all deprecated code also from the backwards branch, we would not
need
the branch at all. So this commit is definitely not needed (and I tested
it,
it works without). In the backwards branch we should only fix the tests,
never the core code. If we do it, it is contra-productive.

There were some edge cases, when we have backwards-incompatible changes
   

in
 

2.9. But this is definitely not a backwards break.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

   

-Original Message-
From: Michael Busch [mailto:busch...@gmail.com]
Sent: Tuesday, October 13, 2009 9:30 AM
To: java-dev@lucene.apache.org
Subject: Re: svn commit: r824611 - in

 
   

/lucene/java/branches/lucene_2_9_back_compat_tests/src/java/org/apache/luc
 

ene/search/spans: FieldMaskingSpanQuery.java SpanFirstQuery.java
SpanNearQuery.java SpanNotQuery.java SpanOrQuery.java

Yes that's indeed the case, see LUCENE-1529.

   Michael

On 10/13/09 12:25 AM, Michael Busch wrote:
 

It was weird - I ran all the tests before I did the previous commit
and it worked fine. Then after committing I wanted to doublecheck by
running 'ant test-tag' and got the compile errors.

I think something is wrong with my eclipse and/or svn. But I also
switched from tortoise to command-line recently - so maybe I'm just
clumsy. Anyway, the new tag is working now, sorry for the noise.

To your question: Wasn't there a fix recently to test-tag to test
drop-in backwards-compatibility? Which means that it compiles the
tests first against the sources of the back-compat branch, but then
runs them against the new trunk JAR? That's why this commit is
necessary I think.

  Michael

On 10/13/09 12:18 AM, Uwe Schindler wrote:
   

I wonder why this commit is needed. It only affects the core
 

classes,

[jira] Created: (LUCENE-1976) isCurrent() and getVersion() on an NRT reader are broken

2009-10-13 Thread Michael McCandless (JIRA)

isCurrent() and getVersion() on an NRT reader are broken


 Key: LUCENE-1976
 URL: https://issues.apache.org/jira/browse/LUCENE-1976
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.1


Right now isCurrent() will always return true for an NRT reader and 
getVersion() will always return the version of the last commit.  This is 
because the NRT reader holds the live segmentInfos.

I think isCurrent() should return "false" when any further changes have 
occurred with the writer, else true.   This is actually fairly easy to 
determine, since the writer tracks how many docs & deletions are buffered in 
RAM and these counters only increase with each change.

getVersion should return the version as of when the reader was created.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1972:
--

Summary: Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and 
lot's of deprecated sort logic  (was: Remove (deprecated) ExtendedFieldCache 
and Auto/Custom caches and sort)

> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of 
> deprecated sort logic
> 
>
> Key: LUCENE-1972
> URL: https://issues.apache.org/jira/browse/LUCENE-1972
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
>
> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1972:
--

Attachment: LUCENE-1972-bw.patch
LUCENE-1972.patch

This patch removes ExtendedFieldCache bw layer. It also removes the AUTO and 
CUSTOM caches.

Because of that, also lot's of SortField logic was also changed and 
deprecations removed (not yet complete, HitCollector is still there). But with 
this patch most of the deprecated sort logic is removed (old Collectors, old 
sorting collectors, legacy search,...)

I also converted the Sort() ctors/setSort methods to varargs and changed the 
tests. It's now easier to use.

Will commit, when all tests were run again and nobody complains. This patch may 
miss to remove some dead code, but this should be done later, when the 
inventors of the new Search API look closer over it.

> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of 
> deprecated sort logic
> 
>
> Key: LUCENE-1972
> URL: https://issues.apache.org/jira/browse/LUCENE-1972
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1972-bw.patch, LUCENE-1972.patch
>
>
> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-1972.
---

Resolution: Fixed

Committed revision: 824699

> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of 
> deprecated sort logic
> 
>
> Key: LUCENE-1972
> URL: https://issues.apache.org/jira/browse/LUCENE-1972
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1972-bw.patch, LUCENE-1972.patch
>
>
> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael McCandless

OK I will cut a branch & commit Mark's last patch onto it, unless
anyone has objections soonish...

I'll also branch (twig?) the back compat branch so we can commit the
patch there as well.

Mike

On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller  wrote:
>
> SVN is about as good at merging branches as any of us are with a patch
> and trunk unfortunately. But that can still be somewhat more convenient
> than all these huge patches, with different people at different stages.
>
> Depends on how many people end up working on this though. Any more than
> 2, and I think the branch has got to be worth it.
>
> From my perspective, it doesn't make any of the merging process any
> easier - but it can be easier than juggling all these patches - you have
> a central code base that can always be targeted for current merging.
>
> Michael Busch wrote:
>> I think it's supposed to work pretty good - though I have no personal
>> experience with merging branches with svn.
>>
>> I think we should try it - then we'll know! :)
>>
>>  Michael
>>
>> On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:
>>>      [
>>> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799
>>> ]
>>>
>>> Michael McCandless commented on LUCENE-1458:
>>> 
>>>
>>> bq. Shall we create a flexible-indexing branch and commit this?
>>>
>>> I think this is a good idea.
>>>
>>> But I haven't played heavily w/ svn&  branching.  EG if we branch
>>> now, and trunk moves fast (which it still is w/ deprecation
>>> removals), are we going to have conflicts?  Or... is svn good about
>>> merging branches?
>>>
>>>
 Further steps towards flexible indexing
 ---

                  Key: LUCENE-1458
                  URL: https://issues.apache.org/jira/browse/LUCENE-1458
              Project: Lucene - Java
           Issue Type: New Feature
           Components: Index
     Affects Versions: 2.9
             Reporter: Michael McCandless
             Assignee: Michael McCandless
             Priority: Minor
          Attachments: LUCENE-1458-back-compat.patch,
 LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
 LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
 LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch,
 LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
 LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
 LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
 LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
 LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
 LUCENE-1458.tar.bz2


 I attached a very rough checkpoint of my current patch, to get early
 feedback.  All tests pass, though back compat tests don't pass due to
 changes to package-private APIs plus certain bugs in tests that
 happened to work (eg call TermPostions.nextPosition() too many times,
 which the new API asserts against).
 [Aside: I think, when we commit changes to package-private APIs such
 that back-compat tests don't pass, we could go back, make a branch on
 the back-compat tag, commit changes to the tests to use the new
 package private APIs on that branch, then fix nightly build to use the
 tip of that branch?o]
 There's still plenty to do before this is committable! This is a
 rather large change:
    * Switches to a new more efficient terms dict format.  This still
      uses tii/tis files, but the tii only stores term&  long offset
      (not a TermInfo).  At seek points, tis encodes term&  freq/prox
      offsets absolutely instead of with deltas delta.  Also, tis/tii
      are structured by field, so we don't have to record field number
      in every term.
 .
      On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
      ->  0.64 MB) and tis file is 9% smaller (75.5 MB ->  68.5 MB).
 .
      RAM usage when loading terms dict index is significantly less
      since we only load an array of offsets and an array of String (no
      more TermInfo array).  It should be faster to init too.
 .
      This part is basically done.
    * Introduces modular reader codec that strongly decouples terms dict
      from docs/positions readers.  EG there is no more TermInfo used
      when reading the new format.
 .
      There's nice symmetry now between reading&  writing in the codec
      chain -- the current docs/prox format is captured in:
 {code}
 FormatPostingsTermsDictWriter/Reader
 FormatPostingsDocsWriter/Reader (.frq file) and
 FormatPostingsPositionsWriter/Reader (.prx file).
 {code}
      This part is basically done.
    * Introduces a new "flex" API for iterating

[jira] Updated: (LUCENE-1977) Remove MultiTermQuery.getTerm()

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1977:
--

Attachment: LUCENE-1977.patch

Here the patch. This also fixes the highlighter problem with NumericRange.

> Remove MultiTermQuery.getTerm()
> ---
>
> Key: LUCENE-1977
> URL: https://issues.apache.org/jira/browse/LUCENE-1977
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1977.patch
>
>
> Removes the field and methods in MTQ that return the pattern term.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1977) Remove MultiTermQuery.getTerm()

2009-10-13 Thread Uwe Schindler (JIRA)

Remove MultiTermQuery.getTerm()
---

 Key: LUCENE-1977
 URL: https://issues.apache.org/jira/browse/LUCENE-1977
 Project: Lucene - Java
  Issue Type: Task
  Components: Search
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 3.0


Removes the field and methods in MTQ that return the pattern term.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1978) Remove HitCollector

2009-10-13 Thread Uwe Schindler (JIRA)

Remove HitCollector
---

 Key: LUCENE-1978
 URL: https://issues.apache.org/jira/browse/LUCENE-1978
 Project: Lucene - Java
  Issue Type: Task
Reporter: Uwe Schindler
Assignee: Uwe Schindler


Remove the rest of HitCollectors

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1978) Remove HitCollector

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1978:
--

Attachment: LUCENE-1978-bw.patch
LUCENE-1978.patch

attached is the patch. Will commit, when full testsuite has run again.

> Remove HitCollector
> ---
>
> Key: LUCENE-1978
> URL: https://issues.apache.org/jira/browse/LUCENE-1978
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: LUCENE-1978-bw.patch, LUCENE-1978.patch
>
>
> Remove the rest of HitCollectors

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1977) Remove MultiTermQuery.getTerm()

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-1977.
---

Resolution: Fixed

Committed revision: 824771

> Remove MultiTermQuery.getTerm()
> ---
>
> Key: LUCENE-1977
> URL: https://issues.apache.org/jira/browse/LUCENE-1977
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1977.patch
>
>
> Removes the field and methods in MTQ that return the pattern term.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery

2009-10-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765082#action_12765082
 ] 

Uwe Schindler commented on LUCENE-1929:
---

This is fixed also in trunk, but different where MTQ.getTerm() is not available.

> Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
> --
>
> Key: LUCENE-1929
> URL: https://issues.apache.org/jira/browse/LUCENE-1929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/highlighter
>Affects Versions: 2.9
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 2.9.1
>
> Attachments: LUCENE-1929.patch
>
>
> Sucks. Will throw a NullPointer exception. 
> Only NumericRangeQuery will throw the exception.
> RangeQuery just won't highlight.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1929) Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery

2009-10-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765082#action_12765082
 ] 

Uwe Schindler edited comment on LUCENE-1929 at 10/13/09 7:11 AM:
-

This is fixed also in trunk, but different where MTQ.getTerm() is not available 
(LUCENE-1977)

  was (Author: thetaphi):
This is fixed also in trunk, but different where MTQ.getTerm() is not 
available.
  
> Highlighter doesn't support NumericRangeQuery or deprecated RangeQuery
> --
>
> Key: LUCENE-1929
> URL: https://issues.apache.org/jira/browse/LUCENE-1929
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/highlighter
>Affects Versions: 2.9
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 2.9.1
>
> Attachments: LUCENE-1929.patch
>
>
> Sucks. Will throw a NullPointer exception. 
> Only NumericRangeQuery will throw the exception.
> RangeQuery just won't highlight.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1978) Remove HitCollector

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-1978.
---

   Resolution: Fixed
Fix Version/s: 3.0

Committed revision: 824781

> Remove HitCollector
> ---
>
> Key: LUCENE-1978
> URL: https://issues.apache.org/jira/browse/LUCENE-1978
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1978-bw.patch, LUCENE-1978.patch
>
>
> Remove the rest of HitCollectors

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1978) Remove HitCollector

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1978:
--

Component/s: Search

> Remove HitCollector
> ---
>
> Key: LUCENE-1978
> URL: https://issues.apache.org/jira/browse/LUCENE-1978
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1978-bw.patch, LUCENE-1978.patch
>
>
> Remove the rest of HitCollectors

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Mark Miller

I can trunk it once more if you'd like - its already pretty out of date :)

If you havn't started anyway ...


Michael McCandless wrote:
> OK I will cut a branch & commit Mark's last patch onto it, unless
> anyone has objections soonish...
>
> I'll also branch (twig?) the back compat branch so we can commit the
> patch there as well.
>
> Mike
>
> On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller  wrote:
>   
>> SVN is about as good at merging branches as any of us are with a patch
>> and trunk unfortunately. But that can still be somewhat more convenient
>> than all these huge patches, with different people at different stages.
>>
>> Depends on how many people end up working on this though. Any more than
>> 2, and I think the branch has got to be worth it.
>>
>> From my perspective, it doesn't make any of the merging process any
>> easier - but it can be easier than juggling all these patches - you have
>> a central code base that can always be targeted for current merging.
>>
>> Michael Busch wrote:
>> 
>>> I think it's supposed to work pretty good - though I have no personal
>>> experience with merging branches with svn.
>>>
>>> I think we should try it - then we'll know! :)
>>>
>>>  Michael
>>>
>>> On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:
>>>   
  [
 https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799
 ]

 Michael McCandless commented on LUCENE-1458:
 

 bq. Shall we create a flexible-indexing branch and commit this?

 I think this is a good idea.

 But I haven't played heavily w/ svn&  branching.  EG if we branch
 now, and trunk moves fast (which it still is w/ deprecation
 removals), are we going to have conflicts?  Or... is svn good about
 merging branches?


 
> Further steps towards flexible indexing
> ---
>
>  Key: LUCENE-1458
>  URL: https://issues.apache.org/jira/browse/LUCENE-1458
>  Project: Lucene - Java
>   Issue Type: New Feature
>   Components: Index
> Affects Versions: 2.9
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
>  Attachments: LUCENE-1458-back-compat.patch,
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch,
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
> LUCENE-1458.tar.bz2
>
>
> I attached a very rough checkpoint of my current patch, to get early
> feedback.  All tests pass, though back compat tests don't pass due to
> changes to package-private APIs plus certain bugs in tests that
> happened to work (eg call TermPostions.nextPosition() too many times,
> which the new API asserts against).
> [Aside: I think, when we commit changes to package-private APIs such
> that back-compat tests don't pass, we could go back, make a branch on
> the back-compat tag, commit changes to the tests to use the new
> package private APIs on that branch, then fix nightly build to use the
> tip of that branch?o]
> There's still plenty to do before this is committable! This is a
> rather large change:
>* Switches to a new more efficient terms dict format.  This still
>  uses tii/tis files, but the tii only stores term&  long offset
>  (not a TermInfo).  At seek points, tis encodes term&  freq/prox
>  offsets absolutely instead of with deltas delta.  Also, tis/tii
>  are structured by field, so we don't have to record field number
>  in every term.
> .
>  On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
>  ->  0.64 MB) and tis file is 9% smaller (75.5 MB ->  68.5 MB).
> .
>  RAM usage when loading terms dict index is significantly less
>  since we only load an array of offsets and an array of String (no
>  more TermInfo array).  It should be faster to init too.
> .
>  This part is basically done.
>* Introduces modular reader codec that strongly decouples terms dict
>  from docs/positions readers.  EG there is no more TermInfo used
>  when reading the new format.
> .
>  There's nice symmetry now between reading&  writing in the codec
>  chain -- the current docs/prox format

[jira] Commented: (LUCENE-1973) Remove deprecated query components

2009-10-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765093#action_12765093
 ] 

Uwe Schindler commented on LUCENE-1973:
---

There are still some of them:
- explain() in Scorer (I do not know what to do exactly here, I use explain() 
very seldom)
- idf() in Similarity
...and some more

> Remove deprecated query components
> --
>
> Key: LUCENE-1973
> URL: https://issues.apache.org/jira/browse/LUCENE-1973
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
> Fix For: 3.0
>
>
> Remove deprecated query components around HitCollector

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic

2009-10-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-1972:
--

Attachment: LUCENE-1972-2.patch

Some small additional deprecated removals after finishing the rest. Will commit 
now.

> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of 
> deprecated sort logic
> 
>
> Key: LUCENE-1972
> URL: https://issues.apache.org/jira/browse/LUCENE-1972
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1972-2.patch, LUCENE-1972-bw.patch, 
> LUCENE-1972.patch
>
>
> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1959) Index Splitter

2009-10-13 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765097#action_12765097
 ] 

Andrzej Bialecki  commented on LUCENE-1959:
---

Indeed, thanks for the fix - I'll commit this.

> Index Splitter
> --
>
> Key: LUCENE-1959
> URL: https://issues.apache.org/jira/browse/LUCENE-1959
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Jason Rutherglen
>Assignee: Michael McCandless
>Priority: Trivial
> Fix For: 3.0
>
> Attachments: LUCENE-1959.patch, LUCENE-1959.patch, 
> mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, 
> mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch
>
>
> If an index has multiple segments, this tool allows splitting those segments 
> into separate directories.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic

2009-10-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765098#action_12765098
 ] 

Uwe Schindler commented on LUCENE-1972:
---

Committed revision: 824792

> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of 
> deprecated sort logic
> 
>
> Key: LUCENE-1972
> URL: https://issues.apache.org/jira/browse/LUCENE-1972
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Fix For: 3.0
>
> Attachments: LUCENE-1972-2.patch, LUCENE-1972-bw.patch, 
> LUCENE-1972.patch
>
>
> Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1973) Remove deprecated query components

2009-10-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765093#action_12765093
 ] 

Uwe Schindler edited comment on LUCENE-1973 at 10/13/09 7:57 AM:
-

There are still some of them:
- explain() in Scorer (I do not know what to do exactly here, I use explain() 
very seldom)
- idf() in Similarity
- IndexSearcher.fieldSortDoTrackScores / IS.fieldSortDoMaxScore
- BoostingTermQuery
- MultiValueSource (what to do with it?)
- BooleanQuery scoreDocOutOfOrder & others (LUCENE-944)

I am not familar with all of these, so I do not want to fix it.

  was (Author: thetaphi):
There are still some of them:
- explain() in Scorer (I do not know what to do exactly here, I use explain() 
very seldom)
- idf() in Similarity
...and some more
  
> Remove deprecated query components
> --
>
> Key: LUCENE-1973
> URL: https://issues.apache.org/jira/browse/LUCENE-1973
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Search
>Reporter: Uwe Schindler
> Fix For: 3.0
>
>
> Remove deprecated query components around HitCollector

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1959) Index Splitter

2009-10-13 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765101#action_12765101
 ] 

Andrzej Bialecki  commented on LUCENE-1959:
---

Committed revision 824798.

> Index Splitter
> --
>
> Key: LUCENE-1959
> URL: https://issues.apache.org/jira/browse/LUCENE-1959
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Jason Rutherglen
>Assignee: Michael McCandless
>Priority: Trivial
> Fix For: 3.0
>
> Attachments: LUCENE-1959.patch, LUCENE-1959.patch, 
> mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, 
> mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch
>
>
> If an index has multiple segments, this tool allows splitting those segments 
> into separate directories.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael McCandless

Yes please!

Mike

On Tue, Oct 13, 2009 at 10:40 AM, Mark Miller  wrote:
> I can trunk it once more if you'd like - its already pretty out of date :)
>
> If you havn't started anyway ...
>
>
> Michael McCandless wrote:
>> OK I will cut a branch & commit Mark's last patch onto it, unless
>> anyone has objections soonish...
>>
>> I'll also branch (twig?) the back compat branch so we can commit the
>> patch there as well.
>>
>> Mike
>>
>> On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller  wrote:
>>
>>> SVN is about as good at merging branches as any of us are with a patch
>>> and trunk unfortunately. But that can still be somewhat more convenient
>>> than all these huge patches, with different people at different stages.
>>>
>>> Depends on how many people end up working on this though. Any more than
>>> 2, and I think the branch has got to be worth it.
>>>
>>> From my perspective, it doesn't make any of the merging process any
>>> easier - but it can be easier than juggling all these patches - you have
>>> a central code base that can always be targeted for current merging.
>>>
>>> Michael Busch wrote:
>>>
 I think it's supposed to work pretty good - though I have no personal
 experience with merging branches with svn.

 I think we should try it - then we'll know! :)

  Michael

 On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:

>      [
> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799
> ]
>
> Michael McCandless commented on LUCENE-1458:
> 
>
> bq. Shall we create a flexible-indexing branch and commit this?
>
> I think this is a good idea.
>
> But I haven't played heavily w/ svn&  branching.  EG if we branch
> now, and trunk moves fast (which it still is w/ deprecation
> removals), are we going to have conflicts?  Or... is svn good about
> merging branches?
>
>
>
>> Further steps towards flexible indexing
>> ---
>>
>>                  Key: LUCENE-1458
>>                  URL: https://issues.apache.org/jira/browse/LUCENE-1458
>>              Project: Lucene - Java
>>           Issue Type: New Feature
>>           Components: Index
>>     Affects Versions: 2.9
>>             Reporter: Michael McCandless
>>             Assignee: Michael McCandless
>>             Priority: Minor
>>          Attachments: LUCENE-1458-back-compat.patch,
>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
>> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
>> LUCENE-1458.tar.bz2
>>
>>
>> I attached a very rough checkpoint of my current patch, to get early
>> feedback.  All tests pass, though back compat tests don't pass due to
>> changes to package-private APIs plus certain bugs in tests that
>> happened to work (eg call TermPostions.nextPosition() too many times,
>> which the new API asserts against).
>> [Aside: I think, when we commit changes to package-private APIs such
>> that back-compat tests don't pass, we could go back, make a branch on
>> the back-compat tag, commit changes to the tests to use the new
>> package private APIs on that branch, then fix nightly build to use the
>> tip of that branch?o]
>> There's still plenty to do before this is committable! This is a
>> rather large change:
>>    * Switches to a new more efficient terms dict format.  This still
>>      uses tii/tis files, but the tii only stores term&  long offset
>>      (not a TermInfo).  At seek points, tis encodes term&  freq/prox
>>      offsets absolutely instead of with deltas delta.  Also, tis/tii
>>      are structured by field, so we don't have to record field number
>>      in every term.
>> .
>>      On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
>>      ->  0.64 MB) and tis file is 9% smaller (75.5 MB ->  68.5 MB).
>> .
>>      RAM usage when loading terms dict index is significantly less
>>      since we only load an array of offsets and an array of String (no
>>      more TermInfo array).  It should be faster to init too.
>> .
>>      This part is basically done.
>>    * Introduces modular reader codec that strongly decouples terms dict
>>      from docs/positions readers.  EG there is no more TermInfo used
>>

RE: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Uwe Schindler

I think the big changes in the o.a.l.search package are over... :-) - Worked
the whole day on it.

Merging branches with TortoiseSVN works really good, you can even edit the
conflicts directly in the diff view. Used it when fixing the IR/IW hell
deprecations in the BW branch.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Tuesday, October 13, 2009 5:01 PM
> To: java-dev@lucene.apache.org
> Subject: Re: [jira] Commented: (LUCENE-1458) Further steps towards
> flexible indexing
> 
> Yes please!
> 
> Mike
> 
> On Tue, Oct 13, 2009 at 10:40 AM, Mark Miller 
> wrote:
> > I can trunk it once more if you'd like - its already pretty out of date
> :)
> >
> > If you havn't started anyway ...
> >
> >
> > Michael McCandless wrote:
> >> OK I will cut a branch & commit Mark's last patch onto it, unless
> >> anyone has objections soonish...
> >>
> >> I'll also branch (twig?) the back compat branch so we can commit the
> >> patch there as well.
> >>
> >> Mike
> >>
> >> On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller 
> wrote:
> >>
> >>> SVN is about as good at merging branches as any of us are with a patch
> >>> and trunk unfortunately. But that can still be somewhat more
> convenient
> >>> than all these huge patches, with different people at different
> stages.
> >>>
> >>> Depends on how many people end up working on this though. Any more
> than
> >>> 2, and I think the branch has got to be worth it.
> >>>
> >>> From my perspective, it doesn't make any of the merging process any
> >>> easier - but it can be easier than juggling all these patches - you
> have
> >>> a central code base that can always be targeted for current merging.
> >>>
> >>> Michael Busch wrote:
> >>>
>  I think it's supposed to work pretty good - though I have no personal
>  experience with merging branches with svn.
> 
>  I think we should try it - then we'll know! :)
> 
>   Michael
> 
>  On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:
> 
> >      [
> > https://issues.apache.org/jira/browse/LUCENE-
> 1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel&focusedCommentId=12764799#action_12764799
> > ]
> >
> > Michael McCandless commented on LUCENE-1458:
> > 
> >
> > bq. Shall we create a flexible-indexing branch and commit this?
> >
> > I think this is a good idea.
> >
> > But I haven't played heavily w/ svn&  branching.  EG if we branch
> > now, and trunk moves fast (which it still is w/ deprecation
> > removals), are we going to have conflicts?  Or... is svn good about
> > merging branches?
> >
> >
> >
> >> Further steps towards flexible indexing
> >> ---
> >>
> >>                  Key: LUCENE-1458
> >>                  URL: https://issues.apache.org/jira/browse/LUCENE-
> 1458
> >>              Project: Lucene - Java
> >>           Issue Type: New Feature
> >>           Components: Index
> >>     Affects Versions: 2.9
> >>             Reporter: Michael McCandless
> >>             Assignee: Michael McCandless
> >>             Priority: Minor
> >>          Attachments: LUCENE-1458-back-compat.patch,
> >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
> >> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
> >> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-
> 1458.patch,
> >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
> >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
> >> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
> >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
> >> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
> >> LUCENE-1458.tar.bz2
> >>
> >>
> >> I attached a very rough checkpoint of my current patch, to get
> early
> >> feedback.  All tests pass, though back compat tests don't pass due
> to
> >> changes to package-private APIs plus certain bugs in tests that
> >> happened to work (eg call TermPostions.nextPosition() too many
> times,
> >> which the new API asserts against).
> >> [Aside: I think, when we commit changes to package-private APIs
> such
> >> that back-compat tests don't pass, we could go back, make a branch
> on
> >> the back-compat tag, commit changes to the tests to use the new
> >> package private APIs on that branch, then fix nightly build to use
> the
> >> tip of that branch?o]
> >> There's still plenty to do before this is committable! This is a
> >> rather large change:
> >>    * Switches to a new more efficient terms dict format.  This
> still
> >>      uses tii/tis files, but the tii only stor

Re: [jira] Commented: (LUCENE-1959) Index Splitter

2009-10-13 Thread Mark Miller

Hmm ... doing some heavy merging so it might be me, but there also might
be a test failure with this now and some of the trunk changes ...

Andrzej Bialecki (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765101#action_12765101
>  ] 
>
> Andrzej Bialecki  commented on LUCENE-1959:
> ---
>
> Committed revision 824798.
>
>   
>> Index Splitter
>> --
>>
>> Key: LUCENE-1959
>> URL: https://issues.apache.org/jira/browse/LUCENE-1959
>> Project: Lucene - Java
>>  Issue Type: New Feature
>>  Components: Index
>>Affects Versions: 2.9
>>Reporter: Jason Rutherglen
>>Assignee: Michael McCandless
>>Priority: Trivial
>> Fix For: 3.0
>>
>> Attachments: LUCENE-1959.patch, LUCENE-1959.patch, 
>> mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, 
>> mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch
>>
>>
>> If an index has multiple segments, this tool allows splitting those segments 
>> into separate directories.  
>> 
>
>   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1959) Index Splitter

2009-10-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1959.


Resolution: Fixed

Thanks Andrzej!

> Index Splitter
> --
>
> Key: LUCENE-1959
> URL: https://issues.apache.org/jira/browse/LUCENE-1959
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Jason Rutherglen
>Assignee: Michael McCandless
>Priority: Trivial
> Fix For: 3.0
>
> Attachments: LUCENE-1959.patch, LUCENE-1959.patch, 
> mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, 
> mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch
>
>
> If an index has multiple segments, this tool allows splitting those segments 
> into separate directories.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1959) Index Splitter

2009-10-13 Thread Mark Miller

I think it was me - ran by itself with eclipse - must have been an
incremental compile issue or something.

Mark Miller wrote:
> Hmm ... doing some heavy merging so it might be me, but there also might
> be a test failure with this now and some of the trunk changes ...
>
> Andrzej Bialecki (JIRA) wrote:
>   
>> [ 
>> https://issues.apache.org/jira/browse/LUCENE-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765101#action_12765101
>>  ] 
>>
>> Andrzej Bialecki  commented on LUCENE-1959:
>> ---
>>
>> Committed revision 824798.
>>
>>   
>> 
>>> Index Splitter
>>> --
>>>
>>> Key: LUCENE-1959
>>> URL: https://issues.apache.org/jira/browse/LUCENE-1959
>>> Project: Lucene - Java
>>>  Issue Type: New Feature
>>>  Components: Index
>>>Affects Versions: 2.9
>>>Reporter: Jason Rutherglen
>>>Assignee: Michael McCandless
>>>Priority: Trivial
>>> Fix For: 3.0
>>>
>>> Attachments: LUCENE-1959.patch, LUCENE-1959.patch, 
>>> mp-splitter-inline.patch, mp-splitter.patch, mp-splitter2.patch, 
>>> mp-splitter3.patch, mp-splitter4.patch, mp-splitter5.patch
>>>
>>>
>>> If an index has multiple segments, this tool allows splitting those 
>>> segments into separate directories.  
>>> 
>>>   
>>   
>> 
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated LUCENE-1458:


Attachment: LUCENE-1458.patch

Latest to trunk - still issues with GC and the reopen thread safety test 
(unless the test is run in isolation).

Must be a tweak needed, but I'm not sure what. I'm closing the thread locals 
when the StandardTermsDictReader is closed - I don't see a way to improve on 
that yet.

> Further steps towards flexible indexing
> ---
>
> Key: LUCENE-1458
> URL: https://issues.apache.org/jira/browse/LUCENE-1458
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2
>
>
> I attached a very rough checkpoint of my current patch, to get early
> feedback.  All tests pass, though back compat tests don't pass due to
> changes to package-private APIs plus certain bugs in tests that
> happened to work (eg call TermPostions.nextPosition() too many times,
> which the new API asserts against).
> [Aside: I think, when we commit changes to package-private APIs such
> that back-compat tests don't pass, we could go back, make a branch on
> the back-compat tag, commit changes to the tests to use the new
> package private APIs on that branch, then fix nightly build to use the
> tip of that branch?o]
> There's still plenty to do before this is committable! This is a
> rather large change:
>   * Switches to a new more efficient terms dict format.  This still
> uses tii/tis files, but the tii only stores term & long offset
> (not a TermInfo).  At seek points, tis encodes term & freq/prox
> offsets absolutely instead of with deltas delta.  Also, tis/tii
> are structured by field, so we don't have to record field number
> in every term.
> .
> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
> .
> RAM usage when loading terms dict index is significantly less
> since we only load an array of offsets and an array of String (no
> more TermInfo array).  It should be faster to init too.
> .
> This part is basically done.
>   * Introduces modular reader codec that strongly decouples terms dict
> from docs/positions readers.  EG there is no more TermInfo used
> when reading the new format.
> .
> There's nice symmetry now between reading & writing in the codec
> chain -- the current docs/prox format is captured in:
> {code}
> FormatPostingsTermsDictWriter/Reader
> FormatPostingsDocsWriter/Reader (.frq file) and
> FormatPostingsPositionsWriter/Reader (.prx file).
> {code}
> This part is basically done.
>   * Introduces a new "flex" API for iterating through the fields,
> terms, docs and positions:
> {code}
> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
> {code}
> This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
> old API on top of the new API to keep back-compat.
> 
> Next steps:
>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
> fix any hidden assumptions.
>   * Expose new API out of IndexReader, deprecate old API but emulate
> old API on top of new one, switch all core/contrib users to the
> new API.
>   * Maybe switch to AttributeSources as the base class for TermsEnum,
> DocsEnum, PostingsEnum -- this would give readers API flexibility
> (not just index-file-format flexibility).  EG if someone wanted
> to store payload at the term-doc level instead of
> term-doc-position level, you could just add a new attribute.
>   * Test performance & iterate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: svn commit: r824781 - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/memory/src/test/org/apache/lucene/index/memory/ contrib/remote/src/java/org/apache/l

2009-10-13 Thread Michael Busch


On 10/13/09 7:28 AM, uschind...@apache.org wrote:

@@ -115,7 +95,6 @@
 *Applications should usually call {...@link Searcher#search(Query)} or
 * {...@link Searcher#search(Query,Filter)} instead.
 * @throws BooleanQuery.TooManyClauses
-   * @deprecated use {...@link #search(Weight, Filter, int)} instead.
 */
TopDocs search(Weight weight, Filter filter, int n) throws IOException;
   


Was this method just accidentally deprecated?

 Michael

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael Busch

Shall we first remove the remaining deprecations from the indexer 
package? There are not many more left, shouldn't be much work.


 Michael

On 10/13/09 5:47 AM, Michael McCandless wrote:

OK I will cut a branch&  commit Mark's last patch onto it, unless
anyone has objections soonish...

I'll also branch (twig?) the back compat branch so we can commit the
patch there as well.

Mike

On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller  wrote:
   

SVN is about as good at merging branches as any of us are with a patch
and trunk unfortunately. But that can still be somewhat more convenient
than all these huge patches, with different people at different stages.

Depends on how many people end up working on this though. Any more than
2, and I think the branch has got to be worth it.

 From my perspective, it doesn't make any of the merging process any
easier - but it can be easier than juggling all these patches - you have
a central code base that can always be targeted for current merging.

Michael Busch wrote:
 

I think it's supposed to work pretty good - though I have no personal
experience with merging branches with svn.

I think we should try it - then we'll know! :)

  Michael

On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:
   

  [
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799
]

Michael McCandless commented on LUCENE-1458:


bq. Shall we create a flexible-indexing branch and commit this?

I think this is a good idea.

But I haven't played heavily w/ svn&branching.  EG if we branch
now, and trunk moves fast (which it still is w/ deprecation
removals), are we going to have conflicts?  Or... is svn good about
merging branches?


 

Further steps towards flexible indexing
---

  Key: LUCENE-1458
  URL: https://issues.apache.org/jira/browse/LUCENE-1458
  Project: Lucene - Java
   Issue Type: New Feature
   Components: Index
 Affects Versions: 2.9
 Reporter: Michael McCandless
 Assignee: Michael McCandless
 Priority: Minor
  Attachments: LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
LUCENE-1458.tar.bz2


I attached a very rough checkpoint of my current patch, to get early
feedback.  All tests pass, though back compat tests don't pass due to
changes to package-private APIs plus certain bugs in tests that
happened to work (eg call TermPostions.nextPosition() too many times,
which the new API asserts against).
[Aside: I think, when we commit changes to package-private APIs such
that back-compat tests don't pass, we could go back, make a branch on
the back-compat tag, commit changes to the tests to use the new
package private APIs on that branch, then fix nightly build to use the
tip of that branch?o]
There's still plenty to do before this is committable! This is a
rather large change:
* Switches to a new more efficient terms dict format.  This still
  uses tii/tis files, but the tii only stores term&long offset
  (not a TermInfo).  At seek points, tis encodes term&freq/prox
  offsets absolutely instead of with deltas delta.  Also, tis/tii
  are structured by field, so we don't have to record field number
  in every term.
.
  On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
  ->0.64 MB) and tis file is 9% smaller (75.5 MB ->68.5 MB).
.
  RAM usage when loading terms dict index is significantly less
  since we only load an array of offsets and an array of String (no
  more TermInfo array).  It should be faster to init too.
.
  This part is basically done.
* Introduces modular reader codec that strongly decouples terms dict
  from docs/positions readers.  EG there is no more TermInfo used
  when reading the new format.
.
  There's nice symmetry now between reading&writing in the codec
  chain -- the current docs/prox format is captured in:
{code}
FormatPostingsTermsDictWriter/Reader
FormatPostingsDocsWriter/Reader (.frq file) and
FormatPostingsPositionsWriter/Reader (.prx file).
{code}
  This part is basically done.
* Introduces a new "flex" API for iterating through the fields,
  terms, docs and positions:
{code}
FieldProducer ->TermsEnum ->DocsEnum ->PostingsEnum
{code}
  This replaces TermEn

RE: svn commit: r824781 - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/memory/src/test/org/apache/lucene/index/memory/ contrib/remote/src/java/org/apache/l

2009-10-13 Thread Uwe Schindler

I think this was a mistake. Especially because the hint to the replacement
method is the method itself.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Michael Busch [mailto:busch...@gmail.com]
> Sent: Tuesday, October 13, 2009 6:42 PM
> To: java-dev@lucene.apache.org
> Subject: Re: svn commit: r824781 - in /lucene/java/trunk: ./
> contrib/memory/src/java/org/apache/lucene/index/memory/
> contrib/memory/src/test/org/apache/lucene/index/memory/
> contrib/remote/src/java/org/apache/lucene/search/
> contrib/surround/src/test/org/apache/luce
> 
> On 10/13/09 7:28 AM, uschind...@apache.org wrote:
> > @@ -115,7 +95,6 @@
> >  *Applications should usually call {...@link Searcher#search(Query)}
> or
> >  * {...@link Searcher#search(Query,Filter)} instead.
> >  * @throws BooleanQuery.TooManyClauses
> > -   * @deprecated use {...@link #search(Weight, Filter, int)} instead.
> >  */
> > TopDocs search(Weight weight, Filter filter, int n) throws
> IOException;
> >
> 
> Was this method just accidentally deprecated?
> 
>   Michael
> 
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-13 Thread Michael Busch (JIRA)

Remove remaining deprecations from indexer package
--

 Key: LUCENE-1979
 URL: https://issues.apache.org/jira/browse/LUCENE-1979
 Project: Lucene - Java
  Issue Type: Task
  Components: Index
Reporter: Michael Busch
Priority: Minor
 Fix For: 3.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1980) Fix javadocs after deprecation removal

2009-10-13 Thread Uwe Schindler (JIRA)

Fix javadocs after deprecation removal
--

 Key: LUCENE-1980
 URL: https://issues.apache.org/jira/browse/LUCENE-1980
 Project: Lucene - Java
  Issue Type: Task
Reporter: Uwe Schindler
 Fix For: 3.0


There are a lot of @links in Javadocs to methods/classes that no longer exist. 
javadoc target prints tons of warnings. We should fix that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: svn commit: r824781 - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/memory/src/test/org/apache/lucene/index/memory/ contrib/remote/src/java/org/apache/l

2009-10-13 Thread Michael Busch


Right. I was confused about that too.

 Michael

On 10/13/09 9:43 AM, Uwe Schindler wrote:

I think this was a mistake. Especially because the hint to the replacement
method is the method itself.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

   

-Original Message-
From: Michael Busch [mailto:busch...@gmail.com]
Sent: Tuesday, October 13, 2009 6:42 PM
To: java-dev@lucene.apache.org
Subject: Re: svn commit: r824781 - in /lucene/java/trunk: ./
contrib/memory/src/java/org/apache/lucene/index/memory/
contrib/memory/src/test/org/apache/lucene/index/memory/
contrib/remote/src/java/org/apache/lucene/search/
contrib/surround/src/test/org/apache/luce

On 10/13/09 7:28 AM, uschind...@apache.org wrote:
 

@@ -115,7 +95,6 @@
  *Applications should usually call {...@link Searcher#search(Query)}
   

or
 

  * {...@link Searcher#search(Query,Filter)} instead.
  * @throws BooleanQuery.TooManyClauses
-   * @deprecated use {...@link #search(Weight, Filter, int)} instead.
  */
 TopDocs search(Weight weight, Filter filter, int n) throws
   

IOException;
 
   

Was this method just accidentally deprecated?

   Michael

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
 



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


   



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Created: (LUCENE-1981) Allow access to entries in the field cache

2009-10-13 Thread Tom Hill (JIRA)

Allow access to entries in the field cache
--

 Key: LUCENE-1981
 URL: https://issues.apache.org/jira/browse/LUCENE-1981
 Project: Lucene - Java
  Issue Type: New Feature
  Components: Search
Affects Versions: 2.9
Reporter: Tom Hill
Priority: Minor


If the data required is already in the field cache, it seems unnecessary to go 
to the disk for it, if the data is already in RAM.

We have a case where we need one field from a large number (500 -1000) of 
scattered documents in a fairly large index (50-100m docs), and seek time to 
collect the data from disk is prohibitive, so we'd like to grab the data from 
the cache, instead.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765149#action_12765149
 ] 

Mark Miller commented on LUCENE-1458:
-

Whoops - double check the wrong index splitter test - the multi pass one is 
throwing a null pointer exception for me - don't think its related to this 
patch, but I havn't checked.

> Further steps towards flexible indexing
> ---
>
> Key: LUCENE-1458
> URL: https://issues.apache.org/jira/browse/LUCENE-1458
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2
>
>
> I attached a very rough checkpoint of my current patch, to get early
> feedback.  All tests pass, though back compat tests don't pass due to
> changes to package-private APIs plus certain bugs in tests that
> happened to work (eg call TermPostions.nextPosition() too many times,
> which the new API asserts against).
> [Aside: I think, when we commit changes to package-private APIs such
> that back-compat tests don't pass, we could go back, make a branch on
> the back-compat tag, commit changes to the tests to use the new
> package private APIs on that branch, then fix nightly build to use the
> tip of that branch?o]
> There's still plenty to do before this is committable! This is a
> rather large change:
>   * Switches to a new more efficient terms dict format.  This still
> uses tii/tis files, but the tii only stores term & long offset
> (not a TermInfo).  At seek points, tis encodes term & freq/prox
> offsets absolutely instead of with deltas delta.  Also, tis/tii
> are structured by field, so we don't have to record field number
> in every term.
> .
> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
> .
> RAM usage when loading terms dict index is significantly less
> since we only load an array of offsets and an array of String (no
> more TermInfo array).  It should be faster to init too.
> .
> This part is basically done.
>   * Introduces modular reader codec that strongly decouples terms dict
> from docs/positions readers.  EG there is no more TermInfo used
> when reading the new format.
> .
> There's nice symmetry now between reading & writing in the codec
> chain -- the current docs/prox format is captured in:
> {code}
> FormatPostingsTermsDictWriter/Reader
> FormatPostingsDocsWriter/Reader (.frq file) and
> FormatPostingsPositionsWriter/Reader (.prx file).
> {code}
> This part is basically done.
>   * Introduces a new "flex" API for iterating through the fields,
> terms, docs and positions:
> {code}
> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
> {code}
> This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
> old API on top of the new API to keep back-compat.
> 
> Next steps:
>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
> fix any hidden assumptions.
>   * Expose new API out of IndexReader, deprecate old API but emulate
> old API on top of new one, switch all core/contrib users to the
> new API.
>   * Maybe switch to AttributeSources as the base class for TermsEnum,
> DocsEnum, PostingsEnum -- this would give readers API flexibility
> (not just index-file-format flexibility).  EG if someone wanted
> to store payload at the term-doc level instead of
> term-doc-position level, you could just add a new attribute.
>   * Test performance & iterate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1981) Allow access to entries in the field cache

2009-10-13 Thread Tom Hill (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Hill updated LUCENE-1981:
-

Attachment: lucene-1981.patch

Here's a sample implementation. There are a number of possible ways to do this, 
but this seemed pretty minimally invasive.

Adds one method to IndexReader and subclasses.


> Allow access to entries in the field cache
> --
>
> Key: LUCENE-1981
> URL: https://issues.apache.org/jira/browse/LUCENE-1981
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Search
>Affects Versions: 2.9
>Reporter: Tom Hill
>Priority: Minor
> Attachments: lucene-1981.patch
>
>
> If the data required is already in the field cache, it seems unnecessary to 
> go to the disk for it, if the data is already in RAM.
> We have a case where we need one field from a large number (500 -1000) of 
> scattered documents in a fairly large index (50-100m docs), and seek time to 
> collect the data from disk is prohibitive, so we'd like to grab the data from 
> the cache, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-944) Remove deprecated methods in BooleanQuery

2009-10-13 Thread Michael Busch (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch updated LUCENE-944:
-

Attachment: lucene-944-bw.patch
lucene-944.patch

Tiny change in QueryUtils#checkSkipTo() to keep it more consistent to how it 
worked before.

Also attaching the back-compat patch. Note that I have to make the change to 
checkSkipTo() there too, because it was not changed before to do the search 
per-segment. Now more tests actually run this check, exposing this problem.

All tests pass now.

> Remove deprecated methods in BooleanQuery
> -
>
> Key: LUCENE-944
> URL: https://issues.apache.org/jira/browse/LUCENE-944
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Paul Elschot
>Assignee: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: BooleanQuery20070626.patch, lucene-944-bw.patch, 
> lucene-944.patch, lucene-944.patch, lucene-944.patch
>
>
> Remove deprecated methods setUseScorer14 and getUseScorer14 in BooleanQuery, 
> and adapt javadocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1981) Allow access to entries in the field cache

2009-10-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765157#action_12765157
 ] 

Yonik Seeley commented on LUCENE-1981:
--

We shouldn't tie IndexReader/SegmentReader to the fieldCache.
All of the public APIs already exist to use the FieldCache instead of  
document().

> Allow access to entries in the field cache
> --
>
> Key: LUCENE-1981
> URL: https://issues.apache.org/jira/browse/LUCENE-1981
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Search
>Affects Versions: 2.9
>Reporter: Tom Hill
>Priority: Minor
> Attachments: lucene-1981.patch
>
>
> If the data required is already in the field cache, it seems unnecessary to 
> go to the disk for it, if the data is already in RAM.
> We have a case where we need one field from a large number (500 -1000) of 
> scattered documents in a fairly large index (50-100m docs), and seek time to 
> collect the data from disk is prohibitive, so we'd like to grab the data from 
> the cache, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765149#action_12765149
 ] 

Mark Miller edited comment on LUCENE-1458 at 10/13/09 10:45 AM:


Whoops - double check the wrong index splitter test - the multi pass one is 
throwing a null pointer exception for me - don't think its related to this 
patch, but I havn't checked.

*edit*

Okay, just checked - it is this patch. Looks like perhaps something to do with 
LegacyFieldsEnum? Something that isnt being hit by core tests at the moment (I 
didnt run through all the backcompat tests with this yet, since that failed)

  was (Author: markrmil...@gmail.com):
Whoops - double check the wrong index splitter test - the multi pass one is 
throwing a null pointer exception for me - don't think its related to this 
patch, but I havn't checked.
  
> Further steps towards flexible indexing
> ---
>
> Key: LUCENE-1458
> URL: https://issues.apache.org/jira/browse/LUCENE-1458
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2
>
>
> I attached a very rough checkpoint of my current patch, to get early
> feedback.  All tests pass, though back compat tests don't pass due to
> changes to package-private APIs plus certain bugs in tests that
> happened to work (eg call TermPostions.nextPosition() too many times,
> which the new API asserts against).
> [Aside: I think, when we commit changes to package-private APIs such
> that back-compat tests don't pass, we could go back, make a branch on
> the back-compat tag, commit changes to the tests to use the new
> package private APIs on that branch, then fix nightly build to use the
> tip of that branch?o]
> There's still plenty to do before this is committable! This is a
> rather large change:
>   * Switches to a new more efficient terms dict format.  This still
> uses tii/tis files, but the tii only stores term & long offset
> (not a TermInfo).  At seek points, tis encodes term & freq/prox
> offsets absolutely instead of with deltas delta.  Also, tis/tii
> are structured by field, so we don't have to record field number
> in every term.
> .
> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
> .
> RAM usage when loading terms dict index is significantly less
> since we only load an array of offsets and an array of String (no
> more TermInfo array).  It should be faster to init too.
> .
> This part is basically done.
>   * Introduces modular reader codec that strongly decouples terms dict
> from docs/positions readers.  EG there is no more TermInfo used
> when reading the new format.
> .
> There's nice symmetry now between reading & writing in the codec
> chain -- the current docs/prox format is captured in:
> {code}
> FormatPostingsTermsDictWriter/Reader
> FormatPostingsDocsWriter/Reader (.frq file) and
> FormatPostingsPositionsWriter/Reader (.prx file).
> {code}
> This part is basically done.
>   * Introduces a new "flex" API for iterating through the fields,
> terms, docs and positions:
> {code}
> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
> {code}
> This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
> old API on top of the new API to keep back-compat.
> 
> Next steps:
>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
> fix any hidden assumptions.
>   * Expose new API out of IndexReader, deprecate old API but emulate
> old API on top of new one, switch all core/contrib users to the
> new API.
>   * Maybe switch to AttributeSources as the base class for TermsEnum,
> DocsEnum, PostingsEnum -- this would give readers API flexibility
> (not just index-file-format flexibility).  EG if someone wanted
> to store payload at the term-doc level instead of
> term-doc-position level, you could just add a new attribute.
>   * Test performance & iterate.

-- 
This message is automatic

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-10-13 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-1606:


Attachment: LUCENE-1606.patch

updated patch to trunk:
* add support for optional regex features
* remove recursion
* improve performance for worst-case regexp/wildcard/FSM
* improved docs & test 
* remove the fuzzy impl, NFA->DFA too slow for this, maybe a later addition.


> Automaton Query/Filter (scalable regex)
> ---
>
> Key: LUCENE-1606
> URL: https://issues.apache.org/jira/browse/LUCENE-1606
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 3.0
>
> Attachments: automaton.patch, automatonMultiQuery.patch, 
> automatonmultiqueryfuzzy.patch, automatonMultiQuerySmart.patch, 
> automatonWithWildCard.patch, automatonWithWildCard2.patch, LUCENE-1606.patch, 
> LUCENE-1606.patch
>
>
> Attached is a patch for an AutomatonQuery/Filter (name can change if its not 
> suitable).
> Whereas the out-of-box contrib RegexQuery is nice, I have some very large 
> indexes (100M+ unique tokens) where queries are quite slow, 2 minutes, etc. 
> Additionally all of the existing RegexQuery implementations in Lucene are 
> really slow if there is no constant prefix. This implementation does not 
> depend upon constant prefix, and runs the same query in 640ms.
> Some use cases I envision:
>  1. lexicography/etc on large text corpora
>  2. looking for things such as urls where the prefix is not constant (http:// 
> or ftp://)
> The Filter uses the BRICS package (http://www.brics.dk/automaton/) to convert 
> regular expressions into a DFA. Then, the filter "enumerates" terms in a 
> special way, by using the underlying state machine. Here is my short 
> description from the comments:
>  The algorithm here is pretty basic. Enumerate terms but instead of a 
> binary accept/reject do:
>   
>  1. Look at the portion that is OK (did not enter a reject state in the 
> DFA)
>  2. Generate the next possible String and seek to that.
> the Query simply wraps the filter with ConstantScoreQuery.
> I did not include the automaton.jar inside the patch but it can be downloaded 
> from http://www.brics.dk/automaton/ and is BSD-licensed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1342) 64bit JVM crashes on Linux

2009-10-13 Thread Amit Nithian (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765180#action_12765180
 ] 

Amit Nithian commented on LUCENE-1342:
--

I just encountered this error in our own QA environment. The last 3 days our 
JVM has been dying around 3AM with this bug and I am running 1.6.0_12. What 
OS/hardware environments are causing problems? I am running CentOS 5.2 and I'll 
attach my crash dump too.

Has anyone seen any info on the Sun lists about this? I perused the change logs 
from 13-16 and didn't see anything specific to this unless it was listed as 
something else.

> 64bit JVM crashes on Linux
> --
>
> Key: LUCENE-1342
> URL: https://issues.apache.org/jira/browse/LUCENE-1342
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: 2.6.18-53.el5 x86_64  GNU/Linux
> Java(TM) SE Runtime Environment (build 1.6.0_04-b12)
>Reporter: Kevin Richards
> Attachments: hs_err_pid10565.log, hs_err_pid21301.log, 
> hs_err_pid27882.log, jvmerror.log
>
>
> Whilst running lucene in our QA environment we received the following 
> exception. This problem was also reported here : 
> http://confluence.atlassian.com/display/KB/JSP-20240+-+POSSIBLE+64+bit+JDK+1.6+update+4+may+have+HotSpot+problems.
> Is this a JVM problem or a problem in Lucene.
> #
> # An unexpected error has been detected by Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x2adb9e3f, pid=2275, tid=1085356352
> #
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed mode linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x1fce3f]
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x2aab0007f000):  JavaThread "CompilerThread0" daemon 
> [_thread_in_vm, id=2301, stack(0x40a13000,0x40b14000)]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), 
> si_addr=0x
> Registers:
> RAX=0x, RBX=0x2aab0007f000, RCX=0x, 
> RDX=0x2aab00309aa0
> RSP=0x40b10f60, RBP=0x40b10fb0, RSI=0x2aaab37d1ce8, 
> RDI=0x2aaad000
> R8 =0x2b40cd88, R9 =0x0ffc, R10=0x2b40cd90, 
> R11=0x2b410810
> R12=0x2aab00ae60b0, R13=0x2aab0a19cc30, R14=0x40b112f0, 
> R15=0x2aab00ae60b0
> RIP=0x2adb9e3f, EFL=0x00010246, CSGSFS=0x0033, 
> ERR=0x0004
>   TRAPNO=0x000e
> Top of Stack: (sp=0x40b10f60)
> 0x40b10f60:   2aab0007f000 
> 0x40b10f70:   2aab0a19cc30 0001
> 0x40b10f80:   2aab0007f000 
> 0x40b10f90:   40b10fe0 2aab0a19cc30
> 0x40b10fa0:   2aab0a19cc30 2aab00ae60b0
> 0x40b10fb0:   40b10fe0 2ae9c2e4
> 0x40b10fc0:   2b413210 2b413350
> 0x40b10fd0:   40b112f0 2aab09796260
> 0x40b10fe0:   40b110e0 2ae9d7d8
> 0x40b10ff0:   2b40f3d0 2aab08c2a4c8
> 0x40b11000:   40b11940 2aab09796260
> 0x40b11010:   2aab09795b28 
> 0x40b11020:   2aab08c2a4c8 2aab009b9750
> 0x40b11030:   2aab09796260 40b11940
> 0x40b11040:   2b40f3d0 2023
> 0x40b11050:   40b11940 2aab09796260
> 0x40b11060:   40b11090 2b0f199e
> 0x40b11070:   40b11978 2aab08c2a458
> 0x40b11080:   2b413210 2023
> 0x40b11090:   40b110e0 2b0f1fcf
> 0x40b110a0:   2023 2aab09796260
> 0x40b110b0:   2aab08c2a3c8 40b123b0
> 0x40b110c0:   2aab08c2a458 40b112f0
> 0x40b110d0:   2b40f3d0 2aab00043670
> 0x40b110e0:   40b11160 2b0e808d
> 0x40b110f0:   2aab000417c0 2aab009b66a8
> 0x40b11100:    2aab009b9750
> 0x40b0:   40b112f0 2aab009bb360
> 0x40b11120:   0003 40b113d0
> 0x40b11130:   01002aab0052d0c0 40b113d0
> 0x40b11140:   00b3 40b112f0
> 0x40b11150:   40b113d0 2aab08c2a108 
> Instructions: (pc=0x2adb9e3f)
> 0x2adb9e2f:   48 89 5d b0 49 8b 55 08 49 8b 4c 24 08 48 8b 32
> 0x2adb9e3f:   4c 8b 21 8b 4e 1c 49 8d 7c 24 10 89 cb 4a 39 34 
> Stack: [0x40a13000,0x40b14000],  sp=0x40b10f60,  free 
> space=1015k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
>

[jira] Updated: (LUCENE-1342) 64bit JVM crashes on Linux

2009-10-13 Thread Amit Nithian (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Nithian updated LUCENE-1342:
-

Attachment: hs_err_pid13693.log

> 64bit JVM crashes on Linux
> --
>
> Key: LUCENE-1342
> URL: https://issues.apache.org/jira/browse/LUCENE-1342
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: 2.6.18-53.el5 x86_64  GNU/Linux
> Java(TM) SE Runtime Environment (build 1.6.0_04-b12)
>Reporter: Kevin Richards
> Attachments: hs_err_pid10565.log, hs_err_pid13693.log, 
> hs_err_pid21301.log, hs_err_pid27882.log, jvmerror.log
>
>
> Whilst running lucene in our QA environment we received the following 
> exception. This problem was also reported here : 
> http://confluence.atlassian.com/display/KB/JSP-20240+-+POSSIBLE+64+bit+JDK+1.6+update+4+may+have+HotSpot+problems.
> Is this a JVM problem or a problem in Lucene.
> #
> # An unexpected error has been detected by Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x2adb9e3f, pid=2275, tid=1085356352
> #
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0-b19 mixed mode linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x1fce3f]
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> ---  T H R E A D  ---
> Current thread (0x2aab0007f000):  JavaThread "CompilerThread0" daemon 
> [_thread_in_vm, id=2301, stack(0x40a13000,0x40b14000)]
> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), 
> si_addr=0x
> Registers:
> RAX=0x, RBX=0x2aab0007f000, RCX=0x, 
> RDX=0x2aab00309aa0
> RSP=0x40b10f60, RBP=0x40b10fb0, RSI=0x2aaab37d1ce8, 
> RDI=0x2aaad000
> R8 =0x2b40cd88, R9 =0x0ffc, R10=0x2b40cd90, 
> R11=0x2b410810
> R12=0x2aab00ae60b0, R13=0x2aab0a19cc30, R14=0x40b112f0, 
> R15=0x2aab00ae60b0
> RIP=0x2adb9e3f, EFL=0x00010246, CSGSFS=0x0033, 
> ERR=0x0004
>   TRAPNO=0x000e
> Top of Stack: (sp=0x40b10f60)
> 0x40b10f60:   2aab0007f000 
> 0x40b10f70:   2aab0a19cc30 0001
> 0x40b10f80:   2aab0007f000 
> 0x40b10f90:   40b10fe0 2aab0a19cc30
> 0x40b10fa0:   2aab0a19cc30 2aab00ae60b0
> 0x40b10fb0:   40b10fe0 2ae9c2e4
> 0x40b10fc0:   2b413210 2b413350
> 0x40b10fd0:   40b112f0 2aab09796260
> 0x40b10fe0:   40b110e0 2ae9d7d8
> 0x40b10ff0:   2b40f3d0 2aab08c2a4c8
> 0x40b11000:   40b11940 2aab09796260
> 0x40b11010:   2aab09795b28 
> 0x40b11020:   2aab08c2a4c8 2aab009b9750
> 0x40b11030:   2aab09796260 40b11940
> 0x40b11040:   2b40f3d0 2023
> 0x40b11050:   40b11940 2aab09796260
> 0x40b11060:   40b11090 2b0f199e
> 0x40b11070:   40b11978 2aab08c2a458
> 0x40b11080:   2b413210 2023
> 0x40b11090:   40b110e0 2b0f1fcf
> 0x40b110a0:   2023 2aab09796260
> 0x40b110b0:   2aab08c2a3c8 40b123b0
> 0x40b110c0:   2aab08c2a458 40b112f0
> 0x40b110d0:   2b40f3d0 2aab00043670
> 0x40b110e0:   40b11160 2b0e808d
> 0x40b110f0:   2aab000417c0 2aab009b66a8
> 0x40b11100:    2aab009b9750
> 0x40b0:   40b112f0 2aab009bb360
> 0x40b11120:   0003 40b113d0
> 0x40b11130:   01002aab0052d0c0 40b113d0
> 0x40b11140:   00b3 40b112f0
> 0x40b11150:   40b113d0 2aab08c2a108 
> Instructions: (pc=0x2adb9e3f)
> 0x2adb9e2f:   48 89 5d b0 49 8b 55 08 49 8b 4c 24 08 48 8b 32
> 0x2adb9e3f:   4c 8b 21 8b 4e 1c 49 8d 7c 24 10 89 cb 4a 39 34 
> Stack: [0x40a13000,0x40b14000],  sp=0x40b10f60,  free 
> space=1015k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> V  [libjvm.so+0x1fce3f]
> V  [libjvm.so+0x2df2e4]
> V  [libjvm.so+0x2e07d8]
> V  [libjvm.so+0x52b08d]
> V  [libjvm.so+0x524914]
> V  [libjvm.so+0x51c0ea]
> V  [libjvm.so+0x519f77]
> V  [libjvm.so+0x519e7c]
> V  [libjvm.so+0x519ad5]
> V  [libjvm.so+0x1e0cf4]
> V  [libjvm.so+0x2a0bc0]
> V  [libjvm.so+0x528e03]
> V  [libjvm.so+0x51c0ea]
> V  [libjvm.so+0x519f77]
> V  [libjvm.so+0x519e7c]
> V  [libjvm.so+0x519ad5]
> V  [libjvm.s

[jira] Resolved: (LUCENE-944) Remove deprecated methods in BooleanQuery

2009-10-13 Thread Michael Busch (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch resolved LUCENE-944.
--

Resolution: Fixed

Committed revision 824870.

> Remove deprecated methods in BooleanQuery
> -
>
> Key: LUCENE-944
> URL: https://issues.apache.org/jira/browse/LUCENE-944
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Paul Elschot
>Assignee: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: BooleanQuery20070626.patch, lucene-944-bw.patch, 
> lucene-944.patch, lucene-944.patch, lucene-944.patch
>
>
> Remove deprecated methods setUseScorer14 and getUseScorer14 in BooleanQuery, 
> and adapt javadocs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1756) contrib/memory: PatternAnalyzerTest is a very, very, VERY, bad unit test

2009-10-13 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-1756:


Lucene Fields: [New, Patch Available]  (was: [New])
Fix Version/s: 3.0
 Assignee: Robert Muir

assigning this one to myself, if there aren't any objections to the fix I would 
like to commit it soon.

> contrib/memory: PatternAnalyzerTest is a very, very, VERY, bad unit test
> 
>
> Key: LUCENE-1756
> URL: https://issues.apache.org/jira/browse/LUCENE-1756
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/*
>Reporter: Hoss Man
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 3.0
>
> Attachments: LUCENE-1756.patch
>
>
> while working on something else i was started getting consistent 
> IllegalStateExceptions from PatternAnalyzerTest -- but only when running the 
> test from the top level.
> Digging into the test, i've found numerous things that are very scary...
> * instead of using assertions to test that tokens streams match, it throws an 
> IllegalStateExceptions when they don't, and then logs a bunch of info about 
> the token streams to System.out -- having assertion messages that tell you 
> *exactly* what doens't match would make a lot more sense.
> * it builds up a list of files to analyze using patsh thta it evaluates 
> relative to the current working directory -- which means you get different 
> files depending on wether you run the tests fro mthe contrib level, or from 
> the top level build file
> * the list of files it looks for include: "../../*.txt", "../../*.html", 
> "../../*.xml" ... so not only do you get different results when you run the 
> tests in the contrib vs at the top level, but different people runing the 
> tests via the top level build file will get different results depending on 
> what types of text, html, and xml files they happen to have two directories 
> above where they checked out lucene.
> * the test comments indicates that it's purpose is to show that 
> PatternAnalyzer produces the same tokens as other analyzers - but points out 
> this will fail for WhitespaceAnalyzer because of the 255 character token 
> limit WhitespaceTokenizer imposes -- the test then proceeds to compare 
> PaternAnalyzer to WhitespaceTokenizer, garunteeing a test failure for anyone 
> who happens to have a text file containing more then 255 characters of 
> non-whitespace in a row somewhere in "../../" (in my case: my bookmarks.html 
> file, and the hex encoded favicon.gif images)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Michael Busch


Hi all,

I wrote a draft for a mail I'd like to send to java-user to get some 
feedback about the proposed changes to our backwards-compatibility 
policy we discussed here and on LUCENE-1698.

Let me know what you think please!

 Michael


Hello Lucene users:

In the past we have discussed our backwards-compatibility policy
frequently on the Lucene developer mailinglist and we are very tempted
to make some significant changes. In this mail I'd like to outline the
proposed changes to get some feedback from the user community.

Our current backwards-compatibility policy regarding API changes
states that we can only make changes that break
backwards-compatibility in major releases (3.0, 4.0, etc.); the next
major release is the upcoming 3.0.

Given how often we made major releases in the past in Lucene this
means that deprecated APIs need to stay in Lucene for a very long
time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
before we can remove it. This means that the code gets very cluttered
and adding new features gets somewhat more difficult, as attention has
to be paid to properly support the old *and* new APIs for a quite long
time.

The current policy also leads to delaying a last minor release before
a major release (e.g. 2.9), because the developers consider it as the
last chance for a long time to introduce new APIs and deprecate old ones.

The proposal now is to change this policy in a way, so that an API can
only be removed if it was deprecated in at least one release, which
can be a major *or* minor release. E.g. if we deprecate an API and
release it with 3.1, we can remove it with the 3.2 release.

For users this means of course that a simple jar drop-in replacement
won't be possible anymore with almost every Lucene release (excluding
bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
you're using a non-deprecated API it will be in the next release.

Note that of course these proposed changes do not affect
backwards-compatibility with old index formats. I.e. it will still be
possible to read all 3.X indexes with any Lucene 4.X version.

Our main goal is to find the right balance between
backwards-compatibility support for all the Lucene users out there and
fast and productive development of new features. If we get positive
feedback here we will call a vote on the development mailinglist where
the committers have to officially decide whether to make these changes 
or not.


Note that in any case the changes will take affect *after* the 3.0
release.

On behalf of the Lucene developers,
 Michael Busch

[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765204#action_12765204
 ] 

Mark Miller commented on LUCENE-1458:
-

Looks pretty simple - the field is not getting set with LegacyFieldsEnum.

> Further steps towards flexible indexing
> ---
>
> Key: LUCENE-1458
> URL: https://issues.apache.org/jira/browse/LUCENE-1458
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2
>
>
> I attached a very rough checkpoint of my current patch, to get early
> feedback.  All tests pass, though back compat tests don't pass due to
> changes to package-private APIs plus certain bugs in tests that
> happened to work (eg call TermPostions.nextPosition() too many times,
> which the new API asserts against).
> [Aside: I think, when we commit changes to package-private APIs such
> that back-compat tests don't pass, we could go back, make a branch on
> the back-compat tag, commit changes to the tests to use the new
> package private APIs on that branch, then fix nightly build to use the
> tip of that branch?o]
> There's still plenty to do before this is committable! This is a
> rather large change:
>   * Switches to a new more efficient terms dict format.  This still
> uses tii/tis files, but the tii only stores term & long offset
> (not a TermInfo).  At seek points, tis encodes term & freq/prox
> offsets absolutely instead of with deltas delta.  Also, tis/tii
> are structured by field, so we don't have to record field number
> in every term.
> .
> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
> .
> RAM usage when loading terms dict index is significantly less
> since we only load an array of offsets and an array of String (no
> more TermInfo array).  It should be faster to init too.
> .
> This part is basically done.
>   * Introduces modular reader codec that strongly decouples terms dict
> from docs/positions readers.  EG there is no more TermInfo used
> when reading the new format.
> .
> There's nice symmetry now between reading & writing in the codec
> chain -- the current docs/prox format is captured in:
> {code}
> FormatPostingsTermsDictWriter/Reader
> FormatPostingsDocsWriter/Reader (.frq file) and
> FormatPostingsPositionsWriter/Reader (.prx file).
> {code}
> This part is basically done.
>   * Introduces a new "flex" API for iterating through the fields,
> terms, docs and positions:
> {code}
> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
> {code}
> This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
> old API on top of the new API to keep back-compat.
> 
> Next steps:
>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
> fix any hidden assumptions.
>   * Expose new API out of IndexReader, deprecate old API but emulate
> old API on top of new one, switch all core/contrib users to the
> new API.
>   * Maybe switch to AttributeSources as the base class for TermsEnum,
> DocsEnum, PostingsEnum -- this would give readers API flexibility
> (not just index-file-format flexibility).  EG if someone wanted
> to store payload at the term-doc level instead of
> term-doc-position level, you could just add a new attribute.
>   * Test performance & iterate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Michael McCandless

Looks good!

Mike

On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch  wrote:
> Hi all,
>
> I wrote a draft for a mail I'd like to send to java-user to get some
> feedback about the proposed changes to our backwards-compatibility policy we
> discussed here and on LUCENE-1698.
> Let me know what you think please!
>
>  Michael
>
>
> Hello Lucene users:
>
> In the past we have discussed our backwards-compatibility policy
> frequently on the Lucene developer mailinglist and we are very tempted
> to make some significant changes. In this mail I'd like to outline the
> proposed changes to get some feedback from the user community.
>
> Our current backwards-compatibility policy regarding API changes
> states that we can only make changes that break
> backwards-compatibility in major releases (3.0, 4.0, etc.); the next
> major release is the upcoming 3.0.
>
> Given how often we made major releases in the past in Lucene this
> means that deprecated APIs need to stay in Lucene for a very long
> time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
> before we can remove it. This means that the code gets very cluttered
> and adding new features gets somewhat more difficult, as attention has
> to be paid to properly support the old *and* new APIs for a quite long
> time.
>
> The current policy also leads to delaying a last minor release before
> a major release (e.g. 2.9), because the developers consider it as the
> last chance for a long time to introduce new APIs and deprecate old ones.
>
> The proposal now is to change this policy in a way, so that an API can
> only be removed if it was deprecated in at least one release, which
> can be a major *or* minor release. E.g. if we deprecate an API and
> release it with 3.1, we can remove it with the 3.2 release.
>
> For users this means of course that a simple jar drop-in replacement
> won't be possible anymore with almost every Lucene release (excluding
> bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
> you're using a non-deprecated API it will be in the next release.
>
> Note that of course these proposed changes do not affect
> backwards-compatibility with old index formats. I.e. it will still be
> possible to read all 3.X indexes with any Lucene 4.X version.
>
> Our main goal is to find the right balance between
> backwards-compatibility support for all the Lucene users out there and
> fast and productive development of new features. If we get positive
> feedback here we will call a vote on the development mailinglist where
> the committers have to officially decide whether to make these changes or
> not.
>
> Note that in any case the changes will take affect *after* the 3.0
> release.
>
> On behalf of the Lucene developers,
>  Michael Busch

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Mark Miller

I think it should be more clear that the devs have not come to an
agreement on this change yet, irregardless of the communities input.

Michael McCandless wrote:
> Looks good!
>
> Mike
>
> On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch  wrote:
>   
>> Hi all,
>>
>> I wrote a draft for a mail I'd like to send to java-user to get some
>> feedback about the proposed changes to our backwards-compatibility policy we
>> discussed here and on LUCENE-1698.
>> Let me know what you think please!
>>
>>  Michael
>>
>>
>> Hello Lucene users:
>>
>> In the past we have discussed our backwards-compatibility policy
>> frequently on the Lucene developer mailinglist and we are very tempted
>> to make some significant changes. In this mail I'd like to outline the
>> proposed changes to get some feedback from the user community.
>>
>> Our current backwards-compatibility policy regarding API changes
>> states that we can only make changes that break
>> backwards-compatibility in major releases (3.0, 4.0, etc.); the next
>> major release is the upcoming 3.0.
>>
>> Given how often we made major releases in the past in Lucene this
>> means that deprecated APIs need to stay in Lucene for a very long
>> time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
>> before we can remove it. This means that the code gets very cluttered
>> and adding new features gets somewhat more difficult, as attention has
>> to be paid to properly support the old *and* new APIs for a quite long
>> time.
>>
>> The current policy also leads to delaying a last minor release before
>> a major release (e.g. 2.9), because the developers consider it as the
>> last chance for a long time to introduce new APIs and deprecate old ones.
>>
>> The proposal now is to change this policy in a way, so that an API can
>> only be removed if it was deprecated in at least one release, which
>> can be a major *or* minor release. E.g. if we deprecate an API and
>> release it with 3.1, we can remove it with the 3.2 release.
>>
>> For users this means of course that a simple jar drop-in replacement
>> won't be possible anymore with almost every Lucene release (excluding
>> bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
>> you're using a non-deprecated API it will be in the next release.
>>
>> Note that of course these proposed changes do not affect
>> backwards-compatibility with old index formats. I.e. it will still be
>> possible to read all 3.X indexes with any Lucene 4.X version.
>>
>> Our main goal is to find the right balance between
>> backwards-compatibility support for all the Lucene users out there and
>> fast and productive development of new features. If we get positive
>> feedback here we will call a vote on the development mailinglist where
>> the committers have to officially decide whether to make these changes or
>> not.
>>
>> Note that in any case the changes will take affect *after* the 3.0
>> release.
>>
>> On behalf of the Lucene developers,
>>  Michael Busch
>> 
>
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Mark Miller

For the record - I still don't see what we gain but confusion.

The major numbers don't have any significant meaning in terms of
features or advancements.

If we want to remove deprecations faster after deprecating in 4.1, we
should just not release 4.2,4.3,4.4,4.5, and then 4.9.

We should go from 4.1 to 4.9, or 4.1,4.2, then 4.9. We have always just
chosen how long we were stuck with stuff by how fast we decided to skip
the dots.


Mark Miller wrote:
> I think it should be more clear that the devs have not come to an
> agreement on this change yet, irregardless of the communities input.
>
> Michael McCandless wrote:
>   
>> Looks good!
>>
>> Mike
>>
>> On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch  wrote:
>>   
>> 
>>> Hi all,
>>>
>>> I wrote a draft for a mail I'd like to send to java-user to get some
>>> feedback about the proposed changes to our backwards-compatibility policy we
>>> discussed here and on LUCENE-1698.
>>> Let me know what you think please!
>>>
>>>  Michael
>>>
>>>
>>> Hello Lucene users:
>>>
>>> In the past we have discussed our backwards-compatibility policy
>>> frequently on the Lucene developer mailinglist and we are very tempted
>>> to make some significant changes. In this mail I'd like to outline the
>>> proposed changes to get some feedback from the user community.
>>>
>>> Our current backwards-compatibility policy regarding API changes
>>> states that we can only make changes that break
>>> backwards-compatibility in major releases (3.0, 4.0, etc.); the next
>>> major release is the upcoming 3.0.
>>>
>>> Given how often we made major releases in the past in Lucene this
>>> means that deprecated APIs need to stay in Lucene for a very long
>>> time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
>>> before we can remove it. This means that the code gets very cluttered
>>> and adding new features gets somewhat more difficult, as attention has
>>> to be paid to properly support the old *and* new APIs for a quite long
>>> time.
>>>
>>> The current policy also leads to delaying a last minor release before
>>> a major release (e.g. 2.9), because the developers consider it as the
>>> last chance for a long time to introduce new APIs and deprecate old ones.
>>>
>>> The proposal now is to change this policy in a way, so that an API can
>>> only be removed if it was deprecated in at least one release, which
>>> can be a major *or* minor release. E.g. if we deprecate an API and
>>> release it with 3.1, we can remove it with the 3.2 release.
>>>
>>> For users this means of course that a simple jar drop-in replacement
>>> won't be possible anymore with almost every Lucene release (excluding
>>> bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
>>> you're using a non-deprecated API it will be in the next release.
>>>
>>> Note that of course these proposed changes do not affect
>>> backwards-compatibility with old index formats. I.e. it will still be
>>> possible to read all 3.X indexes with any Lucene 4.X version.
>>>
>>> Our main goal is to find the right balance between
>>> backwards-compatibility support for all the Lucene users out there and
>>> fast and productive development of new features. If we get positive
>>> feedback here we will call a vote on the development mailinglist where
>>> the committers have to officially decide whether to make these changes or
>>> not.
>>>
>>> Note that in any case the changes will take affect *after* the 3.0
>>> release.
>>>
>>> On behalf of the Lucene developers,
>>>  Michael Busch
>>> 
>>>   
>> -
>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>>
>>   
>> 
>
>
>   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Yonik Seeley

I think I'm against sending such a request for feedback - and I think
we already know what the results will be.
The email reads like "we want to do this, OK?" - and the beneficiaries
of what is a volunteer effort are likely to respond overwhelmingly
"OK!".  One could take the reverse position and probably get just as
many positive responses.

Devs should decide, and if feedback is needed to help that, a neutral
way of asking should be used.

-Yonik
http://www.lucidimagination.com

On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch  wrote:
> Hi all,
>
> I wrote a draft for a mail I'd like to send to java-user to get some
> feedback about the proposed changes to our backwards-compatibility policy we
> discussed here and on LUCENE-1698.
> Let me know what you think please!
>
>  Michael
>
>
> Hello Lucene users:
>
> In the past we have discussed our backwards-compatibility policy
> frequently on the Lucene developer mailinglist and we are very tempted
> to make some significant changes. In this mail I'd like to outline the
> proposed changes to get some feedback from the user community.
>
> Our current backwards-compatibility policy regarding API changes
> states that we can only make changes that break
> backwards-compatibility in major releases (3.0, 4.0, etc.); the next
> major release is the upcoming 3.0.
>
> Given how often we made major releases in the past in Lucene this
> means that deprecated APIs need to stay in Lucene for a very long
> time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
> before we can remove it. This means that the code gets very cluttered
> and adding new features gets somewhat more difficult, as attention has
> to be paid to properly support the old *and* new APIs for a quite long
> time.
>
> The current policy also leads to delaying a last minor release before
> a major release (e.g. 2.9), because the developers consider it as the
> last chance for a long time to introduce new APIs and deprecate old ones.
>
> The proposal now is to change this policy in a way, so that an API can
> only be removed if it was deprecated in at least one release, which
> can be a major *or* minor release. E.g. if we deprecate an API and
> release it with 3.1, we can remove it with the 3.2 release.
>
> For users this means of course that a simple jar drop-in replacement
> won't be possible anymore with almost every Lucene release (excluding
> bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
> you're using a non-deprecated API it will be in the next release.
>
> Note that of course these proposed changes do not affect
> backwards-compatibility with old index formats. I.e. it will still be
> possible to read all 3.X indexes with any Lucene 4.X version.
>
> Our main goal is to find the right balance between
> backwards-compatibility support for all the Lucene users out there and
> fast and productive development of new features. If we get positive
> feedback here we will call a vote on the development mailinglist where
> the committers have to officially decide whether to make these changes or
> not.
>
> Note that in any case the changes will take affect *after* the 3.0
> release.
>
> On behalf of the Lucene developers,
>  Michael Busch

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Michael Busch


On 10/13/09 1:11 PM, Mark Miller wrote:

I think it should be more clear that the devs have not come to an
agreement on this change yet, irregardless of the communities input.


   

OK I made a few changes near the end to make that clearer. How's it now?

Draft:

Hello Lucene users:

In the past we have discussed our backwards-compatibility policy
frequently on the Lucene developer mailinglist and we are very tempted
to make some significant changes. In this mail I'd like to outline the
proposed changes to get some feedback from the user community.

Our current backwards-compatibility policy regarding API changes
states that we can only make changes that break
backwards-compatibility in major releases (3.0, 4.0, etc.); the next
major release is the upcoming 3.0.

Given how often we made major releases in the past in Lucene this
means that deprecated APIs need to stay in Lucene for a very long
time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
before we can remove it. This means that the code gets very cluttered
and adding new features gets somewhat more difficult, as attention has
to be paid to properly support the old *and* new APIs for a quite long
time.

The current policy also leads to delaying a last minor release before
a major release (e.g. 2.9), because the developers consider it as the
last chance for a long time to introduce new APIs and deprecate old ones.

The proposal now is to change this policy in a way, so that an API can
only be removed if it was deprecated in at least one release, which
can be a major *or* minor release. E.g. if we deprecate an API and
release it with 3.1, we can remove it with the 3.2 release.

For users this means of course that a simple jar drop-in replacement
won't be possible anymore with almost every Lucene release (excluding
bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
you're using a non-deprecated API it will be in the next release.

Note that of course these proposed changes do not affect
backwards-compatibility with old index formats. I.e. it will still be
possible to read all 3.X indexes with any Lucene 4.X version.

Our main goal is to find the right balance between
backwards-compatibility support for all the Lucene users out there and
fast and productive development of new features.

The developers haven't come to an agreement on this proposal yet, hence
we'd like to ask the user community for feedback to help us make a
decision. After we gathered some feedback here we will call a vote on the
development mailinglist where the committers have to officially decide
whether to make these changes or not.

Note that in any case the changes will take affect *after* the 3.0
release.

On behalf of the Lucene developers,
 Michael Busch




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Michael Busch


On 10/13/09 1:18 PM, Yonik Seeley wrote:

I think I'm against sending such a request for feedback - and I think
we already know what the results will be.
   


I've mentioned it several times on java-dev and LUCENE-1698 that I'd 
like to ask the user

community and nobody objected.


The email reads like "we want to do this, OK?" - and the beneficiaries
of what is a volunteer effort are likely to respond overwhelmingly
"OK!".  One could take the reverse position and probably get just as
many positive responses.

   
Devs should decide, and if feedback is needed to help that, a neutral

way of asking should be used.

   

Do you want to draft a new mail?

 Michael


-Yonik
http://www.lucidimagination.com

On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch  wrote:
   

Hi all,

I wrote a draft for a mail I'd like to send to java-user to get some
feedback about the proposed changes to our backwards-compatibility policy we
discussed here and on LUCENE-1698.
Let me know what you think please!

  Michael


Hello Lucene users:

In the past we have discussed our backwards-compatibility policy
frequently on the Lucene developer mailinglist and we are very tempted
to make some significant changes. In this mail I'd like to outline the
proposed changes to get some feedback from the user community.

Our current backwards-compatibility policy regarding API changes
states that we can only make changes that break
backwards-compatibility in major releases (3.0, 4.0, etc.); the next
major release is the upcoming 3.0.

Given how often we made major releases in the past in Lucene this
means that deprecated APIs need to stay in Lucene for a very long
time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
before we can remove it. This means that the code gets very cluttered
and adding new features gets somewhat more difficult, as attention has
to be paid to properly support the old *and* new APIs for a quite long
time.

The current policy also leads to delaying a last minor release before
a major release (e.g. 2.9), because the developers consider it as the
last chance for a long time to introduce new APIs and deprecate old ones.

The proposal now is to change this policy in a way, so that an API can
only be removed if it was deprecated in at least one release, which
can be a major *or* minor release. E.g. if we deprecate an API and
release it with 3.1, we can remove it with the 3.2 release.

For users this means of course that a simple jar drop-in replacement
won't be possible anymore with almost every Lucene release (excluding
bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
you're using a non-deprecated API it will be in the next release.

Note that of course these proposed changes do not affect
backwards-compatibility with old index formats. I.e. it will still be
possible to read all 3.X indexes with any Lucene 4.X version.

Our main goal is to find the right balance between
backwards-compatibility support for all the Lucene users out there and
fast and productive development of new features. If we get positive
feedback here we will call a vote on the development mailinglist where
the committers have to officially decide whether to make these changes or
not.

Note that in any case the changes will take affect *after* the 3.0
release.

On behalf of the Lucene developers,
  Michael Busch
 

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


   



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Andi Vajda



On Tue, 13 Oct 2009, Mark Miller wrote:


For the record - I still don't see what we gain but confusion.

The major numbers don't have any significant meaning in terms of
features or advancements.


That's a perception we don't have control over.

A release incrementing the major release number is considered major whether 
we like to think so or not.


If that release only contains backwards compatibility breaking changes and 
nothing much else to talk about, it is not that major and is likely to cause 
disappointment.


Andi..


If we want to remove deprecations faster after deprecating in 4.1, we
should just not release 4.2,4.3,4.4,4.5, and then 4.9.

We should go from 4.1 to 4.9, or 4.1,4.2, then 4.9. We have always just
chosen how long we were stuck with stuff by how fast we decided to skip
the dots.


Mark Miller wrote:

I think it should be more clear that the devs have not come to an
agreement on this change yet, irregardless of the communities input.

Michael McCandless wrote:


Looks good!

Mike

On Tue, Oct 13, 2009 at 3:07 PM, Michael Busch  wrote:



Hi all,

I wrote a draft for a mail I'd like to send to java-user to get some
feedback about the proposed changes to our backwards-compatibility policy we
discussed here and on LUCENE-1698.
Let me know what you think please!

 Michael


Hello Lucene users:

In the past we have discussed our backwards-compatibility policy
frequently on the Lucene developer mailinglist and we are very tempted
to make some significant changes. In this mail I'd like to outline the
proposed changes to get some feedback from the user community.

Our current backwards-compatibility policy regarding API changes
states that we can only make changes that break
backwards-compatibility in major releases (3.0, 4.0, etc.); the next
major release is the upcoming 3.0.

Given how often we made major releases in the past in Lucene this
means that deprecated APIs need to stay in Lucene for a very long
time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
before we can remove it. This means that the code gets very cluttered
and adding new features gets somewhat more difficult, as attention has
to be paid to properly support the old *and* new APIs for a quite long
time.

The current policy also leads to delaying a last minor release before
a major release (e.g. 2.9), because the developers consider it as the
last chance for a long time to introduce new APIs and deprecate old ones.

The proposal now is to change this policy in a way, so that an API can
only be removed if it was deprecated in at least one release, which
can be a major *or* minor release. E.g. if we deprecate an API and
release it with 3.1, we can remove it with the 3.2 release.

For users this means of course that a simple jar drop-in replacement
won't be possible anymore with almost every Lucene release (excluding
bugfix releases, e.g. 2.9.0->2.9.1). However, you can be sure that if
you're using a non-deprecated API it will be in the next release.

Note that of course these proposed changes do not affect
backwards-compatibility with old index formats. I.e. it will still be
possible to read all 3.X indexes with any Lucene 4.X version.

Our main goal is to find the right balance between
backwards-compatibility support for all the Lucene users out there and
fast and productive development of new features. If we get positive
feedback here we will call a vote on the development mailinglist where
the committers have to officially decide whether to make these changes or
not.

Note that in any case the changes will take affect *after* the 3.0
release.

On behalf of the Lucene developers,
 Michael Busch



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org










--
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael Busch


On 10/13/09 9:43 AM, Michael Busch wrote:
Shall we first remove the remaining deprecations from the indexer 
package? There are not many more left, shouldn't be much work.




I wasn't quick enough for you :) Working on LUCENE-1979 now - that will 
be the first test on how good svn merge is!


 Michael


 Michael

On 10/13/09 5:47 AM, Michael McCandless wrote:

OK I will cut a branch&  commit Mark's last patch onto it, unless
anyone has objections soonish...

I'll also branch (twig?) the back compat branch so we can commit the
patch there as well.

Mike

On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller  
wrote:

SVN is about as good at merging branches as any of us are with a patch
and trunk unfortunately. But that can still be somewhat more convenient
than all these huge patches, with different people at different stages.

Depends on how many people end up working on this though. Any more than
2, and I think the branch has got to be worth it.

 From my perspective, it doesn't make any of the merging process any
easier - but it can be easier than juggling all these patches - you 
have

a central code base that can always be targeted for current merging.

Michael Busch wrote:

I think it's supposed to work pretty good - though I have no personal
experience with merging branches with svn.

I think we should try it - then we'll know! :)

  Michael

On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:

  [
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799 


]

Michael McCandless commented on LUCENE-1458:


bq. Shall we create a flexible-indexing branch and commit this?

I think this is a good idea.

But I haven't played heavily w/ svn&branching.  EG if we branch
now, and trunk moves fast (which it still is w/ deprecation
removals), are we going to have conflicts?  Or... is svn good about
merging branches?



Further steps towards flexible indexing
---

  Key: LUCENE-1458
  URL: 
https://issues.apache.org/jira/browse/LUCENE-1458

  Project: Lucene - Java
   Issue Type: New Feature
   Components: Index
 Affects Versions: 2.9
 Reporter: Michael McCandless
 Assignee: Michael McCandless
 Priority: Minor
  Attachments: LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
LUCENE-1458.tar.bz2


I attached a very rough checkpoint of my current patch, to get early
feedback.  All tests pass, though back compat tests don't pass 
due to

changes to package-private APIs plus certain bugs in tests that
happened to work (eg call TermPostions.nextPosition() too many 
times,

which the new API asserts against).
[Aside: I think, when we commit changes to package-private APIs such
that back-compat tests don't pass, we could go back, make a 
branch on

the back-compat tag, commit changes to the tests to use the new
package private APIs on that branch, then fix nightly build to 
use the

tip of that branch?o]
There's still plenty to do before this is committable! This is a
rather large change:
* Switches to a new more efficient terms dict format.  This 
still
  uses tii/tis files, but the tii only stores term&long 
offset
  (not a TermInfo).  At seek points, tis encodes term&
freq/prox
  offsets absolutely instead of with deltas delta.  Also, 
tis/tii
  are structured by field, so we don't have to record field 
number

  in every term.
.
  On first 1 M docs of Wikipedia, tii file is 36% smaller 
(0.99 MB
  ->0.64 MB) and tis file is 9% smaller (75.5 MB ->
68.5 MB).

.
  RAM usage when loading terms dict index is significantly less
  since we only load an array of offsets and an array of 
String (no

  more TermInfo array).  It should be faster to init too.
.
  This part is basically done.
* Introduces modular reader codec that strongly decouples 
terms dict
  from docs/positions readers.  EG there is no more TermInfo 
used

  when reading the new format.
.
  There's nice symmetry now between reading&writing in 
the codec

  chain -- the current docs/prox format is captured in:
{code}
FormatPostingsTermsDictWriter/Reader
FormatPostingsDocsWriter/Reader (.frq file) and
FormatPostingsPositionsWriter/Reader (.prx file).
{code}
  This part is basically done.
* Introduces a new "fl

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Yonik Seeley

On Tue, Oct 13, 2009 at 4:25 PM, Michael Busch  wrote:
> I've mentioned it several times on java-dev and LUCENE-1698 that I'd like to
> ask the user
> community and nobody objected.

It's the old polling problem - how you ask influences the outcome (as
I said below), and you didn't say exactly how you were going to ask
before.

>> The email reads like "we want to do this, OK?" - and the beneficiaries
>> of what is a volunteer effort are likely to respond overwhelmingly
>> "OK!".  One could take the reverse position and probably get just as
>> many positive responses.
>>
>>   Devs should decide, and if feedback is needed to help that, a neutral
>> way of asking should be used.
>>
>
> Do you want to draft a new mail?

Only if I was sure I wanted feedback :-)

Which do you prefer as a back compatibility policy for Lucene:
A) best effort drop-in back compatibility for minor version numbers
(e.g. v3.5 will be compatible with v3.2)
B) best effort drop-in back compatibility for the next minor version
number only, and deprecations may be removed after one minor release
(e.g. v3.3 will be compat with v3.2, but not v3.4)

In either case forward index format compatibility would be maintained
for an entire major version and the previous (e.g. v3.5 would be able
to read an index written by v2.2)

http://www.lucidimagination.com
-Yonik

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765234#action_12765234
 ] 

Michael McCandless commented on LUCENE-1458:


OK I think I've committed Mark's last patch onto this branch:

  https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458

and I also branched the 2.9 back-compat branch and committed the last back 
compat patch:

  
https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458_2_9_back_compat_tests

Mark can you check it out & see if I missed anything?

> Further steps towards flexible indexing
> ---
>
> Key: LUCENE-1458
> URL: https://issues.apache.org/jira/browse/LUCENE-1458
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2
>
>
> I attached a very rough checkpoint of my current patch, to get early
> feedback.  All tests pass, though back compat tests don't pass due to
> changes to package-private APIs plus certain bugs in tests that
> happened to work (eg call TermPostions.nextPosition() too many times,
> which the new API asserts against).
> [Aside: I think, when we commit changes to package-private APIs such
> that back-compat tests don't pass, we could go back, make a branch on
> the back-compat tag, commit changes to the tests to use the new
> package private APIs on that branch, then fix nightly build to use the
> tip of that branch?o]
> There's still plenty to do before this is committable! This is a
> rather large change:
>   * Switches to a new more efficient terms dict format.  This still
> uses tii/tis files, but the tii only stores term & long offset
> (not a TermInfo).  At seek points, tis encodes term & freq/prox
> offsets absolutely instead of with deltas delta.  Also, tis/tii
> are structured by field, so we don't have to record field number
> in every term.
> .
> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
> .
> RAM usage when loading terms dict index is significantly less
> since we only load an array of offsets and an array of String (no
> more TermInfo array).  It should be faster to init too.
> .
> This part is basically done.
>   * Introduces modular reader codec that strongly decouples terms dict
> from docs/positions readers.  EG there is no more TermInfo used
> when reading the new format.
> .
> There's nice symmetry now between reading & writing in the codec
> chain -- the current docs/prox format is captured in:
> {code}
> FormatPostingsTermsDictWriter/Reader
> FormatPostingsDocsWriter/Reader (.frq file) and
> FormatPostingsPositionsWriter/Reader (.prx file).
> {code}
> This part is basically done.
>   * Introduces a new "flex" API for iterating through the fields,
> terms, docs and positions:
> {code}
> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
> {code}
> This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
> old API on top of the new API to keep back-compat.
> 
> Next steps:
>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
> fix any hidden assumptions.
>   * Expose new API out of IndexReader, deprecate old API but emulate
> old API on top of new one, switch all core/contrib users to the
> new API.
>   * Maybe switch to AttributeSources as the base class for TermsEnum,
> DocsEnum, PostingsEnum -- this would give readers API flexibility
> (not just index-file-format flexibility).  EG if someone wanted
> to store payload at the term-doc level instead of
> term-doc-position level, you could just add a new attribute.
>   * Test performance & iterate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael McCandless

Woops sorry I missed that!

Yes this'll be our first test :)

Mike

On Tue, Oct 13, 2009 at 4:58 PM, Michael Busch  wrote:
> On 10/13/09 9:43 AM, Michael Busch wrote:
>>
>> Shall we first remove the remaining deprecations from the indexer package?
>> There are not many more left, shouldn't be much work.
>>
>
> I wasn't quick enough for you :) Working on LUCENE-1979 now - that will be
> the first test on how good svn merge is!
>
>  Michael
>
>>  Michael
>>
>> On 10/13/09 5:47 AM, Michael McCandless wrote:
>>>
>>> OK I will cut a branch&  commit Mark's last patch onto it, unless
>>> anyone has objections soonish...
>>>
>>> I'll also branch (twig?) the back compat branch so we can commit the
>>> patch there as well.
>>>
>>> Mike
>>>
>>> On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller
>>>  wrote:

 SVN is about as good at merging branches as any of us are with a patch
 and trunk unfortunately. But that can still be somewhat more convenient
 than all these huge patches, with different people at different stages.

 Depends on how many people end up working on this though. Any more than
 2, and I think the branch has got to be worth it.

  From my perspective, it doesn't make any of the merging process any
 easier - but it can be easier than juggling all these patches - you have
 a central code base that can always be targeted for current merging.

 Michael Busch wrote:
>
> I think it's supposed to work pretty good - though I have no personal
> experience with merging branches with svn.
>
> I think we should try it - then we'll know! :)
>
>  Michael
>
> On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:
>>
>>      [
>>
>> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799
>> ]
>>
>> Michael McCandless commented on LUCENE-1458:
>> 
>>
>> bq. Shall we create a flexible-indexing branch and commit this?
>>
>> I think this is a good idea.
>>
>> But I haven't played heavily w/ svn&    branching.  EG if we branch
>> now, and trunk moves fast (which it still is w/ deprecation
>> removals), are we going to have conflicts?  Or... is svn good about
>> merging branches?
>>
>>
>>> Further steps towards flexible indexing
>>> ---
>>>
>>>                  Key: LUCENE-1458
>>>                  URL:
>>> https://issues.apache.org/jira/browse/LUCENE-1458
>>>              Project: Lucene - Java
>>>           Issue Type: New Feature
>>>           Components: Index
>>>     Affects Versions: 2.9
>>>             Reporter: Michael McCandless
>>>             Assignee: Michael McCandless
>>>             Priority: Minor
>>>          Attachments: LUCENE-1458-back-compat.patch,
>>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
>>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
>>> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
>>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
>>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
>>> LUCENE-1458.tar.bz2
>>>
>>>
>>> I attached a very rough checkpoint of my current patch, to get early
>>> feedback.  All tests pass, though back compat tests don't pass due to
>>> changes to package-private APIs plus certain bugs in tests that
>>> happened to work (eg call TermPostions.nextPosition() too many times,
>>> which the new API asserts against).
>>> [Aside: I think, when we commit changes to package-private APIs such
>>> that back-compat tests don't pass, we could go back, make a branch on
>>> the back-compat tag, commit changes to the tests to use the new
>>> package private APIs on that branch, then fix nightly build to use
>>> the
>>> tip of that branch?o]
>>> There's still plenty to do before this is committable! This is a
>>> rather large change:
>>>    * Switches to a new more efficient terms dict format.  This still
>>>      uses tii/tis files, but the tii only stores term&    long offset
>>>      (not a TermInfo).  At seek points, tis encodes term&
>>>  freq/prox
>>>      offsets absolutely instead of with deltas delta.  Also, tis/tii
>>>      are structured by field, so we don't have to record field number
>>>      in every term.
>>> .
>>>      On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
>>>      ->    0.64 MB) and tis file is 9% smaller (75.5 MB ->    68.5
>>> MB).
>>>

[jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765237#action_12765237
 ] 

Uwe Schindler commented on LUCENE-1458:
---

By the way, a lot of these PriorityQueues can be generified like in trunk to 
remove the unneeded casts in lessThan, pop, insert,... everywhere.

> Further steps towards flexible indexing
> ---
>
> Key: LUCENE-1458
> URL: https://issues.apache.org/jira/browse/LUCENE-1458
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Index
>Affects Versions: 2.9
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Attachments: LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2
>
>
> I attached a very rough checkpoint of my current patch, to get early
> feedback.  All tests pass, though back compat tests don't pass due to
> changes to package-private APIs plus certain bugs in tests that
> happened to work (eg call TermPostions.nextPosition() too many times,
> which the new API asserts against).
> [Aside: I think, when we commit changes to package-private APIs such
> that back-compat tests don't pass, we could go back, make a branch on
> the back-compat tag, commit changes to the tests to use the new
> package private APIs on that branch, then fix nightly build to use the
> tip of that branch?o]
> There's still plenty to do before this is committable! This is a
> rather large change:
>   * Switches to a new more efficient terms dict format.  This still
> uses tii/tis files, but the tii only stores term & long offset
> (not a TermInfo).  At seek points, tis encodes term & freq/prox
> offsets absolutely instead of with deltas delta.  Also, tis/tii
> are structured by field, so we don't have to record field number
> in every term.
> .
> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
> .
> RAM usage when loading terms dict index is significantly less
> since we only load an array of offsets and an array of String (no
> more TermInfo array).  It should be faster to init too.
> .
> This part is basically done.
>   * Introduces modular reader codec that strongly decouples terms dict
> from docs/positions readers.  EG there is no more TermInfo used
> when reading the new format.
> .
> There's nice symmetry now between reading & writing in the codec
> chain -- the current docs/prox format is captured in:
> {code}
> FormatPostingsTermsDictWriter/Reader
> FormatPostingsDocsWriter/Reader (.frq file) and
> FormatPostingsPositionsWriter/Reader (.prx file).
> {code}
> This part is basically done.
>   * Introduces a new "flex" API for iterating through the fields,
> terms, docs and positions:
> {code}
> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
> {code}
> This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
> old API on top of the new API to keep back-compat.
> 
> Next steps:
>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
> fix any hidden assumptions.
>   * Expose new API out of IndexReader, deprecate old API but emulate
> old API on top of new one, switch all core/contrib users to the
> new API.
>   * Maybe switch to AttributeSources as the base class for TermsEnum,
> DocsEnum, PostingsEnum -- this would give readers API flexibility
> (not just index-file-format flexibility).  EG if someone wanted
> to store payload at the term-doc level instead of
> term-doc-position level, you could just add a new attribute.
>   * Test performance & iterate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael Busch


No problem!  I'm excited about the new branch!
Have to try to write some codecs now...

 Michael

On 10/13/09 2:09 PM, Michael McCandless wrote:

Woops sorry I missed that!

Yes this'll be our first test :)

Mike

On Tue, Oct 13, 2009 at 4:58 PM, Michael Busch  wrote:
   

On 10/13/09 9:43 AM, Michael Busch wrote:
 

Shall we first remove the remaining deprecations from the indexer package?
There are not many more left, shouldn't be much work.

   

I wasn't quick enough for you :) Working on LUCENE-1979 now - that will be
the first test on how good svn merge is!

  Michael

 

  Michael

On 10/13/09 5:47 AM, Michael McCandless wrote:
   

OK I will cut a branch&commit Mark's last patch onto it, unless
anyone has objections soonish...

I'll also branch (twig?) the back compat branch so we can commit the
patch there as well.

Mike

On Mon, Oct 12, 2009 at 10:50 PM, Mark Miller
  wrote:
 

SVN is about as good at merging branches as any of us are with a patch
and trunk unfortunately. But that can still be somewhat more convenient
than all these huge patches, with different people at different stages.

Depends on how many people end up working on this though. Any more than
2, and I think the branch has got to be worth it.

  From my perspective, it doesn't make any of the merging process any
easier - but it can be easier than juggling all these patches - you have
a central code base that can always be targeted for current merging.

Michael Busch wrote:
   

I think it's supposed to work pretty good - though I have no personal
experience with merging branches with svn.

I think we should try it - then we'll know! :)

  Michael

On 10/12/09 12:32 PM, Michael McCandless (JIRA) wrote:
 

  [

https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764799#action_12764799
]

Michael McCandless commented on LUCENE-1458:


bq. Shall we create a flexible-indexing branch and commit this?

I think this is a good idea.

But I haven't played heavily w/ svn&  branching.  EG if we branch
now, and trunk moves fast (which it still is w/ deprecation
removals), are we going to have conflicts?  Or... is svn good about
merging branches?


   

Further steps towards flexible indexing
---

  Key: LUCENE-1458
  URL:
https://issues.apache.org/jira/browse/LUCENE-1458
  Project: Lucene - Java
   Issue Type: New Feature
   Components: Index
 Affects Versions: 2.9
 Reporter: Michael McCandless
 Assignee: Michael McCandless
 Priority: Minor
  Attachments: LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch,
LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch,
LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2,
LUCENE-1458.tar.bz2


I attached a very rough checkpoint of my current patch, to get early
feedback.  All tests pass, though back compat tests don't pass due to
changes to package-private APIs plus certain bugs in tests that
happened to work (eg call TermPostions.nextPosition() too many times,
which the new API asserts against).
[Aside: I think, when we commit changes to package-private APIs such
that back-compat tests don't pass, we could go back, make a branch on
the back-compat tag, commit changes to the tests to use the new
package private APIs on that branch, then fix nightly build to use
the
tip of that branch?o]
There's still plenty to do before this is committable! This is a
rather large change:
* Switches to a new more efficient terms dict format.  This still
  uses tii/tis files, but the tii only stores term&  long offset
  (not a TermInfo).  At seek points, tis encodes term&
  freq/prox
  offsets absolutely instead of with deltas delta.  Also, tis/tii
  are structured by field, so we don't have to record field number
  in every term.
.
  On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
  ->  0.64 MB) and tis file is 9% smaller (75.5 MB ->  68.5
MB).
.
  RAM usage when loading terms dict index is significantly less
  since we only load an array of offsets and an array of String
(no
  more TermInfo array).  It should be faster to init too.
.
  This part is basically done.
* Introduces modular reader codec that strongly decouples terms
dict
  from docs/positions readers.  EG there is no more TermInfo used
  when reading the new format.
.
  There

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Mark Miller

I've added missing enums classes, but everything else is looking good so
far.

Michael McCandless (JIRA) wrote:
> [ 
> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765234#action_12765234
>  ] 
>
> Michael McCandless commented on LUCENE-1458:
> 
>
> OK I think I've committed Mark's last patch onto this branch:
>
>   https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458
>
> and I also branched the 2.9 back-compat branch and committed the last back 
> compat patch:
>
>   
> https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458_2_9_back_compat_tests
>
> Mark can you check it out & see if I missed anything?
>
>   
>> Further steps towards flexible indexing
>> ---
>>
>> Key: LUCENE-1458
>> URL: https://issues.apache.org/jira/browse/LUCENE-1458
>> Project: Lucene - Java
>>  Issue Type: New Feature
>>  Components: Index
>>Affects Versions: 2.9
>>Reporter: Michael McCandless
>>Assignee: Michael McCandless
>>Priority: Minor
>> Attachments: LUCENE-1458-back-compat.patch, 
>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
>> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
>> LUCENE-1458.tar.bz2
>>
>>
>> I attached a very rough checkpoint of my current patch, to get early
>> feedback.  All tests pass, though back compat tests don't pass due to
>> changes to package-private APIs plus certain bugs in tests that
>> happened to work (eg call TermPostions.nextPosition() too many times,
>> which the new API asserts against).
>> [Aside: I think, when we commit changes to package-private APIs such
>> that back-compat tests don't pass, we could go back, make a branch on
>> the back-compat tag, commit changes to the tests to use the new
>> package private APIs on that branch, then fix nightly build to use the
>> tip of that branch?o]
>> There's still plenty to do before this is committable! This is a
>> rather large change:
>>   * Switches to a new more efficient terms dict format.  This still
>> uses tii/tis files, but the tii only stores term & long offset
>> (not a TermInfo).  At seek points, tis encodes term & freq/prox
>> offsets absolutely instead of with deltas delta.  Also, tis/tii
>> are structured by field, so we don't have to record field number
>> in every term.
>> .
>> On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
>> -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
>> .
>> RAM usage when loading terms dict index is significantly less
>> since we only load an array of offsets and an array of String (no
>> more TermInfo array).  It should be faster to init too.
>> .
>> This part is basically done.
>>   * Introduces modular reader codec that strongly decouples terms dict
>> from docs/positions readers.  EG there is no more TermInfo used
>> when reading the new format.
>> .
>> There's nice symmetry now between reading & writing in the codec
>> chain -- the current docs/prox format is captured in:
>> {code}
>> FormatPostingsTermsDictWriter/Reader
>> FormatPostingsDocsWriter/Reader (.frq file) and
>> FormatPostingsPositionsWriter/Reader (.prx file).
>> {code}
>> This part is basically done.
>>   * Introduces a new "flex" API for iterating through the fields,
>> terms, docs and positions:
>> {code}
>> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
>> {code}
>> This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
>> old API on top of the new API to keep back-compat.
>> 
>> Next steps:
>>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
>> fix any hidden assumptions.
>>   * Expose new API out of IndexReader, deprecate old API but emulate
>> old API on top of new one, switch all core/contrib users to the
>> new API.
>>   * Maybe switch to AttributeSources as the base class for TermsEnum,
>> DocsEnum, PostingsEnum -- this would give readers API flexibility
>> (not just index-file-format flexibility).  EG if someone wanted
>> to store payload at the term-doc level instead of
>> term-doc-position level, you could just add a new attribute.
>>   * Test performance & iterate.
>> 
>
>   


-- 
- Mark

http://www.lucidimagination.com




--

[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765246#action_12765246
 ] 

Michael McCandless commented on LUCENE-1969:


Patch looks good!

How do I run the tests?  When I cd to contrib/kamikaze and run "ant test" I get 
this output:

{code}
download-ivy:
 [echo] installing ivy...
  [get] Getting: 
http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.0.0-beta1/ivy-2.0.0-beta1.jar
  [get] To: /lucene/kami.1969/contrib/kamikaze/ivy/ivy.jar
  [get] Not modified - so not downloaded

install-ivy:

resolve:
No ivy:settings found for the default reference 'ivy.instance'.  A default 
instance will be used
no settings file found, using default...
[ivy:retrieve] :: Ivy 2.0.0-beta1 - 20071206070608 :: 
http://ant.apache.org/ivy/ ::
:: loading settings :: url = 
jar:file:/lucene/kami.1969/contrib/kamikaze/ivy/ivy.jar!/org/apache/ivy/core/settings/ivysettings.xml
[ivy:retrieve] :: resolving dependencies :: com.kamikaze#kamikaze;work...@rhumba
[ivy:retrieve]  confs: [master, test]
[ivy:retrieve]  found log4j#log4j;1.2.15 in public
[ivy:retrieve]  found org.apache.lucene#lucene-core;2.9.0 in public
[ivy:retrieve]  found junit#junit;4.5 in public
[ivy:retrieve] :: resolution report :: resolve 277ms :: artifacts dl 9ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  master  |   2   |   0   |   0   |   0   ||   2   |   0   |
|   test   |   1   |   0   |   0   |   0   ||   1   |   0   |
-
[ivy:retrieve] :: retrieving :: com.kamikaze#kamikaze
[ivy:retrieve]  confs: [master, test]
[ivy:retrieve]  0 artifacts copied, 3 already retrieved (0kB/10ms)

init:

compile:

compile-test:

test:

BUILD FAILED
/lucene/kami.1969/contrib/kamikaze/build.xml:88: Test 
com.kamikaze.test.TestDocIdSetSuite failed

Total time: 2 seconds
{code}

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
> Attachments: kamikaze-contrib.patch
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: [jira] Commented: (LUCENE-1458) Further steps towards flexible indexing

2009-10-13 Thread Michael McCandless

Excellent, thanks!

Mike

On Tue, Oct 13, 2009 at 5:32 PM, Mark Miller  wrote:
> I've added missing enums classes, but everything else is looking good so
> far.
>
> Michael McCandless (JIRA) wrote:
>>     [ 
>> https://issues.apache.org/jira/browse/LUCENE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765234#action_12765234
>>  ]
>>
>> Michael McCandless commented on LUCENE-1458:
>> 
>>
>> OK I think I've committed Mark's last patch onto this branch:
>>
>>   https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458
>>
>> and I also branched the 2.9 back-compat branch and committed the last back 
>> compat patch:
>>
>>   
>> https://svn.apache.org/repos/asf/lucene/java/branches/flex_1458_2_9_back_compat_tests
>>
>> Mark can you check it out & see if I missed anything?
>>
>>
>>> Further steps towards flexible indexing
>>> ---
>>>
>>>                 Key: LUCENE-1458
>>>                 URL: https://issues.apache.org/jira/browse/LUCENE-1458
>>>             Project: Lucene - Java
>>>          Issue Type: New Feature
>>>          Components: Index
>>>    Affects Versions: 2.9
>>>            Reporter: Michael McCandless
>>>            Assignee: Michael McCandless
>>>            Priority: Minor
>>>         Attachments: LUCENE-1458-back-compat.patch, 
>>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
>>> LUCENE-1458-back-compat.patch, LUCENE-1458-back-compat.patch, 
>>> LUCENE-1458-back-compat.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>>> LUCENE-1458.patch, LUCENE-1458.patch, LUCENE-1458.patch, 
>>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
>>> LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, LUCENE-1458.tar.bz2, 
>>> LUCENE-1458.tar.bz2
>>>
>>>
>>> I attached a very rough checkpoint of my current patch, to get early
>>> feedback.  All tests pass, though back compat tests don't pass due to
>>> changes to package-private APIs plus certain bugs in tests that
>>> happened to work (eg call TermPostions.nextPosition() too many times,
>>> which the new API asserts against).
>>> [Aside: I think, when we commit changes to package-private APIs such
>>> that back-compat tests don't pass, we could go back, make a branch on
>>> the back-compat tag, commit changes to the tests to use the new
>>> package private APIs on that branch, then fix nightly build to use the
>>> tip of that branch?o]
>>> There's still plenty to do before this is committable! This is a
>>> rather large change:
>>>   * Switches to a new more efficient terms dict format.  This still
>>>     uses tii/tis files, but the tii only stores term & long offset
>>>     (not a TermInfo).  At seek points, tis encodes term & freq/prox
>>>     offsets absolutely instead of with deltas delta.  Also, tis/tii
>>>     are structured by field, so we don't have to record field number
>>>     in every term.
>>> .
>>>     On first 1 M docs of Wikipedia, tii file is 36% smaller (0.99 MB
>>>     -> 0.64 MB) and tis file is 9% smaller (75.5 MB -> 68.5 MB).
>>> .
>>>     RAM usage when loading terms dict index is significantly less
>>>     since we only load an array of offsets and an array of String (no
>>>     more TermInfo array).  It should be faster to init too.
>>> .
>>>     This part is basically done.
>>>   * Introduces modular reader codec that strongly decouples terms dict
>>>     from docs/positions readers.  EG there is no more TermInfo used
>>>     when reading the new format.
>>> .
>>>     There's nice symmetry now between reading & writing in the codec
>>>     chain -- the current docs/prox format is captured in:
>>> {code}
>>> FormatPostingsTermsDictWriter/Reader
>>> FormatPostingsDocsWriter/Reader (.frq file) and
>>> FormatPostingsPositionsWriter/Reader (.prx file).
>>> {code}
>>>     This part is basically done.
>>>   * Introduces a new "flex" API for iterating through the fields,
>>>     terms, docs and positions:
>>> {code}
>>> FieldProducer -> TermsEnum -> DocsEnum -> PostingsEnum
>>> {code}
>>>     This replaces TermEnum/Docs/Positions.  SegmentReader emulates the
>>>     old API on top of the new API to keep back-compat.
>>>
>>> Next steps:
>>>   * Plug in new codecs (pulsing, pfor) to exercise the modularity /
>>>     fix any hidden assumptions.
>>>   * Expose new API out of IndexReader, deprecate old API but emulate
>>>     old API on top of new one, switch all core/contrib users to the
>>>     new API.
>>>   * Maybe switch to AttributeSources as the base class for TermsEnum,
>>>     DocsEnum, PostingsEnum -- this would give readers API flexibility
>>>     (not just index-file-format flexibility).  EG if someone wanted
>>>     to store payload at the term-doc level instead of
>>>     term-doc-position level, you could just add a ne

[jira] Resolved: (LUCENE-1981) Allow access to entries in the field cache

2009-10-13 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved LUCENE-1981.
--

Resolution: Invalid

> Allow access to entries in the field cache
> --
>
> Key: LUCENE-1981
> URL: https://issues.apache.org/jira/browse/LUCENE-1981
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: Search
>Affects Versions: 2.9
>Reporter: Tom Hill
>Priority: Minor
> Attachments: lucene-1981.patch
>
>
> If the data required is already in the field cache, it seems unnecessary to 
> go to the disk for it, if the data is already in RAM.
> We have a case where we need one field from a large number (500 -1000) of 
> scattered documents in a fairly large index (50-100m docs), and seek time to 
> collect the data from disk is prohibitive, so we'd like to grab the data from 
> the cache, instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1937) Add more methods to manipulate QueryNodeProcessorPipeline elements

2009-10-13 Thread Adriano Crestani (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adriano Crestani reassigned LUCENE-1937:


Assignee: (was: Adriano Crestani)

> Add more methods to manipulate QueryNodeProcessorPipeline elements
> --
>
> Key: LUCENE-1937
> URL: https://issues.apache.org/jira/browse/LUCENE-1937
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: Adriano Crestani
>Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-1937.patch, LUCENE-1937_10_13_2009.patch
>
>
> QueryNodeProcessorPipeline allows the user to define a list of processors to 
> process a query tree. However, it's not very flexible when the user wants to 
> extend/modify an already created pipeline, because it only provides an add 
> method, which only allows the user to append a new processor to the pipeline.
> So, I propose to add new methods to manipulate the processor in a pipeline. I 
> think the methods should not consider an index position when modifying the 
> pipeline, hence the index position in a pipeline does not mean anything, a 
> processor has a meaning when it's after or before another processor. 
> Therefore, I suggest the methods should always consider another processor 
> when inserting/modifying the pipeline. For example, insertAfter(processor, 
> newProcessor), which will insert the "newProcessor" after the "processor".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1937) Add more methods to manipulate QueryNodeProcessorPipeline elements

2009-10-13 Thread Adriano Crestani (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adriano Crestani updated LUCENE-1937:
-

Attachment: LUCENE-1937_10_13_2009.patch

New patch, now QueryNodeProcessorPipeline implements List interface

> Add more methods to manipulate QueryNodeProcessorPipeline elements
> --
>
> Key: LUCENE-1937
> URL: https://issues.apache.org/jira/browse/LUCENE-1937
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: Adriano Crestani
>Assignee: Adriano Crestani
>Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-1937.patch, LUCENE-1937_10_13_2009.patch
>
>
> QueryNodeProcessorPipeline allows the user to define a list of processors to 
> process a query tree. However, it's not very flexible when the user wants to 
> extend/modify an already created pipeline, because it only provides an add 
> method, which only allows the user to append a new processor to the pipeline.
> So, I propose to add new methods to manipulate the processor in a pipeline. I 
> think the methods should not consider an index position when modifying the 
> pipeline, hence the index position in a pipeline does not mean anything, a 
> processor has a meaning when it's after or before another processor. 
> Therefore, I suggest the methods should always consider another processor 
> when inserting/modifying the pipeline. For example, insertAfter(processor, 
> newProcessor), which will insert the "newProcessor" after the "processor".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1938) Precedence query parser using the contrib/queryparser framework

2009-10-13 Thread Adriano Crestani (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adriano Crestani reassigned LUCENE-1938:


Assignee: (was: Adriano Crestani)

> Precedence query parser using the contrib/queryparser framework
> ---
>
> Key: LUCENE-1938
> URL: https://issues.apache.org/jira/browse/LUCENE-1938
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: Adriano Crestani
>Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-1938.patch
>
>
> Extend the current StandardQueryParser on contrib so it supports boolean 
> precedence

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Draft for java-user mail about backwards-compatibility policy changes

2009-10-13 Thread Michael Busch

OK, I made the draft a bit "more neutral" by pointing out the downsides 
clearer. However, I think we have to explain reasons for and against the 
change, otherwise people who didn't follow these discussions on java-dev 
will have no idea why we actually want to make a change at all. I added 
your sentences near the end. How's it now?


Also looking at the 2.9 CHANGES.txt we have a pretty long compat-break 
section, so the drop-in replacement guarantee isn't really one anymore, 
which I didn't even mention in this draft.


Draft:

Hello Lucene users:

In the past we have discussed our backwards-compatibility policy
frequently on the Lucene developer mailinglist and we are thinking about
making some significant changes. In this mail I'd like to outline the
proposed changes to get some feedback from the user community.

Our current backwards-compatibility policy regarding API changes
states that we can only make changes that break
backwards-compatibility in major releases (3.0, 4.0, etc.); the next
major release is the upcoming 3.0.

Given how often we made major releases in the past in Lucene this
means that deprecated APIs need to stay in Lucene for a very long
time. E.g. if we deprecate an API in 3.1 we'll have to wait until 4.0
before we can remove it. This means that the code gets very cluttered
and adding new features gets somewhat more difficult, as attention has
to be paid to properly support the old *and* new APIs for a quite long
time.

The current policy also leads to delaying a last minor release before
a major release (e.g. 2.9), because the developers consider it as the
last chance for a long time to introduce new APIs and deprecate old ones.

The proposal now is to change this policy in a way, so that an API can
only be removed if it was deprecated in at least one release, which
can be a major *or* minor release. E.g. if we deprecate an API and
release it with 3.1, we can remove it with the 3.2 release.

The obvious downside of this proposal is that a simple jar drop-in
replacement will not be possble anymore with almost every Lucene release
(excluding bugfix releases, e.g. 2.9.0->2.9.1). However, you can be
sure that if you're using a non-deprecated API it will be in the next
release.

Note that of course these proposed changes do not affect
backwards-compatibility with old index formats. I.e. it will still be
possible to read all 3.X indexes with any Lucene 4.X version.

Our main goal is to find the right balance between
backwards-compatibility support for all the Lucene users out there and
fast and productive development of new features.

The developers haven't come to an agreement on this proposal yet.
Potentionally giving up the drop-in replacement promise that Lucene
could make in the past is the main reason for the struggle the developers
are in and why we'd like to ask the user community for feedback to
help us make a decision. After we gathered some feedback here we will
call a vote on the development mailinglist where the committers have
to officially decide whether to make these changes or not.

So please tell us which you prefer as a back compatibility policy for
Lucene:
A) best effort drop-in back compatibility for minor version numbers
(e.g. v3.5 will be compatible with v3.2)
B) best effort drop-in back compatibility for the next minor version
number only, and deprecations may be removed after one minor release
(e.g. v3.3 will be compat with v3.2, but not v3.4)

Note that in any case the changes will take affect *after* the 3.0
release.

On behalf of the Lucene developers,
 Michael Busch



On 10/13/09 2:05 PM, Yonik Seeley wrote:

On Tue, Oct 13, 2009 at 4:25 PM, Michael Busch  wrote:
   

I've mentioned it several times on java-dev and LUCENE-1698 that I'd like to
ask the user
community and nobody objected.
 

It's the old polling problem - how you ask influences the outcome (as
I said below), and you didn't say exactly how you were going to ask
before.

   

The email reads like "we want to do this, OK?" - and the beneficiaries
of what is a volunteer effort are likely to respond overwhelmingly
"OK!".  One could take the reverse position and probably get just as
many positive responses.

   Devs should decide, and if feedback is needed to help that, a neutral
way of asking should be used.

   

Do you want to draft a new mail?
 

Only if I was sure I wanted feedback :-)

Which do you prefer as a back compatibility policy for Lucene:
A) best effort drop-in back compatibility for minor version numbers
(e.g. v3.5 will be compatible with v3.2)
B) best effort drop-in back compatibility for the next minor version
number only, and deprecations may be removed after one minor release
(e.g. v3.3 will be compat with v3.2, but not v3.4)

In either case forward index format compatibility would be maintained
for an entire major version and the previous (e.g. v3.5 would be able
to read an index written by v2.2)

http://www.lucidimagination.com
-Yonik

-

[jira] Updated: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1974:
-

Attachment: LUCENE-1974.test.patch

this is the same as the previously attached test but i've simplified it (to me) 
and revamped it to be a patch that can be applied to 2.9.0.

I can confirm that it fails for me (against 2.9.0) and seems to suggest a weird 
hit collection bug somwhere in the BooleanScorer or Prefix scoring code 

(a prefix query works, a boolean query containing term queries work, but a 
boolean query containing a prefix query fails to find all the expected matches)

Unless i'm missing something really silly, this suggests a pretty heinious bug 
somewhere in the core scoring code.

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1974:
-

Attachment: LUCENE-1974.test.patch

tweaked test so that it can be applied to 2.4.1 (by removing readOnly param 
from IndexSearcher constructor)

verified this test passes against 2.4.1 ... it's a new bug in 2.9.0

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

(possible) heinous scoring bug in 2.9.0: LUCENE-1974 ? ? ?

2009-10-13 Thread Chris Hostetter



Can someone smarter then me review the patch in LUCENE-1974...

https://issues.apache.org/jira/browse/LUCENE-1974

...on the surface this seems to suggest a pretty serious error somewhere 
in the low level scoring code when a BooleanQuery is involved.



(If this really is a bug, and not just me overlooking some flaw in the 
test, then it probably warrants an urgent 2.9.1 release)




-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-1974:
--

Assignee: Michael McCandless

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: (possible) heinous scoring bug in 2.9.0: LUCENE-1974 ? ? ?

2009-10-13 Thread Michael McCandless

I'm looking at it...

Mike

On Tue, Oct 13, 2009 at 7:06 PM, Chris Hostetter
 wrote:
>
> Can someone smarter then me review the patch in LUCENE-1974...
>
> https://issues.apache.org/jira/browse/LUCENE-1974
>
> ...on the surface this seems to suggest a pretty serious error somewhere in
> the low level scoring code when a BooleanQuery is involved.
>
>
> (If this really is a bug, and not just me overlooking some flaw in the test,
> then it probably warrants an urgent 2.9.1 release)
>
>
>
> -Hoss
>
>
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765299#action_12765299
 ] 

Michael McCandless commented on LUCENE-1974:


Hmm... seems to be a bug in BooleanScorer... if you call static 
BooleanQuery.setAllowDocsOutOfOrder(false) the test passes (so that's a viable 
workaround it seems).

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765303#action_12765303
 ] 

Robert Muir commented on LUCENE-1974:
-

Hoss man, i played with this a little, maybe this is all obvious tho
* test passes if you set BooleanQuery.setAllowDocsOutOfOrder(false) [its 
booleanscorer, not booleanscorer2]
* to simplify things, you can use ConstantScoreQuery of a single term instead 
of PrefixQuery to trigger it

agree with the comment in the original test, if you trace the execution, the 
problem is it doesnt actually refill the queue with his second doc (which is 
docid 11,000 or something). this is because .score() is being called on the 
subscorer with an end limit of 8192 or so.

{code}
// refill the queue
  more = false;
...
if (subScorerDocID != NO_MORE_DOCS) {
  more |= sub.scorer.score(sub.collector, end, subScorerDocID);
 ...   
} while (current != null || more);
{code}



> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread John Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765306#action_12765306
 ] 

John Wang commented on LUCENE-1969:
---

My bad! The build.xml is not updated with the package name changes. I will 
update post the fixed build.xml.

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
> Attachments: kamikaze-contrib.patch
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread John Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Wang updated LUCENE-1969:
--

Attachment: build.xml

updated build.xml with package name changes.

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
> Attachments: build.xml, kamikaze-contrib.patch
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-13 Thread Michael Busch (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch updated LUCENE-1979:
--

Attachment: lucene-1979.patch

Removes almost all deprecations from the indexer package. The only things left 
are:

* IndexReader#getFieldCacheKey() - what do we do with that one?
* calls to IndexInput#readChars() and #skipChars() - I think we have to keep 
those until 4.0?

All core & contrib tests pass. It'd be good if someone could review this patch 
though.

> Remove remaining deprecations from indexer package
> --
>
> Key: LUCENE-1979
> URL: https://issues.apache.org/jira/browse/LUCENE-1979
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Index
>Reporter: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: lucene-1979.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765310#action_12765310
 ] 

Michael McCandless commented on LUCENE-1974:


Ugh, this is the bug:

{code}
Index: src/java/org/apache/lucene/search/Scorer.java
===
--- src/java/org/apache/lucene/search/Scorer.java   (revision 824846)
+++ src/java/org/apache/lucene/search/Scorer.java   (working copy)
@@ -87,7 +87,7 @@
   collector.collect(doc);
   doc = nextDoc();
 }
-return doc == NO_MORE_DOCS;
+return doc != NO_MORE_DOCS;
   }
   
   /** Returns the score of the current document matching the query.

{code}

I'll commit shortly, to trunk & 2.9 branch.

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Michael Busch (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765315#action_12765315
 ] 

Michael Busch commented on LUCENE-1974:
---

It's also concerning that no unit test catches this...

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765316#action_12765316
 ] 

Michael McCandless commented on LUCENE-1979:


bq. IndexReader#getFieldCacheKey() - what do we do with that one?

I think we should undeprecate it.  I had deprecated it thinking LUCENE-831 
would land.

> Remove remaining deprecations from indexer package
> --
>
> Key: LUCENE-1979
> URL: https://issues.apache.org/jira/browse/LUCENE-1979
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Index
>Reporter: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: lucene-1979.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765318#action_12765318
 ] 

Michael McCandless commented on LUCENE-1969:


Excellent, I can now run ant test; thanks.

Except, it runs a bunch of tests that seem to be succeding, yet I get a BUILD 
FAILED at the end.  I'll attach the full output.

Is there some way to shorten these tests without losing [much] coverage?  It's 
great how thorough they are, but it took my machine 6 min 22 sec  to run 
which'd be a big addition to the build time.

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
> Attachments: build.xml, kamikaze-contrib.patch
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765319#action_12765319
 ] 

Michael McCandless commented on LUCENE-1969:


Also, John, have you started the software grant?  I think you need to fill this 
in:

http://www.apache.org/licenses/software-grant.txt

and then list the files contained (in the current patch) and get that to Grant 
(I think?), and then post the md5 of that patch here.

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
> Attachments: build.xml, kamikaze-contrib.patch
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1969:
---

Attachment: kamikaze.test.out

Output when I ran "ant test".

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
> Attachments: build.xml, kamikaze-contrib.patch, kamikaze.test.out
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-13 Thread Michael Busch (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765320#action_12765320
 ] 

Michael Busch commented on LUCENE-1979:
---

OK, will do!

> Remove remaining deprecations from indexer package
> --
>
> Key: LUCENE-1979
> URL: https://issues.apache.org/jira/browse/LUCENE-1979
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Index
>Reporter: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: lucene-1979.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Assigned: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-1969:
--

Assignee: Michael McCandless

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
>Assignee: Michael McCandless
> Attachments: build.xml, kamikaze-contrib.patch, kamikaze.test.out
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-13 Thread Michael Busch (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch updated LUCENE-1979:
--

Attachment: lucene-1979-bw.patch

Patch for the back-compat trunk.

Hmm, everything passes, except this one:

{noformat}
[junit] java.lang.NoSuchMethodError: 
org.apache.lucene.index.SnapshotDeletionPolicy.snapshot()Lorg/apache/lucene/index/IndexCommitPoint;
[junit] at 
org.apache.lucene.TestSnapshotDeletionPolicy.testReuseAcrossWriters(TestSnapshotDeletionPolicy.java:82)
[junit] at 
org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:206)
[junit] Test org.apache.lucene.TestSnapshotDeletionPolicy FAILED
{noformat}

Here drop-in replacement doesn't seem to work. The method snapshot() of 
SnapshotDeletionPolicy was changed to return IndexCommit instead of 
IndexCommitPoint. IndexCommit used to implement the deprecated 
IndexCommitPoint, which this patch removes.

So the tests are compiled against snapshot() returning IndexCommitPoint in the 
bw-branch, and when run against the method returning IndexCommit of trunk it 
fails with the exception above.

> Remove remaining deprecations from indexer package
> --
>
> Key: LUCENE-1979
> URL: https://issues.apache.org/jira/browse/LUCENE-1979
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Index
>Reporter: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: lucene-1979-bw.patch, lucene-1979.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1969) adding kamikaze to lucene contrib

2009-10-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765324#action_12765324
 ] 

Yonik Seeley commented on LUCENE-1969:
--

As a package name, perhaps something like "docset" is more appropriate and 
descriptive?

> adding kamikaze to lucene contrib
> -
>
> Key: LUCENE-1969
> URL: https://issues.apache.org/jira/browse/LUCENE-1969
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 2.9
>Reporter: John Wang
>Assignee: Michael McCandless
> Attachments: build.xml, kamikaze-contrib.patch, kamikaze.test.out
>
>
> Adding kamikaze to lucene contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765327#action_12765327
 ] 

Michael McCandless commented on LUCENE-1974:


bq. It's also concerning that no unit test catches this...

I agree  I'll commit tangfulin & Hoss's test case.

I think the other tests do not catch it because the error only happens if the 
docID is over 8192 (the chunk size that BooleanScorer uses).  Most of our tests 
work on smaller sets of docs.

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Resolved: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1974.


   Resolution: Fixed
Fix Version/s: 3.0
   2.9.1

Thanks tangfulin and Hoss!  I think we need to spin 2.9.1 for this.

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Updated: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-13 Thread Michael Busch (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Busch updated LUCENE-1979:
--

Attachment: lucene-1979.patch

Same patch as before, but with IndexReader#getFieldCacheKey() undeprecated.

Is it correct that we keep IndexInput#readChars() and IndexInput#skipChars() 
for index format compatibility?

> Remove remaining deprecations from indexer package
> --
>
> Key: LUCENE-1979
> URL: https://issues.apache.org/jira/browse/LUCENE-1979
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Index
>Reporter: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: lucene-1979-bw.patch, lucene-1979.patch, 
> lucene-1979.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765332#action_12765332
 ] 

Yonik Seeley commented on LUCENE-1974:
--

bq. It's also concerning that no unit test catches this... 

I've said it before, I'll say it again... anything of sufficient complexity 
really benefits from random tests to hit boundary cases that one would not have 
thought to code for.  We have quite a few in Solr, but not enough.  We 
obviously don't have enough in Lucene either.

One other simple tactic I've used in Solr to increase the chance of hitting 
boundary conditions is to make sure many segments are created by default (bad 
for performance, good for testing), and that cache sizes, window sizes, etc are 
small so that they are crossed more often by more tests.



> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1979) Remove remaining deprecations from indexer package

2009-10-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765336#action_12765336
 ] 

Michael McCandless commented on LUCENE-1979:


bq. Is it correct that we keep IndexInput#readChars() and 
IndexInput#skipChars() for index format compatibility?

Yes, we need to keep them.

It was in 2.4 (LUCENE-510) when we switched to writing strings as UTF8, so any 
index created by eg 2.3 (which we must be able to read through at least 3.9) 
will need these methods.

> Remove remaining deprecations from indexer package
> --
>
> Key: LUCENE-1979
> URL: https://issues.apache.org/jira/browse/LUCENE-1979
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Index
>Reporter: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: lucene-1979-bw.patch, lucene-1979.patch, 
> lucene-1979.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

1 2 >

1 - 100 of 123 matches

Mail list logo