Dave, 

On Tuesday 05 July 2005 20:54, Paul Elschot wrote:
> On Tuesday 05 July 2005 14:35, Dave Kor wrote:
...
> > 
> > Hopefully, this explains what I am trying to achieve with Lucene and why I 
> > need
> > to match repeated sub-queries. I would really appreciate it if anyone has a
> > solution, a quickfix or can guide me in hacking up something workable.
> 
> So, in an ordered SpanNearQuery, you want repeated subqueries not to match the
> same text/tokens, which boils down to non overlapping matches.
> 
> I had another look at NearSpans.java, and I'm afraid there is no quick fix 
> for this
> (but I'd like to be be proven wrong).
> Spans can match ordered/unordered and overlapping/nonoverlapping.
> Currently for the overlap there is no parameter, and I don't know how
> SpanNearQuery behaves wrt. to overlapping matches.
> There is no special case for equal subqueries, which is probably ok, but
> when overlaps are allowed care should be taken not to use equal subqueries.
> 
> On hacking up something workable: it would be good to get this
> bug out of NearSpans.

This might be a fix, it reduces the number of cases that are considered ordered
matches. It also passes all unit tests here:

  private boolean matchIsOrdered() {
    SpansCell spansCell = (SpansCell) ordered.get(0); 
    int lastStart = spansCell.start(); // no need to compare doc nrs here.
    int lastEnd = spansCell.end();
    for (int i = 1; i < ordered.size(); i++) {
      spansCell = (SpansCell) ordered.get(i);
      int start = spansCell.start();
      int end = spansCell.end();
      if ((start < lastStart) || ((start == lastStart) && (end <= lastEnd))) {
        return false; // also equal begin and end is not ordered.
      }
      lastStart = start;
      lastEnd = end;
    }
    return true;
  }

Could you replace the matchIsOrdered() method with the above one
and see whether you can still reproduce the "Unexptected: ordered"
exception?

There is some interplay between the matchIsOrdered() method and
the lessThan() method in CellQueue that also uses  the SpansCell index,
and I hope this gets it right.

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to