apologizes for replying to myself, but another nice side-effect of this fix is that it virtually eliminates the potential for an OutOfMemoryError, which was a problem i encountered on extremely large fields, over 10000 terms, while i was profiling the RangeQuery class.
i can get into specifics if need be, any thoughts?
alex
On Fri, 2002-11-08 at 15:54, Alex Winston wrote:
> thanks for the reply, my apologizes for not explaining myself very
> clearly, it has been a long day.
>
> you expressed exactly our situation, unfortunately this is not an option
> because we want to have multiple ranges for each document as well,
> there is a possible extension of what you suggested but that is a last
> resort. kinda crazy i know, but you have to meet requirements :).
>
> but i also had a thought while i was looking through the lucene code,
> and any comments are welcome.
>
> i may be very mistaken because it has been a long day but if you look at
> the current cvs version of RangeQuery it appears that even if a match is
> found it will continue to iterate over terms within a field, and in my
> case it is on the order of thousands. if i add a break after a match
> has been found it appears as though the search is improved on avg an
> order of magnitude, my math has left me so i cannot be theoretical at
> the moment. i have unit tested the change on my side and on the lucene
> side and it works. note: one hard example is that a query went from 20
> seconds to .5 seconds. any initial thoughts to if there is a case where
> this would not work?
>
> beginning line 164:
> TermQuery tq = new TermQuery(term); // found a match
> tq.setBoost(boost); // set the boost
> q.add(tq, false, false); // add to q
> break; // ADDED!
>
>
> On Fri, 2002-11-08 at 15:09, Mike Barry wrote:
> > Alex,
> >
> > It is rather confusing. It sounds like you've indexed
> > a field that that can be between two values (let's say
> > E-J) and then when you have a search term such as G
> > you want the docs containing E-J (or A-H or F-K but not A-H
> > nor A-C nor J-Z)
> >
> > Just of the top of my head but could you index the upper and
> > lower bounds as separate fields then when you search do a
> > compound query:
> >
> > lower_bound:{ - search_term } AND upper_bound:{ search_term - }
> >
> > just a thought.
> > > -MikeB.
> >
> >
> > Alex Winston wrote:
> >
> > > i was hoping that someone could briefly review my current solution to a
> > > problem that we have encountered to see if anyone could suggest a
> > > possible alternative, because as it stands we have pushed lucene past
> > > its current limits.
> > >
> > > PROBLEM:
> > >
> > > we were wanting to represent a range of values for a particular field
> > > that is searchable over a particular range.
> > >
> > > an example follows for clarification:
> > > we were wanting to store a range of chapters and verses of a book for a
> > > particular document, and in turn search to see if a query range includes
> > > the range that is represented in the index.
> > >
> > > if this is unclear please ask for clarification
> > >
> > > IMPRACTICAL SOLUTION:
> > >
> > > although this solution seems somewhat impractical it is all we could
> > > come up with.
> > >
> > > our solution involved storing each possible range value within the term
> > > which would allow for RangeQuerys to be performed on this particular
> > > field. for very small ranges this seems somewhat practical after
> > > profiling. although once the field ranges began to span multiple
> > > chapters and verses, the search times became unreasonable because we
> > > were storing thousands of entries for each representative range.
> > >
> > > i can elaborate on anything that is unclear,
> > > but any thoughts on a possible alternative solution within lucene that
> > > we overlooked would be extremely helpful.
> > >
> > >
> > > alex
> >
> >
> >
> > --
> > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@;jakarta.apache.org>
> > For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org>
> >
> >
>
signature.asc
Description: This is a digitally signed message part
