The problem arises because there are multiple ranges defined in the document 
and it is not easy to test the start/end value pairs when they are held as 
independent values in separate fields. AFAIK there is currently no query 
implementation for testing position relationships in words from more than one 
field.
There is one sneaky way however, to record and query range information in the 
one field - you can (ab)use the position information stored by Lucene to encode 
data such as time and then query for ranges using SpanQueries.

By writing a custom analyzer you can artificially "place" words in a required 
location in a field by setting the position increment value appropriately.
If you chose to think of the extent of the field as representing time e.g. the 
hours or minutes in a day/week you can post pairs of "start" and "end" words at 
an appropriate location to effectively record the range of time or date 
information.

This solves the problem of recording pairs of ranges that can be queried.

Now for your query - your example query is actually a collection of ranges 
rather than one single range (just Thursdays) so this would need to be 
expressed as multiple SpanQueries testing the areas of the document which 
represent the time period of interest to see if they contain a start and end 
pair. 

Worth a try?

Mark

----- Original Message ----
From: Vijay Santhanam <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, 18 February, 2007 11:43:39 PM
Subject: Multiple time ranges in a document

Hello,

 

I'm using a RangeFilter to find "Event" documents (with Start and End lucene
friendly formatted date fields) that match a Users time range query. This
works perfectly in sub-second times at decent loads, but I'm having trouble
searching multiple performances in the one document. Indexing them is no
problem, because I can add extra terms to the start and end fields.

 

Here's a situation that doesn't work to well with the RangeFilter:-

 

Let's say a comedian has a regular gig every Monday for the next 3 weeks,
from 7pm-9pm. So, the start field will be 200702191900, 200702261900,
200703051900. And, the end field will be 200702192100, 200702262100,
200703052100.

If someone searches for an event on Thursday anytime during his 3 week
stint, the comedian's event will show up, because the Range Filter will
consider the lowest term of the start field and the highest term of the end
field.

 

Also, sorting by start or end fields will break, but I could write my own
SortComparatorSource to fix that.

 

How could I get around the filter problem? I could write my own filter, but
it would need to keep track of both fields, and their respective term
positions for each field.

 

Thanks for your help,

-Vijay

 






        
        
                
___________________________________________________________ 
New Yahoo! Mail is the ultimate force in competitive emailing. Find out more at 
the Yahoo! Mail Championships. Plus: play games and win prizes. 
http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to