On Jun 30, 2005, at 11:27 PM, Chris Lu wrote:
Mark, your suggestion will incur another trip to the database. And
if the search results is large, filtering in DB by pk is not really
good.
Chris - I disagree with that last comment. It can be a great
solution when the filter is cached. Certainly building a filter for
every search would be inefficient, but filters are really best when
cached.
Erik, your original "date" field is good when there is not many
dates(<1024) in the database. Otherwise, Range Query can not handle
it.
Not quite correct... it would not matter how many dates were in the
index/database, only how many were within the range used by
RangeQuery. The original requirement was a years worth of days,
which at most would be 366 days and I suspect someone looking for
hotel room availability would be narrowing things down to a week or
month.
My suggestion is, use "year" + "month" + "day" three fields to
store date. And when searching, for example, any date that's
greater than 2005-06-30, you can use this query to search: ( year >
2005 ) or ( year=2005 and month>=6) or ( year=2005 and month=6 and
day > 30 ).
It's a combination of BooleanQuery, TermQuery, and RangeQuery.
How would you represent multiple dates for a Document using that
scheme? Wasn't that one of the original requirements?
This may seem cumbersome, but it can save one trip to database, and
circumvent Lucene's limitation.
One trip to the DB *once* with the results cached is mighty
inexpensive in the grand scheme of things. Mark's point is something
I agree with (and wrote about in the custom filter example in Lucene
in Action) - some information makes good sense to stay in a
relational database when its too volatile to put in a Lucene index.
Building a filter to access a DB, with the results cached is a good
solution to the specified problem, I think. Certainly there are many
ways to solve it though.
Erik
Chris Lu
http://www.dbsight.net
Erik Hatcher wrote:
I second Mark's suggestion over the alternative I posted. My
alternative was merely to invert the field structure originally
described, but using a Filter for the volatile information is wiser.
Erik
On Jun 29, 2005, at 9:58 AM, mark harwood wrote:
Presumably there is also a free-text element to the
search or you wouldn't be using Lucene.
Multiple fields is not the way to go.
A single Lucene field could contain multiple terms (
the available dates) but I still don't think that's
the best solution.
The availability info is likely to be pretty volatile
and you always want up-to-date info so I would prefer
to hit a database for this. If you keep a DB primary
key to Lucene doc id look-up cached in memory you can
quickly construct a Lucene filter from the database
results and therefore only show Lucene results for
available rooms.
Cheers
Mark
___________________________________________________________
How much free photo storage do you get? Store your holiday
snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com
--------------------------------------------------------------------
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]