Re: Is RangeQuery more efficient than DateFilter?
I've added some information contained on this thread on the wiki. http://wiki.apache.org/jakarta-lucene/DateRangeQueries If you wish to add more information, go right ahead, but since I added this info, I believe it's ultimately my responsibility to maintain it. sv On Mon, 29 Mar 2004, Kevin A. Burton wrote: > Erik Hatcher wrote: > > > > > One more point... caching is done by the IndexReader used for the > > search, so you will need to keep that instance (i.e. the > > IndexSearcher) around to benefit from the caching. > > > Great... Damn... looked at the source of CachingWrapperFilter and it > makes sense. Thanks for the pointer. The results were pretty amazing. > Here are the results before and after. Times are in millis: > > Before caching the Field: > > Searching for Jakarta: > 2238 > 1910 > 1899 > 1901 > 1904 > 1906 > > After caching the field: > 2253 > 10 > 6 > 8 > 6 > 6 > > That's a HUGE difference :) > > I'm very happy :) > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Is RangeQuery more efficient than DateFilter?
Erik Hatcher wrote: One more point... caching is done by the IndexReader used for the search, so you will need to keep that instance (i.e. the IndexSearcher) around to benefit from the caching. Great... Damn... looked at the source of CachingWrapperFilter and it makes sense. Thanks for the pointer. The results were pretty amazing. Here are the results before and after. Times are in millis: Before caching the Field: Searching for Jakarta: 2238 1910 1899 1901 1904 1906 After caching the field: 2253 10 6 8 6 6 That's a HUGE difference :) I'm very happy :) -- Please reply using PGP. http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/ Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965 AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster signature.asc Description: OpenPGP digital signature
Re: Is RangeQuery more efficient than DateFilter?
On Mar 29, 2004, at 8:41 AM, Erik Hatcher wrote: On Mar 29, 2004, at 4:25 AM, Kevin A. Burton wrote: I have a 7G index. A query for a random term comes back fast (300ms) when I'm not using a DateFilter but when I add the DateFilter it takes 2.6 seconds. Way too long. I assume this is because the filter API does a post process so it has to read fields off disk. Is it possible to do to this with a RangeQuery. For example you could create a "days since January 1, 1970" fields and do a range query from between 5 and 10... and then add the original field as well. Are you keeping DateFilter around for more than one search? The drawback to pure DateFilter is that it does not cache, so each search re-enumerates the terms in the range. In fact, DateFilter by itself is practically of no use, I think. If you have a set of canned date ranges, there are two approaches worth considering: DateFilter wrapped by a CachingWrappingFilter, or a RangeQuery wrapped in a QueryFilter (which does cache). Performance-wise, I don't really think there is much (any?) difference in these two approaches, so take your pick. Once the bit sets are cached in a filter, searches will be quite fast. One more point... caching is done by the IndexReader used for the search, so you will need to keep that instance (i.e. the IndexSearcher) around to benefit from the caching. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Is RangeQuery more efficient than DateFilter?
On Mar 29, 2004, at 4:25 AM, Kevin A. Burton wrote: I have a 7G index. A query for a random term comes back fast (300ms) when I'm not using a DateFilter but when I add the DateFilter it takes 2.6 seconds. Way too long. I assume this is because the filter API does a post process so it has to read fields off disk. Is it possible to do to this with a RangeQuery. For example you could create a "days since January 1, 1970" fields and do a range query from between 5 and 10... and then add the original field as well. Are you keeping DateFilter around for more than one search? The drawback to pure DateFilter is that it does not cache, so each search re-enumerates the terms in the range. In fact, DateFilter by itself is practically of no use, I think. If you have a set of canned date ranges, there are two approaches worth considering: DateFilter wrapped by a CachingWrappingFilter, or a RangeQuery wrapped in a QueryFilter (which does cache). Performance-wise, I don't really think there is much (any?) difference in these two approaches, so take your pick. Once the bit sets are cached in a filter, searches will be quite fast. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Is RangeQuery more efficient than DateFilter?
I have a 7G index. A query for a random term comes back fast (300ms) when I'm not using a DateFilter but when I add the DateFilter it takes 2.6 seconds. Way too long. I assume this is because the filter API does a post process so it has to read fields off disk. Is it possible to do to this with a RangeQuery. For example you could create a "days since January 1, 1970" fields and do a range query from between 5 and 10... and then add the original field as well. I have to make some app changes so I figured I would ask here before moving forward. Kevin -- Please reply using PGP. http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/ Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965 AIM/YIM - sfburtonator, Web - http://peerfear.org/ GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster signature.asc Description: OpenPGP digital signature