Hello Uwe, you bring up some very valid points. We did not utilize those classes as we were not aware of how to use these classes. Given what you say, it may make sense to rewrite our implementation as a sort of user level library or package of files that can be used together to easily implement date range searching, like a date range parser, a date range index and so on and so forth. Basically a set of utilities.
Also while, it is indeed infinitely faster to use Conjunction scorer, bear in mind our algorithm only does an intersection on the set of results returned in IndexSearcher, this means that we are perhaps doing intersection on a smaller subset. The methodology described does a query for a date range on an entire index, and then does a query for a term on an entire index and then intersects those results which may be slower. I imagine most users don't look beyond the top twenty documents anyway, so there is no reason to query the entire index for the subset of documents that fit that date range. A "lazy" (term used loosely) loading type of solution may be best, because if you really break it down, a date range is more like a filter for a set of results, and less of something that you have to query against the entire database. Given the aforementioned concepts, perhaps a combination of the two ideas may be the best solution for an implementation, I will continue to think about it, thank you very much for your input. Again, these are just some ideas I am throwing around here, I obviously can't speak in absolute terms because I do not know Lucene very well, but these are some thoughts I am having. Any and all feedback is appreciated, and once again, Thank you for your input, -John On Apr 30, 2012, at 3:03 AM, Uwe Schindler wrote: > Hi, > > Thanks for your input. One citation from your report: > > "These types of searches are uncommon, and thus programmers don't optimize > for this case. Lucene, for example, has the ability to filter search results > using date-ranges, but it is a slow, naive algorithm implemented through > lexographic range searching on a custom field. Which is a user level hack > that works ineffectively. There are no known other ways of performing a date > range search." > > Since Lucene 2.9 / Solr 1.4, Lucene can handle numerical ranges without "a > slow, naive algorithm", see NumericRangeQuery and NumericField. As every > date can be represented as a number (e.g. year as integer, or milliseconds > since 1970 as long,...), date searches can be done easily with Lucene (and > very fast, because the intersection between the NumericRangeQuery and the > TermQuery are done using ConjunctionScorer which does *not* naivly iterate > the postings). > > Did you consider this in your implementation? > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [email protected] > > >> -----Original Message----- >> From: John Mercouris [mailto:[email protected]] >> Sent: Monday, April 30, 2012 9:24 AM >> To: [email protected] >> Subject: Date Range Query Feature Implementation >> >> Hello we (John Mercouris & Nick Zivkovic) have implemented date range >> search functionality into Lucene as part of a class project. The > implementation >> is detailed in the PDF attached. The source is available for download from >> github at the URL: git://github.com/cs429-ir/date-range-search.git >> >> We hope that you find this useful, >> >> -John & Nick > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
