Great points Nick. I wanted to think about this overnight and take a fresh look at it in the morning, and here's my take on it.
TestLegacyDateRange tests the now-deprecated way of expressing date & time information in documents. The DateField type goes away in 3.0 in favor of DateTools The inclusive case of "default:[1/1/2002 TO 1/4/2002]" tests to verify that all of the dates fall within the range inclusively. The time of 1/4/2002 23:59:59.999 is truncated to just the date portion of 1/4/2002. The logical comparison of 1/1/2002 <= datevalue AND datevalue < 1/5/2002 Would be the exclusive range case equivalent to "default:[1/1/2002 TO 1/5/2002}", inclusive on the lower bound and exclusive on the upper bound with both cases being supported and the choice of approach left to the user of Lucene. Legacy date ranges are compared on the string value of the date (lexicographically), so if you had a query of mod_date:[20020101 TO 20030101] and a document with a mod_date field value of 20020101120000000, which technically falls into the range, it would never surface due to the differing date resolution. Michael From: Nicholas Paldino [.NET/C# MVP] [mailto:casper...@caspershouse.com] Sent: Tuesday, November 17, 2009 7:14 PM To: lucene-net-dev@incubator.apache.org Subject: TestDateRange and TestLegacyDateRange - Do they pass in Java, if so, how? After applying the patch that I submitted for LUCENENET-277, most of the tests under TestQueryParser run and pass. Two notable standouts are TestDateRange and TestLegacyDateRange. If you apply the patches from LUCENENET-278, then the tests still do not pass, but they do not pass because the text representations of the queries don't match, not because the queries can't be created (LUCENENET-278 addresses the issue of not being able to create a range query from a date, which is the first step to getting these tests to pass). The question is, there doesn't seem to be a conversion issue, more of a bad test case. The test cases compare the inclusive and exclusive form of the range query (using "{}" and "[]") and using dates. However, in order to test for the exclusive case (curly brackets). For example, in the TestDateRange test, the following query is generated for the inclusive case (en-US date format): default:[1/1/2002 TO 1/4/2002] But in generating the result to compare against, it uses a time of 1/4/2002 23:59:59.999. I find this to be wrong. First, for an inclusive range of dates, the logical comparison should be: 1/1/2002 <= datevalue AND datevalue < 1/5/2002 Note that the second comparison is the NEXT day, along with a less than comparison. Since you can approach the next day in infinitely decreasing increments, but never actually get to the next day, this comparison is future-proof in all cases, no matter what the resolution is when it comes to the measurement of time. Using one millisecond before midnight of the next day is an error in the test. This is possibly an issue in Lucene itself. Basically, the question is, for inclusive ranges involving dates, is this test case correct? I would say no, since this is what the documentation at http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Range Searches states (emphasis mine): Range Searches Range Queries allow one to match documents whose field(s) values are between the lower and upper bound specified by the Range Query. Range Queries can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically. mod_date:[20020101 TO 20030101] If the sorting is done lexicographically, then in the example above, wouldn't any value that has a time component greater than midnight for 1/1/2003 not fall within this case? In other words, if you had a value of 20030101120000000 (noon on 1/1/2003), then that will not be included, since lexicographically, it comes after 20030101. That being said, why is 1/4/2002 23:59:59.999 being used as a test case in this case for inclusive values? Shouldn't it just be 1/4/2002 (converted into an encoded value of course) and let the bracket format decide the rest? - Nick