Hi Chris,

it was a long night for our solr server today because we rebuilt the complete 
index using "well formed" date string. And the date field is stored now so that 
we can see if there went something wrong :-)

But our problems are solved completely. Now I can give you a very exact 
description what is the problem now (and what was the reason that we used 
malformed date values).

Let's imagine we have 3 records with die following date values:
1. 2006-03-04T12:23:19Z
2. 2007-08-12T19:07:03Z
3. 2008-09-16T12:56:19Z

And now I will give you some queries and which results we get back:
- "date:[2005-01-01T00:00:00Z TO NOW]" or "date:[2005-01-01T00:00:00Z TO 
2008-09-18T09:45:00Z]": 1 and 2 (incorrect)
- "date:[2005-01-01T00:00:00Z TO 20080918T09:45:00Z]": 1, 2, 3 (correct)
- "date:[2005-01-01T00:00:00Z TO 2007-12-31T23:59:59Z]": only 1 (incorrect)
- "date:[2005-01-01T00:00:00Z TO 20071231T23:59:59Z]": 1 and 2 (correct)

So as you can see using "-" in the second parameter of the range query for the 
date field causes an error and doesn't find the record should has to be found, 
using a malformed date value without "-" return the correct records.

When using "-" for the second parameter all records that are from the year 
contained in the parameter aren't found any more. This behavior is reproducible 
on different systems, either CentOS or Debian. It must be a problem of solr or 
the Lucene (query parser) itself.

Our next steps are to test our scenario using solr 1.3 and if the problem isn't 
fix we will using timestamps instead for the date format. But maybe this is a 
general problem of solr and should be fixed because in other cases and for 
other users it's not possible to make a workaround and they get wrong 
(incomplete) results for their query.

Best regards,
Christian

Reply via email to