I'm doing something similar for dates/times/timestamps. I'm actually trying to do, "'now' is within the range of what appointments(date/time from and to combos, i.e. timestamps).
Fairly simple search of: What items have a start time BEFORE now, and an end time AFTER now? My thoughts were to store: unix time stamp BIGINTS (64 bit) "ISO_DATE ISO_TIME" strings Which is going to be faster: 1/ Indexing? 2/ Searching? How does the 'tint' field mentioned below apply? Dennis Gearon Signature Warning ---------------- EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Wed, 9/8/10, Jonathan Rochkind <rochk...@jhu.edu> wrote: > From: Jonathan Rochkind <rochk...@jhu.edu> > Subject: Re: How to import data with a different date format > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Date: Wednesday, September 8, 2010, 10:27 AM > Just throwing it out there, I'd > consider a different approach for an actual real app, > although it might not be easier to get up quickly. (For > quickly, yeah, I'd just store it as a string, more on that > at bottom). > > If none of your dates have times, they're all just full > days, I'm not sure you really need the date type at all. > > Convert the date to number-of-days since epoch > integer. (Most languages will have a way to do this, > but I don't know about pure XSLT). Store _that_ in a > 1.4 'int' field. On top of that, make it a "tint" > (precision non-zero) for faster range queries. > > But now your actual interface will have to convert from > "number of days since epoch" to a displayable date. (And if > you allow user input, convert the input to > number-of-days-since-epoch before making a range query or > fq, but you'd have to do that anyway even with solr dates, > users aren't going to be entering W3CDate raw, I don't > think). > > That is probably the most efficient way to have solr handle > it -- using an actual date field type gives you a lot more > precision than you need, which is going to hurt performance > on range queries. Which you can compensate for with trie > date sure, but if you don't really need that precision to > begin with, why use it? Also the extra precision can > end up doing unexpected things and making it easier to have > bugs (range queries on that high precision stuff, you need > to make sure your start date has 00:00:00 set and your end > date has 23:59:59 set, to do what you probably expect). If > you aren't going to use the extra precision, makes > everything a lot simpler to not use a date field. > > Alternately, for your "get this done quick" method, yeah, > I'd just store it as a string. With a string exactly as > you've specified, sorting and range queries won't work how > you'd want. But if you can make it a string of the > format "yyyy/mm/dd" instead (always two-digit month and > year), then you can even sort and do range queries on your > string dates. For the quick and dirty prototype, I'd just do > that. In fact, while this might make range queries and > sorting _slightly_ slower than if you use an int or a tint, > this might really be good enough even for a real app (hey, > it's what lots of people did before the trie-based fields > existed). > > Jonathan > > Erick Erickson wrote: > > I think Markus is spot-on given the fact that you have > 2 days. Using a > > string field is quickest. > > > > However, if you absolutely MUST have functioning > dates, there are three > > options I can think of: > > 1> can you make your XSLT transform the dates? > Confession; I'm XSLT-ignorant > > 2> use DIH and DateTransformer, see: > > http://wiki.apache.org/solr/DataImportHandler#DateFormatTransformer > > you can walk a > directory importing all the XML files with > > FileDataSource. > > <http://wiki.apache.org/solr/DataImportHandler#DateFormatTransformer>3> > you > > could write a program to do this manually. > > > > But given the time constraints, I suspect your time > would be better spent > > doing the other stuff and just using string as per > Markus. I have no clue > > how SOLR-savvy you are, so pardon if this is something > you already know. But > > lots of people trip up over the "string" field type, > which is NOT tokenized. > > You usually want "text" unless it's some sort of > ID.... So it might be worth > > it to do some searching earlier rather than later > <G>.... > > > > Best > > Erick > > > > On Wed, Sep 8, 2010 at 12:34 PM, Markus Jelsma > > <markus.jel...@buyways.nl>wrote: > > > > > >> No. The Datefield [1] will not accept it any other > way. You could, however, > >> fool your boss and dump your dates in an ordinary > string field. But then you > >> cannot use some of the nice date features. > >> > >> > >> > >> [1]: > >> http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html > >> > >> -----Original message----- > >> From: Rico Lelina <rlel...@yahoo.com> > >> Sent: Wed 08-09-2010 17:36 > >> To: solr-user@lucene.apache.org; > >> Subject: How to import data with a different date > format > >> > >> Hi, > >> > >> I am attempting to import some of our data into > SOLR. I did it the quickest > >> way > >> I know because I literally only have 2 days to > import the data and do some > >> queries for a proof-of-concept. > >> > >> So I have this data in XML format and I wrote a > short XSLT script to > >> convert it > >> to the format in solr/example/exampledocs (except > I retained the element > >> names > >> so I had to modify schema.xml in the conf > directory. So far so good -- the > >> import works and I can search the data. One of my > immediate problems is > >> that > >> there is a date field with the format MM/DD/YYYY. > Looking at schema.xml, it > >> seems SOLR accepts only full date fields -- > everything seems to be > >> mandatory > >> including the Z for Zulu/UTC time according to the > doc. Is there a way to > >> specify the date format? > >> > >> Thanks very much. > >> Rico > >> > >> > >> > > > >