John Song <[EMAIL PROTECTED]> wrote on 17/01/2007 11:09:40: > ultimately, everything is text search. For decimal number, what you > do is to write a customized analyzer which multiple the number by > some factor, round it to a long and then use NumberTools to convert > that into a text string. Here is what I did for latitude/longitude > search: multiple it by 10e6. >
For most cases this would be sufficient. Won't work if multiplying by precision takes beyond Long.MAX. I think an alternative that would work in that case too could be to prefix X with P, where: X == original number. K == number of digits left side of decimal point. P == lexicographically valid representation of K. P is defined as "n0"+D+"n", where D is a one char representation of K, that is: '0', .. '9','a'..'z'. Examples for values of K, prefix P, and result string S: X=.17 K=0 P=n00n S=n00n_.17 X=0.170 K=0 P=n00n S=n00n_.17 (no leading/trailing redundant zeros) X=5.8 K=1 P=n01n S=n01n_5.8 X=17.66 K=2 P=n02n S=n02n_17.66 ... X=123456789.22 K=9 P=n09n S=n09n_123456789.22 X=1234567890.22 K=10 P=n0an S=n0an_1234567890.22 X=12345678901.22 K=11 P=n0bn S=n0bn_12345678901.22 X=123456789012.22 K=12 P=n0cn S=n0cn_123456789012.22 You get the idea - the prefix prevents the sorting error. No special care required for the fraction part. Negative values can work using 'm' instead of 'n' in the prefix, e.g.: X=-123456789.22 K=9 P=n09n S=m09m_123456789.22 This should work for long values as well. I like the fixed length prefix. But code for this would need to be written... > john > > ----- Original Message ---- > From: Jiho Han <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Wednesday, January 17, 2007 10:13:47 AM > Subject: Searching/indexing date/time values or numeric values? > > Is there a way to index/search so that a query could be written to > search on a field using arithmetic comparison operators? > > What I mean is if I had a date/time field called CREATEDATE, I would > search for all documents where: > > CREATEDATE > "1/1/2007" > > The above is obvisouly pseudo-query expression. I did find something > called Range searches on the query syntax documentation page and it says > the sorting is done lexicographically. I guess that means it's sorted > by letter. I would then need to store all my date/time values in a > format like yyyymmdd hh:mm:ss. > And search, CREATEDATE:[20070101 00:00:00 TO 20070118 00:00:00], where > the second date/time value is something like midnight tonight. > > But what about a decimal value? If I have a VERSION field where values > are like 1.0, 2.5, 11.3, etc. That wouldn't work. Because the values > would be sorted: > 1.0 > 11.3 > 2.5 > In that order. And if I do VERSION:[1.0 TO 3.0], search would return > all 3 of them. The only workaround seems to be prepending 0's and that > would also only work as long as the maximum digits for the interger part > is known ahead of time. > > Can someone verify/suggest ways to make this work? > Thanks > > Jiho Han > Senior Software Engineer > Infinity Info Systems > The Sales Technology Experts > Tel: 212.563.4400 x6375 > Fax: 212.760.0540 > [EMAIL PROTECTED] > www.infinityinfo.com <http://www.infinityinfo.com/> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > ____________________________________________________________________________________ > Cheap talk? > Check out Yahoo! Messenger's low PC-to-Phone call rates. > http://voice.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]