Interesting question. Does zero-padding make primary key lookups faster or 
slower in lucene?

From my tests it would seem that non-padded keys are quicker to lookup than 
zero-padded ones (tested doing random access on indexes of varying sizes up to 
5m unique keys).
However I imagine there could come a point where zero padding might perform 
better because the number of binary chops required to find records in 
non-padded keys would be inefficient due to the lexicographic sort order. I'm 
sure there's a theoretical way of proving/disproving this notion but it might 
be quicker to just run up a test rig for the scale of index you need and 
benchmarking.

Cheers
Mark


----- Original Message ----
From: Cam Bazz <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, 16 January, 2008 3:37:44 PM
Subject: NumberTools

Hello,

When storing fields to serve as id's - is it better to use
NumberTools.longToString(id) or just store the id as a field?
I have noticed when using NumberTools to store number as a string, this
makes range queries easier, however - you end up storing a long string.
Considering millions of ids, would it be faster to store them just as a
string representing this number. (would take less space, since you
 store 1,2
instead of 00000000001, etc.)
Best Regards,
-C.B.





      __________________________________________________________
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to