Interesting question. Does zero-padding make primary key lookups faster or slower in lucene?
From my tests it would seem that non-padded keys are quicker to lookup than zero-padded ones (tested doing random access on indexes of varying sizes up to 5m unique keys). However I imagine there could come a point where zero padding might perform better because the number of binary chops required to find records in non-padded keys would be inefficient due to the lexicographic sort order. I'm sure there's a theoretical way of proving/disproving this notion but it might be quicker to just run up a test rig for the scale of index you need and benchmarking. Cheers Mark ----- Original Message ---- From: Cam Bazz <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, 16 January, 2008 3:37:44 PM Subject: NumberTools Hello, When storing fields to serve as id's - is it better to use NumberTools.longToString(id) or just store the id as a field? I have noticed when using NumberTools to store number as a string, this makes range queries easier, however - you end up storing a long string. Considering millions of ids, would it be faster to store them just as a string representing this number. (would take less space, since you store 1,2 instead of 00000000001, etc.) Best Regards, -C.B. __________________________________________________________ Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]