: > I can see in FieldDocSortedHitQueue where the case statement deals with : > the various types of SortField, but at that point it's comparing FieldDoc : > objects whose fields[i] is expected to allready be an "Integer" object. : > where is that "Integer" object parsed from the String value of the field? : > : : Surely, by using the number -> string algorithm I showed earlier this : would not be a problem. Did I miss something?
I haven't worked through the math to prove to myself that your algorithm is a viable way of expressing any Integer as a 4 byte String; such that any two Integers sort lexigraphically correct as strings ... but let's assume that i have, and that it works perfectly. So now any RangeFilter or RangeQuery (which operate on String term values) will work ... what about sorting? Well, the basis of your idea is custom code to format the Integer as a string, mainly... : public static String convertTotText(int input) : { : int unsigned = input + Integer.MIN_VALUE; : char c2 = (char) (unsigned & 0x0000FFFF); : char c1 = (char) (unsigned >> 16 & 0x0000FFFF); : return new String(new char[] {c1, c2}); : } As Strings, the Lucene sorting code is not going to look at those and recognize them as numbers, and even if you specified SortField.INT, the default parser (wherever it is) isn't going to be able to make heads or tails of them -- so it's going to have to Sort them as Strings, which is slower then sorting them as Integers -- even if they are only 4 bytes long (unless I'm wrong, which is entirely possible ... i haven't tested it). Now, if we could override the method used to parse the Term values from Strings to Integers (using a user specified NumberFormat as i proposed) then we could name your "convertTotText" method as "format" and write a corrisponding "parse" method, and everything would work smashing. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]