P.S. Meant to add that it's almost always a win, if you have the space, to do any pre-computing you can at index time, you only have to pay the cost once.
Or you could normalize the field, simply add spaces, i.e. left-pad each number with the appropriate number of spaces so all digit sequences are the same width, so Bay 1 Bay 10 Or even left-pad the digits with 0, the user's wouldn't see that unless they used terms component or something. On Wed, Nov 27, 2013 at 8:51 AM, Erick Erickson <erickerick...@gmail.com>wrote: > If this is your complete pattern, can you index two > different fields, one text and one numeric, say > name_text_sort that holds Bay > name_int_sort (make sure it's a number field!) > that holds the 1, 2, 11, etc. > > Then just do a primary sort on name_text_sort and > secondary sort on name_int_sort? > > FWIW, > Erick > > > On Tue, Nov 26, 2013 at 4:11 AM, Umashanker, Srividhya < > srividhya.umashan...@hp.com> wrote: > >> >> >>> What are you intending to do? >> >> [VIDHYA] a field with following values should be sorted in "Natural >> Order" >> >> Name field has Bay 1, Bay10, Bay 11, bay 2, Bay 3 >> should be sorted as Bay 1, bay 2, Bay 3, Bay10, Bay 11 >> >> >> -----Original Message----- >> From: Umashanker, Srividhya >> Sent: Tuesday, November 26, 2013 2:26 PM >> To: Uwe Schindler >> Cc: java-user@lucene.apache.org >> Subject: RE: Alphanumeric Field Comparison : Lucene 4.5 >> >> We do have a duplicate field for every indexed field. >> >> 1> field stores text with exact case (used for case sensitive search) >> 2>lowercased text (used for case insensitive search) >> >> Let me find some example for the collator analyzer. >> >> >> >> Here is the .java lost attachment >> >> >> >> package index.search.util; >> >> >> >> import java.io.IOException; >> >> import java.io.Serializable; >> >> import java.util.Comparator; >> >> >> >> import org.apache.lucene.index.AtomicReaderContext; >> >> import org.apache.lucene.index.BinaryDocValues; >> >> import org.apache.lucene.search.FieldCache; >> >> import org.apache.lucene.search.FieldComparator; >> >> import org.apache.lucene.search.FieldComparatorSource; >> >> import org.apache.lucene.util.Bits; >> >> import org.apache.lucene.util.BytesRef; >> >> >> >> /** >> >> * This class is a FieldComparatorSource having an alpha numeric string >> >> * comparator. The comparator class will compare fields by alpha numeric >> values >> >> * and gives the result to SortField which will handle the ascending and >> >> * descending order. >> >> */ >> >> public class AlphaNumericFieldComparatorSource extends >> FieldComparatorSource implements Serializable >> >> { >> >> >> >> private static final long serialVersionUID = 1L; >> >> >> >> /** >> >> * Comparator class to compare the alpha-numeric field values. >> >> */ >> >> private static class AlphaNumericFieldComparator extends >> FieldComparator<String> implements Comparator<String> >> >> { >> >> private final String[] values; >> >> private BinaryDocValues docTerms; >> >> private final String field; >> >> private String bottom; >> >> private final BytesRef tempBR = new BytesRef(); >> >> private Bits docsWithField; >> >> private int charIndex1, charIndex2, strLen1, strLen2; >> >> // just used internally in this comparator >> >> private static final byte[] MISSING_BYTES = new byte[0]; >> >> >> >> AlphaNumericFieldComparator() >> >> { >> >> values = null; >> >> field = null; >> >> } >> >> >> >> AlphaNumericFieldComparator(final int numHits, final String field) >> >> { >> >> values = new String[numHits]; >> >> this.field = field; >> >> } >> >> >> >> @Override >> >> public int compareBottom(final int doc) >> >> { >> >> docTerms.get(doc, tempBR); >> >> if (tempBR.length == 0 && docsWithField.get(doc) == false) >> >> { >> >> tempBR.bytes = MISSING_BYTES; >> >> } >> >> if (bottom.getBytes() == MISSING_BYTES) >> >> { >> >> if (tempBR.bytes == MISSING_BYTES) >> >> { >> >> return 0; >> >> } >> >> return -1; >> >> } >> >> else if (tempBR.bytes == MISSING_BYTES) >> >> { >> >> return 1; >> >> } >> >> return compare(bottom, tempBR.utf8ToString()); >> >> } >> >> >> >> @Override >> >> public void copy(final int slot, final int doc) >> >> { >> >> BytesRef ref = new BytesRef(); >> >> if (values[slot] != null && values[slot].length() > 0) >> >> { >> >> ref = new BytesRef(values[slot].getBytes()); >> >> } >> >> docTerms.get(doc, ref); >> >> if (ref.length == 0 && docsWithField.get(doc) == false) >> >> { >> >> values[slot] = ""; >> >> } >> >> else >> >> { >> >> values[slot] = ref.utf8ToString(); >> >> } >> >> } >> >> >> >> @Override >> >> public FieldComparator<String> setNextReader(final >> AtomicReaderContext context) >> >> throws IOException >> >> { >> >> docTerms = FieldCache.DEFAULT.getTerms(context.reader(), >> field, true); >> >> docsWithField = >> FieldCache.DEFAULT.getDocsWithField(context.reader(), field); >> >> return this; >> >> } >> >> >> >> @Override >> >> public void setBottom(final int bottom) >> >> { >> >> this.bottom = values[bottom]; >> >> } >> >> >> >> @Override >> >> public String value(final int slot) >> >> { >> >> return values[slot]; >> >> } >> >> >> >> @Override >> >> public int compareValues(final String val1, final String val2) >> >> { >> >> if (val1 == null) >> >> { >> >> if (val2 == null) >> >> { >> >> return 0; >> >> } >> >> return -1; >> >> } >> >> else if (val2 == null) >> >> { >> >> return 1; >> >> } >> >> return compare(val1, val2); >> >> } >> >> >> >> @Override >> >> public int compareDocToValue(final int doc, final String value) >> >> { >> >> docTerms.get(doc, tempBR); >> >> if (tempBR.length == 0 && docsWithField.get(doc) == false) >> >> { >> >> tempBR.bytes = MISSING_BYTES; >> >> } >> >> return compare(tempBR.utf8ToString(), value); >> >> } >> >> >> >> /** >> >> * Method to compare 2 alpha numeric strings >> >> * >> >> * @param s1 >> >> * @param s2 >> >> * @return >> >> */ >> >> @Override >> >> public int compare(final String string1, final String string2) >> >> { >> >> final String strVal1 = string1; >> >> final String strVal2 = string2; >> >> strLen1 = strVal1.length(); >> >> strLen2 = strVal2.length(); >> >> charIndex1 = charIndex2 = 0; >> >> >> >> if (strLen1 == 0) >> >> { >> >> return strLen2 == 0 ? 0 : -1; >> >> } >> >> else if (strLen2 == 0) >> >> { >> >> return 1; >> >> } >> >> >> >> while (charIndex1 < strLen1 && charIndex2 < strLen2) >> >> { >> >> final char char1 = strVal1.charAt(charIndex1); >> >> final char char2 = strVal2.charAt(charIndex2); >> >> int result = 0; >> >> >> >> if (Character.isDigit(char1)) >> >> { >> >> result = Character.isDigit(char2) ? >> compareDigits(strVal1, strVal2) : -1; >> >> } >> >> else if (Character.isLetter(char1)) >> >> { >> >> result = Character.isLetter(char2) ? >> compareAlphabetsAndOthers(strVal1, strVal2) : 1; >> >> } >> >> else >> >> { >> >> result = Character.isDigit(char2) ? 1 : >> Character.isLetter(char2) ? -1 : >> >> compareAlphabetsAndOthers(strVal1, strVal2); >> >> } >> >> >> >> if (result != 0) >> >> { >> >> return result; >> >> } >> >> } >> >> return strLen1 - strLen2; >> >> } >> >> >> >> /** >> >> * Method to compare only digits >> >> * >> >> * @return >> >> */ >> >> private int compareDigits(final String string1, final String >> string2) >> >> { >> >> int diff = 0; >> >> int zeroCount1 = 0, zeroCount2 = 0; >> >> char char1 = (char) 0, char2 = (char) 0; >> >> >> >> // Count the leading zeros and compare it later. >> >> while (charIndex1 < strLen1 && (char1 = >> string1.charAt(charIndex1++)) == '0') >> >> { >> >> zeroCount1++; >> >> } >> >> while (charIndex2 < strLen2 && (char2 = >> string2.charAt(charIndex2++)) == '0') >> >> { >> >> zeroCount2++; >> >> } >> >> >> >> while (true) >> >> { >> >> final boolean endOfDigits1 = (char1 == 0) || >> !Character.isDigit(char1); >> >> final boolean endOfDigits2 = (char2 == 0) || >> !Character.isDigit(char2); >> >> >> >> /* >> >> * If one sequence contains more significant digits than >> the >> >> * other, it's a larger number. In case the sequesnces >> have >> >> * equal lengths, we need to compare digits at each >> position; >> >> * the first >> >> * unequal pair determines which is the bigger number. >> >> */ >> >> >> >> if (endOfDigits1 && endOfDigits2) >> >> { >> >> return diff != 0 ? diff : -(zeroCount1 - zeroCount2); >> >> } >> >> else if (endOfDigits1) >> >> { >> >> return -1; >> >> } >> >> else if (endOfDigits2) >> >> { >> >> return 1; >> >> } >> >> else if (diff == 0 && char1 != char2) >> >> { >> >> diff = char1 - char2; >> >> } >> >> >> >> char1 = charIndex1 < strLen1 ? >> string1.charAt(charIndex1++) : (char) 0; >> >> char2 = charIndex2 < strLen2 ? >> string2.charAt(charIndex2++) : (char) 0; >> >> } >> >> } >> >> >> >> /** >> >> * Method to compare letters and special characters >> >> * >> >> * @param isLetters >> >> * @return >> >> */ >> >> private int compareAlphabetsAndOthers(final String string1, final >> String string2) >> >> { >> >> final char char1 = string1.charAt(charIndex1++); >> >> final char char2 = string2.charAt(charIndex2++); >> >> >> >> return (char1 == char2) ? 0 : (char1 - char2); >> >> } >> >> >> >> @Override >> >> public int compare(final int slot1, final int slot2) >> >> { >> >> final String val1 = values[slot1]; >> >> final String val2 = values[slot2]; >> >> if (val1 == null) >> >> { >> >> if (val2 == null) >> >> { >> >> return 0; >> >> } >> >> return -1; >> >> } >> >> else if (val2 == null) >> >> { >> >> return 1; >> >> } >> >> >> >> return compare(val1, val2); >> >> } >> >> } >> >> >> >> /* >> >> * @see >> >> * >> org.apache.lucene.search.FieldComparatorSource#newComparator(java.lang >> >> * .String, int, int, boolean) >> >> */ >> >> @Override >> >> public FieldComparator<String> newComparator(final String fieldname, >> final int numHits, final int sortPos, final boolean reversed) >> >> throws IOException >> >> { >> >> return new AlphaNumericFieldComparator(numHits, fieldname); >> >> } >> >> >> >> /** >> >> * Method to return alpha-numeric comparator for collection sort >> >> * >> >> * @return comparator >> >> */ >> >> public Comparator<String> getAlphaNumericComparator() >> >> { >> >> return new AlphaNumericFieldComparator(); >> >> } >> >> } >> >> >> >> >> >> -----Original Message----- >> From: Uwe Schindler [mailto:u...@thetaphi.de] >> Sent: Tuesday, November 12, 2013 1:57 PM >> To: java-user@lucene.apache.org >> Subject: RE: Alphanumeric Field Comparison : Lucene 4.5 >> >> >> >> Hi, >> >> >> >> >> >> What are you intending to do? The example code is lost! >> >> >> >> In general, to sort alphanumeric/lexical on a human readable field, you >> would use the collation functionalities (needs a separate field for sorting >> containing the collation keys) provided by Lucene. >> >> >> >> Use >> http://lucene.apache.org/core/4_5_1/analyzers-common/org/apache/lucene/collation/CollationKeyAnalyzer.htmlto >> index the field and then you can do a simple native sort on this field >> (SortField.STRING). >> >> >> >> >> >> Uwe >> >> >> >> >> >> ----- >> >> >> >> Uwe Schindler >> >> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen >> >> >> >> <http://www.thetaphi.de/> http://www.thetaphi.de >> >> >> >> eMail: u...@thetaphi.de<mailto:u...@thetaphi.de> >> >> >> >> >> >> From: Umashanker, Srividhya [mailto:srividhya.umashan...@hp.com] >> >> Sent: Tuesday, November 12, 2013 5:00 AM >> >> To: java-user@lucene.apache.org<mailto:java-user@lucene.apache.org> >> >> Subject: Alphanumeric Field Comparison : Lucene 4.5 >> >> >> >> >> >> Group – >> >> >> >> >> >> We are looking at sorting lucene doc’s based on a field in alphanumeric >> order, as we expect fields to have Alpha numeric characters. >> >> >> >> Attached is the AlphaNumericFieldComparatorSource and below is the >> snippet of its usage. >> >> >> >> >> >> final SortField sortField_id = new SortField(FieldName._id.name(), new >> AlphaNumericFieldComparatorSource(), false); >> >> >> >> >> >> Is anyone using an easier approach or please share other alternatives >> that you have tried. >> >> >> >> >> >> Thanks! >> >> >> >> >> >> -Vidhya >> >> >> >