Erick: (sorry, I misspelled your name in my last email )
I tried a bunch of solutions.... none worked as I expected. Basically none of them sorts the documents using the pattern as I expect. This is my simplified code: public class PatternFieldComparatorSource extends FieldComparatorSource { private String pattern; private boolean ascending = false; public PatternFieldComparatorSource(String pattern, boolean ascending){ this.ascending = ascending; this.pattern = pattern; } public FieldComparator newComparator(String fieldname, int numHits, int sortPos, boolean reversed) throws IOException { return new PatternFieldComparator(numHits, fieldname); } class PatternFieldComparator extends FieldComparator { private final int[] values; private int[] currentReaderValues; private final String field; private int bottom; // Value of bottom of queue HighTrafficFieldComparator(int numHits, String field) { values = new int[numHits]; this.field = field; } public int compare(int slot1, int slot2) { // TODO: there are sneaky non-branch ways to compute // -1/+1/0 sign // Cannot return values[slot1] - values[slot2] because that // may overflow final int v1 = values[slot1]; final int v2 = values[slot2]; if (v1 > v2) { return 1; } else if (v1 < v2) { return -1; } else { return 0; } } public int compareBottom(int doc) { // TODO: there are sneaky non-branch ways to compute // -1/+1/0 sign // Cannot return bottom - values[slot2] because that // may overflow final int v2 = currentReaderValues[doc]; if (bottom > v2) { return 1; } else if (bottom < v2) { return -1; } else { return 0; } } public void copy(int slot, int doc) { values[slot] = currentReaderValues[doc]; } public void setNextReader(IndexReader reader, int docBase) throws IOException { currentReaderValues = FieldCache.DEFAULT.getInts(reader, field, new FieldCache.IntParser() { public final int parseInt(final String val) { return getValueByPattern(val); } }); } public void setBottom(final int bottom) { this.bottom = values[bottom]; } public Comparable value(int slot) { return values[slot]; } } private Integer getValueByPattern(String text) { // if pattern is not present I return then max or min value possible (depends if sort is ascending or descending). int value = !ascending ? Integer.MAX_VALUE : Integer.MIN_VALUE; // if pattern is pressent... if (text.contains(pattern) { value = Integer.parseInt(...) // extract the value and return } return new Integer(value); } } My code does not sort fine. I'm not finding a explanation why. Thanks Víctor On Sat, Jan 17, 2015 at 9:04 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Ah, OK. H.L. Mencken wrote something like: > "For every complex problem there is a solution > that is simple, elegant, and wrong". I specialize in these... > > I don't have a good answer for your question then. How > is what you're trying failing? > > Best, > Erick > > On Fri, Jan 16, 2015 at 4:59 PM, Victor Podberezski > <vpodberez...@cms-medios.com> wrote: > > Erik, Thanks for your reply. > > > > I wrote a simplification of the problem. Not only the values in the field > > that can be sorted are "val1, val2,..." . they can also be "patternX1, > > patternX2", etc. > > > > and in that case I need to sort according to different criteria. They're > a > > lot of differents patterns but not to much documents as result of the > query > > filter > > For that reason I think the best way is a custom FieldComparator. > > > > Thanks > > Víctor Podberezski > > > > On Fri, Jan 16, 2015 at 9:31 PM, Erick Erickson <erickerick...@gmail.com > > > > wrote: > > > >> Personally I would do this on the ingestion side with a new field. > >> That is, analyze the input field when you were indexing the doc, > >> extract the min value from any numbers, and put that in a > >> new field. Then it's simply sorting by the new field. This is likely > >> to be much more performant than reprocessing this at query > >> time in a comparator. > >> > >> FWIW, > >> Erick > >> > >> On Fri, Jan 16, 2015 at 4:00 PM, Victor Podberezski > >> <vpodberez...@cms-medios.com> wrote: > >> > I need a hand with a custom comparator. > >> > > >> > I have a field filled with words separated by spaces. Some words has > >> > numbers inside. > >> > > >> > I need to extract those numbers and sort the documents by this > number. I > >> > need to get the lower if there are more than 1 number . > >> > > >> > For example: > >> > > >> > doc1 "val2 aaaa val3" --> 2, 3 --> 2 > >> > doc2 "val5 aaaa val1" --> 5, 1 --> 1 > >> > doc3 "val7 bbbbb val5" --> 7, 5 ---> 5 > >> > > >> > the sorted results have to be: > >> > > >> > doc2 > >> > doc1 > >> > doc3 > >> > > >> > how can I achieve this? > >> > > >> > I have trouble migrating a functional solution from lucene 2.4 to > lucene > >> > 3.9 or higher (migration from ScoreDocComparator to fieldComparator). > >> > > >> > I try this: > >> > > >> > public void setNextReader(IndexReader reader, int docBase) throws > >> > IOException { > >> > > >> > currentReaderValues = FieldCache.DEFAULT.getInts(reader, field, > new > >> > FieldCache.IntParser() { > >> > public final int parseInt(final String val) { > >> > return extractNumber(val); > >> > } > >> > }); > >> > > >> > and the rest equal to the IntComparator. > >> > but this is not working > >> > > >> > Anybody has an idea of how resolve this problem? > >> > Thanks, > >> > > >> > Víctor Podberezski > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >