On Fri, Aug 12, 2011 at 9:53 AM, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> wrote: > It turned out that there is a sorting issue with solr 3.3. > As fas as I could trace it down currently: > > 4 docs in the index and a search for *:* > > sorting on field dccreator_sort in descending order > > http://localhost:8983/solr/select?fsv=true&sort=dccreator_sort%20desc&indent=on&version=2.2&q=*%3A*&start=0&rows=10&fl=dccreator_sort > > result is: > ---------- > <lst name="sort_values"> > <arr name="dccreator_sort"> > <str>convertitovistitutonazionaled</str> > <str>莊國鴻chuangkuohung</str> > <str>zzzzzyyyyyywwwwwwwxxxxxxx</str> > <str>abdelhadiyasserabdelfattah</str> > </arr> > </lst>
Hmmm, are the docs sorted incorrectly too, or is it the sort_values that are incorrect? All variants of string sorting should be well tested... see TestSort.testSort() > fieldType: > ---------- > <fieldType name="alphaOnlySortLim" class="solr.TextField" > sortMissingLast="true" omitNorms="true"> > <analyzer> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory" /> > <filter class="solr.TrimFilterFactory" /> > <filter class="solr.PatternReplaceFilterFactory" > pattern="([\x20-\x2F\x3A-\x40\x5B-\x60\x7B-\x7E])" replacement="" > replace="all"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="(.{1,30})(.{31,})" replacement="$1" replace="all"/> > </analyzer> > </fieldType> > > field: > ------ > <field name="dccreator_sort" type="alphaOnlySortLim" indexed="true" > stored="true" /> > > > According to documentation the sorting is UTF8 but _why_ is the first string > at position 1 and _not_ at position 3 as it should be? > > > Following sorting through the code is somewhat difficult. > Any hint where to look for or where to start debugging? Sorting.getStringSortField() Can you reproduce this with a smaller test that we could use to debug/fix? -Yonik http://www.lucidimagination.com