Greetings, I have a custom analyzer which converts Library of Congress Callnumbers into normalized strings:
<fieldType name="LCNormalized" class="solr.TextField" sortMissingLast="true" omitNorms="true"> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="org.vufind.solr.analysis.LCCNormalizeFilterFactory"/> </analyzer> </fieldType> <field name="callnumber-normalized" type="LCNormalized" indexed="true" stored="true" /> Thus, values like: PQ239.Z56 PQ239.H63 2008 PQ239.S62 1982 PQ239.B68 1983 PQ2390.S35 A5 PQ2390.S35 B8 1898 PQ2389 .R65 F3 1854 t.1 PQ239.A7 1969 PQ2.N6 1959 PQ22.A4 D47 1949 PQ238.L57 1985 become: PQ 0239.000000 Z0.560000 PQ 0239.000000 H0.630000 002008 PQ 0239.000000 S0.620000 001982 PQ 0239.000000 B0.680000 001983 PQ 2390.000000 S0.350000 A0.500000 PQ 2390.000000 S0.350000 B0.800000 001898 PQ 2389.000000 R0.650000 F0.300000 001854 T.000001 PQ 0002.000000 N0.600000 001959 PQ 0022.000000 A0.400000 D0.470000 001949 PQ 0238.000000 L0.570000 001985 This allows items to be accurately sorted by callnumber. I would also like to perform ranged searches on the normalised callnumber but whereas callnumber-normalized=[DS+TO+FE] will correctly list items with callnumbers between DS and FE, starting with DT* and finishing with FD* , callnumber-normalized=[DS763+TO+FE] incorrectly starts at DT* and finishes with FD*. Can anyone explain why this might be the case? Looking at http://wiki.apache.org/solr/MultitermQueryAnalysis#Current_components_that_implement_MultiTermAwareComponent, would I have to add one of the MultiTermAware Factories to make this work? Thanks, Luke -- Luke O'Sullivan Systems Developer Web Team Swansea University, Singleton Park, Swansea SA2 8PP, UK l.osulli...@swansea.ac.uk<mailto:l.osulli...@swansea.ac.uk> 01792 602772 @l_os_cymru