You need to switch the order. Do synonyms and expansion first, then
shingles..

Have you tried using analysis.jsp ?

On 2/5/11 10:31 AM, "lee carroll" <lee.a.carr...@googlemail.com> wrote:

>Just to add things are going not as expected before the keepword, the
>synonym list is not be expanded for shingles I think I don't understand
>term
>position....
>
>On 5 February 2011 16:08, lee carroll <lee.a.carr...@googlemail.com>
>wrote:
>
>> Hi List
>> I'm trying to achieve the following
>>
>> text in "this aisle contains preserves and savoury spreads"
>>
>> desired index entry for a field to be used for faceting (ie strict set
>>of
>> normalised terms)
>> is "jams" "savoury spreads" ie two facet terms
>>
>> current set up for the field is
>>
>> <fieldType name="facet" class="solr.TextField"
>>positionIncrementGap="100">
>>       <analyzer type="index">
>>         <charFilter class="solr.HTMLStripCharFilterFactory"/>
>>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>         <filter class="solr.ShingleFilterFactory" maxShingleSize="2"
>> outputUnigrams="true"/>
>>         <filter class="solr.SynonymFilterFactory"
>> synonyms="goodForSynonyms.txt" ignoreCase="true" expand="true"/>
>>         <filter class="solr.KeepWordFilterFactory"
>> words="goodForKeepWords.txt" ignoreCase="true"/>
>>       </analyzer>
>>       <analyzer type="query">
>>         <charFilter class="solr.HTMLStripCharFilterFactory"/>
>>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>         <filter class="solr.ShingleFilterFactory" maxShingleSize="2"
>> outputUnigrams="true"/>
>>         <filter class="solr.SynonymFilterFactory"
>> synonyms="goodForSynonyms.txt" ignoreCase="true" expand="true"/>
>>         <filter class="solr.KeepWordFilterFactory"
>> words="goodForKeepWords.txt" ignoreCase="true"/>
>>       </analyzer>
>>     </fieldType>
>>
>> The thinking here is
>> get rid of any mark up nonsense
>> split into tokens based on whitespace => "this" "aisle" "contains"
>> "preserves" "and" "savoury" "spreads"
>> produce shingles of 1 or 2 tokens => "this","this aisle", "aisle",
>>"aisle
>> contains", "contains", "contains preserves","preserves","and",
>>                                                       "and savoury",
>> "savoury", "savoury spreads", "spreads"
>>
>> expand synonyms using a synomym file (preserves -> jam) =>
>>
>> "this","this aisle", "aisle", "aisle contains", "contains","contains
>> preserves","preserves","jam","and","and savoury", "savoury", "savoury
>> spreads", "spreads"
>>
>> produce a normalised term list using a keepword file of jam , "savoury
>> spreads" in it
>>
>> which should place "jam" "savoury spreads" into the index field
>>facet.....
>>
>> However i don't get savoury spreads in the index. from the analysis.jsp
>> everything goes to plan upto the last step where the keepword file does
>>not
>> like keeping the phrase "savoury spreads". i've tried niavely quoting
>>the
>> phrase in the keepword file :-)
>>
>> What is the best way to achive the above ? Is this the correct approach
>>or
>> is there a better way ?
>>
>> thanks in advance lee
>>
>>
>>
>>
>>


Reply via email to