Re: Solr french search optimisation

fbrisbart Thu, 23 May 2013 00:30:18 -0700

You can also think about using a SynonymFilter if you can list the
misspelled words.


That's a quick and dirty solution.
But it's easier to add a "pomppe -> pompe" in a synonym list than tuning
a phonetic filter.
NB: an indexation is required whenever the synonyms file change

Franck Brisbart

Le jeudi 23 mai 2013 à 08:59 +0200, Cristian Cascetta a écrit :
> Hello,
> 
> I think you're confusing three different things:
> 
> 1) schema and fields definition is for precision/recall: treating
> differently a field means different search results and results ranking
> 2) the "pomppe a chaler" problem is more a spellchecking problem
> http://wiki.apache.org/solr/SpellCheckComponent
> 3) "solère" and "solaire" is a phonetic search problem
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PhoneticFilterFactory
> 
> Hope this helps a little,
> 
> cristian
> 
> 
> 2013/5/23 It-forum <it-fo...@meseo.fr>
> 
> > Hello again,
> >
> > Is any one could help me, pleeeeeeeeeeeease
> >
> > David
> >
> > Le 22/05/2013 18:09, It-forum a écrit :
> >
> >  Hello to all,
> >>
> >> I'm trying to setup solr 4.2 to index and search into french content.
> >>
> >> I defined a special fieldtype for french content :
> >>
> >>         <fieldType name="text_fr" class="solr.TextField"
> >> positionIncrementGap="100">
> >>                 <analyzer type="index">
> >>                     <charFilter class="solr.**MappingCharFilterFactory"
> >> mapping="mapping-**ISOLatin1Accent.txt"/>
> >>                     <tokenizer class="solr.**
> >> WhitespaceTokenizerFactory"/>
> >>                     <filter class="solr.**WordDelimiterFilterFactory"
> >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >>                     <filter class="solr.**LowerCaseFilterFactory"/>
> >>                     <filter class="solr.**SnowballPorterFilterFactory"
> >> language="French" protected="protwords.txt"/>
> >>                 </analyzer>
> >>
> >>                 <analyzer type="query">
> >>                     <charFilter class="solr.**MappingCharFilterFactory"
> >> mapping="mapping-**ISOLatin1Accent.txt"/>
> >>                     <tokenizer class="solr.**
> >> WhitespaceTokenizerFactory"/>
> >>                     <filter class="solr.**WordDelimiterFilterFactory"
> >> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >>                     <filter class="solr.**LowerCaseFilterFactory"/>
> >>                     <filter class="solr.**SnowballPorterFilterFactory"
> >> language="French" protected="protwords.txt"/>
> >>                 </analyzer>
> >>         </fieldType>
> >>
> >>
> >> unfortunately, this field does not behave as I wish.
> >>
> >> I'd like to be able to get results from unwell spelled word.
> >>
> >> IE : I wish to get the same result typing "Pompe à chaleur" than typing
> >> "pomppe a chaler"  or with "solère" and "solaire"
> >>
> >> I'm do not find the right way to create a fieldtype to reach this aim.
> >>
> >> thanks in advance for your help, do not hesitate for more information if
> >> need.
> >>
> >> Regards
> >>
> >> David
> >>
> >>
> >>
> >

Re: Solr french search optimisation

Reply via email to