Re: Solr french search optimisation

It-forum Thu, 23 May 2013 01:42:32 -0700

Hello,

Tx Cristian for your details.

I totally agreed with your explanation, this is 2 differents aspectwhich I need to solve.


Could you clarify few more thinks :

- SpellchekComponent and Phonetic, should be use while indexing or onlywhile querying ?

- Does spellcheck component return only the right spelling, or is itused to search into result?

- If i want to solve Spelling, Phonetic, stemming problem in frenchlanguage. Can I use only one field or should I use several withdifferent filters ?


Regards

David


Le 23/05/2013 08:59, Cristian Cascetta a écrit :

Hello,

I think you're confusing three different things:

1) schema and fields definition is for precision/recall: treating
differently a field means different search results and results ranking
2) the "pomppe a chaler" problem is more a spellchecking problem
http://wiki.apache.org/solr/SpellCheckComponent
3) "solère" and "solaire" is a phonetic search problem
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PhoneticFilterFactory

Hope this helps a little,

cristian


2013/5/23 It-forum <it-fo...@meseo.fr>

Hello again,

Is any one could help me, pleeeeeeeeeeeease

David

Le 22/05/2013 18:09, It-forum a écrit :

  Hello to all,

I'm trying to setup solr 4.2 to index and search into french content.

I defined a special fieldtype for french content :

         <fieldType name="text_fr" class="solr.TextField"
positionIncrementGap="100">
                 <analyzer type="index">
                     <charFilter class="solr.**MappingCharFilterFactory"
mapping="mapping-**ISOLatin1Accent.txt"/>
                     <tokenizer class="solr.**
WhitespaceTokenizerFactory"/>
                     <filter class="solr.**WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
                     <filter class="solr.**LowerCaseFilterFactory"/>
                     <filter class="solr.**SnowballPorterFilterFactory"
language="French" protected="protwords.txt"/>
                 </analyzer>

                 <analyzer type="query">
                     <charFilter class="solr.**MappingCharFilterFactory"
mapping="mapping-**ISOLatin1Accent.txt"/>
                     <tokenizer class="solr.**
WhitespaceTokenizerFactory"/>
                     <filter class="solr.**WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
                     <filter class="solr.**LowerCaseFilterFactory"/>
                     <filter class="solr.**SnowballPorterFilterFactory"
language="French" protected="protwords.txt"/>
                 </analyzer>
         </fieldType>


unfortunately, this field does not behave as I wish.

I'd like to be able to get results from unwell spelled word.

IE : I wish to get the same result typing "Pompe à chaleur" than typing
"pomppe a chaler"  or with "solère" and "solaire"

I'm do not find the right way to create a fieldtype to reach this aim.

thanks in advance for your help, do not hesitate for more information if
need.

Regards

David

Re: Solr french search optimisation

Reply via email to