Hello,
Tx Cristian for your details.
I totally agreed with your explanation, this is 2 differents aspect
which I need to solve.
Could you clarify few more thinks :
- SpellchekComponent and Phonetic, should be use while indexing or only
while querying ?
- Does spellcheck component return only the right spelling, or is it
used to search into result?
- If i want to solve Spelling, Phonetic, stemming problem in french
language. Can I use only one field or should I use several with
different filters ?
Regards
David
Le 23/05/2013 08:59, Cristian Cascetta a écrit :
Hello,
I think you're confusing three different things:
1) schema and fields definition is for precision/recall: treating
differently a field means different search results and results ranking
2) the "pomppe a chaler" problem is more a spellchecking problem
http://wiki.apache.org/solr/SpellCheckComponent
3) "solère" and "solaire" is a phonetic search problem
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PhoneticFilterFactory
Hope this helps a little,
cristian
2013/5/23 It-forum <it-fo...@meseo.fr>
Hello again,
Is any one could help me, pleeeeeeeeeeeease
David
Le 22/05/2013 18:09, It-forum a écrit :
Hello to all,
I'm trying to setup solr 4.2 to index and search into french content.
I defined a special fieldtype for french content :
<fieldType name="text_fr" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.**MappingCharFilterFactory"
mapping="mapping-**ISOLatin1Accent.txt"/>
<tokenizer class="solr.**
WhitespaceTokenizerFactory"/>
<filter class="solr.**WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.**LowerCaseFilterFactory"/>
<filter class="solr.**SnowballPorterFilterFactory"
language="French" protected="protwords.txt"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.**MappingCharFilterFactory"
mapping="mapping-**ISOLatin1Accent.txt"/>
<tokenizer class="solr.**
WhitespaceTokenizerFactory"/>
<filter class="solr.**WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.**LowerCaseFilterFactory"/>
<filter class="solr.**SnowballPorterFilterFactory"
language="French" protected="protwords.txt"/>
</analyzer>
</fieldType>
unfortunately, this field does not behave as I wish.
I'd like to be able to get results from unwell spelled word.
IE : I wish to get the same result typing "Pompe à chaleur" than typing
"pomppe a chaler" or with "solère" and "solaire"
I'm do not find the right way to create a fieldtype to reach this aim.
thanks in advance for your help, do not hesitate for more information if
need.
Regards
David