Thorsten Scherler wrote:
On Thu, 2007-02-01 at 16:35 +0100, Manuel Albela Miranda wrote:
Thorsten Scherler wrote:
On Thu, 2007-02-01 at 12:37 +0100, Manuel Albela Miranda wrote:
Hello everybody,

Do you know if there is a way to search with and without accents without duplicate a field?.

I have a large index (60Gb) and don't want to have two fields with the same content one with accents and the other one without them because this field is the biggest in the index.

Again, hope you can help me.
Try something like this in your schema.xml:
<fieldtype name="stringSimilar" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.LowerCaseTokenizerFactory"/>
        <filter class="solr.ISOLatin1AccentFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.LowerCaseTokenizerFactory"/>
        <filter class="solr.ISOLatin1AccentFilterFactory"/>
      </analyzer>
    </fieldtype>

HTH

salu2

Thank you very much.

Regards.

Manu

Hi Thorsten,

First of all, thank you for your message. I've working around the schema.xml file with the lines you sent me. Now i can filter the query, but the problem is that i have accents in my index so, when i search for words with accents, solr only search for the word without them and i need both of them. I don't know if there is a way to do this.

Well, it is not nice but you could use fuzzy search.

AKA q=Órden~075

That will find more matches. See recent threads around fuzzy search.

The above schema patch is working nice if you update your index (index
everything again), but what you would need is to reindex the WHOLE 60Gb.

salu2

Yes, i was considering that, but there is a problem. If i remove the accents into the index, when i get the results of a search they will not have those accents so results will no be good enough.

I have to see the performance of the fuzzy search, but i don't think it would work for me.

Thank you again.

Regards.

Manu.

Reply via email to