Re: Indexing word with plus sign

Muhammad Zahid Iqbal Mon, 22 May 2017 05:27:05 -0700

Hi,


Before applying tokenizer, you can replace your special symbols with some
phrase to preserve it and after tokenized you can replace it back.

For example:
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="(\+)"
replacement="xxx" />


Thanks,
Zahid iqbal

On Mon, May 22, 2017 at 12:57 AM, Fundera Developer <
funderadevelo...@outlook.com> wrote:

> Hi all,
>
> I am a bit stuck at a problem that I feel must be easy to solve. In
> Spanish it is usual to find the term 'i+d'. We are working with Solr 5.5,
> and StandardTokenizer splits 'i' and 'd' and sometimes, as we have in the
> index documents both in Spanish and Catalan, and in Catalan it is frequent
> to find 'i' as a word, when a user searches for 'i+d' it gets Catalan
> documents as results.
>
> I have tried to use the SynonymFilter, with something like:
>
> i+d => investigacionYdesarrollo
>
> But it does not seem to change anything.
>
> Is there a way I could set an exception to the Tokenizer so that it does
> not split this word?
>
> Thanks in advance!
>
>

Re: Indexing word with plus sign

Reply via email to