subject:"Indexing word with plus sign"

Re: Indexing word with plus sign

2017-05-24 Thread Fundera Developer

Thank you very much Erick! You're right! The "Char" part in PatternReplaceCharFilterFactory misguided me and I tought it was just for Char replacements. One I have gone through the documentation of CharFilters (my fault...) I realized that I could use the very same regex I was using with the

Re: Indexing word with plus sign

2017-05-23 Thread Walter Underwood

That was on Solr 1.3, so I’m pretty sure it was the whitespace tokenizer. The synonym substitution for “+/-" was done in client code and indexing code, outside of Solr. We also sanitized queries to remove all query syntax characters. wunder Walter Underwood wun...@wunderwood.org

Re: Indexing word with plus sign

2017-05-23 Thread Fundera Developer

Thanks Walter!! For the sake of curiosity, do you remember which Tokenizer were you using in that case? Thanks! El 23/05/17 a las 20:02, Walter Underwood escribió: Years ago at Netflix, I had to deal with a DVD from a band named “+/-“. I gave up and translated that to “plusminus” at index

Re: Indexing word with plus sign

2017-05-23 Thread Walter Underwood

Years ago at Netflix, I had to deal with a DVD from a band named “+/-“. I gave up and translated that to “plusminus” at index and query time. http://plusmin.us/ Luckily, “.hack//Sign” and other related dot-hack anime matched if I just deleted all the punctuation. And

Re: Indexing word with plus sign

2017-05-23 Thread Erick Erickson

You need to distinguish between PatternReplaceCharFilterFactory and PatternReplaceFilterFactory The first one is applied to the entire input _before_ tokenization. The second is applied _after_ tokenization to individual tokens, by that time it's too late. It's an easy thing to miss. And at

Re: Indexing word with plus sign

2017-05-23 Thread Fundera Developer

I have also tried this option, by using a PatternReplaceFilterFactory, like this: but it gets processed AFTER the Tokenizer, so when it executes there is no longer an "i+d" token, but two "i" and "d" independent tokens. Is there a way I could make the filter execute before the Tokenizer? I

Re: Indexing word with plus sign

2017-05-22 Thread Rick Leir

Fundera, You need a regex which matches a '+' with non-blank chars before and after. It should not replace a '+' preceded by white space, that is important in Solr. This is not a perfect solution, but might improve matters for you. Cheers -- Rick On May 22, 2017 1:58:21 PM EDT, Fundera

Re: Indexing word with plus sign

2017-05-22 Thread Fundera Developer

Thank you Zahid and Erik, I was going to try the CharFilter suggestion, but then I doubted. I see the indexing process, and how the appearance of 'i+d' would be handled, but, what happens at query time? If I use the same filter, I could remove '+' chars that are added by the user to identify

Re: Indexing word with plus sign

2017-05-22 Thread Erick Erickson

You can also use any of the other tokenizers. WhitespaceTokenizer for instance. There are a couple that use regular expressions. Etc. See: https://cwiki.apache.org/confluence/display/solr/Tokenizers Each one has it's considerations. WhitespaceTokenizer won't, for instance, separate out

Re: Indexing word with plus sign

2017-05-22 Thread Muhammad Zahid Iqbal

Hi, Before applying tokenizer, you can replace your special symbols with some phrase to preserve it and after tokenized you can replace it back. For example: Thanks, Zahid iqbal On Mon, May 22, 2017 at 12:57 AM, Fundera Developer < funderadevelo...@outlook.com> wrote: > Hi all, > > I am a

Indexing word with plus sign

2017-05-21 Thread Fundera Developer

Hi all, I am a bit stuck at a problem that I feel must be easy to solve. In Spanish it is usual to find the term 'i+d'. We are working with Solr 5.5, and StandardTokenizer splits 'i' and 'd' and sometimes, as we have in the index documents both in Spanish and Catalan, and in Catalan it is

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Re: Indexing word with plus sign

Indexing word with plus sign

11 matches

Site Navigation

Mail list logo

Footer information