Re: Position increment in WordDelimiterFilter.

2016-01-20 Thread Alessandro Benedetti
On 19 January 2016 at 05:41, Modassar Ather wrote: > Thanks Shawn for your explanation. > > Everything else about the analysis looks > correct to me, and the positions you see are needed for a phrase query > to work correctly. > > Here the "WiFi device" will not be searched as there is a gap in b

Re: Position increment in WordDelimiterFilter.

2016-01-18 Thread Modassar Ather
Thanks Shawn for your explanation. Everything else about the analysis looks correct to me, and the positions you see are needed for a phrase query to work correctly. Here the "WiFi device" will not be searched as there is a gap in between because Fi is at position 2. The document containing WiFi

Re: Position increment in WordDelimiterFilter.

2016-01-18 Thread Shawn Heisey
On 1/18/2016 6:21 AM, Modassar Ather wrote: > Can you please send us tokens you get (and positions) when you analyze > *WiFi device* > > Tokens generated and their respective positions. > > WiFi1 > Wi 1 > WiFi1 > Fi 2 > device

Re: Position increment in WordDelimiterFilter.

2016-01-18 Thread Modassar Ather
Can you please send us tokens you get (and positions) when you analyze *WiFi device* Tokens generated and their respective positions. WiFi1 Wi 1 WiFi1 Fi2 device 3 Best, Modassar On Fri, Jan 15, 2016 at 6:25 PM, E

Re: Position increment in WordDelimiterFilter.

2016-01-15 Thread Emir Arnautovic
Can you please send us tokens you get (and positions) when you analyze *WiFi device* On 15.01.2016 13:15, Modassar Ather wrote: Are you saying that WiFi Wi-Fi and Wi Fi should not match each other? I am using WhiteSpaceTokenizer in my analysis chain so wi fi becomes two different token. Please

Re: Position increment in WordDelimiterFilter.

2016-01-15 Thread Modassar Ather
Are you saying that WiFi Wi-Fi and Wi Fi should not match each other? I am using WhiteSpaceTokenizer in my analysis chain so wi fi becomes two different token. Please refer to my examples given in previous mail about the issues faced. Wi Fi are two term which will match but what happens if for a co

Re: Position increment in WordDelimiterFilter.

2016-01-15 Thread Emir Arnautovic
Modassar, Are you saying that WiFi Wi-Fi and Wi Fi should not match each other? Why do you use WordDelimiterFilter? Can you give us few examples where it is useful? Thanks, Emir On 15.01.2016 05:13, Modassar Ather wrote: Thanks for your responses. It seems to me that you don't want to split

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Modassar Ather
Thanks for your responses. It seems to me that you don't want to split on numbers. It is not with number only. Even if you try to analyze WiFi it will create 4 token one of which will be at position 2. So basically the issue is with position increment which causes few of the queries behave unexpec

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Jack Krupansky
Which release of Solr are you using? Last year (or so) there was a Lucene change that had the effect of keeping all terms for WDF at the same position. There was also some discussion about whether this was either a bug or a bug fix, but I don't recall any resolution. -- Jack Krupansky On Thu, Jan

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Emir Arnautovic
Hi, It seems to me that you don't want to split on numbers. Maybe there are other cases where you need to so it is turned on. If there are such cases I would suggest you create test with expectations so you can check what is best working for you. It is highly likely that you will not be able t

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Binoy Dalal
Irrespective of it what I want to understand why there is an increment in position. Should not all the terms be at same position as they are yielded from the same term/token? No they won't. The positions are incremented because typically these splits are used in phrase queries which solr might aut

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Modassar Ather
Thanks for your responses. Why do you think it should be at position 1? In that case searching for "3 d" would not find anything. Is it what you expect? During search some of the results returned are not wanted. Following is the example. Search query: "3d image" Search results with 3-d image/3 d i

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Binoy Dalal
I've tried out your settings and here's what I get: 3d 1 3 1 d 2 3d 2 1) can you confirm if you've made a typo while typing out your results? 2 ) you'll get the d and 3d as 2 since they're the 2nd token once 3d is split. Try the same thing with d3 and you'll get 3 and d3 at position 2 On Thu,

Re: Position increment in WordDelimiterFilter.

2016-01-14 Thread Emir Arnautovic
Hi Modassar, Why do you think it should be at position 1? In that case searching for "3 d" would not find anything. Is it what you expect? Thanks, Emir On 14.01.2016 10:15, Modassar Ather wrote: Hi, I have following definition for WordDelimiterFilter. The analysis of 3d shows following fo

Position increment in WordDelimiterFilter.

2016-01-14 Thread Modassar Ather
Hi, I have following definition for WordDelimiterFilter. The analysis of 3d shows following four tokens and their positions. token position 3d 1 3 1 3d 1 d 2 Please help me understand why d is at 2? Should not it also be at position