Solr : Stopwords at query time

2013-04-11 Thread meghana
In solr , I have text as like below format.

1s: This is very nice day. 4s: Christmas is about to come 7s: and christmas
preparation is just on 12s: this is awesome!! 

I want that words like '1s:' , '4s:' , anything like 'ns:' should not be
indexed and searchable, to do so I have added stop words filter in my text
field definition. 

below is the my field type defination 
---
 fieldType name=text_en_splitting class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
  analyzer type=index
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/

tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
filter class=solr.PorterStemFilterFactory/
  /analyzer
/fieldType


and stopwords.txt field contains words 
---
1s:
2s:
... 
...
... 
1s:

--
when i search for with q=109s: , it returns 0 results, but if i search for
109s , then it should also return 0 results. but surprisingly solr not
doing so!! , and returning results having 190s:  in text. 

I understand that , if words 109s: is not indexed, thus 190s also not
indexed. and as word 190s is not there in index, it should not return
results for that. 

But solr is not looking to behave so, can anybody explain me of this
behavior. and if any changes i should do to fulfill my requirement 

Thanks




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Stopwords-at-query-time-tp4055249.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr : Stopwords at query time

2013-04-11 Thread Upayavira
I'd suggest using the analyze tab in the admin UI to unpick what is
going on. You can play with scenarios there without having to waste
round trips indexing stuff.

Upayavira

On Thu, Apr 11, 2013, at 08:25 AM, meghana wrote:
 In solr , I have text as like below format.
 
 1s: This is very nice day. 4s: Christmas is about to come 7s: and
 christmas
 preparation is just on 12s: this is awesome!! 
 
 I want that words like '1s:' , '4s:' , anything like 'ns:' should not be
 indexed and searchable, to do so I have added stop words filter in my
 text
 field definition. 
 
 below is the my field type defination 
 ---
  fieldType name=text_en_splitting class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true
   analyzer type=index
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.StopFilterFactory
 ignoreCase=true
 words=stopwords.txt
 enablePositionIncrements=true
 /
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KeywordMarkerFilterFactory
 protected=protwords.txt/
 filter class=solr.PorterStemFilterFactory/
   /analyzer
 /fieldType
 
 
 and stopwords.txt field contains words 
 ---
 1s:
 2s:
 ... 
 ...
 ... 
 1s:
 
 --
 when i search for with q=109s: , it returns 0 results, but if i search
 for
 109s , then it should also return 0 results. but surprisingly solr not
 doing so!! , and returning results having 190s:  in text. 
 
 I understand that , if words 109s: is not indexed, thus 190s also not
 indexed. and as word 190s is not there in index, it should not return
 results for that. 
 
 But solr is not looking to behave so, can anybody explain me of this
 behavior. and if any changes i should do to fulfill my requirement 
 
 Thanks
 
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-Stopwords-at-query-time-tp4055249.html
 Sent from the Solr - User mailing list archive at Nabble.com.