Re: Solr pattern tokenizer

2015-02-10 Thread Erick Erickson
Please do not do this. By having such different tokenizers in your index and query time fieldType definition, I pretty much guarantee that you will have endless problems and spend forever chasing your tail trying to solve them. Please do yourself a favor and take the time to get to know the admin/

Re: Solr pattern tokenizer

2015-02-10 Thread Nivedita
I tried solving issue like It works for query like CHQ PAID-INWARD TRANHDFC LTD 00036529 But if HDFC LTD is preceding with underscore(-) or any digi

Re: Solr pattern tokenizer

2015-02-02 Thread Jim . Musil
It looks to me like you simply want to split the incoming query by the hyphen, so that it searches for exact codes like this ³CHQ PAID² ³INWARD TRAN² ³HDFC LTD². If that¹s true, I¹d either just change the query at the client to do what you want, or look into something like the PatternTokenizer:

Re: Solr pattern tokenizer

2015-02-02 Thread Erick Erickson
You do not have WordDelimiterFilterFactory in your index-time analysis chain. And you're using different tokenizers in the two cases. This will almost certainly lead to "surprising" results unless you completely and thoroughly understand all the nuances here. I _strongly_ recommend you do not do t

Re: Solr pattern tokenizer

2015-02-02 Thread Dikshant Shahi
Why have you created ngram of size 3? Do you want match also in case of spell-mistakes? If you want 2 consecutive tokens to match, you can create shingles. Please refer to link https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-ShingleFilter Thanks, Dikshant O