How to retain % sign against numbers in lucene indexing/ search
Hi Group, I am facing a requirement change to get % sign retained in searches. e.g Sample search docs: 1. Number of boys 50 2. My score was 50% 3. 40-50% for pass score Search query: 50% Expected results: Doc-2, Doc-3 i.e. My score was 50% 40-50% for pass score Actual result: All 4 documents On the implementation front, I am using a set of filters like lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer StandardTokenizer. My analysis suggests, StandardTOkenizer strips off the % sign and hence the behavior.Has someone faced similar requirements? Any help/guidance is highly appreciated. *Warm Regards,* *Amitesh K*
Fwd: How to retain % sign against numbers in lucene indexing/ search
*Warm Regards,* *Amitesh K* -- Forwarded message - From: Amitesh Kumar Date: Wed, Jul 12, 2023 at 7:03 AM Subject: How to retain % sign against numbers in lucene indexing/ search To: Hi Group, I am facing a requirement change to get % sign retained in searches. e.g Sample search docs: 1. Number of boys 50 2. My score was 50% 3. 40-50% for pass score Search query: 50% Expected results: Doc-2, Doc-3 i.e. My score was 50% 40-50% for pass score Actual result: All 4 documents On the implementation front, I am using a set of filters like lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer StandardTokenizer. My analysis suggests, StandardTOkenizer strips off the % sign and hence the behavior.Has someone faced similar requirements? Any help/guidance is highly appreciated. *Warm Regards,* *Amitesh K*
How to retain % sign next to number during tokenization
I am facing a requirement change to get % sign retained in searches. e.g. Sample search docs: 1. Number of boys 50 2. My score was 50% 3. 40-50% for pass score Search query: 50% Expected results: Doc-2, Doc-3 i.e. My score was 1. 50% 2. 40-50% for pass score Actual result: All 3 documents (because tokenizer strips off the % both during indexing as well as searching and hence matches all docs with 50 in it. On the implementation front, I am using a set of filters like lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer StandardTokenizer. Per my analysis suggests, StandardTokenizer strips off the % I am facing a requirement change to get % sign retained in searches. e.g Sample search docs: 1. Number of boys 50 2. My score was 50% 3. 40-50% for pass score Search query: 50% Expected results: Doc-2, Doc-3 i.e. My score was 50% 40-50% for pass score Actual result: All 4 documents On the implementation front, I am using a set of filters like lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer StandardTokenizer. Per my analysis, StandardTOkenizer strips off the % sign and hence the behavior.Has someone faced similar requirement? Any help/guidance is highly appreciated. Regards Amitesh -- Regards, Amitesh Sent from Gmail Mobile (Please ignore typos)
Re: How to retain % sign next to number during tokenization
Sorry for duplicating the question. On Tue, Jul 18, 2023 at 19:09 Amitesh Kumar wrote: > I am facing a requirement change to get % sign retained in searches. e.g. > > Sample search docs: > 1. Number of boys 50 > 2. My score was 50% > 3. 40-50% for pass score > > Search query: 50% > Expected results: Doc-2, Doc-3 i.e. > 1. My score was 50% 2. 40-50% for pass score > > Actual result: All 3 documents (possibly because tokenizer strips off the % both during indexing as well > as searching and hence matches all docs with 50 in it.) > > On the implementation front, I am using a set of filters like > lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer > StandardTokenizer. > > Per my analysis, StandardTOkenizer strips off the % sign and hence the > behavior.Has someone faced similar requirement? Any help/guidance is highly > appreciated. > > Regards > Amitesh > -- > Regards, > Amitesh > Sent from Gmail Mobile > (Please ignore typos) > -- Regards Amitesh