How to retain % sign against numbers in lucene indexing/ search

2023-07-12 Thread Amitesh Kumar
Hi Group,

I am facing a requirement change to get % sign retained in searches. e.g

Sample search docs:
1. Number of boys 50
2. My score was 50%
3. 40-50% for pass score

Search query: 50%
Expected results: Doc-2, Doc-3 i.e.
My score was 50%
40-50% for pass score

Actual result: All 4 documents

On the implementation front, I am using a set of filters like
lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer
StandardTokenizer.

My analysis suggests, StandardTOkenizer strips off the %  sign and hence
the behavior.Has someone faced similar requirements? Any help/guidance is
highly appreciated.

*Warm Regards,*
*Amitesh  K*


Fwd: How to retain % sign against numbers in lucene indexing/ search

2023-07-13 Thread Amitesh Kumar
*Warm Regards,*
*Amitesh  K*


-- Forwarded message -
From: Amitesh Kumar 
Date: Wed, Jul 12, 2023 at 7:03 AM
Subject: How to retain % sign against numbers in lucene indexing/ search
To: 


Hi Group,

I am facing a requirement change to get % sign retained in searches. e.g

Sample search docs:
1. Number of boys 50
2. My score was 50%
3. 40-50% for pass score

Search query: 50%
Expected results: Doc-2, Doc-3 i.e.
My score was 50%
40-50% for pass score

Actual result: All 4 documents

On the implementation front, I am using a set of filters like
lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer
StandardTokenizer.

My analysis suggests, StandardTOkenizer strips off the %  sign and hence
the behavior.Has someone faced similar requirements? Any help/guidance is
highly appreciated.

*Warm Regards,*
*Amitesh  K*


How to retain % sign next to number during tokenization

2023-07-18 Thread Amitesh Kumar
I am facing a requirement change to get % sign retained in searches. e.g.

Sample search docs:
1. Number of boys 50
2. My score was 50%
3. 40-50% for pass score

Search query: 50%
Expected results: Doc-2, Doc-3 i.e.
My score was
1. 50%
2. 40-50% for pass score

Actual result: All 3 documents (because tokenizer strips off the % both
during indexing as well as searching and hence matches all docs with 50 in
it.

On the implementation front, I am using a set of filters like
lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer
StandardTokenizer.

Per my analysis suggests, StandardTokenizer strips off the %  I am facing a
requirement change to get % sign retained in searches. e.g

Sample search docs:
1. Number of boys 50
2. My score was 50%
3. 40-50% for pass score

Search query: 50%
Expected results: Doc-2, Doc-3 i.e.
My score was 50%
40-50% for pass score

Actual result: All 4 documents

On the implementation front, I am using a set of filters like
lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer
StandardTokenizer.

Per my analysis, StandardTOkenizer strips off the %  sign and hence the
behavior.Has someone faced similar requirement? Any help/guidance is highly
appreciated.

Regards
Amitesh
-- 
Regards,
Amitesh
Sent from Gmail Mobile
(Please ignore typos)


Re: How to retain % sign next to number during tokenization

2023-07-18 Thread Amitesh Kumar
Sorry for duplicating the question.

On Tue, Jul 18, 2023 at 19:09 Amitesh Kumar  wrote:

> I am facing a requirement change to get % sign retained in searches. e.g.
>
> Sample search docs:
> 1. Number of boys 50
> 2. My score was 50%
> 3. 40-50% for pass score
>
> Search query: 50%
> Expected results: Doc-2, Doc-3 i.e.
> 1. My score was 50%

2. 40-50% for pass score
>
> Actual result: All 3 documents


(possibly because tokenizer strips off the % both during indexing as well
> as searching and hence matches all docs with 50 in it.)
>
> On the implementation front, I am using a set of filters like
> lowerCaseFilter, EnglishPossessiveFilter etc in addition to base tokenizer
> StandardTokenizer.
>
> Per my analysis, StandardTOkenizer strips off the %  sign and hence the
> behavior.Has someone faced similar requirement? Any help/guidance is highly
> appreciated.
>
> Regards
> Amitesh
> --
> Regards,
> Amitesh
> Sent from Gmail Mobile
> (Please ignore typos)
>
-- 
Regards
Amitesh