Hi,

I have used StandardAnalyzer in my code and it is working fine. One of the 
challenges that I face is the fact that, this Analyzer by default tokenizes on 
some special characters such as hyphen, apart from the SPACE character.

I want to tokenize only on the SPACE character. Could you please suggest how I 
can achieve this?

I got this example when I googled for it. What I want to use is the 
WhitespaceTokenizer so that data is not manipulated in anyway. I understand 
that in this case, searches such as "mechanisms" won't return results because 
of the period (.) at the end. I want to then address this by introducing 
wild-card searches.

Data: 1097-0215 (i.v) product-123 anti-virus, we investigated the mechanisms. 
2266-73 In the present study
Tokens generated with StandardTokenizer:
[1097-0215] [i.v] [product-123] [anti] [virus] [we] [investigated] [the] 
[mechanisms] [2266-73] [In] [the] [present] [study]
Tokens generated with WhiteSpaceTokenizer:
[1097-0215] [(i.v)] [product-123] [anti-virus,] [we] [investigated] [the] 
[mechanisms.] [2266-73] [In] [the] [present] [study]
Note: I have tried using the WhitespaceAnalyzer which tokenizes by default ONLY 
on the space, but my attempt at performing wildcard searches didn't work as 
expected. Where as, wildcard searches worked fine with StandardAnalyzer.

Please provide your inputs.

Regards,
Raghu


_______________________________________________

This message is for information purposes only, it is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is not intended for retail customer use. Intended for 
recipient only. This message is subject to the terms at: 
www.barclays.com/emaildisclaimer.

For important disclosures, please see: 
www.barclays.com/salesandtradingdisclaimer regarding market commentary from 
Barclays Sales and/or Trading, who are active market participants; and in 
respect of Barclays Research, including disclosures relating to specific 
issuers, please see http://publicresearch.barclays.com.

_______________________________________________

Reply via email to