Enhance StandardTokenizer to support words which will not be tokenized

ami dudu Wed, 03 Jun 2009 04:07:52 -0700

Hi, I'm using a StandardTokenizer which do great job for me but i need to
enhance it somehow to consider words like "c++" "c#", ".net" as is and not
tokenized it into "c" or "net".
I know that there are other tokenizers such as KeywordTokenizer and
WhitespaceTokenizer but they do not include the StandardTokenizer  logic.
Any ideas on what is the best way to add this enhancement?


Thanks,
Amid
-- 
View this message in context: 
http://www.nabble.com/Enhance-StandardTokenizer-to-support-words-which-will-not-be-tokenized-tp23849495p23849495.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Enhance StandardTokenizer to support words which will not be tokenized

Reply via email to