Not sure you can easily marry generated JFlex grammar and runtime-provided list of protected words. I took the approach of creating tokens for punctuation inside my tokenizer and later gluing them with nearby text tokens or dropping from the stream with a tokenfilter.
On Wed, Jun 3, 2009 at 20:10, Grant Ingersoll <gsing...@apache.org> wrote: > You'd have to modify the JFlex grammar. I'd suggest adding in a generic > "protected words" approach whereby you can pass in a list of protected > words. > > This would be a nice patch/improvement. > > -Grant > > On Jun 3, 2009, at 4:07 AM, ami dudu wrote: > >> >> Hi, I'm using a StandardTokenizer which do great job for me but i need to >> enhance it somehow to consider words like "c++" "c#", ".net" as is and not >> tokenized it into "c" or "net". >> I know that there are other tokenizers such as KeywordTokenizer and >> WhitespaceTokenizer but they do not include the StandardTokenizer logic. >> Any ideas on what is the best way to add this enhancement? >> >> Thanks, >> Amid >> -- >> View this message in context: >> http://www.nabble.com/Enhance-StandardTokenizer-to-support-words-which-will-not-be-tokenized-tp23849495p23849495.html >> Sent from the Lucene - Java Developer mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using > Solr/Lucene: > http://www.lucidimagination.com/search > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org