Standard tokenizer with punctuation output
------------------------------------------
Key: LUCENE-889
URL: https://issues.apache.org/jira/browse/LUCENE-889
Project: Lucene - Java
Issue Type: Improvement
Affects Versions: 2.1
Reporter: Karl Wettin
Priority: Trivial
This patch adds punctuation (comma, period, question mark and exclamation
point) tokens as output from the StandardTokenizer, and filters them out in
the StandardFilter.
(I needed them for text classification reasons.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]