What do people think about making this configurable? At the moment it's a constant that can't be altered. I see at least one situation in the field where very long payloads are being added (look, it's special) with a custom tokenizer that subclasses CharTokenizer which truncates the incoming "word".
Using KeywordTokenizer can get around this as it has a c'tor that takes a buffer length. But KeywordTokenizer obviously doesn't let you, well, parse tokens. Should I raise a JIRA or are there good reasons this is hard-coded? Erick --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org