Yonik Seeley wrote:
1) If we are deprecating some methods like String termText(), how about at the same time deprecating "String type"? If we want lightweight per-token metadata for communication between filters, an int or a long used as a bitvector (32 or 64 independent boolean vars per token) would be much more useful than a single String.
There are tokenizers that use the type string, e.g., StandardFilter & similar things in Nutch. How would you replace such uses? Add a bit for each token type? Is that really that much more useful?
Doug --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]