On Nov 19, 2007 7:02 PM, Doug Cutting <[EMAIL PROTECTED]> wrote:
> Yonik Seeley wrote:
> > 1) If we are deprecating some methods like String termText(), how
> > about at the same time deprecating "String type"?  If we want
> > lightweight per-token metadata for communication between filters, an
> > int or a long used as a bitvector (32 or 64 independent boolean vars
> > per token) would be much more useful than a single String.
>
> There are tokenizers that use the type string, e.g., StandardFilter &
> similar things in Nutch.  How would you replace such uses?  Add a bit
> for each token type?  Is that really that much more useful?

It is, given that it enables a token to have more than one type at once.
The benefit is probably relatively minor (the number of people who
would use it), and I wouldn't have brought it up except that it could
piggy-back on the other recent changes to Token.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to