I've read rapidly through the analyser's code, but I'm in no way a lucene master. If I understood your statement correctly, you are saying that we would multiply the number of tokens by 1.5 per tokeniser it uses. A potential "optimisation" would be that sometimes the string could be reused since it's immutable as well.
Actually, I was saying that's the absolute worst case. It wouldn't surprise me to see that the actual effect is that it results in only a 10 or 15% increase in object creation during tokenization, not only for the reason you state, but also because there might well be other object creations on a per-token basis that we're not seeing.

Personally, I believe it would be cleaner to make it immutable (I think that's why this thread started), so +1.
Yup.

Immutability -- good.  Mutability just to save a few cycles -- bad.


--
Brian Goetz
Quiotix Corporation
[EMAIL PROTECTED]           Tel: 650-843-1300            Fax: 650-324-8032

http://www.quiotix.com


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to