On 11/2/06, Nadav Har'El <[EMAIL PROTECTED]> wrote:
On Wed, Nov 01, 2006, Yonik Seeley wrote about "Re: Indexing floating point 
number":
> >   longer strings than Solr's NumberTools. Moving to base 100 or even 256
> >   (as I suggest in the comments) can eliminate this difference.
>
> Or higher, depending on what you are optimizing for.
> If you are going to hold String instances in memory (like the
> FieldCache does) it's probably better to use more of those 16 bit
> chars.

This is a good point. I was thinking byte arrays, but indeed terms are
Strings, not byte arrays.

For now :-)
I think byte arrays may be in lucene's future:
http://www.nabble.com/storing-term-text-internally-as-byte-array-and-bytecount-as-prefix%2C-etc.-tf1540279.html#a4220593

And if we kept UTF8 in byte arrays, the next logical question would
be, why not just allow binary in there too :-)

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to