StandardTermsDictReader.java

Robert Muir Sun, 22 Nov 2009 13:07:28 -0800

I guess here is where I just say that unicode and java are optimized for
utf-16 processing, and so while I agree with byte[] being available in
places like this for flex indexing,
I'm already nervous about seeing code / optimizations that only work well
with latin-1, and are very slow / buggy for anything else.


On Sun, Nov 22, 2009 at 3:58 PM, Michael McCandless <
[email protected]> wrote:

> On Sun, Nov 22, 2009 at 3:52 PM, Robert Muir <[email protected]> wrote:
> >
> > On Sun, Nov 22, 2009 at 3:50 PM, Michael McCandless
> > <[email protected]> wrote:
> >>
> >> Yeah I think there will be lots of optimizing we can do, after flex
> lands.
> >>
> >> Maybe stick w/ String for now?  But open an issue, today, to remind us
> >> to cutover to char[] post-flex?
> >
> > ok, i'll create one.
>
> Thanks.
>
> >> Doing all processing in UTF8 is tantalizing too ;)  This would mean no
> >> conversion of the terms data on iterating from the terms dict...
> >
> > lets please not go this route :) its gonna be enough trouble fixing the
> > char[]-based code for unicode 4, forget about byte[]
>
> I'll defer to you ;)
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


-- 
Robert Muir
[email protected]

Re: svn commit: r883088 - in /lucene/java/branches/flex_1458/src/java/org/apache/lucene/index: TermRef.java codecs/standard/StandardTermsDictReader.java

Reply via email to