Wow another issue caught by random testing!

On Fri, May 14, 2010 at 1:42 AM, Robert Muir <[email protected]> wrote:
> the problem is a logic bug (e.g. i have no clue how to really fix
> except to switch over to a UTF-8 sort order).
>
> in converting automaton to utf-8/32, and trying to emulate the utf-16
> term dictionary order, the byte transition ranges (although sorted in
> utf-16 order) are themselves in utf-8/32 order: e.g. a byte range of
> 0xe0-0xef is problematic during enumeration since the 0xee-0xef
> component should be "sorted last" in utf-16 order.

Ugh.  I suppose we could forcefully split such edges?  (We'd have to
fix reduce to not consolidate them).

Or just cutover to UTF8 order for trunk.

> i know a workaround until we switch over, but its gonna cause wasted
> seeks at the least (its just wrong).

This is the FIXME you committed right?  Ie always seek...

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to