Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Michael McCandless
Wow another issue caught by random testing! On Fri, May 14, 2010 at 1:42 AM, Robert Muir rcm...@gmail.com wrote: the problem is a logic bug (e.g. i have no clue how to really fix except to switch over to a UTF-8 sort order). in converting automaton to utf-8/32, and trying to emulate the

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Robert Muir
On Fri, May 14, 2010 at 5:14 AM, Michael McCandless luc...@mikemccandless.com wrote: Or just cutover to UTF8 order for trunk. I would really prefer we go this route, instead of trying to do any hacks at this point! This is the FIXME you committed right?  Ie always seek... Yeah, i can't even

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Yonik Seeley
On Fri, May 14, 2010 at 7:29 AM, Robert Muir rcm...@gmail.com wrote: On Fri, May 14, 2010 at 5:14 AM, Michael McCandless luc...@mikemccandless.com wrote: Or just cutover to UTF8 order for trunk. I would really prefer we go this route, instead of trying to do any hacks at this point! Sounds

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Robert Muir
On Fri, May 14, 2010 at 10:59 AM, Yonik Seeley yo...@lucidimagination.com wrote: So it seems like the biggest issue we might have in cutting over would be the field cache and sorting?  Instead of using String.compareTo we need one that compares as UTF-32 (or longer term, don't even create

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Yonik Seeley
On Fri, May 14, 2010 at 11:21 AM, Michael McCandless luc...@mikemccandless.com wrote: On Fri, May 14, 2010 at 10:59 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, May 14, 2010 at 7:29 AM, Robert Muir rcm...@gmail.com wrote: On Fri, May 14, 2010 at 5:14 AM, Michael McCandless

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Robert Muir
On Fri, May 14, 2010 at 11:21 AM, Michael McCandless luc...@mikemccandless.com wrote: Actually, I think on changing to unicode codepoint order, the StringIndex returned by FieldCache would in fact be sorted in codepoint order (even though it's still a String[]), because it just enums the terms

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Yonik Seeley
On Fri, May 14, 2010 at 11:23 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, May 14, 2010 at 11:21 AM, Michael McCandless luc...@mikemccandless.com wrote: On Fri, May 14, 2010 at 10:59 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, May 14, 2010 at 7:29 AM, Robert Muir

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-14 Thread Michael McCandless
On Fri, May 14, 2010 at 11:23 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, May 14, 2010 at 11:21 AM, Michael McCandless luc...@mikemccandless.com wrote: On Fri, May 14, 2010 at 10:59 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, May 14, 2010 at 7:29 AM, Robert Muir

Build failed in Hudson: Lucene-trunk #1187

2010-05-13 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Lucene-trunk/1187/changes Changes: [mikemccand] LUCENE-2393: add total TF tracking to HighFreqTerms tool [mikemccand] LUCENE-2459: fix FilterIndexReader to (by default) emulate flex API on top of pre-flex API [mikemccand] LUCENE-2449: fix DBLRU

Re: Build failed in Hudson: Lucene-trunk #1187

2010-05-13 Thread Robert Muir
the problem is a logic bug (e.g. i have no clue how to really fix except to switch over to a UTF-8 sort order). in converting automaton to utf-8/32, and trying to emulate the utf-16 term dictionary order, the byte transition ranges (although sorted in utf-16 order) are themselves in utf-8/32