first of all, thanks for taking the time to do all of this debugging! my guess is this might be related to https://issues.apache.org/jira/browse/LUCENE-2565
<https://issues.apache.org/jira/browse/LUCENE-2565>does it fail if you apply Mike's patch? On Mon, Jul 26, 2010 at 3:40 PM, Shai Erera <ser...@gmail.com> wrote: > I don't know what was the thing w/ the strings generated before, but now I > ran the test again w/ the same seed and it generates the same strings. So at > least it seems there are no problems w/ the Random class :). > > However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any > ideas why? What does the test check anyway? > > I ran TRR2, and set the regexp to always be "l.E" and the test passes. The > failure comes from > > junit.framework.AssertionFailedError: expected:<true> but was:<false> > at > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199) > at > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171) > > I've set regexp to "l.E", and also 'string' inside assertAutomaton to > "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are > [108, 69]. It just ignores the middle character. Perhaps that's why the test > fails? > > When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69]. > > If I manually set the bytes, using IBM's, to [108, 63, 69], then the test > passes. > > Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first > result :). I'll dig some more into this character, and why the IBM and SUN > JVMs return different byte[] representation for the same sequence of > characters. If you already spot the problem, please let me know. > > BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop, > which goes and checks a system property. Perhaps we can extract it to a > variable, or include a static constant in LuceneTestCase(J4) or something? > > Shai > > > On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <rcm...@gmail.com> wrote: > >> maybe there is a bug in ibm's random generator :) >> >> >> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless < >> luc...@mikemccandless.com> wrote: >> >>> That's VERY spooky that w/ a fixed seed you see different random >>> regexps being made. >>> >>> Mike >>> >>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <ser...@gmail.com> wrote: >>> > Ok I've dug deeper into the test. I set the random seed to >>> > -9029631602016965389L in setUp(), and discovered that on the 4th >>> iteration >>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex >>> generates >>> > different strings every time I run the test, even though it uses the >>> same >>> > Random object w/ the same seed ... >>> > >>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes) >>> and I >>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this >>> helps. >>> > >>> > Shai >>> > >>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <rcm...@gmail.com> wrote: >>> >> >>> >> sounds nasty... its good you are running the tests with this different >>> >> jvm... >>> >> >>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <ser...@gmail.com> >>> wrote: >>> >>> >>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several >>> times >>> >>> and it succeeds every time. However, when I revert back to IBM's, it >>> fail >>> >>> immediately. >>> >>> >>> >>> I can help w/ the debug, if you give me a hint where to look :). >>> >>> >>> >>> Shai >>> >>> >>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <ser...@gmail.com> >>> wrote: >>> >>>> >>> >>>> Sorry for the delayed response. >>> >>>> >>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time it >>> >>>> fails (amazing !), w/ different seeds. More seeds that fail: >>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was: >>> >>>> -4244174191361080127 >>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was: >>> >>>> -7059086272401721644 >>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was: >>> >>>> -1314734215611104147 >>> >>>> >>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ... >>> >>>> >>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that >>> I >>> >>>> open a separate one? >>> >>>> >>> >>>> Shai >>> >>>> >>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless >>> >>>> <luc...@mikemccandless.com> wrote: >>> >>>>> >>> >>>>> On a more general note... >>> >>>>> >>> >>>>> Any time any of you out there hit an "odd" test failure, please >>> please >>> >>>>> please do just what Shai did: take it to the dev list! >>> >>>>> >>> >>>>> Think of Lucene's unit tests like SETI :) We are desperately >>> seeking >>> >>>>> bugs, and you and your machine may just be lucky enough to find >>> one... >>> >>>>> go forth and buy expensive new power hungry computers just so you >>> can >>> >>>>> run the random tests over and over, seeking the bugs! >>> >>>>> >>> >>>>> But be sure to include that random seed when you do hit a >>> failure... >>> >>>>> >>> >>>>> Mike >>> >>>>> >>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <rcm...@gmail.com> >>> wrote: >>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use >>> an >>> >>>>> > IBM JVM >>> >>>>> > or another environment that might help us figure it out? >>> >>>>> > >>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless >>> >>>>> > <luc...@mikemccandless.com> wrote: >>> >>>>> >> >>> >>>>> >> Hmmm this means a bug is lurking. This is the power of random >>> >>>>> >> testing >>> >>>>> >> (that every time we all run tests, we're testing different >>> "paths" >>> >>>>> >> through the code).... >>> >>>>> >> >>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would >>> >>>>> >> cause >>> >>>>> >> this! >>> >>>>> >> >>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it >>> fail, >>> >>>>> >> which is odd. I'll run a stress test to see if I can tickle the >>> >>>>> >> bug... can you open a Jira issue so we don't lose track? >>> >>>>> >> >>> >>>>> >> Mike >>> >>>>> >> >>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <ser...@gmail.com> >>> >>>>> >> wrote: >>> >>>>> >> > Hi >>> >>>>> >> > >>> >>>>> >> > I was running tests on trunk (after merging the changes from >>> >>>>> >> > LUCENE-2537) >>> >>>>> >> > and received this error message: >>> >>>>> >> > >>> >>>>> >> > expected:<true> but was:<false> >>> >>>>> >> > >>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was: >>> >>>>> >> > at >>> >>>>> >> > >>> >>>>> >> > >>> >>>>> >> > >>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197) >>> >>>>> >> > at >>> >>>>> >> > >>> >>>>> >> > >>> >>>>> >> > >>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170) >>> >>>>> >> > at >>> >>>>> >> > >>> >>>>> >> > >>> org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285) >>> >>>>> >> > >>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was: >>> >>>>> >> > 3510820306304573866 >>> >>>>> >> > >>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen this >>> >>>>> >> > before? >>> >>>>> >> > >>> >>>>> >> > Shai >>> >>>>> >> > >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> >>> --------------------------------------------------------------------- >>> >>>>> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> >>>>> >> For additional commands, e-mail: dev-h...@lucene.apache.org >>> >>>>> >> >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> > -- >>> >>>>> > Robert Muir >>> >>>>> > rcm...@gmail.com >>> >>>>> > >>> >>>>> >>> >>>>> >>> --------------------------------------------------------------------- >>> >>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> >>>>> For additional commands, e-mail: dev-h...@lucene.apache.org >>> >>>>> >>> >>>> >>> >>> >>> >> >>> >> >>> >> >>> >> -- >>> >> Robert Muir >>> >> rcm...@gmail.com >>> > >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: dev-h...@lucene.apache.org >>> >>> >> >> >> -- >> Robert Muir >> rcm...@gmail.com >> > > -- Robert Muir rcm...@gmail.com