Re: Question about usage of LuceneTestCase

2018-08-27 Thread Tomoko Uchida
> i haven't looked closely into what exactly that "useFactory(null)" call
> does, but it's probably worth getting to the bottom of the failures and
> *IF* it's tied to some specific dir type or codec, using annotations to
> supress them -- rather then just eliminating all directory randomization.

Thanks for your advice.

I have not looked into the method "useFactory()" yet (do not have
enough time for this.)
I will take time to examine details of our failed tests, and try to
search lucene/solr test cases around Directories or Codecs to refer
to.

Thanks,
Tomoko
2018年8月28日(火) 2:15 Chris Hostetter :
>
>
> : Current version of Luke supports FS based directory implementations only.
> : (I think it will be better if future versions support non-FS based custom
> : implementations, such as HdfsDirectoryFactory for users who need it.)
> : Disabling the randomization, at least for now, sounds reasonable to me too.
> : I'll try this way.
>
> Be careful with this assumption...
>
> The randomization of directory types isn't just about things like "let's
> try a RAM Dir" it also includes things like "let's randomize a dir that
> simulates Windows filesystem quirks"  -- stuff that would be very handy to
> test with re-usable tool like Luke where you expect users to run on a
> variety of platforms / filesystems.
>
> i haven't looked closely into what exactly that "useFactory(null)" call
> does, but it's probably worth getting to the bottom of the failures and
> *IF* it's tied to some specific dir type or codec, using annotations to
> supress them -- rather then just eliminating all directory randomization.
>
>
> -Hoss
> http://www.lucidworks.com/
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>


-- 
Tomoko Uchida

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Question about usage of LuceneTestCase

2018-08-27 Thread Chris Hostetter


: Current version of Luke supports FS based directory implementations only.
: (I think it will be better if future versions support non-FS based custom
: implementations, such as HdfsDirectoryFactory for users who need it.)
: Disabling the randomization, at least for now, sounds reasonable to me too.
: I'll try this way.

Be careful with this assumption...

The randomization of directory types isn't just about things like "let's 
try a RAM Dir" it also includes things like "let's randomize a dir that 
simulates Windows filesystem quirks"  -- stuff that would be very handy to 
test with re-usable tool like Luke where you expect users to run on a 
variety of platforms / filesystems.

i haven't looked closely into what exactly that "useFactory(null)" call 
does, but it's probably worth getting to the bottom of the failures and 
*IF* it's tied to some specific dir type or codec, using annotations to 
supress them -- rather then just eliminating all directory randomization.


-Hoss
http://www.lucidworks.com/

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Question about usage of LuceneTestCase

2018-08-22 Thread Tomoko Uchida
> You don't really have to figure out exactly what the combinations are,
> just execute the test with the "reproduce with" flags set, cut/paste
> the error message at the root of your local Solr source tree in a
> command prompt.

> ant test  -Dtestcase=CommitsImplTest
> -Dtests.method=testGetSegmentAttributes -Dtests.seed=35AF58F652536895
> -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
> -Dtests.timezone=Africa/Kigali -Dtests.asserts=true
> -Dtests.file.encoding=UTF-8

Thanks for correcting that. :)

> I doubt there's any real point in exercising Luke on non-FS based
> indexes, so disabling the randomization of the filesystem seems fine.

> @BeforeClass
> public static void beforeTriLevelCompositeIdRoutingTest() throws
Exception {
>   useFactory(null); // uses Standard or NRTCaching, FS based anyway.
> }

Current version of Luke supports FS based directory implementations only.
(I think it will be better if future versions support non-FS based custom
implementations, such as HdfsDirectoryFactory for users who need it.)
Disabling the randomization, at least for now, sounds reasonable to me too.
I'll try this way.



> It looks to me as if this test is asserting that the segment in an index
it
> just created has some attributes, but in fact it does not. Perhaps there
is
> a codec that does not store any attributes with its segments, and Luke
does
> not expect this, and maybe the codec is being selected randomly by the
> RandomIndexWriter?

Thanks for your investigation! I'll catch up with you.

Regards,
Tomoko

2018年8月23日(木) 6:03 Michael Sokolov :

> It looks to me as if this test is asserting that the segment in an index it
> just created has some attributes, but in fact it does not. Perhaps there is
> a codec that does not store any attributes with its segments, and Luke does
> not expect this, and maybe the codec is being selected randomly by the
> RandomIndexWriter?
>
> On Wed, Aug 22, 2018 at 4:54 PM Michael Sokolov 
> wrote:
>
> > Here's a seed that fails for me consistently in IntelliJ:
> > "FEF692F43FE50191:656E22441676701C" running CommitsImplTest. Warning: I
> > have a bunch of local changes that might have perturbed the randomness so
> > possibly it might not reproduce for others.  I just run the tests, open
> the
> > "Edit Configurations" dialog, paste in
> > -Dtests.seed=FEF692F43FE50191:656E22441676701C in the VM options box, and
> > then I can get the test to fail every time, it seems
> >
> > On Wed, Aug 22, 2018 at 1:11 PM Erick Erickson 
> > wrote:
> >
> >> bq. My understanding at this point is (though it may be a repeat of your
> >> words,)
> >> first we should find out the combinations behind the failures.
> >> If there are any particular patterns, there could be bugs, so we should
> >> fix
> >> it.
> >>
> >> You don't really have to figure out exactly what the combinations are,
> >> just execute the test with the "reproduce with" flags set, cut/paste
> >> the error message at the root of your local Solr source tree in a
> >> command prompt.
> >>
> >> ant test  -Dtestcase=CommitsImplTest
> >> -Dtests.method=testGetSegmentAttributes -Dtests.seed=35AF58F652536895
> >> -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
> >> -Dtests.timezone=Africa/Kigali -Dtests.asserts=true
> >> -Dtests.file.encoding=UTF-8
> >>
> >> That should reproduce exactly the same results from random() and
> >> (hopefully) reliably reproduce the problem. Not sure how to mavenize
> >> it, but you shouldn't need to if you have Solr locally. If it fails
> >> every time, you can debug. I've had some luck just defining the
> >> tests.seed in my IDE and running the test there (I use IntelliJ, but
> >> I'm sure Eclipse and Netbeans etc. have an equivalent way to do
> >> things). If just setting the seed as a sysvar in your IDE doesn't do
> >> the trick, you can always define all of them in the IDE.
> >>
> >> Even setting all the sysvars in the IDE doesn't always work. That is
> >> executing the entire test from the command line can consistently fail
> >> but defining all the sysvars in the IDE succeeds. But when it does
> >> fail in the IDE it makes things _much_ easier ;)
> >>
> >> Second question:
> >>
> >> I doubt there's any real point in exercising Luke on non-FS based
> >> indexes, so disabling the randomization of the filesystem seems fine.
> >>
> >> See SolrTestCaseJ4, the "useFactory" method. You can do something like
> >> this in your test:
> >>
> >> @BeforeClass
> >> public static void beforeTriLevelCompositeIdRoutingTest() throws
> >> Exception {
> >>   useFactory(null); // uses Standard or NRTCaching, FS based anyway.
> >> }
> >>
> >> or even:
> >>
> >> useFactory("solr.StandardDirectoryFactory");
> >>
> >> I'm not sure about
> >> useFactory("org.apache.solr.core.HdfsDirectoryFactory");
> >>
> >> Or if you're really adventurous:
> >>
> >> @BeforeClass
> >> public static void beforeTriLevelCompositeIdRoutingTest() throws
> >> Exception {
> >>   switch (random().nextInt(2)) {
> >>

Re: Question about usage of LuceneTestCase

2018-08-22 Thread Michael Sokolov
It looks to me as if this test is asserting that the segment in an index it
just created has some attributes, but in fact it does not. Perhaps there is
a codec that does not store any attributes with its segments, and Luke does
not expect this, and maybe the codec is being selected randomly by the
RandomIndexWriter?

On Wed, Aug 22, 2018 at 4:54 PM Michael Sokolov  wrote:

> Here's a seed that fails for me consistently in IntelliJ:
> "FEF692F43FE50191:656E22441676701C" running CommitsImplTest. Warning: I
> have a bunch of local changes that might have perturbed the randomness so
> possibly it might not reproduce for others.  I just run the tests, open the
> "Edit Configurations" dialog, paste in
> -Dtests.seed=FEF692F43FE50191:656E22441676701C in the VM options box, and
> then I can get the test to fail every time, it seems
>
> On Wed, Aug 22, 2018 at 1:11 PM Erick Erickson 
> wrote:
>
>> bq. My understanding at this point is (though it may be a repeat of your
>> words,)
>> first we should find out the combinations behind the failures.
>> If there are any particular patterns, there could be bugs, so we should
>> fix
>> it.
>>
>> You don't really have to figure out exactly what the combinations are,
>> just execute the test with the "reproduce with" flags set, cut/paste
>> the error message at the root of your local Solr source tree in a
>> command prompt.
>>
>> ant test  -Dtestcase=CommitsImplTest
>> -Dtests.method=testGetSegmentAttributes -Dtests.seed=35AF58F652536895
>> -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
>> -Dtests.timezone=Africa/Kigali -Dtests.asserts=true
>> -Dtests.file.encoding=UTF-8
>>
>> That should reproduce exactly the same results from random() and
>> (hopefully) reliably reproduce the problem. Not sure how to mavenize
>> it, but you shouldn't need to if you have Solr locally. If it fails
>> every time, you can debug. I've had some luck just defining the
>> tests.seed in my IDE and running the test there (I use IntelliJ, but
>> I'm sure Eclipse and Netbeans etc. have an equivalent way to do
>> things). If just setting the seed as a sysvar in your IDE doesn't do
>> the trick, you can always define all of them in the IDE.
>>
>> Even setting all the sysvars in the IDE doesn't always work. That is
>> executing the entire test from the command line can consistently fail
>> but defining all the sysvars in the IDE succeeds. But when it does
>> fail in the IDE it makes things _much_ easier ;)
>>
>> Second question:
>>
>> I doubt there's any real point in exercising Luke on non-FS based
>> indexes, so disabling the randomization of the filesystem seems fine.
>>
>> See SolrTestCaseJ4, the "useFactory" method. You can do something like
>> this in your test:
>>
>> @BeforeClass
>> public static void beforeTriLevelCompositeIdRoutingTest() throws
>> Exception {
>>   useFactory(null); // uses Standard or NRTCaching, FS based anyway.
>> }
>>
>> or even:
>>
>> useFactory("solr.StandardDirectoryFactory");
>>
>> I'm not sure about
>> useFactory("org.apache.solr.core.HdfsDirectoryFactory");
>>
>> Or if you're really adventurous:
>>
>> @BeforeClass
>> public static void beforeTriLevelCompositeIdRoutingTest() throws
>> Exception {
>>   switch (random().nextInt(2)) {
>>  case 0:
>> useFactory(null); // uses Standard or NRTCaching, FS based anyway.
>> break;
>> case 1:
>> useFactory("org.apache.solr.core.HdfsDirectoryFactory");
>> break;
>> // I guess whatever else you wanted...
>>
>> }
>>
>>
>> Frankly in this case I'd:
>>
>> 1> see if executing the full reproduce line consistently fails and if so
>> 2> try using the above to disable other filesystems. If that
>> consistently succeeds, consider it done.
>>
>> Since Luke is intended to be used on an existing index I don't see
>> much use in randomizing for edge cases. But that pre-supposes that
>> it's a problem with some of the directory implementations of course...
>>
>> Best,
>> Erick
>>
>> On Wed, Aug 22, 2018 at 8:13 AM, Tomoko Uchida
>>  wrote:
>> > Can I ask one more question.
>> >
>> > 4> If MIke's intuition that it's one of the file system randomizations
>> > that occasionally gets hit _and_ you determine that that's an invalid
>> > test case (and for Luke requiring that the FS-basesd tests are all
>> > that are necessary may be fine) I'm pretty sure you you can disable
>> > that randomization for your specific tests.
>> >
>> > As you may know, Luke calls relatively low Lucene APIs (such as
>> > o.a.l.u.IndexCommit or SegmentInfos) to show commit points, segment
>> files,
>> > etc. ("Commits" tab do this.)
>> > I am not sure about when we could/should disable randomization, could
>> you
>> > give me any cues for this? Or, real test cases that disable
>> randomization
>> > are helpful for me, I will search Lucene/Solr code base.
>> >
>> > Thanks,
>> > Tomoko
>> >
>> > 2018年8月22日(水) 21:58 Tomoko Uchida :
>> >
>> >> Thanks for your kind explanations,
>> >>
>> >> sorry of course I know what is 

Re: Question about usage of LuceneTestCase

2018-08-22 Thread Michael Sokolov
Here's a seed that fails for me consistently in IntelliJ:
"FEF692F43FE50191:656E22441676701C" running CommitsImplTest. Warning: I
have a bunch of local changes that might have perturbed the randomness so
possibly it might not reproduce for others.  I just run the tests, open the
"Edit Configurations" dialog, paste in
-Dtests.seed=FEF692F43FE50191:656E22441676701C in the VM options box, and
then I can get the test to fail every time, it seems

On Wed, Aug 22, 2018 at 1:11 PM Erick Erickson 
wrote:

> bq. My understanding at this point is (though it may be a repeat of your
> words,)
> first we should find out the combinations behind the failures.
> If there are any particular patterns, there could be bugs, so we should fix
> it.
>
> You don't really have to figure out exactly what the combinations are,
> just execute the test with the "reproduce with" flags set, cut/paste
> the error message at the root of your local Solr source tree in a
> command prompt.
>
> ant test  -Dtestcase=CommitsImplTest
> -Dtests.method=testGetSegmentAttributes -Dtests.seed=35AF58F652536895
> -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
> -Dtests.timezone=Africa/Kigali -Dtests.asserts=true
> -Dtests.file.encoding=UTF-8
>
> That should reproduce exactly the same results from random() and
> (hopefully) reliably reproduce the problem. Not sure how to mavenize
> it, but you shouldn't need to if you have Solr locally. If it fails
> every time, you can debug. I've had some luck just defining the
> tests.seed in my IDE and running the test there (I use IntelliJ, but
> I'm sure Eclipse and Netbeans etc. have an equivalent way to do
> things). If just setting the seed as a sysvar in your IDE doesn't do
> the trick, you can always define all of them in the IDE.
>
> Even setting all the sysvars in the IDE doesn't always work. That is
> executing the entire test from the command line can consistently fail
> but defining all the sysvars in the IDE succeeds. But when it does
> fail in the IDE it makes things _much_ easier ;)
>
> Second question:
>
> I doubt there's any real point in exercising Luke on non-FS based
> indexes, so disabling the randomization of the filesystem seems fine.
>
> See SolrTestCaseJ4, the "useFactory" method. You can do something like
> this in your test:
>
> @BeforeClass
> public static void beforeTriLevelCompositeIdRoutingTest() throws Exception
> {
>   useFactory(null); // uses Standard or NRTCaching, FS based anyway.
> }
>
> or even:
>
> useFactory("solr.StandardDirectoryFactory");
>
> I'm not sure about useFactory("org.apache.solr.core.HdfsDirectoryFactory");
>
> Or if you're really adventurous:
>
> @BeforeClass
> public static void beforeTriLevelCompositeIdRoutingTest() throws Exception
> {
>   switch (random().nextInt(2)) {
>  case 0:
> useFactory(null); // uses Standard or NRTCaching, FS based anyway.
> break;
> case 1:
> useFactory("org.apache.solr.core.HdfsDirectoryFactory");
> break;
> // I guess whatever else you wanted...
>
> }
>
>
> Frankly in this case I'd:
>
> 1> see if executing the full reproduce line consistently fails and if so
> 2> try using the above to disable other filesystems. If that
> consistently succeeds, consider it done.
>
> Since Luke is intended to be used on an existing index I don't see
> much use in randomizing for edge cases. But that pre-supposes that
> it's a problem with some of the directory implementations of course...
>
> Best,
> Erick
>
> On Wed, Aug 22, 2018 at 8:13 AM, Tomoko Uchida
>  wrote:
> > Can I ask one more question.
> >
> > 4> If MIke's intuition that it's one of the file system randomizations
> > that occasionally gets hit _and_ you determine that that's an invalid
> > test case (and for Luke requiring that the FS-basesd tests are all
> > that are necessary may be fine) I'm pretty sure you you can disable
> > that randomization for your specific tests.
> >
> > As you may know, Luke calls relatively low Lucene APIs (such as
> > o.a.l.u.IndexCommit or SegmentInfos) to show commit points, segment
> files,
> > etc. ("Commits" tab do this.)
> > I am not sure about when we could/should disable randomization, could you
> > give me any cues for this? Or, real test cases that disable randomization
> > are helpful for me, I will search Lucene/Solr code base.
> >
> > Thanks,
> > Tomoko
> >
> > 2018年8月22日(水) 21:58 Tomoko Uchida :
> >
> >> Thanks for your kind explanations,
> >>
> >> sorry of course I know what is the randomization seed,
> >> but your description and instruction is exactly what I wanted.
> >>
> >> > The randomization can cause different
> >> > combinations of "stuff" to happen. Say the locale is randomized to
> >> > Turkish and a token is also randomly generated that breaks _only_ with
> >> > that combination. You'd never explicitly be able to test all of those
> >> > kinds of combinations, thus the random() function. And there may be
> >> > many calls to random() by the time a test is run.
> >>
> >> My 

Re: Question about usage of LuceneTestCase

2018-08-22 Thread Erick Erickson
bq. My understanding at this point is (though it may be a repeat of your words,)
first we should find out the combinations behind the failures.
If there are any particular patterns, there could be bugs, so we should fix
it.

You don't really have to figure out exactly what the combinations are,
just execute the test with the "reproduce with" flags set, cut/paste
the error message at the root of your local Solr source tree in a
command prompt.

ant test  -Dtestcase=CommitsImplTest
-Dtests.method=testGetSegmentAttributes -Dtests.seed=35AF58F652536895
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
-Dtests.timezone=Africa/Kigali -Dtests.asserts=true
-Dtests.file.encoding=UTF-8

That should reproduce exactly the same results from random() and
(hopefully) reliably reproduce the problem. Not sure how to mavenize
it, but you shouldn't need to if you have Solr locally. If it fails
every time, you can debug. I've had some luck just defining the
tests.seed in my IDE and running the test there (I use IntelliJ, but
I'm sure Eclipse and Netbeans etc. have an equivalent way to do
things). If just setting the seed as a sysvar in your IDE doesn't do
the trick, you can always define all of them in the IDE.

Even setting all the sysvars in the IDE doesn't always work. That is
executing the entire test from the command line can consistently fail
but defining all the sysvars in the IDE succeeds. But when it does
fail in the IDE it makes things _much_ easier ;)

Second question:

I doubt there's any real point in exercising Luke on non-FS based
indexes, so disabling the randomization of the filesystem seems fine.

See SolrTestCaseJ4, the "useFactory" method. You can do something like
this in your test:

@BeforeClass
public static void beforeTriLevelCompositeIdRoutingTest() throws Exception {
  useFactory(null); // uses Standard or NRTCaching, FS based anyway.
}

or even:

useFactory("solr.StandardDirectoryFactory");

I'm not sure about useFactory("org.apache.solr.core.HdfsDirectoryFactory");

Or if you're really adventurous:

@BeforeClass
public static void beforeTriLevelCompositeIdRoutingTest() throws Exception {
  switch (random().nextInt(2)) {
 case 0:
useFactory(null); // uses Standard or NRTCaching, FS based anyway.
break;
case 1:
useFactory("org.apache.solr.core.HdfsDirectoryFactory");
break;
// I guess whatever else you wanted...

}


Frankly in this case I'd:

1> see if executing the full reproduce line consistently fails and if so
2> try using the above to disable other filesystems. If that
consistently succeeds, consider it done.

Since Luke is intended to be used on an existing index I don't see
much use in randomizing for edge cases. But that pre-supposes that
it's a problem with some of the directory implementations of course...

Best,
Erick

On Wed, Aug 22, 2018 at 8:13 AM, Tomoko Uchida
 wrote:
> Can I ask one more question.
>
> 4> If MIke's intuition that it's one of the file system randomizations
> that occasionally gets hit _and_ you determine that that's an invalid
> test case (and for Luke requiring that the FS-basesd tests are all
> that are necessary may be fine) I'm pretty sure you you can disable
> that randomization for your specific tests.
>
> As you may know, Luke calls relatively low Lucene APIs (such as
> o.a.l.u.IndexCommit or SegmentInfos) to show commit points, segment files,
> etc. ("Commits" tab do this.)
> I am not sure about when we could/should disable randomization, could you
> give me any cues for this? Or, real test cases that disable randomization
> are helpful for me, I will search Lucene/Solr code base.
>
> Thanks,
> Tomoko
>
> 2018年8月22日(水) 21:58 Tomoko Uchida :
>
>> Thanks for your kind explanations,
>>
>> sorry of course I know what is the randomization seed,
>> but your description and instruction is exactly what I wanted.
>>
>> > The randomization can cause different
>> > combinations of "stuff" to happen. Say the locale is randomized to
>> > Turkish and a token is also randomly generated that breaks _only_ with
>> > that combination. You'd never explicitly be able to test all of those
>> > kinds of combinations, thus the random() function. And there may be
>> > many calls to random() by the time a test is run.
>>
>> My understanding at this point is (though it may be a repeat of your
>> words,)
>> first we should find out the combinations behind the failures.
>> If there are any particular patterns, there could be bugs, so we should
>> fix it.
>>
>> Thanks,
>> Tomoko
>>
>> 2018年8月22日(水) 14:59 Erick Erickson :
>>
>>> The pseudo-random generator in the Lucene test framework is used to
>>> randomize lots of test conditions, we're talking about the file system
>>> implementation here, but there are lots of others. Whenever you see a
>>> call to random().whatever, that's the call to the framework's method.
>>>
>>> But here's the thing. The randomization can cause different
>>> combinations of "stuff" to happen. Say the locale is 

Re: Question about usage of LuceneTestCase

2018-08-22 Thread Tomoko Uchida
Can I ask one more question.

4> If MIke's intuition that it's one of the file system randomizations
that occasionally gets hit _and_ you determine that that's an invalid
test case (and for Luke requiring that the FS-basesd tests are all
that are necessary may be fine) I'm pretty sure you you can disable
that randomization for your specific tests.

As you may know, Luke calls relatively low Lucene APIs (such as
o.a.l.u.IndexCommit or SegmentInfos) to show commit points, segment files,
etc. ("Commits" tab do this.)
I am not sure about when we could/should disable randomization, could you
give me any cues for this? Or, real test cases that disable randomization
are helpful for me, I will search Lucene/Solr code base.

Thanks,
Tomoko

2018年8月22日(水) 21:58 Tomoko Uchida :

> Thanks for your kind explanations,
>
> sorry of course I know what is the randomization seed,
> but your description and instruction is exactly what I wanted.
>
> > The randomization can cause different
> > combinations of "stuff" to happen. Say the locale is randomized to
> > Turkish and a token is also randomly generated that breaks _only_ with
> > that combination. You'd never explicitly be able to test all of those
> > kinds of combinations, thus the random() function. And there may be
> > many calls to random() by the time a test is run.
>
> My understanding at this point is (though it may be a repeat of your
> words,)
> first we should find out the combinations behind the failures.
> If there are any particular patterns, there could be bugs, so we should
> fix it.
>
> Thanks,
> Tomoko
>
> 2018年8月22日(水) 14:59 Erick Erickson :
>
>> The pseudo-random generator in the Lucene test framework is used to
>> randomize lots of test conditions, we're talking about the file system
>> implementation here, but there are lots of others. Whenever you see a
>> call to random().whatever, that's the call to the framework's method.
>>
>> But here's the thing. The randomization can cause different
>> combinations of "stuff" to happen. Say the locale is randomized to
>> Turkish and a token is also randomly generated that breaks _only_ with
>> that combination. You'd never explicitly be able to test all of those
>> kinds of combinations, thus the random() function. And there may be
>> many calls to random() by the time a test is run.
>>
>> Here's the key. When "seeded" with the same number, the calls to
>> random() produce the exact same output every time. So say with seed1 I
>> get
>> nextInt() - 1
>> nextInt() - 67
>> nextBool() - true
>>
>> Whenever I use 1 as the seed, I'll get exactly the above. However, if
>> I use 2 as a seed, I might get
>> nextInt() - 93
>> nextInt() - 63
>> nextBool() - false
>>
>> So the short form is
>>
>> 1. randomization is used to try out various combinations.
>>
>> 2. using a particular seed guarantees that the randomization is
>> repeatable.
>>
>> 3.  when a test fails with a particular seed, running the test with
>> the _same_ seed will produce the same conditions, hopefully allowing
>> that particular error resulting from that particular combination to be
>> reproduced reliably (and fixed).
>>
>> 4. at least that's the theory and in practice it works quite well.
>> There is no _guarantee_ that the test will fail using the same seed,
>> sometimes the failures are a result of subtle timing etc, which is not
>> under control of the randomization. I breathe a sigh of relief,
>> though, when a test _does_ reproduce with a particular seed 'cause
>> then I have a hope of knowing the issue is actually fixed ;).
>>
>>
>> Best,
>> Erick
>>
>> On Tue, Aug 21, 2018 at 3:56 PM, Tomoko Uchida
>>  wrote:
>> > Thanks a lot for your information & insights,
>> >
>> > I will try to reproduce the errors and investigate the results.
>> > And, maybe I should learn more about internal of the test framework,
>> > I'm not familiar with it and still do not understand what does "seed"
>> means
>> > exactly in this context.
>> >
>> > Regards,
>> > Tomoko
>> >
>> > 2018年8月22日(水) 1:05 Erick Erickson :
>> >
>> >> Couple of things (and I know you've been around for a while, so pardon
>> >> me if it's all old hat to you):
>> >>
>> >> 1> if you run the entire "reproduce with" line and can get a
>> >> consistent failure, then you are half way there, nothing is as
>> >> frustrating as not getting failures reliably. The critical bit is
>> >> often the -Dtests.seed. As Michael mentioned, there are various
>> >> randomizations done for _many_ things in Lucene tests using a random
>> >> generator.  tests.seed, well, seeds that generator so it produces the
>> >> same numbers every time it's run with that seed. You'll see lots of
>> >> calls to a static ramdom() method calls. I'll add that if you want to
>> >> use randomness in your tests, use that method and do _not_ use a local
>> >> instance of Java's Random.
>> >>
>> >> 2> MIke: You say IntelliJ succeeds. But that'll use a new random()
>> >> seed. Once you run a test, in the upper right (on my version at
>> 

Re: Question about usage of LuceneTestCase

2018-08-22 Thread Tomoko Uchida
Thanks for your kind explanations,

sorry of course I know what is the randomization seed,
but your description and instruction is exactly what I wanted.

> The randomization can cause different
> combinations of "stuff" to happen. Say the locale is randomized to
> Turkish and a token is also randomly generated that breaks _only_ with
> that combination. You'd never explicitly be able to test all of those
> kinds of combinations, thus the random() function. And there may be
> many calls to random() by the time a test is run.

My understanding at this point is (though it may be a repeat of your words,)
first we should find out the combinations behind the failures.
If there are any particular patterns, there could be bugs, so we should fix
it.

Thanks,
Tomoko

2018年8月22日(水) 14:59 Erick Erickson :

> The pseudo-random generator in the Lucene test framework is used to
> randomize lots of test conditions, we're talking about the file system
> implementation here, but there are lots of others. Whenever you see a
> call to random().whatever, that's the call to the framework's method.
>
> But here's the thing. The randomization can cause different
> combinations of "stuff" to happen. Say the locale is randomized to
> Turkish and a token is also randomly generated that breaks _only_ with
> that combination. You'd never explicitly be able to test all of those
> kinds of combinations, thus the random() function. And there may be
> many calls to random() by the time a test is run.
>
> Here's the key. When "seeded" with the same number, the calls to
> random() produce the exact same output every time. So say with seed1 I
> get
> nextInt() - 1
> nextInt() - 67
> nextBool() - true
>
> Whenever I use 1 as the seed, I'll get exactly the above. However, if
> I use 2 as a seed, I might get
> nextInt() - 93
> nextInt() - 63
> nextBool() - false
>
> So the short form is
>
> 1. randomization is used to try out various combinations.
>
> 2. using a particular seed guarantees that the randomization is repeatable.
>
> 3.  when a test fails with a particular seed, running the test with
> the _same_ seed will produce the same conditions, hopefully allowing
> that particular error resulting from that particular combination to be
> reproduced reliably (and fixed).
>
> 4. at least that's the theory and in practice it works quite well.
> There is no _guarantee_ that the test will fail using the same seed,
> sometimes the failures are a result of subtle timing etc, which is not
> under control of the randomization. I breathe a sigh of relief,
> though, when a test _does_ reproduce with a particular seed 'cause
> then I have a hope of knowing the issue is actually fixed ;).
>
>
> Best,
> Erick
>
> On Tue, Aug 21, 2018 at 3:56 PM, Tomoko Uchida
>  wrote:
> > Thanks a lot for your information & insights,
> >
> > I will try to reproduce the errors and investigate the results.
> > And, maybe I should learn more about internal of the test framework,
> > I'm not familiar with it and still do not understand what does "seed"
> means
> > exactly in this context.
> >
> > Regards,
> > Tomoko
> >
> > 2018年8月22日(水) 1:05 Erick Erickson :
> >
> >> Couple of things (and I know you've been around for a while, so pardon
> >> me if it's all old hat to you):
> >>
> >> 1> if you run the entire "reproduce with" line and can get a
> >> consistent failure, then you are half way there, nothing is as
> >> frustrating as not getting failures reliably. The critical bit is
> >> often the -Dtests.seed. As Michael mentioned, there are various
> >> randomizations done for _many_ things in Lucene tests using a random
> >> generator.  tests.seed, well, seeds that generator so it produces the
> >> same numbers every time it's run with that seed. You'll see lots of
> >> calls to a static ramdom() method calls. I'll add that if you want to
> >> use randomness in your tests, use that method and do _not_ use a local
> >> instance of Java's Random.
> >>
> >> 2> MIke: You say IntelliJ succeeds. But that'll use a new random()
> >> seed. Once you run a test, in the upper right (on my version at
> >> least), IntelliJ will show you a little box with the test name and you
> >> can "edit configurations" on it. I often have luck by editing the
> >> configuration and adding the test seed to the "VM option" box for the
> >> test, just the "-Dtests.seed=35AF58F652536895" part. You can add all
> >> of the -D flags in the "reproduce with" line if you want, but often
> >> just the seed works for me. If that works and you track it down, do
> >> remember to take that seed _out_ of the "VM options" box rather than
> >> forget it as I have done ;)
> >>
> >> 3> Mark Miller's beasting script can be used to run a zillion tests
> >> over night: https://gist.github.com/markrmiller/dbdb792216dc98b018ad
> >>
> >> 4> If MIke's intuition that it's one of the file system randomizations
> >> that occasionally gets hit _and_ you determine that that's an invalid
> >> test case (and for Luke requiring that 

Re: Question about usage of LuceneTestCase

2018-08-21 Thread Erick Erickson
The pseudo-random generator in the Lucene test framework is used to
randomize lots of test conditions, we're talking about the file system
implementation here, but there are lots of others. Whenever you see a
call to random().whatever, that's the call to the framework's method.

But here's the thing. The randomization can cause different
combinations of "stuff" to happen. Say the locale is randomized to
Turkish and a token is also randomly generated that breaks _only_ with
that combination. You'd never explicitly be able to test all of those
kinds of combinations, thus the random() function. And there may be
many calls to random() by the time a test is run.

Here's the key. When "seeded" with the same number, the calls to
random() produce the exact same output every time. So say with seed1 I
get
nextInt() - 1
nextInt() - 67
nextBool() - true

Whenever I use 1 as the seed, I'll get exactly the above. However, if
I use 2 as a seed, I might get
nextInt() - 93
nextInt() - 63
nextBool() - false

So the short form is

1. randomization is used to try out various combinations.

2. using a particular seed guarantees that the randomization is repeatable.

3.  when a test fails with a particular seed, running the test with
the _same_ seed will produce the same conditions, hopefully allowing
that particular error resulting from that particular combination to be
reproduced reliably (and fixed).

4. at least that's the theory and in practice it works quite well.
There is no _guarantee_ that the test will fail using the same seed,
sometimes the failures are a result of subtle timing etc, which is not
under control of the randomization. I breathe a sigh of relief,
though, when a test _does_ reproduce with a particular seed 'cause
then I have a hope of knowing the issue is actually fixed ;).


Best,
Erick

On Tue, Aug 21, 2018 at 3:56 PM, Tomoko Uchida
 wrote:
> Thanks a lot for your information & insights,
>
> I will try to reproduce the errors and investigate the results.
> And, maybe I should learn more about internal of the test framework,
> I'm not familiar with it and still do not understand what does "seed" means
> exactly in this context.
>
> Regards,
> Tomoko
>
> 2018年8月22日(水) 1:05 Erick Erickson :
>
>> Couple of things (and I know you've been around for a while, so pardon
>> me if it's all old hat to you):
>>
>> 1> if you run the entire "reproduce with" line and can get a
>> consistent failure, then you are half way there, nothing is as
>> frustrating as not getting failures reliably. The critical bit is
>> often the -Dtests.seed. As Michael mentioned, there are various
>> randomizations done for _many_ things in Lucene tests using a random
>> generator.  tests.seed, well, seeds that generator so it produces the
>> same numbers every time it's run with that seed. You'll see lots of
>> calls to a static ramdom() method calls. I'll add that if you want to
>> use randomness in your tests, use that method and do _not_ use a local
>> instance of Java's Random.
>>
>> 2> MIke: You say IntelliJ succeeds. But that'll use a new random()
>> seed. Once you run a test, in the upper right (on my version at
>> least), IntelliJ will show you a little box with the test name and you
>> can "edit configurations" on it. I often have luck by editing the
>> configuration and adding the test seed to the "VM option" box for the
>> test, just the "-Dtests.seed=35AF58F652536895" part. You can add all
>> of the -D flags in the "reproduce with" line if you want, but often
>> just the seed works for me. If that works and you track it down, do
>> remember to take that seed _out_ of the "VM options" box rather than
>> forget it as I have done ;)
>>
>> 3> Mark Miller's beasting script can be used to run a zillion tests
>> over night: https://gist.github.com/markrmiller/dbdb792216dc98b018ad
>>
>> 4> If MIke's intuition that it's one of the file system randomizations
>> that occasionally gets hit _and_ you determine that that's an invalid
>> test case (and for Luke requiring that the FS-basesd tests are all
>> that are necessary may be fine) I'm pretty sure you you can disable
>> that randomization for your specific tests.
>>
>> Best,
>> Erick
>>
>> On Tue, Aug 21, 2018 at 7:47 AM, Tomoko Uchida
>>  wrote:
>> > Hi, Mike
>> >
>> > Thanks for sharing your experiments.
>> >
>> >> CommitsImplTest.testListCommits
>> >> CommitsImplTest.testGetCommit_generation_notfound
>> >> CommitsImplTest.testGetSegments
>> >> DocumentsImplTest.testGetDocumentFIelds
>> >
>> > I also found CommitsImplTest and DocumentsImplTest fail frequently,
>> > especially CommitsImplTest is unhappy with lucene test framework (I
>> pointed
>> > that in my previous post.)
>> >
>> >> I wonder if this is somehow related to running mvn from command line vs
>> > running in IntelliJ since previously I was doing the latter
>> >
>> > In my personal experience, when I was running those suspicious tests on
>> > IntelliJ IDEA, they were always green - but I am not sure that `mvn test`
>> > 

Re: Question about usage of LuceneTestCase

2018-08-21 Thread Tomoko Uchida
Thanks a lot for your information & insights,

I will try to reproduce the errors and investigate the results.
And, maybe I should learn more about internal of the test framework,
I'm not familiar with it and still do not understand what does "seed" means
exactly in this context.

Regards,
Tomoko

2018年8月22日(水) 1:05 Erick Erickson :

> Couple of things (and I know you've been around for a while, so pardon
> me if it's all old hat to you):
>
> 1> if you run the entire "reproduce with" line and can get a
> consistent failure, then you are half way there, nothing is as
> frustrating as not getting failures reliably. The critical bit is
> often the -Dtests.seed. As Michael mentioned, there are various
> randomizations done for _many_ things in Lucene tests using a random
> generator.  tests.seed, well, seeds that generator so it produces the
> same numbers every time it's run with that seed. You'll see lots of
> calls to a static ramdom() method calls. I'll add that if you want to
> use randomness in your tests, use that method and do _not_ use a local
> instance of Java's Random.
>
> 2> MIke: You say IntelliJ succeeds. But that'll use a new random()
> seed. Once you run a test, in the upper right (on my version at
> least), IntelliJ will show you a little box with the test name and you
> can "edit configurations" on it. I often have luck by editing the
> configuration and adding the test seed to the "VM option" box for the
> test, just the "-Dtests.seed=35AF58F652536895" part. You can add all
> of the -D flags in the "reproduce with" line if you want, but often
> just the seed works for me. If that works and you track it down, do
> remember to take that seed _out_ of the "VM options" box rather than
> forget it as I have done ;)
>
> 3> Mark Miller's beasting script can be used to run a zillion tests
> over night: https://gist.github.com/markrmiller/dbdb792216dc98b018ad
>
> 4> If MIke's intuition that it's one of the file system randomizations
> that occasionally gets hit _and_ you determine that that's an invalid
> test case (and for Luke requiring that the FS-basesd tests are all
> that are necessary may be fine) I'm pretty sure you you can disable
> that randomization for your specific tests.
>
> Best,
> Erick
>
> On Tue, Aug 21, 2018 at 7:47 AM, Tomoko Uchida
>  wrote:
> > Hi, Mike
> >
> > Thanks for sharing your experiments.
> >
> >> CommitsImplTest.testListCommits
> >> CommitsImplTest.testGetCommit_generation_notfound
> >> CommitsImplTest.testGetSegments
> >> DocumentsImplTest.testGetDocumentFIelds
> >
> > I also found CommitsImplTest and DocumentsImplTest fail frequently,
> > especially CommitsImplTest is unhappy with lucene test framework (I
> pointed
> > that in my previous post.)
> >
> >> I wonder if this is somehow related to running mvn from command line vs
> > running in IntelliJ since previously I was doing the latter
> >
> > In my personal experience, when I was running those suspicious tests on
> > IntelliJ IDEA, they were always green - but I am not sure that `mvn test`
> > is the cause.
> >
> > Thanks,
> > Tomoko
> >
> > 2018年8月21日(火) 22:53 Michael Sokolov :
> >
> >> I was running these luke tests a bunch and found the following tests
> fail
> >> intermittently; pretty frequently. Once I @Ignore them I can get a
> >> consistent pass:
> >>
> >>
> >> CommitsImplTest.testListCommits
> >> CommitsImplTest.testGetCommit_generation_notfound
> >> CommitsImplTest.testGetSegments
> >> DocumentsImplTest.testGetDocumentFIelds
> >>
> >> I did not attempt to figure out why the tests were failing, but to do
> that,
> >> I would:
> >>
> >> Run repeatedly until you get a failure -- save the test "seed" from this
> >> run that should be printed out in the failure message Then you should be
> >> able to reliably reproduce this failure by re-running with system
> property
> >> "tests.seed" set to that value. This is used to initialize the
> >> randomization that LuceneTestCase does.
> >>
> >> My best guess is that the failures may have to do with randomly using
> some
> >> Directory implementation or other Lucene feature that Luke doesn't
> properly
> >> handle?
> >>
> >> Hmm I was trying this again to see if I could get an example, and
> strangely
> >> these tests are no longer failing for me after several runs, when
> >> previously they failed quite often. I wonder if this is somehow related
> to
> >> running mvn from command line vs running in IntelliJ since previously I
> was
> >> doing the latter
> >>
> >> -Mike
> >>
> >> On Tue, Aug 21, 2018 at 9:01 AM Tomoko Uchida <
> >> tomoko.uchida.1...@gmail.com>
> >> wrote:
> >>
> >> > Hello,
> >> >
> >> > Could you give me some advice or comments about usage of
> LuceneTestCase.
> >> >
> >> > Some of our unit tests extending LuceneTestCase fail by assertion
> error
> >> --
> >> > sometimes, randomly.
> >> > I suppose we use LuceneTestCase in inappropriate way, but cannot find
> out
> >> > how to fix it.
> >> >
> >> > Here is some information about failed tests.
> 

Re: Question about usage of LuceneTestCase

2018-08-21 Thread Erick Erickson
Couple of things (and I know you've been around for a while, so pardon
me if it's all old hat to you):

1> if you run the entire "reproduce with" line and can get a
consistent failure, then you are half way there, nothing is as
frustrating as not getting failures reliably. The critical bit is
often the -Dtests.seed. As Michael mentioned, there are various
randomizations done for _many_ things in Lucene tests using a random
generator.  tests.seed, well, seeds that generator so it produces the
same numbers every time it's run with that seed. You'll see lots of
calls to a static ramdom() method calls. I'll add that if you want to
use randomness in your tests, use that method and do _not_ use a local
instance of Java's Random.

2> MIke: You say IntelliJ succeeds. But that'll use a new random()
seed. Once you run a test, in the upper right (on my version at
least), IntelliJ will show you a little box with the test name and you
can "edit configurations" on it. I often have luck by editing the
configuration and adding the test seed to the "VM option" box for the
test, just the "-Dtests.seed=35AF58F652536895" part. You can add all
of the -D flags in the "reproduce with" line if you want, but often
just the seed works for me. If that works and you track it down, do
remember to take that seed _out_ of the "VM options" box rather than
forget it as I have done ;)

3> Mark Miller's beasting script can be used to run a zillion tests
over night: https://gist.github.com/markrmiller/dbdb792216dc98b018ad

4> If MIke's intuition that it's one of the file system randomizations
that occasionally gets hit _and_ you determine that that's an invalid
test case (and for Luke requiring that the FS-basesd tests are all
that are necessary may be fine) I'm pretty sure you you can disable
that randomization for your specific tests.

Best,
Erick

On Tue, Aug 21, 2018 at 7:47 AM, Tomoko Uchida
 wrote:
> Hi, Mike
>
> Thanks for sharing your experiments.
>
>> CommitsImplTest.testListCommits
>> CommitsImplTest.testGetCommit_generation_notfound
>> CommitsImplTest.testGetSegments
>> DocumentsImplTest.testGetDocumentFIelds
>
> I also found CommitsImplTest and DocumentsImplTest fail frequently,
> especially CommitsImplTest is unhappy with lucene test framework (I pointed
> that in my previous post.)
>
>> I wonder if this is somehow related to running mvn from command line vs
> running in IntelliJ since previously I was doing the latter
>
> In my personal experience, when I was running those suspicious tests on
> IntelliJ IDEA, they were always green - but I am not sure that `mvn test`
> is the cause.
>
> Thanks,
> Tomoko
>
> 2018年8月21日(火) 22:53 Michael Sokolov :
>
>> I was running these luke tests a bunch and found the following tests fail
>> intermittently; pretty frequently. Once I @Ignore them I can get a
>> consistent pass:
>>
>>
>> CommitsImplTest.testListCommits
>> CommitsImplTest.testGetCommit_generation_notfound
>> CommitsImplTest.testGetSegments
>> DocumentsImplTest.testGetDocumentFIelds
>>
>> I did not attempt to figure out why the tests were failing, but to do that,
>> I would:
>>
>> Run repeatedly until you get a failure -- save the test "seed" from this
>> run that should be printed out in the failure message Then you should be
>> able to reliably reproduce this failure by re-running with system property
>> "tests.seed" set to that value. This is used to initialize the
>> randomization that LuceneTestCase does.
>>
>> My best guess is that the failures may have to do with randomly using some
>> Directory implementation or other Lucene feature that Luke doesn't properly
>> handle?
>>
>> Hmm I was trying this again to see if I could get an example, and strangely
>> these tests are no longer failing for me after several runs, when
>> previously they failed quite often. I wonder if this is somehow related to
>> running mvn from command line vs running in IntelliJ since previously I was
>> doing the latter
>>
>> -Mike
>>
>> On Tue, Aug 21, 2018 at 9:01 AM Tomoko Uchida <
>> tomoko.uchida.1...@gmail.com>
>> wrote:
>>
>> > Hello,
>> >
>> > Could you give me some advice or comments about usage of LuceneTestCase.
>> >
>> > Some of our unit tests extending LuceneTestCase fail by assertion error
>> --
>> > sometimes, randomly.
>> > I suppose we use LuceneTestCase in inappropriate way, but cannot find out
>> > how to fix it.
>> >
>> > Here is some information about failed tests.
>> >
>> >  * The full test code is here:
>> >
>> >
>> https://github.com/DmitryKey/luke/blob/master/src/test/java/org/apache/lucene/luke/models/commits/CommitsImplTest.java
>> >  * We run tests by `mvn test` on Mac PC or Travis CI (oracle jdk8/9/10,
>> > openjdk 8/9/10), assertion errors occur regardless of platform or jdk
>> > version.
>> >  * Stack trace of an assertion error is at the end of this mail.
>> >
>> > Any advice are appreciated. Please tell me if more information is needed.
>> >
>> > Thanks,
>> > Tomoko
>> >
>> >
>> > 

Re: Question about usage of LuceneTestCase

2018-08-21 Thread Tomoko Uchida
Hi, Mike

Thanks for sharing your experiments.

> CommitsImplTest.testListCommits
> CommitsImplTest.testGetCommit_generation_notfound
> CommitsImplTest.testGetSegments
> DocumentsImplTest.testGetDocumentFIelds

I also found CommitsImplTest and DocumentsImplTest fail frequently,
especially CommitsImplTest is unhappy with lucene test framework (I pointed
that in my previous post.)

> I wonder if this is somehow related to running mvn from command line vs
running in IntelliJ since previously I was doing the latter

In my personal experience, when I was running those suspicious tests on
IntelliJ IDEA, they were always green - but I am not sure that `mvn test`
is the cause.

Thanks,
Tomoko

2018年8月21日(火) 22:53 Michael Sokolov :

> I was running these luke tests a bunch and found the following tests fail
> intermittently; pretty frequently. Once I @Ignore them I can get a
> consistent pass:
>
>
> CommitsImplTest.testListCommits
> CommitsImplTest.testGetCommit_generation_notfound
> CommitsImplTest.testGetSegments
> DocumentsImplTest.testGetDocumentFIelds
>
> I did not attempt to figure out why the tests were failing, but to do that,
> I would:
>
> Run repeatedly until you get a failure -- save the test "seed" from this
> run that should be printed out in the failure message Then you should be
> able to reliably reproduce this failure by re-running with system property
> "tests.seed" set to that value. This is used to initialize the
> randomization that LuceneTestCase does.
>
> My best guess is that the failures may have to do with randomly using some
> Directory implementation or other Lucene feature that Luke doesn't properly
> handle?
>
> Hmm I was trying this again to see if I could get an example, and strangely
> these tests are no longer failing for me after several runs, when
> previously they failed quite often. I wonder if this is somehow related to
> running mvn from command line vs running in IntelliJ since previously I was
> doing the latter
>
> -Mike
>
> On Tue, Aug 21, 2018 at 9:01 AM Tomoko Uchida <
> tomoko.uchida.1...@gmail.com>
> wrote:
>
> > Hello,
> >
> > Could you give me some advice or comments about usage of LuceneTestCase.
> >
> > Some of our unit tests extending LuceneTestCase fail by assertion error
> --
> > sometimes, randomly.
> > I suppose we use LuceneTestCase in inappropriate way, but cannot find out
> > how to fix it.
> >
> > Here is some information about failed tests.
> >
> >  * The full test code is here:
> >
> >
> https://github.com/DmitryKey/luke/blob/master/src/test/java/org/apache/lucene/luke/models/commits/CommitsImplTest.java
> >  * We run tests by `mvn test` on Mac PC or Travis CI (oracle jdk8/9/10,
> > openjdk 8/9/10), assertion errors occur regardless of platform or jdk
> > version.
> >  * Stack trace of an assertion error is at the end of this mail.
> >
> > Any advice are appreciated. Please tell me if more information is needed.
> >
> > Thanks,
> > Tomoko
> >
> >
> > ---
> >  T E S T S
> > ---
> > Running org.apache.lucene.luke.models.commits.CommitsImplTest
> > NOTE: reproduce with: ant test  -Dtestcase=CommitsImplTest
> > -Dtests.method=testGetSegmentAttributes -Dtests.seed=35AF58F652536895
> > -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
> > -Dtests.timezone=Africa/Kigali -Dtests.asserts=true
> > -Dtests.file.encoding=UTF-8
> > NOTE: leaving temporary files on disk at:
> >
> >
> /private/var/folders/xr/mrs6w1m15y1f4wkgfhn_x1dmgp/T/lucene.luke.models.commits.CommitsImplTest_35AF58F652536895-001
> > NOTE: test params are:
> >
> >
> codec=HighCompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=HIGH_COMPRESSION,
> > chunkSize=6, maxDocsPerChunk=7, blockSize=2),
> >
> >
> termVectorsFormat=CompressingTermVectorsFormat(compressionMode=HIGH_COMPRESSION,
> > chunkSize=6, blockSize=2)), sim=RandomSimilarity(queryNorm=true): {},
> > locale=de, timezone=Africa/Kigali
> > NOTE: Mac OS X 10.13.6 x86_64/Oracle Corporation 1.8.0_181
> > (64-bit)/cpus=4,threads=1,free=201929064,total=257425408
> > NOTE: All tests run in this JVM: [CommitsImplTest]
> > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.44 sec
> > <<< FAILURE!
> >
> >
> testGetSegmentAttributes(org.apache.lucene.luke.models.commits.CommitsImplTest)
> > Time elapsed: 0.047 sec  <<< FAILURE!
> > java.lang.AssertionError
> > at
> __randomizedtesting.SeedInfo.seed([35AF58F652536895:AE37E8467BC01918]:0)
> > at org.junit.Assert.fail(Assert.java:92)
> > at org.junit.Assert.assertTrue(Assert.java:43)
> > at org.junit.Assert.assertTrue(Assert.java:54)
> > at
> >
> >
> org.apache.lucene.luke.models.commits.CommitsImplTest.testGetSegmentAttributes(CommitsImplTest.java:151)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> 

Re: Question about usage of LuceneTestCase

2018-08-21 Thread Michael Sokolov
I was running these luke tests a bunch and found the following tests fail
intermittently; pretty frequently. Once I @Ignore them I can get a
consistent pass:


CommitsImplTest.testListCommits
CommitsImplTest.testGetCommit_generation_notfound
CommitsImplTest.testGetSegments
DocumentsImplTest.testGetDocumentFIelds

I did not attempt to figure out why the tests were failing, but to do that,
I would:

Run repeatedly until you get a failure -- save the test "seed" from this
run that should be printed out in the failure message Then you should be
able to reliably reproduce this failure by re-running with system property
"tests.seed" set to that value. This is used to initialize the
randomization that LuceneTestCase does.

My best guess is that the failures may have to do with randomly using some
Directory implementation or other Lucene feature that Luke doesn't properly
handle?

Hmm I was trying this again to see if I could get an example, and strangely
these tests are no longer failing for me after several runs, when
previously they failed quite often. I wonder if this is somehow related to
running mvn from command line vs running in IntelliJ since previously I was
doing the latter

-Mike

On Tue, Aug 21, 2018 at 9:01 AM Tomoko Uchida 
wrote:

> Hello,
>
> Could you give me some advice or comments about usage of LuceneTestCase.
>
> Some of our unit tests extending LuceneTestCase fail by assertion error --
> sometimes, randomly.
> I suppose we use LuceneTestCase in inappropriate way, but cannot find out
> how to fix it.
>
> Here is some information about failed tests.
>
>  * The full test code is here:
>
> https://github.com/DmitryKey/luke/blob/master/src/test/java/org/apache/lucene/luke/models/commits/CommitsImplTest.java
>  * We run tests by `mvn test` on Mac PC or Travis CI (oracle jdk8/9/10,
> openjdk 8/9/10), assertion errors occur regardless of platform or jdk
> version.
>  * Stack trace of an assertion error is at the end of this mail.
>
> Any advice are appreciated. Please tell me if more information is needed.
>
> Thanks,
> Tomoko
>
>
> ---
>  T E S T S
> ---
> Running org.apache.lucene.luke.models.commits.CommitsImplTest
> NOTE: reproduce with: ant test  -Dtestcase=CommitsImplTest
> -Dtests.method=testGetSegmentAttributes -Dtests.seed=35AF58F652536895
> -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
> -Dtests.timezone=Africa/Kigali -Dtests.asserts=true
> -Dtests.file.encoding=UTF-8
> NOTE: leaving temporary files on disk at:
>
> /private/var/folders/xr/mrs6w1m15y1f4wkgfhn_x1dmgp/T/lucene.luke.models.commits.CommitsImplTest_35AF58F652536895-001
> NOTE: test params are:
>
> codec=HighCompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=HIGH_COMPRESSION,
> chunkSize=6, maxDocsPerChunk=7, blockSize=2),
>
> termVectorsFormat=CompressingTermVectorsFormat(compressionMode=HIGH_COMPRESSION,
> chunkSize=6, blockSize=2)), sim=RandomSimilarity(queryNorm=true): {},
> locale=de, timezone=Africa/Kigali
> NOTE: Mac OS X 10.13.6 x86_64/Oracle Corporation 1.8.0_181
> (64-bit)/cpus=4,threads=1,free=201929064,total=257425408
> NOTE: All tests run in this JVM: [CommitsImplTest]
> Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.44 sec
> <<< FAILURE!
>
> testGetSegmentAttributes(org.apache.lucene.luke.models.commits.CommitsImplTest)
> Time elapsed: 0.047 sec  <<< FAILURE!
> java.lang.AssertionError
> at __randomizedtesting.SeedInfo.seed([35AF58F652536895:AE37E8467BC01918]:0)
> at org.junit.Assert.fail(Assert.java:92)
> at org.junit.Assert.assertTrue(Assert.java:43)
> at org.junit.Assert.assertTrue(Assert.java:54)
> at
>
> org.apache.lucene.luke.models.commits.CommitsImplTest.testGetSegmentAttributes(CommitsImplTest.java:151)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
>
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
> at
>
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
> at
>
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
> at
>
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
> at
>
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
> at
>
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
> at
>
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
> at
>
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
> at
>
>