Re: Random testing performance (theoretical)

Dawid Weiss Sun, 24 Jun 2012 11:47:12 -0700

Thanks Uwe. I think that guy actually refers to randomizing parameter/
variable space for testing, not generating tests at random in general.
I agree with you that any type of generational/ mutational/
metaheuristic-driven tests generation is not a good way to go in
practice. Lucene tests still have a very broad search space, even with
just randomized parameters/ data and for this that blog post would
apply (?).


Anyway, his is one argument in a broader discussion. I don't know
which theoretical approach would be ideal but I know for sure even
typical randomization of parameters reveals bugs pretty damn
quickly...

Dawid

On Sun, Jun 24, 2012 at 5:53 PM, Uwe Schindler <[email protected]> wrote:
> Hi Dawid,
>
> I think we do not do completely random testing at all! Random testing (with 
> the limits mentioned in the article) would mean that we do very random 
> testing with also randomly generated tests. In contrast our tests all have a 
> non-random background, means our tests are written with a specific use-case 
> in mind (there are some exemptions, I agree, see below). So with any random 
> parameter they basically test what a non-random junit test would also test. 
> In most cases the randomness is only added on top (to extend the space of 
> parameters we test) - so the chances to find bugs get greater. If we would 
> test Lucene only with also randomly generated tests, I agree we would find no 
> bugs at all. On the other hand, classical junit tests have no randomness at 
> all, they just test that a specific "requirement" is implemented correctly 
> and returns the correct result. The problem with those tests is the fact, 
> that by fixing a bug to do the correct thing in test can break something else 
> (this is especially important for IR tests, they rely on statistics!!!). You 
> could write a test that expects a specific output of a lucene query and you 
> can of course fix all the scoring logic to exactly create that result. Of 
> course any other similar query can return bogus results. So by executing 
> random queries, we can also test statistical things like ranking function 
> works also for a larger set of queries or larger number of documents.
>
> There are some exceptions: The famous TestRandomChains falls exactly in the 
> category you mention: They just build a random chain of analyzer components 
> and the number of combinations you could create is huge! With a few 
> Tokenizers and a list of 150 TokenFilters you can create an insane amount of 
> TokenStream combinations (just start to multiply..., it is the formula for 
> the binomial coefficient). In that case, the chance to find a new bug is 
> really small. The number of failures at the beginning really shows that the 
> TokenStreams were in a very bad state! Now we have only few failures (one 
> more today!), but I am sure there are still billions (ore more) of 
> combinations that break the TokenStream contracts!
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: [email protected]
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of
>> Dawid Weiss
>> Sent: Sunday, June 24, 2012 3:42 PM
>> To: [email protected]
>> Cc: [email protected]
>> Subject: Random testing performance (theoretical)
>>
>> An interesting link was given to me today:
>>
>> "The Central Limit Theorem Makes Random Testing Hard"
>> http://blog.regehr.org/archives/660
>>
>> Just recently I had a talk at a local university
>> (http://idss.cs.put.poznan.pl/site/idss-en.html) and the low theoretical
>> probability of hitting _any_ sensible bug was also raised as a concern. This 
>> is in
>> stark contrast to what we see in practice, isn't it? It is an interesting
>> phenomenon because it means that (among other possible explanations) the
>> parameter/ search space of random tests is just ridden with failures and
>> exceptions and bugs and even with close-to-zero probability we still hit a 
>> lot of
>> them :)
>>
>> Dawid
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected] For additional
>> commands, e-mail: [email protected]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Random testing performance (theoretical)

Reply via email to