Torben, This is beautiful, It is an extremely good introduction. I am really looking forward to seeing the error, shrinking stuff.
Eric On Sun, Mar 27, 2011 at 3:31 PM, Torben Hoffmann <[email protected]> wrote: > Hi, > > I have converted the earlier mail into wikicreole markup - see > https://github.com/lehoff/erlware_commons/blob/master/pbt.creole - and my > colleague Raghav has improved the initial > ramblings on top of which I in turn have started going through a concrete > example. > > Still work in progress!!e > > A aside: the github creole renderer does not work as well as my home made > one (should release that to the public one day...) so please bear with it > for now until I figure out where the text will get its own life. > > Cheers, > Torben > > p.s. for ease of commenting for those who do not go in and modify the > description directly here is the updated description: > > Property based testing for unit testers > > Main contributors: Torben Hoffmann, Raghav Karol, Eric Merritt > > The purpose of the short document is to help people who are familiar with > unit testing understand how property based testing (PBT) differs, but also > where the thinking is the same. > > This document focusses on the PBT tool PropEr for Erlang since that is what > I am familiar with, but the general principles applies to all PBT tools > regardless of which language they are written in. > > The approach taken here is that we hear from people who are used to working > with unit testing regarding how they think when designing their tests and > how a concrete test might look. > > These descriptions are then "converted" into the way it works with PBT, with > a clear focus on what stays the same and what is different. > > Testing philosophies > > A quote from Martin Logan: > > For me unit testing is about contracts. I think about the same things > I think about when I write statements like {ok, Resp} = > Mod:Func(Args). Unit testing and writing specs are very close for me. > Hypothetically speaking lets say a function should return return {ok, > string()} | {error, term()} for all given input parameters then my > unit tests should be able to show that for a representative set of > input parameters that those contracts are honored. The art comes in > thinking about what that set is. > > The trap in writing all your own tests can often be that we think about the > set in terms of what we coded for and not what may indeed be asked of our > function. As the code is tried in further exploratory testing and in > production new input parameter sets for which the given function does not > meet the stated contract are discovered and added to the test case once a > fix has been put into place. > > This is a very good description of what the ground rules for unit testing > are: > > Checking that contracts are obeyed. > Creating a representative set of input parameters. > > The former is very much part of PBT - each property you write will check a > contract, so that thinking is the same. > > xUnit vs PBT > > Unit testing has become popular for software testing with the advent of > xUnit tools like jUnit for Java. xUnit like tools typically provide a > testing framework with the following functionality > > test fixture setup > test case execution > test fixture teardown > test suite management > test status reporting and management > > While xUnit tools provide a lot of functionality to execute and manage test > cases and suites, reporting results there is no focus on test case execution > step, while this is the main focus area of property-based testing (PBT). > > Consider the following function specification > > sort(list::integer()) ---> list::integer() | error > > A verbal specification of this function is, > > For all input lists of integers, the sort function returns a sorted list of > integers. > > For any other kind of argument the function returns the atom error. > > The specification above may be a requirement of how the function should > behave or even how the function does behave. This distinction is important; > the former is the requirement for the function, the latter is the actual > API. Both should be the same and that is what our testing should confirm. > Test cases for this function might look like > > assertEqual(sort([5,4,3,2,1]), [1,2,3,4,5]) > assertEqual(sort([1,2,3,4,5]), [1,2,3,4,5]) > assertEqual(sort([] ), [] ) > assertEqual(sort([-1,0, 1] ), [-1, 0, 1] ) > > How many tests cases should we write to be convinced that the actual > behaviour of the function is the same as its specification? Clearly, it is > impossible to write tests cases for all possible input values, here all > lists of integers, the art of testing is finding individual input values > that are representative of a large part of the input space. We hope that the > test cases are exhaustive to cover the specification. xUnit tools offer no > support for this and this is where PBT and PBT Tools like PropEr and > QuickCheck come in. > > PBT introduces testing with a large set of random input values and verifying > that the specification holds for each input value selected. Functions used > to generate input values, generators, are specified using rules and can be > simply composed together to construct complicated values. So, a property > based test for the function above may look like: > > FOREACH({I, J, InputList}, {nat(), nat(), integer_list()}, > SUCHTHAT(I < J andalso J < length(InputList), > SortedList = sort(InputList) > length(SortedList) == length(InputList) > andalso > lists:get(SortedList, I) =< lists:get(SortedList, J)) > > The property above works as follows > > Generate a random list of integers InputList and two natural numbers I, J, > such that I < J < size of InputList > Check that size of sorted and input lists is the same. > Check that element with smaller index I is less than or equal to element > with larger index J in SortedList. > > Notice in the property above, we specify property. Verification of the > property based on random input values will be done by the property based > tool, therefore we can generated a large number of tests cases with random > input values and have a higher level of confidence that the function when > using unit tests alone. > > But it does not stop at generation of input parameters. If you have more > complex tests where you have to generate a series of events and keep track > of some state then your PBT tool will generate random sequences of events > which corresponds to legal sequences of events and test that your system > behaves correctly for all sequences. > > So when you have written a property with associated generators you have in > fact created something that can create numerous test cases - you just have > to tell your PBT tool how many test cases you want to check the property on. > > Shrinking the bar > > At this point you might still have the feeling that introducing the notion > of some sort of generators to your unit testing tool of choice would bring > you on par with PBT tools, but wait there is more to come. > > When a PBT tool creates a test case that fails there is real chance that it > has created a long test case or some big input parameters - trying to debug > that is very much like receiving a humongous log from a system in the field > and try to figure out what cause the system to fail. > > Enter shrinking... > > When a test case fails the PBT tool will try to shrink the failing test case > down to the essentials by stripping out input elements or events that does > not cause the failure. In most cases this results in a very short > counterexample that clearly states which events and inputs are required to > break a property. > > As we go through some concrete examples later the effects of shrinking will > be shown. > > Shrinking makes it a lot easier to debug problems and is as key to the > strength of PBT as the generators. > > Converting a unit test > > We will now take a look at one possible way of translating a unit test into > a PBT setting. > > The example comes from Eric Merritt and is about the add/2 function in the > ec_dictionary instance ec_gb_trees. > > The add function has the following spec: > > -spec add(ec_dictionary:key(), ec_dictionary:value(), Object::dictionary()) > -> > dictionary(). > > and it is supposed to do the obvious: add the key and value pair to the > dictionary and return a new dictionary. > > Eric states his basic expectations as follows: > > I can put arbitrary terms into the dictionary as keys > I can put arbitrary terms into the dictionary as values > When I put a value in the dictionary by a key, I can retrieve that same > value > When I put a different value in the dictionary by key it does not change > other key value pairs. > When I update a value the new value in available by the new key > When a value does not exist a not found exception is created > > The first two expectations regarding being able to use arbritrary terms as > keys and values is a job for generators. > > The latter four are prime candidates for properties and we will create one > for each of them. > > Generators > > key() -> any(). > > value() -> any(). > > For PropEr this approach has the drawback that creation and shrinking > becomes rather time consuming, so it might be better to narrow to something > like this: > > key() -> union([integer(),atom()]). > > value() -> union([integer(),atom(),binary(),boolean(),string()]). > > What is best depends on the situation and intended usage. > > Now, being able to generate keys and values is not enough. You also have to > tell PropEr how to create a dictionary and in this case we will use a > symbolic generator (detail to be explained later). > > sym_dict() -> > ?SIZED(N,sym_dict(N)). > > sym_dict(0) -> > {'$call',ec_dictionary,new,[ec_gb_trees]}; > sym_dict(N) -> > ?LAZY( > frequency([ > {1, {'$call',ec_dictionary,remove,[key(),sym_dict(N-1)]}}, > {2, > {'$call',ec_dictionary,add,[value(),value(),sym_dict(N-1)]}} > ]) > ). > > sym_dict/0 uses the ?SIZED macro to control the size of the generated > dictionary. PropEr will start out with small numbers and gradually raise it. > > sym_dict/1 is building a dictionary by randomly adding key/value pairs and > removing keys. Eventually the base case is reached which will create an > empty dictionary. > > The ?LAZY macro is used to defer the calculation of the sym_dict(N-1)}} > until they are needed and {{{frequency/1 is used to ensure that twice as > many adds compared to removes are done. This should give rather more > interesting dictionaries in the long run, if not one can alter the > frequencies accondingly. > > But does it really work? > > That is a good question and one that should always be asked when looking at > genetors. Fortunately there is a way to see what a generator produces > provided that the generator functions are exported. > > Hint: in most cases it will not hurt to throw in a -compile(export_all). in > the module used to specify the properties. And here we actually have a > sub-hint: specify the properties in a separate file to avoid peeking inside > the implementation! Base the test on the published API as this is what the > users of the code will be restricted to. > > When the test module has been loaded you can test the generators by starting > up an Erlang shell (this example uses the erlware_commons code so get > yourself a clone to play with): > > $ erl -pz ebin -pz test > 1> proper_gen:pick(ec_dictionary_proper:key()). > {ok,4} > 2> proper_gen:pick(ec_dictionary_proper:key()). > {ok,35} > 3> proper_gen:pick(ec_dictionary_proper:key()). > {ok,-5} > 4> proper_gen:pick(ec_dictionary_proper:key()). > {ok,48} > 5> proper_gen:pick(ec_dictionary_proper:key()). > {ok,'\036\207_là´?\nc'} > 6> proper_gen:pick(ec_dictionary_proper:value()). > {ok,2} > 7> proper_gen:pick(ec_dictionary_proper:value()). > {ok,-14} > 8> proper_gen:pick(ec_dictionary_proper:value()). > {ok,-3} > 9> proper_gen:pick(ec_dictionary_proper:value()). > {ok,27} > 10> proper_gen:pick(ec_dictionary_proper:value()). > {ok,-8} > 11> proper_gen:pick(ec_dictionary_proper:value()). > {ok,[472765,17121]} > 12> proper_gen:pick(ec_dictionary_proper:value()). > {ok,true} > 13> proper_gen:pick(ec_dictionary_proper:value()). > {ok,<<>>} > 14> proper_gen:pick(ec_dictionary_proper:value()). > {ok,<<89,69,18,148,32,42,238,101>>} > 15> proper_gen:pick(ec_dictionary_proper:sym_dict()). > {ok,{'$call',ec_dictionary,add, > [[114776,1053475], > 'fª\020\227\215', > {'$call',ec_dictionary,add, > ['',true, > {'$call',ec_dictionary,add, > ['2^Ø¡', > [900408,886056], > > {'$call',ec_dictionary,add,[[48618|...],<<...>>|...]}]}]}]}} > 16> proper_gen:pick(ec_dictionary_proper:sym_dict()). > {ok,{'$call',ec_dictionary,add, > [10,'a¯\214\031fõC', > {'$call',ec_dictionary,add, > [false,-1, > {'$call',ec_dictionary,remove, > ['d·ÉV÷[', > {'$call',ec_dictionary,remove,[12,{'$call',...}]}]}]}]}} > > That does not look too bad, so we will continue with that for now. > > Properties of {{{add/2}}} > > The first expectation Eric had about how the dictionary works was that if a > key had been stored it could be retrieved. > > One way of expressing this could be with this property: > > prop_get_after_add_returns_correct_value() -> > ?FORALL({Dict,K,V}, {sym_dict(),key(),value()}, > begin > try ec_dictionary:get(K,ec_dictionary:add(K,V,Dict)) of > V -> > true; > _ -> > false > catch > _:_ -> > false > end > end). > > This property reads that for all dictionaries get/2 using a key from a > key/value pair just inserted using the add/3 function will return that > value. If that is not the case the property will evaluate to false. > > Running the property is done using proper:quickcheck/1: > > proper:quickcheck(ec_dictionary_proper:prop_get_after_add_returns_correct_value()). > .................................................................................................... > OK: Passed 100 test(s). > true > > This was as expected, but at this point we will take a little detour and > introduce a mistake in the ec_gb_trees implementation and see how that > works. > > -- > http://www.linkedin.com/in/torbenhoffmann > -- You received this message because you are subscribed to the Google Groups "erlware-dev" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/erlware-dev?hl=en.
