Re: [klee-dev] using klee with Siemens Benchmarks

Urmas Repinski Thu, 07 Mar 2013 07:05:57 -0800

Hi Michael.

The work that you are examining seems, from casual inspection, to be 
about assessing the claim that MC/DC (and related adequacy criteria) 
really are adequate. (Urmas: did I get this right?)


    I want to be able to generate inputs that detect errors at least in Siemens 
Benchmarks without additional usage already known faulty versions and 
regression testing.
   So yes, i think MC/DC can be good start, but later it will be possible to 
reduce the generated inputs.

As a result, as Paul somewhat tersely replied: what's the executable 
spec -- i.e., the "true set of requirements" from which MC/DC or 
other test cases might be generated?

    I do not need spec for test generation, i have one design and want to get 
as much different inputs from it as possible, MC/DC criteria, as i already 
told, is a  good start.
    Using this inputs all other errors in my experimental designs should be 
detected.
    I see no need for specification at test generation stage, specification is 
necessary for Error Localization and Error Correction also.

P.S. - also, have you played with using Klee to check equivalence or 
bisimulation of two programs? 

     This is something from formal verification, equivalence checking, again, 
this is slightly different direction from this, that i intend to apply.

The problem that if i will not the task concrete - MC/DC inputs for one design, 
using KLEE or without - then i can waste much time with no experimental result, 
this is the thing i want to avoid.

With regards,
Urmas Repinski

Date: Thu, 7 Mar 2013 07:44:21 -0500
From: [email protected]
To: [email protected]
CC: [email protected]
Subject: [klee-dev]  using klee with Siemens Benchmarks



On Mar 7, 2013 5:53 AM, "Paul Marinescu" <[email protected]> wrote:

>

> 1. How do you define 'design structure'?

> 2. You might be criticising an apple for not being a good orange. KLEE is 
> designed to tests the program it's being executed on, and if one also wants 
> to test 41 variants, he/she will run KLEE again on the variants. I'm not sure 
> what you would expect, even in a simple scenario like



>

> if (input = 100) {

>  //something

> }

>

> You would need to generate 2^sizeof(input) inputs to take into account all 
> possible mutations to the literal 100 using regression testing.

>Urmas,To expand a bit on Paul's first question to try to improve my own 
>understanding...

As I understand things, the point of the MC/DC coverage criterion as typically 
applied to the certification of life-critical airborne systems is that, given a 
good spec, one can check the adequacy of a set of test cases for a given 
implementation by checking that the test cases achieve 100% coverage of both 
the requirements and the implementation logic and state vector.
In this world, MC/DC is helpful because it finds many bugs with a number of 
test cases that grows linearly, rather than exponentially, in the number of, 
e.g., conditional branches and bits of program state.
The work that you are examining seems, from casual inspection, to be about 
assessing the claim that MC/DC (and related adequacy criteria) really are 
adequate. (Urmas: did I get this right?)As a result, as Paul somewhat tersely 
replied: what's the executable spec -- i.e., the "true set of requirements" 
from which MC/DC or other test cases might be generated?


Regards,

MichaelP.S. - also, have you played with using Klee to check equivalence or 
bisimulation of two programs? (Here's an old post that I wrote up to quickly 
illustrate one approach:
http://mstone.info/posts/klee_quickcheck/
)


> Paul

>

>

> On 7 Mar 2013, at 05:06, Urmas Repinski <[email protected]> wrote:

>

>> Hello.

>>

>> Yes.

>>

>> So KLEE takes into account only the coverage, if all paths are covered, but 
>> does not take into account entire design structure.

>>

>> The errors provided by siemens benchmarks are in the operators (+ -> -, > -> 
>> <, etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if 
>> corresponding nodes in the design's model representation are activated, 
>> error is not propagated to the output.



>>

>> Its a pity, if it is possible to test the entire structure with klee inputs 
>> (activate different bits in variables with inputs, confirm that operators 
>> values take influence on the output) then klee generated inputs will be more 
>> useful for practical imlementation.



>>

>> It is possible to investigate error types, found in siemens benchmarks, and 
>> improve test generation with klee.

>>

>> I am writing one article about the errors in the designs at the moment, it 
>> will be published in the september, and then i will send it to the list too.

>>

>> Maybe this will make possible to improve input generation and make klee more 
>> usable in actual industry.

>>

>> Urmas Repinski

>>

>> ________________________________

>> Subject: Re: [klee-dev] using klee with Siemens Benchmarks

>> From: [email protected]

>> Date: Wed, 6 Mar 2013 21:19:54 +0000

>> CC: [email protected]

>> To: [email protected]

>>

>> Hello Urmas,

>> It's not that clear what you're trying to do but I assume you want to use 
>> the inputs generated by KLEE on one program version to test 41 variants. I'm 
>> afraid this might not be that easy.

>>

>> KLEE actually gets nearly 100% statement and branch coverage on the original 
>> program ('nearly' because there's some unreachable code). You should run it 
>> without the POSIX runtime and uclibc because you're not using them anyway. 
>> What you get is:



>>

>> --------------------------------------------------------------------------

>> | Path       | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) |

>> --------------------------------------------------------------------------

>> | klee-out-0 |   5951 |    0.26 |   99.36 |   94.44 |    311 |     91.92 |

>> --------------------------------------------------------------------------

>>

>> From what I can tell, the path coverage is also 100%, counting just feasible 
>> paths, so there are no other inputs to generate as far as KLEE is concerned.

>>

>> Paul

>>

>> On 6 Mar 2013, at 12:31, Urmas Repinski <[email protected]> wrote:

>>

>>> Hello.

>>>

>>> My name is Urmas Repinski, PhD student in Tallinn University of Technology, 
>>> Estonia.

>>>

>>> I am trying to generate inputs for Siemens Benchmarks with KLEE.

>>>

>>> Siemens Benchmarks - C designs with various erroneous versions for testing 
>>> error localization and error correction in C designs.

>>> They can be located and downloaded from 
>>> http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

>>>

>>> I am testing KLEE with tcas design, it is smallest and simplest.

>>>

>>> Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 
>>> 1607 lines of execution of the design, and it is easy extract 1607 various 
>>> inputs for the design.

>>> But i have no idea how the inputs were generated, attaching siemens 
>>> benchmarks inputs to the letter (file INPUTS_tcas_Siemens_arg).

>>>

>>> I want to compare Siemens inputs with inputs, generated by klee.

>>>

>>> I had installed klee as it is written in documentation - 
>>> http://klee.llvm.org/GetStarted.html - with uclibc support.

>>>

>>> Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 
>>> uses too new version of gcc, and this generated error then used llvm-gcc, 
>>> but ok, this error solved.

>>>

>>> When i take tcas/v1/tcas.c design, adding corresponding modification to 
>>> generate inputs with klee (tcas_original.c and tcas_modified.c are attached 
>>> to the letter), i get 45 inputs generated, and after modifying klee output 
>>> to suitable format i get klee outputs (file KLEE_OUTPUT_arg).



>>>

>>> urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c

>>> urmas-PBL21 src # klee --libc=uclibc  --posix-runtime   tcas.o 

>>> KLEE: NOTE: Using model: 
>>> /home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca

>>> KLEE: output directory = "klee-out-143"

>>> KLEE: WARNING: undefined reference to function: fwrite

>>> KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)

>>> KLEE: WARNING ONCE: calling __user_main with extra arguments.

>>>

>>> KLEE: done: total instructions = 14790

>>> KLEE: done: completed paths = 45

>>> KLEE: done: generated tests = 45

>>>

>>>

>>> This outputs have same coverage than Siemens Inputs, but most of errors in 
>>> erroneous designs are simply not detected, while Simenes inputs detect all 
>>> errors.

>>>

>>> Inputs KLEE: Coverage - 92.5532%.  Total Inputs - 45 Detected Errors - 8/41

>>> Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, 
>>> Detected Erroneous designs - 41/41

>>>

>>> Maybe i have something wrong with KLEE arguments when i execute klee, can 
>>> somebody help me with right klee execution?

>>> I had tested klee with various options, but i still have 45 generated 
>>> inputs. Is it possible to increase somehow number of generated inputs with 
>>> klee?

>>> Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts 
>>> executions somewhere, but there is no fwrite function in tcas design.

>>>

>>> Urmas Repinski

>>> <INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________

>>> klee-dev mailing list

>>> [email protected]

>>> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

>>

>>

>> _______________________________________________

>> klee-dev mailing list

>> [email protected]

>> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

>

>

>

> _______________________________________________

> klee-dev mailing list

> [email protected]

> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

>

_______________________________________________
klee-dev mailing list
[email protected]
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

_______________________________________________
klee-dev mailing list
[email protected]
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Re: [klee-dev] using klee with Siemens Benchmarks

Reply via email to