Re: [klee-dev] using klee with Siemens Benchmarks

Urmas Repinski Thu, 07 Mar 2013 06:18:06 -0800

Hi, David.

(1)    You appear to be investigating the effectiveness of test case generation 
tools. (ie fault detection effectiveness)Is this a correct conclusion?
The problem is that i were working with Error Localization and Error Corrcetion 
of DIAMOND FP7 project (http://fp7-diamond.eu/).
Papers will be soon ready, Model-Based Error Localization and Mutation-Based 
Error Correction.
But i have much time of my PhD studies left, and the superviser asked me to 
investigate KLEE and i think i can investigate also Error Generation 
possibilities using the FORENSIC tool, that were developed within contents of 
the project.
Tool - http://informatik.uni-bremen.de/agra/eng/forensic.php


(2)    You
 also appear to be attempting to identify a minimal set of regression 
test (when you have multiple instances in the history of the program) 
Is
 comparing programs as you intend even feasible (ie time O(number of 
 differences),  O( polynomial-expression(number of differences) ) or 
O(exponential-expression(number of differences) )?
No, i have set of designs - siemens benchmarks, and planning to make 
experiments only with them, they are all very different (text processing, 
logical-mathematical operations, Compound C data structures, file processing) 
and surely complicated enough and suitable for experiments with Error 
Localization, Error Correction and possible Test generation algorithms.
I do not want to go deeper into code complexity, this is a little different 
direction i think.


(3)    You also appear to be investigation fault isolation strategies (“..but i 
need to make inputs detect various states on the program, various errors.”)
That is a much harder problemYes, this is what i want to investigate and 
implements with Test Generation tool.

So i have now two possibilities - i can install KLEE source on my machine and 
try to extend it, to make possible to generate at least something close to 
MC/DC test complexity, or
I can start implementing test generation from scratch on the FORENSIC tool, to 
re-implement known algorithms and to make experiments.

First approach seems more reasonable, SMT-based systems are widely used for 
Test Generation.

So at the moment i am planning to go deeper in klee and i think it possible to 
add paths, that differ not only by their activated components - that result 
high coverage, but by the entire values of the activated variables also - this 
will make inputs closer to MC/DC.

Please tell me what do you think about this?
Urmas Repinski. 

From: [email protected]
To: [email protected]
CC: [email protected]
Subject: RE: [klee-dev] using klee with Siemens Benchmarks
Date: Thu, 7 Mar 2013 08:35:15 -0500

Urmas Repinski 
As I previously indicated KLEE is not capable of achieving MCDC coverage. That 
should be enough of an valuation for purposes to test effectiveness (2)    You 
also appear to be attempting to identify a minimal set of regression test (when 
you have multiple instances in the history of the program) 
Is comparing programs as you intend even feasible (ie time O(number of  
differences),  O( polynomial-expression(number of differences) ) or 
O(exponential-expression(number of differences) )? (3)    You also appear to be 
investigation fault isolation strategies (“..but i need to make inputs detect 
various states on the program, various errors.”)
That is a much harder problem Dave Lightstone


  From: [email protected] 
[mailto:[email protected]] On Behalf Of Urmas Repinski
Sent: Thursday, March 07, 2013 7:34 AM
To: klee
Subject: Re: [klee-dev] using klee with Siemens Benchmarks Yes, Thank You.

1. How do you define 'design structure'?

      This depends on tool what i am using for test generation. KLEE uses 
symbolic states and branches, i have also design's model - parsed design 
representation using hammock graph structure, any tool what generates inputs 
will have own design representation, and generated tests will depend on the 
design's representation's structure, more detailed model will generate more 
complete inputs.2. You might be criticising an apple for not being a good 
orange. KLEE is designed to tests the program it's being executed on, and if 
one also wants to test 41 variants, he/she will run KLEE again on the variants. 
I'm not sure what you would expect, even in a simple scenario like .... 
    Regression testing requires 2 versions of the design - correct one and 
faulty one, and generates inputs, based on detecting behavioral differences 
between two versions of the design, if i am not mistaken.

    I am, from the other hand, want to use only one version of the design, and 
to generate as complete inputs as possible.

    KLEE is pretty fast and reliable in terms of coverage, but i need to make 
inputs detect various states on the program, various errors.

    According to http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/ in 
Siemens Benchmarks Regression testing were used, all faulty versions of the 
design were processed and complete set of inputs, that detect already known 
errors were generated.

    This works well if i have set of faulty versions, but what if not, if i 
have only one version of the design and want to generate complete inputs for it?

Any help will be appreciated,
Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: [email protected]
Date: Thu, 7 Mar 2013 10:52:56 +0000
CC: [email protected]
To: [email protected]

1. How do you define 'design structure'?2. You might be criticising an apple 
for not being a good orange. KLEE is designed to tests the program it's being 
executed on, and if one also wants to test 41 variants, he/she will run KLEE 
again on the variants. I'm not sure what you would expect, even in a simple 
scenario like if (input = 100) { //something} You would need to generate 
2^sizeof(input) inputs to take into account all possible mutations to the 
literal 100 using regression testing. Paul  On 7 Mar 2013, at 05:06, Urmas 
Repinski <[email protected]> wrote: Hello.

Yes.

So KLEE takes into account only the coverage, if all paths are covered, but 
does not take into account entire design structure.

The errors provided by siemens benchmarks are in the operators (+ -> -, > -> <, 
etc) and in numbers (2 -> 3, 70 -> 80), and with klee inputs even if 
corresponding nodes in the design's model representation are activated, error 
is not propagated to the output.

Its a pity, if it is possible to test the entire structure with klee inputs 
(activate different bits in variables with inputs, confirm that operators 
values take influence on the output) then klee generated inputs will be more 
useful for practical imlementation.

It is possible to investigate error types, found in siemens benchmarks, and 
improve test generation with klee.

I am writing one article about the errors in the designs at the moment, it will 
be published in the september, and then i will send it to the list too.

Maybe this will make possible to improve input generation and make klee more 
usable in actual industry.

Urmas RepinskiSubject: Re: [klee-dev] using klee with Siemens Benchmarks
From: [email protected]
Date: Wed, 6 Mar 2013 21:19:54 +0000
CC: [email protected]
To: [email protected]

Hello Urmas,It's not that clear what you're trying to do but I assume you want 
to use the inputs generated by KLEE on one program version to test 41 variants. 
I'm afraid this might not be that easy. KLEE actually gets nearly 100% 
statement and branch coverage on the original program ('nearly' because there's 
some unreachable code). You should run it without the POSIX runtime and uclibc 
because you're not using them anyway. What you get is: 
--------------------------------------------------------------------------| 
Path       | Instrs | Time(s) | ICov(%) | BCov(%) | ICount | Solver(%) 
|--------------------------------------------------------------------------| 
klee-out-0 |   5951 |    0.26 |   99.36 |   94.44 |    311 |     91.92 
|-------------------------------------------------------------------------- 
From what I can tell, the path coverage is also 100%, counting just feasible 
paths, so there are no other inputs to generate as far as KLEE is concerned. 
Paul On 6 Mar 2013, at 12:31, Urmas Repinski <[email protected]> wrote: Hello.

My name is Urmas Repinski, PhD student in Tallinn University of Technology, 
Estonia.

I am trying to generate inputs for Siemens Benchmarks with KLEE.

Siemens Benchmarks - C designs with various erroneous versions for testing 
error localization and error correction in C designs.
They can be located and downloaded from 
http://pleuma.cc.gatech.edu/aristotle/Tools/subjects/

I am testing KLEE with tcas design, it is smallest and simplest.

Siemens benchmarks provide their inputs, tcas/script/runall.sh script has 1607 
lines of execution of the design, and it is easy extract 1607 various inputs 
for the design.
But i have no idea how the inputs were generated, attaching siemens benchmarks 
inputs to the letter (file INPUTS_tcas_Siemens_arg).

I want to compare Siemens inputs with inputs, generated by klee.

I had installed klee as it is written in documentation - 
http://klee.llvm.org/GetStarted.html - with uclibc support.

Installed llvm-gcc-4.2, llvm-gcc-4.5, originally provided by Linux Mint 12 uses 
too new version of gcc, and this generated error then used llvm-gcc, but ok, 
this error solved.

When i take tcas/v1/tcas.c design, adding corresponding modification to 
generate inputs with klee (tcas_original.c and tcas_modified.c are attached to 
the letter), i get 45 inputs generated, and after modifying klee output to 
suitable format i get klee outputs (file KLEE_OUTPUT_arg).

urmas-PBL21 src # llvm-gcc --emit-llvm -c -g tcas.c
urmas-PBL21 src # klee --libc=uclibc  --posix-runtime   tcas.o 
KLEE: NOTE: Using model: 
/home/urmas/forensic/thirdparty-install/klee/Release+Asserts/lib/libkleeRuntimePOSIX.bca
KLEE: output directory = "klee-out-143"
KLEE: WARNING: undefined reference to function: fwrite
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 54549392)
KLEE: WARNING ONCE: calling __user_main with extra arguments.

KLEE: done: total instructions = 14790
KLEE: done: completed paths = 45
KLEE: done: generated tests = 45


This outputs have same coverage than Siemens Inputs, but most of errors in 
erroneous designs are simply not detected, while Simenes inputs detect all 
errors.

Inputs KLEE: Coverage - 92.5532%.  Total Inputs - 45 Detected Errors - 8/41
Inputs Siemens Benchmarks: Coverage - 93.0851%. Total Inputs - 1607, Detected 
Erroneous designs - 41/41

Maybe i have something wrong with KLEE arguments when i execute klee, can 
somebody help me with right klee execution?
I had tested klee with various options, but i still have 45 generated inputs. 
Is it possible to increase somehow number of generated inputs with klee?
Maybe WARNING KLEE: WARNING: undefined reference to function: fwrite aborts 
executions somewhere, but there is no fwrite function in tcas design.

Urmas 
Repinski<INPUTS_tcas_Siemens_arg><tcas_original.c><tcas_modified.c><KLEE_OUTPUT_arg>_______________________________________________
klee-dev mailing list
[email protected]
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev 
_______________________________________________
klee-dev mailing list
[email protected]
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

_______________________________________________
klee-dev mailing list
[email protected]
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

Re: [klee-dev] using klee with Siemens Benchmarks

Reply via email to