Re: [klee-dev] Regarding getting variable names from KLEE

2022-10-06 Thread Nowack, Martin
Hi Tareq,

I can highly recommend you the following documents to get an understanding of 
the internal details of debug information and LLVM that KLEE uses in the first 
place:
* https://llvm.org/docs/SourceLevelDebugging.html
* https://llvm.org/docs/HowToUpdateDebugInfo.html

For the examples that you asked for, a good starting point for me are often the 
transformation passes of LLVM, i.e. (llvm-project/llvm/lib/Transforms/) to get 
an idea how different concepts are implemented and might work.
(Just one example: 
https://github.com/llvm/llvm-project/blob/53dc0f107877acad44824b1426986c7f88f4bc50/llvm/lib/Transforms/IPO/MergeFunctions.cpp#L566)

A good IDE support makes a huge difference in working through the source code 
and makes it slightly less painful.

Best,
Martin

> On 5. Oct 2022, at 20:58, Nazir, Tareq Mohammed  wrote:
> 
> Hi Frank,
> 
> Thanks for the reply,
> 
> Yes I have used the second option and -fno-discard-value-names and I am able 
> to see the variable name in the llvm bitcode .ll file. But what I need is the 
> access to this variables in KLEE engine. Also I am trying to use 
> DILocalVariable. I would like to know if it is possible to share any example 
> where DILocalVariable can be used to extract the variable name. This would be 
> really helpful.
> 
> Thanks and Best Regards,
> Tareq Mohammed Nazir
> From: klee-dev-boun...@imperial.ac.uk  on 
> behalf of Frank Busse 
> Sent: Wednesday, 5 October 2022 13:05
> Cc: klee-dev@imperial.ac.uk
> Subject: Re: [klee-dev] Regarding getting variable names from KLEE
>  
> Hi,
> 
> 
> On Wed, 5 Oct 2022 09:26:35 +
> "Nazir, Tareq Mohammed"  wrote:
> 
> > I would like to know if I can get variable names of the source code
> > in KLEE engine. Would also like to know if KLEE engine is storing the
> > variable name information while executing the llvm bitcode
> > symbolically.
> > 
> > 
> > For example : From the below C code I want to get the buffer variable
> > name is it possible or not.
> > 
> > 
> > int main(int argc, char *argv[])
> > {
> >char *buffer = malloc(5);
> >buffer[6] = 'a';
> >return 0;
> > }
> 
> There are at least two options:
> 
> 1) you compile your code with debug information (-g) and recover the
>information from there (LLVM API):
> 
> !15 = !DILocalVariable(name: "buffer", scope: !10, file: !1, line: 4, type: 
> !16)
> 
> 2) or you tell clang to keep variable names (-fno-discard-value-names)
>and it will generate bitcode like:
> 
> %buffer = alloca i8*, align 8
> store i8* %call, i8** %buffer, align 8
> %0 = load i8*, i8** %buffer, align 8
> 
>instead of:
> 
> %2 = alloca i8*, align 8
> store i8* %3, i8** %2, align 8
> %4 = load i8*, i8** %2, align 8
> 
> 
> Kind regards,
> 
> Frank
> 
> ___
> klee-dev mailing list
> klee-dev@imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
> ___
> klee-dev mailing list
> klee-dev@imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev


___
klee-dev mailing list
klee-dev@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev


Re: [klee-dev] Regarding getting variable names from KLEE

2022-10-05 Thread Nazir, Tareq Mohammed
Hi Frank,


Thanks for the reply,


Yes I have used the second option and -fno-discard-value-names and I am able to 
see the variable name in the llvm bitcode .ll file. But what I need is the 
access to this variables in KLEE engine. Also I am trying to use 
DILocalVariable. I would like to know if it is possible to share any example 
where DILocalVariable can be used to extract the variable name. This would be 
really helpful.


Thanks and Best Regards,

Tareq Mohammed Nazir


From: klee-dev-boun...@imperial.ac.uk  on 
behalf of Frank Busse 
Sent: Wednesday, 5 October 2022 13:05
Cc: klee-dev@imperial.ac.uk
Subject: Re: [klee-dev] Regarding getting variable names from KLEE

Hi,


On Wed, 5 Oct 2022 09:26:35 +
"Nazir, Tareq Mohammed"  wrote:

> I would like to know if I can get variable names of the source code
> in KLEE engine. Would also like to know if KLEE engine is storing the
> variable name information while executing the llvm bitcode
> symbolically.
>
>
> For example : From the below C code I want to get the buffer variable
> name is it possible or not.
>
>
> int main(int argc, char *argv[])
> {
>char *buffer = malloc(5);
>buffer[6] = 'a';
>return 0;
> }

There are at least two options:

1) you compile your code with debug information (-g) and recover the
   information from there (LLVM API):

!15 = !DILocalVariable(name: "buffer", scope: !10, file: !1, line: 4, type: !16)

2) or you tell clang to keep variable names (-fno-discard-value-names)
   and it will generate bitcode like:

%buffer = alloca i8*, align 8
store i8* %call, i8** %buffer, align 8
%0 = load i8*, i8** %buffer, align 8

   instead of:

%2 = alloca i8*, align 8
store i8* %3, i8** %2, align 8
%4 = load i8*, i8** %2, align 8


Kind regards,

Frank

___
klee-dev mailing list
klee-dev@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
___
klee-dev mailing list
klee-dev@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev


Re: [klee-dev] Regarding getting variable names from KLEE

2022-10-05 Thread J. Ryan Stinnett
On Wed, 5 Oct 2022 at 14:10, Nazir, Tareq Mohammed  wrote:
> Yes I would be needing the source-level variable names and source-level info. 
> We are currently trying to understand the vulnerabilities such as heap 
> overflow that can be present in the source code. We are able to understand 
> the how KLEE engine is able to detect a vulnerability. But we need a map 
> between the information available to us from Memory space of KLEE to the 
> source code. It would be really helpful if I can get this map between the 
> KLEE memory space and source code.

Makes sense, thanks for the additional detail. This sounds like my
in-progress enhancement should indeed help you follow source-level
state.

As Frank mentioned elsewhere in the thread, you may be able to use
`-fno-discard-value-names` as a workaround for the moment, but that
doesn't encode any of the source-level semantics that e.g. a debugger
would follow; for proper source-level info, KLEE needs to make use of
the debug info when present in the IR.

I did not see any existing issues about this in the KLEE repo, so I
have filed one just now (https://github.com/klee/klee/issues/1552). We
can use that issue to gauge interest in this enhancement and discuss
any finer details as needed.

Thanks,
Ryan

___
klee-dev mailing list
klee-dev@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev


Re: [klee-dev] Regarding getting variable names from KLEE

2022-10-05 Thread Nazir, Tareq Mohammed

Hi Ryan,


Thanks for the reply,

Yes I would be needing the source-level variable names and source-level info. 
We are currently trying to understand the vulnerabilities such as heap overflow 
that can be present in the source code. We are able to understand the how KLEE 
engine is able to detect a vulnerability. But we need a map between the 
information available to us from Memory space of KLEE to the source code. It 
would be really helpful if I can get this map between the KLEE memory space and 
source code.


Thanks and Best Regards,

Tareq Mohammed Nazir


From: J. Ryan Stinnett 
Sent: Wednesday, 5 October 2022 14:59:43
To: Nazir, Tareq Mohammed
Cc: klee-dev@imperial.ac.uk
Subject: Re: [klee-dev] Regarding getting variable names from KLEE

On Wed, 5 Oct 2022 at 10:28, Nazir, Tareq Mohammed  wrote:
> I would like to know if I can get variable names of the source code in KLEE 
> engine. Would also like to know if KLEE engine is storing the variable name 
> information while executing the llvm bitcode symbolically.

KLEE currently thinks in terms of names at the LLVM IR level only, so
this means it does not know about source-level local variables such as
`buffer` in your example.

When you compile with debug info enabled (e.g. the `-g` option with
Clang), the IR does include additional DWARF-like metadata that maps
back to source-level concepts. My current research involves using KLEE
to test this debug info, and as part of that work, I've been enhancing
KLEE so that it can think in terms of source-level info using this
metadata. I hope to eventually add this enhancement to upstream KLEE,
but it may be a few months before I can do so.

It sounds like this enhancement would be useful to you as well. Do you
need only the source-level variable names, or other source-level info
as well?

- Ryan
___
klee-dev mailing list
klee-dev@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev


Re: [klee-dev] Regarding getting variable names from KLEE

2022-10-05 Thread J. Ryan Stinnett
On Wed, 5 Oct 2022 at 10:28, Nazir, Tareq Mohammed  wrote:
> I would like to know if I can get variable names of the source code in KLEE 
> engine. Would also like to know if KLEE engine is storing the variable name 
> information while executing the llvm bitcode symbolically.

KLEE currently thinks in terms of names at the LLVM IR level only, so
this means it does not know about source-level local variables such as
`buffer` in your example.

When you compile with debug info enabled (e.g. the `-g` option with
Clang), the IR does include additional DWARF-like metadata that maps
back to source-level concepts. My current research involves using KLEE
to test this debug info, and as part of that work, I've been enhancing
KLEE so that it can think in terms of source-level info using this
metadata. I hope to eventually add this enhancement to upstream KLEE,
but it may be a few months before I can do so.

It sounds like this enhancement would be useful to you as well. Do you
need only the source-level variable names, or other source-level info
as well?

- Ryan

___
klee-dev mailing list
klee-dev@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev


Re: [klee-dev] Regarding getting variable names from KLEE

2022-10-05 Thread Frank Busse
Hi,


On Wed, 5 Oct 2022 09:26:35 +
"Nazir, Tareq Mohammed"  wrote:

> I would like to know if I can get variable names of the source code
> in KLEE engine. Would also like to know if KLEE engine is storing the
> variable name information while executing the llvm bitcode
> symbolically.
> 
> 
> For example : From the below C code I want to get the buffer variable
> name is it possible or not.
> 
> 
> int main(int argc, char *argv[])
> {
>char *buffer = malloc(5);
>buffer[6] = 'a';
>return 0;
> }

There are at least two options:

1) you compile your code with debug information (-g) and recover the
   information from there (LLVM API):

!15 = !DILocalVariable(name: "buffer", scope: !10, file: !1, line: 4, type: !16)

2) or you tell clang to keep variable names (-fno-discard-value-names)
   and it will generate bitcode like:

%buffer = alloca i8*, align 8
store i8* %call, i8** %buffer, align 8
%0 = load i8*, i8** %buffer, align 8

   instead of:

%2 = alloca i8*, align 8
store i8* %3, i8** %2, align 8
%4 = load i8*, i8** %2, align 8


Kind regards,

Frank

___
klee-dev mailing list
klee-dev@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev