Re: [klee-dev] Regarding getting variable names from KLEE
Hi Tareq, I can highly recommend you the following documents to get an understanding of the internal details of debug information and LLVM that KLEE uses in the first place: * https://llvm.org/docs/SourceLevelDebugging.html * https://llvm.org/docs/HowToUpdateDebugInfo.html For the examples that you asked for, a good starting point for me are often the transformation passes of LLVM, i.e. (llvm-project/llvm/lib/Transforms/) to get an idea how different concepts are implemented and might work. (Just one example: https://github.com/llvm/llvm-project/blob/53dc0f107877acad44824b1426986c7f88f4bc50/llvm/lib/Transforms/IPO/MergeFunctions.cpp#L566) A good IDE support makes a huge difference in working through the source code and makes it slightly less painful. Best, Martin > On 5. Oct 2022, at 20:58, Nazir, Tareq Mohammed wrote: > > Hi Frank, > > Thanks for the reply, > > Yes I have used the second option and -fno-discard-value-names and I am able > to see the variable name in the llvm bitcode .ll file. But what I need is the > access to this variables in KLEE engine. Also I am trying to use > DILocalVariable. I would like to know if it is possible to share any example > where DILocalVariable can be used to extract the variable name. This would be > really helpful. > > Thanks and Best Regards, > Tareq Mohammed Nazir > From: klee-dev-boun...@imperial.ac.uk on > behalf of Frank Busse > Sent: Wednesday, 5 October 2022 13:05 > Cc: klee-dev@imperial.ac.uk > Subject: Re: [klee-dev] Regarding getting variable names from KLEE > > Hi, > > > On Wed, 5 Oct 2022 09:26:35 + > "Nazir, Tareq Mohammed" wrote: > > > I would like to know if I can get variable names of the source code > > in KLEE engine. Would also like to know if KLEE engine is storing the > > variable name information while executing the llvm bitcode > > symbolically. > > > > > > For example : From the below C code I want to get the buffer variable > > name is it possible or not. > > > > > > int main(int argc, char *argv[]) > > { > >char *buffer = malloc(5); > >buffer[6] = 'a'; > >return 0; > > } > > There are at least two options: > > 1) you compile your code with debug information (-g) and recover the >information from there (LLVM API): > > !15 = !DILocalVariable(name: "buffer", scope: !10, file: !1, line: 4, type: > !16) > > 2) or you tell clang to keep variable names (-fno-discard-value-names) >and it will generate bitcode like: > > %buffer = alloca i8*, align 8 > store i8* %call, i8** %buffer, align 8 > %0 = load i8*, i8** %buffer, align 8 > >instead of: > > %2 = alloca i8*, align 8 > store i8* %3, i8** %2, align 8 > %4 = load i8*, i8** %2, align 8 > > > Kind regards, > > Frank > > ___ > klee-dev mailing list > klee-dev@imperial.ac.uk > https://mailman.ic.ac.uk/mailman/listinfo/klee-dev > ___ > klee-dev mailing list > klee-dev@imperial.ac.uk > https://mailman.ic.ac.uk/mailman/listinfo/klee-dev ___ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Re: [klee-dev] Regarding getting variable names from KLEE
Hi Frank, Thanks for the reply, Yes I have used the second option and -fno-discard-value-names and I am able to see the variable name in the llvm bitcode .ll file. But what I need is the access to this variables in KLEE engine. Also I am trying to use DILocalVariable. I would like to know if it is possible to share any example where DILocalVariable can be used to extract the variable name. This would be really helpful. Thanks and Best Regards, Tareq Mohammed Nazir From: klee-dev-boun...@imperial.ac.uk on behalf of Frank Busse Sent: Wednesday, 5 October 2022 13:05 Cc: klee-dev@imperial.ac.uk Subject: Re: [klee-dev] Regarding getting variable names from KLEE Hi, On Wed, 5 Oct 2022 09:26:35 + "Nazir, Tareq Mohammed" wrote: > I would like to know if I can get variable names of the source code > in KLEE engine. Would also like to know if KLEE engine is storing the > variable name information while executing the llvm bitcode > symbolically. > > > For example : From the below C code I want to get the buffer variable > name is it possible or not. > > > int main(int argc, char *argv[]) > { >char *buffer = malloc(5); >buffer[6] = 'a'; >return 0; > } There are at least two options: 1) you compile your code with debug information (-g) and recover the information from there (LLVM API): !15 = !DILocalVariable(name: "buffer", scope: !10, file: !1, line: 4, type: !16) 2) or you tell clang to keep variable names (-fno-discard-value-names) and it will generate bitcode like: %buffer = alloca i8*, align 8 store i8* %call, i8** %buffer, align 8 %0 = load i8*, i8** %buffer, align 8 instead of: %2 = alloca i8*, align 8 store i8* %3, i8** %2, align 8 %4 = load i8*, i8** %2, align 8 Kind regards, Frank ___ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev ___ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Re: [klee-dev] Regarding getting variable names from KLEE
On Wed, 5 Oct 2022 at 14:10, Nazir, Tareq Mohammed wrote: > Yes I would be needing the source-level variable names and source-level info. > We are currently trying to understand the vulnerabilities such as heap > overflow that can be present in the source code. We are able to understand > the how KLEE engine is able to detect a vulnerability. But we need a map > between the information available to us from Memory space of KLEE to the > source code. It would be really helpful if I can get this map between the > KLEE memory space and source code. Makes sense, thanks for the additional detail. This sounds like my in-progress enhancement should indeed help you follow source-level state. As Frank mentioned elsewhere in the thread, you may be able to use `-fno-discard-value-names` as a workaround for the moment, but that doesn't encode any of the source-level semantics that e.g. a debugger would follow; for proper source-level info, KLEE needs to make use of the debug info when present in the IR. I did not see any existing issues about this in the KLEE repo, so I have filed one just now (https://github.com/klee/klee/issues/1552). We can use that issue to gauge interest in this enhancement and discuss any finer details as needed. Thanks, Ryan ___ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Re: [klee-dev] Regarding getting variable names from KLEE
Hi Ryan, Thanks for the reply, Yes I would be needing the source-level variable names and source-level info. We are currently trying to understand the vulnerabilities such as heap overflow that can be present in the source code. We are able to understand the how KLEE engine is able to detect a vulnerability. But we need a map between the information available to us from Memory space of KLEE to the source code. It would be really helpful if I can get this map between the KLEE memory space and source code. Thanks and Best Regards, Tareq Mohammed Nazir From: J. Ryan Stinnett Sent: Wednesday, 5 October 2022 14:59:43 To: Nazir, Tareq Mohammed Cc: klee-dev@imperial.ac.uk Subject: Re: [klee-dev] Regarding getting variable names from KLEE On Wed, 5 Oct 2022 at 10:28, Nazir, Tareq Mohammed wrote: > I would like to know if I can get variable names of the source code in KLEE > engine. Would also like to know if KLEE engine is storing the variable name > information while executing the llvm bitcode symbolically. KLEE currently thinks in terms of names at the LLVM IR level only, so this means it does not know about source-level local variables such as `buffer` in your example. When you compile with debug info enabled (e.g. the `-g` option with Clang), the IR does include additional DWARF-like metadata that maps back to source-level concepts. My current research involves using KLEE to test this debug info, and as part of that work, I've been enhancing KLEE so that it can think in terms of source-level info using this metadata. I hope to eventually add this enhancement to upstream KLEE, but it may be a few months before I can do so. It sounds like this enhancement would be useful to you as well. Do you need only the source-level variable names, or other source-level info as well? - Ryan ___ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Re: [klee-dev] Regarding getting variable names from KLEE
On Wed, 5 Oct 2022 at 10:28, Nazir, Tareq Mohammed wrote: > I would like to know if I can get variable names of the source code in KLEE > engine. Would also like to know if KLEE engine is storing the variable name > information while executing the llvm bitcode symbolically. KLEE currently thinks in terms of names at the LLVM IR level only, so this means it does not know about source-level local variables such as `buffer` in your example. When you compile with debug info enabled (e.g. the `-g` option with Clang), the IR does include additional DWARF-like metadata that maps back to source-level concepts. My current research involves using KLEE to test this debug info, and as part of that work, I've been enhancing KLEE so that it can think in terms of source-level info using this metadata. I hope to eventually add this enhancement to upstream KLEE, but it may be a few months before I can do so. It sounds like this enhancement would be useful to you as well. Do you need only the source-level variable names, or other source-level info as well? - Ryan ___ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
Re: [klee-dev] Regarding getting variable names from KLEE
Hi, On Wed, 5 Oct 2022 09:26:35 + "Nazir, Tareq Mohammed" wrote: > I would like to know if I can get variable names of the source code > in KLEE engine. Would also like to know if KLEE engine is storing the > variable name information while executing the llvm bitcode > symbolically. > > > For example : From the below C code I want to get the buffer variable > name is it possible or not. > > > int main(int argc, char *argv[]) > { >char *buffer = malloc(5); >buffer[6] = 'a'; >return 0; > } There are at least two options: 1) you compile your code with debug information (-g) and recover the information from there (LLVM API): !15 = !DILocalVariable(name: "buffer", scope: !10, file: !1, line: 4, type: !16) 2) or you tell clang to keep variable names (-fno-discard-value-names) and it will generate bitcode like: %buffer = alloca i8*, align 8 store i8* %call, i8** %buffer, align 8 %0 = load i8*, i8** %buffer, align 8 instead of: %2 = alloca i8*, align 8 store i8* %3, i8** %2, align 8 %4 = load i8*, i8** %2, align 8 Kind regards, Frank ___ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev