Thanks Cristian and Martin, The pointer to lib/Core/Executor.cpp was exactly what I needed to finish implementing this and I've submitted a PR.
And, in case anybody else out there is trying to use KLEE with Rust at the moment, here is a list of the PRs that I am currently merging into master in my work. These are sufficient that I can verify properties of programs that use Rust data structures (Vec, BTreeMap, Option, enums, traits, etc.). https://github.com/klee/klee/pull/1295 More robust handling of unknown intrinsics https://github.com/klee/klee/pull/1290 Stub definition of __cxa_thread_atexit_impl https://github.com/klee/klee/pull/1271 [WIP] Handle global variables (The last one is a WIP change by Martin that he mentioned here https://github.com/klee/klee/issues/1263#issuecomment-643254348) -- Alastair On Fri, Aug 14, 2020 at 6:21 PM Cristian Cadar <c.ca...@imperial.ac.uk> wrote: > +1 to this, we should only terminate those paths that hit unhandled > intrinsics. This is what we do with unavailable external functions and > inline assembly for instance, so we should adopt the same behaviour here. > > Best, > Cristian > > On 14/08/2020 18:06, Nowack, Martin wrote: > > Hi Alastair, > > > > I would also rather use option 3. > > Normally, the `LowerIntrinsic` pass should try to replace appropriate > > calls if possible. It should not terminate it. > > > > Beside your attached code, I would suggest to change the following code > > line in `lib/Core/Executor.cpp`: > > > https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Core/Executor.cpp#L1514 > > > > Instead of an `klee_error` which terminates the whole KLEE execution, > > one could terminate the current path. > > This allows to explore other paths that don’t contain such instructions > > or at least explore a path until it reaches such instruction, which I > > guess might be valuable in itself. > > > > > > As you mentioned inline assembly, there is a currently a pass which > > tries to lift mostly memory barriers into LLVM intrinsics that are > > handles as NOPs. More complex things are hard to handle. > > ( > https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Module/RaiseAsm.cpp > ) > > > > Hope that helps. > > > > Best, > > Martin > > > >> On 14. Aug 2020, at 17:33, Alastair Reid <adr...@google.com > >> <mailto:adr...@google.com>> wrote: > >> > >> Hi, > >> <mailto:klee-dev@imperial.ac.uk> > >> > >> I want to check what the best way is to deal with missing intrinsics > >> when adapting KLEE for a new language and new standard libraries. > >> In particular, what kind of changes should I make and submit as PRs? > >> > >> I am trying to use KLEE with a modest sized Rust application (about > >> 20kloc) but that depends on over 100 other libraries ("crates") and > >> Rust's fairly rich standard library. > >> Given the amount of code involved, it is unsurprising that the > >> resulting LLVM bitcode files use some intrinsics that KLEE does not > >> support. > >> I don't know whether I have a complete list but at least the following > >> are needed: > >> > >> maxnum minnum > >> x86_avx2_pshuf_b x86_ssse3_pshuf_b_128 > >> x86_avx2_pslli_d x86_avx2_psrli_d > >> x86_avx_ldu_dq_256 x86_sse3_ldu_dq > >> x86_sse2_pslli_d x86_sse2_psrli_d > >> x86_sse2_pause > >> x86_avx_vzeroupper > >> > >> As far as I can tell, KLEE checks that it accepts all intrinsics > >> before starting symbolic execution. > >> So it can reject a program even if there is no feasible path these > >> intrinsics just because they are in the bitcode file. > >> I have seen this happen in my current trials (on short test programs > >> that only use the standard library). > >> > >> > >> I see three solutions - with various pros/cons > >> > >> 1) Implement all missing intrinsics. > >> This seems like a good option for target independent intrinsics > >> like maxnum (IEEE float min/max) > >> And it might be necessary for x86_sse2_pause because LLVM will > >> emit this instruction even if you disable SSE2 (this instruction is a > >> NOP on old x86 cores - it makes spinlocks faster on modern cores). > >> > >> The disadvantage is that implementing all LLVM intrinsics is a > >> Sisyphean task. > >> (I have not even explored what intrinsics are required on Arm - > >> which is a much more > >> widespread ISA than x86!) > >> > >> 2) Compile the libraries with AVX and SSE features disabled. > >> This should fix most of the intrinsics (except for > >> maxnum/minnum/x86_sse2_pause). > >> > >> The disadvantage of messing with compilation options is that this > >> becomes quite hard > >> to do in large, complex projects. > >> > >> 3) Change KLEE to ignore any missing intrinsics unless it actually > >> executes them. > >> This would make option 1 less painful because I would only need to > >> implement > >> intrinsics that are truly needed. > >> > >> I have a change that would do this - but I don't think it is ready > >> to submit as a PR yet. > >> (diff attached at end). > >> > >> (Using this change, I got a bit further with the application - I > >> was able to start running > >> the code in KLEE but it then got stuck because one of the > >> libraries contains a mix of > >> Rust, C and assembly code. But that's a problem for next week...) > >> > >> I am currently leaning towards option 3 but I want to check which is > >> preferred and there are a few possible variants (sketched at the end) > >> > >> -- > >> Alastair Reid > >> > >> ps Here is the change. It ignores all intrinsics on the above list > >> during the Cleaner pass (but still reports an error if it hits them > >> during execution). > >> > >> The main goal of this code is to avoid calling > >> "IL->LowerIntrinsicCall(ii);" in the default case because that LLVM > >> function fails an assertion if it cannot lower the call. > >> (The function is defined in LLVM in > >> llvm/lib/CodeGen/IntrinsicLowering.cpp) > >> > >> Some things that I am unsure about are: > >> - I am a bit uneasy about hardcoding the list of intrinsics to ignore. > >> - Is there a need for a command line option to enable/disable ignoring > >> unknown intrinsics. > >> > >> diff --git a/lib/Module/IntrinsicCleaner.cpp > >> b/lib/Module/IntrinsicCleaner.cpp > >> index a1d4fdda..e88d5300 100644 > >> --- a/lib/Module/IntrinsicCleaner.cpp > >> +++ b/lib/Module/IntrinsicCleaner.cpp > >> @@ -10,6 +10,7 @@ > >> #include "Passes.h" > >> > >> #include "klee/Config/Version.h" > >> +#include "klee/Support/ErrorHandling.h" > >> #include "llvm/Analysis/MemoryBuiltins.h" > >> #include "llvm/Analysis/ConstantFolding.h" > >> #include "llvm/IR/Constants.h" > >> @@ -20,6 +21,7 @@ > >> #include "llvm/IR/Instruction.h" > >> #include "llvm/IR/Instructions.h" > >> #include "llvm/IR/IntrinsicInst.h" > >> +#include "llvm/IR/IntrinsicsX86.h" > >> #include "llvm/IR/Module.h" > >> #include "llvm/IR/Type.h" > >> #include "llvm/Pass.h" > >> @@ -340,6 +342,33 @@ bool > >> IntrinsicCleanerPass::runOnBasicBlock(BasicBlock &b, Module &M) { > >> break; > >> } > >> #endif > >> + > >> + // The following intrinsics are known to be missing from KLEE > >> + // but can be generated by Rustc and are found in libcore, > >> libstd, > >> + // common Rust crates, etc. > >> + // > >> + // Adding an intrinsic to this list allows KLEE to execute > >> programs that > >> + // use these intrinsics as long as the paths explored by KLEE > >> + // do not invoke these intrinsics. > >> + case Intrinsic::maxnum: > >> + case Intrinsic::minnum: > >> + case Intrinsic::x86_avx2_pshuf_b: > >> + case Intrinsic::x86_avx2_pslli_d: > >> + case Intrinsic::x86_avx2_psrli_d: > >> + case Intrinsic::x86_avx_ldu_dq_256: > >> + case Intrinsic::x86_avx_vzeroupper: > >> + case Intrinsic::x86_sse2_pause: > >> + case Intrinsic::x86_sse2_pslli_d: > >> + case Intrinsic::x86_sse2_psrli_d: > >> + case Intrinsic::x86_sse3_ldu_dq: > >> + case Intrinsic::x86_ssse3_pshuf_b_128: > >> + { > >> + const Function *Callee = ii->getCalledFunction(); > >> + llvm::StringRef name = Callee->getName(); > >> + klee_warning_once((void*)Callee, "unsupported intrinsic > >> %.*s", (int)name.size(), name.data()); > >> + break; > >> + } > >> + > >> default: > >> IL->LowerIntrinsicCall(ii); > >> dirty = true; > >> > >> > >> > >> > >> pps Some alternatives to having a hardcoded list of intrinsics to > >> ignore are > >> > >> 1) to change the last three lines of the above (i.e., the "default > >> case") to only invoke > >> LowerIntrinsicCall if the intrinsic is one of the intrinsics that > >> it will not barf on. > >> (LowerIntrinsicCall accepts 44 intrinsics. They are all target > >> independent.) > >> > >> 2) to change the hardcoded list of intrinsics to ignore to > >> > >> case x86_3dnow_pavgusb ... x86_xtest : > >> > >> which will match against all x86 intrinsics. > >> (However, this also feels fragile - what if a new intrinsic is > >> added before pavgusb or > >> after xtest?) > >> > >> _______________________________________________ > >> klee-dev mailing list > >> klee-dev@imperial.ac.uk <mailto:klee-dev@imperial.ac.uk> > >> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev > > > > > > _______________________________________________ > > klee-dev mailing list > > klee-dev@imperial.ac.uk > > https://mailman.ic.ac.uk/mailman/listinfo/klee-dev > > >
_______________________________________________ klee-dev mailing list klee-dev@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/klee-dev