On 14. Aug 2020, at 17:33, Alastair Reid <[email protected]
<mailto:[email protected]>> wrote:
Hi,
<mailto:[email protected]>
I want to check what the best way is to deal with missing intrinsics
when adapting KLEE for a new language and new standard libraries.
In particular, what kind of changes should I make and submit as PRs?
I am trying to use KLEE with a modest sized Rust application (about
20kloc) but that depends on over 100 other libraries ("crates") and
Rust's fairly rich standard library.
Given the amount of code involved, it is unsurprising that the
resulting LLVM bitcode files use some intrinsics that KLEE does not
support.
I don't know whether I have a complete list but at least the following
are needed:
maxnum minnum
x86_avx2_pshuf_b x86_ssse3_pshuf_b_128
x86_avx2_pslli_d x86_avx2_psrli_d
x86_avx_ldu_dq_256 x86_sse3_ldu_dq
x86_sse2_pslli_d x86_sse2_psrli_d
x86_sse2_pause
x86_avx_vzeroupper
As far as I can tell, KLEE checks that it accepts all intrinsics
before starting symbolic execution.
So it can reject a program even if there is no feasible path these
intrinsics just because they are in the bitcode file.
I have seen this happen in my current trials (on short test programs
that only use the standard library).
I see three solutions - with various pros/cons
1) Implement all missing intrinsics.
This seems like a good option for target independent intrinsics
like maxnum (IEEE float min/max)
And it might be necessary for x86_sse2_pause because LLVM will
emit this instruction even if you disable SSE2 (this instruction is a
NOP on old x86 cores - it makes spinlocks faster on modern cores).
The disadvantage is that implementing all LLVM intrinsics is a
Sisyphean task.
(I have not even explored what intrinsics are required on Arm -
which is a much more
widespread ISA than x86!)
2) Compile the libraries with AVX and SSE features disabled.
This should fix most of the intrinsics (except for
maxnum/minnum/x86_sse2_pause).
The disadvantage of messing with compilation options is that this
becomes quite hard
to do in large, complex projects.
3) Change KLEE to ignore any missing intrinsics unless it actually
executes them.
This would make option 1 less painful because I would only need to
implement
intrinsics that are truly needed.
I have a change that would do this - but I don't think it is ready
to submit as a PR yet.
(diff attached at end).
(Using this change, I got a bit further with the application - I
was able to start running
the code in KLEE but it then got stuck because one of the
libraries contains a mix of
Rust, C and assembly code. But that's a problem for next week...)
I am currently leaning towards option 3 but I want to check which is
preferred and there are a few possible variants (sketched at the end)
--
Alastair Reid
ps Here is the change. It ignores all intrinsics on the above list
during the Cleaner pass (but still reports an error if it hits them
during execution).
The main goal of this code is to avoid calling
"IL->LowerIntrinsicCall(ii);" in the default case because that LLVM
function fails an assertion if it cannot lower the call.
(The function is defined in LLVM in
llvm/lib/CodeGen/IntrinsicLowering.cpp)
Some things that I am unsure about are:
- I am a bit uneasy about hardcoding the list of intrinsics to ignore.
- Is there a need for a command line option to enable/disable ignoring
unknown intrinsics.
diff --git a/lib/Module/IntrinsicCleaner.cpp
b/lib/Module/IntrinsicCleaner.cpp
index a1d4fdda..e88d5300 100644
--- a/lib/Module/IntrinsicCleaner.cpp
+++ b/lib/Module/IntrinsicCleaner.cpp
@@ -10,6 +10,7 @@
#include "Passes.h"
#include "klee/Config/Version.h"
+#include "klee/Support/ErrorHandling.h"
#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/IR/Constants.h"
@@ -20,6 +21,7 @@
#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/IntrinsicsX86.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/Type.h"
#include "llvm/Pass.h"
@@ -340,6 +342,33 @@ bool
IntrinsicCleanerPass::runOnBasicBlock(BasicBlock &b, Module &M) {
break;
}
#endif
+
+ // The following intrinsics are known to be missing from KLEE
+ // but can be generated by Rustc and are found in libcore,
libstd,
+ // common Rust crates, etc.
+ //
+ // Adding an intrinsic to this list allows KLEE to execute
programs that
+ // use these intrinsics as long as the paths explored by KLEE
+ // do not invoke these intrinsics.
+ case Intrinsic::maxnum:
+ case Intrinsic::minnum:
+ case Intrinsic::x86_avx2_pshuf_b:
+ case Intrinsic::x86_avx2_pslli_d:
+ case Intrinsic::x86_avx2_psrli_d:
+ case Intrinsic::x86_avx_ldu_dq_256:
+ case Intrinsic::x86_avx_vzeroupper:
+ case Intrinsic::x86_sse2_pause:
+ case Intrinsic::x86_sse2_pslli_d:
+ case Intrinsic::x86_sse2_psrli_d:
+ case Intrinsic::x86_sse3_ldu_dq:
+ case Intrinsic::x86_ssse3_pshuf_b_128:
+ {
+ const Function *Callee = ii->getCalledFunction();
+ llvm::StringRef name = Callee->getName();
+ klee_warning_once((void*)Callee, "unsupported intrinsic
%.*s", (int)name.size(), name.data());
+ break;
+ }
+
default:
IL->LowerIntrinsicCall(ii);
dirty = true;
pps Some alternatives to having a hardcoded list of intrinsics to
ignore are
1) to change the last three lines of the above (i.e., the "default
case") to only invoke
LowerIntrinsicCall if the intrinsic is one of the intrinsics that
it will not barf on.
(LowerIntrinsicCall accepts 44 intrinsics. They are all target
independent.)
2) to change the hardcoded list of intrinsics to ignore to
case x86_3dnow_pavgusb ... x86_xtest :
which will match against all x86 intrinsics.
(However, this also feels fragile - what if a new intrinsic is
added before pavgusb or
after xtest?)
_______________________________________________
klee-dev mailing list
[email protected] <mailto:[email protected]>
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev