Hi Vladimir! On Wed, Jul 29, 2009 at 10:33 AM, Vladimir Kuznetsov<ks.vladimir at gmail.com> wrote: > Hello, > > During last week I have successfully ported KLEE to 64bit. Now it passes all > checks and unittests. I have also tested the port on several programs from > coreutils (more detailed testing will be done in the following days).
This is fantastic! > Initially I have tried using KLEE64 to run 32bit code, however it turns out to > be a very difficult problem. When KLEE encounters a call to an external > function, it executes it on behalf of its own 64bit process. However 64bit and > 32bit functions may have different ABIs and mixing them leads to many > problems. > Examples include malloc() function (returns a 64bit address), or scanf() > function (when calling with %ld argument it expects a pointer to 64bit > location). The problem is not even specific to KLEE: it can be reproduced in > LLVM interpreter as well. Right, this case is never really going to work... > So my port is focused on running only 64bit code on 64bit KLEE. The following > is a brief list of changes that I have made: Ok. Is it possible to separate the changes for make KLEE work on 64-bit from the changes to allow KLEE to execute code for 64-bit targets? In theory they are separate concepts, a 32-bit KLEE should be able to run 64-bit programs (assuming they do not calling externals). In practice we don't have build system support for doing this very easily, but at least conceptually it should work. > - Changed "unsigned" and "int" types to 64bit clean equivalents (like size_t, > ptrdiff_t, intptr_t, etc.) in various places. The diff is quite big: it > touches > 412 lines in 57 files. I'm a bit surprised by this, I don't think there are so many places that we use 32-bit types which actually need to be 64-bit. If you send the diff I will look at it. > - Added forced coercion for arguments when calling functions (for example, > 64bit llvm can pass an i32 argument when a function expects i64 one) and for > malloc and alloca opcodes. The former should only happen when calling through a bitcast; this is an "interesting" situation, because unfortunately it is very difficult to accurately model what is really happening (which is usually that the code is working because of the way the underlying code generator translates from LLVM to machine code). This issue is present for 32-bit as well, it just occurs less often, and I'd like to treat this as a separate issue until I am sure of the best approach. The latter conversions are part of LLVM and I have already fixed them in TOT. > - Changed arguments and return types of functions in klee/klee.h to be 64bit > clean, updated implementation of that functions. Ok. I'm not sure what functions really need to change except for perhaps the byte count arguments. I'm not expecting anyone to need >4GB objects any time soon. :) > - Changed types in KLEE runtime libs to be 64bit clean and match types in > system libs. Ok. > - Changed datatypes of __ctype_* arrays in initializeGlobals() function to > match its implementations in libc. Ok. This is a wart in the current system that I would like to move elsewhere; I guess we will need some place to put target dependent "hacks" for making external calls work. > - Fixed one bug in stp: BeevMgr::CreateBVConst function assumes that > sizeof(unsigned long long) > sizeof(unsigned long) Nice! > - Implemented varargs handling on 64bit. To avoiding dealing with separate > register save area and deciding what types of args should go to what > registers, I am simply making a function believe that all varargs are on > stack. Ok. > - Temporary disabled Programs/pcregrep.c test on 64bit: the code of the test > is a decompilation of llvm bytecode compiled for 32bit machine. We have a way to mark that a test only works on certain platforms, I will add this to the test. > - Changed Array::getDomain() function to return Expr::Int64 on 64bit machines. > This fixes many assert failures when accessing arrays. > > - Added a special syntax in KQuery language to denote a type used as array > indexes on current architecture. I have selected "w." syntax for it because it > requires a very minimal change to parser and lexer. When the parser encounters > "w." as a type expression, it understands it either as w32 or as w64 depending > on the architecture. > > The last change is quite controversial, so I would like to discuss it further. > The reason for introducing the new syntax is that currently kleaver can only > understand arrays with index type matching to the pointer type. The "w." > syntax is the fastest workaround for this problem. > > Another workaround would be to make array index type equal w32 on any > architecture and always cast indexes to appropriate type when constructing > Read and Write expressions in KLEE. The solution I would like to go with for now (which happens to already be implemented in TOT) is that KLEE itself always uses 32-bit indices to reads. Bounds checking, etc, always match the width of the target, it is just array offsets that are stored as 32-bits. The only limitation of this approach is that a single object can't be larger than 4GB. Since it is pretty inconceivable that anyone would want to do symbolic execution if they had buffers that size, I don't see any reason to add the complexity so that KLEE itself selects the read size based on target pointer width. Note that the expression and solver libraries, on the other hand, eventually should be able to deal with array indices of arbitrary bit width. Most of this work should already be done (although it isn't tested), and we already have syntax and parser support for it in the KQuery language. The only thing left is to actually add the parameter to the Array type and update the constructors etc. -- Thanks for doing this work! Could you split your work into a few separate patches and send them to klee-commits? They don't necessarily need to apply cleanly to TOT, as long as I can work out the details enough to merge them in. - Daniel > A clean solution is, of course, supporting arrays with arbitrary index type. > Would > you suggest me implementing it instead of my workaround ? > > -- > With the best regards, > Vladimir > _______________________________________________ > klee-dev mailing list > klee-dev at keeda.stanford.edu > http://keeda.Stanford.EDU/mailman/listinfo/klee-dev >
