Hello Daniel, On Thursday 30 July 2009 07:26:58 you wrote: > Hi Vladimir! > > On Wed, Jul 29, 2009 at 10:33 AM, Vladimir > > Kuznetsov<ks.vladimir at gmail.com> wrote: > > Hello, > > > > During last week I have successfully ported KLEE to 64bit. Now it passes > > all checks and unittests. I have also tested the port on several programs > > from coreutils (more detailed testing will be done in the following > > days). > > This is fantastic! Thank you.
> > Initially I have tried using KLEE64 to run 32bit code, however it turns > > out to be a very difficult problem. When KLEE encounters a call to an > > external function, it executes it on behalf of its own 64bit process. > > However 64bit and 32bit functions may have different ABIs and mixing them > > leads to many problems. Examples include malloc() function (returns a > > 64bit address), or scanf() function (when calling with %ld argument it > > expects a pointer to 64bit location). The problem is not even specific to > > KLEE: it can be reproduced in LLVM interpreter as well. > > Right, this case is never really going to work... > > > So my port is focused on running only 64bit code on 64bit KLEE. The > > following is a brief list of changes that I have made: > > Ok. Is it possible to separate the changes for make KLEE work on > 64-bit from the changes to allow KLEE to execute code for 64-bit > targets? In theory they are separate concepts, a 32-bit KLEE should be > able to run 64-bit programs (assuming they do not calling externals). > In practice we don't have build system support for doing this very > easily, but at least conceptually it should work. I think that the main work here is to replace constants in klee/Machine.h by dynamically-detected values. However I have not implemented this so far. > > > - Changed "unsigned" and "int" types to 64bit clean equivalents (like > > size_t, ptrdiff_t, intptr_t, etc.) in various places. The diff is quite > > big: it touches 412 lines in 57 files. > > I'm a bit surprised by this, I don't think there are so many places > that we use 32-bit types which actually need to be 64-bit. If you send > the diff I will look at it. The diff can be logically split into the following changes: - Variables used for array indexes and offsets (as I have told, currently I have made array indexes 64bit) - Places where integer variable holds a value of a pointer (for example in AddressSpace.cpp and in Executor.cpp) - Variables used as indexes for stl containers (STL specifies them as size_t) - not strictly required change, its just to be safe - After more careful review of the patch I see many places where I have changed types by mistake - I should fix it by reverting some of the changes. > > > - Added forced coercion for arguments when calling functions (for > > example, 64bit llvm can pass an i32 argument when a function expects i64 > > one) and for malloc and alloca opcodes. > > The former should only happen when calling through a bitcast; this is > an "interesting" situation, because unfortunately it is very difficult > to accurately model what is really happening (which is usually that > the code is working because of the way the underlying code generator > translates from LLVM to machine code). This issue is present for > 32-bit as well, it just occurs less often, and I'd like to treat this > as a separate issue until I am sure of the best approach. > > The latter conversions are part of LLVM and I have already fixed them in > TOT. > > > - Changed arguments and return types of functions in klee/klee.h to be > > 64bit clean, updated implementation of that functions. > > Ok. I'm not sure what functions really need to change except for > perhaps the byte count arguments. I'm not expecting anyone to need > 4GB objects any time soon. :) I've changes all byte count arguments and also klee_int and klee_range function. > > > > - Changed types in KLEE runtime libs to be 64bit clean and match types in > > system libs. > > Ok. > > > - Changed datatypes of __ctype_* arrays in initializeGlobals() function > > to match its implementations in libc. > > Ok. This is a wart in the current system that I would like to move > elsewhere; I guess we will need some place to put target dependent > "hacks" for making external calls work. > > > - Fixed one bug in stp: BeevMgr::CreateBVConst function assumes that > > sizeof(unsigned long long) > sizeof(unsigned long) > > Nice! > > > - Implemented varargs handling on 64bit. To avoiding dealing with > > separate register save area and deciding what types of args should go to > > what registers, I am simply making a function believe that all varargs > > are on stack. > > Ok. > > > - Temporary disabled Programs/pcregrep.c test on 64bit: the code of the > > test is a decompilation of llvm bytecode compiled for 32bit machine. > > We have a way to mark that a test only works on certain platforms, I > will add this to the test. > > > - Changed Array::getDomain() function to return Expr::Int64 on 64bit > > machines. This fixes many assert failures when accessing arrays. > > > > - Added a special syntax in KQuery language to denote a type used as > > array indexes on current architecture. I have selected "w." syntax for it > > because it requires a very minimal change to parser and lexer. When the > > parser encounters "w." as a type expression, it understands it either as > > w32 or as w64 depending on the architecture. > > > > The last change is quite controversial, so I would like to discuss it > > further. The reason for introducing the new syntax is that currently > > kleaver can only understand arrays with index type matching to the > > pointer type. The "w." syntax is the fastest workaround for this problem. > > > > Another workaround would be to make array index type equal w32 on any > > architecture and always cast indexes to appropriate type when > > constructing Read and Write expressions in KLEE. > > The solution I would like to go with for now (which happens to already > be implemented in TOT) is that KLEE itself always uses 32-bit indices > to reads. Bounds checking, etc, always match the width of the target, > it is just array offsets that are stored as 32-bits. > > The only limitation of this approach is that a single object can't be > larger than 4GB. Since it is pretty inconceivable that anyone would > want to do symbolic execution if they had buffers that size, I don't > see any reason to add the complexity so that KLEE itself selects the > read size based on target pointer width. It seems that at least in one of our project we may want to use such large objects quite soon. Currently the change in KLEE itself is not very complex, it just amounts to replacing Int32 by kMachinePointerType and 4 by kMachinePointerSize in several places. > Note that the expression and solver libraries, on the other hand, > eventually should be able to deal with array indices of arbitrary bit > width. Most of this work should already be done (although it isn't > tested), and we already have syntax and parser support for it in the > KQuery language. The only thing left is to actually add the parameter > to the Array type and update the constructors etc. I could try to do it and check whether it will break any of the unittests. > > -- > > Thanks for doing this work! Could you split your work into a few > separate patches and send them to klee-commits? They don't necessarily > need to apply cleanly to TOT, as long as I can work out the details > enough to merge them in. Done. > > > > A clean solution is, of course, supporting arrays with arbitrary index > > type. Would you suggest me implementing it instead of my workaround ? > > > > > > -- > > With the best regards, > > Vladimir > > _______________________________________________ > > klee-dev mailing list > > klee-dev at keeda.stanford.edu > > http://keeda.Stanford.EDU/mailman/listinfo/klee-dev -- With the best regards, Vladimir
