Greetings, After some extensive debugging, I found the cause of (1) the discrepancies between multiple runs of Klee, and also (2) the source of failed replays.
(1) The contents of the constant arrays were actually pointers to other memory objects in the symbolic memory. By design, Klee generates addresses for symbolic objects using malloc(), which gives pseudo-random addresses from one run to the other. Hence the differences between consecutive runs. (2) Bug #6690 [1] in Klee causes some wrongful 64bit value truncations. In particular, in the case of pointers, if the pointers had their 32bit MSB non-zero, the states which contained them were dropped if they reached a particular execution point in the file system model. Because of (1), this event was non-deterministic. If one state survived in the source Klee because of low value pointers, it might get dropped at destination, if the new pointer addresses have higher values. Cheers, Stefan [1] http://llvm.org/bugs/show_bug.cgi?id=6690 On Tue, Mar 16, 2010 at 1:14 AM, Stefan Bucur <stefan.bucur at epfl.ch> wrote: > Hi, > > I also tried replay experiments for other test cases, and I managed to > get more details about a particular case where the replay is broken. I > found a situation where the symbolic state differs between the > original and the replay, for the same path. However, I don't know > exactly how to correlate the symbolic names shown in the output of > ExprPPrinter::printConstraints() function with actual variables / > constants, so I would be glad if you could give me any piece of > advice. > > So, when testing the pinky.bc utility, Klee gets to a point when it > executes the _getopt_internal_r function, in file > [klee-uclibc]/libc/unistd/getopt.c:547. The constraints in the > original states look like: > > array arr1[4] : w32 -> w8 = symbolic > array arr2[11] : w32 -> w8 = symbolic > array arr6[4] : w32 -> w8 = symbolic > array arr24[144] : w32 -> w8 = symbolic > array arr32[144] : w32 -> w8 = symbolic > array arr26[144] : w32 -> w8 = symbolic > array arr33[4] : w32 -> w8 = symbolic > array const_arr2[24] : w32 -> w8 = [48 18 0 152 21 127 0 0 16 97 10 > 152 21 127 0 0 171 171 171 171 171 171 171 171] > (query [(Ult N0:(ReadLSB w32 0 arr1) > ? ? ? ? ? ? 2) > ? ? ? ?(Slt 0 N0) > ? ? ? ?(Ult N1:(ReadLSB w32 0 arr6) > ? ? ? ? ? ? 3) > ? ? ? ?(Eq false (Slt 0 N1)) > ? ? ? ?(Eq false > [....] > > while the constraints in the replay, for the same position, are: > > array arr1[4] : w32 -> w8 = symbolic > array arr2[11] : w32 -> w8 = symbolic > array arr3[4] : w32 -> w8 = symbolic > array arr13[144] : w32 -> w8 = symbolic > array const_arr2[24] : w32 -> w8 = [112 18 250 2 0 0 0 0 144 4 2 172 > 89 127 0 0 171 171 171 171 171 171 171 171] > array arr15[144] : w32 -> w8 = symbolic > array arr18[4] : w32 -> w8 = symbolic > array arr17[144] : w32 -> w8 = symbolic > (query [(Ult N0:(ReadLSB w32 0 arr1) > ? ? ? ? ? ? 2) > ? ? ? ?(Slt 0 N0) > ? ? ? ?(Ult N1:(ReadLSB w32 0 arr3) > ? ? ? ? ? ? 3) > ? ? ? ?(Eq false (Slt 0 N1)) > ? ? ? ?(Eq false > [....] > > The rest of the formulas in the constraints (not shown), are built > around the array variables presented above, and they slightly differ. > However, the first thing I noticed was that the const_arr2 (which is > used in the rest of the constraints), differs from one state to > another. In order to see why that happens, it would be useful to know > what is the source of this value in the original program. Do you have > any idea about how can I find that information, or what is the source > of this discrepancy, whatsoever? > > Thank you very much, > Stefan Bucur > > > On Thu, Mar 11, 2010 at 1:08 AM, Stefan Bucur <stefan.bucur at epfl.ch> wrote: >> Greetings, >> >> I was experimenting with Klee and tried to set up the following >> scenario: one Klee instance executes a portion of the symbolic tree, >> then another Klee instance picks up paths from the first instance and >> resumes exploration having those paths as seeds. The second instance >> would perform a replay (basically, start from root, and select states >> to explore only along the seed paths), before resuming execution. >> >> This approach seems to work in general. However, there are cases when >> the seeds replay on the second Klee instance is broken, which means >> that the original path could not be totally reproduced on the second >> instance. I tracked down the problem and found that the replay is >> broken (at least) in the runtime/POSIX/fd.c:903 point, in the ioctl() >> function, after executing a syscall. Moreover, the code that calls >> ioctl() is _stdio_init(), in klee-uclibc, when getting information >> about standard file descriptors 0 and 1. >> >> Regarding this issue, I have two questions: >> >> 1) Does the file system model keep track of each symbolic state >> individually? In other words, does one symbolic state depend on the >> previous execution of other symbolic states? [It shouldn't be so, in >> my opinion, but that would be an explanation for getting a different >> behavior depending on state exploration history.] >> >> 2) How does the replay code in Klee deal with internal forks? My >> replay code replays every fork, be it external or internal, but as far >> as I studied Executor.cpp, the replay feature in Klee deals only with >> external forks. If there are multiple paths stemming from an internal >> fork, how does Klee know which one to pick during replay? This is >> something which I didn't find obvious from studying the source code. >> >> Thanks, >> Stefan >> >
