On 27/10/10 20:53, Valery Reznic wrote:
OK, you was warned :)
Yes, I was. Still....
How can two programs do the same thing on the same system,
and yet get such different results?
Let's take 'read' syscall.
read(10, ....)

I am not talking about a single syscall that behaves differently. I'm talking about tracing the entire history of the process from the time it performed "execve".

If you want, I uploaded to http://fakeroot-ng.lingnu.com/files/clone-traces.tgz the two logs. The two are from strace, once tracing strace tracing vi. The second time it is strace attached to the fakeroot-ng daemon after that is monitoring a shell, and then the shell is used to run "vi". This second log continues until I kill the daemon to release it from the deadlock.

Both the inner strace and fakeroot-ng were instructed to issue logs of what they find, and you can find this log in "write" calls throughout the logs. The logs are, of course, not identical, but I failed to find any difference that should matter.

Since not all of the logs are interesting, the interesting parts start when the processes write that they detected an execve of /usr/bin/vi, and ends a couple of lines after the first time the word "clone" appears after that point. In the trace-strace log, you can see that after releasing the process to perform the clone (PTRACE_SYSCALL), it performs wait4 twice, and gets two notifications, one for the parent thread (3299) and one for the child one (3300).

In the fakeroot log, you can see the clone(RETURN) log message, identifying the child thread 3885 being created, but all of the waits performed only report the parent thread, 3884. This goes on until wait returns with "nothing more to report", and pselect hangs in futile wait for the signal to arrive.

Not in this trace, but had I sent a non-lethal signal, you would see the wait repeated, again saying there is nothing to report, and a hang again. In essence, the difference in the waits should not have happened, as far as I can tell, as the system calls were treated the same.
I suspect there is something like this in your case.
You have everything you need in order to prove you suspicion. In fact, strace is easilly installable from your nearest repository, as well as vi and bash. Many will also carry fakeroot-ng, but if not, feel free to pull the latest SVN image and compile it yourself.
May be there is something that strace do and fakeroot-ng don't?
I'm sure there is. I just can't figure out what it is. The strace code does not appear to have any special handling as opposed to, say, using clone to create a new process (which is a case which works flawlessly in fakeroot-ng).
Setting some flag(s) to clone? Calling some system call that affect wait 
behaviour?
Same flags to clone in both cases (vi sets the same flags, and both strace and fakeroot-ng change them to the same different flags).
I'm not aware of any settings that globally affects wait's behavior.

Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting Ltd.
http://www.lingnu.com


_______________________________________________
Linux-il mailing list
[email protected]
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il

Reply via email to