On 04/14/2010 02:33 AM, Pierre-Marc Fournier wrote:
Sylvain Geneves wrote:
I've pulled UST from git yesterday, and noticed some errors, and some
are confusing me...
first of all, i have a problem for compiling my program (when i include
marker.h), it gives the following error :
In file included from /usr/local/include/ust/kcompat/kcompat.h:64,
from /usr/local/include/ust/kernelcompat.h:21,
from /usr/local/include/ust/marker.h:31,
from task.lbc.C:4:
/usr/local/include/ust/kcompat/jhash.h: In function ‘u32 jhash(const
void*, u32, u32)’:
/usr/local/include/ust/kcompat/jhash.h:47: error: invalid conversion
from ‘const void*’ to ‘const u8*’
Should be fixed in the latest git. I expect the error here was due to
the usage of strict aliasing rules. This is now disabled in ust because
the lttng code assumes -fno-strict-aliasing.
Here's what i see in the resulting trace directory:
http://pastebin.com/raw.php?i=TqJWarkA
it seems that all traces aren't recorded (some metadata and ust are
zero
size), i can't understand why, i must be missing something here... ?
-f should fix this.
also, when using lttv, it says "Cannot open trace : maybe you should
enter in the trace directory to select it ?"
i note that lttv can open a subdirectory (like
/root/.usttraces/californium-20100329172748911948191/8494_5454077942969735172
in my example), but obviously all i can see is a subset of what really
happenned...
I just looked at this more closely.
thanks for your help
If your program does not fork (nor clone without the CLONE_VM flag),
it's still possible you are tracing several processes. This would be
caused by the fact that the command you are passing to usttrace is not
an elf executable but rather a shell script that does many things like
run commands, which starts new processes. For example, a shell script
that uses find, sed, ls etc. will start them in their distinct process.
The tracer will trace everything. To find out what these processes are,
you could run the same command with strace -f and see what is going on.
the command i launch is actually an executable, not a script. What it
does is quite simple :
- the application registers some events to be processed
- it creates one thread per core on the machine
- each thread executes a loop to execute events
after all events has been processed, each created thread calls
pthread_exit, and the application exits normally.
the output of strace -f shows that there is no call to fork in there,
but clone is called multiple times.
that leads me to a question about how to correctly enable the CLONE_VM
flag for ust : is enabling it through the shell environment (export
CLONE_VM=1) sufficient ?
is there another way to do it ?
The fact that some of the files are empty means that the daemon was
unable to connect to the processes at some point. Indeed, you are
getting printed errors about this. But it is initially able to because
we see there is at least one file that is not empty. This is unlikely
due to the fact the process stops existing too quickly. This could
happen because of an exec() or an exit(), but there is a keep alive
mechanism (triggered by ustctl -f in the case of the exec) that induces
a delay in these calls when the ustd did not have the time to connect to
all buffers yet.
Either there is a bug in this mechanism, or something crashes. Could be
a bug in your program or in ust. When the traced program crashes, only
the buffers that were already connected can be recovered. Enabling core
dumps (ulimit -c) and checking if there are core files in the directory
after execution could help finding if you have segfaults or similar
crashes.
That's a good point.
nothing crashes explicitely (to be sure i ran my program after "ulimit
-c unlimited" and they were no resulting core dump).
Did you have a look at ustd.log and app.log in the trace directory? They
could give additional information.
i did not looked at it enough, since there is some interesting lines
near the end of the files :
in ustd.log, when grepping with Error i get multiple lines like this :
ustd[17643/17973]: Error: unable to parse response to get_pidunique (in
connect_buffer() at ustd.c:268)
ustd[17643/17973]: Error: failed to connect to buffer (in
consumer_thread() at ustd.c:581)
libustcomm[17643/17995]: Error: connect (path=/tmp/ust-app-socks/17992):
Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
when grepping with die there are a lot of :
ustd[17643/17710]: application died while putting subbuffer (in
consumer_loop() at ustd.c:513)
ustd[17643/17687]: app died while being traced (in get_subbuffer() at
ustd.c:75)
ustd[17643/17745]: For buffer metadata_2, the trace was not found. This
llibustcomm[17643/17747]: sent message "put_subbuffer ust_0 0" (in
send_messagustd[17643/17745]: application died while putting subbuffer
(in consumer_loop() at ustd.c:513)
ustd[17643/17995]: Warning: unable to connect to process, it probably
died before we were able to connect (in connect_buffer() at ustd.c:250)
in app.log, i can see some lines like this :
libust[17924/17928]: Cannot find trace. It was likely destroyed by the
user. (in do_cmd_put_subbuffer() at tracectl.c:717)
sorry not to have seen this before...
from what i understand, it seems ust is seeing my application dying
unexpectedly. It seems weird to me because the application appears to be
exiting gracefully (no segfault, nor else). What could i do to check
that more thoroughly ?
Are there some ust calls to add in my application to notify ust we will
stop tracing, or the thread will exit ?
It could help to just try to run the raw executable to usttrace. It
would diminish the complexity of the operation. But it is supposed to
work even with lots of processes like here.
Thanks
pmf
_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev