On 05/24/2010 01:11 PM, [email protected] wrote:
Thanks Pierre-Marc.  That will teach me to post something to a public board without 
double checking the interface first. The "3" below was a cut/paste issue from 
some glibc code (sched_getcpu.c) and I've replaced that coding line with:

   int r = syscall(SYS_getcpu,&cpu);

I've verified the proper operation of the above call in a separate test 
program. I've rebuild everything after making that change. Unfortunately, that 
does not get rid of the segmentation fault with usttrace:

# usttrace ./ustTest
/usr/local/bin/usttrace: line 156: 20724 Segmentation fault      $CMD 2>&1
Waiting for ustd to shutdown...
Trace was output in:  /root/.usttraces/machineName-20100524100514656225139

Nor does this resolve the issue with the application seg-faulting with ustd:

# export UST_AUTOPROBE=1
# gcc -o ustTest ustTest.c -lust
# mkdir /tmp/trace<- ust-app-socks already present
# ustd&
# ./ustTest&

# ustctl --create-trace 20798
# ustctl --start-trace 20798

libustcomm[20795/20812]: Error: connect (path=/tmp/ust-app-socks/20798): 
Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
ustd[20795/20812]: Warning: unable to connect to process, it probably died 
before we were able to connect (in connect_buffer() at ustd.c:250)
ustd[20795/20812]: Error: failed to connect to buffer (in consumer_thread() at 
ustd.c:581)
libustcomm[20795/20813]: Error: connect (path=/tmp/ust-app-socks/20798): 
Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
ustd[20795/20813]: Warning: unable to connect to process, it probably died 
before we were able to connect (in connect_buffer() at ustd.c:250)
ustd[20795/20813]: Error: failed to connect to buffer (in consumer_thread() at 
ustd.c:581)
libustcomm[20795/20814]: Error: connect (path=/tmp/ust-app-socks/20798): 
Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
ustd[20795/20814]: Warning: unable to connect to process, it probably died 
before we were able to connect (in connect_buffer() at ustd.c:250)
ustd[20795/20814]: Error: failed to connect to buffer (in consumer_thread() at 
ustd.c:581)
libustcomm[20795/20815]: Error: connect (path=/tmp/ust-app-socks/20798): 
Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
ustd[20795/20815]: Warning: unable to connect to process, it probably died 
before we were able to connect (in connect_buffer() at ustd.c:250)
ustd[20795/20815]: Error: failed to connect to buffer (in consumer_thread() at 
ustd.c:581)
ustd[20795/20810]: Error: failed to connect to buffer (in consumer_thread() at 
ustd.c:581)
ustd[20795/20811]: Error: failed to connect to buffer (in consumer_thread() at 
ustd.c:581)
[6]+  Segmentation fault      ./ustTest

# ls /tmp/ust-app-socks/
20798  ustd

I'm guessing that ustd is complaining as my test application dumped and is no 
longer active. Looking at a core dump of my test app, it appears that the seg 
fault occurred at the following line of _rcu_read_unlock():

   _STORE_SHARED(rcu_reader->ctr, rcu_reader->ctr - RCU_GP_COUNT);

Which was called from ltt_vtrace(). But that only seems to fail when the 
syscall(getcpu) returns with a -1. I actually changed ltt_vtrace() code as 
follows:

{
//      cpu = ust_get_cpu();
   int r = syscall(SYS_getcpu,&cpu);
   if (r == -1)
     cpu = r;
   if (cpu == -1)
     printf(".. invalid cpu %s (%d)\n", strerror(errno), errno);
}

And had the following print out:

.. invalid cpu Bad address (14)

So ... it appears that something isn't working correctly to make that
syscall here. Not really sure why this is failing .. maybe a thread
related issue? It doesn't fail every time. Maybe best to upgrade the
latest glibc and try again with the inline methods? It is important
to note that the following code comes directly from glibc-2.11 and
sched_getcpu() can return a -1 upon a failed INLINE_SYSCALL. Would
suggest that ltt_vtrace() be changed to properly handle a -1 cpu
value:


I believe the above code is failing some of the time with an "invalid address" because some pointers are missing in the call. You have only 1 argument and you need 3.

I am not too enthousiastic at the idea of adding error checking for the getcpu call. The call should never fail and this is in the critical path of the tracer.

I would consider a patch with some preprocessor logic that chooses the right call based on the one available on the system. However, this patch must take into account the latest kernels which provide getcpu as a vdso.

By the way, you will get considerable performance penalty with this old libc. UST tries very hard not to make system calls in the tracing critical path because they are slow. The recent kernels/glibc's provide getcpu/sched_getcpu as a vdso, which helps a lot. If you are doing a real system call in the tracing path, this will result in a penalty.

pmf

_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev

Reply via email to