Just an FYI: here's what RedHat tech support said:

There was a feature request for this symbol sched_getcpu in RHEL 5.x. 
Unfortunately, it was not approved due to the following reason. sched_getcpu is 
a GLIBC_2.6 symbol, so can not be added to RHEL5.x glibc, which is glibc 2.5. 
It would require all glibc 2.6 specific symbols get backported.  Since this is 
equivalent to a rebase, we would be unable to do this in this major release of 
RHEL. RHEL 6 will be shipped with glibc-2.11, which will be having all the 
required features.

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Mon 6/7/2010 3:31 PM
To: [email protected]
Cc: Kenneth R. Macfarlane
Subject: Re: [ltt-dev] LTT UserSpace Tracer, broken?
 
Greetings:

Thought I'd provide an update here on efforts to get the user space tracer to 
work. Based on this web site:

http://www.kernel.org/doc/man-pages/online/pages/man2/getcpu.2.html

I do see that getcpu() can accept up to three arguments (this detail is 
somewhat hidden when using syscall), depending on the kernel version. Per my 
modified code below, I've tried both of the following:

  int r = syscall(SYS_getcpu,&cpu,NULL);

or

  int r = syscall(SYS_getcpu,&cpu,NULL,NULL);

Either of the above calls still result in a segmentation fault. As an FYI: I'm 
working with a 2.6.18 RedHat base kernel that has glibc 2.5-49 installed. Also 
installed on this same machine is a couple of 2.6.33 kernels (RT and non-RT), 
of which I have also tried. There is no glibc update from RedHat that goes 
beyond what I have listed (per a recent note from tech support). I have put in 
a request for an update that would specifically include sched_getcpu() ... I 
don't expect an update anytime soon.

I spent a good part of a day trying to manually update glibc to v2.11. No luck 
.. my attempts usually resulting in having to restore /usr/local due to 
corrupted systems calls. I've also tried installing glibc into a different 
location and building the ust library (by either changing arguments to 
configure or hand editing Makefiles) to just look at that new location ... this 
has not been successful either. Might still be some work that could be done to 
allow this type of build.

The above web site that I listed seems to indicate that getcpu was added in 
glibc 2.6 and appeared in kernels starting at 2.6.19 ... not really sure how we 
are seeing some indications of that working in the versions that I listed above.

Given these details, unless anyone has any suggestions, I'm currently under the 
assumption that there is a hard/fast dependence on glibc 2.6 for the user space 
tracer.

JP

-----Original Message-----
From: Pierre-Marc Fournier [mailto:[email protected]]
Sent: Fri 5/28/2010 12:51 PM
To: John P. Paul
Cc: [email protected]; Kenneth R. Macfarlane
Subject: Re: [ltt-dev] LTT UserSpace Tracer, broken?
 
On 05/24/2010 01:11 PM, [email protected] wrote:
> Thanks Pierre-Marc.  That will teach me to post something to a public board 
> without double checking the interface first. The "3" below was a cut/paste 
> issue from some glibc code (sched_getcpu.c) and I've replaced that coding 
> line with:
>
>    int r = syscall(SYS_getcpu,&cpu);
>
> I've verified the proper operation of the above call in a separate test 
> program. I've rebuild everything after making that change. Unfortunately, 
> that does not get rid of the segmentation fault with usttrace:
>
> # usttrace ./ustTest
> /usr/local/bin/usttrace: line 156: 20724 Segmentation fault      $CMD 2>&1
> Waiting for ustd to shutdown...
> Trace was output in:  /root/.usttraces/machineName-20100524100514656225139
>
> Nor does this resolve the issue with the application seg-faulting with ustd:
>
> # export UST_AUTOPROBE=1
> # gcc -o ustTest ustTest.c -lust
> # mkdir /tmp/trace<- ust-app-socks already present
> # ustd&
> # ./ustTest&
>
> # ustctl --create-trace 20798
> # ustctl --start-trace 20798
>
> libustcomm[20795/20812]: Error: connect (path=/tmp/ust-app-socks/20798): 
> Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20812]: Warning: unable to connect to process, it probably died 
> before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20812]: Error: failed to connect to buffer (in consumer_thread() 
> at ustd.c:581)
> libustcomm[20795/20813]: Error: connect (path=/tmp/ust-app-socks/20798): 
> Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20813]: Warning: unable to connect to process, it probably died 
> before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20813]: Error: failed to connect to buffer (in consumer_thread() 
> at ustd.c:581)
> libustcomm[20795/20814]: Error: connect (path=/tmp/ust-app-socks/20798): 
> Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20814]: Warning: unable to connect to process, it probably died 
> before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20814]: Error: failed to connect to buffer (in consumer_thread() 
> at ustd.c:581)
> libustcomm[20795/20815]: Error: connect (path=/tmp/ust-app-socks/20798): 
> Connection refused (in ustcomm_connect_path() at ustcomm.c:581)
> ustd[20795/20815]: Warning: unable to connect to process, it probably died 
> before we were able to connect (in connect_buffer() at ustd.c:250)
> ustd[20795/20815]: Error: failed to connect to buffer (in consumer_thread() 
> at ustd.c:581)
> ustd[20795/20810]: Error: failed to connect to buffer (in consumer_thread() 
> at ustd.c:581)
> ustd[20795/20811]: Error: failed to connect to buffer (in consumer_thread() 
> at ustd.c:581)
> [6]+  Segmentation fault      ./ustTest
>
> # ls /tmp/ust-app-socks/
> 20798  ustd
>
> I'm guessing that ustd is complaining as my test application dumped and is no 
> longer active. Looking at a core dump of my test app, it appears that the seg 
> fault occurred at the following line of _rcu_read_unlock():
>
>    _STORE_SHARED(rcu_reader->ctr, rcu_reader->ctr - RCU_GP_COUNT);
>
> Which was called from ltt_vtrace(). But that only seems to fail when the 
> syscall(getcpu) returns with a -1. I actually changed ltt_vtrace() code as 
> follows:
>
> {
> //    cpu = ust_get_cpu();
>    int r = syscall(SYS_getcpu,&cpu);
>    if (r == -1)
>      cpu = r;
>    if (cpu == -1)
>      printf(".. invalid cpu %s (%d)\n", strerror(errno), errno);
> }
>
> And had the following print out:
>
> .. invalid cpu Bad address (14)
>
> So ... it appears that something isn't working correctly to make that
> syscall here. Not really sure why this is failing .. maybe a thread
> related issue? It doesn't fail every time. Maybe best to upgrade the
> latest glibc and try again with the inline methods? It is important
> to note that the following code comes directly from glibc-2.11 and
> sched_getcpu() can return a -1 upon a failed INLINE_SYSCALL. Would
> suggest that ltt_vtrace() be changed to properly handle a -1 cpu
> value:
>

I believe the above code is failing some of the time with an "invalid 
address" because some pointers are missing in the call. You have only 1 
argument and you need 3.

I am not too enthousiastic at the idea of adding error checking for the 
getcpu call. The call should never fail and this is in the critical path 
of the tracer.

I would consider a patch with some preprocessor logic that chooses the 
right call based on the one available on the system. However, this patch 
must take into account the latest kernels which provide getcpu as a vdso.

By the way, you will get considerable performance penalty with this old 
libc. UST tries very hard not to make system calls in the tracing 
critical path because they are slow. The recent kernels/glibc's provide 
getcpu/sched_getcpu as a vdso, which helps a lot. If you are doing a 
real system call in the tracing path, this will result in a penalty.

pmf




--
This is an e-mail from General Dynamics Robotic Systems. It is for the intended 
recipient only and may contain confidential and privileged information. No one 
else may read, print, store, copy, forward or act in reliance on it or its 
attachments. If you are not the intended recipient, please return this message 
to the sender and delete the message and any attachments from your computer. 
Your cooperation is appreciated.


_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev



--
This is an e-mail from General Dynamics Robotic Systems. It is for the intended 
recipient only and may contain confidential and privileged information. No one 
else may read, print, store, copy, forward or act in reliance on it or its 
attachments. If you are not the intended recipient, please return this message 
to the sender and delete the message and any attachments from your computer. 
Your cooperation is appreciated.


_______________________________________________
ltt-dev mailing list
[email protected]
http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev

Reply via email to