Hi, Can you prepare a README patch for UST that specify dependency on glibc 2.6 in the README file along with a note telling which symbol is needed (sched_getcpu) ? I'll recommend Pierre-Marc to pull it.
Thanks, Mathieu * [email protected] ([email protected]) wrote: > Just an FYI: here's what RedHat tech support said: > > There was a feature request for this symbol sched_getcpu in RHEL 5.x. > Unfortunately, it was not approved due to the following reason. sched_getcpu > is a GLIBC_2.6 symbol, so can not be added to RHEL5.x glibc, which is glibc > 2.5. It would require all glibc 2.6 specific symbols get backported. Since > this is equivalent to a rebase, we would be unable to do this in this major > release of RHEL. RHEL 6 will be shipped with glibc-2.11, which will be having > all the required features. > > -----Original Message----- > From: [email protected] [mailto:[email protected]] > Sent: Mon 6/7/2010 3:31 PM > To: [email protected] > Cc: Kenneth R. Macfarlane > Subject: Re: [ltt-dev] LTT UserSpace Tracer, broken? > > Greetings: > > Thought I'd provide an update here on efforts to get the user space tracer to > work. Based on this web site: > > http://www.kernel.org/doc/man-pages/online/pages/man2/getcpu.2.html > > I do see that getcpu() can accept up to three arguments (this detail is > somewhat hidden when using syscall), depending on the kernel version. Per my > modified code below, I've tried both of the following: > > int r = syscall(SYS_getcpu,&cpu,NULL); > > or > > int r = syscall(SYS_getcpu,&cpu,NULL,NULL); > > Either of the above calls still result in a segmentation fault. As an FYI: > I'm working with a 2.6.18 RedHat base kernel that has glibc 2.5-49 installed. > Also installed on this same machine is a couple of 2.6.33 kernels (RT and > non-RT), of which I have also tried. There is no glibc update from RedHat > that goes beyond what I have listed (per a recent note from tech support). I > have put in a request for an update that would specifically include > sched_getcpu() ... I don't expect an update anytime soon. > > I spent a good part of a day trying to manually update glibc to v2.11. No > luck .. my attempts usually resulting in having to restore /usr/local due to > corrupted systems calls. I've also tried installing glibc into a different > location and building the ust library (by either changing arguments to > configure or hand editing Makefiles) to just look at that new location ... > this has not been successful either. Might still be some work that could be > done to allow this type of build. > > The above web site that I listed seems to indicate that getcpu was added in > glibc 2.6 and appeared in kernels starting at 2.6.19 ... not really sure how > we are seeing some indications of that working in the versions that I listed > above. > > Given these details, unless anyone has any suggestions, I'm currently under > the assumption that there is a hard/fast dependence on glibc 2.6 for the user > space tracer. > > JP > > -----Original Message----- > From: Pierre-Marc Fournier [mailto:[email protected]] > Sent: Fri 5/28/2010 12:51 PM > To: John P. Paul > Cc: [email protected]; Kenneth R. Macfarlane > Subject: Re: [ltt-dev] LTT UserSpace Tracer, broken? > > On 05/24/2010 01:11 PM, [email protected] wrote: > > Thanks Pierre-Marc. That will teach me to post something to a public board > > without double checking the interface first. The "3" below was a cut/paste > > issue from some glibc code (sched_getcpu.c) and I've replaced that coding > > line with: > > > > int r = syscall(SYS_getcpu,&cpu); > > > > I've verified the proper operation of the above call in a separate test > > program. I've rebuild everything after making that change. Unfortunately, > > that does not get rid of the segmentation fault with usttrace: > > > > # usttrace ./ustTest > > /usr/local/bin/usttrace: line 156: 20724 Segmentation fault $CMD 2>&1 > > Waiting for ustd to shutdown... > > Trace was output in: /root/.usttraces/machineName-20100524100514656225139 > > > > Nor does this resolve the issue with the application seg-faulting with ustd: > > > > # export UST_AUTOPROBE=1 > > # gcc -o ustTest ustTest.c -lust > > # mkdir /tmp/trace<- ust-app-socks already present > > # ustd& > > # ./ustTest& > > > > # ustctl --create-trace 20798 > > # ustctl --start-trace 20798 > > > > libustcomm[20795/20812]: Error: connect (path=/tmp/ust-app-socks/20798): > > Connection refused (in ustcomm_connect_path() at ustcomm.c:581) > > ustd[20795/20812]: Warning: unable to connect to process, it probably died > > before we were able to connect (in connect_buffer() at ustd.c:250) > > ustd[20795/20812]: Error: failed to connect to buffer (in consumer_thread() > > at ustd.c:581) > > libustcomm[20795/20813]: Error: connect (path=/tmp/ust-app-socks/20798): > > Connection refused (in ustcomm_connect_path() at ustcomm.c:581) > > ustd[20795/20813]: Warning: unable to connect to process, it probably died > > before we were able to connect (in connect_buffer() at ustd.c:250) > > ustd[20795/20813]: Error: failed to connect to buffer (in consumer_thread() > > at ustd.c:581) > > libustcomm[20795/20814]: Error: connect (path=/tmp/ust-app-socks/20798): > > Connection refused (in ustcomm_connect_path() at ustcomm.c:581) > > ustd[20795/20814]: Warning: unable to connect to process, it probably died > > before we were able to connect (in connect_buffer() at ustd.c:250) > > ustd[20795/20814]: Error: failed to connect to buffer (in consumer_thread() > > at ustd.c:581) > > libustcomm[20795/20815]: Error: connect (path=/tmp/ust-app-socks/20798): > > Connection refused (in ustcomm_connect_path() at ustcomm.c:581) > > ustd[20795/20815]: Warning: unable to connect to process, it probably died > > before we were able to connect (in connect_buffer() at ustd.c:250) > > ustd[20795/20815]: Error: failed to connect to buffer (in consumer_thread() > > at ustd.c:581) > > ustd[20795/20810]: Error: failed to connect to buffer (in consumer_thread() > > at ustd.c:581) > > ustd[20795/20811]: Error: failed to connect to buffer (in consumer_thread() > > at ustd.c:581) > > [6]+ Segmentation fault ./ustTest > > > > # ls /tmp/ust-app-socks/ > > 20798 ustd > > > > I'm guessing that ustd is complaining as my test application dumped and is > > no longer active. Looking at a core dump of my test app, it appears that > > the seg fault occurred at the following line of _rcu_read_unlock(): > > > > _STORE_SHARED(rcu_reader->ctr, rcu_reader->ctr - RCU_GP_COUNT); > > > > Which was called from ltt_vtrace(). But that only seems to fail when the > > syscall(getcpu) returns with a -1. I actually changed ltt_vtrace() code as > > follows: > > > > { > > // cpu = ust_get_cpu(); > > int r = syscall(SYS_getcpu,&cpu); > > if (r == -1) > > cpu = r; > > if (cpu == -1) > > printf(".. invalid cpu %s (%d)\n", strerror(errno), errno); > > } > > > > And had the following print out: > > > > .. invalid cpu Bad address (14) > > > > So ... it appears that something isn't working correctly to make that > > syscall here. Not really sure why this is failing .. maybe a thread > > related issue? It doesn't fail every time. Maybe best to upgrade the > > latest glibc and try again with the inline methods? It is important > > to note that the following code comes directly from glibc-2.11 and > > sched_getcpu() can return a -1 upon a failed INLINE_SYSCALL. Would > > suggest that ltt_vtrace() be changed to properly handle a -1 cpu > > value: > > > > I believe the above code is failing some of the time with an "invalid > address" because some pointers are missing in the call. You have only 1 > argument and you need 3. > > I am not too enthousiastic at the idea of adding error checking for the > getcpu call. The call should never fail and this is in the critical path > of the tracer. > > I would consider a patch with some preprocessor logic that chooses the > right call based on the one available on the system. However, this patch > must take into account the latest kernels which provide getcpu as a vdso. > > By the way, you will get considerable performance penalty with this old > libc. UST tries very hard not to make system calls in the tracing > critical path because they are slow. The recent kernels/glibc's provide > getcpu/sched_getcpu as a vdso, which helps a lot. If you are doing a > real system call in the tracing path, this will result in a penalty. > > pmf > > > > > -- > This is an e-mail from General Dynamics Robotic Systems. It is for the > intended recipient only and may contain confidential and privileged > information. No one else may read, print, store, copy, forward or act in > reliance on it or its attachments. If you are not the intended recipient, > please return this message to the sender and delete the message and any > attachments from your computer. Your cooperation is appreciated. > > > _______________________________________________ > ltt-dev mailing list > [email protected] > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev > > > > -- > This is an e-mail from General Dynamics Robotic Systems. It is for the > intended recipient only and may contain confidential and privileged > information. No one else may read, print, store, copy, forward or act in > reliance on it or its attachments. If you are not the intended recipient, > please return this message to the sender and delete the message and any > attachments from your computer. Your cooperation is appreciated. > > > _______________________________________________ > ltt-dev mailing list > [email protected] > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev > -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
