Hey Mathieu: Thanks for looking at this. I'm a bit new to debugging at this level, so you may need to provide me a bit more info on what you need. I attempted to use "pstack" on the lttctl and lttd tasks ... no luck as pstack also locked up.
I put a bit of tracing into liblttctl and discovered the lockup occurs when a write of "traceName" (whatever traceName happens to be) occurs to the "/mnt/debugfs/ltt/destroy_trace" file. I'm guessing that you would like some tracing of the ltt kernel module. Is there something that I can turn on, or another way I could get a stack dump of that module after lockup? I'll do a little research this weekend on kernel debugging techniques. I can certainly sprinkle in some printk statements in the ltt kernel module source. Doing provided the following info: - Control entered _ltt_trace_destroy (single underscore) - Control entered del_timer_sync(<t_async_wakeup_timer) and never returned Does that help, or should I continue farther down this path? Thanks JP -----Original Message----- From: Mathieu Desnoyers [mailto:[email protected]] Sent: Thursday, April 22, 2010 12:06 PM To: John P. Paul Cc: [email protected] Subject: Re: [ltt-dev] lttctl locks up with RT Linux * [email protected] ([email protected]) wrote: > Greetings: > > I'm using a a 2.6.33.2 kernel. I have LTT up and running on the plain vanilla kernel, but "lttctl -D trace1" never returns on the RT version of the same kernel. I've downloaded and integrated the following pieces: > > patch-2.6.33.2-lttng-0.211 > ltt-control-0.84-07042010 > lttv-0.12.31.04072010 > > Note that I've had to manually apply several of the patches from the patch file. I can provide a list if desired. > > After the lockup, I can do an ls on the /tmp/trace directory and see that the following files have a non-zero length (remaining files in the trace directory have a zero length): > > fs_0, fs_1, kernel_0, kernel_1 > > I'm running on an Intel Core2 Duo system. I've built all the LTT components into the kernel, so I do not have to load any modules at runtime. I do execute an ltt-armall prior to issuing any "lttctl -C -w /tmp/trace trace1" commands. > > When the above occurs, I usually have to hard power down the machine as a root issued "reboot" does not reboot the machine (even after trying to kill the running ltt processes). > > Any suggestions on how to get this working under the RT kernel would be appreciated. Does LTT even function properly for RT kernels? If not, it would be of great benefit to have it do so. Please let me know if additional debug info would be helpful. I bet there is something fishy on RT with __ltt_trace_destroy(). Having an output of where the CPU is stalled in lttng code would help. > > A couple additional notes: > > - LTTV docs state that it requires glib 2.4 or greater. I believe this is incorrect due to the following dependency: > > $ rpm -qa glib2 > glib2-2.12.3-4.el5_3.1 << my default glib (RHEL5.x base) > > state.c: In function 'copy_process_state': > state.c:1344: error: 'GHashTableIter' undeclared (first use in this function) > > I've installed glib-2.22.5 to get around the above issue. OK, the dependency seems to be glib 2.16 now. Will update the README and LTTng manual accordingly. Thanks, Mathieu -- This is an e-mail from General Dynamics Robotic Systems. It is for the intended recipient only and may contain confidential and privileged information. No one else may read, print, store, copy, forward or act in reliance on it or its attachments. If you are not the intended recipient, please return this message to the sender and delete the message and any attachments from your computer. Your cooperation is appreciated. _______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
