On Fri, Apr 2, 2010 at 1:49 PM, Mathieu Desnoyers < [email protected]> wrote:
> * Dany Madden ([email protected]) wrote: > > On Fri, Apr 2, 2010 at 7:34 AM, Mathieu Desnoyers < > > [email protected]> wrote: > > > > > * Dany Madden ([email protected]) wrote: > > > > On Wed, Mar 31, 2010 at 1:56 PM, Mathieu Desnoyers < > > > > [email protected]> wrote: > > > > > > > > > * Mathieu Desnoyers ([email protected]) wrote: > > > > > > * Dany Madden ([email protected]) wrote: > > > > > > > On Wed, Mar 31, 2010 at 10:24 AM, Mathieu Desnoyers < > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > * Dany Madden ([email protected]) wrote: > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > Please take a look at this lttv error with the mainline > kernel > > > > > > > > > 2.6.32.9 on ppc64. > > > > > > > > > > > > > > > > > > lttng: patch-2.6.32.9-lttng-0.198 > > > > > > > > > lttv-0.12.30-02102010 > > > > > > > > > ltt-control-0.79-01022010 > > > > > > > > > > > > > > > > > > LTTV crashed with this error: > > > > > > > > > > > > > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion > failed: > > > > > (info != > > > > > > > > NULL) > > > > > > > > > /usr/bin/lttv: line 15: 2696 Aborted > $0.real > > > $* > > > > > > > > > > > > > > > > > > > > > > > > > > > CONFIG_LTT=y > > > > > > > > > CONFIG_LTT_FILTER=m > > > > > > > > > # CONFIG_HAVE_LTT_DUMP_TABLES is not set > > > > > > > > > CONFIG_LTT_RELAY=y > > > > > > > > > CONFIG_LTT_RELAY_LOCKLESS=y > > > > > > > > > # CONFIG_LTT_RELAY_IRQOFF is not set > > > > > > > > > # CONFIG_LTT_RELAY_LOCKED is not set > > > > > > > > > CONFIG_LTT_SERIALIZE=y > > > > > > > > > CONFIG_LTT_FAST_SERIALIZE=y > > > > > > > > > CONFIG_LTT_TRACEPROBES=m > > > > > > > > > CONFIG_LTT_TRACE_CONTROL=m > > > > > > > > > CONFIG_LTT_TRACER=y > > > > > > > > > # CONFIG_LTT_ALIGNMENT is not set > > > > > > > > > CONFIG_LTT_CHECK_ARCH_EFFICIENT_UNALIGNED_ACCESS=y > > > > > > > > > # CONFIG_LTT_DEBUG_EVENT_SIZE is not set > > > > > > > > > CONFIG_LTT_USERSPACE_EVENT=m > > > > > > > > > CONFIG_LTT_VMCORE=y > > > > > > > > > CONFIG_LTT_STATEDUMP=m > > > > > > > > > CONFIG_LTT_KPROBES=m > > > > > > > > > # CONFIG_LTT_ASCII is not set > > > > > > > > > > > > > > > > > > > > > > > > > > > > modprobe ltt-trace-control > > > > > > > > > > modprobe ipc-trace > > > > > > > > > > modprobe kernel-trace > > > > > > > > > > modprobe mm-trace > > > > > > > > > > modprobe fs-trace > > > > > > > > > > modprobe syscall-trace > > > > > > > > > > modprobe trap-trace > > > > > > > > > > > > > > > > > > > ltt-armall > > > > > > > > > > lttctl -C -w /tmp/trace1 trace1 > > > > > > > > > > lttctl -D trace1 > > > > > > > > > > lttv -m textDump -t /tmp/trace1 > > > > > > > > > ** > > > > > > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion > failed: > > > > > (info != > > > > > > > > NULL) > > > > > > > > > /usr/bin/lttv: line 15: 2696 Aborted > $0.real > > > $* > > > > > > > > > > > > > > > > > > This post, > > > > > > > > > > > > > > > > > http://lists.casi.polymtl.ca/pipermail/ltt-dev/2010-February/002189.html, > > > > > > > > > describes the same error on 32 bit system. By not having > the > > > > > > > > > LTT_ALIGNMENT set, shouldn't it just work in my case? > > > > > > > > > > > > > > > > > > Thanks in advance for your help! > > > > > > > > > > > > > > > > A few questions: > > > > > > > > > > > > > > > > - the trace is taken on a ppc64 machine. Is is viewed on a > ppc64 > > > too, > > > > > or > > > > > > > > on a different arch size/endianness ? > > > > > > > > > > > > > > > > > > > > > > Trace collected and viewed on the same ppc64 system. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Can you provide some information about the content of the > trace > > > > > > > > directory ? (files, file size) > > > > > > > > > > > > > > > > > > > > > > The files in the trace directory are all the same size?? > > > > > > > > > > > > > > > /tmp/trace3 # ls -l > > > > > > > total 2448 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 fs_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 input_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 ipc_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 kernel_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 metadata_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 mm_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 module_state_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 softirq_state_3 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_0 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_1 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_2 > > > > > > > -rwxr-xr-x 1 root root 65536 2010-03-31 19:32 swap_state_3 > > > > > > > > > > > > > > > > > > > Oh, that's a pretty good hint. Although I tried to make > everything as > > > > > > generic as possible, I think LTTV assumes that the page size is > 4k. > > > > > > However, recent ppc64 kernels have a 64k page size. So I'm fairly > > > sure > > > > > > that this is where the problem comes from. > > > > > > > > > > > > Can you send me (privately) a tarball of your trace so I can see > what > > > I > > > > > > can do with LTTV to support it ? > > > > > > > > > > OK, got it. Well, your metadata channels are all empty (looked with > > > > > hexedit). So clearly LTTV cannot make sense of the data in the > other > > > > > channels. > > > > > > > > > > So let's try the simple solutions first: try to load all the > > > > > LTTng-related modules you have (I see that a few were missing from > your > > > > > list), and only then run ltt-armall. If that does not work, try to > > > > > compile LTTng as "built in". > > > > > > > > > > > > > Tried building ltt into the kernel and still seeing the assertion > > > failed... > > > > > lttv -m textDump -t /tmp/trace7 > > > > ** > > > > ERROR:tracefile.c:1582:ltt_update_event_size: assertion failed: (info > != > > > > NULL) > > > > /usr/bin/lttv: line 15: 2677 Aborted $0.real $* > > > > > > > > > I just looked at your trace5, and it looks like this one has empty > > > "metadata" channels too. So I am pretty confident that there is a > > > problem with 4k vs 64k alignment, but I cannot test the fix until we > > > find out why the metadata channel is empty. > > > > > > Can you give me the output of: > > > > > > # find /mnt/debugfs/ltt > > > > > I attached the output mnt_debugfs_ltt_content because it is too long to > > list here > > > > > > > ? (or with the appropriate location of debugfs on your system) > > > > > > Normally, we should have the metadata generated by events in: > > > > > > kernel/marker.c: ltt_dump_marker_state() > > > > > > So double-checking that this symbol is present in your kernel might > > > help. Also making sure that CONFIG_MARKERS is set would be good. To > > > confirm that ltt_dump_marker_state() is called, you could add some > > > printk() in that function to see if LTTng is calling it at it should. > > > > > > > CONFIG_MARKERS is set. I added a printk to ltt_dump_marker_state and saw > > that the function is being called. This is the log from > /var/log/messages: > > > > Apr 2 20:06:37 elm17d191 kernel: LTT dump marker state is called ### My > > printk statement ##### > > Apr 2 20:06:37 elm17d191 kernel: > > Apr 2 20:06:37 elm17d191 kernel: LTT state dump begin > > Apr 2 20:06:37 elm17d191 kernel: LTT state dump thread start > > Apr 2 20:06:37 elm17d191 kernel: LTT state dump end > > Apr 2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel > > softirq_state (cpu 0, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 1 events written in channel > > global_state (cpu 0, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 1890 events written in channel > kernel > > (cpu 0, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 525 events written in channel > kernel > > (cpu 1, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel > kernel > > (cpu 2, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 48 events written in channel > kernel > > (cpu 3, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 7 events written in channel fs > (cpu > > 1, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 32 events written in channel block > > (cpu 0, index 0) > > Apr 2 20:07:34 elm17d191 kernel: LTT: 14 events written in channel block > > (cpu 2, index 0) > > > > Hrm, no event written in the metadata channel. Can you check if > CONFIG_IMMEDIATE is active ? If yes, try setting it to 'N'. > > Mathieu, CONFIG_IMMEDIATE is set to y by default. I changed it to 'N' as you suggested and I do get the textDump now. I will need to check to see if disabling this CONFIG is acceptable. Thanks for the help! Dany > Thanks, > > Mathieu > > > > > > > > > > > Thanks, > > > > > > Mathieu > > > > > > > > > > > dany > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Mathieu > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Mathieu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Also, trying to enable "CONFIG_LTT_DEBUG_EVENT_SIZE" might > help > > > by > > > > > > > > providing extra self-check on trace metadata consistency. > > > > > > > > > > > > > > > > Rebuilt the kernel with CONFIG_LTT_DEBUG_EVENT_SIZE=y , > > > recollected > > > > > the > > > > > > > trace and viewed it. I still see the same error and did notice > > > anything > > > > > else > > > > > > > different than it was before. > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Mathieu > > > > > > > > > > > > > > > > > > > > > > > > > > Dany Madden > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > > > ltt-dev mailing list > > > > > > > > > [email protected] > > > > > > > > > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Mathieu Desnoyers > > > > > > > > Operating System Efficiency R&D Consultant > > > > > > > > EfficiOS Inc. > > > > > > > > http://www.efficios.com > > > > > > > > > > > > > > > > > > > > -- > > > > > > Mathieu Desnoyers > > > > > > Operating System Efficiency R&D Consultant > > > > > > EfficiOS Inc. > > > > > > http://www.efficios.com > > > > > > > > > > -- > > > > > Mathieu Desnoyers > > > > > Operating System Efficiency R&D Consultant > > > > > EfficiOS Inc. > > > > > http://www.efficios.com > > > > > > > > > > > > _______________________________________________ > > > > ltt-dev mailing list > > > > [email protected] > > > > http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev > > > > > > > > > -- > > > Mathieu Desnoyers > > > Operating System Efficiency R&D Consultant > > > EfficiOS Inc. > > > http://www.efficios.com > > > > > > > -- > Mathieu Desnoyers > Operating System Efficiency R&D Consultant > EfficiOS Inc. > http://www.efficios.com >
_______________________________________________ ltt-dev mailing list [email protected] http://lists.casi.polymtl.ca/cgi-bin/mailman/listinfo/ltt-dev
