----- On Jun 14, 2017, at 11:55 AM, Thomas McGuire [email protected] wrote:
> Hi, > > On 14.06.2017 17:12, Mathieu Desnoyers wrote: >> Can you provide a copy of the metadata file ? And ideally the data >> streams too ? This would give us a better idea of what is happening. >> >> Do you perform kernel or user-space tracing ? Do you trace huge >> sequences of bytes within your own tracepoints ? > > I perform kernel traceing only, in this case limited to syscalls, > sched*, block* and irq*. No user-space tracepoints. > > I didn't know the metadata file was plain text, I had a quick look into > it and noticed corruption already, with random garbage data inserted all > over the place. I'm surprised babeltrace didn't choke on the metadata > already. The lttng metadata is "packetized plain-text". What you see is plain-text in a transport layer which is binary. This explains the "garbage" you see: those are binary headers for packets. Use babeltrace -o ctf-metadata to extract the text-only metadata (which is also valid metadata under CTF). Both packetized and pure text metadata are allowed. > I can not provide the data file as it has confidential data. Looking at > it with a hex editor, I see the same kind of garbage as in the metadata > file, so both files are affected by the same problem. The CTF data files are binary, so the garbage you see can be either headers or padding. > > I've uploaded the metadata file to > http://www.kdab.com/~thomas/stuff/metadata. > > To double-check that it isn't file system corruption, I ran "yes > > test.data" - that file is OK, so it's probably a different problem. > > Any idea what can cause the corrupted trace? Based on your babeltrace backtrace, the possible culprits would be the events that have a sequence (variable-sized array): syscalls: select, poll, ppoll, pselect6, epoll_wait, epoll_pwait block_rq_issue, block_rq_insert, block_rq_complete, block_rq_requeue, block_rq_abort. There are a few approaches to cornering the issue. You can try reproducing on your workload/config by only enabling one of these events at a time. Just knowing which event(s) is/are the culprit would be a good start. Another possibility would be to send us a trace reproducing the issue with only those events enabled, which should not contain confidential info about your system. Thanks, Mathieu > > Regards, > Thomas > -- > Thomas McGuire | [email protected] | Senior Software Engineer > KDAB (Deutschland) GmbH&Co KG, a KDAB Group company > Tel: +49-30-521325470 > KDAB - The Qt Experts > > > _______________________________________________ > lttng-dev mailing list > [email protected] > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com _______________________________________________ lttng-dev mailing list [email protected] https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
