On 2017-06-23 11:48, Thomas McGuire wrote: >>> Any idea what can cause the corrupted trace? >> Based on your babeltrace backtrace, the possible culprits would be the >> events that have a sequence (variable-sized array): >> >> syscalls: select, poll, ppoll, pselect6, epoll_wait, epoll_pwait >> >> block_rq_issue, block_rq_insert, block_rq_complete, block_rq_requeue, >> block_rq_abort. >> >> There are a few approaches to cornering the issue. You can try reproducing >> on your workload/config by only enabling one of these events at a time. >> Just knowing which event(s) is/are the culprit would be a good start. >> >> Another possibility would be to send us a trace reproducing the issue >> with only those events enabled, which should not contain confidential >> info about your system. > > I've added some debug statements to babeltrace now. The culprit in this > particular case is the first block_rq_complete event, the __cmd_length > field contains a large value (3040877592). __cmd_length is used as the > length for the _cmd sequence, and then of course allocating space for > that sequence fails. > > Any idea what can cause __cmd_length to be bogus?
Hi Thomas, I see from the metadata file you provided that your kernel version is 4.9.28-20170428-1, is it built from vanilla kernel sources? If not, could you point us to a git repo or source archive? It would help a lot to figure this out. Thanks, Michael _______________________________________________ lttng-dev mailing list [email protected] https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
