Re: [lttng-dev] Allocation failures with babeltrace and TraceCompass - corrupt trace?

Michael Jeanson Fri, 23 Jun 2017 13:02:46 -0700

On 2017-06-23 11:48, Thomas McGuire wrote:
>>> Any idea what can cause the corrupted trace?
>> Based on your babeltrace backtrace, the possible culprits would be the
>> events that have a sequence (variable-sized array):
>>
>> syscalls: select, poll, ppoll, pselect6, epoll_wait, epoll_pwait
>>
>> block_rq_issue, block_rq_insert, block_rq_complete, block_rq_requeue, 
>> block_rq_abort.
>>
>> There are a few approaches to cornering the issue. You can try reproducing
>> on your workload/config by only enabling one of these events at a time.
>> Just knowing which event(s) is/are the culprit would be a good start.
>>
>> Another possibility would be to send us a trace reproducing the issue
>> with only those events enabled, which should not contain confidential
>> info about your system.
> 
> I've added some debug statements to babeltrace now. The culprit in this
> particular case is the first block_rq_complete event, the __cmd_length
> field contains a large value (3040877592). __cmd_length is used as the
> length for the _cmd sequence, and then of course allocating space for
> that sequence fails.
> 
> Any idea what can cause __cmd_length to be bogus?


Hi Thomas,

I see from the metadata file you provided that your kernel version is
4.9.28-20170428-1, is it built from vanilla kernel sources? If not,
could you point us to a git repo or source archive? It would help a lot
to figure this out.

Thanks,

Michael
_______________________________________________
lttng-dev mailing list
[email protected]
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Re: [lttng-dev] Allocation failures with babeltrace and TraceCompass - corrupt trace?

Reply via email to