Hi,

On 2017-09-21 05:34 AM, liguang li wrote:
> Hi,
> 
> On Wed, Sep 20, 2017 at 10:58 PM, Jonathan Rajotte Julien 
> <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     Hi,
> 
>     On 2017-09-20 05:12 AM, liguang li wrote:
>     >
>     >
>     > On Tue, Sep 19, 2017 at 10:57 PM, Jonathan Rajotte-Julien 
> <[email protected] 
> <mailto:[email protected]> 
> <mailto:[email protected] 
> <mailto:[email protected]>>> wrote:
>     >
>     >     On Tue, Sep 19, 2017 at 03:53:27PM +0800, liguang li wrote:
>     >     >    Hi,
>     >     >    On Mon, Sep 18, 2017 at 11:18 PM, Jonathan Rajotte-Julien
>     >     >    <[1][email protected] 
> <mailto:[email protected]> 
> <mailto:[email protected] 
> <mailto:[email protected]>>> wrote:
>     >     >
>     >     >      Hi,
>     >     >      On Mon, Sep 18, 2017 at 11:32:07AM +0800, liguang li wrote:
>     >     >      >    Hi,
>     >     >      >
>     >     >      >    Create a session in live-reading mode, run a 
> application which
>     >     >      having very
>     >     >      >    high event throughput, then prints
>     >     >      >    the events with babeltrace. We found the live trace 
> viewer are
>     >     >      viewing
>     >     >      >    events a few seconds ago, and as time
>     >     >
>     >     >      Could you provide us the version used for babeltrace, 
> lttng-tools and
>     >     >      lttng-ust?
>     >     >
>     >     >     Babeltrace: 1.5.1
>     >
>     >     Update to babeltrace 1.5.3.
>     >
>     >     >     Lttng-tools: 2.8.6
>     >
>     >     Update to lttng-tools 2.8.8
>     >
>     >     >     Lttng-ust: 2.8.2
>     >     >
>     >     >      >    goes on, the delay will be bigger and bigger.
>     >     >
>     >     >      A similar issues was observed a couple months ago, which 
> implicated
>     >     >      multiple delayed ack
>     >     >      problems during communication between lttng-relayd and 
> babeltrace.
>     >     >
>     >     >      The following fixes were merged:
>     >     >
>     >     >      [1]
>     >     >      
> [2]https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088
>  
> <https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088>
>  
> <https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088
>  
> <https://github.com/lttng/lttng-tools/commit/b6025e9476332b75eb8184345c3eb3e924780088>>
>     >     >      [2]
>     >     >      
> [3]https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d
>  
> <https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d>
>  
> <https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d
>  
> <https://github.com/efficios/babeltrace/commit/de417d04317ca3bc30f59685a9d19de670e4b11d>>
>     >     >      [3]
>     >     >      
> [4]https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648
>  
> <https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648>
>  
> <https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648
>  
> <https://github.com/efficios/babeltrace/commit/4594dbd8f7c2af2446a3e310bee74ba4a2e9d648>>
>     >     >
>     >     >      In the event that you are already using an updated version 
> of babeltrace
>     >     >      and
>     >     >      lttng-tools, it would be pertinent to provide us with a 
> simple
>     >     >      reproducer so we
>     >     >      can assess the issue.
>     >
>     >     The version you are using does not include the mentioned fixes. 
> Please update
>     >     and redo your experiment.
>     >
>     >
>     > Test this issue in the version you have listed, the issue still exists.
> 
>     Given that previous versions had a major timing problem there I would 
> expect to have some improvement.
> 
>     In that case, we will need a lot more information on your benchmarking 
> strategy.
>     We will need a simple reproducer, your benchmark code (r,gnuplot etc.), 
> your overall methodology
>     to be able to reproduce the issue locally. Otherwise, it will be very hard
>     to come to any conclusion.
> 
> 
> Sorry, we can not provide the detailed information about the benchmark code 
> due to the company regulations.
>  
> 
>     >  
>     >
>     >     Cheers
>     >
>     >     >
>     >     >     
>     >     >     Steps:
>     >     >     lttng create session --live -U net://*
>     >     >     lttng enable-channel -s session -u ch1
>     >     >     lttng enable-event -s session -c ch1 -u -a
>     >     >     lttng start
>     >     >     
>     >     >     Run a high event throughput application, which is 
> multithreaded
>     >     >    application>
>     >
>     > In the multithreaded application, each tracepoint will have the wall
>     > time of the system,then we can easily reproduce this issue through
>     > comparing the time of recorded event and the system wall time.
>     >  
>     >
>     >     >     babeltrace -i lttng-live net://*
>     >     >     
>     >     >     After a while, we found the timestamp of the event in the 
> babeltrace is
>     >     >    different with the time in host
>     >     >     which run the application. And the delay will be bigger and 
> bigger with
>     >     >    time goes.
>     >     >     
>     >     >
>     >     >      Cheers
>     >     >      >    I checked the source code, found Babeltrace in 
> live-reading mode
>     >     >      will read
>     >     >      >    the recorded events in the CTF
>     >     >      >    files, and then parse and print it in a single thread. 
> The process
>     >     >      is a
>     >     >      >    little slow, do you have any ideas to
>     >     >      >    improve the process.
>     >
>     >
>     > From my understanding of the source code, the process of parse and
>     > print event will consume a lot of time. For example, the multithreaded
>     > application will consume 3 CPUs, in a specified time,3 subbuffers will
>     > be filled and sent to lttng-relayd daemon, recorded into the CTF files.
>     > If in the specified time, babeltrace only handled 2 subbuffers' event,
>     > thenthe issue will happens.
> 
>     Did you perform a bisection for where the delay come from? Reception of 
> packet? formatting of event?

?

>     What is the throughput of the application?


? 
This is a key element if anyone want to test at home. It can be expressed in 
event/s, event/ms.

>     How many tracepoint definition>     Does babeltrace catch up if a 
> quiescent period is given?
> 
> 
> I think so.
>  
> 
>     Could you provide us with statistics, timing data, etc.?
> 
>     What type of delay are we talking about?
> 
> 
> For example, the system wall time is 5:18 PM now, but the Babeltrace is still 
> printing events which
> have time 5:10 PM. Then stop running the application, the Babeltrace still 
> need some times to print
> the recorded events.
> 
> I think the root cause is that the parsing and printing process is a bit 
> slow. So i want to know if there
> are any method to improve the performance of this process.

A simple experiment would be to run a bounded workload 
to perform a quick timing evaluation of babeltrace with a streamed trace (on 
disk) and in live mode.
Calculate the time (time babeltrace path_to_streamed_trace) it takes babeltrace 
to read the on disk trace
and perform the same experiment with a live trace. It is important to bound 
your experiment
and have mostly the same workload in both experiment.

This will give us a rough estimate of the disparity between both scenario.

Cheers
_______________________________________________
lttng-dev mailing list
[email protected]
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

Reply via email to