Re: [lttng-dev] LTTng enable-event performance issue during active session
Hi Zach, there's a proposed change to lttng-ust which addresses the issue you are seeing, but it hasn't finished clearing review & testing yet. https://review.lttng.org/c/lttng-ust/+/11006 thanks, kienan On 2023-10-05 18:04, Kienan Stewart via lttng-dev wrote: Hi Zach apologies for the delay in replying to your question. On 2023-09-26 08:27, Kramer, Zach via lttng-dev wrote: Hi, I am observing a performance issue with regards to enabling events while a session is active and was wondering if this is expected. LTTng versions: * lttng-tools: 2.13.9 * lttng-ust: 2.13.6 Steps to reproduce: 1. Ensure many userspace tracepoints are available in `lttng list -u` e.g. 100 2. Create a new session 3. Start session 4. Enable new events on session The time it takes to enable each new event has increasing cost e.g. 1. Event 1: 1ms 2. Event 100: 15ms 3. Event 1000: 150ms 4. àin total about 1.5 minutes to enable 1000 events If either: 1. No userspace tracepoints are available 2. Or session is not started until /after /the events are enabled Then the time it takes to enable new events is constant (e.g. 1ms). Below is a bash script demonstrating this behavior: # Pre-requisite: have many userspace tracepoints available lttng create foo lttng enable-channel -u -s foo bar lttng start foo total_t1=$(date +%s%3N); for i in {1..100} do t1=$(date +%s%3N); lttng enable-event -u lttng_iter_$i -s foo -c bar > /dev/null t2=$(date +%s%3N); echo "Event #$i took $((t2-t1)) ms" done total_t2=$(date +%s%3N); echo "" echo "Enabling events on active session took $((total_t2-total_t1)) ms" echo "" lttng destroy foo lttng create foo lttng enable-channel -u -s foo bar total_t1=$(date +%s%3N); for i in {1..100} do t1=$(date +%s%3N); lttng enable-event -u lttng_iter_$i -s foo -c bar > /dev/null t2=$(date +%s%3N); echo "Event #$i took $((t2-t1)) ms" done total_t2=$(date +%s%3N); echo "" echo "Enabling events on inactive session took $((total_t2-total_t1)) ms" echo "" lttng destroy foo Is this reproducible for you? Any insight is appreciated. I'm able to reproduce what you're seeing, and it's not completely unexpected based on my understanding of the architecture of lttng. When a session is active and an event rule is enabled/disabled, the sessiond will notify each of the registered applications of the event rule state changes. This communication is going through either unix or tcp sockets, and that portion of the protocol seems to struggle with many small changes as the protocol to communicate the changes doesn't support batching. Enabling multiple events in a single call reduces some of the overhead, but the changes will still be communicated serially. This means a single call activating thousands events can still take a long time to process (eg. `lttng enable-event -s foo -c bar -u tp0,tp1,...,tpN`) Using glob patterns or `--all` will be significantly faster as the UST applications can digest those types of event rule changes with a single set of messages. In cases where you want most but not all events, flipping the logic to enable "*" but add events or patterns to omit with "--exclude" is going to be a better strategy. Do you have cases where you need to activate many of events but which couldn't be covered by using glob patterns and/or exclusions? thanks, kienan Many thanks, Zach ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
Re: [lttng-dev] LTTng enable-event performance issue during active session
Hi Zach apologies for the delay in replying to your question. On 2023-09-26 08:27, Kramer, Zach via lttng-dev wrote: Hi, I am observing a performance issue with regards to enabling events while a session is active and was wondering if this is expected. LTTng versions: * lttng-tools: 2.13.9 * lttng-ust: 2.13.6 Steps to reproduce: 1. Ensure many userspace tracepoints are available in `lttng list -u` e.g. 100 2. Create a new session 3. Start session 4. Enable new events on session The time it takes to enable each new event has increasing cost e.g. 1. Event 1: 1ms 2. Event 100: 15ms 3. Event 1000: 150ms 4. àin total about 1.5 minutes to enable 1000 events If either: 1. No userspace tracepoints are available 2. Or session is not started until /after /the events are enabled Then the time it takes to enable new events is constant (e.g. 1ms). Below is a bash script demonstrating this behavior: # Pre-requisite: have many userspace tracepoints available lttng create foo lttng enable-channel -u -s foo bar lttng start foo total_t1=$(date +%s%3N); for i in {1..100} do t1=$(date +%s%3N); lttng enable-event -u lttng_iter_$i -s foo -c bar > /dev/null t2=$(date +%s%3N); echo "Event #$i took $((t2-t1)) ms" done total_t2=$(date +%s%3N); echo "" echo "Enabling events on active session took $((total_t2-total_t1)) ms" echo "" lttng destroy foo lttng create foo lttng enable-channel -u -s foo bar total_t1=$(date +%s%3N); for i in {1..100} do t1=$(date +%s%3N); lttng enable-event -u lttng_iter_$i -s foo -c bar > /dev/null t2=$(date +%s%3N); echo "Event #$i took $((t2-t1)) ms" done total_t2=$(date +%s%3N); echo "" echo "Enabling events on inactive session took $((total_t2-total_t1)) ms" echo "" lttng destroy foo Is this reproducible for you? Any insight is appreciated. I'm able to reproduce what you're seeing, and it's not completely unexpected based on my understanding of the architecture of lttng. When a session is active and an event rule is enabled/disabled, the sessiond will notify each of the registered applications of the event rule state changes. This communication is going through either unix or tcp sockets, and that portion of the protocol seems to struggle with many small changes as the protocol to communicate the changes doesn't support batching. Enabling multiple events in a single call reduces some of the overhead, but the changes will still be communicated serially. This means a single call activating thousands events can still take a long time to process (eg. `lttng enable-event -s foo -c bar -u tp0,tp1,...,tpN`) Using glob patterns or `--all` will be significantly faster as the UST applications can digest those types of event rule changes with a single set of messages. In cases where you want most but not all events, flipping the logic to enable "*" but add events or patterns to omit with "--exclude" is going to be a better strategy. Do you have cases where you need to activate many of events but which couldn't be covered by using glob patterns and/or exclusions? thanks, kienan Many thanks, Zach ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
[lttng-dev] LTTng enable-event performance issue during active session
Hi, I am observing a performance issue with regards to enabling events while a session is active and was wondering if this is expected. LTTng versions: * lttng-tools: 2.13.9 * lttng-ust: 2.13.6 Steps to reproduce: 1. Ensure many userspace tracepoints are available in `lttng list -u` e.g. 100 2. Create a new session 3. Start session 4. Enable new events on session The time it takes to enable each new event has increasing cost e.g. 1. Event 1: 1ms 2. Event 100: 15ms 3. Event 1000: 150ms 4. --> in total about 1.5 minutes to enable 1000 events If either: 1. No userspace tracepoints are available 2. Or session is not started until after the events are enabled Then the time it takes to enable new events is constant (e.g. 1ms). Below is a bash script demonstrating this behavior: # Pre-requisite: have many userspace tracepoints available lttng create foo lttng enable-channel -u -s foo bar lttng start foo total_t1=$(date +%s%3N); for i in {1..100} do t1=$(date +%s%3N); lttng enable-event -u lttng_iter_$i -s foo -c bar > /dev/null t2=$(date +%s%3N); echo "Event #$i took $((t2-t1)) ms" done total_t2=$(date +%s%3N); echo "" echo "Enabling events on active session took $((total_t2-total_t1)) ms" echo "" lttng destroy foo lttng create foo lttng enable-channel -u -s foo bar total_t1=$(date +%s%3N); for i in {1..100} do t1=$(date +%s%3N); lttng enable-event -u lttng_iter_$i -s foo -c bar > /dev/null t2=$(date +%s%3N); echo "Event #$i took $((t2-t1)) ms" done total_t2=$(date +%s%3N); echo "" echo "Enabling events on inactive session took $((total_t2-total_t1)) ms" echo "" lttng destroy foo Is this reproducible for you? Any insight is appreciated. Many thanks, Zach ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev