[ 
https://issues.apache.org/jira/browse/KUDU-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mitch Barnett updated KUDU-2762:
--------------------------------
    Status: In Review  (was: In Progress)

> Improve write transaction tracing
> ---------------------------------
>
>                 Key: KUDU-2762
>                 URL: https://issues.apache.org/jira/browse/KUDU-2762
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: Will Berkeley
>            Assignee: Mitch Barnett
>            Priority: Major
>
> Here's a write transaction trace:
> {noformat}
> W0331 02:16:27.859648 27337 rpcz_store.cc:244] Call 
> kudu.tserver.TabletServerService.Write from 10.80.34.74:58250 (ReqId={client: 
> 6a904be8604b482989e3d1592f8824f2, seq_no=66108, attempt_no=1}) took 7855ms 
> (client timeout 10000).
> W0331 02:16:27.859728 27337 rpcz_store.cc:248] Trace:
> 0331 02:16:20.003835 (+     0us) service_pool.cc:159] Inserting onto call 
> queue
> 0331 02:16:20.003844 (+     9us) service_pool.cc:218] Handling call
> 0331 02:16:27.859641 (+7855797us) inbound_call.cc:157] Queueing success 
> response
> Related trace 'txn':
> 0331 02:16:20.003988 (+     0us) write_transaction.cc:101] PREPARE: Starting
> 0331 02:16:20.004087 (+    99us) write_transaction.cc:268] Acquiring schema 
> lock in shared mode
> 0331 02:16:20.004088 (+     1us) write_transaction.cc:271] Acquired schema 
> lock
> 0331 02:16:20.004088 (+     0us) tablet.cc:400] PREPARE: Decoding operations
> 0331 02:16:20.004130 (+    42us) tablet.cc:422] PREPARE: Acquiring locks for 
> 1 operations
> 0331 02:16:20.004137 (+     7us) tablet.cc:426] PREPARE: locks acquired
> 0331 02:16:20.004138 (+     1us) write_transaction.cc:126] PREPARE: finished.
> 0331 02:16:20.004154 (+    16us) write_transaction.cc:136] Start()
> 0331 02:16:20.004157 (+     3us) write_transaction.cc:141] Timestamp: P: 
> 1554016580004153 usec, L: 0
> 0331 02:16:20.004192 (+    35us) log.cc:582] Serialized 3741 byte log entry
> 0331 02:16:27.859496 (+7855304us) write_transaction.cc:149] APPLY: Starting
> 0331 02:16:27.859608 (+   112us) tablet_metrics.cc:365] ProbeStats: 
> bloom_lookups=2,key_file_lookups=2,delta_file_lookups=4,mrs_lookups=0
> 0331 02:16:27.859614 (+     6us) log.cc:582] Serialized 28 byte log entry
> 0331 02:16:27.859622 (+     8us) write_transaction.cc:309] Releasing row and 
> schema locks
> 0331 02:16:27.859623 (+     1us) write_transaction.cc:277] Released schema 
> lock
> 0331 02:16:27.859625 (+     2us) write_transaction.cc:196] FINISH: updating 
> metrics
> Metrics: 
> {"tcmalloc_contention_cycles":7552,"child_traces":[["txn",{"apply.queue_time_us":7854429,"cfile_cache_hit":12,"cfile_cache_hit_bytes":72423,"delta_iterators_relevant":2,"num_ops":1,"prepare.queue_time_us":11,"prepare.run_cpu_time_us":226,"prepare.run_wall_time_us":225,"raft.queue_time_us":12,"raft.run_cpu_time_us":92,"raft.run_wall_time_us":91,"replication_time_us":898,"spinlock_wait_cycles":18688}]]}
> {noformat}
> It could use some polish. Here's a few things to fix:
> 1. Nit: the casing and punctuation of {{PREPARE: finished.}} and {{APPLY: 
> Starting}} is inconsistent.
> 2. {{PREPARE}} has a start and end; {{APPLY}} has a start but no end; 
> replication has neither. We should add the missing trace events.
> 3. nit: There's no need for the {{PREPARE:}} preamble to trace events within 
> {{PREPARE: Starting}} and {{PREPARE: finished.}}.
> It's potentially misleading to read this trace. Almost all of the time is 
> spent between the trace event reporting the local WAL was written and the 
> trace event reporting apply started. Naively, this should mean the operation 
> took a long time to replicate, as that's the most obvious thing that could 
> take a long time. However, the metrics show the time was taken waiting in the 
> apply queue, held up by some other slow activity. This operation was fast to 
> perform-- it took a long time because it was held up, idling, by something 
> else.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to