On 01/17/2012 07:59 PM, Liu Yuan wrote:

> This patch set aims to provide a ftrace-like trace utility for performance 
> tuning 
> of online system in production environment. This means virtually no overhead 
> when
> disabled.
> 
> The patch set achieves this by dynamically patching the binary code at 
> runtime, making
> use of GCC -pg instrumentation features, which instrument a 5-bytes opcode in 
> every function.
> 
> When the trace is
>  turned off, the opcode is patched as NOP opcode.
>  turned on, the opcode is a call opcode to our trace functions and we can 
> hook callbacks both for
> function entry and exit. This is how the patch atually calculate the time 
> spent by function.
> 
> This is an early draft set, thought the trace infrastrure is almost done. The 
> work left is most related
> to the trace buffer implementation. Currently in the patch set, a dynamical 
> buffer is drafted but not used
> yet. So the output is rely on the original dprintf, which is slow and doing 
> sync logging. A final trace
> output version should be per-thread buffer and will be manipulated by collie 
> command.
> 
> Further more, the trace infrasture is quite modular, we can hook different 
> callbacks for different purposes.
> 
> So now the output is like:
> ...
> Jan 17 19:49:52 default_trace_graph_return(47) strbuf_grow: 8259 (ns) < how 
> much nanosecond used in the function
> Jan 17 19:49:52 default_trace_graph_return(47) strbuf_addf: 15663 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) strbuf_release: 106 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) farm_open: 61610 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) xpread: 1647242 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) farm_read: 1704604 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) farm_close: 8024 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) store_read_obj: 1802385 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) do_process_work: 1827672 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) do_local_io: 1845099 (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) forward_read_obj_req: 1860139 
> (ns)
> Jan 17 19:49:52 default_trace_graph_return(47) get_sd_op: 141 (ns)
> ...


For your convenience, I use the following script to test the functionality:

pkill -9 sheep
pkill -9 qemu-io
rm store/* -rf
for i in 0 1 2; do sheep/sheep -d /home/tailai.ly/sheepdog/store/$i -z
$i -p 700$i;sleep 1;done
collie/collie cluster format -b farm
qemu-img create -f raw sheepdog:test 1G
qemu-io -c "write -P 0x1 0 10M" sheepdog:test
qemu-io -c "read -P 0x1 0 10M" sheepdog:test

collie/collie debug trace -e #start the trace
qemu-io -c "write -P 0x1 0 20M" sheepdog:test
qemu-io -c "read -P 0x1 0 10M" sheepdog:test
collie/collie debug trace -t #end the trace

qemu-io -c "read -P 0x1 0 10M" sheepdog:test

Thanks,
Yuan
-- 
sheepdog mailing list
[email protected]
http://lists.wpkg.org/mailman/listinfo/sheepdog

Reply via email to