On 01/17/2012 07:59 PM, Liu Yuan wrote: > This patch set aims to provide a ftrace-like trace utility for performance > tuning > of online system in production environment. This means virtually no overhead > when > disabled. > > The patch set achieves this by dynamically patching the binary code at > runtime, making > use of GCC -pg instrumentation features, which instrument a 5-bytes opcode in > every function. > > When the trace is > turned off, the opcode is patched as NOP opcode. > turned on, the opcode is a call opcode to our trace functions and we can > hook callbacks both for > function entry and exit. This is how the patch atually calculate the time > spent by function. > > This is an early draft set, thought the trace infrastrure is almost done. The > work left is most related > to the trace buffer implementation. Currently in the patch set, a dynamical > buffer is drafted but not used > yet. So the output is rely on the original dprintf, which is slow and doing > sync logging. A final trace > output version should be per-thread buffer and will be manipulated by collie > command. > > Further more, the trace infrasture is quite modular, we can hook different > callbacks for different purposes. > > So now the output is like: > ... > Jan 17 19:49:52 default_trace_graph_return(47) strbuf_grow: 8259 (ns) < how > much nanosecond used in the function > Jan 17 19:49:52 default_trace_graph_return(47) strbuf_addf: 15663 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) strbuf_release: 106 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) farm_open: 61610 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) xpread: 1647242 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) farm_read: 1704604 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) farm_close: 8024 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) store_read_obj: 1802385 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) do_process_work: 1827672 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) do_local_io: 1845099 (ns) > Jan 17 19:49:52 default_trace_graph_return(47) forward_read_obj_req: 1860139 > (ns) > Jan 17 19:49:52 default_trace_graph_return(47) get_sd_op: 141 (ns) > ...
For your convenience, I use the following script to test the functionality: pkill -9 sheep pkill -9 qemu-io rm store/* -rf for i in 0 1 2; do sheep/sheep -d /home/tailai.ly/sheepdog/store/$i -z $i -p 700$i;sleep 1;done collie/collie cluster format -b farm qemu-img create -f raw sheepdog:test 1G qemu-io -c "write -P 0x1 0 10M" sheepdog:test qemu-io -c "read -P 0x1 0 10M" sheepdog:test collie/collie debug trace -e #start the trace qemu-io -c "write -P 0x1 0 20M" sheepdog:test qemu-io -c "read -P 0x1 0 10M" sheepdog:test collie/collie debug trace -t #end the trace qemu-io -c "read -P 0x1 0 10M" sheepdog:test Thanks, Yuan -- sheepdog mailing list [email protected] http://lists.wpkg.org/mailman/listinfo/sheepdog
