[jira] [Commented] (HADOOP-11758) Add options to filter out too much granular tracing spans
[ https://issues.apache.org/jira/browse/HADOOP-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481317#comment-14481317 ] Colin Patrick McCabe commented on HADOOP-11758: --- Thanks for attaching this. I've been meaning to put up some more real-world data from my hdfs cluster myself. It looks like we still have a problem with overly long span names in a few places... e.g., {{org.apache.hadoop.hdfs.protocol.ClientProtocol.complete}} should really be {{ClientProtocol#complete}}. Let me see if I can find where it's doing this... Add options to filter out too much granular tracing spans - Key: HADOOP-11758 URL: https://issues.apache.org/jira/browse/HADOOP-11758 Project: Hadoop Common Issue Type: Improvement Components: tracing Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: testWriteTraceHooks-HDFS-8026.html, testWriteTraceHooks.html in order to avoid queue in span receiver spills -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11758) Add options to filter out too much granular tracing spans
[ https://issues.apache.org/jira/browse/HADOOP-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387823#comment-14387823 ] Colin Patrick McCabe commented on HADOOP-11758: --- Check out HDFS-8026... does this help reduce the number of spans in the write path? Add options to filter out too much granular tracing spans - Key: HADOOP-11758 URL: https://issues.apache.org/jira/browse/HADOOP-11758 Project: Hadoop Common Issue Type: Improvement Components: tracing Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: testWriteTraceHooks.html in order to avoid queue in span receiver spills -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11758) Add options to filter out too much granular tracing spans
[ https://issues.apache.org/jira/browse/HADOOP-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384334#comment-14384334 ] Colin Patrick McCabe commented on HADOOP-11758: --- Hmm. The idea behind HTrace is not to trace every operation. We should be tracing less than 1% of all operations At that point, we wouldn't really have a problem with too many trace spans. The only time you would turn on tracing for every operation is when doing debugging. In that case it's like turning log4j up to TRACE level-- you know you're going to get swamped. So basically I would argue that we already do have an option to filter out too many trace spans-- setting the trace sampler to ProbabilitySampler. Add options to filter out too much granular tracing spans - Key: HADOOP-11758 URL: https://issues.apache.org/jira/browse/HADOOP-11758 Project: Hadoop Common Issue Type: Improvement Components: tracing Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: testWriteTraceHooks.html in order to avoid queue in span receiver spills -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11758) Add options to filter out too much granular tracing spans
[ https://issues.apache.org/jira/browse/HADOOP-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384371#comment-14384371 ] Colin Patrick McCabe commented on HADOOP-11758: --- I also wonder if we can do tracing at a level slightly above writeChunk. writeChunk operates at the level of 512-byte chunks, but writes are often larger than that. If you look here: {code} private void writeChecksumChunks(byte b[], int off, int len) throws IOException { sum.calculateChunkedSums(b, off, len, checksum, 0); for (int i = 0; i len; i += sum.getBytesPerChecksum()) { int chunkLen = Math.min(sum.getBytesPerChecksum(), len - i); int ckOffset = i / sum.getBytesPerChecksum() * getChecksumSize(); writeChunk(b, off + i, chunkLen, checksum, ckOffset, getChecksumSize()); } } {code} you can see that if we do a 4 kilobyte read, writeChunk will get called 8 times. But really it would be better just to have one span representing the entire 4k write. Add options to filter out too much granular tracing spans - Key: HADOOP-11758 URL: https://issues.apache.org/jira/browse/HADOOP-11758 Project: Hadoop Common Issue Type: Improvement Components: tracing Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: testWriteTraceHooks.html in order to avoid queue in span receiver spills -- This message was sent by Atlassian JIRA (v6.3.4#6332)