[
https://issues.apache.org/jira/browse/HDFS-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153706#comment-14153706
]
Colin Patrick McCabe commented on HDFS-7055:
--------------------------------------------
bq. I think SpanReceiverHost#getUniqueLocalTraceFileName is useful but it
should belong to htrace. Can I port it to htrace later and remove from hadoop
on the next bumping of htrace version?
Yeah, absolutely.
bq. I attached screenshot of spans for reference. It shows trace of getting 1MB
of file by FsShell on pseudo distributed cluster with .004 patch. The trace
consists of over 500 spans in this case.... Setting
hadoop.trace.sampler=ProbabilitySampler did not reduce the number of spans
above because Trace#startSpan always start span without regarding to sampler
when there is ongoing trace.
Well, I guess it depends on what you mean by "granular." :) I certainly don't
want all trace spans to be activated randomly. We need to see the parent/child
relationships between the spans. I think the granularity of individual reads
is just about right-- less than that, and we start not being able to see the
big picture. More than that, and we can't effectively do random sampling.
But you are right that we have too many trace spans here. I thought about this
a little more, and I don't think we have to create a trace span for each
BlockReader operation. We can just create trace spans for the operations that
actually perform I/O to the datanode.
I think we can reduce this by not creating trace spans for every read done via
a BlockReader-- only the reads which actually result in data being written from
the DN. Similarly for BlockReaderLocal, we can trace the times we fill up the
buffer, but not every call into BlockReaderLocal.
> Add tracing to DFSInputStream
> -----------------------------
>
> Key: HDFS-7055
> URL: https://issues.apache.org/jira/browse/HDFS-7055
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Affects Versions: 2.6.0
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-7055.002.patch, HDFS-7055.003.patch,
> HDFS-7055.004.patch, screenshot-get-1mb.png
>
>
> Add tracing to DFSInputStream.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)