[ 
https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404334#comment-13404334
 ] 

Todd Lipcon commented on HDFS-3170:
-----------------------------------

Looks pretty good. A few comments:

- Can you use System.nanoTime so that it's (a) a monotonic clock instead of 
time-of-day, and (b) higher granularity than MS? I think a lot of these metrics 
will end up sub-millisecond.
- In {{flushOrSync()}} I think you can avoid having quite so many calls to get 
the time -- eg after flush(), the time there is the same as the start time of 
sync(). They're not super expensive, but they are syscalls, so let's be 
efficient and call it the minimal number of times.
- For all your variables/metric names, can you please add the units? eg 
"packetAckRoundTripTimeNanos"?
- I'm not sure the math's quite right: in the Ack itself, we should be 
including the total ack time from the whole downstream pipeline, not just the 
immediate next hop. In ASCII form:

{code}
.
          A->         B->        C->        D->
Client          DN1        DN2        DN3        DN4
          <-H         <-G        <-F        <-E
{code}

G's downstream RTT time should = (F.recvTime - C.sendTime), not (F.recvTime - 
C.sendTime - (E.recvTime - D.sendTime)). Does that make sense? Otherwise DN1 
will calculate the wrong metric.
                
> Add more useful metrics for write latency
> -----------------------------------------
>
>                 Key: HDFS-3170
>                 URL: https://issues.apache.org/jira/browse/HDFS-3170
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Matthew Jacobs
>         Attachments: hdfs-3170.txt
>
>
> Currently, the only write-latency related metric we expose is the total 
> amount of time taken by opWriteBlock. This is practically useless, since (a) 
> different blocks may be wildly different sizes, and (b) if the writer is only 
> generating data slowly, it will make a block write take longer by no fault of 
> the DN. I would like to propose two new metrics:
> 1) *flush-to-disk time*: count how long it takes for each call to flush an 
> incoming packet to disk (including the checksums). In most cases this will be 
> close to 0, as it only flushes to buffer cache, but if the backing block 
> device enters congested writeback, it can take much longer, which provides an 
> interesting metric.
> 2) *round trip to downstream pipeline node*: track the round trip latency for 
> the part of the pipeline between the local node and its downstream neighbors. 
> When we add a new packet to the ack queue, save the current timestamp. When 
> we receive an ack, update the metric based on how long since we sent the 
> original packet. This gives a metric of the total RTT through the pipeline. 
> If we also include this metric in the ack to upstream, we can subtract the 
> amount of time due to the later stages in the pipeline and have an accurate 
> count of this particular link.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to