[ 
https://issues.apache.org/jira/browse/HDFS-8411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734508#comment-15734508
 ] 

Kai Zheng commented on HDFS-8411:
---------------------------------

Thanks [~andrew.wang] for the ping! It's good for me to have a very close to 
the codes and see the very good work here.

1. In {{StripedReader}} the new variable {{bytesRead}} needs to be cleared 
during the loop. We don't run into this because our tests are using file length 
smaller than a strip.

2. The following tests would be good to merge to save some test time:
{noformat}
testEcTasks
testEcCodingTime
testEcBytesFullBlock
{noformat}
By the way, inherited from existing codes, not necessary to have the {{Ec}} 
prefix as they're in the context of {{TestDataNodeErasureCodingMetrics}}.

3. In the following codes, {{blockSize}} could be {{cellSize}} instead, 
otherwise it's very confusing and hard to understand the logic (the asserts).
{code}
  public void testEcBytesPartialGroup2() throws Exception {
    final int fileLen = blockSize + blockSize / 10;
    doTestForPartialGroup("/testEcBytes", fileLen, 0);
    // Add all reconstruction bytes read/write from all data nodes
    long bytesRead = 0;
    long bytesWrite = 0;
    for (DataNode dn : cluster.getDataNodes()) {
      MetricsRecordBuilder rb = getMetrics(dn.getMetrics().name());
      bytesRead += getLongCounter("EcReconstructionBytesRead", rb);
      bytesWrite += getLongCounter("EcReconstructionBytesWritten", rb);
    }

    Assert.assertEquals("ecReconstructionBytesRead should be ",
        blockSize + blockSize / 10, bytesRead);
    Assert.assertEquals("ecReconstructionBytesWritten should be ",
        blockSize, bytesWrite);
  }
{code}

4. I'm happy to see this new trick to count the metrics more reliably by 
looking at all datanodes:
{code}
    for (DataNode dn : cluster.getDataNodes()) {
      MetricsRecordBuilder rb = getMetrics(dn.getMetrics().name());
      bytesRead += getLongCounter("EcReconstructionBytesRead", rb);
      bytesWrite += getLongCounter("EcReconstructionBytesWritten", rb);
    }
{code}
So we could get rid of the trick played in {{doTest}} by using an extra 
datanode. In fact we could refactor and get rid of {{doTest}} entirely, by 
reusing the codes of {{doTestForPartialGroup}}. We could refactor 
doTestForPartialGroup to make it suitable for tests of all cases in file length 
considering the boundaries of cell, block, group, and more than one group.

I thought 1, 2 and 3 are good to be fixed here and 4 could be done in a follow 
on issue as it's not trivial. [~Sammi] could you proceed to help with these? 
Thank you!

> Add bytes count metrics to datanode for ECWorker
> ------------------------------------------------
>
>                 Key: HDFS-8411
>                 URL: https://issues.apache.org/jira/browse/HDFS-8411
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: SammiChen
>         Attachments: HDFS-8411-001.patch, HDFS-8411-002.patch, 
> HDFS-8411-003.patch, HDFS-8411-004.patch, HDFS-8411-005.patch, 
> HDFS-8411-006.patch, HDFS-8411-007.patch, HDFS-8411-008.patch, 
> HDFS-8411-009.patch, HDFS-8411.010.patch
>
>
> This is a sub task of HDFS-7674. It calculates the amount of data that is 
> read from local or remote to attend decoding work, and also the amount of 
> data that is written to local or remote datanodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to