[
https://issues.apache.org/jira/browse/HDFS-8411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734508#comment-15734508
]
Kai Zheng commented on HDFS-8411:
---------------------------------
Thanks [~andrew.wang] for the ping! It's good for me to have a very close to
the codes and see the very good work here.
1. In {{StripedReader}} the new variable {{bytesRead}} needs to be cleared
during the loop. We don't run into this because our tests are using file length
smaller than a strip.
2. The following tests would be good to merge to save some test time:
{noformat}
testEcTasks
testEcCodingTime
testEcBytesFullBlock
{noformat}
By the way, inherited from existing codes, not necessary to have the {{Ec}}
prefix as they're in the context of {{TestDataNodeErasureCodingMetrics}}.
3. In the following codes, {{blockSize}} could be {{cellSize}} instead,
otherwise it's very confusing and hard to understand the logic (the asserts).
{code}
public void testEcBytesPartialGroup2() throws Exception {
final int fileLen = blockSize + blockSize / 10;
doTestForPartialGroup("/testEcBytes", fileLen, 0);
// Add all reconstruction bytes read/write from all data nodes
long bytesRead = 0;
long bytesWrite = 0;
for (DataNode dn : cluster.getDataNodes()) {
MetricsRecordBuilder rb = getMetrics(dn.getMetrics().name());
bytesRead += getLongCounter("EcReconstructionBytesRead", rb);
bytesWrite += getLongCounter("EcReconstructionBytesWritten", rb);
}
Assert.assertEquals("ecReconstructionBytesRead should be ",
blockSize + blockSize / 10, bytesRead);
Assert.assertEquals("ecReconstructionBytesWritten should be ",
blockSize, bytesWrite);
}
{code}
4. I'm happy to see this new trick to count the metrics more reliably by
looking at all datanodes:
{code}
for (DataNode dn : cluster.getDataNodes()) {
MetricsRecordBuilder rb = getMetrics(dn.getMetrics().name());
bytesRead += getLongCounter("EcReconstructionBytesRead", rb);
bytesWrite += getLongCounter("EcReconstructionBytesWritten", rb);
}
{code}
So we could get rid of the trick played in {{doTest}} by using an extra
datanode. In fact we could refactor and get rid of {{doTest}} entirely, by
reusing the codes of {{doTestForPartialGroup}}. We could refactor
doTestForPartialGroup to make it suitable for tests of all cases in file length
considering the boundaries of cell, block, group, and more than one group.
I thought 1, 2 and 3 are good to be fixed here and 4 could be done in a follow
on issue as it's not trivial. [~Sammi] could you proceed to help with these?
Thank you!
> Add bytes count metrics to datanode for ECWorker
> ------------------------------------------------
>
> Key: HDFS-8411
> URL: https://issues.apache.org/jira/browse/HDFS-8411
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Li Bo
> Assignee: SammiChen
> Attachments: HDFS-8411-001.patch, HDFS-8411-002.patch,
> HDFS-8411-003.patch, HDFS-8411-004.patch, HDFS-8411-005.patch,
> HDFS-8411-006.patch, HDFS-8411-007.patch, HDFS-8411-008.patch,
> HDFS-8411-009.patch, HDFS-8411.010.patch
>
>
> This is a sub task of HDFS-7674. It calculates the amount of data that is
> read from local or remote to attend decoding work, and also the amount of
> data that is written to local or remote datanodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]