[ 
https://issues.apache.org/jira/browse/HDFS-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-6934:
--------------------------------
    Attachment: HDFS-6934.4.patch

Thank you for reviewing, Xiaoyu.  Lots of good suggestions.  Here is patch v4, 
with the following changes.

{{FSOutputSummer}}: Used the common {{getChecksumSize}} method in more places 
like you suggested.
{{DataChecksum}}: Added whitespace between operators.
{{DFSClient}}: Fixed minor Findbugs warning identified in last Jenkins run.
{{DFSInputStream}}: In {{getBestNodeDNAddrPair}}, moved storage types outside 
the loop and added comment explaining indexed lookup of storage type.
{{LazyWriterTestCase}}: There was a lot of code duplication for helper methods 
in {{TestLazyPersistFiles}} and {{TestScrLazyPersistFiles}}.  I took the 
opportunity to refactor into a new abstract base class that they can share.
{{TestScrLazyPersistFiles}}: The new tests are now in here.

bq. If you put Line 582~583 before Line 574, you can use...

I don't think we can make this change.  The two conditions are logically 
slightly different.  On line 574:

{code}
      if (checksumReceivedLen == 0 && !streams.isTransientStorage()) {
{code}

On line 582:

{code}
      final boolean shouldNotWriteChecksum = checksumReceivedLen == 0
          && streams.isTransientStorage();
{code}

Note the negation of {{isTransientStorage()}} in the first one.  The first case 
is for detecting if a write to non-transient storage did not send a checksum, 
and therefore it needs to be calculated here.  The second case is for detecting 
if this is a write to transient storage, and therefore the checksum needs to be 
skipped entirely.

bq. I notice the test timeout attribute is removed. Is there a particular 
reason for that?

The patch is using the {{Timeout}} JUnit rule to specify a timeout applied 
across every test case in one place instead of specifying the timeout in each 
{{Test}} annotation.  We've been using this approach where possible in new 
tests, because it avoids duplication of the timeout values and helps prevent 
introducing new tests that forgot to specify a timeout.  In the current version 
of the patch, the {{Timeout}} rule is in the abstract base class, which covers 
it for all tests in both subclasses.


> Move checksum computation off the hot path when writing to RAM disk
> -------------------------------------------------------------------
>
>                 Key: HDFS-6934
>                 URL: https://issues.apache.org/jira/browse/HDFS-6934
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Arpit Agarwal
>            Assignee: Chris Nauroth
>         Attachments: HDFS-6934.3.patch, HDFS-6934.4.patch, 
> h6934_20141003b.patch, h6934_20141005.patch
>
>
> Since local RAM is considered reliable we can avoid writing checksums on the 
> hot path when replicas are being written to a local RAM disk.
> The checksum can be computed by the lazy writer when moving replicas to disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to