[
https://issues.apache.org/jira/browse/HDFS-6934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-6934:
--------------------------------
Attachment: HDFS-6934.3.patch
I'm resuming the work started on this by Nicholas. I'm uploading patch v3. I
believe this addresses the prior feedback from Jing. Here is a summary of the
changes in the current patch.
{{FSOutputSummer}}, {{Options}}, {{DataChecksum}}: Refactoring to help simplify
usage of the checksum classes in HDFS.
{{BlockReaderFactory}}: Pass along storage type for use later by individual
block reader classes.
{{BlockReaderLocal}}: Allow skipping checksums if the replica is on transient
storage. However, do not anchor, because the intent is not that clients should
prevent the DataNode from evicting a replica from RAM.
{{BlockReaderLocalLegacy}}: Skip checksums when reading a replica on transient
storage. Do not cache path information for a replica on transient storage,
because eviction at the DataNode may later invalidate that cached path. Unlike
the newer short-circuit read implementation, we don't have a communication
channel (the shared memory segment) for the DataNode to communicate path
invalidation back to the client, so we don't have a more graceful way to handle
this.
{{DFSClient}}: Minor changes associated with refactoring of the checksum
classes in Common.
{{DFSInputStream}}: Get storage type from the {{LocatedBlock}}. Pass it along
to {{BlockReaderFactory}}.
{{DFSOutputStream}}: Do not write checksums if using transient storage. Minor
changes associated with refactoring of the checksum classes in Common.
{{LocatedBlock}}: Change {{toString}} to aid debugging.
{{BlockMetadataHeader}}: Similar to the changes in Common, this is refactoring
to make code related to checksum handling simpler.
{{BlockReceiver}}: Do not require writing checksums on transient storage.
{{BlockSender}}: Do not verify checksum when reading a replica from transient
storage.
{{ReplicaInPipeline}}: Minor cleanups.
{{ReplicaOutputStreams}}: Expose whether or not the replica is on transient
storage.
{{BlockPoolSlice}}: Minor change to go along with the refactoring of checksum
classes.
{{FsDatasetImpl}}: During lazy persist, instead of copying a meta file from
transient storage, we now calculate the checksum to create a new meta file.
Also, when a replica gets evicted from transient storage, make a call to the
{{ShortCircuitRegistry}} to inform clients that they must evict from their
caches.
{{RamDiskAsyncLazyPersistService}}: Log stack trace for lazy persist failures
to aid debugging.
{{RamDiskReplicaLruTracker}}: Minor clean-up of raw generic type.
{{TestLazyPersistFiles}}: New tests added covering short-circuit read (new and
legacy) in combination with eviction, including corruption detection.
Refactored to use a global timeout setting.
> Move checksum computation off the hot path when writing to RAM disk
> -------------------------------------------------------------------
>
> Key: HDFS-6934
> URL: https://issues.apache.org/jira/browse/HDFS-6934
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Reporter: Arpit Agarwal
> Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-6934.3.patch, h6934_20141003b.patch,
> h6934_20141005.patch
>
>
> Since local RAM is considered reliable we can avoid writing checksums on the
> hot path when replicas are being written to a local RAM disk.
> The checksum can be computed by the lazy writer when moving replicas to disk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)