[
https://issues.apache.org/jira/browse/HDFS-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894069#comment-13894069
]
David Powell commented on HDFS-5868:
------------------------------------
Buddy,
Very nice. A couple comments:
* Given the acknowledgement of non-FileOutputStream OutputStreams, the
LOG.warn() in BlockReceiver() when getDataOut() is not a FileOutputStream is
probably producing unwanted output.
* I'm not a fan of the sync(OutputStream) method signature. The premise is, I
assume, to sync one of the streams associated with the ReplicaOutputStream.
sync(OutputStream) lets one pass in *any* stream, which means an implementation
will end up doing one of two things:
*# Do the right thing if the OutputStream is the correct type or nothing if
not. This works fine if callers only pass in correctly coordinated
OutputStreams, but otherwise is useless. (This is what the attached patch
does).
*# Compare the OutputStream reference against the streams kept internally and
perform the correct action depending on which it is. (This might be done if
you had a simple OutputStream adapter subclass to state managed otherwise by
the ReplicaOutputStream.)
* In either case, the interface lets the caller pass in something wrong, which
means the implementation has to test for that before performing the only action
that should ever be performed for a correctly-written consumer. Instead, I
recommend separate syncDataOut() and syncChecksumOut() methods that do exactly
what they say and require no validity checking (a win for performance,
maintenance, and testing).
> Make hsync implementation pluggable
> -----------------------------------
>
> Key: HDFS-5868
> URL: https://issues.apache.org/jira/browse/HDFS-5868
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode
> Affects Versions: 2.2.0
> Reporter: Buddy
> Attachments: HDFS-5868-branch-2.patch
>
>
> The current implementation of hsync in BlockReceiver only works if the output
> streams are instances of FileOutputStream. Therefore, there is currently no
> way for a FSDatasetSpi plugin to implement hsync if it is not using standard
> OS files.
> One possible solution is to push the implementation of hsync into the
> ReplicaOutputStreams class. This class is constructed by the
> ReplicaInPipeline which is constructed by the FSDatasetSpi plugin, therefore
> it can be extended. Instead of directly calling sync on the output stream,
> BlockReceiver would call ReplicaOutputStream.sync. The default
> implementation of sync in ReplicaOutputStream would be the same as the
> current implementation in BlockReceiver.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)