[jira] [Commented] (HDFS-5868) Make hsync implementation pluggable

David Powell (JIRA) Thu, 06 Feb 2014 17:10:14 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894069#comment-13894069
 ]


David Powell commented on HDFS-5868:
------------------------------------

Buddy,

Very nice.  A couple comments:
* Given the acknowledgement of non-FileOutputStream OutputStreams, the 
LOG.warn() in BlockReceiver() when getDataOut() is not a FileOutputStream is 
probably producing unwanted output.
* I'm not a fan of the sync(OutputStream) method signature.  The premise is, I 
assume, to sync one of the streams associated with the ReplicaOutputStream.  
sync(OutputStream) lets one pass in *any* stream, which means an implementation 
will end up doing one of two things:
*# Do the right thing if the OutputStream is the correct type or nothing if 
not.  This works fine if callers only pass in correctly coordinated 
OutputStreams, but otherwise is useless.  (This is what the attached patch 
does).
*# Compare the OutputStream reference against the streams kept internally and 
perform the correct action depending on which it is.  (This might be done if 
you had a simple OutputStream adapter subclass to state managed otherwise by 
the ReplicaOutputStream.)
* In either case, the interface lets the caller pass in something wrong, which 
means the implementation has to test for that before performing the only action 
that should ever be performed for a correctly-written consumer.  Instead, I 
recommend separate syncDataOut() and syncChecksumOut() methods that do exactly 
what they say and require no validity checking (a win for performance, 
maintenance, and testing).


> Make hsync implementation pluggable
> -----------------------------------
>
>                 Key: HDFS-5868
>                 URL: https://issues.apache.org/jira/browse/HDFS-5868
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.2.0
>            Reporter: Buddy
>         Attachments: HDFS-5868-branch-2.patch
>
>
> The current implementation of hsync in BlockReceiver only works if the output 
> streams are instances of FileOutputStream. Therefore, there is currently no 
> way for a FSDatasetSpi plugin to implement hsync if it is not using standard 
> OS files.
> One possible solution is to push the implementation of hsync into the 
> ReplicaOutputStreams class. This class is constructed by the 
> ReplicaInPipeline which is constructed by the FSDatasetSpi plugin, therefore 
> it can be extended. Instead of directly calling sync on the output stream, 
> BlockReceiver would call ReplicaOutputStream.sync.  The default 
> implementation of sync in ReplicaOutputStream would be the same as the 
> current implementation in BlockReceiver. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5868) Make hsync implementation pluggable

Reply via email to