[ 
https://issues.apache.org/jira/browse/HADOOP-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359106#comment-14359106
 ] 

Colin Patrick McCabe commented on HADOOP-11708:
-----------------------------------------------

I agree with [[email protected]] here... +1 for changing 
{{CryptoOutputStream}} to behave the same as HDFS.  It should be a pretty small 
patch.

bq. Sean wrote: ...we could remove ~10 synchronization blocks in DFSOS (some of 
them are unneeded and just about all of them are questionable, and I can't find 
a rationalization for them).  As a follow-on, we add a FSDataOutputStream that 
isn't threadsafe and says as much. We can do this compatibly by either making 
it an option (in FSDataOutputStream construction or in configs), by making it a 
new API, or making it a documented breaking change.

I agree there is a lot to clean up here.  Let's talk about this in a separate 
JIRA.  We have a bunch of options here and I think the discussion will take a 
while.

> CryptoOutputStream synchronization differences from DFSOutputStream break 
> HBase
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-11708
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11708
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.6.0
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>            Priority: Critical
>
> For the write-ahead-log, HBase writes to DFS from a single thread and sends 
> sync/flush/hflush from a configurable number of other threads (default 5).
> FSDataOutputStream does not document anything about being thread safe, and it 
> is not thread safe for concurrent writes.
> However, DFSOutputStream is thread safe for concurrent writes + syncs. When 
> it is the stream FSDataOutputStream wraps, the combination is threadsafe for 
> 1 writer and multiple syncs (the exact behavior HBase relies on).
> When HDFS Transparent Encryption is turned on, CryptoOutputStream is inserted 
> between FSDataOutputStream and DFSOutputStream. It is proactively labeled as 
> not thread safe, and this composition is not thread safe for any operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to