[
https://issues.apache.org/jira/browse/HADOOP-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608009#comment-14608009
]
Steve Loughran commented on HADOOP-11708:
-----------------------------------------
..never did the output stream one; things were taking so long to get the core
FS API and input stream in that I left it alone. Nominally the java.io API
should define it, but here's one of those examples where an HDFS implementation
detail has (unintentionally?) changed behaviour.
As well as concurrency, there's the issue of {{Syncable}} & "what does
flush() do?", especially in the context of object stores
If someone were to do it, it'd round things out, especially with extra tests. I
promise I will review it.
> CryptoOutputStream synchronization differences from DFSOutputStream break
> HBase
> -------------------------------------------------------------------------------
>
> Key: HADOOP-11708
> URL: https://issues.apache.org/jira/browse/HADOOP-11708
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 2.6.0
> Reporter: Sean Busbey
> Assignee: Sean Busbey
> Priority: Critical
>
> For the write-ahead-log, HBase writes to DFS from a single thread and sends
> sync/flush/hflush from a configurable number of other threads (default 5).
> FSDataOutputStream does not document anything about being thread safe, and it
> is not thread safe for concurrent writes.
> However, DFSOutputStream is thread safe for concurrent writes + syncs. When
> it is the stream FSDataOutputStream wraps, the combination is threadsafe for
> 1 writer and multiple syncs (the exact behavior HBase relies on).
> When HDFS Transparent Encryption is turned on, CryptoOutputStream is inserted
> between FSDataOutputStream and DFSOutputStream. It is proactively labeled as
> not thread safe, and this composition is not thread safe for any operations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)