[ 
https://issues.apache.org/jira/browse/HDFS-14345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786666#comment-16786666
 ] 

Steve Loughran commented on HDFS-14345:
---------------------------------------

We've had problems with application's expectations of thread safety compared to 
what the Java APIs say. Essentially, the DFS input/output stream threading 
safety is the normative source of threading info

HDFS-6803, HDFS-6735, HADOOP-15557, and HADOOP-11708 spring to mind. We don't 
dare weaken the thread safety of the default DFS streams.

Now, looking at where BufferedInputStream is being used in the Hadoop code 
(Directly and indirectly), which of these are places where switching to a new 
version would deliver speedups without doing anything to HDFS or other 
FSDataInputStream-wrapped connections?

> fs.BufferedFSInputStream::read is synchronized
> ----------------------------------------------
>
>                 Key: HDFS-14345
>                 URL: https://issues.apache.org/jira/browse/HDFS-14345
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.1.2
>            Reporter: Gopal V
>            Priority: Major
>
> BufferedInputStream::read() has performance issues - this can be fixed by 
> wrapping the stream in another non-synchronized buffered inputstream, but 
> that incurs memory copy overheads and is sub-optimal.
> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/io/BufferedInputStream.java#L269
> Hadoop fs streams aren't thread-safe (except for ReadFully) and are stateful 
> for position, so this synchronization is purely a tax without benefit.
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BufferedFSInputStream.java#L35
> The readFully skips the BufferedInputStream super classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to