[
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219026#comment-14219026
]
Colin Patrick McCabe commented on HDFS-6735:
--------------------------------------------
This is definitely a tricky change but I think it will have a lot of benefits.
Thanks again for taking this on.
I can see that you didn't update the indent. I guess that makes things easier
to review. But eventually we will need to add indentation for the things that
are now surrounded by a new {{synchronized}} block.
{code}
- private synchronized ByteBuffer tryReadZeroCopy(int maxLength,
+ private ByteBuffer tryReadZeroCopy(int maxLength,
EnumSet<ReadOption> opts) throws IOException {
{code}
I don't think that we can remove the synchronization here, since this function
is using {{pos}} and some other stuff which is positional.
re: CachingStrategy: I think this should just be protected by {{sharedLock}},
right? It shouldn't need to be volatile if we just grab the current value when
holding {{sharedLock}}.
I'm not sure "sharedLock" is the best name. All locks are shared, right? If
we weren't sharing between threads, there would be no need for locks. I can
see that the intention is that sharedLock = shared between read and pread. But
sharedLock is also used by stuff like getFileEncryptionInfo.
How about calling this "infoLock" instead? It's protecting the block info we
got from the NameNode, lastBlockBeingWrittenLength, and fileEncryptionInfo (and
I think cachingstrategy, if my previous comment makes sense?)
> A minor optimization to avoid pread() be blocked by read() inside the same
> DFSInputStream
> -----------------------------------------------------------------------------------------
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Affects Versions: 3.0.0
> Reporter: Liang Xie
> Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in
> read/pread path, and it has became a HBase read latency pain point so far. In
> HDFS-6698, i made a minor patch against the first encourtered lock, around
> getFileLength, in deed, after reading code and testing, it shows still other
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we
> issue all read()/pread() requests in the same DFSInputStream for one HFile.
> (Multi streams solution is another story i had a plan to do, but probably
> will take more time than i expected)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)