[
https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Hofhansl updated HDFS-6735:
--------------------------------
Attachment: HDFS-6735-v3.txt
I classified the state in DFSInputStream into state used by read only and state
used by both read and pread.
With that here's a new proposed patch.
* makes LocatedBlocks immutable (which was intended it seems)
* pread no longer affects currentNode (that was unintended I think)
* guards state shared between read and pread with an extra sharedLock
(the state used for read only is still guarded by a lock on <this>, which we
need to take anyway to avoid concurrent stateful reads against the same input
stream)
* removed all synchronized on private method that were only called from methods
already synchronized (good practice anyway)
* makes cachingStrategy volatile (made more sense than locking there)
* should be free of deadlocks (never acquire lock on <this> with sharedLock
held, but the reverse is possible)
* pos, blockEnd, currentLocatedBlock are not updated in getBlockAt unless
called on behalf of read (not for pread, hence locking on <this> not needed
there)
I have not tested this, yet.
Please have a careful look and let me know what you think.
We might want to further disentangle the mixed state.
(And just maybe the best solution would be for HBase to have an input stream
for each thread doing read and one for all threads doing preads - and not do
any of this...?)
> A minor optimization to avoid pread() be blocked by read() inside the same
> DFSInputStream
> -----------------------------------------------------------------------------------------
>
> Key: HDFS-6735
> URL: https://issues.apache.org/jira/browse/HDFS-6735
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client
> Affects Versions: 3.0.0
> Reporter: Liang Xie
> Assignee: Liang Xie
> Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in
> read/pread path, and it has became a HBase read latency pain point so far. In
> HDFS-6698, i made a minor patch against the first encourtered lock, around
> getFileLength, in deed, after reading code and testing, it shows still other
> locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case
> to show the issue and the improved result.
> This is important for HBase application, since in current HFile read path, we
> issue all read()/pread() requests in the same DFSInputStream for one HFile.
> (Multi streams solution is another story i had a plan to do, but probably
> will take more time than i expected)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)