[jira] [Comment Edited] (HDFS-15000) Improve FsDatasetImpl to avoid IO operation in datasetLock

dmichal (Jira) Tue, 14 Jul 2020 13:24:12 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157657#comment-17157657
 ]


dmichal edited comment on HDFS-15000 at 7/14/20, 8:23 PM:
----------------------------------------------------------

How about distinguishing in volume map between blocks for which the IO 
operation has finished from the ones for which the IO operation is still in 
progress? Then the {{FsDatasetImpl.createRbw()}} method could work in the 
following way:
{code:java}
lock {
 1. check if block_id exists (either as 'finished' or as 'in progress')
 2. perform other checks
 3. select the volume
 4. add the block to the volume map (as 'in progress')
}
 5. perform the IO
lock {
 6. change the block status to 'finished' or clean up in case of IO error
}
{code}
Maybe even this second lock is not necessary?

Two implementations that come to my mind are:
 # Keep the information about the status in the volume map.
 # Create a separate volume map for blocks with IO in progress.


was (Author: dmichal):
How about distinguishing in volume map between blocks for which IO operation 
has finished from the ones for which the IO operation is still in progress? 
Then the {{FsDatasetImpl.createRbw()}} method could work in the following way:
{code:java}
lock {
 1. check if block_id exists (either as 'finished' or as 'in progress')
 2. perform other checks
 3. select the volume
 4. add the block to the volume map (as 'in progress')
}
 5. perform the IO
lock {
 6. change the block status to 'finished' or clean up in case of IO error
}
{code}
Maybe even this second lock is not necessary?

Two implementations that come to my mind are:
 # Keep the information about the status in the volume map.
 # Create a separate volume map for blocks with IO in progress.

> Improve FsDatasetImpl to avoid IO operation in datasetLock
> ----------------------------------------------------------
>
>                 Key: HDFS-15000
>                 URL: https://issues.apache.org/jira/browse/HDFS-15000
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Xiaoqiao He
>            Assignee: Aiphago
>            Priority: Major
>         Attachments: HDFS-15000.001.patch
>
>
> As HDFS-14997 mentioned, some methods in #FsDatasetImpl such as 
> #finalizeBlock, #finalizeReplica, #createRbw includes IO operation in the 
> datasetLock, It will block some logic when IO load is very high. We should 
> reduce grain fineness or move IO operation out of datasetLock.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-15000) Improve FsDatasetImpl to avoid IO operation in datasetLock

Reply via email to