[
https://issues.apache.org/jira/browse/HDFS-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157657#comment-17157657
]
dmichal edited comment on HDFS-15000 at 7/14/20, 8:23 PM:
----------------------------------------------------------
How about distinguishing in volume map between blocks for which the IO
operation has finished from the ones for which the IO operation is still in
progress? Then the {{FsDatasetImpl.createRbw()}} method could work in the
following way:
{code:java}
lock {
1. check if block_id exists (either as 'finished' or as 'in progress')
2. perform other checks
3. select the volume
4. add the block to the volume map (as 'in progress')
}
5. perform the IO
lock {
6. change the block status to 'finished' or clean up in case of IO error
}
{code}
Maybe even this second lock is not necessary?
Two implementations that come to my mind are:
# Keep the information about the status in the volume map.
# Create a separate volume map for blocks with IO in progress.
was (Author: dmichal):
How about distinguishing in volume map between blocks for which IO operation
has finished from the ones for which the IO operation is still in progress?
Then the {{FsDatasetImpl.createRbw()}} method could work in the following way:
{code:java}
lock {
1. check if block_id exists (either as 'finished' or as 'in progress')
2. perform other checks
3. select the volume
4. add the block to the volume map (as 'in progress')
}
5. perform the IO
lock {
6. change the block status to 'finished' or clean up in case of IO error
}
{code}
Maybe even this second lock is not necessary?
Two implementations that come to my mind are:
# Keep the information about the status in the volume map.
# Create a separate volume map for blocks with IO in progress.
> Improve FsDatasetImpl to avoid IO operation in datasetLock
> ----------------------------------------------------------
>
> Key: HDFS-15000
> URL: https://issues.apache.org/jira/browse/HDFS-15000
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Reporter: Xiaoqiao He
> Assignee: Aiphago
> Priority: Major
> Attachments: HDFS-15000.001.patch
>
>
> As HDFS-14997 mentioned, some methods in #FsDatasetImpl such as
> #finalizeBlock, #finalizeReplica, #createRbw includes IO operation in the
> datasetLock, It will block some logic when IO load is very high. We should
> reduce grain fineness or move IO operation out of datasetLock.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]