[
https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142118#comment-16142118
]
Kihwal Lee commented on HDFS-12136:
-----------------------------------
I think the performance impact is less severe after HDFS-12157, so we could
target 2.8.3 for the fix.
> BlockSender performance regression due to volume scanner edge case
> ------------------------------------------------------------------
>
> Key: HDFS-12136
> URL: https://issues.apache.org/jira/browse/HDFS-12136
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 2.8.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Priority: Critical
> Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch
>
>
> HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan
> by reading the last checksum of finalized blocks within the {{BlockSender}}
> ctor. Unfortunately it's holding the exclusive dataset lock to open and read
> the metafile multiple times Block sender instantiation becomes serialized.
> Performance completely collapses under heavy disk i/o utilization or high
> xceiver activity. Ex. lost node replication, balancing, or decommissioning.
> The xceiver threads congest creating block senders and impair the heartbeat
> processing that is contending for the same lock. Combined with other lock
> contention issues, pipelines break and nodes sporadically go dead.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]