[ 
https://issues.apache.org/jira/browse/HDFS-12136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086282#comment-16086282
 ] 

Kihwal Lee edited comment on HDFS-12136 at 7/14/17 9:21 PM:
------------------------------------------------------------

Normal serving has been greatly impaired.  Appending to a block while it's 
scanned is exceeding rare compared to the normal block sending rate, yet the 
fix impacted all serving.  There's a better way accomplished via:
* Entirely remove (revert) fetching of checksums for finalized blocks in the 
BlockSender ctor.  Reduces lock hold time by eliminating i/o in the dataset 
lock.
* If a checksum exception occurs during the scan, and the genstamp changed, 
mark the block as suspect for rescan.  This is the edge case.
* Recent suspect blocks considers genstamps.  Suspect blocks with a newer 
genstamp than last recorded are not skipped.
* Recent suspects expire 10 min after being added to the cache.  Prior behavior 
was 10 mins after last access - which could lead to indefinite postponement.

No test changes needed.  {{TestBlockScanner#testAppendWhileScanning}} proves 
this approach continues to work.

Only difference in trunk/branch-2 is context and a few log lines in code copied 
into a getStoredBlock method.


was (Author: daryn):
Normal serving has been greatly impaired.  Appending to a block while it's 
scanned is exceeding rare compared to the normal block sending rate, yet the 
fix impacted all serving.  There's a bettery way accomplished via:
* Entirely remove (revert) fetching of checksums for finalized blocks in the 
BlockSender ctor.  Reduces lock hold time by eliminating i/o in the dataset 
lock.
* If a checksum exception occurs during the scan, and the genstamp changed, 
mark the block as suspect for rescan.  This is the edge case.
* Recent suspect blocks considers genstamps.  Suspect blocks with a newer 
genstamp than last recorded are not skipped.
* Recent suspects expire 10 min after being added to the cache.  Prior behavior 
was 10 mins after last access - which could lead to indefinite postponement.

No test changes needed.  {{TestBlockScanner#testAppendWhileScanning}} proves 
this approach continues to work.

Only difference in trunk/branch-2 is context and a few log lines in code copied 
into a getStoredBlock method.

> BlockSender performance regression due to volume scanner edge case
> ------------------------------------------------------------------
>
>                 Key: HDFS-12136
>                 URL: https://issues.apache.org/jira/browse/HDFS-12136
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.8.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-12136.branch-2.patch, HDFS-12136.trunk.patch
>
>
> HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan 
> by reading the last checksum of finalized blocks within the {{BlockSender}} 
> ctor.  Unfortunately it's holding the exclusive dataset lock to open and read 
> the metafile multiple times  Block sender instantiation becomes serialized.
> Performance completely collapses under heavy disk i/o utilization or high 
> xceiver activity.  Ex. lost node replication, balancing, or decommissioning.  
> The xceiver threads congest creating block senders and impair the heartbeat 
> processing that is contending for the same lock.  Combined with other lock 
> contention issues, pipelines break and nodes sporadically go dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to