YUBI LEE created HDFS-17868:
-------------------------------
Summary: introduce BlockPlacementPolicyCrossDC for multi
datacenter stretched hdfs cluster
Key: HDFS-17868
URL: https://issues.apache.org/jira/browse/HDFS-17868
Project: Hadoop HDFS
Issue Type: New Feature
Components: block placement
Reporter: YUBI LEE
I got ideas from https://dan.naver.com/25/sessions/692,
https://www.youtube.com/watch?v=1h4k_Dbt0t8, I implemented
"BlockPlacementPolicyCrossDC" policy. Thanks to [~acedia28].
It would be better if [~acedia28] shares the better version of the block
placement policy.
It introduces some configurations:
(default value written in parenthesis)
{code}
dfs.block.replicator.cross.dc.async.enabled (false)
dfs.block.replicator.cross.dc.preferred.datacenter
dfs.block.replicator.cross.dc.bandwidth.limit.mb
dfs.block.replicator.cross.dc.bandwidth.refill.period.sec
dfs.block.replicator.cross.dc.sync.paths
dfs.block.replicator.cross.dc.limited.sync.paths
{code}
According to ideas from the session I mentioned above, this policy introduces 3
ways to write hdfs block.
- sync write: the original hdfs way
- limited sync write: using bucket4j, sync write < threshold, async write >
threshold.
- async write: return datanode candidates only which locate the same datacenter
to hdfs client, under replicated blocks will replicated later in asynchronous
way.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]