[
https://issues.apache.org/jira/browse/HDFS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arpit Agarwal updated HDFS-5153:
--------------------------------
Attachment: HDFS-5153.04.patch
Updated patch with the following changes:
# Introduces new conf key {{DFS_BLOCKREPORT_SPLIT_THRESHOLD_KEY}}. If the
number of blocks on the DN is below this threshold then block reports for all
storages are sent in a single message. Else the block reports will be split
across messages. The default value is currently 1Million blocks.
# When splitting, reports for all storages are sent in quick succession. In the
future we can consider spacing them apart.
# The 'staleness' of a DN is determined by whether all storages have reported
since the last restart/failover.
# Added new test class {{TestDnRespectsBlockReportSplitThreshold}}.
# Refactored existing test {{TestBlockReport}} into base class
{{BlockReportTestBase}} and derived classes for readability.
> Datanode should send block reports for each storage in a separate message
> -------------------------------------------------------------------------
>
> Key: HDFS-5153
> URL: https://issues.apache.org/jira/browse/HDFS-5153
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 3.0.0
> Reporter: Arpit Agarwal
> Attachments: HDFS-5153.01.patch, HDFS-5153.03.patch,
> HDFS-5153.03b.patch, HDFS-5153.04.patch
>
>
> When the number of blocks on the DataNode grows large we start running into a
> few issues:
> # Block reports take a long time to process on the NameNode. In testing we
> have seen that a block report with 6 Million blocks takes close to one second
> to process on the NameNode. The NameSystem write lock is held during this
> time.
> # We start hitting the default protobuf message limit of 64MB somewhere
> around 10 Million blocks. While we can increase the message size limit it
> already takes over 7 seconds to serialize/unserialize a block report of this
> size.
> HDFS-2832 has introduced the concept of a DataNode as a collection of
> storages i.e. the NameNode is aware of all the volumes (storage directories)
> attached to a given DataNode. This makes it easy to split block reports from
> the DN by sending one report per storage directory to mitigate the above
> problems.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)