[
https://issues.apache.org/jira/browse/HDFS-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Virajith Jalaparti updated HDFS-12777:
--------------------------------------
Attachment: HDFS-12777-HDFS-9806.003.patch
Thanks a taking a look [~elgoiri]. Posting an updated patch with additional
javadocs to {{getSuffix}}, {{ProvidedReplica}}, and {{ReplicaBuilder}}. Also,
augmented the {{testProvidedReplicaSuffixExtraction}} with more examples.
> [READ] Reduce memory and CPU footprint for PROVIDED volumes.
> ------------------------------------------------------------
>
> Key: HDFS-12777
> URL: https://issues.apache.org/jira/browse/HDFS-12777
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Virajith Jalaparti
> Assignee: Virajith Jalaparti
> Attachments: HDFS-12777-HDFS-9806.001.patch,
> HDFS-12777-HDFS-9806.002.patch, HDFS-12777-HDFS-9806.003.patch
>
>
> As opposed to local blocks, each DN keeps track of all blocks in PROVIDED
> storage. This can be millions of blocks for 100s of TBs of PROVIDED data.
> Storing the data for these blocks can lead to a large memory footprint.
> Further, with so many blocks, {{DirectoryScanner}} running on a PROVIDED
> volume can increase the memory and CPU utilization.
> To reduce these overheads, this JIRA aims to (a) disable the
> {{DirectoryScanner}} on PROVIDED volumes (as HDFS-9806 focuses on only
> read-only data in PROVIDED volumes), (b) reduce the space occupied by
> {{FinalizedProvidedReplicaInfo}} by using a common URI prefix across all
> PROVIDED blocks.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]