[ 
https://issues.apache.org/jira/browse/HDFS-12777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-12777:
--------------------------------------
    Description: 
As opposed to local blocks, each DN keeps track of all blocks in PROVIDED 
storage. This can be millions of blocks for 100s of TBs of PROVIDED data. 
Storing the data for these blocks can lead to a large memory footprint. 
Further, with so many blocks, {{DirectoryScanner}} running on a PROVIDED volume 
can increase the memory and CPU utilization. 

To reduce these overheads, this JIRA aims to (a) disable the 
{{DirectoryScanner}} on PROVIDED volumes (as HDFS-9806 focuses on only 
read-only data in PROVIDED volumes), (b) reduce the space occupied by 
{{FinalizedProvidedReplicaInfo by using a common URI prefix across all PROVIDED 
blocks.



  was:
As opposed to local blocks, each DN keeps track of all blocks in PROVIDED 
storage. This can be millions of blocks for 100s of TBs of PROVIDED data. This 
JIRA aims to reduce the memory footprint of these blocks by using a common URI 
prefix across all PROVIDED blocks.
Further, with so many blocks the DirectoryScanner can take up a lot of 




> [READ] Reduce memory and CPU footprint for PROVIDED volumes.
> ------------------------------------------------------------
>
>                 Key: HDFS-12777
>                 URL: https://issues.apache.org/jira/browse/HDFS-12777
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Virajith Jalaparti
>            Priority: Major
>
> As opposed to local blocks, each DN keeps track of all blocks in PROVIDED 
> storage. This can be millions of blocks for 100s of TBs of PROVIDED data. 
> Storing the data for these blocks can lead to a large memory footprint. 
> Further, with so many blocks, {{DirectoryScanner}} running on a PROVIDED 
> volume can increase the memory and CPU utilization. 
> To reduce these overheads, this JIRA aims to (a) disable the 
> {{DirectoryScanner}} on PROVIDED volumes (as HDFS-9806 focuses on only 
> read-only data in PROVIDED volumes), (b) reduce the space occupied by 
> {{FinalizedProvidedReplicaInfo by using a common URI prefix across all 
> PROVIDED blocks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to