[ 
https://issues.apache.org/jira/browse/HDDS-13608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-13608:
-------------------------------
    Description: 
In an ideal cluster, each container will be closed when it's full (e.g. nearing 
the 5GB size). However, in real clusters a lot of times these containers are 
prematurely closed due to one reason or another which causes a lot of small 
containers. Small and big containers are considered equally during container 
replications which cause things like decommission to take a longer time since 
it needs to replicate a lot of these small containers.

This is a wish to kickstart discussion on container merging. If there are small 
CLOSED containers, we can schedule some merge operations to combine them to a 
single container. However, there are a lot of foreseen complexities since we 
might need to create a new container ID (or pick one) which will be different 
in what is stored in the key location info in the OM. One way is to create a 
layer of mapping between the old container ID and the new (merged) container ID 
when getting the block location, but this will add more overheads in memory and 
lookup.

  was:
In an ideal cluster, each container will be closed when it's full (e.g. nearing 
the 5GB size). However, in real clusters a lot of times these containers are 
prematurely closed due to one reason or another which causes a lot of small 
containers. Small and big containers are considered equally during container 
replications which cause things like decommission to take a longer time since 
it needs to replicate a lot of these small containers.

This is a wish to kickstart discussion on container merging. If there are small 
CLOSED containers, we can schedule some merge operations to combine them to a 
single container. However, there are a lot of foreseen complexities since we 
might need to create a new container ID which will be different in what is 
stored in the key location info in the OM. One way is to create a layer of 
mapping between the old container ID and the new (merged) container ID when 
getting the block location, but this will add more overheads in memory and 
during lookup.


> Support Ozone container merge
> -----------------------------
>
>                 Key: HDDS-13608
>                 URL: https://issues.apache.org/jira/browse/HDDS-13608
>             Project: Apache Ozone
>          Issue Type: Wish
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> In an ideal cluster, each container will be closed when it's full (e.g. 
> nearing the 5GB size). However, in real clusters a lot of times these 
> containers are prematurely closed due to one reason or another which causes a 
> lot of small containers. Small and big containers are considered equally 
> during container replications which cause things like decommission to take a 
> longer time since it needs to replicate a lot of these small containers.
> This is a wish to kickstart discussion on container merging. If there are 
> small CLOSED containers, we can schedule some merge operations to combine 
> them to a single container. However, there are a lot of foreseen complexities 
> since we might need to create a new container ID (or pick one) which will be 
> different in what is stored in the key location info in the OM. One way is to 
> create a layer of mapping between the old container ID and the new (merged) 
> container ID when getting the block location, but this will add more 
> overheads in memory and lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to