[ 
https://issues.apache.org/jira/browse/HDDS-8874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-8874:
---------------------------------
    Labels: pull-request-available  (was: )

> Use unique OM ID as SCM container owner
> ---------------------------------------
>
>                 Key: HDDS-8874
>                 URL: https://issues.apache.org/jira/browse/HDDS-8874
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>              Labels: pull-request-available
>
> Currently, OM passes OzoneManager#getOmNodeId as the owner argument in 
> OMKeyRequest#allocateBlock. This means that the created container's owner is 
> based on the OM HA leader's Node ID specified in the ozone configuration 
> (i.e. ozone.om.node.id).
> In the case where there are multiple OMs using the same SCM. If the Node IDs 
> configured are not unique across the SCM cluster, there might be a situation 
> where a single container contains blocks owned by multiple OMs. This should 
> be fine for now, but in the future if we want to separate the containers 
> based on the Storage Type, this collision might raise tricky situations
> For example for configuration (omitting om addresses)
> {code:java}
> <property>
>   <name>ozone.om.service.ids</name>
>   <value>service1,service2</value>
> </property>
> <property>
>   <name>ozone.om.nodes.service1</name>
>   <value>om1,om2,om3</value>
> </property> 
> <property>
>   <name>ozone.om.nodes.service2</name>
>   <value>om1,om2,om3</value>
> </property>{code}
>  
> If the OM leader of both service1 and service2 has the same node ID (e.g. 
> om1), both OM will use the configured node ID as the owner when allocating 
> block, and SCM will assume that the request comes from the same owner and 
> allocate the same container for the blocks.
> We should find a way to pick a unique ID to pass to the allocateBlock call. 
> Currently a valid option is to use omUuid since the chance of collisions 
> would be smaller. 
> However, a better option for the container owner is to pick a unique ID for 
> the a single OM HA service (instead of node ID). This has possible additional 
> advantages:
>  * When the OM leader change, the SCM does not need to create a new container 
> for the new blocks
>  * Fewer number of OPEN containers and all containers in SCM overall
> Unfortunately, I have not found such an ID available. The candidates I have 
> considered:
>  * OM's ClusterID is tied to the SCM so it doesn't make sense to use it as 
> container owner.
>  * Service ID cannot be used since there doesn't seem to be any uniqueness 
> guarantee across OM HA services.
>  * OM Service Ratis Group ID is also derived from the MD5 hash of OM HA 
> Service ID so it cannot be used.
> Another consideration is to migrate the existing containers's owner to the 
> new naming conventions. This might be able to be done by using some container 
> task. This issue can be handled in another ticket.
> Any further suggestions are greatly appreciated.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to