[
https://issues.apache.org/jira/browse/HDDS-8874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-8874:
---------------------------------
Labels: pull-request-available (was: )
> Use unique OM ID as SCM container owner
> ---------------------------------------
>
> Key: HDDS-8874
> URL: https://issues.apache.org/jira/browse/HDDS-8874
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
> Labels: pull-request-available
>
> Currently, OM passes OzoneManager#getOmNodeId as the owner argument in
> OMKeyRequest#allocateBlock. This means that the created container's owner is
> based on the OM HA leader's Node ID specified in the ozone configuration
> (i.e. ozone.om.node.id).
> In the case where there are multiple OMs using the same SCM. If the Node IDs
> configured are not unique across the SCM cluster, there might be a situation
> where a single container contains blocks owned by multiple OMs. This should
> be fine for now, but in the future if we want to separate the containers
> based on the Storage Type, this collision might raise tricky situations
> For example for configuration (omitting om addresses)
> {code:java}
> <property>
> <name>ozone.om.service.ids</name>
> <value>service1,service2</value>
> </property>
> <property>
> <name>ozone.om.nodes.service1</name>
> <value>om1,om2,om3</value>
> </property>
> <property>
> <name>ozone.om.nodes.service2</name>
> <value>om1,om2,om3</value>
> </property>{code}
>
> If the OM leader of both service1 and service2 has the same node ID (e.g.
> om1), both OM will use the configured node ID as the owner when allocating
> block, and SCM will assume that the request comes from the same owner and
> allocate the same container for the blocks.
> We should find a way to pick a unique ID to pass to the allocateBlock call.
> Currently a valid option is to use omUuid since the chance of collisions
> would be smaller.
> However, a better option for the container owner is to pick a unique ID for
> the a single OM HA service (instead of node ID). This has possible additional
> advantages:
> * When the OM leader change, the SCM does not need to create a new container
> for the new blocks
> * Fewer number of OPEN containers and all containers in SCM overall
> Unfortunately, I have not found such an ID available. The candidates I have
> considered:
> * OM's ClusterID is tied to the SCM so it doesn't make sense to use it as
> container owner.
> * Service ID cannot be used since there doesn't seem to be any uniqueness
> guarantee across OM HA services.
> * OM Service Ratis Group ID is also derived from the MD5 hash of OM HA
> Service ID so it cannot be used.
> Another consideration is to migrate the existing containers's owner to the
> new naming conventions. This might be able to be done by using some container
> task. This issue can be handled in another ticket.
> Any further suggestions are greatly appreciated.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]