[ 
https://issues.apache.org/jira/browse/HDDS-8749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729371#comment-17729371
 ] 

David Ayres commented on HDDS-8749:
-----------------------------------

Attila, I did miss that detail in the documentation, thanks for pointing it 
out. Even with the serviceID set it still errors out that the ID is null.

 

WARN ha.OMProxyInfo: OzoneManager address testcluster:9862 for serviceID null 
remains unresolved for node ID null Check your ozone-site.xml file to ensure 
ozone manager addresses are configured properly.

 

It is not quite clear in the documentation, but does the ServiceID need to be a 
DNS record as well?

My ozone-site.xml is set up as follows:

 
<!--OM HA Settings-->
<property>
<name>ozone.om.ratis.enable</name>
<value>true</value>
</property>

<property>
<name>ozone.om.service.ids</name>
<value>testcluster</value>
</property>

<property>
<name>ozone.om.nodes.testcluster</name>
<value>om1,om2,om3</value>
</property>

<property>
<name>ozone.om.address.testcluster.om1</name>
<value>ddl07oom01.root.local</value>
</property>

<property>
<name>ozone.om.address.testcluster.om2</name>
<value>ddl07oom02.root.local</value>
</property>

<property>
<name>ozone.om.address.testcluster.om3</name>
<value>ddl07oom03.root.local</value>
</property>
 
<!--SCM HA Settings-->
<property>
<name>ozone.scm.ratis.enable</name>
<value>true</value>

</property>

<property>
<name>ozone.scm.service.ids</name>
<value>testcluster</value>
</property>

<property>
<name>ozone.scm.nodes.testcluster</name>
<value>scm1,scm2,scm3</value>
</property>

<property>
<name>ozone.scm.address.testcluster.scm1</name>
<value>ddl07oscm01.root.local</value>
</property>

<property>
<name>ozone.scm.address.testcluster.scm2</name>
<value>ddl07oscm02.root.local</value>
</property>

<property>
<name>ozone.scm.address.testcluster.scm3</name>
<value>ddl07oscm03.root.local</value>
</property>
 
 

> [Hadoop OFS] HDFS commands fail when not set as the leader of OMHA
> ------------------------------------------------------------------
>
>                 Key: HDDS-8749
>                 URL: https://issues.apache.org/jira/browse/HDDS-8749
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OFS, OM HA
>    Affects Versions: 1.3.0
>         Environment: OS: Red Hat 8
>            Reporter: David Ayres
>            Priority: Minor
>
> When setting the defaultFS in Hadoop's core-site.xml it seems you are only 
> allowed to declare one OM node, but if the node declared is not the leader it 
> fails with the following error:
> INFO retry.RetryInvocationHandler: com.google.protobuf.ServiceException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ozone.om.exceptions.OMNotLeaderException):
>  OM:om1 is not the leader. Could not determine the leader node.
>  
> , while invoking $Proxy13.submitRequest over 
> nodeId=null,nodeAddress=ddl07oom01.vuhl.root.mrc.local:9862 after 1 failover 
> attempts. Trying to failover after sleeping for 4000ms. Current retry count: 
> 1.
>  
> HDFS commands only work when declaring the leader, but that would defeat the 
> purpose of HA. As if the OM node were to fail over HDFS commands would cease 
> to work.
>  
> There also does not seem to be any documentation on how HA works with 
> OFS/O3FS as of yet and I am not sure if this is a feature in the works or not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to