[
https://issues.apache.org/jira/browse/HDDS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aravindan Vijayan updated HDDS-2848:
------------------------------------
Description:
Recon talks to OM in 2 ways - Through HTTP to get DB snapshot, and through RPC
to get delta updates.
Since Recon already uses the OzoneManagerClientProtocol to query the
OzoneManager RPC, the RPC client automatically routes the request to the leader
on an OM HA cluster. Recon only needs the updates from the OM RocksDB store,
and does not need the in flight updates in the OM DoubleBuffer. Due to the
guarantee from Ratis that the leader’s RocksDB will always be up to date, Recon
does not need to worry about going back in time when a current OM leader goes
down. We have to pass in the om service ID to the Ozone Manager client in
Recon, and the failover works internally. Currently we pass in 'null'.
To make the HTTP call to work against OM HA, Recon has to find out the current
OM leader and download the snapshot from that OM instance. We can use the way
this has been implemented in
org.apache.hadoop.ozone.admin.om.GetServiceRolesSubcommand. We can get the
roles of OM instances and then determine the leader from that.
was:Since Recon already uses the OzoneManagerClientProtocol to query the
OzoneManager, on an OM HAcluster, the RPC client automatically routes the
request to the leader. Recon only needs the updates from the OM RocksDB store,
and does not need the in flight updates in the OM DoubleBuffer. Due to the
guarantee from Ratis that the leader’s RocksDB will always be up to date, Recon
does not need to worry about going back in time when a current OM leader goes
down. In this version of Recon, the plan is to test out OM HA scenarios and
add unit/integration tests to verify correct behavior
> Recon should work with OM HA
> ----------------------------
>
> Key: HDDS-2848
> URL: https://issues.apache.org/jira/browse/HDDS-2848
> Project: Hadoop Distributed Data Store
> Issue Type: Sub-task
> Components: Ozone Recon
> Reporter: Aravindan Vijayan
> Priority: Major
>
> Recon talks to OM in 2 ways - Through HTTP to get DB snapshot, and through
> RPC to get delta updates.
> Since Recon already uses the OzoneManagerClientProtocol to query the
> OzoneManager RPC, the RPC client automatically routes the request to the
> leader on an OM HA cluster. Recon only needs the updates from the OM RocksDB
> store, and does not need the in flight updates in the OM DoubleBuffer. Due to
> the guarantee from Ratis that the leader’s RocksDB will always be up to date,
> Recon does not need to worry about going back in time when a current OM
> leader goes down. We have to pass in the om service ID to the Ozone Manager
> client in Recon, and the failover works internally. Currently we pass in
> 'null'.
> To make the HTTP call to work against OM HA, Recon has to find out the
> current OM leader and download the snapshot from that OM instance. We can use
> the way this has been implemented in
> org.apache.hadoop.ozone.admin.om.GetServiceRolesSubcommand. We can get the
> roles of OM instances and then determine the leader from that.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]