[
https://issues.apache.org/jira/browse/HDDS-11116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sumit Agrawal resolved HDDS-11116.
----------------------------------
Resolution: Won't Do
Readonly mode will likely still be considered an outage. When I’ve dealt with
customers they talk in terms of “workloads/jobs”, not “operations”. If every
one of their jobs is 99% reads and 1% writes, an engineer would say 99% of ops
still work. The customer would still say 100% of the jobs are failing. This
brings me to the next point.
> OM state machine move to readonly mode on failure
> -------------------------------------------------
>
> Key: HDDS-11116
> URL: https://issues.apache.org/jira/browse/HDDS-11116
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Sumit Agrawal
> Assignee: Sumit Agrawal
> Priority: Major
> Labels: pull-request-available
>
> When OM statemachine receives any unknown failure (other than IOException
> type), it results in termination of OM.
> Its observed that same failure is applied to other om nodes, while
> applyTransaction, as raft log is already replicated to minimum number of
> nodes in quorum before applyTransaction.
>
> Till recovery method is not applied, OM remains down. So as to provide read
> only service, can move OM to read only mode, where read operation is allowed
> over leader.
>
> Few points in consideration:
> # Need block applyTransaction of raft log in readonly mode to avoid running
> transaction on previous failed operation
> # Need support Read operation over read only mode (as leader election will
> happen, but leader will not be ready till latest transaction is updated)
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]