[ 
https://issues.apache.org/jira/browse/HDDS-11116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumit Agrawal resolved HDDS-11116.
----------------------------------
    Resolution: Won't Do

Readonly mode will likely still be considered an outage. When I’ve dealt with 
customers they talk in terms of “workloads/jobs”, not “operations”. If every 
one of their jobs is 99% reads and 1% writes, an engineer would say 99% of ops 
still work. The customer would still say 100% of the jobs are failing. This 
brings me to the next point.

> OM state machine move to readonly mode on failure
> -------------------------------------------------
>
>                 Key: HDDS-11116
>                 URL: https://issues.apache.org/jira/browse/HDDS-11116
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Sumit Agrawal
>            Assignee: Sumit Agrawal
>            Priority: Major
>              Labels: pull-request-available
>
> When OM statemachine receives any unknown failure (other than IOException 
> type), it results in termination of OM.
> Its observed that same failure is applied to other om nodes, while 
> applyTransaction, as raft log is already replicated to minimum number of 
> nodes in quorum before applyTransaction.
>  
> Till recovery method is not applied, OM remains down. So as to provide read 
> only service, can move OM to read only mode, where read operation is allowed 
> over leader.
>  
> Few points in consideration:
>  # Need block applyTransaction of raft log in readonly mode to avoid running 
> transaction on previous failed operation
>  # Need support Read operation over read only mode (as leader election will 
> happen, but leader will not be ready till latest transaction is updated)
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to