[ 
https://issues.apache.org/jira/browse/HDDS-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-14516:
-------------------------------
    Summary: Investigate high latency OM follower linearizable read request  
(was: Investigate high latency on first OM linearizable read request)

> Investigate high latency OM follower linearizable read request
> --------------------------------------------------------------
>
>                 Key: HDDS-14516
>                 URL: https://issues.apache.org/jira/browse/HDDS-14516
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> From TestOzoneShellHAWithFollowerRead, it is observed that when OM enables 
> linearizable read, the first OM read request from a unique client (e.g.
> getServiceInfo() in RpcClient initialization) sent to the OM will have a lot 
> higher latency (around 500ms) compared to the following OM requests (which 
> only runs for <10ms) from the same client. If another client sends a request, 
> this issue happens again for the first request of that client.
> {code:java}
> 2026-01-27 13:41:29,696 [IPC Server handler 14 on default port 15041] INFO  
> protocolPB.OzoneManagerProtocolServerSideTranslatorPB 
> (OzoneManagerProtocolServerSideTranslatorPB.java:submitReadRequestToOM(302)) 
> - Linearizable read submit request ServiceList on omNode-2 elapsed 492ms
> 2026-01-27 13:41:29,700 [IPC Server handler 12 on default port 15041] INFO  
> protocolPB.OzoneManagerProtocolServerSideTranslatorPB 
> (OzoneManagerProtocolServerSideTranslatorPB.java:submitReadRequestToOM(302)) 
> - Linearizable read submit request InfoVolume on omNode-2 elapsed 2ms
> 2026-01-27 13:41:29,703 [IPC Server handler 10 on default port 15041] INFO  
> protocolPB.OzoneManagerProtocolServerSideTranslatorPB 
> (OzoneManagerProtocolServerSideTranslatorPB.java:submitReadRequestToOM(302)) 
> - Linearizable read submit request InfoBucket on omNode-2 elapsed 1ms  {code}
> It does not seem to be related to the getServiceInfo() as I tried to remove 
> the initial getServiceInfo() and the InfoVolume becomes the slow one instead. 
> It also does not seem to be related to the ReadIndex network slowness since 
> the high latency happens only in a test.
> We need to check the reason of this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to