[jira] [Commented] (HDDS-9709) NO_REPLICA_FOUND should trigger a OM pipeline cache refresh

Duong (Jira) Fri, 17 Nov 2023 17:22:08 -0800


    [ 
https://issues.apache.org/jira/browse/HDDS-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787437#comment-17787437
 ]


Duong commented on HDDS-9709:
-----------------------------

Both approaches would result in the same result in OM cache: the next request 
will re-fetch the container pipeline from SCM.

The fundamental difference lies in the client-side retry policy. Should the 
Ozone client (BlockInputStream) retry when NO_REPLICA_FOUND with a 
force-cache-refresh, or it should fail fast?

> NO_REPLICA_FOUND should trigger a OM pipeline cache refresh
> -----------------------------------------------------------
>
>                 Key: HDDS-9709
>                 URL: https://issues.apache.org/jira/browse/HDDS-9709
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Duong
>            Priority: Major
>
> Today, container pipelines are cached in OM and the cache data consistency is 
> eventually ensured by client behavior. This means, if a container is 
> replicated to another set of datanodes, the client detects this change when 
> using the outdated cached pipeline to read data from datanodes and requests 
> OM to refresh the pipeline cache from SCM.
> When the datanodes belonging to a container go offline, there are chances 
> that an empty pipeline could be cached in OM. However, when client get an 
> empty pipeline, it fails to ask OM to refresh the pipeline. 
> {code:java}
> Caused by: java.lang.IllegalArgumentException: NO_REPLICA_FOUND
>         at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:145)
>         at 
> org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:164)
>         at 
> org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClientForReadData(XceiverClientManager.java:157)
>         at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.acquireClient(BlockInputStream.java:285)
>         at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:238)
>         at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:146)
>         at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:308)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-9709) NO_REPLICA_FOUND should trigger a OM pipeline cache refresh

Reply via email to