[ 
https://issues.apache.org/jira/browse/HDDS-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-15305:
----------------------------------
    Labels: pull-request-available  (was: )

> ozone admin container list --all returns duplicate containers due to 
> sort/pagination key mismatch in SCM
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-15305
>                 URL: https://issues.apache.org/jira/browse/HDDS-15305
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sreeja
>            Assignee: Sreeja
>            Priority: Major
>              Labels: pull-request-available
>
> The *ozone admin container list --all* command produces duplicate container 
> entries in its output. The duplicates are non-deterministic across runs, 
> which containers are duplicated depends on the order in which the SCM returns 
> them, which can vary.
> Root Cause: 
> The bug is a mismatch between the server-side sort key and the client-side 
> pagination key Server side 
> (SCMClientProtocolServer.listContainerInternal):The method filters containers 
> by containerID >= startContainerID, then calls .sorted() which invokes 
> ContainerInfo.compareTo(). That comparator is defined as:
>  
> {code:java}
> private static final Comparator<ContainerInfo> COMPARATOR = 
> Comparator.comparingLong(info -> info.getLastUsed().toEpochMilli());  {code}
> So the batch is returned sorted by lastUsed timestamp, not by containerID.
> Client side (ListSubcommand.listAllContainers): The pagination loop advances 
> the cursor using the last returned element's container ID.
> This assumes the last element of the batch has the highest container ID. That 
> is only true if the batch is sorted by container ID. Since it is actually 
> sorted by lastUsed, the last element can have a lower container ID than other 
> elements already in the same batch.
>  
> Example: With batch size 40, a batch may return containers with IDs [..., 41, 
> 42, 44, 40] (sorted by lastUsed). The cursor is set to 40 + 1 = 41. The next 
> batch fetches containers with containerID >= 41, returning [41, 42, 43, 44, 
> 45, ...] — re-fetching 41, 42, and 44.
>  
> Fix : 
> In SCMClientProtocolServer.listContainerInternal, replace the lastUsed-based 
> sort with a containerID-based sort, aligning the sort key with the pagination 
> key:
> {code:java}
> .sorted(Comparator.comparing(ContainerInfo::containerID)){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to