[ 
https://issues.apache.org/jira/browse/HDDS-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Yarovoy reassigned HDDS-15680:
-------------------------------------

    Assignee: Andrey Yarovoy

> ECFileChecksumHelper rebuilds the checksum pipeline on every block instead of 
> caching per placement group
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-15680
>                 URL: https://issues.apache.org/jira/browse/HDDS-15680
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Andrey Yarovoy
>            Assignee: Andrey Yarovoy
>            Priority: Major
>
> {{ECFileChecksumHelper.getChunkInfos}} is called once per block during EC 
> file checksum computation. Each call unconditionally rebuilds the STANDALONE 
> pipeline used to contact datanodes from scratch:
>  # Iterates all N EC nodes (9 for EC 6+3), calling 
> {{pipeline.getReplicaIndex(dn)}} for each to filter to replica index 1 and 
> parity nodes
>  # Sorts the selected node UUIDs into a string key and calls 
> {{UUID.nameUUIDFromBytes}} (a hash computation) to derive the deterministic 
> pipeline ID
>  # Allocates a new {{Pipeline}} object via {{Pipeline.newBuilder()}} with 5 
> field assignments
>  # Calls {{pipeline.getReplicaIndexes()}} as an argument to 
> {{ContainerProtocolCalls.getBlock}} — this streams over the already-filtered 
> nodes, calls {{getReplicaIndex}} on each again, and allocates a new 
> {{{}Map{}}}, even though the identical map ({{{}selectedReplicaIndexes{}}}) 
> was just built in the same method
> For a file where all blocks reside in one EC placement group (the common 
> case), steps 1–4 produce identical output on every block. A file with N 
> blocks performs N full reconstructions instead of 1.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to