Re: [PR] HDDS-10963. Implement a headBlocks API on the Datanode [ozone]

via GitHub Thu, 06 Jun 2024 22:20:09 -0700


xichen01 commented on PR #6774:
URL: https://github.com/apache/ozone/pull/6774#issuecomment-2154052199


   @errose28 Thanks for your detailed response.
   
   ### What is the overall goal?
   
   The goal is to eventually implement a tool(command line tool) that can 
quickly find missing keys.
   Here the "missing keys" means the key exists in OM, but the Block is missing.
   
   ### What is the context
   
   There are two scenarios that require this tool
   - We're performing an Orphan Block cleanup, which is done with an external 
tool, and this “find missing key” can be used to make sure that all keys in the 
cluster are not lost as a result of the Orphan Block cleanup.
   - We find that there are missing keys in the cluster, and when we read them, 
we get an error “NO_REPLICA_FOUND” or “Unable to find the block”. Our cluster 
has been running for several years and has gone through many releases, and due 
to some historical bugs, some keys are missing and we need to be able to find 
them.
   
   We can't guarantee that there won't be other reasons for data loss in the 
future, so we need a tool that can quickly scan the entire cluster and make 
sure that all keys aren't missing.
   
   ### There looks to be overlap with other features
   
   “find missing keys” and "container reconciliation" and "Datanode scanner" 
have different purposes. 
   - find missing keys is to find possible missing keys.
       - This part may overlap with HDDS-9346, but its development process is 
uncertain.
   - Container reconciliation is more about solving the Contianer data 
consistency problem.
   - Datanode scanner is to ensure the reliability of Datanode data, it can 
find checksum errors or disk data loss. But it can't find blocks that are 
supposed to be in the Container, but aren't actually in the Container.
   
   ### The scanner also accounts for nuances like container state, block 
deletion that modifies disk and DB in multiple steps
   
   I think “find missing keys” doesn't need to take into account such things as 
Container state, because find missing key is to find Block replica of keys that 
are completely missing, and as long as any of these keys “exist” then the key 
is not missing in the cluster, and the key can be recovered by some way.
   
   ### Does “find missing key” require reading and checking of data?
   “find missing key” is only used to verify existence, not correctness, which 
can be guaranteed by the Datanode scanner and checksum.
   
   
   ### The "head request" terminology is kind of confusing in this context
   Makes sense , I think we can change the name of the API. maybe we can rename 
it to `verifyBlocksExistence`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-10963. Implement a headBlocks API on the Datanode [ozone]

Reply via email to