errose28 opened a new pull request, #7127:
URL: https://github.com/apache/ozone/pull/7127

   ## What changes were proposed in this pull request?
   
   ### Motivation
   
   In order to build the merkle tree of all data in the container, the scanner 
should not exit after the first issue it encounters like it does currently. The 
scanner should track and return all errors that it sees, and only stop the scan 
on fatal errors that prevent further scanning of the container, like DB access 
errors.
   
   This PR is a pre-requisite to HDDS-10374. It does not actually generate a 
merkle tree during the scan and is also not testing this functionality. It sets 
up HDDS-10374 to be an easy drop in to the scanner which will allow the focus 
of that change to be testing of merkle tree generation.
   
   ### Primary Changes
   
   Previously `ScanResult` was an object that encapsulated a singe error. This 
was the first error the scanner saw which would abort the scan. This change 
decouples the `ScanResult` from the errors, which are now represented by a list 
of`ContainerScanError`s in the `ScanResult`.
   - `ScanResult` is a general interface that can represent a data or metadata 
scan. It can be logged by entities like the `ContainerLogger` which do not care 
where the unhealthy result came from.
   - `MetadataScanResult` is a `ScanResult` implementation produced from a 
container metadata scan.
     - This scan will not produce a merkle tree since it does not check data.
   - `DataScanResult` extends `MetadataScanResult` by adding a merkle tree 
representing the data that was scanned.
     - All data scans begin with a metadata scan, and then proceed to scan the 
data only if the metadata scan succeeds.
   
   ### Secondary Changes
   
   - General cleanup of `KeyValueContainerCheck` internals were done since this 
PR already required invasive changes in this area.
   - Fixed a bug from #5485/HDDS-9005 where the DB was being used to check for 
container deletion during a scan instead of the container state in memory.
     - This would fail to detect a schema v2 container which is completely lost 
but not actually deleted. The container would remain in the datanode's memory 
as healthy.
     - Since the [memory state is updated 
first](https://github.com/apache/ozone/blob/769a9aad1fe8c936c5dc5d5019ca4ed61644c042/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java#L1512)
 and our container object will still be valid even after being removed from the 
[ContainerSet](https://github.com/apache/ozone/blob/769a9aad1fe8c936c5dc5d5019ca4ed61644c042/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java#L114),
 we can use the `DELETED` state as the deletion check which is both safer and 
simpler.
   - When a container is deleted during a scan, this is indicated with a 
different field in the `ScanResult`. It is no longer considered an error and 
will not cause the `ScanResult` to be unhealthy.
   
   ### Notes to Reviewers
   
   It is probably best to review the scan flow end-to-end instead in addition 
to just viewing the diff.
   
   ## What is the link to the Apache JIRA
   
   HDDS-11290
   
   ## How was this patch tested?
   
   The scanner has a lot of existing tests that should all pass ensuring no 
regressions (Still WIP):
   - `TestKeyValueContainerCheck`
   - `TestKeyValueHandlerWithUnhealthyContainer`
   - `Test{Background,OnDemand}Container{Data,Metadata}Scanner`
   - `Test{Background,OnDemand}Container{Data,Metadata}ScannerIntegration`
   
   
   New tests added for detecting multiple errors:
   - WIP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to