steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528894617
 
 
   OK, latest review is good with etags; modtime is something we can worry 
about as an extra iteration. Tested on a store which is set up for auth 
listings and is clearly considered inconsistent. It warns me of this, maybe in 
too much detail. 
   
   ```
   ...
   2019-09-06 15:51:25,549 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: 
s3a://hwdev-steve-london/fork-0006/test/ITestS3AContractDistCp/testTrackDeepDirectoryStructureToRemote/remote/DELAY_LISTING_ME/outputDir/inputDir
   The content of an authoritative directory listing does not match the content 
of the S3 listing. S3: 
[[S3AFileStatus{path=s3a://hwdev-steve-london/fork-0006/test/ITestS3AContractDistCp/testTrackDeepDirectoryStructureToRemote/remote/DELAY_LISTING_ME/outputDir/inputDir/subDir1;
 isDirectory=true; modification_time=0; access_time=0; owner=stevel; 
group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; 
isEncrypted=true; isErasureCoded=false} isEmptyDirectory=FALSE eTag=null 
versionId=null, 
S3AFileStatus{path=s3a://hwdev-steve-london/fork-0006/test/ITestS3AContractDistCp/testTrackDeepDirectoryStructureToRemote/remote/DELAY_LISTING_ME/outputDir/inputDir/subDir2;
 isDirectory=true; modification_time=0; access_time=0; owner=stevel; 
group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; 
isEncrypted=true; isErasureCoded=false} isEmptyDirectory=FALSE eTag=null 
versionId=null]], MS: []
   
   2019-09-06 15:51:25,549 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(144)) - Total scan time: 3s
   2019-09-06 15:51:25,549 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(145)) - Scanned entries: 51
   ~/P/R/fsck echo $status
   0
   ```
   
   It's noisy, but, well, that could be tuned a bit by cutting back on the 
amount of the S3AFileStatus fields to print. What is clear is: it found real 
problems where the DDB was incomplete.
   
    But: exit code was still 0. I think we should return something in the case 
where there is a mismatch of this scale
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to