[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-12 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-530748126
 
 
   # +1
   
   As well as all the automated tests, did some manual command line operations.
   * empty args
   * command without -check
   * -check without path
   * against store marked as auth but with incomplete MS
   * after doing an import, same store
   * empty store
   * unguarded store
   
   All outcomes were as expected 
   
   I'm happy with this 
   
   ## Followup
   
   One of the changes with the HADOOP-16430 PR is that we now have an S3A FS 
method `boolean allowAuthoritative(final Path path) ` that takes a path and 
returns true iff its authoritative either by the MS being auth *or* the given 
path being marked as one of the authoritative dirs. I think the validation when 
an authoritative directory is consistent between the metastore and S3 should be 
using this when it wants to highlight an authoritative path is inconsistent. 
   
   This can be a follow-on patch, because as usual it will need more tests, in 
the code, and someone to try out the command line.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-11 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-530564476
 
 
   ok, full test with ddb non-auth happy; repeating with auth
   
   Tomorrow I'll build the CLI and run the various manual operations which were 
failing


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-11 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-530380233
 
 
   my branch with changes to the assertion 
https://github.com/steveloughran/hadoop/tree/incoming/HADOOP-16423-fsck-log-changes


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-11 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-530357583
 
 
   last test run:
   ```
   [ERROR] Failures: 
   [ERROR]   
ITestS3GuardFsck.testIDetectParentTombstoned:194->assertComparePairsSize:452 
[Number of compare pairs] expected:<[1]> but was:<[2]>
   [ERROR] Errors: 
   [ERROR]   
ITestS3GuardFsck.testIAuthoritativeDirectoryContentMismatch:292->checkForViolationInPairs:474
 ยป NoSuchElement
   [INFO] 
   [INFO] Running org.apache.hadoop.fs.s3a.select.ITestS3SelectLandsat
   [ERROR] Tests run: 12, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 
30.746 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck
   [ERROR] 
testIDetectParentTombstoned(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck)  
Time elapsed: 8.026 s  <<< FAILURE!
   org.junit.ComparisonFailure: [Number of compare pairs] expected:<[1]> but 
was:<[2]>
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck.assertComparePairsSize(ITestS3GuardFsck.java:452)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck.testIDetectParentTombstoned(ITestS3GuardFsck.java:194)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
   
   [ERROR] 
testIAuthoritativeDirectoryContentMismatch(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck)
  Time elapsed: 4.626 s  <<< ERROR!
   java.util.NoSuchElementException: No value present
at java.util.Optional.get(Optional.java:135)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck.checkForViolationInPairs(ITestS3GuardFsck.java:474)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck.testIAuthoritativeDirectoryContentMismatch(ITestS3GuardFsck.java:292)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
   
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-10 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-52599
 
 
   test runs before your last commit. First fine; second with -Dauth failed
   
   ```
   [ERROR] Tests run: 11, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
25.371 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck
   [ERROR] 
testIAuthoritativeDirectoryContentMismatch(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck)
  Time elapsed: 3.461 s  <<< ERROR!
   java.util.NoSuchElementException: No value present
at java.util.Optional.get(Optional.java:135)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck.testIAuthoritativeDirectoryContentMismatch(ITestS3GuardFsck.java:403)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-10 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-529980416
 
 
   OK, done a quick scan. Changes all LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528905689
 
 
   Overall then, last iteration has a working CLI
   * One of the tests is brittle in parallel runs
   * fsck must return an error code on a failure for scripts and tests
   * modtime handling needs to be tuned (followup?)
   + no docs that I can see


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528894617
 
 
   OK, latest review is good with etags; modtime is something we can worry 
about as an extra iteration. Tested on a store which is set up for auth 
listings and is clearly considered inconsistent. It warns me of this, maybe in 
too much detail. 
   
   ```
   ...
   2019-09-06 15:51:25,549 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: 
s3a://hwdev-steve-london/fork-0006/test/ITestS3AContractDistCp/testTrackDeepDirectoryStructureToRemote/remote/DELAY_LISTING_ME/outputDir/inputDir
   The content of an authoritative directory listing does not match the content 
of the S3 listing. S3: 
[[S3AFileStatus{path=s3a://hwdev-steve-london/fork-0006/test/ITestS3AContractDistCp/testTrackDeepDirectoryStructureToRemote/remote/DELAY_LISTING_ME/outputDir/inputDir/subDir1;
 isDirectory=true; modification_time=0; access_time=0; owner=stevel; 
group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; 
isEncrypted=true; isErasureCoded=false} isEmptyDirectory=FALSE eTag=null 
versionId=null, 
S3AFileStatus{path=s3a://hwdev-steve-london/fork-0006/test/ITestS3AContractDistCp/testTrackDeepDirectoryStructureToRemote/remote/DELAY_LISTING_ME/outputDir/inputDir/subDir2;
 isDirectory=true; modification_time=0; access_time=0; owner=stevel; 
group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; 
isEncrypted=true; isErasureCoded=false} isEmptyDirectory=FALSE eTag=null 
versionId=null]], MS: []
   
   2019-09-06 15:51:25,549 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(144)) - Total scan time: 3s
   2019-09-06 15:51:25,549 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(145)) - Scanned entries: 51
   ~/P/R/fsck echo $status
   0
   ```
   
   It's noisy, but, well, that could be tuned a bit by cutting back on the 
amount of the S3AFileStatus fields to print. What is clear is: it found real 
problems where the DDB was incomplete.
   
But: exit code was still 0. I think we should return something in the case 
where there is a mismatch of this scale
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528883425
 
 
   The testCLIFsckWithParam test works standalone. Looks to me like a race 
condition -the fsck is being performed on an active bucket and files have been 
deleted between listed and queued for scanning and the actual scan. 
   1. Test should only scan the test directory, or we handle FNFEs as something 
to ignore
   2. Could the fsck code itself change here? Because on a stable bucket the 
FNFE could be a sign of a mismatch from DDB to store; you don't want them 
ignored. 
   
   But: the failure could be delayed and reported as a missing file, rather 
than triggering a fast failure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528873116
 
 
   and
   ```
   [ERROR] Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
25.342 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck
   [ERROR] 
testIVersionIdMismatch(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck)  Time 
elapsed: 1.591 s  <<< FAILURE!
   java.lang.AssertionError: 
   [Violations in the childPair] 
   Expecting:
<[ETAG_MISMATCH, LENGTH_MISMATCH, MOD_TIME_MISMATCH]>
   to contain:
<[VERSIONID_MISMATCH]>
   but could not find:
<[VERSIONID_MISMATCH]>
   
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardFsck.testIVersionIdMismatch(ITestS3GuardFsck.java:589)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528871326
 
 
   got a test run failure in a new test
   ```
   
testCLIFsckWithParam(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB) 
 Time elapsed: 8.028 s  <<< ERROR!
   java.io.FileNotFoundException: No such file or directory: 
s3a://hwdev-steve-ireland-new/fork-0004/test
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2788)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2677)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2571)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:2360)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$10(S3AFileSystem.java:2339)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:2339)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardFsck.compareS3ToMs(S3GuardFsck.java:115)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Fsck.run(S3GuardTool.java:1560)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:402)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:1763)
at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:137)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB.testCLIFsckWithParam(ITestS3GuardToolDynamoDB.java:301)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528867021
 
 
   Did another test run on an unversioned bucket where the the DDB table was 
built up with an ls -R, so filled up straight from S3. all checks happy (e.g 
modtime) but still warning of null etags on all directories, including the root 
one.
   ```
   2019-09-06 14:59:23,893 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: s3a://hwdev-steve-london/
   No etag.
   
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528866219
 
 
   e.g
   ```
   bin/hadoop s3guard fsck -check 
s3a://hwdev-steve-ireland-new/etc/hadoop/shellprofile.d
   2019-09-06 14:51:46,377 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(322)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-1, tableName=hwdev-steve-ireland-new, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/hwdev-steve-ireland-new} 
is initialized.
   2019-09-06 14:51:46,705 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: 
s3a://hwdev-steve-ireland-new/etc/hadoop/shellprofile.d - Length S3: 0, MS: 0 - 
Etag S3: null, MS: null
   2019-09-06 14:51:46,764 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: 
s3a://hwdev-steve-ireland-new/etc/hadoop/shellprofile.d/example.sh - Length S3: 
3880, MS: 3880 - Etag S3: c7dbe1b877a287175df9dfc32c226765, MS: 
c7dbe1b877a287175df9dfc32c226765
   2019-09-06 14:51:46,792 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: s3a://hwdev-steve-ireland-new/etc/hadoop/shellprofile.d
   No etag.
   
   2019-09-06 14:51:46,792 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: s3a://hwdev-steve-ireland-new/etc/hadoop/shellprofile.d/example.sh
   getModificationTime mismatch - s3: 1567773091000, ms: 1567773090841
   
   2019-09-06 14:51:46,792 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(144)) - Total scan time: 0s
   2019-09-06 14:51:46,792 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(145)) - Scanned entries: 2
   ~/P/R/fsck echo $status
   0
   ```
   
   The good news: the return code is 0; it passed the scan. So these are just 
info rather than warn


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528864203
 
 
   OK, latest patch is better. 
   * still warns of no etag on a dir;
   * when you pass a path to a file it is scanned twice
   * I think we need to be able to disable the modtime checks, because you tend 
to get them whenever you create an entry after writing a file (system clock is 
used); they get updated on the first read. Or: we allow a range of accuracy?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-09-06 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-528858428
 
 
   working on the CLI, but overreporting errors, especially on versioning
   
   Warns of no etag on directories; there's no need to check here
   
   ```
   2019-09-06 13:46:33,571 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: s3a://hwdev-steve-ireland-new/etc/hadoop
   No etag.
   
   ```
   Reports mismatch on a directory scan, where the listing doesn't include the 
versions. Maybe this is just the time mismatch triggering the reporting, in 
which case it is misleading
   ```
   
   2019-09-06 13:46:33,572 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml
   getModificationTime mismatch - s3: 1567773093000, ms: 1567773092204
   getVersionId mismatch - s3: null, ms: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t
   ```The ddb table has the version ID, but I'm assuming that the scan doesn't 
get them from S3 because we'd need to use HEAD over LIST.
   
   When I give the full path, it says there's a mismatch but now prints the 
same value on both sides. This is not a mismatch and should not appear.
   
   ```
   ~/P/R/fsck bin/hadoop s3guard fsck -check 
s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml
   2019-09-06 13:59:52,857 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(322)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-1, tableName=hwdev-steve-ireland-new, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/hwdev-steve-ireland-new} 
is initialized.
   2019-09-06 13:59:53,057 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: 
s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 
8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 
2887e7740b821abd405e6a5c70d2081e
   2019-09-06 13:59:53,115 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: 
s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 
8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 
2887e7740b821abd405e6a5c70d2081e
   2019-09-06 13:59:53,142 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml
   getModificationTime mismatch - s3: 1567773093000, ms: 1567773092204
   getVersionId mismatch - s3: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t, ms: 
AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t
   
   2019-09-06 13:59:53,142 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml
   getModificationTime mismatch - s3: 1567773093000, ms: 1567773092204
   getVersionId mismatch - s3: AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t, ms: 
AfxJ3agigvhWyYhkCVXikPCpgx1C5z1t
   
   2019-09-06 13:59:53,142 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(144)) - Total scan time: 0s
   2019-09-06 13:59:53,142 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareS3ToMs(145)) - Scanned entries: 2
   ```
   
   Note also that the file gets scanned twice. This hints at the scanning 
playing up when the supplied path is a file, not a dir.
   
   Now I  open the file with `hadoop fs -cat 
s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml`; there's a PUT 
to the DDB table as the modtime is updated; the next scan doesn't report 
modtime issues, but it does still mistakenly report the version IDs are 
different. 
   
   ```
   bin/hadoop s3guard fsck -check 
s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml
   2019-09-06 14:33:59,582 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(322)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-1, tableName=hwdev-steve-ireland-new, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/hwdev-steve-ireland-new} 
is initialized.
   2019-09-06 14:33:59,773 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: 
s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 
8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 
2887e7740b821abd405e6a5c70d2081e
   2019-09-06 14:33:59,828 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareFileStatusToPathMetadata(217)) - Path: 
s3a://hwdev-steve-ireland-new/etc/hadoop/capacity-scheduler.xml - Length S3: 
8260, MS: 8260 - Etag S3: 2887e7740b821abd405e6a5c70d2081e, MS: 
2887e7740b821abd405e6a5c70d2081e
   2019-09-06 14:33:59,856 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(79)) - 
   On path: 

[GitHub] [hadoop] steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)

2019-08-22 Thread GitBox
steveloughran commented on issue #1208: HADOOP-16423. S3Guard fsck: Check 
metadata consistency between S3 and metadatastore (log)
URL: https://github.com/apache/hadoop/pull/1208#issuecomment-523927611
 
 
   ## Overall
   
   there are checks, but as it doesnt recurse from me it's hard to valide them.
   
   The UX can be improved. I propose:
   * for successful entries, print their details as they are processed, such as 
length and etag.
   * failure to initialize the fs to include the error.
   * print the total duration of the check, number of entries scanned. 
   
   ## operations
   
   failed on root entry.
   
   ```
   bin/hadoop s3guard fsck -check s3a://guarded-table/
   2019-08-22 14:17:37,973 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(322)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-2, tableName=guarded-table, 
tableArn=arn:aws:dynamodb:eu-west-2:980678866538:table/guarded-table} is 
initialized.
   == Path: s3a://guarded-table/
   2019-08-22 14:17:38,160 [main] INFO  s3guard.S3GuardFsck 
(S3GuardFsck.java:compareFileStatusToPathMetadata(220)) - Entry is in the root, 
so there's no parent
   == Path: s3a://guarded-table/example
   2019-08-22 14:17:38,189 [main] ERROR s3guard.S3GuardFsckViolationHandler 
(S3GuardFsckViolationHandler.java:handle(76)) - 
   On path: s3a://guarded-table/
   No etag.
   ```
   
   with a path, I got the same message twice
   
   ```
   bin/hadoop s3guard fsck -check s3a://guarded-table/example
   2019-08-22 14:19:26,674 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(322)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-2, tableName=guarded-table, 
tableArn=arn:aws:dynamodb:eu-west-2:980678866538:table/guarded-table} is 
initialized.
   == Path: s3a://guarded-table/example
   == Path: s3a://guarded-table/example
   
   ```
   
   missing file. 
   ```
bin/hadoop s3guard fsck -check s3a://guarded-table/example/missing
   2019-08-22 14:21:23,252 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(322)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-2, tableName=guarded-table, 
tableArn=arn:aws:dynamodb:eu-west-2:980678866538:table/guarded-table} is 
initialized.
   java.io.FileNotFoundException: No such file or directory: 
s3a://guarded-table/example/missing
   2019-08-22 14:21:23,404 [main] INFO  util.ExitUtil 
(ExitUtil.java:terminate(210)) - Exiting with status 44: 
java.io.FileNotFoundException: No such file or directory: 
s3a://guarded-table/example/missing
   ```
   
   This is good. Add a test for it.
   
   s3a://bucket/.. This is bad. Add test and then fix.
   
   ```
   bin/hadoop s3guard fsck -check s3a://guarded-table/..
   2019-08-22 14:23:14,640 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(322)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-2, tableName=guarded-table, 
tableArn=arn:aws:dynamodb:eu-west-2:980678866538:table/guarded-table} is 
initialized.
   org.apache.hadoop.fs.s3a.AWSBadRequestException: getFileStatus on 
s3a://guarded-table/..: com.amazonaws.services.s3.model.AmazonS3Exception: 
Invalid URI (Service: Amazon S3; Status Code: 400; Error Code: 400 Invalid URI; 
Request ID: null; S3 Extended Request ID: null), S3 Extended Request ID: 
null:400 Invalid URI: Invalid URI (Service: Amazon S3; Status Code: 400; Error 
Code: 400 Invalid URI; Request ID: null; S3 Extended Request ID: null)
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:237)
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:164)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2732)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2694)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2587)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardFsck.compareS3RootToMs(S3GuardFsck.java:94)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Fsck.run(S3GuardTool.java:1560)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:402)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:1759)
at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.main(S3GuardTool.java:1768)
   Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Invalid URI 
(Service: Amazon S3; Status Code: 400; Error Code: 400 Invalid URI; Request ID: 
null; S3 Extended Request ID: null), S3 Extended Request ID: null
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
at