steveloughran commented on issue #1576: Hadoop 16520 dynamodb ms version race 
refactor. 
URL: https://github.com/apache/hadoop/pull/1576#issuecomment-537610443
 
 
   I tested out on the command line while doing things to the DDB tables
   
   " Version marker in the dynamo table was null"
   
   How about "No version marker found in the DynamoDB table " + table name?
    
   Issue: we always patch the table, even on fsck, destroy and bucket-info 
calls. Should we?  I'm happy with it as is, as trying to be clever complicates 
life.
    
   Issue: What do we think makes a good retry interval? I think it is too big, 
as a bucket info is now sleeping for 3 minutes before giving up.
   
   (that's on a table with no version Tag BTW) .
   
   
   ```
   > bin/hadoop s3guard prune -tombstone -seconds 0  
s3a://hwdev-steve-ireland-new
   
   ...
   2019-10-02 17:37:49,476 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 629 ms before next retry
   2019-10-02 17:37:50,127 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 1048 ms before next retry
   2019-10-02 17:37:51,200 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 1149 ms before next retry
   2019-10-02 17:37:52,375 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 3756 ms before next retry
   2019-10-02 17:37:56,155 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 8933 ms before next retry
   2019-10-02 17:38:05,115 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 10687 ms before next retry
   2019-10-02 17:38:15,829 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 17912 ms before next retry
   2019-10-02 17:38:33,766 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 55424 ms before next retry
   2019-10-02 17:39:29,212 [main] WARN  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:getVersionMarkerItem(564)) - Version 
marker in the dynamo table was null. Sleeping 175876 ms before next retry
   2019-10-02 17:42:25,409 [main] DEBUG 
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:verifyVersionCompatibility(368)) - 
versionMarkerItem: null;  versionMarkerFromTag: { Item: {parent=../VERSION, 
child=../VERSION, table_version=100, table_created=0} }
   2019-10-02 17:42:25,409 [main] INFO  
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:verifyVersionCompatibility(392)) - 
Table hwdev-steve-ireland-new contains no version marker ITEM but contains 
compatible version marker TAG. Restoring the version marker item from tag.
   2019-10-02 17:42:25,409 [main] DEBUG 
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:putItem(599)) - Putting item { Item: 
{parent=../VERSION, child=../VERSION, table_version=100, 
table_created=1570034545409} }
   2019-10-02 17:42:25,470 [main] DEBUG 
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:initTable(187)) - Using existing 
DynamoDB table hwdev-steve-ireland-new in region eu-west-1 created Wed Oct 02 
17:42:25 BST 2019
   ```
   
   I verified that incompatible item or tag values raised errors
   
   * tag version = 150
   * tag valid, item = 150
   * item lacks a numeric version field at all
   
   In those situations, I think the error could include the full details which 
are logged at debug. I can see that being useful.
   
   ```
   2019-10-02 18:08:57,286 [main] DEBUG 
s3guard.DynamoDBMetadataStoreTableHandler 
(DynamoDBMetadataStoreTableHandler.java:verifyVersionCompatibility(368)) 
    versionMarkerItem: { Item: {parent=../VERSION, table_created=1570034545409, 
table_version=120, child=../VERSION} };
      versionMarkerFromTag: { Item: {parent=../VERSION, child=../VERSION, 
table_version=100, table_created=0} }
   ```
   
   CLI run on an inconsistent version marker (even if there's a valid file 
entry) is noisy: it logs at error with stack and then throws. I'd prefer just 
the log without the stack here,
   
   ```
   2019-10-02 18:06:00,690 [main] ERROR s3guard.S3Guard 
(S3Guard.java:getMetadataStore(120)) - Failed to instantiate metadata store 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore defined in 
fs.s3a.metadatastore.impl: java.io.IOException: Database table is from an 
incompatible S3Guard version based on table TAG. Table hwdev-steve-ireland-new 
Expected version: 100 actual tag version: 150
   java.io.IOException: Database table is from an incompatible S3Guard version 
based on table TAG. Table hwdev-steve-ireland-new Expected version: 100 actual 
tag version: 150
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableHandler.throwExceptionOnVersionMismatch(DynamoDBMetadataStoreTableHandler.java:440)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableHandler.verifyVersionCompatibility(DynamoDBMetadataStoreTableHandler.java:418)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableHandler.initTable(DynamoDBMetadataStoreTableHandler.java:184)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:426)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3Guard.getMetadataStore(S3Guard.java:109)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:413)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3391)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:137)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3440)
        at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:3414)
        at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:556)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:1208)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:427)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:1797)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.main(S3GuardTool.java:1806)
   java.io.IOException: Database table is from an incompatible S3Guard version 
based on table TAG. Table hwdev-steve-ireland-new Expected version: 100 actual 
tag version: 150
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableHandler.throwExceptionOnVersionMismatch(DynamoDBMetadataStoreTableHandler.java:440)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableHandler.verifyVersionCompatibility(DynamoDBMetadataStoreTableHandler.java:418)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableHandler.initTable(DynamoDBMetadataStoreTableHandler.java:184)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:426)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3Guard.getMetadataStore(S3Guard.java:109)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:413)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3391)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:137)
        at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3440)
        at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:3414)
        at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:556)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:1208)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:427)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:1797)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.main(S3GuardTool.java:1806)
   2019-10-02 18:06:00,693 [main] INFO  util.ExitUtil 
(ExitUtil.java:terminate(210)) - Exiting with status -1: java.io.IOException: 
Database table is from an incompatible S3Guard version based on table TAG. 
Table hwdev-steve-ireland-new Exp
   ```
   
   
   Side issue a prune with seconds 0 didn't delete tombstones from the DDB 
table.
   
   ```
   > bin/hadoop s3guard prune -tombstone -seconds 0  
s3a://hwdev-steve-ireland-new/
   ...
   2019-10-02 17:56:25,011 [main] INFO  s3guard.S3GuardTool 
(S3GuardTool.java:initMetadataStore(324)) - Metadata store 
DynamoDBMetadataStore{region=eu-west-1, tableName=hwdev-steve-ireland-new, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/hwdev-steve-ireland-new} 
is initialized.
   2019-10-02 17:56:25,027 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:prune(1519)) - Prune tombstones under 
/hwdev-steve-ireland-new with age 1569948985027
   2019-10-02 17:56:25,033 [main] INFO  s3guard.DynamoDBMetadataStore 
(DurationInfo.java:<init>(72)) - Starting: Pruning DynamoDB Store
   2019-10-02 17:56:25,066 [main] INFO  s3guard.DynamoDBMetadataStore 
(DurationInfo.java:close(87)) - Pruning DynamoDB Store: duration 0:00.032s
   2019-10-02 17:56:25,066 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:close(2267)) - Auditing #(Prune-0001)
   2019-10-02 17:56:25,066 [main] INFO  s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1596)) - Finished pruning 0 items in 
batches of 25
   2019-10-02 17:56:25,068 [shutdown-hook-0] DEBUG s3a.S3AFileSystem 
(S3AFileSystem.java:close(3048)) - Filesystem s3a://hwdev-steve-ireland-new is 
closed
   2019-10-02 17:56:25,070 [shutdown-hook-0] DEBUG s3a.S3AFileSystem 
(S3AUtils.java:closeAll(1628)) - Closing NullMetadataStore
   2019-10-02 17:56:25,070 [shutdown-hook-0] DEBUG s3a.S3AFileSystem 
(S3AUtils.java:closeAll(1628)) - Closing 
org.apache.hadoop.fs.s3a.S3AInstrumentation@27968269
   2019-10-02 17:56:25,071 [shutdown-hook-0] DEBUG s3a.S3AFileSystem 
(S3AUtils.java:closeAutocloseables(1651)) - Closing 
AWSCredentialProviderList[refcount= 2: [TemporaryAWSCredentialsProvider, 
SimpleAWSCredentialsProvider, EnvironmentVariableCredentialsProvider, 
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider@63a12c68] last 
provider: SimpleAWSCredentialsProvider
   2019-10-02 17:56:25,071 [shutdown-hook-0] DEBUG 
s3a.AWSCredentialProviderList (AWSCredentialProviderList.java:close(311)) - Not 
closing AWSCredentialProviderList[refcount= 1: 
[TemporaryAWSCredentialsProvider, SimpleAWSCredentialsProvider, 
EnvironmentVariableCredentialsProvider, 
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider@63a12c68] last 
provider: SimpleAWSCredentialsProvider
   ```
   
   But when I said -seconds 1, it did
   
   ```
   > bin/hadoop s3guard prune -tombstone -seconds 1  
s3a://hwdev-steve-ireland-new/
   
   ...
   2019-10-02 17:58:55,815 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:prune(1519)) - Prune tombstones under 
/hwdev-steve-ireland-new with age 1570035534815
   2019-10-02 17:58:55,821 [main] INFO  s3guard.DynamoDBMetadataStore 
(DurationInfo.java:<init>(72)) - Starting: Pruning DynamoDB Store
   2019-10-02 17:58:55,882 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1548)) - Prune entry 
s3a://hwdev-steve-ireland-new/terasort-directory/validate/_SUCCESS
   2019-10-02 17:58:55,883 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1548)) - Prune entry 
s3a://hwdev-steve-ireland-new/terasort-directory/validate/part-r-00000
   2019-10-02 17:58:55,883 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1548)) - Prune entry 
s3a://hwdev-steve-ireland-new/terasort-magic/validate/__magic/app-attempt-0001/tasks/attempt_1570028934823_0006_m_000000_0/__base
   2019-10-02 17:58:55,883 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1548)) - Prune entry 
s3a://hwdev-steve-ireland-new/user/stevel/target/test/data/Nhh2EnrZqv
   2019-10-02 17:58:55,884 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1548)) - Prune entry 
s3a://hwdev-steve-ireland-new/user/stevel/target/test/data/PEzrLmEa92
   2019-10-02 17:58:55,884 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1548)) - Prune entry 
s3a://hwdev-steve-ireland-new/user/stevel/target/test/data/rogKpvLQQU
   2019-10-02 17:58:55,884 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1548)) - Prune entry 
s3a://hwdev-steve-ireland-new/terasort-magic/sortin/__magic/app-attempt-0001/tasks/attempt_1570019599148_0002_
   ...
   2019-10-02 17:58:57,212 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:get(711)) - result of get 
s3a://hwdev-steve-ireland-new/test/prune-cli-keep is: null
   2019-10-02 17:58:57,212 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:lambda$removeAuthoritativeDirFlag$10(1640)) - No 
parent s3a://hwdev-steve-ireland-new/test/prune-cli-keep; skipping
   2019-10-02 17:58:57,212 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:removeAuthoritativeDirFlag(1666)) - innerPut on 
metas: []
   2019-10-02 17:58:57,213 [main] INFO  s3guard.DynamoDBMetadataStore 
(DurationInfo.java:close(87)) - Pruning DynamoDB Store: duration 0:01.391s
   2019-10-02 17:58:57,213 [main] DEBUG s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:close(2267)) - Auditing #(Prune-0001)
   2019-10-02 17:58:57,213 [main] INFO  s3guard.DynamoDBMetadataStore 
(DynamoDBMetadataStore.java:innerPrune(1596)) - Finished pruning 104 items in 
batches of 25
   ```
   
   I don't think it is at all related -but have a try on your system and see 
what happens.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to