[jira] [Commented] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-11-08 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16970505#comment-16970505
 ] 

Gabor Bota commented on HADOOP-16484:
-

Sure, I'm happy to do this. 

> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
> Fix For: 3.3.0
>
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing.
> * status: give s3guard on/off status
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16473) S3Guard prune to only remove auth dir marker if files (not tombstones) are removed

2019-11-07 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HADOOP-16473:
---

Assignee: Gabor Bota

> S3Guard prune to only remove auth dir marker if files (not tombstones) are 
> removed
> --
>
> Key: HADOOP-16473
> URL: https://issues.apache.org/jira/browse/HADOOP-16473
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> the {{s3guard prune}} command marks all dirs as non-auth if an entry was 
> deleted. This makes sense from a performance perspective. But if only 
> tombstones are being purged, it doesn't -all it does is hurt the performance 
> of future scans



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-11-04 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16484.
-
   Fix Version/s: 3.3.0
Target Version/s: 3.3.0  (was: 3.2.2)
  Resolution: Fixed

> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
> Fix For: 3.3.0
>
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing.
> * status: give s3guard on/off status
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-11-04 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16966594#comment-16966594
 ] 

Gabor Bota commented on HADOOP-16484:
-

+1 from [~ste...@apache.org] on PR #1661, committed to trunk.

> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing.
> * status: give s3guard on/off status
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-16424) S3Guard fsck: Check internal consistency of the MetadataStore

2019-10-28 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16424 started by Gabor Bota.
---
> S3Guard fsck: Check internal consistency of the MetadataStore
> -
>
> Key: HADOOP-16424
> URL: https://issues.apache.org/jira/browse/HADOOP-16424
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The internal consistency should be checked e.g for orphaned entries which can 
> cause trouble in runtime and testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-10-28 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16484 started by Gabor Bota.
---
> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing.
> * status: give s3guard on/off status
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16653) S3Guard DDB overreacts to no tag access

2019-10-28 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16653.
-
Resolution: Fixed

> S3Guard DDB overreacts to no tag access
> ---
>
> Key: HADOOP-16653
> URL: https://issues.apache.org/jira/browse/HADOOP-16653
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> if you don't have permissions to read or write DDB tags it logs a lot every 
> time you bring up a guarded FS
> # we shouldn't worry so much about no tag access if version is there
> # if you can't read the tag, no point trying to write



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16653) S3Guard DDB overreacts to no tag access

2019-10-28 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960909#comment-16960909
 ] 

Gabor Bota commented on HADOOP-16653:
-

+1 on PR#1660 from [~ste...@apache.org]. Committing.

> S3Guard DDB overreacts to no tag access
> ---
>
> Key: HADOOP-16653
> URL: https://issues.apache.org/jira/browse/HADOOP-16653
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> if you don't have permissions to read or write DDB tags it logs a lot every 
> time you bring up a guarded FS
> # we shouldn't worry so much about no tag access if version is there
> # if you can't read the tag, no point trying to write



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16424) S3Guard fsck: Check internal consistency of the MetadataStore

2019-10-17 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953649#comment-16953649
 ] 

Gabor Bota commented on HADOOP-16424:
-

Our code should not be creating orphan entries. If we have an orphan entry than 
it's a bug in the production code.


> S3Guard fsck: Check internal consistency of the MetadataStore
> -
>
> Key: HADOOP-16424
> URL: https://issues.apache.org/jira/browse/HADOOP-16424
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The internal consistency should be checked e.g for orphaned entries which can 
> cause trouble in runtime and testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16424) S3Guard fsck: Check internal consistency of the MetadataStore

2019-10-17 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898870#comment-16898870
 ] 

Gabor Bota edited comment on HADOOP-16424 at 10/17/19 11:35 AM:


Tasks to do here: 
* find orphan entries (entries without a parent)
* find if a file's parent is not a directory (so the parent is a file)
* warn: no lastUpdated field
* entries where the parent is a tombstone



was (Author: gabor.bota):
Tasks to do here: 
* find orphan entries (entries without a parent)
* find if a file's parent is not a directory (so the parent is a file)
* warn: no lastUpdated field

> S3Guard fsck: Check internal consistency of the MetadataStore
> -
>
> Key: HADOOP-16424
> URL: https://issues.apache.org/jira/browse/HADOOP-16424
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The internal consistency should be checked e.g for orphaned entries which can 
> cause trouble in runtime and testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-10-16 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952938#comment-16952938
 ] 

Gabor Bota edited comment on HADOOP-16484 at 10/16/19 3:29 PM:
---

I removed status (merged with inform) so at inform level it will log with 
LOG.info


was (Author: gabor.bota):
I removed status (merged with inform) so at inform level it will log with 

> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing.
> * status: give s3guard on/off status
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-10-16 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952938#comment-16952938
 ] 

Gabor Bota commented on HADOOP-16484:
-

I removed status (merged with inform) so at inform level it will log with 

> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing.
> * status: give s3guard on/off status
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16653) S3Guard DDB overreacts to no tag access

2019-10-14 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951043#comment-16951043
 ] 

Gabor Bota commented on HADOOP-16653:
-

It was cleary in the docs, so I'll update that as well: 
{{s3guard.md}}:

{noformat}
*Note*: If the user does not have sufficient rights to tag the table, 
but it can read the tags the initialization of S3Guard will not fail, 
but there will be no version marker tag on the dynamo table and the following 
message will be logged on WARN level:
```
Exception during tagging table: {AmazonDynamoDBException exception message}
```
{noformat}

> S3Guard DDB overreacts to no tag access
> ---
>
> Key: HADOOP-16653
> URL: https://issues.apache.org/jira/browse/HADOOP-16653
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> if you don't have permissions to read or write DDB tags it logs a lot every 
> time you bring up a guarded FS
> # we shouldn't worry so much about no tag access if version is there
> # if you can't read the tag, no point trying to write



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16653) S3Guard DDB overreacts to no tag access

2019-10-14 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951043#comment-16951043
 ] 

Gabor Bota edited comment on HADOOP-16653 at 10/14/19 2:47 PM:
---

It was cleary in the docs, so I'll update that as well: 
{{s3guard.md}}:

{noformat}
*Note*: If the user does not have sufficient rights to tag the table the 
initialization of S3Guard will not fail, but there will be no version marker tag
on the dynamo table and the following message will be logged on WARN level:
```
Exception during tagging table: {AmazonDynamoDBException exception message}
```
{noformat}


was (Author: gabor.bota):
It was cleary in the docs, so I'll update that as well: 
{{s3guard.md}}:

{noformat}
*Note*: If the user does not have sufficient rights to tag the table, 
but it can read the tags the initialization of S3Guard will not fail, 
but there will be no version marker tag on the dynamo table and the following 
message will be logged on WARN level:
```
Exception during tagging table: {AmazonDynamoDBException exception message}
```
{noformat}

> S3Guard DDB overreacts to no tag access
> ---
>
> Key: HADOOP-16653
> URL: https://issues.apache.org/jira/browse/HADOOP-16653
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> if you don't have permissions to read or write DDB tags it logs a lot every 
> time you bring up a guarded FS
> # we shouldn't worry so much about no tag access if version is there
> # if you can't read the tag, no point trying to write



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16349) DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry

2019-10-11 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949341#comment-16949341
 ] 

Gabor Bota commented on HADOOP-16349:
-

https://github.com/apache/hadoop/pull/1576
Fixed in HADOOP-16540.
+1 on #1576 from [~ste...@apache.org]. Committing. Thanks.

> DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry
> -
>
> Key: HADOOP-16349
> URL: https://issues.apache.org/jira/browse/HADOOP-16349
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> If you delete the version marker from a S3Guard table, it appears to hang for 
> 5 minutes.
> Only if you restart and turn logging to debug do you see that 
> {{DynamoDBMetadataStore.getVersionMarkerItem()}} is sleeping and retrying.
> # log at warn
> # add entry to troubleshooting doc on the topic
> The cause of the failure can be any of
> * table being inited elsewhere: expectation, fast recovery
> * it's not a S3Guard table: it won't recover
> * it's a S3Guard table without a version marker: it won't recover.
> + consider having a shorter retry lifespan, though if it adds a new config 
> point I'm a bit reluctant. For s3guard bucket-info it would make sense to 
> change the policy to be aggressively short lived



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16349) DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry

2019-10-11 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16349.
-
Resolution: Fixed

> DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry
> -
>
> Key: HADOOP-16349
> URL: https://issues.apache.org/jira/browse/HADOOP-16349
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> If you delete the version marker from a S3Guard table, it appears to hang for 
> 5 minutes.
> Only if you restart and turn logging to debug do you see that 
> {{DynamoDBMetadataStore.getVersionMarkerItem()}} is sleeping and retrying.
> # log at warn
> # add entry to troubleshooting doc on the topic
> The cause of the failure can be any of
> * table being inited elsewhere: expectation, fast recovery
> * it's not a S3Guard table: it won't recover
> * it's a S3Guard table without a version marker: it won't recover.
> + consider having a shorter retry lifespan, though if it adds a new config 
> point I'm a bit reluctant. For s3guard bucket-info it would make sense to 
> change the policy to be aggressively short lived



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16520) Race condition in DDB table init and waiting threads

2019-10-11 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16520.
-
Resolution: Fixed

> Race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16520) Race condition in DDB table init and waiting threads

2019-10-11 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949339#comment-16949339
 ] 

Gabor Bota commented on HADOOP-16520:
-

+1 on #1576 from [~ste...@apache.org]. Committing. Thanks.

> Race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16520) Race condition in DDB table init and waiting threads

2019-10-11 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16520:

Summary: Race condition in DDB table init and waiting threads  (was: race 
condition in DDB table init and waiting threads)

> Race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16349) DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry

2019-10-01 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942161#comment-16942161
 ] 

Gabor Bota commented on HADOOP-16349:
-

I'm going to fix this with  HADOOP-16520 

> DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry
> -
>
> Key: HADOOP-16349
> URL: https://issues.apache.org/jira/browse/HADOOP-16349
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> If you delete the version marker from a S3Guard table, it appears to hang for 
> 5 minutes.
> Only if you restart and turn logging to debug do you see that 
> {{DynamoDBMetadataStore.getVersionMarkerItem()}} is sleeping and retrying.
> # log at warn
> # add entry to troubleshooting doc on the topic
> The cause of the failure can be any of
> * table being inited elsewhere: expectation, fast recovery
> * it's not a S3Guard table: it won't recover
> * it's a S3Guard table without a version marker: it won't recover.
> + consider having a shorter retry lifespan, though if it adds a new config 
> point I'm a bit reluctant. For s3guard bucket-info it would make sense to 
> change the policy to be aggressively short lived



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-16349) DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry

2019-10-01 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16349 started by Gabor Bota.
---
> DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry
> -
>
> Key: HADOOP-16349
> URL: https://issues.apache.org/jira/browse/HADOOP-16349
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> If you delete the version marker from a S3Guard table, it appears to hang for 
> 5 minutes.
> Only if you restart and turn logging to debug do you see that 
> {{DynamoDBMetadataStore.getVersionMarkerItem()}} is sleeping and retrying.
> # log at warn
> # add entry to troubleshooting doc on the topic
> The cause of the failure can be any of
> * table being inited elsewhere: expectation, fast recovery
> * it's not a S3Guard table: it won't recover
> * it's a S3Guard table without a version marker: it won't recover.
> + consider having a shorter retry lifespan, though if it adds a new config 
> point I'm a bit reluctant. For s3guard bucket-info it would make sense to 
> change the policy to be aggressively short lived



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16349) DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry

2019-10-01 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HADOOP-16349:
---

Assignee: Gabor Bota

> DynamoDBMetadataStore.getVersionMarkerItem() to log at info/warn on retry
> -
>
> Key: HADOOP-16349
> URL: https://issues.apache.org/jira/browse/HADOOP-16349
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> If you delete the version marker from a S3Guard table, it appears to hang for 
> 5 minutes.
> Only if you restart and turn logging to debug do you see that 
> {{DynamoDBMetadataStore.getVersionMarkerItem()}} is sleeping and retrying.
> # log at warn
> # add entry to troubleshooting doc on the topic
> The cause of the failure can be any of
> * table being inited elsewhere: expectation, fast recovery
> * it's not a S3Guard table: it won't recover
> * it's a S3Guard table without a version marker: it won't recover.
> + consider having a shorter retry lifespan, though if it adds a new config 
> point I'm a bit reluctant. For s3guard bucket-info it would make sense to 
> change the policy to be aggressively short lived



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-16520) race condition in DDB table init and waiting threads

2019-09-28 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16520 started by Gabor Bota.
---
> race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16520) race condition in DDB table init and waiting threads

2019-09-26 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938690#comment-16938690
 ] 

Gabor Bota commented on HADOOP-16520:
-

I'll try to solve this the other way: add the version marker in 
{{verifyVersionCompatibility}} IFF the table lacks version marker AND empty.

> race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16579) Upgrade to Apache Curator 4.2.0 in Hadoop

2019-09-25 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HADOOP-16579:
---

Assignee: Norbert Kalmar

> Upgrade to Apache Curator 4.2.0 in Hadoop
> -
>
> Key: HADOOP-16579
> URL: https://issues.apache.org/jira/browse/HADOOP-16579
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Mate Szalay-Beko
>Assignee: Norbert Kalmar
>Priority: Major
>
> Currently in Hadoop we are using [ZooKeeper version 
> 3.4.13|https://github.com/apache/hadoop/blob/7f9073132dcc9db157a6792635d2ed099f2ef0d2/hadoop-project/pom.xml#L90].
>  ZooKeeper 3.5.5 is the latest stable Apache ZooKeeper release. It contains 
> many new features (including SSL related improvements which can be very 
> important for production use; seeĀ [the release 
> notes|https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html]).
> Apache Curator is a high level ZooKeeper client library, that makes it easier 
> to use the low level ZooKeeper API. Currently [in Hadoop we are using Curator 
> 2.13.0|https://github.com/apache/hadoop/blob/7f9073132dcc9db157a6792635d2ed099f2ef0d2/hadoop-project/pom.xml#L91]
>  and [in Ozone we use Curator 
> 2.12.0|https://github.com/apache/hadoop/blob/7f9073132dcc9db157a6792635d2ed099f2ef0d2/pom.ozone.xml#L146].
> Curator 2.x is supporting only the ZooKeeper 3.4.x releases, while Curator 
> 3.x is compatible only with the new ZooKeeper 3.5.x releases. Fortunately, 
> the latest Curator 4.x versions are compatible with both ZooKeeper 3.4.x and 
> 3.5.x. (see [the relevant Curator 
> page|https://curator.apache.org/zk-compatibility.html]). Many Apache projects 
> have already migrated to Curator 4 (like HBase, Phoenix, Druid, etc.), other 
> components are doing it right now (e.g. Hive).
> *The aims of this task are* to:
>  - change Curator version in Hadoop to the latest stable 4.x version 
> (currently 4.2.0)
>  - also make sure we don't have multiple ZooKeeper versions in the classpath 
> to avoid runtime problems (it is 
> [recommended|https://curator.apache.org/zk-compatibility.html] to exclude the 
> ZooKeeper which come with Curator, so that there will be only a single 
> ZooKeeper version used runtime in Hadoop)
> In this ticket we still don't want to change the default ZooKeeper version in 
> Hadoop, we only want to make it possible for the community to be able to 
> build / use Hadoop with the new ZooKeeper (e.g. if they need to secure the 
> ZooKeeper communication with SSL, what is only supported in the new ZooKeeper 
> version). Upgrading to Curator 4.x should keep Hadoop to be compatible with 
> both ZooKeeper 3.4 and 3.5.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16529) Allow AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION to be set from abfs.xml property

2019-09-25 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16529.
-
Resolution: Workaround

> Allow AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION to be set from 
> abfs.xml property
> ---
>
> Key: HADOOP-16529
> URL: https://issues.apache.org/jira/browse/HADOOP-16529
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> In 
> org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest#AbstractAbfsIntegrationTest
>  we do a
> {code:java}
> 
> abfsConfig.setBoolean(AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION, 
> true);
> {code}
> which is not good for some testcases (eg. HADOOP-16138) where we want to test 
> against a container that is not exist.
> A property should be added to be able to override this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-09-25 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937551#comment-16937551
 ] 

Gabor Bota commented on HADOOP-16138:
-

That is true. Sorry [~ayushtkn]. 
The way I plan to fix it: create an HDFS jira where I `revert` this change in 
the sense that I will create a PR with the original log. No need for the 
additional logging what we added.

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-09-23 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16138.
-
Target Version/s: 3.3.0
  Resolution: Fixed

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-09-23 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935762#comment-16935762
 ] 

Gabor Bota commented on HADOOP-16138:
-

+1 by [~ste...@apache.org] on PR #1302. Committing.

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16565) Region must be provided when requesting session credentials or SdkClientException will be thrown

2019-09-23 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16565:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Region must be provided when requesting session credentials or 
> SdkClientException will be thrown
> 
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16547) s3guard prune command doesn't get AWS auth chain from FS

2019-09-18 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16547.
-
Resolution: Fixed

> s3guard prune command doesn't get AWS auth chain from FS
> 
>
> Key: HADOOP-16547
> URL: https://issues.apache.org/jira/browse/HADOOP-16547
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> s3guard prune command doesn't get AWS auth chain from any FS, so it just 
> drives the DDB store from the conf settings. If S3A is set up to use 
> Delegation tokens then the DTs/custom AWS auth sequence is not picked up, so 
> you get an auth failure.
> Fix:
> # instantiate the FS before calling initMetadataStore
> # review other commands to make sure problem isn't replicated



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16547) s3guard prune command doesn't get AWS auth chain from FS

2019-09-18 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932692#comment-16932692
 ] 

Gabor Bota commented on HADOOP-16547:
-

+1 on PR 1402. Committed.

> s3guard prune command doesn't get AWS auth chain from FS
> 
>
> Key: HADOOP-16547
> URL: https://issues.apache.org/jira/browse/HADOOP-16547
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> s3guard prune command doesn't get AWS auth chain from any FS, so it just 
> drives the DDB store from the conf settings. If S3A is set up to use 
> Delegation tokens then the DTs/custom AWS auth sequence is not picked up, so 
> you get an auth failure.
> Fix:
> # instantiate the FS before calling initMetadataStore
> # review other commands to make sure problem isn't replicated



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16484) S3A to warn or fail if S3Guard is disabled

2019-09-18 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HADOOP-16484:
---

Assignee: Gabor Bota

> S3A to warn or fail if S3Guard is disabled
> --
>
> Key: HADOOP-16484
> URL: https://issues.apache.org/jira/browse/HADOOP-16484
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> A seemingly recurrent problem with s3guard is "people who think S3Guard is 
> turned on but really it isn't"
> It's not immediately obvious this is the case, and the fact S3Guard is off 
> tends to surface after some intermittent failure has actually been detected.
> Propose: add a configuration parameter which chooses what to do when an S3A 
> FS is instantiated without S3Guard
> * silent : today; do nothing. 
> * inform: log FS is instantiated without s3guard
> * warn: Warn that data may be at risk in workflows
> * fail
> deployments could then choose which level of reaction they want. I'd make the 
> default "inform" for now; any on-prem object store deployment should switch 
> to silent, and if you really want strictness, fail is the ultimate option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work stopped] (HADOOP-16520) race condition in DDB table init and waiting threads

2019-09-18 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16520 stopped by Gabor Bota.
---
> race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16565) Region must be provided when requesting session credentials or SdkClientException will be thrown

2019-09-18 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932216#comment-16932216
 ] 

Gabor Bota commented on HADOOP-16565:
-

+1 from [~ste...@apache.org] on PR 1454. Committing.

> Region must be provided when requesting session credentials or 
> SdkClientException will be thrown
> 
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16565) Region must be provided when requesting session credentials or SdkClientException will be thrown

2019-09-16 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16565:

Status: Patch Available  (was: In Progress)

> Region must be provided when requesting session credentials or 
> SdkClientException will be thrown
> 
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-16565) Region must be provided when requesting session credentials or SdkClientException will be thrown

2019-09-16 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16565 started by Gabor Bota.
---
> Region must be provided when requesting session credentials or 
> SdkClientException will be thrown
> 
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16565) Region must be provided when requesting session credentials or SdkClientException will be thrown

2019-09-16 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16565:

Summary: Region must be provided when requesting session credentials or 
SdkClientException will be thrown  (was: Fix "com.amazonaws.SdkClientException: 
Unable to find a region via the region provider chain.")

> Region must be provided when requesting session credentials or 
> SdkClientException will be thrown
> 
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-16565) Fix "com.amazonaws.SdkClientException: Unable to find a region via the region provider chain."

2019-09-13 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reopened HADOOP-16565:
-

Reopened to add a message.

> Fix "com.amazonaws.SdkClientException: Unable to find a region via the region 
> provider chain."
> --
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-16520) race condition in DDB table init and waiting threads

2019-09-13 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16520 started by Gabor Bota.
---
> race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16520) race condition in DDB table init and waiting threads

2019-09-13 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HADOOP-16520:
---

Assignee: Gabor Bota

> race condition in DDB table init and waiting threads
> 
>
> Key: HADOOP-16520
> URL: https://issues.apache.org/jira/browse/HADOOP-16520
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Major
>
> s3guard threads waiting for table creation completion can be scheduled before 
> the creating thread, look for the version marker and then fail.
> window will be sleep times in AWS SDK Table.waitForActive();



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16565) Fix "com.amazonaws.SdkClientException: Unable to find a region via the region provider chain."

2019-09-13 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16565.
-
Resolution: Workaround

> Fix "com.amazonaws.SdkClientException: Unable to find a region via the region 
> provider chain."
> --
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16565) Fix "com.amazonaws.SdkClientException: Unable to find a region via the region provider chain."

2019-09-13 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929284#comment-16929284
 ] 

Gabor Bota commented on HADOOP-16565:
-

It worked with the following STS config:

{noformat}

test.sts.endpoint
Specific endpoint to use for STS requests.
sts.amazonaws.com



sts.london.endpoint
sts.eu-west-2.amazonaws.com


sts.london.region
eu-west-2



sts.ireland.endpoint
sts.eu-west-1.amazonaws.com



fs.s3a.assumed.role.sts.endpoint
${sts.london.endpoint}


fs.s3a.assumed.role.sts.endpoint.region
${sts.london.region}

{noformat}

> Fix "com.amazonaws.SdkClientException: Unable to find a region via the region 
> provider chain."
> --
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16565) Fix "com.amazonaws.SdkClientException: Unable to find a region via the region provider chain."

2019-09-13 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929242#comment-16929242
 ] 

Gabor Bota commented on HADOOP-16565:
-

I don't have any STS endpoint besides
{noformat}

test.sts.endpoint
Specific endpoint to use for STS requests.
sts.amazonaws.com

{noformat}
so I haven't changed a single setting and I was not getting this error but now 
I'm getting it. 


> Fix "com.amazonaws.SdkClientException: Unable to find a region via the region 
> provider chain."
> --
>
> Key: HADOOP-16565
> URL: https://issues.apache.org/jira/browse/HADOOP-16565
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The error found during testing in the following tests:
> {noformat}
> [ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
> Unable to f...
> [ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find 
> a region v...
> [ERROR]   
> ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
>  ? SdkClient
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
> Unable to ...
> [ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
> SdkClient Unabl...
> [ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
> Unable to ...
> [ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
> SdkClient ...
> [ERROR]   
> ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   
> ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 
> ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
> SdkClient Unab...
> [ERROR]   
> ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
>  ? SdkClient
> [ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
> SdkClient Una...
> [ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
> Unable to find...
> [ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
> Unable to find...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16566) S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of com.google.common.base.Stopwatch

2019-09-12 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16566.
-
Resolution: Fixed

> S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of 
> com.google.common.base.Stopwatch
> --
>
> Key: HADOOP-16566
> URL: https://issues.apache.org/jira/browse/HADOOP-16566
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> Some distributions won't have the updated guava, and 
> {{org.apache.hadoop.util.StopWatch}} is only available in the newer ones. 
> Fix this issue by using the hadoop util's instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16566) S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of com.google.common.base.Stopwatch

2019-09-12 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928734#comment-16928734
 ] 

Gabor Bota commented on HADOOP-16566:
-

+1 by [~ste...@apache.org] on  #1433 PR.
Committing.

> S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of 
> com.google.common.base.Stopwatch
> --
>
> Key: HADOOP-16566
> URL: https://issues.apache.org/jira/browse/HADOOP-16566
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> Some distributions won't have the updated guava, and 
> {{org.apache.hadoop.util.StopWatch}} is only available in the newer ones. 
> Fix this issue by using the hadoop util's instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-09-12 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16423.
-
Resolution: Fixed

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16566) S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of com.google.common.base.Stopwatch

2019-09-12 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928693#comment-16928693
 ] 

Gabor Bota commented on HADOOP-16566:
-

maybe, but it would be better to update guava everywhere.

> S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of 
> com.google.common.base.Stopwatch
> --
>
> Key: HADOOP-16566
> URL: https://issues.apache.org/jira/browse/HADOOP-16566
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> Some distributions won't have the updated guava, and 
> {{org.apache.hadoop.util.StopWatch}} is only available in the newer ones. 
> Fix this issue by using the hadoop util's instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work started] (HADOOP-16566) S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of com.google.common.base.Stopwatch

2019-09-12 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16566 started by Gabor Bota.
---
> S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of 
> com.google.common.base.Stopwatch
> --
>
> Key: HADOOP-16566
> URL: https://issues.apache.org/jira/browse/HADOOP-16566
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> Some distributions won't have the updated guava, and 
> {{org.apache.hadoop.util.StopWatch}} is only available in the newer ones. 
> Fix this issue by using the hadoop util's instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16566) S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of com.google.common.base.Stopwatch

2019-09-12 Thread Gabor Bota (Jira)
Gabor Bota created HADOOP-16566:
---

 Summary: S3Guard fsck: Use org.apache.hadoop.util.StopWatch 
instead of com.google.common.base.Stopwatch
 Key: HADOOP-16566
 URL: https://issues.apache.org/jira/browse/HADOOP-16566
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Gabor Bota
Assignee: Gabor Bota


Some distributions won't have the updated guava, and 
{{org.apache.hadoop.util.StopWatch}} is only available in the newer ones. 
Fix this issue by using the hadoop util's instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16566) S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of com.google.common.base.Stopwatch

2019-09-12 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16566:

Component/s: fs/s3

> S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of 
> com.google.common.base.Stopwatch
> --
>
> Key: HADOOP-16566
> URL: https://issues.apache.org/jira/browse/HADOOP-16566
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> Some distributions won't have the updated guava, and 
> {{org.apache.hadoop.util.StopWatch}} is only available in the newer ones. 
> Fix this issue by using the hadoop util's instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16566) S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of com.google.common.base.Stopwatch

2019-09-12 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16566:

Affects Version/s: 3.3.0

> S3Guard fsck: Use org.apache.hadoop.util.StopWatch instead of 
> com.google.common.base.Stopwatch
> --
>
> Key: HADOOP-16566
> URL: https://issues.apache.org/jira/browse/HADOOP-16566
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> Some distributions won't have the updated guava, and 
> {{org.apache.hadoop.util.StopWatch}} is only available in the newer ones. 
> Fix this issue by using the hadoop util's instead.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16565) Fix "com.amazonaws.SdkClientException: Unable to find a region via the region provider chain."

2019-09-12 Thread Gabor Bota (Jira)
Gabor Bota created HADOOP-16565:
---

 Summary: Fix "com.amazonaws.SdkClientException: Unable to find a 
region via the region provider chain."
 Key: HADOOP-16565
 URL: https://issues.apache.org/jira/browse/HADOOP-16565
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Gabor Bota
Assignee: Gabor Bota


The error found during testing in the following tests:
{noformat}
[ERROR]   ITestS3ATemporaryCredentials.testInvalidSTSBinding:257 ? SdkClient 
Unable to f...
[ERROR]   ITestS3ATemporaryCredentials.testSTS:130 ? SdkClient Unable to find a 
region v...
[ERROR]   
ITestS3ATemporaryCredentials.testSessionRequestExceptionTranslation:441->lambda$testSessionRequestExceptionTranslation$5:442
 ? SdkClient
[ERROR]   ITestS3ATemporaryCredentials.testSessionTokenExpiry:222 ? SdkClient 
Unable to ...
[ERROR]   ITestS3ATemporaryCredentials.testSessionTokenPropagation:193 ? 
SdkClient Unabl...
[ERROR]   ITestDelegatedMRJob.testJobSubmissionCollectsTokens:286 ? SdkClient 
Unable to ...
[ERROR]   ITestSessionDelegationInFileystem.testAddTokensFromFileSystem:235 ? 
SdkClient ...
[ERROR]   
ITestSessionDelegationInFileystem.testCanRetrieveTokenFromCurrentUserCreds:260->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
 ? SdkClient
[ERROR]   
ITestSessionDelegationInFileystem.testDTCredentialProviderFromCurrentUserCreds:278->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
 ? SdkClient
[ERROR]   
ITestSessionDelegationInFileystem.testDelegatedFileSystem:308->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
 ? SdkClient
[ERROR]   
ITestSessionDelegationInFileystem.testDelegationBindingMismatch1:432->createDelegationTokens:292->AbstractDelegationIT.mkTokens:88
 ? SdkClient
[ERROR]   ITestSessionDelegationInFileystem.testFileSystemBoundToCreator:681 ? 
SdkClient
[ERROR]   ITestSessionDelegationInFileystem.testGetDTfromFileSystem:212 ? 
SdkClient Unab...
[ERROR]   
ITestSessionDelegationInFileystem.testHDFSFetchDTCommand:606->lambda$testHDFSFetchDTCommand$3:607
 ? SdkClient
[ERROR]   ITestSessionDelegationInFileystem.testYarnCredentialPickup:576 ? 
SdkClient Una...
[ERROR]   ITestSessionDelegationTokens.testCreateAndUseDT:176 ? SdkClient 
Unable to find...
[ERROR]   ITestSessionDelegationTokens.testSaveLoadTokens:121 ? SdkClient 
Unable to find...
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-09-12 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928449#comment-16928449
 ] 

Gabor Bota commented on HADOOP-16423:
-

+1 from [~ste...@apache.org] on PR #1208. Committing.
Created followup jiras:
HADOOP-16564 - docs
HADOOP-16563 - authoritative paths


> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16564) S3Guarld fsck: Add docs to the first iteration (S3->ddbMS, -verify)

2019-09-12 Thread Gabor Bota (Jira)
Gabor Bota created HADOOP-16564:
---

 Summary: S3Guarld fsck: Add docs to the first iteration 
(S3->ddbMS, -verify)
 Key: HADOOP-16564
 URL: https://issues.apache.org/jira/browse/HADOOP-16564
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Gabor Bota


Followup for HADOOP-16423.
Add md documentation and how to extend wtih new violations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16563) S3Guard fsck: Detect if a directory if authoritative and highlight errors if detected in it

2019-09-12 Thread Gabor Bota (Jira)
Gabor Bota created HADOOP-16563:
---

 Summary: S3Guard fsck: Detect if a directory if authoritative and 
highlight errors if detected in it
 Key: HADOOP-16563
 URL: https://issues.apache.org/jira/browse/HADOOP-16563
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Gabor Bota


Followup from HADOOP-16423.

One of the changes with the HADOOP-16430 PR is that we now have an S3A FS 
method boolean allowAuthoritative(final Path path) that takes a path and 
returns true iff its authoritative either by the MS being auth or the given 
path being marked as one of the authoritative dirs. I think the validation when 
an authoritative directory is consistent between the metastore and S3 should be 
using this when it wants to highlight an authoritative path is inconsistent.

This can be a follow-on patch, because as usual it will need more tests, in the 
code, and someone to try out the command line.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16507) S3Guard fsck: Add option to configure severity (level) for the scan

2019-09-10 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926405#comment-16926405
 ] 

Gabor Bota commented on HADOOP-16507:
-

++ additional task: 
Add `-verbose` arg to the fsck with more output.
Use 
https://github.com/steveloughran/cloudstore/blob/master/src/main/java/org/apache/hadoop/fs/store/StoreEntryPoint.java#L317
 to dump FS stats.
[~ste...@apache.org]

> S3Guard fsck: Add option to configure severity (level) for the scan
> ---
>
> Key: HADOOP-16507
> URL: https://issues.apache.org/jira/browse/HADOOP-16507
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Priority: Major
>
> There's the severity of Violation (inconsistency) defined in 
> {{org.apache.hadoop.fs.s3a.s3guard.S3GuardFsck.Violation}}. 
> This flag is only for defining the severity of the Violation, but not used to 
> filter the scan for issue severity.
> The task to do: Use the severity level to define which issue should be logged 
> and/or fixed during the scan. 
> Note: the best way to avoid possible code duplication would be to not even 
> add the consistency violation pair to the list of violations during the scan.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16550) Spark config name error on the Launching Applications Using Docker Containers page

2019-09-06 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924058#comment-16924058
 ] 

Gabor Bota commented on HADOOP-16550:
-

Merged PR #9

> Spark config name error on the Launching Applications Using Docker Containers 
> page
> --
>
> Key: HADOOP-16550
> URL: https://issues.apache.org/jira/browse/HADOOP-16550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.9.0, 2.8.2, 2.8.3, 3.0.0, 3.1.0, 2.9.1, 3.0.1, 2.8.4, 
> 3.0.2, 3.1.1, 2.9.2, 3.0.3, 2.8.5, 3.1.2
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> On the "Launching Applications Using Docker Containers" page at the "Example: 
> Spark" section the Spark config for configuring the environment variables for 
> the application master the config prefix are wrong:
> - 
> spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_TYPE 
>  
> The correct ones:
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE
> See https://spark.apache.org/docs/2.4.0/running-on-yarn.html:
> {quote}
> spark.yarn.appMasterEnv.[EnvironmentVariableName]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16550) Spark config name error on the Launching Applications Using Docker Containers page

2019-09-06 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16550.
-
Resolution: Fixed

> Spark config name error on the Launching Applications Using Docker Containers 
> page
> --
>
> Key: HADOOP-16550
> URL: https://issues.apache.org/jira/browse/HADOOP-16550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.9.0, 2.8.2, 2.8.3, 3.0.0, 3.1.0, 2.9.1, 3.0.1, 2.8.4, 
> 3.0.2, 3.1.1, 2.9.2, 3.0.3, 2.8.5, 3.1.2
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> On the "Launching Applications Using Docker Containers" page at the "Example: 
> Spark" section the Spark config for configuring the environment variables for 
> the application master the config prefix are wrong:
> - 
> spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_TYPE 
>  
> The correct ones:
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE
> See https://spark.apache.org/docs/2.4.0/running-on-yarn.html:
> {quote}
> spark.yarn.appMasterEnv.[EnvironmentVariableName]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16550) Spark config name error on the Launching Applications Using Docker Containers page

2019-09-05 Thread Gabor Bota (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16550:

Description: 
On the "Launching Applications Using Docker Containers" page at the "Example: 
Spark" section the Spark config for configuring the environment variables for 
the application master the config prefix are wrong:
- 
spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
- spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_TYPE  

The correct ones:
- spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
- spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE

See https://spark.apache.org/docs/2.4.0/running-on-yarn.html:

{quote}
spark.yarn.appMasterEnv.[EnvironmentVariableName]
{quote}


  was:
On the "Launching Applications Using Docker Containers" page at the "Example: 
Spark" section the Spark config for configuring the environment variables for 
the application master the config prefix are wrong:
- 
spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
- park.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_TYPE  

The correct ones:
- spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
- spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE

See https://spark.apache.org/docs/2.4.0/running-on-yarn.html:

{quote}
spark.yarn.appMasterEnv.[EnvironmentVariableName]
{quote}



> Spark config name error on the Launching Applications Using Docker Containers 
> page
> --
>
> Key: HADOOP-16550
> URL: https://issues.apache.org/jira/browse/HADOOP-16550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.9.0, 2.8.2, 2.8.3, 3.0.0, 3.1.0, 2.9.1, 3.0.1, 2.8.4, 
> 3.0.2, 3.1.1, 2.9.2, 3.0.3, 2.8.5, 3.1.2
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> On the "Launching Applications Using Docker Containers" page at the "Example: 
> Spark" section the Spark config for configuring the environment variables for 
> the application master the config prefix are wrong:
> - 
> spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_TYPE 
>  
> The correct ones:
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE
> See https://spark.apache.org/docs/2.4.0/running-on-yarn.html:
> {quote}
> spark.yarn.appMasterEnv.[EnvironmentVariableName]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16550) Spark config name error on the Launching Applications Using Docker Containers page

2019-09-05 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923501#comment-16923501
 ] 

Gabor Bota commented on HADOOP-16550:
-

Sure, thanks for the contribution [~attilapiros].
LGTM, +1. on the PR

> Spark config name error on the Launching Applications Using Docker Containers 
> page
> --
>
> Key: HADOOP-16550
> URL: https://issues.apache.org/jira/browse/HADOOP-16550
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.9.0, 2.8.2, 2.8.3, 3.0.0, 3.1.0, 2.9.1, 3.0.1, 2.8.4, 
> 3.0.2, 3.1.1, 2.9.2, 3.0.3, 2.8.5, 3.1.2
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> On the "Launching Applications Using Docker Containers" page at the "Example: 
> Spark" section the Spark config for configuring the environment variables for 
> the application master the config prefix are wrong:
> - 
> spark.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - park.yarn.{color:#DE350B}*A*{color}ppMasterEnv.YARN_CONTAINER_RUNTIME_TYPE  
> The correct ones:
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE
> - spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE
> See https://spark.apache.org/docs/2.4.0/running-on-yarn.html:
> {quote}
> spark.yarn.appMasterEnv.[EnvironmentVariableName]
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16529) Allow AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION to be set from abfs.xml property

2019-08-22 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913420#comment-16913420
 ] 

Gabor Bota commented on HADOOP-16529:
-

Note: a possible fix for this could be to add 
{code:java}
conf.setBoolean(AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION, 
false);
{code}
to the constructor of the test class which extends 
{{AbstractAbfsIntegrationTest}}

> Allow AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION to be set from 
> abfs.xml property
> ---
>
> Key: HADOOP-16529
> URL: https://issues.apache.org/jira/browse/HADOOP-16529
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> In 
> org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest#AbstractAbfsIntegrationTest
>  we do a
> {code:java}
> 
> abfsConfig.setBoolean(AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION, 
> true);
> {code}
> which is not good for some testcases (eg. HADOOP-16138) where we want to test 
> against a container that is not exist.
> A property should be added to be able to override this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16529) Allow AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION to be set from abfs.xml property

2019-08-22 Thread Gabor Bota (Jira)
Gabor Bota created HADOOP-16529:
---

 Summary: Allow 
AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION to be set from abfs.xml 
property
 Key: HADOOP-16529
 URL: https://issues.apache.org/jira/browse/HADOOP-16529
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.3.0
Reporter: Gabor Bota
Assignee: Gabor Bota


In 
org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest#AbstractAbfsIntegrationTest
 we do a
{code:java}
abfsConfig.setBoolean(AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION, 
true);
{code}
which is not good for some testcases (eg. HADOOP-16138) where we want to test 
against a container that is not exist.


A property should be added to be able to override this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-08-22 Thread Gabor Bota (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913415#comment-16913415
 ] 

Gabor Bota commented on HADOOP-16138:
-

 I found out what's happening. We auto-create all containers during the test 
run, so I should turn that feature off for this test. I forgot to do that, and 
so I ended up a lot of random containers in the testing account.


In 
{{org.apache.hadoop.fs.azurebfs.AbstractAbfsIntegrationTest#AbstractAbfsIntegrationTest}}
 we do a 
{code:java}
abfsConfig.setBoolean(AZURE_CREATE_REMOTE_FILESYSTEM_DURING_INITIALIZATION, 
true);
{code}

which is not good, so I'll create a patch where this can be passed as a 
parameter.

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-08-15 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908308#comment-16908308
 ] 

Gabor Bota commented on HADOOP-16138:
-

Based on offline discussion with [~mackrorysd] we agreed that this output is 
not what we would like to see.
Something like "{{The container does not exist: [nameofthecontainer].}}" is way 
better than what we currently have.
I'll update my PR accordingly.

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-08-15 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16138:

Status: In Progress  (was: Patch Available)

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-08-15 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16138:

Status: Patch Available  (was: In Progress)

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-08-15 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908062#comment-16908062
 ] 

Gabor Bota commented on HADOOP-16138:
-

So I've created a test for it (new test class for testing CLI with ABFS).

The test was not failing. The output is:
{{mkdir: 
`abfs://nonexistent-3ab66e98-66dc-4e7b-879d-16e323bd2...@mycontainer.dfs.core.windows.net/':
 File exists}}

Next, I tried to run from a {{dist}}, where the output was the same. So I guess 
this got fixed, but we could add the test I've created for this - so it's ready 
for review!



> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16416) mark DynamoDBMetadataStore.deleteTrackingValueMap as final

2019-08-15 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908006#comment-16908006
 ] 

Gabor Bota edited comment on HADOOP-16416 at 8/15/19 11:28 AM:
---

You forgot to add final to v003 patch. 


was (Author: gabor.bota):
+1 on patch v003, even if checkstyle has it's own issues with this - I think 
{{final static}} should be uppercase.

> mark DynamoDBMetadataStore.deleteTrackingValueMap as final
> --
>
> Key: HADOOP-16416
> URL: https://issues.apache.org/jira/browse/HADOOP-16416
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: kevin su
>Priority: Trivial
> Attachments: HADOOP-16416.001.patch, HADOOP-16416.002.patch, 
> HADOOP-16416.003.patch
>
>
> S3Guard's {{DynamoDBMetadataStore.deleteTrackingValueMap}} field is static 
> and can/should be marked as final; its name changed to upper case to match 
> the coding conventions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16416) mark DynamoDBMetadataStore.deleteTrackingValueMap as final

2019-08-15 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908006#comment-16908006
 ] 

Gabor Bota commented on HADOOP-16416:
-

+1 on patch v003, even if checkstyle has it's own issues with this - I think 
{{final static}} should be uppercase.

> mark DynamoDBMetadataStore.deleteTrackingValueMap as final
> --
>
> Key: HADOOP-16416
> URL: https://issues.apache.org/jira/browse/HADOOP-16416
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: kevin su
>Priority: Trivial
> Attachments: HADOOP-16416.001.patch, HADOOP-16416.002.patch, 
> HADOOP-16416.003.patch
>
>
> S3Guard's {{DynamoDBMetadataStore.deleteTrackingValueMap}} field is static 
> and can/should be marked as final; its name changed to upper case to match 
> the coding conventions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16505) Add ability to register custom signer with AWS SignerFactory

2019-08-14 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907281#comment-16907281
 ] 

Gabor Bota commented on HADOOP-16505:
-

Thanks for working on this [~viczsaurav], and [~jojochuang] to notifying us!

I think that the PR must be the current, because there is at least one test for 
this included in that.

[~viczsaurav], could you include an integration test for this change? Also 
please run all integration test against an aws endpoint with these parameters 
at least: {{mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Ds3guard 
-Ddynamo}}, and tell us if it was successful. It would nice to show that the 
tests won't fail with the signer changed.


> Add ability to register custom signer with AWS SignerFactory
> 
>
> Key: HADOOP-16505
> URL: https://issues.apache.org/jira/browse/HADOOP-16505
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, hadoop-aws
>Affects Versions: 3.3.0
>Reporter: Saurav Verma
>Assignee: Saurav Verma
>Priority: Major
> Attachments: HADOOP-16505.patch, hadoop-16505-1.patch
>
>
> Currently, the AWS SignerFactory restricts the class of Signer algorithms 
> that can be used. 
> We require an ability to register a custom Signer. The SignerFactory supports 
> this functionality through its {{registerSigner}} method. 
> By providing a fully qualified classname to the existing parameter 
> {{fs.s3a.signing-algorithm}}, the custom signer can be registered.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16416) mark DynamoDBMetadataStore.deleteTrackingValueMap as final

2019-08-14 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907171#comment-16907171
 ] 

Gabor Bota commented on HADOOP-16416:
-

Thanks for working on this [~pingsutw]!

Please note that we use underscores in our constant names instead of camelcase, 
but to still separate the words from each other.
For reference check e.g {{org.apache.hadoop.fs.s3a.Constants}}.
In this case, deleteTrackingValueMap would be DELETE_TRACKING_VALUE_MAP.

> mark DynamoDBMetadataStore.deleteTrackingValueMap as final
> --
>
> Key: HADOOP-16416
> URL: https://issues.apache.org/jira/browse/HADOOP-16416
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: kevin su
>Priority: Trivial
> Attachments: HADOOP-16416.001.patch, HADOOP-16416.002.patch
>
>
> S3Guard's {{DynamoDBMetadataStore.deleteTrackingValueMap}} field is static 
> and can/should be marked as final; its name changed to upper case to match 
> the coding conventions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16500) S3ADelegationTokens to only log at debug on startup

2019-08-14 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16500.
-
   Resolution: Fixed
Fix Version/s: 3.3.0

Committed to trunk.

> S3ADelegationTokens to only log at debug on startup
> ---
>
> Key: HADOOP-16500
> URL: https://issues.apache.org/jira/browse/HADOOP-16500
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
>
> downgrade the log at info to log at debug when S3A comes up with DT support. 
> Otherwise it's too noisy.
> Things still get printed when tokens are created.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16500) S3ADelegationTokens to only log at debug on startup

2019-08-14 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907034#comment-16907034
 ] 

Gabor Bota commented on HADOOP-16500:
-

+1 for GitHub Pull Request #1269. Committing this.

> S3ADelegationTokens to only log at debug on startup
> ---
>
> Key: HADOOP-16500
> URL: https://issues.apache.org/jira/browse/HADOOP-16500
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> downgrade the log at info to log at debug when S3A comes up with DT support. 
> Otherwise it's too noisy.
> Things still get printed when tokens are created.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16507) S3Guard fsck: Add option to configure severity (level) for the scan

2019-08-12 Thread Gabor Bota (JIRA)
Gabor Bota created HADOOP-16507:
---

 Summary: S3Guard fsck: Add option to configure severity (level) 
for the scan
 Key: HADOOP-16507
 URL: https://issues.apache.org/jira/browse/HADOOP-16507
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.0
Reporter: Gabor Bota


There's the severity of Violation (inconsistency) defined in 
{{org.apache.hadoop.fs.s3a.s3guard.S3GuardFsck.Violation}}. 

This flag is only for defining the severity of the Violation, but not used to 
filter the scan for issue severity.

The task to do: Use the severity level to define which issue should be logged 
and/or fixed during the scan. 
Note: the best way to avoid possible code duplication would be to not even add 
the consistency violation pair to the list of violations during the scan.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16138) hadoop fs mkdir / of nonexistent abfs container raises NPE

2019-08-12 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905158#comment-16905158
 ] 

Gabor Bota commented on HADOOP-16138:
-

Not recently, but I plan to fix this this week.

> hadoop fs mkdir / of nonexistent abfs container raises NPE
> --
>
> Key: HADOOP-16138
> URL: https://issues.apache.org/jira/browse/HADOOP-16138
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Gabor Bota
>Priority: Minor
>
> If you try to do a mkdir on the root of a nonexistent container, you get an 
> NPE
> {code}
> hadoop fs -mkdir  abfs://contain...@abfswales1.dfs.core.windows.net/  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16481) ITestS3GuardDDBRootOperations.test_300_MetastorePrune needs to set region

2019-08-09 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903982#comment-16903982
 ] 

Gabor Bota commented on HADOOP-16481:
-

thanks for working on this [~ste...@apache.org]; +1; committed {{GitHub Pull 
Request #1209}} to trunk.

> ITestS3GuardDDBRootOperations.test_300_MetastorePrune needs to set region
> -
>
> Key: HADOOP-16481
> URL: https://issues.apache.org/jira/browse/HADOOP-16481
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> The new  test {{ITestS3GuardDDBRootOperations.test_300_MetastorePrune}} fails 
> if you don't explicitly set the region
> {code}
> [ERROR] 
> test_300_MetastorePrune(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardDDBRootOperations)
>   Time elapsed: 0.845 s  <<< ERROR!
> org.apache.hadoop.util.ExitUtil$ExitException: No region found from -region 
> flag, config, or S3 bucket
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardDDBRootOperations.test_300_MetastorePrune(ITestS3GuardDDBRootOperations.java:186)
> {code}
> it should be picked up from the test filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16481) ITestS3GuardDDBRootOperations.test_300_MetastorePrune needs to set region

2019-08-09 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16481.
-
  Resolution: Fixed
   Fix Version/s: 3.3.0
Target Version/s: 3.3.0

> ITestS3GuardDDBRootOperations.test_300_MetastorePrune needs to set region
> -
>
> Key: HADOOP-16481
> URL: https://issues.apache.org/jira/browse/HADOOP-16481
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
>
> The new  test {{ITestS3GuardDDBRootOperations.test_300_MetastorePrune}} fails 
> if you don't explicitly set the region
> {code}
> [ERROR] 
> test_300_MetastorePrune(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardDDBRootOperations)
>   Time elapsed: 0.845 s  <<< ERROR!
> org.apache.hadoop.util.ExitUtil$ExitException: No region found from -region 
> flag, config, or S3 bucket
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardDDBRootOperations.test_300_MetastorePrune(ITestS3GuardDDBRootOperations.java:186)
> {code}
> it should be picked up from the test filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16499) S3A retry policy to be exponential

2019-08-09 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16499.
-
  Resolution: Fixed
   Fix Version/s: 3.3.0
Target Version/s: 3.3.0

> S3A retry policy to be exponential
> --
>
> Key: HADOOP-16499
> URL: https://issues.apache.org/jira/browse/HADOOP-16499
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0, 3.1.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
> Fix For: 3.3.0
>
>
> the fixed s3a retry policy doesnt leave big enough gaps for cached 404s to 
> expire; we cant recover from this
> HADOOP-16490 is a full fix for this, but one we can backport is moving from 
> fixed to exponential retries



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16499) S3A retry policy to be exponential

2019-08-09 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903905#comment-16903905
 ] 

Gabor Bota commented on HADOOP-16499:
-

+1 on GitHub Pull Request #1246; committing.

> S3A retry policy to be exponential
> --
>
> Key: HADOOP-16499
> URL: https://issues.apache.org/jira/browse/HADOOP-16499
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.2.0, 3.1.2
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Critical
>
> the fixed s3a retry policy doesnt leave big enough gaps for cached 404s to 
> expire; we cant recover from this
> HADOOP-16490 is a full fix for this, but one we can backport is moving from 
> fixed to exponential retries



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16502) Add fsck to S3A tests where additional diagnosis is needed

2019-08-09 Thread Gabor Bota (JIRA)
Gabor Bota created HADOOP-16502:
---

 Summary: Add fsck to S3A tests where additional diagnosis is needed
 Key: HADOOP-16502
 URL: https://issues.apache.org/jira/browse/HADOOP-16502
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Gabor Bota


Extend 
{{org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStore#testPruneTombstoneUnderTombstone}}

{code:java}
// the child2 entry is still there, though it's now orphan (the store isn't
// meeting the rule "all entries must have a parent which exists"
getFile(child2);

+ // todo create a raw fs
+ S3GuardFsck fsck = new S3GuardFsck(rawFs, ms);

// a full prune will still find and delete it, as this
// doesn't walk the tree
getDynamoMetadataStore().prune(PruneMode.ALL_BY_MODTIME,
now + MINUTE);
{code}

Extend 
{{org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStore#testPutFileDeepUnderTombstone}}:

{code:java}
// now put the tombstone
putTombstone(base, now, null);
assertIsTombstone(base);

+ // todo create a raw fs for checking
+ S3GuardFsck fsck = new S3GuardFsck(rawFs, ms);

/*- */
/* Begin S3FileSystem.finishedWrite() sequence. */
/* -*/
AncestorState ancestorState = getDynamoMetadataStore()
.initiateBulkWrite(BulkOperationState.OperationType.Put,
childPath);
{code}



Add new test: 
{{org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardDDBRootOperations#test_070_run_fsck_on_store}}
{code:java}

  @Test
  public void test_070_run_fsck_on_store() throws Throwable {
// todo create a raw fs
S3AFileSystem rawFs = ;
S3GuardFsck s3GuardFsck = new S3GuardFsck(rawFs, metastore);
  }
{code}





--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:10 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages will stay in the handlers instead of adding those to the 
enums: we need to access the pair (the FileStatus both from the MS and S3) when 
writing the log message to log where's the error and we need to log parts of 
the filestatusĀ for showing eg. a mismatch.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:06 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:06 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * version id check is removed because if we want to have version id for an 
object then we need to do a HEAD req for that object. We walk the tree on S3 
directory listings which is far more efficient in terms of the number of 
requests - we do only one request per-directory and only do this for 
directories. For version id we should do a request for every single object, so 
it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900897#comment-16900897
 ] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:03 AM:
--

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
 * Error messages added to the enums instead of the violation handler.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
* Severity added to the violation type enums (3 levels, defined in the enum)
* * eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
* Error messages added to the enums instead of the violation handler. 
* Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-06 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900897#comment-16900897
 ] 

Gabor Bota commented on HADOOP-16423:
-

Based on an offline discussion what's coming up:
* Severity added to the violation type enums (3 levels, defined in the enum)
* * eg. Etag mismatch is defined as serious but etag missing is defined as a 
warn
* Error messages added to the enums instead of the violation handler. 
* Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.

2019-08-05 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1697#comment-1697
 ] 

Gabor Bota edited comment on HADOOP-15565 at 8/5/19 11:21 AM:
--

Thanks for working on this [~LiJinglun]!

{quote}I think we should let FileSystem.CACHE cache ViewFileSystem only, and 
let ViewFileSystem cache all it's child filesystems.{quote}
You mean that the {{FileSystem.CACHE}} will only cache *one instance* of  
{{ViewFileSystem}}, and the {{ViewFileSystem}} will handle other instances 
right? If you mean that (based on the code you do) please correct the 
description.

It would be also good to see more test fore this change: 
* Unit tests for the {{ViewFileSystem#SimpleCache}}.
* Based on the description you had a problem with {quote}re-login my kerberos 
and renew ViewFileSystem periodically{quote}. Could you write a test case where 
this is reproduced to show that this change will solve that issue?




was (Author: gabor.bota):
Thanks for working on this [~LiJinglun]!

> I think we should let FileSystem.CACHE cache ViewFileSystem only, and let 
> ViewFileSystem cache all it's child filesystems.
You mean that the {{FileSystem.CACHE}} will only cache *one instance* of  
{{ViewFileSystem}}, and the {{ViewFileSystem}} will handle other instances 
right? If you mean that (based on the code you do) please correct the 
description.

It would be also good to see more test fore this change: 
* Unit tests for the {{ViewFileSystem#SimpleCache}}.
* Based on the description you had a problem with "re-login my kerberos and 
renew ViewFileSystem periodically". Could you write a test case where this is 
reproduced to show that this change will solve that issue?



> ViewFileSystem.close doesn't close child filesystems and causes FileSystem 
> objects leak.
> 
>
> Key: HADOOP-15565
> URL: https://issues.apache.org/jira/browse/HADOOP-15565
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch
>
>
> When we create a ViewFileSystem, all it's child filesystems will be cached by 
> FileSystem.CACHE. Unless we close these child filesystems, they will stay in 
> FileSystem.CACHE forever.
> I think we should let FileSystem.CACHE cache ViewFileSystem only, and let 
> ViewFileSystem cache all it's child filesystems. So we can close 
> ViewFileSystem without leak and won't affect other ViewFileSystems.
> I find this problem because i need to re-login my kerberos and renew 
> ViewFileSystem periodically. Because FileSystem.CACHE.Key is based on 
> UserGroupInformation, which changes everytime i re-login, I can't use the 
> cached child filesystems when i new a ViewFileSystem. And because 
> ViewFileSystem.close does nothing but remove itself from cache, i leak all 
> it's child filesystems in cache.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.

2019-08-05 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1697#comment-1697
 ] 

Gabor Bota commented on HADOOP-15565:
-

Thanks for working on this [~LiJinglun]!

> I think we should let FileSystem.CACHE cache ViewFileSystem only, and let 
> ViewFileSystem cache all it's child filesystems.
You mean that the {{FileSystem.CACHE}} will only cache *one instance* of  
{{ViewFileSystem}}, and the {{ViewFileSystem}} will handle other instances 
right? If you mean that (based on the code you do) please correct the 
description.

It would be also good to see more test fore this change: 
* Unit tests for the {{ViewFileSystem#SimpleCache}}.
* Based on the description you had a problem with "re-login my kerberos and 
renew ViewFileSystem periodically". Could you write a test case where this is 
reproduced to show that this change will solve that issue?



> ViewFileSystem.close doesn't close child filesystems and causes FileSystem 
> objects leak.
> 
>
> Key: HADOOP-15565
> URL: https://issues.apache.org/jira/browse/HADOOP-15565
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch
>
>
> When we create a ViewFileSystem, all it's child filesystems will be cached by 
> FileSystem.CACHE. Unless we close these child filesystems, they will stay in 
> FileSystem.CACHE forever.
> I think we should let FileSystem.CACHE cache ViewFileSystem only, and let 
> ViewFileSystem cache all it's child filesystems. So we can close 
> ViewFileSystem without leak and won't affect other ViewFileSystems.
> I find this problem because i need to re-login my kerberos and renew 
> ViewFileSystem periodically. Because FileSystem.CACHE.Key is based on 
> UserGroupInformation, which changes everytime i re-login, I can't use the 
> cached child filesystems when i new a ViewFileSystem. And because 
> ViewFileSystem.close does nothing but remove itself from cache, i leak all 
> it's child filesystems in cache.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16483) S3Guarld fsck: Check metadata consistency from metadatastore to S3 (log)

2019-08-02 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16483:

Component/s: fs/s3

> S3Guarld fsck: Check metadata consistency from metadatastore to S3 (log)
> 
>
> Key: HADOOP-16483
> URL: https://issues.apache.org/jira/browse/HADOOP-16483
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Priority: Major
>
> This issue covers the walk on the MS structure and to compare it with S3.
> Things to be checked:
> * Record is in the MS, but not in S3
> * ...
> Only makes sense if 
> * during testing, after a test failure it can be wired *before* the cleanup 
> to debug the inconsistent state -> as a diagnostic tool.
> * the MS is running in auth mode, and there is no short term metadata expiry 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16483) S3Guarld fsck: Check metadata consistency from metadatastore to S3 (log)

2019-08-02 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16483:

Affects Version/s: 3.3.0

> S3Guarld fsck: Check metadata consistency from metadatastore to S3 (log)
> 
>
> Key: HADOOP-16483
> URL: https://issues.apache.org/jira/browse/HADOOP-16483
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Priority: Major
>
> This issue covers the walk on the MS structure and to compare it with S3.
> Things to be checked:
> * Record is in the MS, but not in S3
> * ...
> Only makes sense if 
> * during testing, after a test failure it can be wired *before* the cleanup 
> to debug the inconsistent state -> as a diagnostic tool.
> * the MS is running in auth mode, and there is no short term metadata expiry 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16483) S3Guarld fsck: Check metadata consistency from metadatastore to S3 (log)

2019-08-02 Thread Gabor Bota (JIRA)
Gabor Bota created HADOOP-16483:
---

 Summary: S3Guarld fsck: Check metadata consistency from 
metadatastore to S3 (log)
 Key: HADOOP-16483
 URL: https://issues.apache.org/jira/browse/HADOOP-16483
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Gabor Bota


This issue covers the walk on the MS structure and to compare it with S3.
Things to be checked:
* Record is in the MS, but not in S3
* ...

Only makes sense if 
* during testing, after a test failure it can be wired *before* the cleanup to 
debug the inconsistent state -> as a diagnostic tool.
* the MS is running in auth mode, and there is no short term metadata expiry 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-02 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16423:

Description: 
This part is only for logging the inconsistencies.

This issue only covers the part when the walk is being done in the S3 and 
compares all metadata to the MS.
There will be no part where the walk is being done in the MS and compare it to 
the S3. 

  was:This part is only for logging the inconsistencies.


> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and 
> compares all metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it 
> to the S3. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)

2019-08-02 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16423:

Summary: S3Guarld fsck: Check metadata consistency from S3 to metadatastore 
(log)  (was: S3Guarld fsck: Check metadata consistency between S3 and 
metadatastore (log))

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> 
>
> Key: HADOOP-16423
> URL: https://issues.apache.org/jira/browse/HADOOP-16423
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> This part is only for logging the inconsistencies.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16424) S3Guard fsck: Check internal consistency of the MetadataStore

2019-08-02 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16898870#comment-16898870
 ] 

Gabor Bota commented on HADOOP-16424:
-

Tasks to do here: 
* find orphan entries (entries without a parent)
* find if a file's parent is not a directory (so the parent is a file)
* warn: no lastUpdated field

> S3Guard fsck: Check internal consistency of the MetadataStore
> -
>
> Key: HADOOP-16424
> URL: https://issues.apache.org/jira/browse/HADOOP-16424
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Assignee: Gabor Bota
>Priority: Major
>
> The internal consistency should be checked e.g for orphaned entries which can 
> cause trouble in runtime and testing.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16476) Intermittent failure of ITestS3GuardConcurrentOps#testConcurrentTableCreations

2019-07-31 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897270#comment-16897270
 ] 

Gabor Bota commented on HADOOP-16476:
-

(y)

> Intermittent failure of ITestS3GuardConcurrentOps#testConcurrentTableCreations
> --
>
> Key: HADOOP-16476
> URL: https://issues.apache.org/jira/browse/HADOOP-16476
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Priority: Minor
>
> Test is failing intermittently. One possible solution would be to wait 
> (retry) more because the table will be deleted eventually - it's not there 
> after the whole test run.
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 142.471 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps
> [ERROR] 
> testConcurrentTableCreations(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps)
>   Time elapsed: 142.286 s  <<< ERROR!
> java.lang.IllegalArgumentException: Table 
> s3guard.test.testConcurrentTableCreations-1265635747 is not deleted.
>   at 
> com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:505)
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.deleteTable(ITestS3GuardConcurrentOps.java:87)
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.testConcurrentTableCreations(ITestS3GuardConcurrentOps.java:178)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: com.amazonaws.waiters.WaiterTimedOutException: Reached maximum 
> attempts without transitioning to the desired state
>   at 
> com.amazonaws.waiters.WaiterExecution.pollResource(WaiterExecution.java:86)
>   at com.amazonaws.waiters.WaiterImpl.run(WaiterImpl.java:88)
>   at 
> com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:502)
>   ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16476) Intermittent failure of ITestS3GuardConcurrentOps#testConcurrentTableCreations

2019-07-31 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16476:

Affects Version/s: 3.3.0

> Intermittent failure of ITestS3GuardConcurrentOps#testConcurrentTableCreations
> --
>
> Key: HADOOP-16476
> URL: https://issues.apache.org/jira/browse/HADOOP-16476
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Gabor Bota
>Priority: Minor
>
> Test is failing intermittently. One possible solution would be to wait 
> (retry) more because the table will be deleted eventually - it's not there 
> after the whole test run.
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 142.471 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps
> [ERROR] 
> testConcurrentTableCreations(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps)
>   Time elapsed: 142.286 s  <<< ERROR!
> java.lang.IllegalArgumentException: Table 
> s3guard.test.testConcurrentTableCreations-1265635747 is not deleted.
>   at 
> com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:505)
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.deleteTable(ITestS3GuardConcurrentOps.java:87)
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.testConcurrentTableCreations(ITestS3GuardConcurrentOps.java:178)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: com.amazonaws.waiters.WaiterTimedOutException: Reached maximum 
> attempts without transitioning to the desired state
>   at 
> com.amazonaws.waiters.WaiterExecution.pollResource(WaiterExecution.java:86)
>   at com.amazonaws.waiters.WaiterImpl.run(WaiterImpl.java:88)
>   at 
> com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:502)
>   ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16476) Intermittent failure of ITestS3GuardConcurrentOps#testConcurrentTableCreations

2019-07-31 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HADOOP-16476:

Component/s: fs/s3

> Intermittent failure of ITestS3GuardConcurrentOps#testConcurrentTableCreations
> --
>
> Key: HADOOP-16476
> URL: https://issues.apache.org/jira/browse/HADOOP-16476
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Gabor Bota
>Priority: Minor
>
> Test is failing intermittently. One possible solution would be to wait 
> (retry) more because the table will be deleted eventually - it's not there 
> after the whole test run.
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 142.471 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps
> [ERROR] 
> testConcurrentTableCreations(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps)
>   Time elapsed: 142.286 s  <<< ERROR!
> java.lang.IllegalArgumentException: Table 
> s3guard.test.testConcurrentTableCreations-1265635747 is not deleted.
>   at 
> com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:505)
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.deleteTable(ITestS3GuardConcurrentOps.java:87)
>   at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.testConcurrentTableCreations(ITestS3GuardConcurrentOps.java:178)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: com.amazonaws.waiters.WaiterTimedOutException: Reached maximum 
> attempts without transitioning to the desired state
>   at 
> com.amazonaws.waiters.WaiterExecution.pollResource(WaiterExecution.java:86)
>   at com.amazonaws.waiters.WaiterImpl.run(WaiterImpl.java:88)
>   at 
> com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:502)
>   ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16476) Intermittent failure of ITestS3GuardConcurrentOps#testConcurrentTableCreations

2019-07-31 Thread Gabor Bota (JIRA)
Gabor Bota created HADOOP-16476:
---

 Summary: Intermittent failure of 
ITestS3GuardConcurrentOps#testConcurrentTableCreations
 Key: HADOOP-16476
 URL: https://issues.apache.org/jira/browse/HADOOP-16476
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Gabor Bota


Test is failing intermittently. One possible solution would be to wait (retry) 
more because the table will be deleted eventually - it's not there after the 
whole test run.

{noformat}
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 142.471 
s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps
[ERROR] 
testConcurrentTableCreations(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps)
  Time elapsed: 142.286 s  <<< ERROR!
java.lang.IllegalArgumentException: Table 
s3guard.test.testConcurrentTableCreations-1265635747 is not deleted.
at 
com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:505)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.deleteTable(ITestS3GuardConcurrentOps.java:87)
at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.testConcurrentTableCreations(ITestS3GuardConcurrentOps.java:178)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.amazonaws.waiters.WaiterTimedOutException: Reached maximum 
attempts without transitioning to the desired state
at 
com.amazonaws.waiters.WaiterExecution.pollResource(WaiterExecution.java:86)
at com.amazonaws.waiters.WaiterImpl.run(WaiterImpl.java:88)
at 
com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:502)
... 16 more
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16472) findbugs warning on LocalMetadataStore.ttlTimeProvider sync

2019-07-30 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-16472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896261#comment-16896261
 ] 

Gabor Bota commented on HADOOP-16472:
-

I see that you already have a PR for this so I've unassigned myself.

> findbugs warning on LocalMetadataStore.ttlTimeProvider sync
> ---
>
> Key: HADOOP-16472
> URL: https://issues.apache.org/jira/browse/HADOOP-16472
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> This is a minor issue codewise, but its interfering with all PR test runs, so 
> I need it fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16472) findbugs warning on LocalMetadataStore.ttlTimeProvider sync

2019-07-30 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HADOOP-16472:
---

Assignee: Steve Loughran  (was: Gabor Bota)

> findbugs warning on LocalMetadataStore.ttlTimeProvider sync
> ---
>
> Key: HADOOP-16472
> URL: https://issues.apache.org/jira/browse/HADOOP-16472
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> This is a minor issue codewise, but its interfering with all PR test runs, so 
> I need it fixed. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >