[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-05-16 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841079#comment-16841079
 ] 

Gabor Bota commented on HADOOP-15999:
-

[~fabbri] that is covered in HADOOP-16279 and HADOOP-16184.
You can find the test cases under HADOOP-16184 PR.
HADOOP-16184 can be fixed after creating the metadata expiry for all MS entries 
in HADOOP-16279.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-05-15 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840846#comment-16840846
 ] 

Sean Mackrory commented on HADOOP-15999:


{quote} Seems like a common case{quote}

Yeah that's actually the exact case that motivated this ticket, IIRC...

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-05-15 Thread Aaron Fabbri (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840844#comment-16840844
 ] 

Aaron Fabbri commented on HADOOP-15999:
---

What about in-band delete (create tombstone) and then OOB create? Didn't see 
this covered in the cases here but maybe I missed it.  Seems like a common case 
(OOB process dropping data in bucket). Might want to add a test case and 
document this in a new JIRA if it is not already covered here.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804150#comment-16804150
 ] 

Hudson commented on HADOOP-15999:
-

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16299 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16299/])
HADOOP-15999. S3Guard: Better support for out-of-band operations. (stevel: rev 
b5db2383832881034d57d836a8135a07a2bd1cf4)
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
* (add) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3GuardOutOfBandOperations.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
* (edit) hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3Guard.java
* (edit) hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractGetFileStatusTest.java


> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-19 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796148#comment-16796148
 ] 

Gabor Bota commented on HADOOP-15999:
-

updated pull request with the directory skipping: 
[https://github.com/apache/hadoop/pull/624]

successful itests run against Ireland 

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793031#comment-16793031
 ] 

Steve Loughran commented on HADOOP-15999:
-

really close to getting in, ran lots of tests, am happy. I tried adding a new 
test but failed and gave up HADOOP-16193 is the outcome there.

One more change to request: skip going to s3 if the file checked is a 
directory. Because if the dest is also a directory, there's no difference.

Pro: misses out the two failing HEAD calls and an expensive LIST whose output 
is discarded

Con: doesn't catch up on the special failure case: someone has taken a 
directory path /a/b/ and overwritten it with a file /a/b  . 
'
If we did want to worry about that, then rather than doing the whole 
s3GetFileStatus call, we only need to execute a single getObjectMetadata for 
the key "a/b" and, if something is actually there do an update.

That would still be hitting the store, but it'd only be doing 1/3 as many 
requests



> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-14 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792794#comment-16792794
 ] 

Steve Loughran commented on HADOOP-15999:
-

bq.  (note to myself: if I use low ddb.table.capacity.read and forget to 
modify it on the dashboard tests will timeout and fail)

try switching to PAYG capacity. I have; there's a few tests we need to fix, but 
otherwise all seems well. Regarding the tests timing out *rather than failing*, 
I consider that a success of HADOOP-15426: no matter how overloaded things are, 
your client shouldn't fail

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-14 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792645#comment-16792645
 ] 

Gabor Bota commented on HADOOP-15999:
-

verify ran successfuly against ireland. 

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-13 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791892#comment-16791892
 ] 

Gabor Bota commented on HADOOP-15999:
-

I get random failures/timeouts with integration tests but seems unrelated. I'll 
run the test tomorrow again and describe the failures if still persist. I'm 
going to create the proposed issues now.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790839#comment-16790839
 ] 

Hadoop QA commented on HADOOP-15999:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
29s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HADOOP-15999 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12962164/HADOOP-15999.009.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7dbe88a9baf5 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 34b1406 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/16045/testReport/ |
| Max. process+thread count | 323 (vs. ulimit of 1) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/16045/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: 

[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-12 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790770#comment-16790770
 ] 

Gabor Bota commented on HADOOP-15999:
-

fixed the ordering in patch 009. I will create those issues you've proposed 
tomorrow.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> HADOOP-15999.009.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-08 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787977#comment-16787977
 ] 

Steve Loughran commented on HADOOP-15999:
-

This is ~ready to go in; I've only got one change to the code (see the bottom).

I do think we need to be sure that we've got all opportunties for 
inconsistencies to arise covered, and I'm now considering performance too.

h3. Deletions

I think the whole of S3Guard is potentially brittle to 
* OOB deletions: you skip it here, so no worse, but because the S3AInputStream 
retries on FNFE, so as to "debounce" cached 404s, it's potentially going to 
retry forever
* OOB creation of a file which has a deletion tombstone marker. 

You are already documenting this, so the next step to think about is code: 

*Proposed*: write a test to simulate that deletion problem, to see what 
happens. I'm actually curious now. We ought to have the S3AInputStream retry 
briefly on that initial GET failing, but only on that initial one. (after 
setting "fs.s3a.retry.limit" to something low & the interval down to 10ms or so 
to fail fast)

sequences

{code}
1. create; delete; open; read -> fail after retry
2. create; open, read, delete, read -> fail fast on second read
{code}

The StoreStatistics of the filesystem's IGNORED_ERRORS stat will be increased 
on the ignored error, so on sequence 1 will have increased, whereas on sequence 
2 it will not have. If either  of these tests don't quite fail as expected, we 
can disable the tests and continue, at least now with some tests to simulate a 
condition we don't have a fix for



*Proposed* add a JIRA on this for us all to worry about. For both we just need 
to have some model of how long it takes for debouncing to stabilise. Then in 
this new check, if an FNFE is raised *and* the check is happening > (modtime+ 
debounce-delay) then its a real FNFE. 

h3. Timestamp ordering 


I'm going to add a new complication here. When you initiate a PUT, AFAIK (and 
[~Thomas Demoor] should be able to confirm), the modified time is that of the 
time the PUT began, not when the PUT completed. Which means I can have a 
workflow of 

{code}
write1 = fs.create(path, true)
write2 = fs.create(path, true)
write2.close()
status  = fs.getFileStatus(path)
write1.write(128MB of data)
write1.close()
status2  = fs.getFileStatus(path)

assertTrue(status2.getLastModified() < status1.getLastModified())
{code}

There's no way we are going to be able to defend against that except by 
tracking versions in the DDB tables, and the S3a Status including that when 
known. What we'll have to do then is make sure that this issue is documented 
today, and for the extension to do tag tracking in S3Guard, it keeps an eye on 
versions

*Proposed*: mention this problem in the docs. Once version tracking goes in to 
s3guard, we'll need to move the ("is newer than") operator out of this modtime 
check into somewhere else (proposed: do it in the S3AFileStatus, which will 
look @ version info if set, falling back to etags). (actually, if version 
checking is on in the GET, we'd never see the updated file, would we?)


h3. Performance impact

this is going to reinstate the HEAD on every read, so making non-auth S3Guard a 
bit slower. We could think about addressing that by moving the checks into the 
input stream itself. That is: the first GET which returns data will also act as 
the metadata check. That'd mean the read context will need updating with some 
"metastoreProcessHeader" callback to invoke on the first GET.

*Proposed*: Add a JIRA for this to become an optimization

The good news is that because it's reading a file, its only one HTTP HEAD 
request: no need to do any of the other two directory probes except in the case 
that the file isn't there.

h2. code review


h3. ITestS3GuardOutOfBandOperations

Check your import ordering: new files are where we should start off with 
getting things "correct" according to our style rules.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 

[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-05 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784464#comment-16784464
 ] 

Hadoop QA commented on HADOOP-15999:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
27s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HADOOP-15999 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12961163/HADOOP-15999.008.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0bd51f8b96a3 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0aefe28 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/16016/testReport/ |
| Max. process+thread count | 341 (vs. ulimit of 1) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/16016/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: 

[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-05 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784393#comment-16784393
 ] 

Gabor Bota commented on HADOOP-15999:
-

Uploaded patch 8. Integration tests run against Ireland without issues.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, HADOOP-15999.008.patch, 
> out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-05 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784299#comment-16784299
 ] 

Gabor Bota commented on HADOOP-15999:
-

I'm thinking about to use a {{static}} cache in LocalMS. Basically it would 
emulate the same behaviour as in dynamo: whenever we create a new metadatastore 
instance, the content of the underlying cache would be the same for all ms. 
It's likely that I will create an issue for that if I run this "reference tho 
the LocalMS is lost" issue agan, and go through all tests to fix this.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-05 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784294#comment-16784294
 ] 

Gabor Bota commented on HADOOP-15999:
-

I figured it out: the timeout in the LocalMS for file metadata was set to 10s 
and the test was running longer than this. Because of the timeout the metadata 
was no longer available from the local metadata store, so we hot the exception 
on {{getFileStatus.}} The {{DEFAULT_S3GUARD_METASTORE_LOCAL_ENTRY_TTL}} will be 
increased to 60s, so no there will be no more issues with this kind of test.
We can increase this value safely as the LocalMS is an only testing 
implementation.

 I'll run all other tests with dynamo && local && null. If everything's ok I'll 
upload a new patch.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-04 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783633#comment-16783633
 ] 

Gabor Bota commented on HADOOP-15999:
-

It was really me - so running the tests in my ide with the setting:
{noformat}
  
fs.s3a.s3guard.test.implementation
local
  
{noformat}
Running the same test with *dynamo* everything passes. 
Turned out the reason for *NPE*s when using local was we had the issue with the 
reference for the localms again. When we rebuild the fs or build a new fs 
instance we have to set the same cache and the NPEs are gone.

After fixing the NPEs the next issue is 
{{java.util.concurrent.ExecutionException: java.io.FileNotFoundException:}} - 
only for *local* again.
In {{expectExceptionWhenReadingOpenFileAPI}} when the following is called:
{code:java}
  try (FSDataInputStream in = guardedFs.openFile(testFilePath).build().get()) {
  intercept(FileNotFoundException.class, () -> {
byte[] bytes = new byte[text.length()];
return in.read(bytes, 0, bytes.length);
  });
}
{code}
The *{{FSDataInputStream in = guardedFs.openFile(testFilePath).build().get()}}* 
throws *FNFE*, and that's even before it's expected. That means there's 
something wrong going on with open file API is used. I don't have a clue right 
now why would this happen just when using local and not when using dynamo, but 
I need to figure it out.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-04 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783325#comment-16783325
 ] 

Steve Loughran commented on HADOOP-15999:
-

{code}
java.lang.NullPointerException
at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.overwriteFileInListing(ITestS3GuardOutOfBandOperations.java:319)
at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.testListingSameLengthOverwrite(ITestS3GuardOutOfBandOperations.java:211)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
{code}
you get to see what is null at that line. Maybe add an extra assert above to 
help debug


> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-04 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783322#comment-16783322
 ] 

Steve Loughran commented on HADOOP-15999:
-

{code}
java.util.concurrent.ExecutionException: java.io.FileNotFoundException: No such 
file or directory: 
s3a://cloudera-dev-gabor-ireland/OutOfBandDelete-abc0f4f4-741e-4e25-b6bf-3e60b180e6b4

at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.expectExceptionWhenReadingOpenFileAPI(ITestS3GuardOutOfBandOperations.java:439)
at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.expectExceptionWhenReading(ITestS3GuardOutOfBandOperations.java:427)
at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.outOfBandDeletes(ITestS3GuardOutOfBandOperations.java:240)
at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.testOutOfBandDeletes(ITestS3GuardOutOfBandOperations.java:206)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: No such file or directory: 
s3a://cloudera-dev-gabor-ireland/OutOfBandDelete-abc0f4f4-741e-4e25-b6bf-3e60b180e6b4
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2526)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2420)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2331)
at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:863)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$18(S3AFileSystem.java:3764)
at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$openFileWithOptions$19(S3AFileSystem.java:3763)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
{code}

maybe its an observed inconsistency. if the check for status for a file fails 
second time round, you'll have to think about using the {{s3guardInvoker}} we 
use for file reading to check this. This uses `S3GuardExistsRetryPolicy` to 
retry on FNFE. 

Until now we've assumed that if the entry is in DDB then we can open the file, 
and its the `S3aInputSTream.reopen()` which gets to handle OOB deletion. Now 
it'll need to be done earlier on. Or: not. If the file isn't there, carry on 
with the open and expect the read to handle it, at the specific place where it 
is needed.





> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for 

[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-01 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781817#comment-16781817
 ] 

Gabor Bota commented on HADOOP-15999:
-

Sure, here are the stacktraces:  
https://gist.github.com/bgaborg/4378fd13cf9ee8dab9475274a6dd251d

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-03-01 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781712#comment-16781712
 ] 

Steve Loughran commented on HADOOP-15999:
-

other tests run parametrized fine; worked for me in intellij. So there is a 
problem here it's not related to parameterization, more in how JUnit runs under 
the IDE are different from those of external runner. Can you stick the stack 
traces up.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-02-28 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780436#comment-16780436
 ] 

Gabor Bota commented on HADOOP-15999:
-

-1 for the latest (007) patch, because 8 out of the 12 new tests are failing in 
my IDE. CLI runs clear, but I had no problems with running test before in my 
IDE like this. Will debug more and provide another method maybe without using 
{{Parameterized.class}}

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-02-27 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779400#comment-16779400
 ] 

Gabor Bota commented on HADOOP-15999:
-

Downloaded the patch and looking into the tombstone problem.

> S3Guard: Better support for out-of-band operations
> --
>
> Key: HADOOP-15999
> URL: https://issues.apache.org/jira/browse/HADOOP-15999
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.1.0
>Reporter: Sean Mackrory
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

2019-02-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778758#comment-16778758
 ] 

Hadoop QA commented on HADOOP-15999:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 18s{color} | {color:orange} hadoop-tools/hadoop-aws: The patch generated 1 
new + 6 unchanged - 0 fixed = 7 total (was 6) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
31s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HADOOP-15999 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12960256/HADOOP-15999-007.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 411e690ab4b1 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9192f71 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/15981/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/15981/testReport/ |
| Max. process+thread count | 341 (vs. ulimit of 1) |
| modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/15981/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This