[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

Steve Loughran (JIRA) Mon, 04 Mar 2019 04:14:35 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783322#comment-16783322
 ]


Steve Loughran commented on HADOOP-15999:
-----------------------------------------

{code}
java.util.concurrent.ExecutionException: java.io.FileNotFoundException: No such 
file or directory: 
s3a://cloudera-dev-gabor-ireland/OutOfBandDelete-abc0f4f4-741e-4e25-b6bf-3e60b180e6b4

        at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
        at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.expectExceptionWhenReadingOpenFileAPI(ITestS3GuardOutOfBandOperations.java:439)
        at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.expectExceptionWhenReading(ITestS3GuardOutOfBandOperations.java:427)
        at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.outOfBandDeletes(ITestS3GuardOutOfBandOperations.java:240)
        at 
org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.testOutOfBandDeletes(ITestS3GuardOutOfBandOperations.java:206)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
        at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
        at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: No such file or directory: 
s3a://cloudera-dev-gabor-ireland/OutOfBandDelete-abc0f4f4-741e-4e25-b6bf-3e60b180e6b4
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2526)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2420)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2331)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:863)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$18(S3AFileSystem.java:3764)
        at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$openFileWithOptions$19(S3AFileSystem.java:3763)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        ... 1 more
{code}

maybe its an observed inconsistency. if the check for status for a file fails 
second time round, you'll have to think about using the {{s3guardInvoker}} we 
use for file reading to check this. This uses `S3GuardExistsRetryPolicy` to 
retry on FNFE. 

Until now we've assumed that if the entry is in DDB then we can open the file, 
and its the `S3aInputSTream.reopen()` which gets to handle OOB deletion. Now 
it'll need to be done earlier on. Or: not. If the file isn't there, carry on 
with the open and expect the read to handle it, at the specific place where it 
is needed.





> S3Guard: Better support for out-of-band operations
> --------------------------------------------------
>
>                 Key: HADOOP-15999
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15999
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0
>            Reporter: Sean Mackrory
>            Assignee: Gabor Bota
>            Priority: Major
>         Attachments: HADOOP-15999-007.patch, HADOOP-15999.001.patch, 
> HADOOP-15999.002.patch, HADOOP-15999.003.patch, HADOOP-15999.004.patch, 
> HADOOP-15999.005.patch, HADOOP-15999.006.patch, out-of-band-operations.patch
>
>
> S3Guard was initially done on the premise that a new MetadataStore would be 
> the source of truth, and that it wouldn't provide guarantees if updates were 
> done without using S3Guard.
> I've been seeing increased demand for better support for scenarios where 
> operations are done on the data that can't reasonably be done with S3Guard 
> involved. For example:
> * A file is deleted using S3Guard, and replaced by some other tool. S3Guard 
> can't tell the difference between the new file and delete / list 
> inconsistency and continues to treat the file as deleted.
> * An S3Guard-ed file is overwritten by a longer file by some other tool. When 
> reading the file, only the length of the original file is read.
> We could possibly have smarter behavior here by querying both S3 and the 
> MetadataStore (even in cases where we may currently only query the 
> MetadataStore in getFileStatus) and use whichever one has the higher modified 
> time.
> This kills the performance boost we currently get in some workloads with the 
> short-circuited getFileStatus, but we could keep it with authoritative mode 
> which should give a larger performance boost. At least we'd get more 
> correctness without authoritative mode and a clear declaration of when we can 
> make the assumptions required to short-circuit the process. If we can't 
> consider S3Guard the source of truth, we need to defer to S3 more.
> We'd need to be extra sure of any locality / time zone issues if we start 
> relying on mod_time more directly, but currently we're tracking the 
> modification time as returned by S3 anyway.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-15999) S3Guard: Better support for out-of-band operations

Reply via email to