[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076839#comment-16076839
 ] 

Steve Loughran commented on HADOOP-14467:
-----------------------------------------

without s3guard, this can arise if 

# a file is deleted between the open() returning a reference and the first 
read() call. 
# a file is deleted during a read sequence and a new partial GET of a file is 
made (new seek, new block in the fadvise=random mode).
#  also if a file is deleted while a sequential GET was already in progress and 
a subsequent read() causes this to surface (issue: how does it surface?). If 
it's a read error we'll try and re-open the connection, which should escalate 
it to condition (2)

We can certainly write tests for the first two of these; the final one is 
probably driven by buffer settings in the infrastructure (or indeed, could be 
used to determine what those buffer sizes are)

s3guard adds a new failure: file is in the DDB, but not in the FS.  This will 
surface as a similar situation to #1 above.

Maybe that's something which the FS itself should be made aware of, in a metric 
or callback. There's some incrementing of statistics in the {{S3AInputStream}}, 
but it could actually invoke some callback on the S3A FS to say "we've got a 
failure on read #0 of blob s3a://bucket/file1", which can then trigger other 
actions if the FS is s3guard. It could also think about a callback if the first 
read triggered an EOF as well, as that could be a sign of the file length not 
being what DDB thinks it is.

> S3Guard: Improve FNFE message when opening a stream
> ---------------------------------------------------
>
>                 Key: HADOOP-14467
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14467
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>            Priority: Minor
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to