[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-08-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120327#comment-16120327
 ] 

Steve Loughran commented on HADOOP-14467:
-

let's not worry about it for now

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14467-HADOOP-13345.001.patch
>
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-08-09 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120168#comment-16120168
 ] 

Aaron Fabbri commented on HADOOP-14467:
---

I didn't find a nice clean way to add a new exception message that I liked 
here.  At least now folks can google it.  I feel like we could make more 
improvements as part of HADOOP-14735:  we could report existence in S3 (if we 
checked), Metadata Store, etc.

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14467-HADOOP-13345.001.patch
>
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-08-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120153#comment-16120153
 ] 

Steve Loughran commented on HADOOP-14467:
-

 I should add, do we actually want a followup task here?

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14467-HADOOP-13345.001.patch
>
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-07-15 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088605#comment-16088605
 ] 

Steve Loughran commented on HADOOP-14467:
-

ask on common-dev

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-07-14 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16088268#comment-16088268
 ] 

Aaron Fabbri commented on HADOOP-14467:
---

I've created a simple integration test that exercises this.

[~ste...@apache.org] do you know who can grant me write access to the Hadoop 
Confluence wiki?

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-07-06 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076839#comment-16076839
 ] 

Steve Loughran commented on HADOOP-14467:
-

without s3guard, this can arise if 

# a file is deleted between the open() returning a reference and the first 
read() call. 
# a file is deleted during a read sequence and a new partial GET of a file is 
made (new seek, new block in the fadvise=random mode).
#  also if a file is deleted while a sequential GET was already in progress and 
a subsequent read() causes this to surface (issue: how does it surface?). If 
it's a read error we'll try and re-open the connection, which should escalate 
it to condition (2)

We can certainly write tests for the first two of these; the final one is 
probably driven by buffer settings in the infrastructure (or indeed, could be 
used to determine what those buffer sizes are)

s3guard adds a new failure: file is in the DDB, but not in the FS.  This will 
surface as a similar situation to #1 above.

Maybe that's something which the FS itself should be made aware of, in a metric 
or callback. There's some incrementing of statistics in the {{S3AInputStream}}, 
but it could actually invoke some callback on the S3A FS to say "we've got a 
failure on read #0 of blob s3a://bucket/file1", which can then trigger other 
actions if the FS is s3guard. It could also think about a callback if the first 
read triggered an EOF as well, as that could be a sign of the file length not 
being what DDB thinks it is.

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-06-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032925#comment-16032925
 ] 

Steve Loughran commented on HADOOP-14467:
-

bq. Should we have a top-level S3A cwiki page that the site docs link to? 

Can do, though there the nice thing is we can update the wiki as it goes along

bq. When should things live in the site docs versus wiki?

Wiki has proven useful for a consistent URL across versions, whereas a URL in 
the code to a docs ref would be a maintenance mess. And we get to update it on 
a whim. The big limitation is the need to be cross-version in what you explain; 
for things like network errors this is trivial, "ConnectionRefused" never 
changes. Here we may need to say "if using s3guard". 

I've proposed doing a wiki link on BadAuth, incidentally; we should probably do 
one for 301 moved, as that usually means "wrong endpoint"



> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-06-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032926#comment-16032926
 ] 

Steve Loughran commented on HADOOP-14467:
-

+regarding state, easy to add a flag to the stream "opened count" (if we aren't 
already tracking that), and treat an open failure if count==0 as different from 
the rest

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-05-31 Thread Aaron Fabbri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032075#comment-16032075
 ] 

Aaron Fabbri commented on HADOOP-14467:
---

{quote}I think we should add a wiki entry pointing to a (new) hadoop cwiki 
entry.
{quote}
I'm happy to do that.  Couple of noob questions: Should we have a top-level S3A 
cwiki page that the site docs link to?  When should things live in the site 
docs versus wiki?

{quote}
maybe also the input stream could differentiate from "never opened the file" 
from "opened, it was there, now it isnt". which can happen if someone deletes 
it during the read process
{quote}
Makes sense.  I'll look at the code and think about this.  Usually I'd handle 
this by passing around a request context object that keeps bits of state 
associated with an operation.  

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>Priority: Minor
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14467) S3Guard: Improve FNFE message when opening a stream

2017-05-31 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031008#comment-16031008
 ] 

Steve Loughran commented on HADOOP-14467:
-

likely root cause would be that DDB is inconsistent with the FS.

# I think we should add a wiki entry pointing to a (new) hadoop cwiki entry. I 
say cwiki as the older wiki may be phased out at some point.
# maybe also the input stream could differentiate from "never opened the file" 
from "opened, it was there, now it isnt". which can happen if someone deletes 
it during the read process

> S3Guard: Improve FNFE message when opening a stream
> ---
>
> Key: HADOOP-14467
> URL: https://issues.apache.org/jira/browse/HADOOP-14467
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Aaron Fabbri
>Assignee: Aaron Fabbri
>
> Following up on the [discussion on 
> HADOOP-13345|https://issues.apache.org/jira/browse/HADOOP-13345?focusedCommentId=16030050=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030050],
>  because S3Guard can serve getFileStatus() from the MetadataStore without 
> doing a HEAD on S3, a FileNotFound error on a file due to S3 GET 
> inconsistency does not happen on open(), but on the first read of the stream. 
>  We may add retries to the S3 client in the future, but for now we should 
> have an exception message that indicates this may be due to inconsistency 
> (assuming it isn't a more straightforward case like someone deleting the 
> object out from under you).
> This is expected to be a rare case, since the S3 service is now mostly 
> consistent for GET.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org