[
https://issues.apache.org/jira/browse/HADOOP-16184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827033#comment-16827033
]
Gabor Bota commented on HADOOP-16184:
-------------------------------------
Created a PR with 2 initial tests for the sequences defined above.
The first call sq will fail as expected: 404 after read
The second one is different. It will read a byte array of 0's!
{noformat}
(ITestS3GuardOutOfBandOperations.java:testTombstoneMarkerCORDR(338))
- first read: some test - read after delete:
{noformat}
If I log the array instead of the string, I get the following:
{noformat}
(ITestS3GuardOutOfBandOperations.java:testTombstoneMarkerCORDR(338)) -
first read: [115, 111, 109, 101, 32, 116, 101, 115, 116] - read after delete:
[0, 0, 0, 0, 0, 0, 0, 0, 0]
{noformat}
But, we should have failed fast, right? This should be handled instead of
reading garbage.
> S3Guard: Handle OOB deletions and creation of a file which has a tombstone
> marker
> ---------------------------------------------------------------------------------
>
> Key: HADOOP-16184
> URL: https://issues.apache.org/jira/browse/HADOOP-16184
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.1.0
> Reporter: Gabor Bota
> Assignee: Gabor Bota
> Priority: Major
>
> When a file is deleted in S3 using S3Guard a tombstone marker will be added
> for that file in the MetadataStore. If another process creates the file
> without using S3Guard (as an out of band operation - OOB) the file still not
> be visible for the client using S3Guard because of the deletion tombstone.
> ----
> The whole of S3Guard is potentially brittle to
> * OOB deletions: we skip it in HADOOP-15999, so no worse, but because the
> S3AInputStream retries on FNFE, so as to "debounce" cached 404s, it's
> potentially going to retry forever.
> * OOB creation of a file which has a deletion tombstone marker.
> The things this issue covers:
> * Write a test to simulate that deletion problem, to see what happens. We
> ought to have the S3AInputStream retry briefly on that initial GET failing,
> but only on that initial one. (after setting "fs.s3a.retry.limit" to
> something low & the interval down to 10ms or so to fail fast)
> * Sequences
> {noformat}
> 1. create; delete; open; read -> fail after retry
> 2. create; open, read, delete, read -> fail fast on the second read
> {noformat}
> The StoreStatistics of the filesystem's IGNORED_ERRORS stat will be increased
> on the ignored error, so on sequence 1 will have increased, whereas on
> sequence 2 it will not have. If either of these tests don't quite fail as
> expected, we can disable the tests and continue, at least now with some tests
> to simulate a condition we don't have a fix for.
> * For both, we just need to have some model of how long it takes for
> debouncing to stabilize. Then in this new check, if an FNFE is raised and the
> check is happening > (modtime+ debounce-delay) then it's a real FNFE.
> This issue is created based on [[email protected]] remarks and comments on
> HADOOP-15999.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]