[
https://issues.apache.org/jira/browse/HADOOP-16725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved HADOOP-16725.
-------------------------------------
Resolution: Invalid
OK, my test is showing that this is not as bad as I thought.
Prior to HADOOP-16697, PUT operations on a s3guard table will update all parent
dir entries, at least for the active operation, so single file writes will
always have a complete tree of parents all of the same age.
* BulkOperationState reduces the parent writes so that on a bulk rename/commit
a parent will only ever be written once. Thus, for any big directory rename,
the parent dirs will be older than the child.
* the resetting of the auth mode of a dir in a prune will (a) recreate the
parent entry (with a Timestamp == 0!) which will then rebuild all parent dirs.
Therefore: directories with pruned children are recreated -only dirs without
children would be prune
Even given that, I'm unable to recreate the failure. Why not?
* We're filtering on modtime, but directories don't have a modtime field, so
there was an implicit "is_dir=false" filter.
Therefore: nothing to worry about after all. The logic was there, just hidden.
I'm going to merge the work done here (new test, extended SELECT) into the
HADOOP-16697 work; just putting it up as a separate PR for completeness.
> s3guard prune can delete directories -leaving orphan children.
> --------------------------------------------------------------
>
> Key: HADOOP-16725
> URL: https://issues.apache.org/jira/browse/HADOOP-16725
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.0.3, 3.2.1, 3.1.3
> Reporter: Steve Loughran
> Priority: Critical
>
> When s3guard prune is invoked to delete not updated since a specific time, it
> doesn't check to see if an expired directory entry has any children. As a
> result -if a child is newer than the cut-off date, the dir entry can be
> removed but not the child. This can leave S3Guard in an inconsistent state.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]