[jira] [Updated] (HADOOP-15621) s3guard: implement time-based (TTL) expiry for DynamoDB Metadata Store

Gabor Bota (JIRA) Sun, 02 Sep 2018 02:03:46 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-15621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gabor Bota updated HADOOP-15621:
--------------------------------
    Attachment: HADOOP-15621.001.patch

> s3guard: implement time-based (TTL) expiry for DynamoDB Metadata Store
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-15621
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15621
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0-beta1
>            Reporter: Aaron Fabbri
>            Assignee: Gabor Bota
>            Priority: Minor
>         Attachments: HADOOP-15621.001.patch
>
>
> Similar to HADOOP-13649, I think we should add a TTL (time to live) feature 
> to the Dynamo metadata store (MS) for S3Guard.
> Think of this as the "online algorithm" version of the CLI prune() function, 
> which is the "offline algorithm".
> Why: 
> 1. Self healing (soft state): since we do not implement transactions around 
> modification of the two systems (s3 and metadata store), certain failures can 
> lead to inconsistency between S3 and the metadata store (MS) state.  Having a 
> time to live (TTL) on each entry in S3Guard means that any inconsistencies 
> will be time bound.  Thus "wait and restart your job" becomes a valid, if 
> ugly, way to get around any issues with FS client failure leaving things in a 
> bad state.
> 2. We could make manual invocation of `hadoop s3guard prune ...` unnecessary, 
> depending on the implementation.
> 3. Makes it possible to fix the problem that dynamo MS prune() doesn't prune 
> directories due to the lack of true modification time.
> How:
> I think we need a new column in the dynamo table "entry last written time".  
> This is updated each time the entry is written to dynamo.
> After that we can either
> 1. Have the client simply ignore / elide any entries that are older than the 
> configured TTL.
> 2. Have the client delete entries older than the TTL.
> The issue with #2 is it will increase latency if done inline in the context 
> of an FS operation. We could mitigate this some by using an async helper 
> thread, or probabilistically doing it "some times" to amortize the expense of 
> deleting stale entries (allowing some batching as well).
> Caveats:
> - Clock synchronization as usual is a concern. Many clusters already keep 
> clocks close enough via NTP. We should at least document the requirement 
> along with the configuration knob that enables the feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-15621) s3guard: implement time-based (TTL) expiry for DynamoDB Metadata Store

Reply via email to