[
https://issues.apache.org/jira/browse/HADOOP-13449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mingliang Liu updated HADOOP-13449:
-----------------------------------
Attachment: HADOOP-13449-HADOOP-13345.005.patch
Thanks for the discussion, [~fabbri]. That's very helpful.
{quote}
for v1, you could always return authoritative = false.
{quote}
Yes, it's the current patch. Let's address this as a follow-up JIRA after the
[HADOOP-13651] and this both be committed.
{quote}
The interface allows any of these behaviors.... The filesystem is responsible
for ensuring that the delete to /a must be recursive since it is not empty.
MetadataStore explicitly does not do that.
{quote}
Agreed. For example, {{delete(path)}} does not check the directory path being
empty.
{quote}
You either have to (A) pay money to store an extra copy of your metadata
forever, or (B) spend money and time hydrating the MetadataStore each time you
start a cluster.
{quote}
The metadata size is considered small and the price of DDB storage is low
comparing with read/write operations pricing. If I have to choose, (A) makes
more sense.
{quote}
and we don't assume everything is always in DynamoDB, it makes recovery much
easier
{quote}
That's very valid. Altering S3 and MetadataStore is not atomic.
{quote}
The other concern is that I just don't understand why you would want to do the
preloading.
{quote}
You mean import? I suppose not. For read/write existing s3 buckets, importing
the structure first seems a prerequisite unless we assume it
discovers/converges fast or we reach little consistency.
I guess you mean the constrictions on the pre-creating parent directories. I
re-read the design doc and [HADOOP-13651] patch, and think you made a good
point about this. Let S3AFileSystem ensure the contract.
Moreover, I now think storing the is_empty bit in DynamoDB is not ideal.
Maintaining it needs non-trivial effort and it's easy to make it wrong. Perhaps
we can query via parent directories as HASH key when we need this information.
This is non-trivial either; I'll think about this as my next work. We can
either fix this in next patch, or I'll work on a follow-up JIRA.
If this patch is still in question, a conference call will be very helpful.
Let's schedule next week. [[email protected]] is traveling this week.
[~eddyxu] you have more comments since I revised the latest patch?
Thank you,
> S3Guard: Implement DynamoDBMetadataStore.
> -----------------------------------------
>
> Key: HADOOP-13449
> URL: https://issues.apache.org/jira/browse/HADOOP-13449
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Chris Nauroth
> Assignee: Mingliang Liu
> Attachments: HADOOP-13449-HADOOP-13345.000.patch,
> HADOOP-13449-HADOOP-13345.001.patch, HADOOP-13449-HADOOP-13345.002.patch,
> HADOOP-13449-HADOOP-13345.003.patch, HADOOP-13449-HADOOP-13345.004.patch,
> HADOOP-13449-HADOOP-13345.005.patch
>
>
> Provide an implementation of the metadata store backed by DynamoDB.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]