[ 
https://issues.apache.org/jira/browse/HDFS-12000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16077462#comment-16077462
 ] 

Weiwei Yang commented on HDFS-12000:
------------------------------------

Hi [~linyiqun]

Thank you for reviewing this. 

bq.  A better way is that only when we detect the change between current key 
data and uploaded file, then we put this file.

Do you mean if user puts a same file twice to a same key, we do not store the 
file twice and just update each key version reference to the same file? That 
sounds a good idea but how we can tell if two files are same? Even if we have a 
neat way to, this will need to be done in every PUT operation and compare with 
every previous versions, that's a lot of overhead. I think we can live with an 
assumption with versioning: user generally does not upload same object multiple 
times to a single key, they upload different versions. Similarly here is the 
document from s3 : _Each version of an object is the entire object; it is not 
just a diff from the previous version. Thus, if you have three versions of an 
object stored, you are charged for three objects._

bq. I think it will still be a hard work for KSM to delete all the version 
keys/files even if we use a async way.

True, this is the missing piece right now in ozone. I have added some comments 
in HDFS-11922, please take a look. I think we need to support async delete of 
keys, once that is implemented, it will be easy to support delete any version 
of objects from KSM.

bq. How about using an expire policy for old key versions as that have 
mentioned in the doc? This should be a good way for this case.

Expire policy is one part of the object life cycle management, that would be 
another new feature. I don't think that will be a near time target.

bq. If we need to complete this work in the first phase work of Ozone?

I agree. I think this is not a MUST for the 1st phase. Lets listen to the 
opinions from other folks. Anyway it would be good to study this and see how 
much effort is there if we need to implement this. 

Thanks for sharing your idea!

> Ozone: Container : Add key versioning support
> ---------------------------------------------
>
>                 Key: HDFS-12000
>                 URL: https://issues.apache.org/jira/browse/HDFS-12000
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Anu Engineer
>            Assignee: Chen Liang
>
> The rest interface of ozone supports versioning of keys. This support comes 
> from the containers and how chunks are managed to support this feature. This 
> JIRA tracks that feature. Will post a detailed design doc so that we can talk 
> about this feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to