[
https://issues.apache.org/jira/browse/HADOOP-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12621214#action_12621214
]
dhruba borthakur commented on HADOOP-3929:
------------------------------------------
+1. This will be a useful feature to have.
> I would like to improve the archive tool [see issue 3307].
> ----------------------------------------------------------
>
> Key: HADOOP-3929
> URL: https://issues.apache.org/jira/browse/HADOOP-3929
> Project: Hadoop Core
> Issue Type: Improvement
> Reporter: Dick King
> Original Estimate: 504h
> Remaining Estimate: 504h
>
> I have a tool written atop the libhdfs library that implements an archive
> system. It's working [in C++]
> JIRA #3307 documents a native DFS archive system, first available in 18.0 .
> I would like to port my code, and thereby extend that system in 3 directions:
> 1: archives will be immutable in 18.0 . I would like to provide an API to
> let you add, delete, and modify files.
> 1a: You would want to be able to batch such operations and perform them
> all at once when a batch is complete.
> 2: the tree to be archived must be in dfs in 18.0 . I would like it to be
> possible for the tree to contain some local filesystem files as well [think
> org.apache.hadoop.fs.Path ]
> 2a: I realize that this would preclude parallel modification when a local
> filesystem is used
> 2b: I don't have a convincing story re two processes simultaneously
> modifying the same archive, even for a disjoint set of files, but I'm willing
> to discuss this.
> 3: i would like it to be possible to batch the changes and make them all in
> one operation, to reduce DFS activity.
> I had in-person discussions on this with user mahadev . He is encouraging me
> to file this bug report so we can broaden this discussion.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.