[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943218#comment-15943218
 ] 

Steve Loughran commented on MAPREDUCE-6840:
-------------------------------------------

milis is a pretty human-unfriendly number; the mechanism Configuration uses to 
support ms, s, m, h, d is better. I think it should be possible to use 
{{configuration.getTimeDuration()}} to parse the duration arg simply by 
creating a no-default Configuration, set the property, then have it parse the 
string. Ugly but effective

> Distcp to support cutoff time
> -----------------------------
>
>                 Key: MAPREDUCE-6840
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6840
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: distcp
>    Affects Versions: 2.6.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>            Priority: Minor
>         Attachments: MAPREDUCE-6840.1.patch
>
>
> To ensure consistency in the datasets on HDFS,  some projects like file 
> formats on Hive do HDFS operations in a particular order.  For example, if a 
> file format uses an index file, a new version of the index file will only be 
> written to HDFS after all files mentioned by the index are written to HDFS.
> When we do distcp, it's important to preserve that consistency, so that we 
> don't break those file formats.
> A typical solution for that is to create a HDFS Snapshot beforehand, and only 
> distcp the Snapshot.  That could work well if the user has superuser 
> privilege to make the directory snapshottable.
> If not, then it will be beneficial to have a cutoff time for distcp, so that 
> distcp only copy files modified on/before that cutoff time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to