[
https://issues.apache.org/jira/browse/HADOOP-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721899#action_12721899
]
Doug Cutting commented on HADOOP-6087:
--------------------------------------
Some comments:
- 'sleeptime' should be 'getSleeptime()' to be thread safe, no? or maybe use
int as a sleep time, since updates to an int are atomic.
- getNumRunningMaps() is expensive to call from each node at each interval,
since reports for all tasks must be retrieved from the JT. better would be to
just fetch the job's counters each time, since they're constant-sized, not
proportional to the number of tasks. You'd need to add a maps_completed
counter, then use the difference between that and TOTAL_LAUNCHED_MAPS to
calculate the number running.
- the interval to contact the JT might be randomized a bit, so that not all
tasks hit it at the same time, e.g., by adding a random value that's 10% of the
specified value.
- when InterruptedException is caught a thread should generally exit, not
simply log a warning. if things will no longer work correctly without the
thread, then it should somehow cause other threads dependent threads to fail
too.
- getNumRunningMaps() should either return a correct value or throw an
exception. if it cannot contact the JT or if the task does not know its Id it
should fail, no?
> distcp can support bandwidth limiting
> -------------------------------------
>
> Key: HADOOP-6087
> URL: https://issues.apache.org/jira/browse/HADOOP-6087
> Project: Hadoop Core
> Issue Type: New Feature
> Components: tools/distcp
> Affects Versions: 0.21.0
> Reporter: Ravi Gummadi
> Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: d_bw.patch
>
>
> distcp should support an option for user to specify the bandwidth limit for
> the distcp job.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.