[
https://issues.apache.org/jira/browse/HADOOP-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623815#action_12623815
]
amirhyoussefi edited comment on HADOOP-3973 at 8/19/08 3:15 PM:
----------------------------------------------------------------
Here is relevant part for Map:
public void run(RecordReader<K1, V1> input, OutputCollector<K2, V2> output,
Reporter reporter)
throws IOException {
try {
// allocate key & value instances that are re-used for all entries
K1 key = input.createKey();
V1 value = input.createValue();
while (input.next(key, value)) {
// map pair to output
mapper.map(key, value, output, reporter);
}
} finally {
mapper.close();
}
}
Talked to Arun and we may be use status of reporter. Map can set break status
in reporter then we check it in while loop and breaking out of loop.
Checking it every time has performance side effect. So we can have a flag and
use a shortcut in while to check reporter only when that flag is there.
while (input.next(key, value)) {
if (breakEnabled && check_status_of_reporter) break;
...
}
was (Author: amirhyoussefi):
Here is relevant part for Map:
public void run(RecordReader<K1, V1> input, OutputCollector<K2, V2> output,
Reporter reporter)
throws IOException {
try {
// allocate key & value instances that are re-used for all entries
K1 key = input.createKey();
V1 value = input.createValue();
while (input.next(key, value)) {
// map pair to output
mapper.map(key, value, output, reporter);
}
} finally {
mapper.close();
}
}
Talked to Arun and we may be use status of reporter. Map can set break status
in reporter then we check it in while loop and breaking out of loop.
Checking it every time has performance side effect. So we can have a flag and
use a shortcut in while to check reporter only when that flag is there.
while (input.next(key, value) && breakEnabled &&
check_status_of_reporter) {
...
}
> Breaking out of a task
> ----------------------
>
> Key: HADOOP-3973
> URL: https://issues.apache.org/jira/browse/HADOOP-3973
> Project: Hadoop Core
> Issue Type: New Feature
> Reporter: Amir Youssefi
>
> Occasionally a mapper/reducer is done without need to see more input data. If
> we provide a way to break out and skip reading the rest of input then job
> finishs earlier and resources are spared.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.