Re: Dynamically terminate a job once Reporter hits a threshold
Out of curiosity, how reliable are the counters from the perspective of the JobClient while the job is in progress? While hitting 'refresh' on the status web page for a job, I notice that my counters bounce all over the place, showing wildly different figures second-to-second. Is that using a different (less well-synchronized?) mechanism to access the counters than the user has available in the JobClient? (If so, is this something we can easily patch to make more consistent?) - Aaron On Fri, Nov 7, 2008 at 12:21 PM, Arun C Murthy [EMAIL PROTECTED] wrote: On Nov 7, 2008, at 12:12 PM, Brian MacKay wrote: Looking for a way to dynamically terminate a job once Reporter in a Map job hits a threshold, Example: public void map(WritableComparable key, Text values, Output CollectorText, Text output, Reporter reporter) throws IOException { if( reporter.getCount() SomeConfigValue) { return; } map job code } Obviously, reporter.getCount() doesn't exist. Open to other ideas, and any advice would be appreciated. If you _really_ need this, you could do this from your JobClient... use JobClient.submitJob (rather than runJob: http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Submission+and+Monitoring), manually fetch the Counters you need and terminate the Job via JobClient.getJob(jobId).killJob(). Arun
RE: Dynamically terminate a job once Reporter hits a threshold
Thanks Arun for your tip. This morning I changed to submitJob and polled. It worked very well, and you saved me some trial and error. -Original Message- From: Aaron Kimball [mailto:[EMAIL PROTECTED] Sent: Monday, November 10, 2008 4:35 AM To: core-user@hadoop.apache.org Subject: Re: Dynamically terminate a job once Reporter hits a threshold Out of curiosity, how reliable are the counters from the perspective of the JobClient while the job is in progress? While hitting 'refresh' on the status web page for a job, I notice that my counters bounce all over the place, showing wildly different figures second-to-second. Is that using a different (less well-synchronized?) mechanism to access the counters than the user has available in the JobClient? (If so, is this something we can easily patch to make more consistent?) - Aaron On Fri, Nov 7, 2008 at 12:21 PM, Arun C Murthy [EMAIL PROTECTED] wrote: On Nov 7, 2008, at 12:12 PM, Brian MacKay wrote: Looking for a way to dynamically terminate a job once Reporter in a Map job hits a threshold, Example: public void map(WritableComparable key, Text values, Output CollectorText, Text output, Reporter reporter) throws IOException { if( reporter.getCount() SomeConfigValue) { return; } map job code } Obviously, reporter.getCount() doesn't exist. Open to other ideas, and any advice would be appreciated. If you _really_ need this, you could do this from your JobClient... use JobClient.submitJob (rather than runJob: http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Subm ission+and+Monitoring), manually fetch the Counters you need and terminate the Job via JobClient.getJob(jobId).killJob(). Arun _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this message in error, please contact the sender and delete the material from any computer.
Dynamically terminate a job once Reporter hits a threshold
Looking for a way to dynamically terminate a job once Reporter in a Map job hits a threshold, Example: public void map(WritableComparable key, Text values, Output CollectorText, Text output, Reporter reporter) throws IOException { if( reporter.getCount() SomeConfigValue) { return; } map job code } Obviously, reporter.getCount() doesn't exist. Open to other ideas, and any advice would be appreciated. Thanks, Brian _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this message in error, please contact the sender and delete the material from any computer.
Re: Dynamically terminate a job once Reporter hits a threshold
On Nov 7, 2008, at 12:12 PM, Brian MacKay wrote: Looking for a way to dynamically terminate a job once Reporter in a Map job hits a threshold, Example: public void map(WritableComparable key, Text values, Output CollectorText, Text output, Reporter reporter) throws IOException { if( reporter.getCount() SomeConfigValue) { return; } map job code } Obviously, reporter.getCount() doesn't exist. Open to other ideas, and any advice would be appreciated. If you _really_ need this, you could do this from your JobClient... use JobClient.submitJob (rather than runJob: http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Submission+and+Monitoring) , manually fetch the Counters you need and terminate the Job via JobClient.getJob(jobId).killJob(). Arun