Re: Dynamically terminate a job once Reporter hits a threshold

2008-11-10 Thread Aaron Kimball
Out of curiosity, how reliable are the counters from the perspective of the
JobClient while the job is in progress? While hitting 'refresh' on the
status web page for a job, I notice that my counters bounce all over the
place, showing wildly different figures second-to-second. Is that using a
different (less well-synchronized?) mechanism to access the counters than
the user has available in the JobClient? (If so, is this something we can
easily patch to make more consistent?)

- Aaron

On Fri, Nov 7, 2008 at 12:21 PM, Arun C Murthy [EMAIL PROTECTED] wrote:


 On Nov 7, 2008, at 12:12 PM, Brian MacKay wrote:


 Looking for a way to dynamically terminate a job once Reporter in a Map
 job hits a threshold,

 Example:

 public void map(WritableComparable key, Text values, Output
 CollectorText, Text output, Reporter reporter) throws IOException {

if( reporter.getCount()   SomeConfigValue) {
   return;
   }
   map job code
   }

 Obviously,  reporter.getCount() doesn't exist.  Open to other ideas, and
 any advice would be appreciated.


 If you _really_ need this, you could do this from your JobClient... use
 JobClient.submitJob (rather than runJob:
 http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Submission+and+Monitoring),
 manually fetch the Counters you need and terminate the Job via
 JobClient.getJob(jobId).killJob().

 Arun




RE: Dynamically terminate a job once Reporter hits a threshold

2008-11-10 Thread Brian MacKay



Thanks Arun for your tip.

This morning I changed to submitJob and polled.  It worked very well,
and you saved me some trial and error.




-Original Message-
From: Aaron Kimball [mailto:[EMAIL PROTECTED] 
Sent: Monday, November 10, 2008 4:35 AM
To: core-user@hadoop.apache.org
Subject: Re: Dynamically terminate a job once Reporter hits a threshold

Out of curiosity, how reliable are the counters from the perspective of
the
JobClient while the job is in progress? While hitting 'refresh' on the
status web page for a job, I notice that my counters bounce all over the
place, showing wildly different figures second-to-second. Is that using
a
different (less well-synchronized?) mechanism to access the counters
than
the user has available in the JobClient? (If so, is this something we
can
easily patch to make more consistent?)

- Aaron

On Fri, Nov 7, 2008 at 12:21 PM, Arun C Murthy [EMAIL PROTECTED]
wrote:


 On Nov 7, 2008, at 12:12 PM, Brian MacKay wrote:


 Looking for a way to dynamically terminate a job once Reporter in a
Map
 job hits a threshold,

 Example:

 public void map(WritableComparable key, Text values, Output
 CollectorText, Text output, Reporter reporter) throws IOException {

if( reporter.getCount()   SomeConfigValue) {
   return;
   }
   map job code
   }

 Obviously,  reporter.getCount() doesn't exist.  Open to other ideas,
and
 any advice would be appreciated.


 If you _really_ need this, you could do this from your JobClient...
use
 JobClient.submitJob (rather than runJob:

http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Subm
ission+and+Monitoring),
 manually fetch the Counters you need and terminate the Job via
 JobClient.getJob(jobId).killJob().

 Arun


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this message in error, please contact the sender and delete the material 
from any computer.




Dynamically terminate a job once Reporter hits a threshold

2008-11-07 Thread Brian MacKay

Looking for a way to dynamically terminate a job once Reporter in a Map
job hits a threshold,

Example:  
 
public void map(WritableComparable key, Text values, Output
CollectorText, Text output, Reporter reporter) throws IOException {

 if( reporter.getCount()   SomeConfigValue) {
return;
}
    map job code
}

Obviously,  reporter.getCount() doesn't exist.  Open to other ideas, and
any advice would be appreciated.
Thanks, Brian

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this message in error, please contact the sender and delete the material 
from any computer.




Re: Dynamically terminate a job once Reporter hits a threshold

2008-11-07 Thread Arun C Murthy


On Nov 7, 2008, at 12:12 PM, Brian MacKay wrote:



Looking for a way to dynamically terminate a job once Reporter in a  
Map

job hits a threshold,

Example:

public void map(WritableComparable key, Text values, Output
CollectorText, Text output, Reporter reporter) throws IOException {

if( reporter.getCount()   SomeConfigValue) {
   return;
   }
   map job code
   }

Obviously,  reporter.getCount() doesn't exist.  Open to other ideas,  
and

any advice would be appreciated.


If you _really_ need this, you could do this from your JobClient...  
use JobClient.submitJob (rather than runJob: http://hadoop.apache.org/core/docs/current/mapred_tutorial.html#Job+Submission+and+Monitoring) 
, manually fetch the Counters you need and terminate the Job via  
JobClient.getJob(jobId).killJob().


Arun