Am 18.06.2014 um 02:56 schrieb Jesse Becker:

> On Wed, Jun 18, 2014 at 07:56:36AM +1000, Ian Mortimer wrote:
>> I've had to disable outgoing mail from our cluster after an incident
>> when an array job sent thousands of notifications to an external
>> address.
>> 
>> I assume if I set mailer to /bin/true that will stop any notifications
>> being sent, but is there a way to prevent notifications just for array
>> jobs while allowing notifications for regular jobs
> 
> There isn't a "built in" way to do that, but I can think of two
> different ways of handling the spamming problem.
> 
> The mailer setting can be any program or script.  Three arguments are
> passed:
> $1 - A literal "-s" (to set the subject for when mail(1) is called)
> $2 - The "subject" line (examples below)
> $3 - The recipient(s) of the message
> 
> The body of the email is fed to the script on STDIN.
> 
> The subject is useful for this.  Here are two examples, one a normal
> job, the other an array task:
> 
>   Job 369 (date) Complete
>   Job-array task 210356.5 (worker.sh) Complete
> 
> The script can pretty easily figure out if it's an array job, and handle
> it appropriately.
> 
> However, it's pretty easy for a user to queue up 10,000 jobs with email
> notification enabled that run /bin/false to cause similar problems, so
> you might want to consider option #2:
> 
> Configure your mail server to not send email from SGE to "outside"
> addresses.  This shouldn't be overly difficult, and you can "whitelist"
> designated addresses if you so choose.
> 
> As an aside, you can do lots of "interesting" things in the mailer,
> including pull certain information about the just-completed job.  We use
> it occasionally to trap errors that the prologue can miss.

This could also be used to send an array-digest: create a directory for each 
job somewhere in a shared place in the mail-wrapper and `cat` the email content 
to a file for each task (i.e. in the mail-wrapper). At submission time, submit 
a follow-up job with -hold_jid to a dummy queue (with a CPU time limit of 60 
seconds or so and only accessible by a BOOLEAN to an always available dummy 
queue). Only after the array job ran, this follow up job will run, `cat` all 
the found files according to the $JOB_ID it waited for and send this email from 
inside this dummy job. Afterwards this directory with the collection of emails 
can be deleted in this follow-up job.

-- Reuti


> -- 
> Jesse Becker (Contractor)
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to