7. November 2013 18:26
Hhhhmm ...you could use a job context variable (see the qsub/qalter man page on job context vars and the options to set them) for instance as a counter to ascertain in a prolog that a particular task really is the last from an array job.
I meant "epilog", not "prolog" here.

A JSV could insert the option to set the job context variable to the number of tasks (to save the user from having to do it) and then the epilog of the tasks could decrease that counter by 1 unless the current count is 1 in which case the task is the last and it can send the e-mail.

This approach has a number of pitfalls which would need to be addressed, however. If more than one task were to finish at about the same time then there'd be a race condition during the step of decreasing the counter. It is not an atomic operation. You have to read the content of the context var first, then decrease it and then set it back in the job context. You probably could address this with another context var serving as a lock, though. (That's several client communications with the qmaster per task, however, and on a throughput cluster certainly not a good idea for performance reasons.)

Another issue would be if epilog scripts for tasks would not get executed properly, e.g. because the node goes down while being into it. Should happen very rarely, though.

It's still somewhat cumbersome to do all this and hence I'd go for Reuti's solution if it was me. A wrapper script for job submission could ensure that the dependent dummy job gets submitted without a need for the end user to think about it.

Cheers,

Fritz


7. November 2013 17:42
El 07/11/13 16:32, Reuti escribió:
Hi,

Am 07.11.2013 um 15:28 schrieb Arnau Bria:

I'd like to get an e-mail when job a job array finishes.

I was looking at
http://comments.gmane.org/gmane.comp.clustering.gridengine.users/19962
and did a simple condition when SGE_TASK = SGE_TASK_LAST then e-mail,
but someone told me that maybe the last task is not the last one to
finish, so, i.e, in a array of 10 jobs, the 3th is the one that
finish last so I won't be getting the e-mail when the array finishes
but when last task finishes.


So, I'm thinking in how to manage this, and I'm wondering if there's
another solution than doing/parsing a qstat every time a task finishes
for guessing if it's the last one.

* I'd like to leave this in the job side, nothing like "daemons"
  running in the server or even prolog... (if possible).

Anyone with something more elegant?
Instead of getting an email from a particular task, you could submit a follow up job with -hold_jid which depends on this one. For this followup job you will then just get one email. Depending on your cluster setup, it might be necessary to have some kind of dummy-queue with a cpu time limit of 10 seconds or so, which will always accept jobs (i.e. maybe a forced "mail_only" boolean complex, it could even reside on the master node).

-- Reuti

That is a good solution, but it would rely on the users/is not automated.

The best "automatic" solution would be using an epilog script that checks qstat | grep $JOBID | wc -l == 1 and acts accordingly. But then again you have the problem of 2 jobs finishing at once (rare), and users that request "-m e" in their task jobs.


TIA,
Arnau
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
7. November 2013 16:32
Hi,

Am 07.11.2013 um 15:28 schrieb Arnau Bria:

I'd like to get an e-mail when job a job array finishes.

I was looking at
http://comments.gmane.org/gmane.comp.clustering.gridengine.users/19962
and did a simple condition when SGE_TASK = SGE_TASK_LAST then e-mail,
but someone told me that maybe the last task is not the last one to
finish, so, i.e, in a array of 10 jobs, the 3th is the one that
finish last so I won't be getting the e-mail when the array finishes
but when last task finishes.


So, I'm thinking in how to manage this, and I'm wondering if there's
another solution than doing/parsing a qstat every time a task finishes
for guessing if it's the last one.

* I'd like to leave this in the job side, nothing like "daemons"
 running in the server or even prolog... (if possible).

Anyone with something more elegant? 
Instead of getting an email from a particular task, you could submit a follow up job with -hold_jid which depends on this one. For this followup job you will then just get one email. Depending on your cluster setup, it might be necessary to have some kind of dummy-queue with a cpu time limit of 10 seconds or so, which will always accept jobs (i.e. maybe a forced "mail_only" boolean complex, it could even reside on the master node).

-- Reuti


TIA,
Arnau
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
7. November 2013 15:28
Hi all,

I'd like to get an e-mail when job a job array finishes.

I was looking at
http://comments.gmane.org/gmane.comp.clustering.gridengine.users/19962
and did a simple condition when SGE_TASK = SGE_TASK_LAST then e-mail,
but someone told me that maybe the last task is not the last one to
finish, so, i.e, in a array of 10 jobs, the 3th is the one that
finish last so I won't be getting the e-mail when the array finishes
but when last task finishes.


So, I'm thinking in how to manage this, and I'm wondering if there's
another solution than doing/parsing a qstat every time a task finishes
for guessing if it's the last one.

* I'd like to leave this in the job side, nothing like "daemons"
running in the server or even prolog... (if possible).

Anyone with something more elegant?

TIA,
Arnau
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

--

UnivaFritz Ferstl | CTO and Business Development, EMEA
Univa Corporation | The Data Center Optimization Company
E-Mail: [email protected] | Phone: +49.9471.200.195 | Mobile: +49.170.819.7390

Where Grid Engine lives

Visit us at SC13 at booth #4101!

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to