On 05/20/2011 02:56 PM, Matthieu Pérotin wrote:
> Hi,
>
> we recently experienced an annoying problem with processes that, in some
> circonstances, would get stuck and never return. The fault here is
> clearly on the processes side, but one can never be sure that a process
> will return nicely... The consequence on SEC's side is that child
> processes remain attached to the SEC process, cluttering its %children
> hash table and adding to the complexity of the check_children sub.
>
> A solution to the problem would be to have the possibility to give a
> timeout option to the shellcmd action: on expiration a sigterm (or
> sigkill, I'm still not sure) would be issued to the process that was
> launched.
>
> I could not find in the mailing list archives any message about a
> similar issue, and as we really needed this feature I implemented it as
> a new action (to retain backward compatibility, it could not bear the
> same name), which takes two parameters: a timeout in seconds and the
> command to launch.
>
> I'm not quite sure this mailing list is the right place for proposing
> patchs. If not, could someone give me the right place for that ?
>
> Regards,
> Matthieu.

hi Matthieu,

indeed, the mailing list is the proper way for proposing patches. 
However, in this case it looks to me the issue can quite easily tackled 
with the means provided by the standard UNIX shell, for example:

action=shellcmd (/bin/yourprog & PROID=$! ; sleep 10; kill -9 $PROID)

Since the shellcmd action allows for shell intepretation of the 
commandline (provided that shell metacharacters are present), this 
action will run /bin/yourprog in background and assign its PID to a 
variable PROID. Then, the shell that started /bin/yourprog will sleep 
for 10 seconds, and then kill /bin/yourprog (provided the process is 
still running).

There are a number of other ways for tackling the issue, like the 
employment of Perl:

action=shellcmd ( perl -e 'alarm(10); exec("/bin/yourprog")' )

In this case, since the command line does not contain shell 
metacharacters, an interpreting shell is not started, but SEC rather 
runs perl directly. In the started new process, we invoke the alarm(2) 
system call for delivering the ALRM signal for the process itself after 
10 seconds. Then we simply run /bin/yourprog within the current process, 
and since the alarm timer is inherited by /bin/yourprog, the process 
will get it after 10 seconds and terminate (provided /bin/yourprog does 
not set a handler to ALRM).

In the past, the users have taken advantage of similar shell/Perl 
features for advanced job control, e.g., see 
http://simple-evcorr.sourceforge.net/FAQ.html#21. So instead of patching 
SEC, I'd take advantage of the features of shell or Perl, since they are 
simply so much more advanced.

hope this helps,
risto

------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to