On Wed, Nov 22, 2017 at 09:53:17AM -0800, Mun Johl wrote: > Hi, > Periodically I am seeing the following error: > > Unable to initialize environment because of error: cannot register event > client. Only 100 event clients are allowed in the system > > The error first showed up a few days ago but stated "950 event clients are > allowed". Because MAX_DYN_EC was not set in my config, I equated it to > 100. I am not sure what you mean by "I equated it to 100"? Did you set it to 100 after getting the error? IIRC the default is 1000.
> However, our sim ring is fairly small at this point and we shouldn't be > getting anywhere near 100 outstanding qsub's (let alone 950). Therefore, > I'm wondering what other factors could result in this error? > For example, could a slow network or slow grid master result in this > error? > Any suggestions on how I can get to root cause would be most appreciated. > Thanks, Are you actually using qsub? IIRC when using DRMAA it is possible to leak event clients (ie the event client is created when a job is qsub'd but isn't automatically freed when the job terminates only when the client program does) if you launch multiple jobs from the same process. If you are using qsub -sync y check that the qsub processes are actually being reaped (ie there aren't a bunch of zombie qsubs hanging around). Also check that you aren't short of filehandles (ie ulimit) either where the submit program runs or where the qmaster lives. William
signature.asc
Description: PGP signature
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
