Hi, yes, you found the root cause, thank you :)
The workaround however will cause huge memory leak, as the process tree needs freed by delprocesstree + putting it to Util_isProcessRunning() would reload the process tree unnecessarily many times during the testing cycle. The fix is attached - the processtree is refreshed only in the wait_start() and wait_stop() where it needs to be refreshed. Please can you try the patch? Thanks, Best regards, Martin
process.patch
Description: Binary data
On Mar 18, 2011, at 2:55 AM, <[email protected]> <[email protected]> wrote: > Hi. > > I checked those points. ... a) Execute and Permission is all ok. b) No > problem. > > And start script "jajs_spmd" is normal end status. > > But I found a workaround(?). > Modifing a source "util.c", I added the following code which it is called > "initprocesstree" in the function "Util_isProcessRunning". > > ------ > int Util_isProcessRunning(Service_T s) { > int i; > pid_t pid = -1; > > ASSERT(s); > > errno = 0; > > if (s->matchlist) { > /* The process table read may sporadically fail during read, because we're > using glob on some platforms which may fail if the proc filesystem > * which it traverses is changed during glob (process stopped). Note that > the glob failure is rare and temporary - it will be OK on next cycle. > * We skip the process matching that cycle however because we don't have > process informations - will retry next cycle */ > > /* added by futa */ > initprocesstree(&ptree, &ptreesize, &oldptree, &oldptreesize); <------ > Added > /****/ > if (Run.doprocess) { > for (i = 0; i < ptreesize; i++) { > ------------------------ > > By this modifing, start action is normal end, not "failed to start". > > I seems this. > > Function "initprocesstree" isn't called after start action. > > Because matching function compared with a process's tree whichi is before > start action, > matching function is not matching ---- "failed to start" > > Am I right in this guess ? > > > Thanks, Kenichi Futatsumori in Japan. > >> -----Original Message----- >> From: >> monit-general-bounces+kenichi.futatsumori=unisys.co.jp@nongnu. >> org >> [mailto:monit-general-bounces+kenichi.futatsumori=unisys.co.jp >> @nongnu.org] On Behalf Of Jan-Henrik Haukeland >> Sent: Thursday, March 10, 2011 8:09 PM >> To: This is the general mailing list for monit >> Subject: Re: Problems : start action status is always "failed >> to start" when check process with matching >> >> >> On Mar 10, 2011, at 10:25 AM, >> <[email protected]> wrote: >> >>> But start action or restart action is always "failed to start". >> >> There may only be a few reasons that Monit cannot start the >> program a) the user that started Monit does not have >> permission to start the process or there are other permission >> problems such as if the program write to a file or b) that >> the program need special environment variables, such as PATH >> set. As you may or may not know, Monit strips the environment >> and leave only a spartan PATH. Please check this and also any >> log files such as /var/messages etc for clues. If all else >> fails; strace Monit and see what actually fails. >> >> -- >> To unsubscribe: >> http://lists.nongnu.org/mailman/listinfo/monit-general >> > -- > To unsubscribe: > http://lists.nongnu.org/mailman/listinfo/monit-general
-- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
