This looks like data from the Job Details page showing job-processes. For that job, what does the Jobs page look like? Is the state Completed? If so, the job is not running and the information on the Job Details page is spurious.
Said another way, if check_ducc -k is working then all your DUCC daemons were stopped and you had to re-start DUCC. Upon re-start (presuming you used the default "warm" start) all previous running jobs are marked as Completed. If the the job itself is Completed yet the job-processes continue to show an active state then this is erroneous information...and I assert that the job-processes are not really running. The fact that the Job Details page reports otherwise is a bug that needs to be fixed (if not already fixed in the next release). Lou. Lou. On Wed, Jul 9, 2014 at 1:41 AM, reshu.agarwal <[email protected]> wrote: > On 07/08/2014 03:44 PM, Jim Challenger wrote: > >> I like to stop ducc by issuing check_ducc -k a few times after stop_ducc. >> This sends kill -9 to any ducc components that couldn't stop for some >> reason. Unfortunately it can't kill zombies but once you have done >> check_ducc -k it should not matter. As Lou mentioned, the 1.1.0 release >> will make some of this situation better but I've seen intense analytics >> leave hardware and software on hosts in states that only kill -9 can >> effectively handle. >> > Dear Jim and Lou, > > I have tried all check_ducc -k and ./stop_ducc but the Job is showing > incremented status till now as given below: > > Id Log Size Host > Name PID State > Scheduler Reason > Scheduler > or extraordinary status State > Agent Reason > Agent Exit Time > Init Time > Run Time > GC PgIn Swap %CPU RSS Time > Avg Time > Max Time > Min Done Error Dis- > patch Retry Pre- > empt JConsole > URL > 0 jd.out.log <http://192.168.10.144:42133/ > ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/jd.out.log> > 0.14 S144 8408 Deallocated Voluntary Stopped > ExitCode=0 00 2:15:59:40 00 57 0.0 > 5.0 0.2 6 16 1 14 0 0 0 0 > 0 jd.out.log <http://192.168.10.144:42133/ > ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/jd.out.log> > 0.14 S144 8408 Deallocated Voluntary Stopped > ExitCode=0 00 2:15:59:40 00 57 0.0 > 5.0 0.2 6 16 1 14 0 0 0 0 > 10849 696-UIMA-S1-8962.log <http://192.168.10.144:42133/ > ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/ > 30696-UIMA-S144-8962.log> 0.02 S144 8962 Deallocated > Starting > > 2:15:57:46 <http://192.168.10.144:42133/ > uima-initialization-report.html?idJob=30696&idPro=10849> 00 > 00 0 0.0 0.0 0.0 > > > > > > > > > 10848 696-UIMA-S1-8503.log <http://192.168.10.144:42133/ > ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/ > 30696-UIMA-S144-8503.log> 0.02 S144 8503 Deallocated > Purged Stopped > ExitCode=0 50 <http://192.168.10.144:42133/ > uima-initialization-report.html?idJob=30696&idPro=10848> > 2:15:58:15 02 1102 0.0 46.0 2.3 6 16 1 > 14 0 0 0 0 > 10852 696-UIMA-S2-11649.log <http://192.168.10.144:42133/ > ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/ > 30696-UIMA-S143-11649.log> 0.04 S143 11649 Deallocated > Voluntary Stopped > Discontinued > ExitCode=0 31 <http://192.168.10.144:42133/ > uima-initialization-report.html?idJob=30696&idPro=10852> 00 > 02 0 0.0 1.0 2.2 > > > > > > > > > > > > -- > Thanks, > Reshu Agarwal > >
