Hi Eddie, Thanks so much for taking the time to look at my issue and for your reply.
The reason I had to increase the heap size for the JD is because I'm running cTAKES (http://ctakes.apache.org/) with DUCC. The increased heap size is to accommodate loading all the models from cTAKES into memory. Before, when I didn't increase the memory size, DUCC would cancel the driver and ends. cTAKES would return back the error of "java.lang.OutOfMemoryError: Java heap space”. Would you say that this problem is mainly a limitation of my physical memory and processes that are running on my computer or can it be adjusted in DUCC, like making parameter adjustments so I can use an increased heap size or maybe a way to pre-allocate enough memory to be used by DUCC? Thanks again, Selina On Wed, Mar 9, 2016 at 7:35 PM, Eddie Epstein <eaepst...@gmail.com> wrote: > Hi Selina, > > I suspect that the problem is due to the following job parameter: > driver_jvm_args -Xmx4g > > This would certainly be true if cgroups have been configured on for DUCC. > The default cgroup size for a JD is 450MB, so specifying an Xmx of 4GB can > cause the JVM to spill into swap space and cause erratic behavior. > > Comparing a "fast" job (96) vs "slow" job (97), the time to process the > single work item was 8 sec vs 9 sec: > 09 Mar 2016 08:46:08,556 INFO JobDriverHelper - T[20] summarize > workitem statistics [sec] avg=8.14 min=8.14 max=8.14 stddev=.00 > vs > 09 Mar 2016 08:56:46,583 INFO JobDriverHelper - T[19] summarize > workitem statistics [sec] avg=9.41 min=9.41 max=9.41 stddev=.00 > > The extra delays between the two jobs appear associated with the Job > Driver. > > Was there some reason you specified heap size for the JD? The default JD > heap size is Xmx400m. > > Regards, > Eddie > > > > On Wed, Mar 9, 2016 at 2:41 PM, Selina Chu <selina....@gmail.com> wrote: > > > Hi > > > > I’m kind of new to DUCC and this forum. I was hoping to see if someone > > could give me some insights as to why DUCC is behaving strangely and a > bit > > unstable. > > > > So what I'm trying to do is: I’m using DUCC to process a cTAKES job. > > Currently DUCC is just using a single node. DUCC seems to act randomly > in > > processing the jobs, varying between 4.5 minutes to 23 minutes, and I > > wasn’t running anything else that is CPU intensive. When I don’t use DUCC > > and use cTAKES alone, the times for processing are pretty consistent. > > > > To demonstrate this strange behavior in DUCC, I submitted the exact same > > job 10 times in a row (job ID 95-104), without modification to the > > settings. > > The duration for finishing each of the jobs are: 4:41, 4:43, 12:48, 8:41, > > 5:24, 4:38, 7:07, 23:08, 8:08, 20:37 (canceled by system). The first 9 > jobs > > were completed and the last one got canceled. Even before the last job, > > the first 9 jobs were varying in duration times. > > After restarting DUCC a couple of times and resetting it, I submitted the > > same job (job ID 110), that job was completed without a problem (long > > processing time) > > > > I noticed that when a job takes a long time to finish, past 5 minutes, it > > seemed to be stuck at the “initializing” and “completing” states for the > > longest. > > > > It seems like DUCC is doing something randomly. I tried examining the > log > > files, but they are all similar, except for the time between each state. > > (I’ve also placed the related logs and job file in a repo > > https://github.com/selinachu/Templogs, in case anyone is interested in > > examining them.) > > > > I’m baffled with the random behaviors from DUCC. I was hoping maybe > someone > > could clarify this more for me. > > > > After completing a job, what does DUCC do? Does it save something in > > memory, which carries over to the next job, which probably relates to the > > initialization process? Are there some parameter settings that might > > alleviate this type of behavior? > > > > I would appreciate any insight. Thanks in advance for your help. > > > > > > Cheers, > > Selina Chu > > >