@Ian: I have a separate CC on one node that doesn't have a NC. Yourkit might be a good way to find the reason. Thanks.
@Till: I think so. I am sending the same query now to see what happens this time. Best, Taewoo On Thu, Dec 1, 2016 at 10:41 PM, Till Westmann <[email protected]> wrote: > Hi Taewoo, > > is this behavior reproducible? > > Cheers, > Till > > > On 1 Dec 2016, at 22:14, Taewoo Kim wrote: > > PS: It took 2 more hours to finish the job on one NC. I wonder why this >> happens. >> >> Dec 01, 2016 7:19:35 PM >> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run >> INFO: Executing: NotifyTaskComplete >> Dec 01, 2016 9:11:23 PM >> org.apache.hyracks.control.common.work.WorkQueue$WorkerThread run >> INFO: Executing: CleanupJoblet >> Dec 01, 2016 9:11:23 PM >> org.apache.hyracks.control.nc.work.CleanupJobletWork run >> INFO: Cleaning up after job: JID:4 >> Dec 01, 2016 9:11:23 PM org.apache.hyracks.control.nc.Joblet close >> WARNING: Freeing leaked 54919521 bytes >> >> Best, >> Taewoo >> >> On Thu, Dec 1, 2016 at 8:39 PM, Taewoo Kim <[email protected]> wrote: >> >> Hi All, >>> >>> Have you experienced this case? >>> >>> I have 9 NCs and the CPU utilization of one NC shows 100% for 1 hour and >>> 30 minutes while other NCs have finished their job about 1 hour ago. Even >>> the problematic NC shows the following log at the end. So, looks like >>> it's >>> done but I'm not sure why this job never finishes. It's a simple hash >>> join >>> for 9M records on 9 nodes. >>> >>> Dec 01, 2016 7:18:02 PM org.apache.hyracks.control.com >>> mon.work.WorkQueue$WorkerThread >>> run >>> INFO: Executing: NotifyTaskComplete >>> >>> Best, >>> Taewoo >>> >>>
