Can you give some insight into the machine specs and JVM options used? Also, is it 8000 jobs or tasks? The terms are often mixed up, but will have a big difference here.
On Wednesday, June 8, 2016, Shyam Patel <sham.pate...@gmail.com> wrote: > Hi, > > While running LnP testing, I’m spinning of 8K docker jobs. During the run, > I ran into issue where TaskStatUpdate and TaskReconciler queries taking > real long times. During the time, Aurora is pretty much freezing and at a > point dying. Also, tried the same run w/o the docker jobs and faced the > same issue. > > > Is there a way to keep the Aurora performance intact during the query runs > ? > > > > Here is snipped from log : > > > I0602 00:53:37.527 [TaskStatUpdaterService RUNNING, DbTaskStore:104] Query > took 1243517 ms: TaskQuery(owner:null, role:null, environment:null, > jobName:null, taskIds:null, statuses:[STARTING, THROTTLED, RUNNING, > DRAINING, ASSIGNED, KILLING, RESTARTING, PENDING, PREEMPTING], > instanceIds:null, slaveHosts:null, jobKeys:null, offset:0, limit:0) > > > I0602 00:56:54.180 [TaskReconciler-0, DbTaskStore:104] Query took 1380169 > ms: TaskQuery(owner:null, role:null, environment:null, jobName:null, > taskIds:null, statuses:[STARTING, RUNNING, DRAINING, ASSIGNED, KILLING, > RESTARTING, PREEMPTING], instanceIds:null, slaveHosts:null, jobKeys:null, > offset:0, limit:0) > > > > Appreciate any insights.. > > > Thanks, > Sham > >