It looks good now Thanks. You can close the ticket. ———————————————— Kashif Khan, PMI-ACP Sr. Solution Architect Publishing Technology
Houghton Mifflin Harcourt 9400 South Park Center Loop Orlando, FL 32819 Office: 407.345.3420 Mobile: 407.949.4697 hmhco.com From: indar verma <[email protected]<mailto:[email protected]>> Reply-To: indar verma <[email protected]<mailto:[email protected]>>, MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Date: Monday, September 30, 2013 8:35 AM To: Michael Blakeley <[email protected]<mailto:[email protected]>>, MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Cc: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: [MarkLogic Dev General] One node CPU utilization maxed out but others not in a 5 node cluster once load increases Hi Michael, Thanks a lot for your suggestions and explaining me the problem in detail. There are 4 forests in each node, -- 2 masters and 2 replicas Total 20 forests (10 masters + 10 replicas) I am attaching some screenshots of the DB I started looking into the xqy and trying to reduce response time. I will follow your other instructions too to see the other factors. Actually problem is, I have to give some justifications of maximum use of CPU in ML4 node only even it is data node and all the data is not present in that node only. so I am struggling to get a concrete reason. every time, my customer is asking why Ml4 node only going for maximum. Thanks & Regards, JJ ________________________________ From: Michael Blakeley <[email protected]<mailto:[email protected]>> To: indar verma <[email protected]<mailto:[email protected]>>; MarkLogic Developer Discussion <[email protected]<mailto:[email protected]>> Cc: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Sent: Monday, 30 September 2013 1:44 AM Subject: Re: [MarkLogic Dev General] One node CPU utilization maxed out but others not in a 5 node cluster once load increases Does "zenoss" mean Xen virtualization? PVM or HVM? How many forests are on each host? You could simply try upgrading from 6.0-3.2 to the latest release, 6.0-4, and see if that helps. But if it were me I would want to know which query or queries caused the problem. Even though you aren't sending queries directly to that busy host, it's resolving index lookups as requested by the eval hosts. So it's still important to look at long-running queries, as these are the ones likely driving the load on your busy host. You also want to have a reproducible test case, and the best way to build that is to isolate a query that recreates the high load. At the same time, dig into how utilization is measured and exactly what the numbers are. It's not enough to say that a host is "maxed out": you need to understand which subsystem is the bottleneck. It's quite difficult to drive a 16-core or 32-core host to 0% idle, especially if the workload is mixed between network, disk, and CPU activity. You really want to know how much of each is involved, to better understand what "maxed out" really means. For example 'iostat -mxz 15' is a good way to monitor current activity, or if sysstat is collecting data then sar can display it. Just to illustrate the point, here are some low-utilization numbers from a system I happen to have handy. 12:00:01 AM CPU %user %nice %system %iowait %steal %idle 12:05:01 AM all 6.24 0.30 0.50 0.24 0.09 92.64 12:15:01 AM all 2.79 0.00 0.12 0.09 0.05 96.95 12:25:01 AM all 3.39 0.00 0.16 0.10 0.07 96.27 12:35:01 AM all 2.80 0.00 0.13 0.06 0.06 96.96 If this host were "maxed out", that could appear as high %user, or %nice, or %system, or %iowait, or %steal - or any mix of those. That, in turn, would tell you something about why the host is busy. If it turns out to be high %system or %iowait, take a look at the :8001/host-status.xqy page for the host in question. At the bottom you'll see a table of rates and loads, which will tell you something about where the host is spending its time. -- Mike On 29 Sep 2013, at 11:57 , indar verma <[email protected]<mailto:[email protected]>> wrote: > One more thing to add, > > We are sending requests to Ml1 to Ml3 in round robin fashion from the > application end. so Ml4 & Ml5 are not accepting any direct request from the > front end app. > > we are ingesting data through these two ml4 &Ml5 newly added nodes. > > Thanks, > JJ > > From: indar verma <[email protected]<mailto:[email protected]>> > To: "[email protected]<mailto:[email protected]>" > <[email protected]<mailto:[email protected]>> > Cc: > "[email protected]<mailto:[email protected]>" > > <[email protected]<mailto:[email protected]>> > Sent: Monday, 30 September 2013 12:18 AM > Subject: Re: One node CPU utilization maxed out but others not in a 5 node > cluster once load increases > > Thanks Michael!!! > > Which version of MarkLogic is this? - MarkLogic 6.0-3.2 > > > > How many CPU cores are on each host? ML1 to ML4 - 2CPUs & 16 cores and ML5 - > 2 CPUs & 32 cores > > Local Disk failover enabled between ML1 to Ml3 and Ml4 to Ml5 (as ml4 & Ml5 > have been added recently) > > > How much RAM is on each host? - 64GB > > > How are you measuring CPU utilization? - Linux commands once response time > increases and infra team tell us. it seems thye are using zenoss > > > My concern is why ML4 CPU is maxed out even it is not having much data nor > xQuery is being executed on this node. > > Could it be the reason as Ml5 is highly configured than Ml4 and fail over is > enabled between both? > > Offcourse xQuery execution can be one of the case, these will be taking time > however Heavy load & data is on Ml1 to Ml3 even their CPU usages is around > 30%. > > Thanks, > > JJ > > > From: indar verma <[email protected]<mailto:[email protected]>> > To: "[email protected]<mailto:[email protected]>" > <[email protected]<mailto:[email protected]>> > Sent: Sunday, 29 September 2013 11:18 PM > Subject: One node CPU utilization maxed out but others not in a 5 node > cluster once load increases > > Hi All, > > We have a 5 nodes clustered environment and data is in TBs. Ml1 to Ml3 are > evaluator nodes & data nodes and Ml4 & Ml5 are data nodes only for now. > > Once loads increase the Ml4 CPU utilization going to become maximum and other > 4 nodes are normal around 30% utilization. > > Due to Ml4 CPU maxed out, page is not rendering, some queries are giving time > out. > > Also one thing, more data processing is not being happen in Ml4 than other > nodes. > > Could you please somebody help me to resolve this? what can be the root cause > for this problem? what should we improve? > > Thanking in advance!!! > > JJ > > > > > > _______________________________________________ > General mailing list > [email protected]<mailto:[email protected]> > http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
