Jonathan Hurley created AMBARI-12657:
----------------------------------------

             Summary: Cluster creates fail on larger deployments with SQL Azure 
DB
                 Key: AMBARI-12657
                 URL: https://issues.apache.org/jira/browse/AMBARI-12657
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.0.0
            Reporter: Jonathan Hurley
            Assignee: Jonathan Hurley
            Priority: Critical
             Fix For: 2.1.1


We started doing larger cluster creates (48 workernodes) with SQL Azure DB as 
an Ambari DB, and we are seeing below HTTP GET requests timeout on the client 
side (even after retries), resulting in cluster create failures (15%). This is 
a tracking Jira to resolve the CRUD failures.

What I’m seeing is that DB CPU usage goes above 50% in some of my experiments 
for 48 node clusters. This might explain why SQL is running slow.

end_time            avg_cpu_percent            avg_data_io_percent    
avg_log_write_percent                avg_memory_usage_percent
2015-08-05 18:51:24.153                40.89     0.00        0.62        0.67
2015-08-05 18:51:09.107                41.86     0.00        1.49        0.67
2015-08-05 18:50:54.090                24.36     0.00        0.08        0.67
2015-08-05 18:50:38.763                43.16     0.00        0.57        0.67
2015-08-05 18:50:23.700                65.03     0.00        0.51        0.67
2015-08-05 18:50:07.840                28.57     0.00        0.45        0.67
2015-08-05 18:49:49.480                39.78     0.00        0.42        0.67
2015-08-05 18:49:34.383                28.14     0.00        0.43        0.67

Most expensive queries in terms of CPU time are below. 
Basically, it’s this one query which consumes most of the CPU. Query plan is 
also attached.
{code}
SELECT DISTINCT t0.request_id FROM host_role_command t0 WHERE NOT EXISTS 
(SELECT @P0 FROM host_role_command t1 WHERE (t1.status IN 
(@P1,@P2,@P3,@P4,@P5,@P6,@P7,@P8,@P9)))  ORDER BY t0.request_id ASC
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to