Sailaja Mada created CLOUDSTACK-3729:
----------------------------------------
Summary: Management Server stopped responding in a Multinode
Management Setup
Key: CLOUDSTACK-3729
URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3729
Project: CloudStack
Issue Type: Bug
Security Level: Public (Anyone can view this level - this is the default.)
Components: Management Server
Affects Versions: 4.2.0
Reporter: Sailaja Mada
Priority: Critical
Setup : Adv Configuration with VMWARE (Stadard vSwitch with 2 clusters ) &
Zone wide Primary storage
Steps:
1. Configure Multinode Management setup (MS1 & MS2)
2. Enable agent.lb.enabled to true
3. DB is available locally to MS1
4. Create new accounts and deployed VM's
Observation:
1. Tried to access Management server (MS1/MS2 ). Both servers are very slow
to respond and finally they stopped to respond
2. Restarting Management servers also did not help here :
DB Statistics :
mysql> SHOW STATUS WHERE `variable_name` = 'Threads_connected';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Threads_connected | 38 |
+-------------------+-------+
1 row in set (0.00 sec)
mysql> show status like '%onn%';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| Aborted_connects | 6 |
| Connections | 1461 |
| Max_used_connections | 41 |
| Ssl_client_connects | 0 |
| Ssl_connect_renegotiates | 0 |
| Ssl_finished_connects | 0 |
| Threads_connected | 38 |
+--------------------------+-------+
7 rows in set (0.00 sec)
[root@ec2 management]# mysqladmin -u root -ppassword processlist
+------+-------+----------------------+-------------+---------+------+-------+------------------+
| Id | User | Host | db | Command | Time | State |
Info |
+------+-------+----------------------+-------------+---------+------+-------+------------------+
| 35 | cloud | localhost:32999 | cloud | Sleep | 708 | |
|
| 40 | cloud | localhost:33004 | cloud | Sleep | 708 | |
|
| 43 | cloud | localhost:33007 | cloudbridge | Sleep | 715 | |
|
| 81 | cloud | localhost:33310 | cloud_usage | Sleep | 872 | |
|
| 959 | cloud | 10.102.192.117:48041 | cloud | Sleep | 11 | |
|
| 965 | cloud | 10.102.192.117:48049 | cloud | Query | 0 | |
commit |
| 1015 | cloud | 10.102.192.117:48294 | cloud_usage | Sleep | 186 | |
|
| 1406 | cloud | localhost:60220 | cloud | Sleep | 725 | |
|
| 1410 | cloud | 10.102.192.117:54646 | cloud | Sleep | 93 | |
|
| 1412 | cloud | 10.102.192.117:54648 | cloud | Sleep | 173 | |
|
| 1413 | cloud | localhost:60352 | cloud | Sleep | 725 | |
|
| 1415 | cloud | 10.102.192.117:54704 | cloud | Sleep | 133 | |
|
| 1418 | cloud | localhost:60445 | cloud | Sleep | 709 | |
|
| 1419 | cloud | localhost:60463 | cloud | Sleep | 853 | |
|
| 1420 | cloud | localhost:60467 | cloud | Sleep | 851 | |
|
| 1421 | cloud | localhost:60468 | cloud | Sleep | 820 | |
|
| 1422 | cloud | 10.102.192.117:54799 | cloud | Sleep | 35 | |
|
| 1423 | cloud | localhost:60472 | cloud | Sleep | 846 | |
|
| 1424 | cloud | localhost:60473 | cloud | Sleep | 725 | |
|
| 1425 | cloud | localhost:60474 | cloud | Sleep | 733 | |
|
| 1426 | cloud | localhost:60475 | cloud | Sleep | 852 | |
|
| 1427 | cloud | localhost:60476 | cloud | Sleep | 734 | |
|
| 1429 | cloud | localhost:60478 | cloud | Sleep | 725 | |
|
| 1430 | cloud | localhost:60479 | cloud | Sleep | 780 | |
|
| 1431 | cloud | localhost:60480 | cloud | Sleep | 733 | |
|
| 1432 | cloud | localhost:60481 | cloud | Sleep | 851 | |
|
| 1433 | cloud | localhost:60482 | cloud | Sleep | 851 | |
|
| 1434 | cloud | localhost:60483 | cloud | Sleep | 725 | |
|
| 1435 | cloud | localhost:60484 | cloud | Sleep | 729 | |
|
| 1436 | cloud | localhost:60485 | cloud | Sleep | 845 | |
|
| 1437 | cloud | 10.102.192.117:54800 | cloud | Sleep | 2 | |
|
| 1438 | cloud | localhost:60486 | cloud | Sleep | 708 | |
|
| 1439 | cloud | localhost:60487 | cloud | Sleep | 733 | |
|
| 1440 | cloud | localhost:60488 | cloud | Sleep | 733 | |
|
| 1441 | cloud | 10.102.192.117:54850 | cloud | Sleep | 1 | |
|
| 1442 | cloud | 10.102.192.117:54851 | cloud | Sleep | 13 | |
|
| 1443 | cloud | 10.102.192.117:54852 | cloud | Sleep | 183 | |
|
| 1445 | cloud | 10.102.192.117:55057 | cloud | Sleep | 53 | |
|
| 1446 | cloud | 10.102.192.117:55058 | cloud | Sleep | 183 | |
|
| 1449 | root | localhost | | Query | 0 | |
show processlist |
+------+-------+----------------------+-------------+---------+------+-------+------------------+
[root@ec2 management]# mysqladmin -u root -ppassword processlist | grep cloud |
wc
39 585 3822
[root@ec2 management]# mysqladmin -u root -ppassword processlist | grep cloud |
wc -l
39
[root@ec2 management]# netstat -nat | grep 3306 | grep -i ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:48049
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60476
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60478
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:54850
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60352
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60220
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:55867
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60467
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:55663
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60486
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60445
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60479
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60474
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:48041
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60481
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:33007
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:33310
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60488
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:33004
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60483
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60485
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:56272
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:56271
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:54800
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60480
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:32999
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60487
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:48294
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60473
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60472
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:54852
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60468
ESTABLISHED
tcp 0 0 10.102.192.207:3306 10.102.192.117:56068
ESTABLISHED
tcp 0 0 127.0.0.1:3306 127.0.0.1:60484
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:33004 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60488 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60467 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60468 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60445 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60481 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60352 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60487 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60479 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:33007 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:33310 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60484 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60472 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60474 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60485 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60476 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:32999 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60220 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60483 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60473 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60486 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60478 ::ffff:127.0.0.1:3306
ESTABLISHED
tcp 0 0 ::ffff:127.0.0.1:60480 ::ffff:127.0.0.1:3306
ESTABLISHED
[root@ec2 management]# netstat -nat | grep 3306 | grep -i ESTABLISHED | wc -l
57
[root@ec2 management]#
Cloud process statistics :
[root@ec2 management]# top -p 3323
top - 13:50:58 up 4 days, 16 min, 5 users, load average: 26.92, 28.20, 15.50
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 6.1%us, 3.0%sy, 0.0%ni, 0.0%id, 85.9%wa, 2.3%hi, 2.7%si, 0.0%st
Mem: 785484k total, 727832k used, 57652k free, 1968k buffers
Swap: 1572856k total, 1335760k used, 237096k free, 15088k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3323 cloud 20 0 5038m 375m 2500 S 11.7 48.9 105:08.63 java
Mysqld :
top - 13:52:17 up 4 days, 17 min, 5 users, load average: 15.43, 24.26, 15.16
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.4%us, 0.8%sy, 0.0%ni, 92.4%id, 4.0%wa, 0.9%hi, 0.3%si, 0.0%st
Mem: 785484k total, 741076k used, 44408k free, 2972k buffers
Swap: 1572856k total, 1333312k used, 239544k free, 15716k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2357 mysql 20 0 692m 32m 3188 S 1.8 4.2 98:57.49 mysqld
mysql> SHOW ENGINE INNODB STATUS\G ;
*************************** 1. row ***************************
Type: InnoDB
Name:
Status:
=====================================
130723 13:57:58 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 7 seconds
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 39333, signal count 38645
Mutex spin waits 0, rounds 81686, OS waits 3537
RW-shared spins 68861, OS waits 34420; RW-excl spins 1443, OS waits 1376
------------------------
LATEST FOREIGN KEY ERROR
------------------------
130719 13:50:13 Error in dropping of a foreign key constraint of table
"cloud"."baremetal_pxe_devices",
in SQL command
ALTER TABLE baremetal_pxe_devices DROP FOREIGN KEY
fk_external_pxe_devices_physical_network_id
Cannot find a constraint with the given id
"fk_external_pxe_devices_physical_network_id".
------------
TRANSACTIONS
------------
Trx id counter 0 3618750
Purge done for trx's n:o < 0 3618480 undo n:o < 0 0
History list length 39
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0 3618620, not started, process no 2357, OS thread id
140486500669184
MySQL thread id 1472, query id 18528732 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618579, not started, process no 2357, OS thread id
140486496405248
MySQL thread id 1470, query id 18528479 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618577, not started, process no 2357, OS thread id
140486495872768
MySQL thread id 1473, query id 18528477 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618562, not started, process no 2357, OS thread id
140486499071744
MySQL thread id 1469, query id 18528390 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618538, not started, process no 2357, OS thread id
140486837835520
MySQL thread id 1471, query id 18528255 localhost 127.0.0.1 cloud
---TRANSACTION 0 0, not started, process no 2357, OS thread id 140486495606528
MySQL thread id 1474, query id 18529490 localhost root
SHOW ENGINE INNODB STATUS
---TRANSACTION 0 3618625, not started, process no 2357, OS thread id
140486498539264
MySQL thread id 1468, query id 18528753 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618703, not started, process no 2357, OS thread id
140486496671488
MySQL thread id 1466, query id 18529253 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618610, not started, process no 2357, OS thread id
140486704576256
MySQL thread id 1467, query id 18528674 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618745, not started, process no 2357, OS thread id
140486498006784
MySQL thread id 1465, query id 18529466 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618714, not started, process no 2357, OS thread id
140486500402944
MySQL thread id 1463, query id 18529300 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618532, not started, process no 2357, OS thread id
140486499870464
MySQL thread id 1462, query id 18528224 10.102.192.117 cloud
---TRANSACTION 0 3618410, not started, process no 2357, OS thread id
140486498805504
MySQL thread id 1461, query id 18527587 10.102.192.117 cloud
---TRANSACTION 0 3618526, not started, process no 2357, OS thread id
140486839965440
MySQL thread id 1459, query id 18528175 10.102.192.117 cloud
---TRANSACTION 0 3618568, not started, process no 2357, OS thread id
140486704043776
MySQL thread id 1458, query id 18528404 10.102.192.117 cloud
---TRANSACTION 0 3618411, not started, process no 2357, OS thread id
140486838101760
MySQL thread id 1457, query id 18527578 10.102.192.117 cloud
---TRANSACTION 0 3618158, not started, process no 2357, OS thread id
140486496941824
MySQL thread id 1439, query id 18527968 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618735, not started, process no 2357, OS thread id
140486839432960
MySQL thread id 1440, query id 18529427 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618632, not started, process no 2357, OS thread id
140486499604224
MySQL thread id 1436, query id 18528788 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618739, not started, process no 2357, OS thread id
140486500935424
MySQL thread id 1427, query id 18529442 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618570, not started, process no 2357, OS thread id
140486499337984
MySQL thread id 1431, query id 18528424 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618336, not started, process no 2357, OS thread id
140486497474304
MySQL thread id 1434, query id 18529306 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618695, not started, process no 2357, OS thread id
140486839699200
MySQL thread id 1429, query id 18529332 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618700, not started, process no 2357, OS thread id
140486497740544
MySQL thread id 1435, query id 18529247 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618618, not started, process no 2357, OS thread id
140486498273024
MySQL thread id 1432, query id 18528728 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618635, not started, process no 2357, OS thread id
140486501201664
MySQL thread id 1430, query id 18528805 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618749, not started, process no 2357, OS thread id
140486838634240
MySQL thread id 1437, query id 18529489 10.102.192.117 cloud
---TRANSACTION 0 3618697, not started, process no 2357, OS thread id
140486837569280
MySQL thread id 1424, query id 18529234 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618740, not started, process no 2357, OS thread id
140486497208064
MySQL thread id 1425, query id 18529446 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618629, not started, process no 2357, OS thread id
140486703245056
MySQL thread id 1423, query id 18528782 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618539, not started, process no 2357, OS thread id
140486496139008
MySQL thread id 1421, query id 18528251 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618699, not started, process no 2357, OS thread id
140486838368000
MySQL thread id 1420, query id 18529246 localhost 127.0.0.1 cloud
---TRANSACTION 0 3602499, not started, process no 2357, OS thread id
140486840497920
MySQL thread id 1418, query id 18508239 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618607, not started, process no 2357, OS thread id
140486702978816
MySQL thread id 1413, query id 18528671 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618715, not started, process no 2357, OS thread id
140486838900480
MySQL thread id 1406, query id 18529311 localhost 127.0.0.1 cloud
---TRANSACTION 0 3616468, not started, process no 2357, OS thread id
140486703777536
MySQL thread id 1015, query id 18516644 10.102.192.117 cloud
---TRANSACTION 0 2505627, not started, process no 2357, OS thread id
140486837036800
MySQL thread id 959, query id 18528771 10.102.192.117 cloud
---TRANSACTION 0 3615045, not started, process no 2357, OS thread id
140486837303040
MySQL thread id 81, query id 18508421 localhost 127.0.0.1 cloud
---TRANSACTION 0 3613913, not started, process no 2357, OS thread id
140486704310016
MySQL thread id 43, query id 18502143 localhost 127.0.0.1 cloud
---TRANSACTION 0 7638, not started, process no 2357, OS thread id
140486840231680
MySQL thread id 35, query id 18527805 localhost 127.0.0.1 cloud
---TRANSACTION 0 3618746, COMMITTED IN MEMORY, process no 2357, OS thread id
140486500136704 committing
, undo log entries 1
MySQL thread id 965, query id 18529471 10.102.192.117 cloud
commit
---TRANSACTION 0 3618741, COMMITTED IN MEMORY, process no 2357, OS thread id
140486839166720 committing
, undo log entries 1
MySQL thread id 40, query id 18529451 localhost 127.0.0.1 cloud
commit
---TRANSACTION 0 3618565, ACTIVE 5 sec, process no 2357, OS thread id
140486703511296
MySQL thread id 1438, query id 18529295 localhost 127.0.0.1 cloud
--------
FILE I/O
--------
I/O thread 0 state: waiting for i/o request (insert buffer thread)
I/O thread 1 state: waiting for i/o request (log thread)
I/O thread 2 state: waiting for i/o request (read thread)
I/O thread 3 state: waiting for i/o request (write thread)
Pending normal aio reads: 0, aio writes: 0,
ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 1; buffer pool: 0
3945 OS file reads, 1095705 OS file writes, 759430 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.71 writes/s, 0.71 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2,
18 inserts, 18 merged recs, 8 merges
Hash table size 17393, node heap has 4 buffer(s)
46.99 hash searches/s, 88.27 non-hash searches/s
---
LOG
---
Log sequence number 0 189051814
Log flushed up to 0 189051218
Last checkpoint at 0 189043028
1 pending log writes, 0 pending chkp writes
622265 log i/o's done, 0.71 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 25657658; in additional pool allocated 1048320
Dictionary memory allocated 2388320
Buffer pool size 512
Free buffers 1
Database pages 507
Modified db pages 9
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 4232, created 1244, written 442699
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 2357, id 140486717187840, state: flushing log
Number of rows inserted 5365, updated 545505, deleted 464, read 11099941
0.00 inserts/s, 3.29 updates/s, 0.00 deletes/s, 80.13 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
1 row in set (0.26 sec)
ERROR:
No query specified
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira