Re: [Users] management server very slow lately

Juan Hernandez Mon, 25 Mar 2013 00:36:50 -0700

On 03/22/2013 09:58 PM, Jonathan Horne wrote:

is it ok to restart the engine at any time, or should i be prepared for a
maintenance window?

The engine can be restarted at any time, assuming that your users don'tneed to use it (via the user portal) the few seconds it will be down.

this manager has 12 hosts, and about 75 VMs.  we are running 3.1, dreyou's
EL6 packages.

Version 3.1 had the problem with the memory limit. To fix it open the/usr/share/ovirt-engine/service/engine-service.py file, go to line 203and replace -Xms with -Xmx, the resulting lines 202 and 203 should bethe following:


        "-Xms%s" % engineHeapMin,
        "-Xmx%s" % engineHeapMax,

Then restart the engine and then it should never consume more than 1 GiBof heap, which will mean a max of approx 2 GiB of virtual space, and amuch smaller resident set size.


Let us know if this makes it faster.

[jhorne@d0lppc021 ~]$ rpm -qa|grep ovirt
ovirt-engine-restapi-3.1.0-3.19.el6.noarch
ovirt-engine-sdk-3.1.0.5-1.el6.noarch
ovirt-engine-backend-3.1.0-3.19.el6.noarch
ovirt-engine-tools-common-3.1.0-3.19.el6.noarch
ovirt-log-collector-3.1.0-16.el6.noarch
ovirt-image-uploader-3.1.0-16.el6.noarch
ovirt-engine-setup-3.1.0-3.19.el6.noarch
ovirt-engine-config-3.1.0-3.19.el6.noarch
ovirt-iso-uploader-3.1.0-16.el6.noarch
ovirt-engine-webadmin-portal-3.1.0-3.19.el6.noarch
ovirt-engine-genericapi-3.1.0-3.19.el6.noarch
ovirt-engine-3.1.0-3.19.el6.noarch
ovirt-engine-cli-3.1.0.7-1.el6.noarch
ovirt-engine-userportal-3.1.0-3.19.el6.noarch
ovirt-engine-notification-service-3.1.0-3.19.el6.noarch
ovirt-engine-jbossas711-1-0.x86_64
ovirt-engine-dbscripts-3.1.0-3.19.el6.noarch

thanks,
jonathan





On 3/22/13 10:05 AM, "Juan Hernandez" <[email protected]> wrote:

On 03/22/2013 02:54 PM, Jonathan Horne wrote:

top - 08:53:38 up 70 days, 16:31,  1 user,  load average: 0.40, 0.34,
0.32
Tasks: 432 total,   1 running, 431 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.3%us,  0.1%sy,  0.0%ni, 98.6%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Mem:  32876240k total, 18653508k used, 14222732k free,   522432k buffers
Swap:  2097144k total,     4528k used,  2092616k free,  6270908k cached

    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
   2121 ovirt     20   0 12.9g 7.7g  18m S  9.0 24.6  16539:08 java


This is not normal at all. First thing that is strange is that your
engine is taking 7.7 GiB of RAM, which it should never take, as it is by
default limited to 1 GiB. Did you assign more memory to the engine on
purpose? How much? If you assign a lot of memory it can start to consume
a lot of CPU just for garbage collection. You may want to enable verbose
garbage collection adding this to /etc/sysconfig/ovirt-engine (or
/etc/ovirt-engine/engine.conf if you are using the latest source code):

   ENGINE_VERBOSE_GC=true

Then restart the engine and it will start to dump garbage collection
statistics to /var/log/ovirt-engine/console.log. The garbage collection
should be quite silent in an low activity system.

We used to have a bug that caused the max amount of memory not be
correctly limited, but it was fixed long ago:

   http://gerrit.ovirt.org/7952

The other thing that seems strange is the amount of CPU that it is
consuming. Do you have many hosts managed by that engine? In an
otherwise idle environment the CPU consumption is caused by the periodic
polls of the hosts, one each two seconds by default. If you see
continually the engine using a significant amount of CPU (you the output
of top above it is 9%) it could be useful to get a snapshot of the
stacks of threads, to see which threads in particular are consuming the
CPU. Send the QUIT signal to the engine process and it will dump the
stacks of the threads to /var/log/ovirt-engine/console.log:

   # kill -3 $(cat /var/run/ovirt-engine.pid)

Once you have that dump you can check which thread is consuming the CPU
as follows:

1. Get the PIDs of the threads of the engine together with their use of
CPU:

   # ps -L -u ovirt -o tid,pcpu

2. If you see one of them consuming a high amount of CPU time then try
to find it in the stack dump generated in
/var/log/ovirt-engine/console.log. Lets assume that the PID is 13397,
for example, translate it to hex:

   # printf "%04x\n" 13397
   3455

3. Then look in /var/log/ovirt-engine/console.log for a line containing
"nid=0x3455". There you will find the stack trace of that thread,
something like this:

   "ajp-/127.0.0.1:8702-Acceptor-0" daemon prio=10
tid=0x00007f41e0220800 nid=0x3493 runnable [0x00007f41dbdf2000]
    java.lang.Thread.State: RUNNABLE
         ...

Most threads will be waiting, but if you find one thread that is
consistently RUNNABLE then there is probably an issue. The dump of the
stack of that thread can help to find out what it is doing and why it is
consuming the CPU.


I don't have a lot of experience with jboss, so im not sure it thats
good or bad.  I did the jboss restart, and that helped a little, but its
still a little sluggish again, now a few days later.

Thanks,

-----Original Message-----
From: Itamar Heim [mailto:[email protected]]
Sent: Friday, March 15, 2013 6:32 AM
To: Jonathan Horne
Cc: [email protected]
Subject: Re: [Users] management server very slow lately

On 03/13/2013 08:51 PM, Jonathan Horne wrote:

Hello, lately my manager server web interface is extremely sluggish.
Perhaps the server is ready for a reboot?

My management server is also the hosts of my NFS export and ISO mounts.
Is there a prescribed method for rebooting when I am also providing
NFS services from the management server?  My assumption is that aside
from NFS, I should be able to reboot the management serve and the
nodes and virtual machines will be fine in the mean time?


what's the cpu consumption of your ovirt-engine service (java process).
cpu load on the engine? memory/swap state of the engine, etc


________________________________
This is a PRIVATE message. If you are not the intended recipient,
please delete without copying and kindly advise us by e-mail of the
mistake in delivery. NOTE: Regardless of content, this e-mail shall not
operate to bind SKOPOS to any order or other contract unless pursuant to
explicit written agreement or government initiative expressly permitting
the use of e-mail for such purpose.
_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users



--
Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta
3ºD, 28016 Madrid, Spain
Inscrita en el Reg. Mercantil de Madrid  C.I.F. B82657941 - Red Hat S.L.



________________________________
This is a PRIVATE message. If you are not the intended recipient, please delete 
without copying and kindly advise us by e-mail of the mistake in delivery. 
NOTE: Regardless of content, this e-mail shall not operate to bind SKOPOS to 
any order or other contract unless pursuant to explicit written agreement or 
government initiative expressly permitting the use of e-mail for such purpose.

--

Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta3ºD, 28016 Madrid, Spain

Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L.
_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] management server very slow lately

Reply via email to