Re: [ovirt-users] vdsm using 100% CPU, rapidly filling logs with _handle_event messages

2015-11-27 Thread Robert Story
On Wed, 18 Nov 2015 07:28:23 -0500 Robert wrote:
RS> On Wed, 18 Nov 2015 12:35:17 +0100 Vinzenz wrote:
RS> VF> On 11/12/2015 03:16 PM, Robert Story wrote:
RS> VF> > On Thu, 12 Nov 2015 16:02:59 +0200 Dan wrote:
RS> VF> > DK> On Thu, Nov 12, 2015 at 08:45:43AM -0500, Robert Story wrote:
RS> VF> > DK> > I'm running oVirt 3.5.x with a hosted engine. This morning I
RS> VF> > DK> > noticed that 2 of my 5 hosts were showing 99-100% cpu usage.
RS> VF> > DK> > Logging in to them, vdsmd seemed to be the culprit, and it
RS> VF> > DK> > was filling the log file with these messages:
RS> VF> > DK>
RS> VF> > DK> You're probably seeing
RS> VF> > DK> Bug 1226911 - vmchannel thread consumes 100% of CPU
RS> VF> > DK>
RS> VF> > DK> which was closed due to missing information. Do you have any
RS> VF> > DK> information on when this pops up? Is it reproducible? Would
RS> VF> > DK> you be bale to test a suggested patch
RS> VF> > DK>
RS> VF> > DK> https://gerrit.ovirt.org/#/c/42570/
RS> VF> >
RS> VF> > Hi Dan,
RS> VF> >
RS> VF> > Thanks for the pointers. If it comes up again, I'll try this
RS> VF> > patch and report back on the bug...
RS> VF> >
RS> VF> Out of curiosity, did you happen again to see that happening again?
RS> 
RS> No, I have not.

So naturally it shows up again during a holiday. I came in today to find 1
of my 5 nodes (the SPM and host where hosted engine is running) with two
vdsmd threads chewing up 90-100% of the CPU. I applied the patch from above
and restarted vdsmd. This resulted in another node being selected as the
SPM, and within about 15 minutes, that node had the same issue. Applied the
patch to the new node, and restarted vdsmd, and the SPM went back to the
previous (now patched) node. Hopefully things will stay stable.

I've attached snippets of the logs from the SPM when the problem started,
along with the server/engine log snippets from the hosted engine around the
same time..




Robert

-- 
Senior Software Engineer @ Parsons


vdsm-runaway.tgz
Description: application/compressed-tar


pgpT38OiTML4U.pgp
Description: OpenPGP digital signature
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] vdsm using 100% CPU, rapidly filling logs with _handle_event messages

2015-11-18 Thread Robert Story
On Wed, 18 Nov 2015 12:35:17 +0100 Vinzenz wrote:
VF> On 11/12/2015 03:16 PM, Robert Story wrote:
VF> > On Thu, 12 Nov 2015 16:02:59 +0200 Dan wrote:
VF> > DK> On Thu, Nov 12, 2015 at 08:45:43AM -0500, Robert Story wrote:
VF> > DK> > I'm running oVirt 3.5.x with a hosted engine. This morning I
VF> > DK> > noticed that 2 of my 5 hosts were showing 99-100% cpu usage.
VF> > DK> > Logging in to them, vdsmd seemed to be the culprit, and it was
VF> > DK> > filling the log file with these messages:
VF> > DK>
VF> > DK> You're probably seeing
VF> > DK> Bug 1226911 - vmchannel thread consumes 100% of CPU
VF> > DK>
VF> > DK> which was closed due to missing information. Do you have any
VF> > DK> information on when this pops up? Is it reproducible? Would you
VF> > DK> be bale to test a suggested patch
VF> > DK>
VF> > DK> https://gerrit.ovirt.org/#/c/42570/
VF> >
VF> > Hi Dan,
VF> >
VF> > Thanks for the pointers. If it comes up again, I'll try this patch and
VF> > report back on the bug...
VF> >
VF> Out of curiosity, did you happen again to see that happening again?

No, I have not.

Robert

-- 
Senior Software Engineer @ Parsons


pgpTptOzRh2yy.pgp
Description: OpenPGP digital signature
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] vdsm using 100% CPU, rapidly filling logs with _handle_event messages

2015-11-12 Thread Robert Story
On Thu, 12 Nov 2015 16:02:59 +0200 Dan wrote:
DK> On Thu, Nov 12, 2015 at 08:45:43AM -0500, Robert Story wrote:
DK> > I'm running oVirt 3.5.x with a hosted engine. This morning I noticed
DK> > that 2 of my 5 hosts were showing 99-100% cpu usage. Logging in to
DK> > them, vdsmd seemed to be the culprit, and it was filling the log file
DK> > with these messages:
DK> 
DK> You're probably seeing
DK> Bug 1226911 - vmchannel thread consumes 100% of CPU
DK> 
DK> which was closed due to missing information. Do you have any information
DK> on when this pops up? Is it reproducible? Would you be bale to test a
DK> suggested patch
DK> 
DK> https://gerrit.ovirt.org/#/c/42570/

Hi Dan,

Thanks for the pointers. If it comes up again, I'll try this patch and
report back on the bug...

Robert

-- 
Senior Software Engineer @ Parsons


pgprPpkTJVCUB.pgp
Description: OpenPGP digital signature
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] vdsm using 100% CPU, rapidly filling logs with _handle_event messages

2015-11-12 Thread Dan Kenigsberg
On Thu, Nov 12, 2015 at 08:45:43AM -0500, Robert Story wrote:
> I'm running oVirt 3.5.x with a hosted engine. This morning I noticed that 2
> of my 5 hosts were showing 99-100% cpu usage. Logging in to them, vdsmd
> seemed to be the culprit, and it was filling the log file with these
> messages:

You're probably seeing

Bug 1226911 - vmchannel thread consumes 100% of CPU

which was closed due to missing information. Do you have any information
on when this pops up? Is it reproducible? Would you be bale to test a
suggested patch

https://gerrit.ovirt.org/#/c/42570/

Regards,
Dan.

> 
> VM Channels Listener::DEBUG::2015-11-12
> 08:09:26,292::vmchannels::59::vds::(_handle_event) Received 0011. On fd 
> removed by epoll. VM Channels Listener::INFO::2015-11-12 
> 08:09:26,293::vmchannels::54::vds::(_handle_event) Received 0011 on 
> fileno 119
> VM Channels Listener::DEBUG::2015-11-12 
> 08:09:26,293::vmchannels::59::vds::(_handle_event) Received 0011. On fd 
> removed by epoll.
> VM Channels Listener::INFO::2015-11-12 
> 08:09:26,293::vmchannels::54::vds::(_handle_event) Received 0011 on 
> fileno 75
> VM Channels Listener::DEBUG::2015-11-12 
> 08:09:26,293::vmchannels::59::vds::(_handle_event) Received 0011. On fd 
> removed by epoll.
> VM Channels Listener::INFO::2015-11-12 
> 08:09:26,294::vmchannels::54::vds::(_handle_event) Received 0011 on 
> fileno 119
> VM Channels Listener::DEBUG::2015-11-12 
> 08:09:26,294::vmchannels::59::vds::(_handle_event) Received 0011. On fd 
> removed by epoll.
> 
> I googled to see how to change the debug level to turn of DEBUG messages
> for vdsm, which referred me to libvirtd.conf, but the debug level there was
> not set, which should have meant a log level of 3 (warnings and errors), so
> I'm not sure why the log was filling up with DEBUG/INFO messages.
> 
> I restarted vdsmd, which resulted in those nodes being marked as
> 'disconnected', but they did eventually recover and loads went back to
> normal.
> 
> This may or may not be related to the fact that the 3 hosts where this did
> not happen can't seem to keep their ha brokers up. I'll be starting a new
> thread on that shortly.
> 
> 
> Robert
> 
> -- 
> Senior Software Engineer @ Parsons



> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users