On Wed, Oct 02, 2013 at 03:59:58PM +1000, David Gibson wrote:
> On Tue, 01 Oct 2013 20:25:21 +0100
> Lee Yarwood <lyarw...@redhat.com> wrote:
> 
> > On 10/01/2013 06:35 PM, Dan Kenigsberg wrote:
> > > On Tue, Oct 01, 2013 at 02:33:00PM +0100, Lee Yarwood wrote:
> > >> On 10/01/2013 09:00 AM, Dan Kenigsberg wrote:
> > >>> It is prefered to post patches to gerrit.ovirt.org.
> > >>
> > >> Apologies for jumping in David but I've pushed this here for now :
> > >>
> > >> http://gerrit.ovirt.org/19741
> > > 
> > > Thanks!
> 
> Thanks, I'm new to ovirt development and gerrit, so I'm going to need
> to work that out.
> 
> > >>> On Tue, Oct 01, 2013 at 01:18:25PM +1000, David Gibson wrote:
> > >>>> At present, if the super vdsm server dies with an exception inside
> > >>>> Python's multiprocessing module, then it will not usually produce any
> > >>>> useful debugging output.
> > >>>
> > >>> For our context - when do you notice such supervdsm deaths?
> > >>> Is it frequent? What is the cause?
> > >>
> > >> BZ#1011661 & BZ#1010030 downstream.
> > > 
> > > Ok, I can see them, dig into them and find an answer to my question. But
> > > it's not fair to the wider community of users and partner to cite
> > > private bugs.
> > > 
> > > https://www.berrange.com/posts/2012/06/27/thoughts-on-improving-openstack-git-commit-practicehistory/
> > 
> > Apologies Dan,
> > 
> > I believe David was referring to the public BZ#1011661. I believe that
> > has been attributed to the following change merged upstream in May :
> > 
> > http://gerrit.ovirt.org/#/c/14998
> 
> Uh, the problem's not attributed to that, rather that patch fixes it.
> The problem was that the process ctimes were changing, leading vdsm to
> erroneously think that supervdsm had died and restarting it.  That lead
> to several complications, including the supervdsm servers failing
> silently due to lack of logging from multiprocessing.

As much as I hate restarting supervdsmd service, I'm so glad that these
issues are solved in ovirt-3.3.

> 
> We don't yet know why the ctimes were changing in this particular
> customer environment.

a longshot: maybe a system date change managed to confuse vdsm?

Dan.
_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to