Hi,

On Thu, Aug 12, 2010 at 12:09:40PM -0700, David Lang wrote:
> On Thu, 12 Aug 2010, Dejan Muhamedagic wrote:
> 
> > On Wed, Aug 11, 2010 at 05:22:56PM -0700, David Lang wrote:
> >> On Thu, 12 Aug 2010, Dejan Muhamedagic wrote:
> >>
> >>> On Wed, Aug 11, 2010 at 03:59:34PM -0700, David Lang wrote:
> >>>> On Thu, 12 Aug 2010, Dejan Muhamedagic wrote:
> >>>>
> >>>>> On Wed, Aug 11, 2010 at 02:44:36PM -0700, David Lang wrote:
> >>
> >>>>>> I've been watching things get more and more complicated over time, and 
> >>>>>> I
> >>>>>> recognise that to solve complex problems you sometimes need that 
> >>>>>> complexity, but
> >>>>>> there are a LOT of problems that aren't that complex. Heartbeat has 
> >>>>>> been making
> >>>>>> it harder and harder to do simple things, and with the difficulty in 
> >>>>>> figuring
> >>>>>> out what version 3.0.2 is doing that Igor is experiancing, and the 
> >>>>>> inability to
> >>>>>> take a simple config and convert it to the new format, it is sounding 
> >>>>>> like it
> >>>>>> may be time to fork.
> >>>>>
> >>>>> I completely agree that increased complexity is a problem and
> >>>>> particularly in HA solutions. And it is possible to create very
> >>>>> complex configurations with Pacemaker, and at the same time make
> >>>>> it hard (or impossible) for humans to understand what does the
> >>>>> cluster do.
> >>>>
> >>>> and sometimes such complexity is needed, but sometimes it's not.
> >>>
> >>> I'd say that running something one can't understand is at least
> >>> unmaintainable.
> >>
> >> but if all I'm doing is the simple stuff, I don't need to understand all 
> >> the
> >> complex stuff, I just need to learn the part that I'm using.
> >
> > Well, you said it. I'm not sure what does "complex stuff" exactly
> > refer to.
> 
> more than two machines, active-active to start with.
> 
> the simple haresources config (when you start have box X default to running 
> the 
> following resources) covers a LOT of ground, especially if one of those 
> resources can be control of a shared drive (either physically shared or 
> logical 
> via drbd)

If you show me a haresources I can give you a comparable v2
configuration. Perhaps then you can judge better if it looks too
complex.

> >>>> the fact that we are on day 2 or 3 of Igor's problem and can't even 
> >>>> figure out
> >>>> what's happening because the logs aren't showing anything is a very bad 
> >>>> sign.
> >>>
> >>> Those logs have always been the same.
> >>
> >> Could you please take a look at what Igor has been posting and see if you 
> >> can
> >> figure out why the logs stop within a minute or so of heartbeat starting 
> >> (before
> >> it starts/stops any resources) and doesn't log _anything_ for a long time 
> >> (at
> >> least 40 min)
> >>
> >> the logs are not showing stuff that I (and others who have responded) are 
> >> used
> >> to seeing in the 2.x versions that we have deployed, so I assumed that 
> >> this was
> >> due to logging changes (I have never used logd, so I didn't know what 
> >> changes it
> >> had for example)
> >
> > Unfortunately, I forgot almost everything about v1 and can't
> > provide any useful input. Don't know what kind of logging is
> > missing.
> 
> he's running 3.0.x
> 
> he has one sample in e-mail where he started heartbeat manually and it did
> 
> on the box that auto-failback pointed to
> 
> initialization
> stop all services
> notice that it needed to be active
> start all services
> received an external kill signal
> stop all services
> exit
> 
> on the other box
> initialization
> stop all services
> received an external kill signal
> stop all services
> exit
> 
> 
> what he's getting normally is
> 
> initialization
> 
> with nothing else unless one of the boxes shuts down (at which point the 
> other 
> takes over, but he hasn't posted logs from that scenerio)
> 
> so what _should_ be happening after the first few seconds of startup? when 
> initdead expires something _should_ happen, but we don't see anything in the 
> logs.

That certainly sounds odd, but I really can't offer any advice.
It's been ages since I last time looked at v1.

Thanks,

Dejan

> David Lang
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to