Hi, On Thu, Aug 12, 2010 at 12:09:40PM -0700, David Lang wrote: > On Thu, 12 Aug 2010, Dejan Muhamedagic wrote: > > > On Wed, Aug 11, 2010 at 05:22:56PM -0700, David Lang wrote: > >> On Thu, 12 Aug 2010, Dejan Muhamedagic wrote: > >> > >>> On Wed, Aug 11, 2010 at 03:59:34PM -0700, David Lang wrote: > >>>> On Thu, 12 Aug 2010, Dejan Muhamedagic wrote: > >>>> > >>>>> On Wed, Aug 11, 2010 at 02:44:36PM -0700, David Lang wrote: > >> > >>>>>> I've been watching things get more and more complicated over time, and > >>>>>> I > >>>>>> recognise that to solve complex problems you sometimes need that > >>>>>> complexity, but > >>>>>> there are a LOT of problems that aren't that complex. Heartbeat has > >>>>>> been making > >>>>>> it harder and harder to do simple things, and with the difficulty in > >>>>>> figuring > >>>>>> out what version 3.0.2 is doing that Igor is experiancing, and the > >>>>>> inability to > >>>>>> take a simple config and convert it to the new format, it is sounding > >>>>>> like it > >>>>>> may be time to fork. > >>>>> > >>>>> I completely agree that increased complexity is a problem and > >>>>> particularly in HA solutions. And it is possible to create very > >>>>> complex configurations with Pacemaker, and at the same time make > >>>>> it hard (or impossible) for humans to understand what does the > >>>>> cluster do. > >>>> > >>>> and sometimes such complexity is needed, but sometimes it's not. > >>> > >>> I'd say that running something one can't understand is at least > >>> unmaintainable. > >> > >> but if all I'm doing is the simple stuff, I don't need to understand all > >> the > >> complex stuff, I just need to learn the part that I'm using. > > > > Well, you said it. I'm not sure what does "complex stuff" exactly > > refer to. > > more than two machines, active-active to start with. > > the simple haresources config (when you start have box X default to running > the > following resources) covers a LOT of ground, especially if one of those > resources can be control of a shared drive (either physically shared or > logical > via drbd)
If you show me a haresources I can give you a comparable v2 configuration. Perhaps then you can judge better if it looks too complex. > >>>> the fact that we are on day 2 or 3 of Igor's problem and can't even > >>>> figure out > >>>> what's happening because the logs aren't showing anything is a very bad > >>>> sign. > >>> > >>> Those logs have always been the same. > >> > >> Could you please take a look at what Igor has been posting and see if you > >> can > >> figure out why the logs stop within a minute or so of heartbeat starting > >> (before > >> it starts/stops any resources) and doesn't log _anything_ for a long time > >> (at > >> least 40 min) > >> > >> the logs are not showing stuff that I (and others who have responded) are > >> used > >> to seeing in the 2.x versions that we have deployed, so I assumed that > >> this was > >> due to logging changes (I have never used logd, so I didn't know what > >> changes it > >> had for example) > > > > Unfortunately, I forgot almost everything about v1 and can't > > provide any useful input. Don't know what kind of logging is > > missing. > > he's running 3.0.x > > he has one sample in e-mail where he started heartbeat manually and it did > > on the box that auto-failback pointed to > > initialization > stop all services > notice that it needed to be active > start all services > received an external kill signal > stop all services > exit > > on the other box > initialization > stop all services > received an external kill signal > stop all services > exit > > > what he's getting normally is > > initialization > > with nothing else unless one of the boxes shuts down (at which point the > other > takes over, but he hasn't posted logs from that scenerio) > > so what _should_ be happening after the first few seconds of startup? when > initdead expires something _should_ happen, but we don't see anything in the > logs. That certainly sounds odd, but I really can't offer any advice. It's been ages since I last time looked at v1. Thanks, Dejan > David Lang > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
