Re: [Linux-HA] New release ahead ?

2008-07-11 Thread Lars Marowsky-Bree
On 2008-07-09T12:44:16, Andrew Beekhof [EMAIL PROTECTED] wrote: Assuming it comes out, 2.1.4 wont not include all the fixes/enhancements from Pacemaker 0.6. But 2.1.4 should have all relevant bugfixes to pacemaker, still; though indeed not all of the enhancements. Regards, Lars --

Re: [Linux-HA] Dependency graph

2008-07-09 Thread Lars Marowsky-Bree
On 2008-07-08T13:11:53, Ciro Iriarte [EMAIL PROTECTED] wrote: ptest -D Seems you thought about everything :D Andrew did ;-) I got a blank graph: asusis-xen1:~ # ptest -L -D sprod asusis-xen1:~ # cat sprod digraph g { } asusis-xen1:~ # rpm -q heartbeat heartbeat-2.1.3-22.1 That

Re: [Linux-HA] Dependency on dm_multipath

2008-07-09 Thread Lars Marowsky-Bree
On 2008-07-08T18:49:53, Ciro Iriarte [EMAIL PROTECTED] wrote: For group2 I need to make sure that the LUNs are available, the issue is how do I do that? Well, two answers - You could indeed write a pseudo-RA which monitors the DM-MPIO devs, or feed their status directly into the CIB from

Re: [Linux-HA] Dependency graph

2008-07-07 Thread Lars Marowsky-Bree
On 2008-07-07T01:00:39, Ciro Iriarte [EMAIL PROTECTED] wrote: I know that graphviz is used, but this utility harvests the info, something that graphviz can't do on its own.. I was just asking if someone already worked on a utility that talks to the crm to make this kind of graph. ptest -D

Re: [Linux-HA] stonith_action poweroff

2008-07-02 Thread Lars Marowsky-Bree
On 2008-07-01T23:28:33, Hannes Dorbath [EMAIL PROTECTED] wrote: Is there a way with legacy V1 config to power off the other node permanently on STONITH calls (no reboot)? Short off modifying the code, no. ___ Linux-HA mailing list

Re: [Linux-HA] behavior of lrmd/crmd when lrmd process is killed

2008-06-30 Thread Lars Marowsky-Bree
On 2008-06-30T18:48:42, Junko IKEDA [EMAIL PROTECTED] wrote: Mori-san fixed this :) See attached. It seems that the process spawned by Heartbeat keep holding the crmd-lrmd channel. Thanks for the patch, merged! Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development

[Linux-HA] Re: Lost log messages with a centralized syslog-ng setup

2008-06-30 Thread Lars Marowsky-Bree
On 2008-06-27T15:47:01, Andrew Beekhof [EMAIL PROTECTED] wrote: A non-trivial amount of lost messages :-( Now keep in mind that CTS keeps the cluster in a constant state of upheaval and that this was an 8-node cluster. I go one step beyond what you recommend and keep the logfile on the CTS

Re: [Linux-ha-dev] ha_log() as defined in /etc/ha,d/shellfuncs

2008-06-26 Thread Lars Marowsky-Bree
On 2008-06-25T11:30:54, Joe Bill [EMAIL PROTECTED] wrote: Ok. What I failed to realize was that any cross-network dialog between ha_logger and syslog via ha_logd was anyways asynchronous, and whatever dialog between ha_logger and ha_logd is local inter-process (independent from the network),

Re: [Linux-ha-dev] ha_log() as defined in /etc/ha,d/shellfuncs

2008-06-25 Thread Lars Marowsky-Bree
On 2008-06-24T04:12:48, Joe Bill [EMAIL PROTECTED] wrote: ha_logger -t ${HA_LOGTAG} $@ if [ $? -eq 0 ] ; then return 0!!! if successful fi where, if ha_logger fails then ha_log() attempts to write directly to a logfile and to a debugfile. If ha_logger succeeds, ha_log()

[Linux-ha-dev] Re: [Linux-ha-announce] Linux-HA Leadership Announcement

2008-06-25 Thread Lars Marowsky-Bree
On 2008-06-24T07:26:47, Alan Robertson [EMAIL PROTECTED] wrote: Dear Alan, thanks for the time and effort you have put into Linux-HA in the past. Without you, it would not be what it is today. Mori-san, Dave, and myself will discuss and announce how to proceed soon, I hope. We need to have some

Re: [Linux-HA] Heartbeat o2cb OCF resource script fails to start the resources

2008-06-25 Thread Lars Marowsky-Bree
On 2008-06-22T20:00:00, Ivan [EMAIL PROTECTED] wrote: ... that it was in the official guide. It is in there, that's where I picked it up from. /usr/share/doc/manual/sles-heartbeat_en/SLES-heartbeat_en.pdf (page 143) Ah, duh. That was auto-generated to from all meta-data. Our bad.

[Linux-HA] Re: [Linux-ha-announce] Linux-HA Leadership Announcement

2008-06-25 Thread Lars Marowsky-Bree
On 2008-06-24T07:26:47, Alan Robertson [EMAIL PROTECTED] wrote: Dear Alan, thanks for the time and effort you have put into Linux-HA in the past. Without you, it would not be what it is today. Mori-san, Dave, and myself will discuss and announce how to proceed soon, I hope. We need to have some

Re: [Linux-HA] Heartbeat o2cb OCF resource script fails to start the resources

2008-06-21 Thread Lars Marowsky-Bree
On 2008-06-21T03:00:11, Ciro Iriarte [EMAIL PROTECTED] wrote: Link: http://developerbugs.linux-foundation.org/show_bug.cgi?id=1897 Probably the patch is not applied yet as there're no comments to the report I'm using heartbeat-resources-2.1.3-22.1 from the build service (also on

Re: [Linux-HA] Heartbeat o2cb OCF resource script fails to start the resources

2008-06-21 Thread Lars Marowsky-Bree
On 2008-06-22T11:14:15, Ivan [EMAIL PROTECTED] wrote: I don't think it's good enough for a product like SLES10SP2. At least I think that's what we pay for (not to get into a situation like this). It should have been marked or mentioned in the script's header that it's experimental only.

Re: [Linux-ha-dev] sfex

2008-06-20 Thread Lars Marowsky-Bree
On 2008-06-20T12:52:09, Xinwei Hu [EMAIL PROTECTED] wrote: sfex relies on timing, yes, but with such considerable safety margins Do we have any systematic method to analysis the safety margin already ? If not, I'll not go with the considerable claim. It depends; but I would think that 60

Re: [Linux-ha-dev] What best name for the Heartbeat v2 log daemon ?

2008-06-20 Thread Lars Marowsky-Bree
On 2008-06-19T08:50:24, Joe Bill [EMAIL PROTECTED] wrote: In my mind, anybody who brings up Linux-HA or HA to talk about a cluster manager means Heartbeat. There is no confusion with Xen, or OSCAR, or Rocks. I disagree, actually. Linux-HA and heartbeat are two different things, and this

Re: [Linux-ha-dev] sfex

2008-06-19 Thread Lars Marowsky-Bree
On 2008-06-19T22:52:55, Xinwei Hu [EMAIL PROTECTED] wrote: True. It is possible to break sfex, but the probability that that is going to happen is extremely low and could be due only to a very pathological timing. One way to make this probability still From my previous experience, I

Re: [Linux-HA] Filesystem agent not OK when in a clone group

2008-06-10 Thread Lars Marowsky-Bree
On 2008-06-09T16:25:25, Florent DUTHEIL [EMAIL PROTECTED] wrote: One can see clearly in logs (when it comes to mount the first bind/ro ressource: Filesystem[2688][2718]: 2008/06/09_16:10:08 INFO: Running start for /mnt/filer1/drivers on /var/ftp/labtech/labtech_drivers

Re: [Linux-ha-dev] Upgrade from 2.0.8 to 2.1.3 on SLES 10 SP1

2008-06-07 Thread Lars Marowsky-Bree
On 2008-06-07T13:14:06, Michael Bristow [EMAIL PROTECTED] wrote: SLES 10 SP1 two node cluster. I just updated Heartbeat from 2.0.8 to 2.1.3 on two SLES 10 SP1 servers. After the update, I am no longer able to log into hb_gui. I get the message: Can't connect to server 127.0.0.1. Did

Re: [Linux-HA] Ha and ocfs2

2008-06-04 Thread Lars Marowsky-Bree
On 2008-06-04T10:50:04, Karel Brenkman [EMAIL PROTECTED] wrote: Hi, In my 5 node ha2 cluster I'm using an ocfs2 volume. Everything goes well but only 4 nodes actually mount the ocfs2 volume. It turned out that I did not set the correct node slot count in ocfs2console. So I thought (stupid

Re: [Linux-HA] Ha and ocfs2

2008-06-04 Thread Lars Marowsky-Bree
On 2008-06-04T13:11:27, Karel Brenkman [EMAIL PROTECTED] wrote: I KNOW that i must change the number of node slots, but how do you do that? I've tried via ocfs2console and via tunefs.ocfs2 -N 5 then I get the error as posted. When you say offline can you be more specific, stop the HA2

Re: 回复: 回复: [ Linux-HA] HA 2.1.3--How to get ride of warnin g logs about Using Deprecated name... for cluster options

2008-06-04 Thread Lars Marowsky-Bree
On 2008-06-04T22:31:37, zhang july [EMAIL PROTECTED] wrote: Don't worry. I find out, the command should be /usr/lib64/heartbeat/pengine metadata. I can see no-quorum-policy in the return results, which is supposed to   mean it supports by this version. But the logs still shown warning about

Re: 回复: 回复: 回复: [Linux-HA] HA 2.1.3--How to get rid e of warnin g logs about Using Deprecated name... for cluster options

2008-06-04 Thread Lars Marowsky-Bree
On 2008-06-05T01:37:13, zhang july [EMAIL PROTECTED] wrote: Lars, sorry if I didn't make my questions clear to you. The name I use in the configration is the same one in the pengine metadata and the same in crm.dtd. But the log tells me that I use the deprecated names. I prefer not to use

Re: [Linux-HA] Announcement: New book about Linux-HA Version 2

2008-06-04 Thread Lars Marowsky-Bree
On 2008-06-04T11:05:38, Dejan Muhamedagic [EMAIL PROTECTED] wrote: Some virtual environments are notoriously bad at keeping time (vmware for example). I think that sometimes ntp can't prevent clock from jigger. Clock drifting is not the same as the clock jumping backwards; virtualization

Re: [Linux-HA] Starting non-idle resources

2008-06-04 Thread Lars Marowsky-Bree
On 2008-06-03T16:14:15, Nuno Covas [EMAIL PROTECTED] wrote: Are you sure? My idea was that HB calls ResourceManager verifyallidle on init time, which just checks for non-idle resources and logs the result with CRITICAL tag, but HB doesn't stop the resources once started.. You're right, my

Re: [Linux-HA] Announcement: New book about Linux-HA Version 2

2008-05-30 Thread Lars Marowsky-Bree
On 2008-05-30T16:07:47, Michael Schwartzkopff [EMAIL PROTECTED] wrote: Hi Michael, tolle Sache, Kompliment! Ich habe mal ins Beispielkapitel geschaut: - Der Aussage, daß man bei der Auswahl nicht so viel Augenmerk auf Qualität legen muss, würde ich nicht unterschreiben - nur durch

Re: [Linux-HA] heartbeat from SLES10 SP2

2008-05-30 Thread Lars Marowsky-Bree
On 2008-05-28T15:59:38, Andrew Beekhof [EMAIL PROTECTED] wrote: I'm going with crm. In fact didn't find a pacemaker package in the installation media (still and integrated bundle for SLES? right - changing the packaging structure in a service pack wasn't considered acceptable for customers

Re: [Linux-HA] Announcement: New book about Linux-HA Version 2

2008-05-30 Thread Lars Marowsky-Bree
On 2008-05-30T20:07:24, Lars Marowsky-Bree [EMAIL PROTECTED] wrote: Oops. I apologize. I _meant_ to send that to Michael personally, hence the German. That was not intended. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB

Re: [Linux-ha-dev] [RFC] heartbeat-2.1.4

2008-04-16 Thread Lars Marowsky-Bree
On 2008-04-15T18:36:12, Junko IKEDA [EMAIL PROTECTED] wrote: Hi again, Another request; Would it be possible to include the following patch in release 2.1.4? http://hg.linux-ha.org/dev/rev/6307bb091d02 That was already in, yes. Thanks! Regards, Lars -- Teamlead Kernel, SuSE Labs,

Re: [Linux-ha-dev] Re: [RFC] heartbeat-2.1.4

2008-04-16 Thread Lars Marowsky-Bree
On 2008-04-16T20:34:42, HIDEO YAMAUCHI [EMAIL PROTECTED] wrote: Hi Dejan, It's strange that hb_report fails to produce good backtraces. How did you get them from the command line? I used a hb_report -f 09:00 -u root /root/mast_slave_emg2 command-line. Hi Hideo, I think Dejan was

Re: [Linux-ha-dev] [RFC] heartbeat-2.1.4 --- Master resource'sdemoteoperation goes into an infinite loop

2008-04-16 Thread Lars Marowsky-Bree
On 2008-04-16T17:53:20, Junko IKEDA [EMAIL PROTECTED] wrote: This test relates to these issues. http://developerbugs.linux-foundation.org/show_bug.cgi?id=1822 http://hg.clusterlabs.org/pacemaker/dev/rev/7edc6bc1557b It seems that the fix is included to pacemaker/dev not stable...

Re: [Linux-ha-dev] [RFC] heartbeat-2.1.4 --- build onRHEL5.1

2008-04-16 Thread Lars Marowsky-Bree
On 2008-04-16T20:57:08, Junko IKEDA [EMAIL PROTECTED] wrote: Sorry for annoying. The last attached was something wrong, so check this one. Hi Junko, any ideas as to why the current code doesn't work for you? Does the patched version work to actually _build_ the plugins? For example, you add

Re: [Linux-ha-dev] [RFC] heartbeat-2.1.4

2008-04-16 Thread Lars Marowsky-Bree
On 2008-04-16T14:10:05, Dejan Muhamedagic [EMAIL PROTECTED] wrote: It will help the problems which are posted into Bugzilla 1814, for all platform not only ppc. http://developerbugs.linux-foundation.org/show_bug.cgi?id=1814 I suppose that you mean the patch which is attached by Mori-san

Re: [Linux-ha-dev] RFC: Proposed project restructuring

2008-04-11 Thread Lars Marowsky-Bree
On 2008-04-11T13:55:04, Andrew Beekhof [EMAIL PROTECTED] wrote: As Lars mentioned yesterday, the reason for 2.1.4 is to allow time for some additional project changes (both managerial and technical). Part of these changes are to split the, currently monolithic, Heartbeat project into

Re: [Linux-ha-dev] RFC: Proposed project restructuring

2008-04-11 Thread Lars Marowsky-Bree
On 2008-04-11T16:05:35, Andrew Beekhof [EMAIL PROTECTED] wrote: * heartbeat-core (alternate name: cluster-core, lha-core) This project would contain all the pieces relevant to the operation of a single node. Conceptually, the project would include: - clplumbing (including

Re: [Linux-ha-dev] RFC: Proposed project restructuring

2008-04-11 Thread Lars Marowsky-Bree
On 2008-04-11T18:10:13, Dejan Muhamedagic [EMAIL PROTECTED] wrote: True. It's there only because of CRM. It still however depends on ha.cf and logd.cf. It could be useful with v1 too, but right now it probably depends on both. There'll have to be some code changes to support the new layout.

[Linux-ha-dev] [RFC] heartbeat-2.1.4

2008-04-10 Thread Lars Marowsky-Bree
Hi all, the Linux-HA project is undergoing some changes, as you've noticed. Not all of them have gone as well as expected, and it hasn't stabilized yet. Under guidance with Alan, the project members have met and decided to change the governance of the project in the future. This will be

Re: [Linux-ha-dev] [PATCH] Process monitor daemon (revised)

2008-04-09 Thread Lars Marowsky-Bree
On 2008-04-08T06:56:25, Serge Dubrouski [EMAIL PROTECTED] wrote: - RAs should sign in with it for the processes they want monitored, instead of listing the processes in the procd configuration section (means it gets decoupled from the CIB further). The RAs could write a record to

Re: [Linux-ha-dev] [PATCH] Process monitor daemon (revised)

2008-04-09 Thread Lars Marowsky-Bree
On 2008-04-09T06:43:13, Serge Dubrouski [EMAIL PROTECTED] wrote: Whatever principles make sense for the specific RA - according to instance attributes specified, the current role etc. The RA really knows best. Right, but that doesn't mean that they know user best. As I see it a user

Re: [Linux-HA] Initial dead time is smaller than deadtime

2008-04-09 Thread Lars Marowsky-Bree
On 2008-04-08T19:32:58, Bernd Schubert [EMAIL PROTECTED] wrote: Hello, I need to set a rather huge dead time of 1200s, but the initial dead time is supposed to be of 120s or less. However, heartbeat tries to be schoolmasterly and doesn't want to accept my settings: deadtime 1200 # time

Re: [Linux-HA] Initial dead time is smaller than deadtime

2008-04-09 Thread Lars Marowsky-Bree
On 2008-04-09T20:26:02, Bernd Schubert [EMAIL PROTECTED] wrote: I still think there is another bug in heartbeat, though. There is simply no reason for heartbeat to wait $deadtime on initial startup of the heartbeat services, when it knows all heartbeat nodes are are up. If I at least could

Re: [Linux-HA] OCF resource not moving when put into standby

2008-04-08 Thread Lars Marowsky-Bree
On 2008-04-07T16:26:25, William Francis [EMAIL PROTECTED] wrote: hearbeat 2.1.3 on ubuntu 8 beta I have two nodes and an OCF ip resource(plus DRBD and a mailserver). When I put the node with all my resources into standby, all resource except the ip resource (and the mail server dependant on

Re: [Linux-HA] heartbeat-2.1.3 spec file attempts to re-add hacluster UID under CentOS 5

2008-04-08 Thread Lars Marowsky-Bree
On 2008-04-05T12:22:39, Brian Reichert [EMAIL PROTECTED] wrote: That is sound advice, but CentOS's spec file is the same as the heartbeat project's spec file. Were I to report to CentOS, any patch developed would be pushed upstream to the HA project, or so I would hope. The RPMs shown to

Re: [Linux-HA] about the timing of the old node takes part in a membership again

2008-04-04 Thread Lars Marowsky-Bree
On 2008-04-04T17:34:04, Junko IKEDA [EMAIL PROTECTED] wrote: Hi, I am running one test for a split brain like this; (1) start Heartbeat (node-a/node-b) (2) run Dummy resource on node-a (3) down the interconnect LAN - a split brain (4) stop Heartbeat (only node-b) It might be just a

Re: [Linux-HA] crm_failcount queries quite slow?

2008-04-03 Thread Lars Marowsky-Bree
On 2008-04-03T13:59:36, Dejan Muhamedagic [EMAIL PROTECTED] wrote: Any crm* program is significantly slower on a non-DC node regardless of whether something's happening in the cluster. It's always been like that. Hm, I've not personally observed that in my test cluster, or at least not

Re: [Linux-HA] crm_failcount queries quite slow?

2008-04-02 Thread Lars Marowsky-Bree
On 2008-04-02T13:00:59, Abraham Iglesias [EMAIL PROTECTED] wrote: Hi all, i'm trying to implemente my snmp mib module to get every resource failcount in the cluster. I'm surprised that the crm_failcount query to get the failcount for a resource takes 2-3 seconds. Then, for 8 resources in

Re: [Linux-ha-dev] Ahem, another feature request ...

2008-03-22 Thread Lars Marowsky-Bree
On 2008-03-22T02:37:13, Joe Bill [EMAIL PROTECTED] wrote: First, I believe a loud Happy Birthday is in order for Alan's brainchild and all the fine people he gathered to give it a proper ... development ? :-) ... and to help Heartbeat lead it's little HA revolution in the world of open

Re: [Linux-HA] ERROR in ha log

2008-03-21 Thread Lars Marowsky-Bree
On 2008-03-21T13:46:23, Gary [EMAIL PROTECTED] wrote: I am tring to setup ha in VE of OpenVZ. After I started heartbeat, Log file show some errors: Mar 21 13:41:35 cluster-dev1 heartbeat: [16236]: ERROR: Unable to set scheduler parameters.: Operation not permitted How can I fix these

Re: [Linux-HA] How to syncronise my cib.xml file?

2008-03-21 Thread Lars Marowsky-Bree
On 2008-03-21T13:24:36, Szasz Tamas [EMAIL PROTECTED] wrote: I use the cibadmin command to update, replace, delete resources in my cib file. When I add a new resource, and view the result in crm_mon, the new resource is succesfully added to the group, but when I view the cib file, there isn't

Re: [Linux-HA] When to expect working Quorum server?

2008-03-19 Thread Lars Marowsky-Bree
On 2008-03-19T21:06:57, Atanas Dyulgerov [EMAIL PROTECTED] wrote: When to expect working quorumd? Are you planning to fix it in the next release? Probably not. The next release will focus on restructuring the packages plus bugfixes only - though I may be wrong, I'm reasonably sure. Is the

Re: [Linux-HA] Bugfix release for Heartbeat in old packaging

2008-03-18 Thread Lars Marowsky-Bree
On 2008-03-18T14:05:45, Serge Dubrouski [EMAIL PROTECTED] wrote: Then my other questions is: Who supports current Heartbeat packages suitable for Pacemaker and where I can get sources for them with appropriate .spec files? We'll provide that soon again. And in fact, the build service

Re: [Linux-HA] ha_msg_addraw_ll: illegal field

2008-03-08 Thread Lars Marowsky-Bree
On 2008-03-07T17:15:10, Tom Brown [EMAIL PROTECTED] wrote: OS: Debian Etch 4.0r3 Kernel: vanilla kernel 2.6.24.3 DRBD: 8.2.5 Heartbeat: 2.1.3 I was testing fail-overs between two nodes: fs01 and fs02. I've alternated rebooting the nodes several times. I saw the errors below show up in

Re: [Linux-HA] crmd error cl_compress_field: loading compression module failed - what's going on?

2008-03-08 Thread Lars Marowsky-Bree
On 2008-03-06T11:03:05, Luis Motta Campos [EMAIL PROTECTED] wrote: All four machines are the same CentOS 5 (final), all four machines are using the same package set and where installed from the same DVD copy. They literally share the same hardware, under VMware, and everything else is virtual

Re: [Linux-HA] Force switch with DRBD

2008-03-05 Thread Lars Marowsky-Bree
On 2008-03-05T00:36:17, [EMAIL PROTECTED] wrote: Yes I know, but this was actually the question, right? How can I force all resources to move to the other node? and thats the purpose of this command. I have to admit that I don't use drbd and I am not familiar with master/slave devices. I

Re: [Linux-ha-dev] portability

2008-03-04 Thread Lars Marowsky-Bree
On 2008-03-03T17:10:10, David Lee [EMAIL PROTECTED] wrote: But if any patch causes any trouble, feel free to back it out; they should all be independent of each other. But preferably do so in an amended form that includes a comment about why the original code was the way it was. Hi David,

Re: [Linux-ha-dev] tools/ccdv

2008-03-03 Thread Lars Marowsky-Bree
On 2008-03-01T15:35:10, Lars Marowsky-Bree [EMAIL PROTECTED] wrote: Try it with --enable-pretty - it wraps gcc to make the build less noisy and, well, more pretty. (The longish gcc lines are only shown when an error actually occured.) BTW, David, I'm pretty sure your latest change has broken

Re: [Linux-HA] ERROR: crm_abort: ha_set_tm_time: Triggered assert at iso8601.c:887

2008-03-03 Thread Lars Marowsky-Bree
On 2008-03-03T16:50:52, Luis Motta Campos [EMAIL PROTECTED] wrote: For a obviously fairly embarrassing bug, it's pretty complicated to understand... :( Explanations and pointers to reading material are welcome. I'm not quite sure what you need; this is a bug in a date calculation code being

Re: [Linux-HA] Heartbeat reboots machine after generic plugin load failed error - please help

2008-03-03 Thread Lars Marowsky-Bree
On 2008-03-03T15:59:30, Luis Motta Campos [EMAIL PROTECTED] wrote: Finally, I've investigated my connectivity between the two nodes, and everything seems fine on the network layer: I can see the other machine (both sides) and there is no packet filtering firewalls running on them (and there

Re: [Linux-HA] Question of Service Monitoring in HAv2

2008-03-02 Thread Lars Marowsky-Bree
On 2008-02-29T20:36:27, Dejan Muhamedagic [EMAIL PROTECTED] wrote: http://lists.linux-ha.org/pipermail/linux-ha/2008-February/030537.html Many thanks for sharing this. I was not aware of it and it seems to be important. Fixed in pacemaker's latest release, I think. Regards, Lars --

Re: [Linux-ha-dev] tools/ccdv

2008-02-29 Thread Lars Marowsky-Bree
On 2008-02-29T11:19:50, David Lee [EMAIL PROTECTED] wrote: So while I'm visiting this, is there any reason not to simplify this right down to the usual 'automake' behaviour? That obscure comment said ...auto-built but not auto-cleaned. Does anyone know why is it necessary to avoid this

Re: [Linux-ha-dev] tools/ccdv

2008-02-29 Thread Lars Marowsky-Bree
On 2008-02-29T15:39:07, David Lee [EMAIL PROTECTED] wrote: But if you can pretty it up and --enable-pretty still works, sure. No, I don't particularly want to get sidetracked into its insides. I was simply wondering why, from the configure/make perspective, it seems to be a special case. I

Re: [Linux-HA] NFS HA cluster

2008-02-28 Thread Lars Marowsky-Bree
On 2008-02-27T11:45:45, Wayne Carty [EMAIL PROTECTED] wrote: Yea after testing again today the problem re-surfaced. The services start up in the following order ip nfsboot nfsserver You also want the IP address up last. Rationale: if the IP is up but the NFS server isn't yet, the clients

Re: [Linux-HA] NFS HA cluster

2008-02-27 Thread Lars Marowsky-Bree
On 2008-02-26T12:46:46, Wayne Carty [EMAIL PROTECTED] wrote: Solved the problem by reducing the resource timeout from 20s to 5s That is very unlikely to solve any problem, rather it will introduce spurious errors. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development

Re: [Linux-HA] meta_attributes twice for some resources

2008-02-27 Thread Lars Marowsky-Bree
On 2008-02-27T17:18:32, Sebastian Reitenbach [EMAIL PROTECTED] wrote: Your conclusions are more or less the same, I had. However, I'll create a bug report later. unfortunately, still no idea how it happened. We removed the duplicate entries, replacing the CIB (cibadmin -R -o resources),

Re: [Linux-ha-dev] Do we need ccdv?

2008-02-26 Thread Lars Marowsky-Bree
On 2008-02-26T11:19:36, Dejan Muhamedagic [EMAIL PROTECTED] wrote: Don't know. ccdv is apparently a tool to help with reading compiler output. I've never used it. It pretties the compiler output. Try it with --enable-pretty=yes ;-) The error looks like a compiler issue, explicitly casting

[Linux-HA] quorumd - is anyone using it?

2008-02-21 Thread Lars Marowsky-Bree
Hi all, is anyone actually using the quorum daemon? My assessment seems to suggest that it is not workable in any scenario; but maybe I have missed something? If not and I'm right, I am afraid that users might actually deploy it, thinking it solves something and then be very upset when it fails

Re: [Linux-HA] quorumd - is anyone using it?

2008-02-21 Thread Lars Marowsky-Bree
On 2008-02-21T18:28:45, Michael Schwartzkopff [EMAIL PROTECTED] wrote: I would like to give it a try if somebody could explain me how it works. That's the problem, I don't see how you can use it to build a working and reliable configuration ;-) ___

Re: [Linux-HA] Some problems with monitoring

2008-02-21 Thread Lars Marowsky-Bree
On 2008-02-21T18:28:00, Adrian Chapela [EMAIL PROTECTED] wrote: Hello, I am having troubles with resource monitoring. It only runs well some seconds, then monitoring stops and the log says: tengine[23994]: 2008/02/21_18:22:10 info: match_graph_event: Action mysqld-child:0_monitor_2

Re: [Linux-HA] Heartbeat and DRBD with harmonious configuration

2008-02-21 Thread Lars Marowsky-Bree
On 2008-02-21T14:35:30, Doug Lochart [EMAIL PROTECTED] wrote: So now I am walking through my ha.cf with crm off (yes I want to get this working in version 1 then convert my haresources to cib format afterwards). I don't think this approach is going to make you very happy. v2 is quite

Re: [Linux-HA] Heartbeat and DRBD with harmonious configuration

2008-02-21 Thread Lars Marowsky-Bree
On 2008-02-21T15:08:52, Doug Lochart [EMAIL PROTECTED] wrote: after the negotiating phase. Alert someone about this incident. pri-lost echo pri-lost. Have a look at the log files. | mail -s 'DRBD Alert' [EMAIL PROTECTED]; This just tells me that this node was primary and

Re: [Linux-HA] OCFS2 on HB 2.1.3 v2

2008-02-19 Thread Lars Marowsky-Bree
On 2008-02-19T12:33:04, Raoul Bhatia [IPAX] [EMAIL PROTECTED] wrote: to my knowledge, the folks from suse made some heavy modification to ocfs2 to remove this behavior. i once tried to incorporate their patches into the then current vanilla kernel, but failed mainly because of my lack of

Re: [Linux-HA] question regarding orderings in resource groups

2008-02-19 Thread Lars Marowsky-Bree
On 2008-02-19T12:11:26, Sebastian Reitenbach [EMAIL PROTECTED] wrote: there ordered is set to false. I have the group running, and when I then e.g. want to stop the resource D2, then D3 stops too. Only when I change collocated to false, then D3 keeps running when I stop D2. Seems to be

Re: [Linux-HA] question regarding orderings in resource groups

2008-02-19 Thread Lars Marowsky-Bree
On 2008-02-19T15:49:28, Sebastian Reitenbach [EMAIL PROTECTED] wrote: Make rsc 'from' run on the same machine as rsc 'to' If rsc 'to' cannot run anywhere and 'score' is INFINITY, then rsc 'from' wont be allowed to run anywhere either If rsc 'from' cannot run anywhere, then 'to' wont

Re: [Linux-ha-dev] Pacemaker-Python-GUI(hb_gui) not working on latest Heartbeat and Pacemaker

2008-02-18 Thread Lars Marowsky-Bree
On 2008-02-18T14:13:27, DAIKI MATSUDA [EMAIL PROTECTED] wrote: I recently testing the development tree of Heartbeat and Pacemaker. And I found they are alomost working well, but hb_gui on mgmtd provided Pacemaker-Python-GUI does not work. Because, as errased the part for mgmt from Heartbeat

Re: [Linux-ha-dev] Pacemaker-Python-GUI(hb_gui) not working on latest Heartbeat and Pacemaker

2008-02-18 Thread Lars Marowsky-Bree
On 2008-02-18T15:57:15, Dejan Muhamedagic [EMAIL PROTECTED] wrote: But there is a problem. In the environment not installed mgmtd, the heartbeat failes at the first. What fails? Compilation? Heartbeat should be aware of the mgmtd only insofar as to manage the process (the respawn

[Linux-ha-dev] BUGZILLA: new Pacemaker product

2008-02-14 Thread Lars Marowsky-Bree
Hi all, at the http://developerbugs.linux-foundation.org/ bugzilla, which is also used by the Linux HA and openAIS project, we have just created a Pacemaker product. Please use this for filing all CRM/Pacemaker related bugs going forward, as Pacemaker is the vehicle for future bugfixes and

[Linux-ha-dev] Bugzilla clarification issue

2008-02-14 Thread Lars Marowsky-Bree
Hi Alan, you have set yourself as the new default owner for all Linux HA related CRM issues. I assume this means you'll take care of shepherding them over to the Pacemaker product, or take care of their resolution? Seriously, why was that done, we have made clear that we'd continue to support

Re: [Linux-ha-dev] Bugzilla clarification issue

2008-02-14 Thread Lars Marowsky-Bree
On 2008-02-14T22:39:13, Dejan Muhamedagic [EMAIL PROTECTED] wrote: Andrew said in reply to a bugzilla which I posted around the time he created the pacemaker project that the bugs should be filed there. I guess that what Alan did was a reaction to that. Yeah, well, that was Andrew's intent,

Re: [Linux-HA] Heartbeat 2.1.3 error

2008-02-14 Thread Lars Marowsky-Bree
On 2008-02-14T14:17:05, Nikita Michalko [EMAIL PROTECTED] wrote: heartbeat[5612]: 2008/02/14_09:47:59 WARN: Managed /usr/lib/heartbeat/cib process 5630 exited with return code 1. heartbeat[5612]: 2008/02/14_09:47:59 EMERG: Rebooting system. Reason: /usr/lib/heartbeat/cib Someone

[Linux-HA] BUGZILLA: new Pacemaker product

2008-02-14 Thread Lars Marowsky-Bree
Hi all, at the http://developerbugs.linux-foundation.org/ bugzilla, which is also used by the Linux HA and openAIS project, we have just created a Pacemaker product. Please use this for filing all CRM/Pacemaker related bugs going forward, as Pacemaker is the vehicle for future bugfixes and

Re: [Linux-HA] Windows port of haclient

2008-02-10 Thread Lars Marowsky-Bree
On 2008-02-09T10:15:47, Xinwei Hu [EMAIL PROTECTED] wrote: GUI needs to do more things than just talking to CIB. For example, when the admin wants to know why a resource fails on node A, someone has to do log analysis on node A then. So I'd prefer to have a indirect layer in-between here

Re: [Linux-HA] Windows port of haclient

2008-02-10 Thread Lars Marowsky-Bree
On 2008-02-09T10:55:57, Xinwei Hu [EMAIL PROTECTED] wrote: Making it web or not will not change the fact that CIB is harder to understand than RHCS. I vote for a web based gui also, but won't expect it to be more friendly then haclient. The key trick is to only show the complexity the user

Re: [Linux-HA] Windows port of haclient

2008-02-10 Thread Lars Marowsky-Bree
On 2008-02-09T10:34:39, Xinwei Hu [EMAIL PROTECTED] wrote: When watching the status of HA, we normally don't care about which resources are good. Instead, we only care about which resource is failed, which should started but stopped, blahblah. A new monitor view of GUI is needed here ;)

Re: [Linux-HA] Pen a very cool reverse proxy with no configuration file great for HA

2008-02-08 Thread Lars Marowsky-Bree
On 2008-02-08T10:50:31, Eddie C [EMAIL PROTECTED] wrote: pen 8080 host1:80 host2:80 This makes it ideal work as an RA, because the load balancer (ok its a proxy but close enough) can fail over to any node without having to worry about synchronizing configuration files. You could do

Re: [Linux-ha-dev] RFC: Roadmap for 2.2.0

2008-02-07 Thread Lars Marowsky-Bree
On 2008-02-07T10:45:00, Andrew Beekhof [EMAIL PROTECTED] wrote: Saying Requires: pacemaker doesn't seem like a good idea though True, people who wish to run v1 only don't need Pacemaker installed. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products

Re: [Linux-HA] propagate value similliar to pingd

2008-02-07 Thread Lars Marowsky-Bree
On 2008-02-07T19:00:55, Thomas Glanzmann [EMAIL PROTECTED] wrote: Hello, I would like to write a script similiar to pingd that is spawnd and populates a value in the cib that I can build a rule on. What do I have to do to obtain the above. Concrete questions are: - What do I have

Re: [Linux-HA] ClusterIP

2008-02-07 Thread Lars Marowsky-Bree
On 2008-02-07T19:11:26, Thomas Glanzmann [EMAIL PROTECTED] wrote: Hello, I would like to do a Cluster-IP Setup with SLES 10. A few things are unclear for me. With ClusterIP you have one IP address that is shared on two or more nodes. It useally uses a multicast mac address. Both nodes see

Re: [Linux-HA] Resources are stopped for a few seconds when bringing secondary node out of standby

2008-02-07 Thread Lars Marowsky-Bree
On 2008-02-07T16:33:48, Dejan Muhamedagic [EMAIL PROTECTED] wrote: This is very strange. Are you sure that you're running bash and not dash? It's such a mishmash with all those xxshes. I'm off to check that dash thing. Why not simply specify #!/bin/bash explicitly? Other scripts require it

Re: [Linux-HA] Windows port of haclient

2008-02-07 Thread Lars Marowsky-Bree
On 2008-02-07T16:26:10, Dejan Muhamedagic [EMAIL PROTECTED] wrote: The documentation is rather scarce, I'm afraid. The current GUI is based on python and, if you speak python, you could take a look there. Another option would be to just invoke external programs such as crm_mon,

Re: [Linux-HA] ClusterIP

2008-02-07 Thread Lars Marowsky-Bree
On 2008-02-07T22:43:50, Thomas Glanzmann [EMAIL PROTECTED] wrote: Hello again, here comes by cib.xml for a clusterip. But the ressource stickiness is not working for me. When I shoutdown ha-2, the two clone instances stay on ha-1. Any ideas? Before sending this e-mail I used the following

Re: [Linux-ha-dev] RFC: Roadmap for 2.2.0

2008-02-06 Thread Lars Marowsky-Bree
On 2008-02-06T16:28:41, Andrew Beekhof [EMAIL PROTECTED] wrote: There don't seem to be any (I think we exhausted the testing discussion)... are you waiting for me to do it? Sure, feel free to; at least I have no objections. I wonder if we could add a Recommends/Requires: pacemaker or

Re: [Linux-HA] Cluster Fail-Over

2008-02-06 Thread Lars Marowsky-Bree
On 2008-02-06T13:45:47, Andrew Beekhof [EMAIL PROTECTED] wrote: Most things seem to be running, but the 3s timeout for rsc_winbind_2 seems very low and rsc_clamd_2 seems to have trouble stopping. There's also this: Resource Group: filter_group_1 rsc_winbind_1 (lsb:winbind):

Re: [Linux-HA] crm crash on centos 4.5

2008-02-06 Thread Lars Marowsky-Bree
On 2008-02-06T15:11:28, Tao Yu [EMAIL PROTECTED] wrote: Running the heartbeat 2.1.2 Core: #0 0x003b9602e21d in raise () from /lib64/tls/libc.so.6 #1 0x003b9602fa1e in abort () from /lib64/tls/libc.so.6 #2 0x003efe80bc90 in crm_abort () from /usr/lib64/libcrmcommon.so.1 #3

Re: [Linux-HA] 32 / 64 Bit

2008-02-06 Thread Lars Marowsky-Bree
On 2008-02-06T08:58:46, Andrew Beekhof [EMAIL PROTECTED] wrote: Right. If there is enough demand, then we can just make the equivalent of ocf-shellfuncs for that language. autoconf makes this relatively straight forward, you just supply the template and it'll fill in the values. Actually

Re: [Linux-HA] DRBD and Linux HA, crm_verify fails

2008-02-05 Thread Lars Marowsky-Bree
On 2008-02-04T17:39:22, Mike Toler [EMAIL PROTECTED] wrote: Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: rsc:drbd0:0: start Feb 4 15:46:11 nfs_server1 drbd[4850]: INFO: r0: Using hostname node_0 Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: RA output: (drbd0:0:start:stdout)

Re: [Linux-HA] 32 / 64 Bit

2008-02-05 Thread Lars Marowsky-Bree
On 2008-02-05T14:55:35, Andreas Mock [EMAIL PROTECTED] wrote: 1) Does this mean I can rely on $OCF_ROOT being set and on $OCF_ROOT/resource.d/heartbeat/.ocf-shellfuncs being a shell snippet to execute that sets the listed variables? Yes, OCF_ROOT is a OCF-mandated setting. The .ocf-shellfuncs

Re: [Linux-ha-dev] Re: RFC: Roadmap for 2.2.0

2008-02-04 Thread Lars Marowsky-Bree
On 2008-02-04T21:59:44, Tadashiro Yoshida [EMAIL PROTECTED] wrote: I understand Andrew will release Pacemaker, and Alan will release Heartbeat, individually. Now we are standing on a starting point of this discussion ;-) Someone should testify the combination of these packages in the

Re: [Linux-ha-dev] Re: RFC: Roadmap for 2.2.0

2008-02-04 Thread Lars Marowsky-Bree
On 2008-02-04T14:23:14, Tadashiro Yoshida [EMAIL PROTECTED] wrote: I am confused completely. I have thought integrated product of Heartbeat and PaceMaker will be tested and released in the SUSE's Build Service. We will build packages of both heartbeat and PaceMaker, yes. And, of course, we

Re: [Linux-HA] The suicide stonith plugin doesn't work with 2.1.3 (?)

2008-02-04 Thread Lars Marowsky-Bree
On 2008-02-04T12:58:47, Dejan Muhamedagic [EMAIL PROTECTED] wrote: This is not quite true. The cluster cannot get direct confirmation from the device which pulled the plug, but we're talking probabilities here. stonith is an all-or-nothing proposition. Nothing ever is ;-) We're just willing

Re: [Linux-ha-dev] RFC: Roadmap for 2.2.0

2008-02-03 Thread Lars Marowsky-Bree
On 2008-01-29T10:31:35, Lars Marowsky-Bree [EMAIL PROTECTED] wrote: Hi all, I'd like to propose the following changes to happen in the next heartbeat release, which I'd name 2.2.0 because of them. ... If noone has any remaining objections, I suppose we could go ahead

<    3   4   5   6   7   8   9   10   11   12   >