Re: [Linux-HA] always have to cleanup LSB script on failover

2008-04-25 Thread Dominik Klein
[EMAIL PROTECTED] wrote: Hello list, I have an ordered and collocated group that consists of the following elements that startup in order: Resource Group: GROUPS_KNWORKS_mail drbddisk_knworks_mail (heartbeat:drbddisk): Started asknmapr01 drbddisk_axigenbin (heartbeat:drbddisk):

Re: [Linux-HA] Resource reached -INIFINITY

2008-04-24 Thread Dominik Klein
Due to a temporary initialization problem, a resource reached a -INIFINITY score on one node. Is there a way to instruct the heartbeat to recalculate the score of the resource on the node without restart the heartbeat? Clean the resource crm_resource -C -r $res -H $node Any maybe you have

Re: [Linux-HA] Almost done with my HA setup, but somethign not working

2008-04-22 Thread Dominik Klein
Nick Duda wrote: (sorry for the long email, but all my configs are here to view) I posted before about HA with 2 squid servers. It's just about done, but stumbling on something. Everytime i manually cause something to happen in hopes to see it failover, it doesnt. For example, I get crm_mon

Re: [Linux-HA] Almost done with my HA setup, but somethign not working

2008-04-22 Thread Dominik Klein
This will return 1. So the stop operation failed. With stonith, your node would be rebooted now. I don't see a stonith device, so the resource goes unmanaged. I think what you see is intended. Regards Dominik and squid crashed it should failover to the other box.its not. Dominik Klein wrote

Re: [Linux-HA] Constraint: Two drdb master on the same node?

2008-04-11 Thread Dominik Klein
I have the following problem with a two-node cluster: I have two DRBD resources. On the node where drbd0 is master, a certain resource group with different resources will be activated. On the node where drbd1 is master, this will happen with another resource group. You can get the necessary

Re: [Linux-HA] New questions relating to: Methods of dealing with network fail(ure/over)

2008-04-11 Thread Dominik Klein
Stallmann, Andreas wrote: Hi there! I have set up a two-node heartbeat cluster running apache and drbd. Everthing went fine, till we tested a split brain scenario. In this case, when we detach both network cables from one host, we get a two-primary situation. I read in the thread methods of

Re: [Linux-HA] Three questions on failcount attribute

2008-04-10 Thread Dominik Klein
[EMAIL PROTECTED] sbin # ./crm_failcount -G -U isdl601 -r caebench.proc name=fail-count-caebench.proc value=(null) Error performing operation: The object/attribute does not exist Is this intentional? At least the normal behaviour. in that version Ah right. crm_failcount gives a

Re: [Linux-HA] how to config to the cluster to perform a simple two node (master-slave) cluster?

2008-04-10 Thread Dominik Klein
Hi to run all resources on the same node, you could put them in a group. Read http://wiki.linux-ha.org/ClusterInformationBase/ResourceGroups If you want to decide which node the group is usually located at, you need a rsc_location constraint. An example is also on that page. To move the group

Re: [Linux-HA] Ordering question

2008-04-09 Thread Dominik Klein
William Francis wrote: http://linux-ha.org/DRBD/HowTov2 it has the example rsc_order id=drbd0_before_fs0 from=fs0 action=start to=ms-drbd0 to_action=promote/ This reads: start fs0 after ms-drbd0 promote which seems to mean promote ms-drbd0 (the to) THEN start fs0 (the from) But if you

Re: [Linux-HA] Three questions on failcount attribute

2008-04-09 Thread Dominik Klein
Martin Knoblauch wrote: Hi, three questions on the failcount attribute. I am running 2.0.8, and yes I know I should upgrade ... :-( Good to know you know :) a) Is it possible that the failcount for a ressource/node is only available after a failure? On a not-yet-failed ressource I see:

Re: [Linux-HA] crm_failcount queries quite slow?

2008-04-04 Thread Dominik Klein
Lars Marowsky-Bree wrote: On 2008-04-03T13:59:36, Dejan Muhamedagic [EMAIL PROTECTED] wrote: Any crm* program is significantly slower on a non-DC node regardless of whether something's happening in the cluster. It's always been like that. I can confirm that. It's been for me ever since I

Re: [Linux-HA] lsb resource problem

2008-04-03 Thread Dominik Klein
Hi William Francis wrote: Ubuntu 7.10 with DRBD 8.0.3 and Heartbeat 2.1.2 with an updated Filesystem file kernel 2.6.22-14 (updated from stock) I have possibly two problems, a heartbeat and a DRBD issue. My goal is to get a pair of machines working with a large /opt partition for zimbra (my

Re: [Linux-HA] Re: pingd problem

2008-04-03 Thread Dominik Klein
Achim Stumpf wrote: Hi, Now it works. I have changed in cib.xml: rsc_location id=group_1:connected rsc=group_1 rule id=group_1:connected:rule score_attribute=pingd expression

Re: [Linux-HA] Re: pingd problem

2008-04-03 Thread Dominik Klein
see in the logs. There won't be a failover to the other node. Any hints, how i could get working the setup with scores as above? Thanks, Achim Dominik Klein schrieb: Achim Stumpf wrote: Hi, Now it works. I have changed in cib.xml: rsc_location id=group_1:connected

Re: [Linux-HA] Re: pingd problem

2008-04-03 Thread Dominik Klein
Achim Stumpf wrote: This will give you a pingd score of 500. A ping_group is treated as one ping_host score wise. If you want to take each ping hosts connectivity into play, you should have ping 10.14.0.10 ping 10.14.0.11 ping 10.14.0.12 ping 10.14.0.13 instead. This would give a pingd score

Re: [Linux-HA] cibadmin question

2008-04-02 Thread Dominik Klein
Jason Erickson wrote: I am trying to add a resource with this command. cibadmin -C -o resources -x meatware_stonithcloneset.xml It is telling me could not parse input file here is the xml file as well. clone id=meat_stonith_cloneset − This - is not actually in there, is it? If it is,

Re: [Linux-HA] crm_failcount queries quite slow?

2008-04-02 Thread Dominik Klein
Abraham Iglesias wrote: Hi all, i'm trying to implemente my snmp mib module to get every resource failcount in the cluster. I'm surprised that the crm_failcount query to get the failcount for a resource takes 2-3 seconds. Then, for 8 resources in the cluster it takes 16-24s and it's quite low

Re: [Linux-HA] crm_failcount queries quite slow?

2008-04-02 Thread Dominik Klein
Abraham Iglesias wrote: thank you for the answer, dominik. As you said, in the DC is faster, but i need to run these queries on every node, as every node can be asked for that information. :S crm_failcount -U ;) Is there any cached data about these values? Or a static file where the

Re: [Linux-HA] crm_failcount queries quite slow?

2008-04-02 Thread Dominik Klein
Abraham Iglesias wrote: Hi, thanks for the failcount tip ;) By thw way, i'm using 2.0.8 heartbeat. There is no information in cib.xml about failcounts...is it possible? or am I missing anything? No. I was wrong. The failcount is in the status section, which is never written to disc. Sorry

Re: [Linux-HA] HA maintenance mode

2008-03-31 Thread Dominik Klein
I don't see an option to specify the behavior for the stand-by mode in the manual. I just wanna prevent HA from moving resources to other nodes for maintenance purpose. So basically, you want to stop all resources at once, don't you? Here's a feature request:

Re: [Linux-HA] adding DRBD group problem

2008-03-28 Thread Dominik Klein
B) put everything in a group in the master_slave resource? ... never tried this by myself I don't think this would work without changing the all the groups resource agents to be master/slave resource agents. Every resource within the master_slave would be promoted/demoted etc. and that

Re: [Linux-HA] Re: Re: showscores.sh weirdness and Not failing over after, repeated kills of IPaddr2?

2008-03-27 Thread Dominik Klein
Roland G. McIntosh wrote: Dominik Klein wrote: With a failure stickiness of -30, you allow your groups resources to fail (400/30)=14 times. Is that what you want? Although the default failure stickiness is -30, the group has a failure stickiness of -100. I would like to failover after 3

Re: [Linux-HA] slave's drbd resource doesn't get promote when master dies

2008-03-27 Thread Dominik Klein
I think I have found out my problem though: I didn't put the resource location stuff for pingd. I added this snippet to the CIB to constrain the master-slave drbd resource to not run on a node with lost connectivity and so far in my tests it seems to work: rsc_location id=drbd_id:connected

Re: [Linux-HA] Virtual IP

2008-03-27 Thread Dominik Klein
Jason Erickson wrote: The only part that i am confused about is where do you set the resource score for a node? Within the constraints section of the cib. Something like this: constraints rsc_location id=rscloc-webserver rsc=webserver rule id=rscloc-webserver-rule-1

Re: [Linux-HA] showscores.sh weirdness and Not failing over after repeated kills of IPaddr2?

2008-03-26 Thread Dominik Klein
Hi Roland G. McIntosh wrote: No matter how many times I kill IPaddr2 I can't seem to cause a failover in my simple 2 node cluster. OT, but why do people keep calling 2 node clusters simple clusters? Clusters are not simple. Maybe it's a rather small cluster. I'm trying to get it working

Re: [Linux-HA] DRBD problems

2008-03-26 Thread Dominik Klein
Néstor wrote: I am getting this errors when running the command drbdadm adjust mysql on WAHOO: *Failure: (114) Lower device is already claimed. This usually means it is mounted. Well, is it? Command 'drbdsetup /dev/drbd0 disk /dev/sda2 /dev/sda2 internal --set-defaults --create-device

Re: [Linux-HA] DRBD problems

2008-03-26 Thread Dominik Klein
Néstor wrote: Version 8.2.5 I think is telling me that the device is already mounted. Right. Is it? What I do not understand them is how to pick a directoy or device. Do I need to re-partition my device to create a separate device for drbd or can I pick just a directory within the device

Re: [Linux-HA] Help with failure-stickiness

2008-03-22 Thread Dominik Klein
Roland G. McIntosh schrieb: I've got 3 resources in a group, and I'd like to configure stickiness values such that if there are more than 3 failures in the group all resources go to the failover node. I've read http://www.linux-ha.org/v2/faq/forced_failover many times, but do not quite

Re: [Linux-HA] slave's drbd resource doesn't get promote when master dies

2008-03-20 Thread Dominik Klein
Jean-Francois Malouin wrote: I thought I had it nailed but still no go. I'm running a simple two-nodes Active/Passive, Debian/Etch cluster with apache, mysql, heartbeat-2.1.3 and drbd-8.2.5 using mcast on the primary NIC and bcast on secondary GigE interfaces which is also the replication link

Re: [Linux-HA] Demote primary when connectivity lost.

2008-03-20 Thread Dominik Klein
Guy wrote: Hi guys, After much fiddling and learning (still loads to do though) I've got my 2 node primary/secondary secondary/primary set more or less working. One failure of node 1, node 2 takes both drbd partitions as primary and mounts the partitions and nfs etc etc. When node 1 is

Re: [Linux-HA] (Bug?) regarding resource_stickiness, master_slave and master-colocated groups

2008-03-14 Thread Dominik Klein
Adrian Chapela wrote: Dominik Klein escribió: Hi during the writeup of ScoreCalculation on the wiki, I noticed something strange. It'd be nice to know whether this is on purpose or a bug. Test setup is: 2 nodes, 1 drbd device, a group of 3 resource which are to run on top of the drbd

Re: [Linux-HA] (Bug?) regarding resource_stickiness, master_slave and master-colocated groups

2008-03-14 Thread Dominik Klein
Solved that problem for me. At least with a colocated resource I have to add. Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] (Bug?) regarding resource_stickiness, master_slave and master-colocated groups

2008-03-14 Thread Dominik Klein
Dominik Klein wrote: Solved that problem for me. At least with a colocated resource I have to add. Urghs. Friday afternoon ... I just wanted to verify that and it turns out my method does not restart the whole thing on a slave failure. Thats true. But it does still restart the whole

Re: [Linux-HA] DRBD+Hearbeat not working as intended

2008-03-13 Thread Dominik Klein
Hi I have a drbd+heartbeat setup and I am having a problem. If the machine which drbd is master shuts down the passive machine does not change its status to active one and because of that it cant mount the drbd file system. Can anyone give me some feedback in this ?? here is

Re: [Linux-HA] Enhanced version of showscores and a major updateonthe score calculation documentation

2008-03-13 Thread Dominik Klein
Hi Dominik, this looks good now. Thank you for fixing. You're welcome. One question: Are you able to cache the default stickiness values? If you determine that every loop it costs time. Good idea. Thanks. The script runs here 5,7 seconds for 18 resources on the DC. That's long.

[Linux-HA] Enhanced version of showscores and a major update on the score calculation documentation

2008-03-12 Thread Dominik Klein
Hi just yesterday I found a way better way to get the scores from ptest. This new version can not only display normal scores, it can also display master scores, which seems quite important these days. It still produces a heck of a lot of logs when executed, but thats just the nature of the

Re: [Linux-HA] Which STONITH devices is everybody using?

2008-03-12 Thread Dominik Klein
Thanks for the replies so far. No one else? Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Enhanced version of showscores and a major update on the score calculation documentation

2008-03-12 Thread Dominik Klein
Andreas Mock wrote: Hi Dominik, you know I like your script, but the newest version broke something: When a resource name has '.' (dots) in it, That might just be. Use dashes ;) the way you split the $line in the while loop to get the score, node and resource name doesn't work any more.

Re: [Linux-HA] Enhanced version of showscores and a major update on the score calculation documentation

2008-03-12 Thread Dominik Klein
Use dashes ;) Well - turns out to be hyphen. At least as of dict.leo.org :) Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Strange behavior of the resource group on 2 nodes cluster

2008-03-11 Thread Dominik Klein
In a colocated group (which is the default), all subsequent resources are tied to the group's first resource with a score of INFINITY. To not allow them to run on another node but the node the first resource is run on, they also get -INFINITY for any other node. Thank you very much, Dominik,

Re: [Linux-HA] Strange behavior of the resource group on 2 nodes cluster

2008-03-11 Thread Dominik Klein
Hi Dominik, I tried to let the resource in the group fail a couple of times, but after the 2-nd try will the failcount for both resources NOT increased. Did you wait for the cluster to restart the resource after you produced the failure before causing another failure? It shows after each

Re: AW: AW: AW: [Linux-HA] Switchover problem with DRBD

2008-03-10 Thread Dominik Klein
Schmidt, Florian wrote: Hello again, you're right, i do have DRBD 8.2.1 installed. Well, you mean downgrading on 0.7.x would be better? This is only a test-cluster so this shouldn't be a problem. But I'll try re-installing my current DRBD-version first and then (if this doesn't help)

Re: [Linux-HA] Strange behavior of the resource group on 2 nodes cluster

2008-03-10 Thread Dominik Klein
Nikita Michalko wrote: Hi all! I have some troubles with HA V2.1.3 on SLES10 SP1, two-node cluster with 1 resource group=2 resources. Intended is forced failover of the group on the third failure of any resource in the group; one node is preferred over the other (see attached

[Linux-HA] Which STONITH devices is everybody using?

2008-03-10 Thread Dominik Klein
Hi I read the list of stonith plugins and had a look at which devices they support. The list of devices you can buy new today was rather small. So I'd like to know which STONITH devices heartbeat users use. Which device do you use? What kind of device is it? How much is it? How well does it

Re: [Linux-HA] Force switch with DRBD

2008-03-04 Thread Dominik Klein
DucaConte Balabam wrote: Hello, I've a cluster using heartbeat v2 and drbd in master/slave configuration. It's: Last updated: Tue Mar 4 09:49:30 2008 Current DC: rman1c (875afc12-b88e-4940-9816-218d2a5911c3) 2 Nodes configured. 2 Resources configured. Node: rman1a

Re: [Linux-HA] How to allow resources to ping-pong forever?

2008-03-03 Thread Dominik Klein
Alex Spengler wrote: Hi, I'm stuck in setting up my cluster. What I want to achive is - run apache on whatever node together with cluster IP which is 172.23.100.200. - if apache fails - switch over to other node - if gateway 172.23.100.1 is not reachable - switch over to other node AND allow

Re: [Linux-HA] (no subject)

2008-02-29 Thread Dominik Klein
Dominik Klein wrote: Schmidt, Florian wrote: Hello list, I still have problem with my heartbeat-config I want heartbeat to start AFD. I checked the RA for LSB-compatibility and think that it's right now. The log file says, that the bash does not find the command afd. crmd[2725]: 2008/02

Re: [Linux-HA] (no subject)

2008-02-29 Thread Dominik Klein
Schmidt, Florian wrote: Hello list, I still have problem with my heartbeat-config I want heartbeat to start AFD. I checked the RA for LSB-compatibility and think that it's right now. The log file says, that the bash does not find the command afd. crmd[2725]: 2008/02/29_11:32:00 info:

Re: AW: [Linux-HA] (no subject)

2008-02-29 Thread Dominik Klein
Hello, I wrote the following lines on top of the script code: export PATH=$PATH:/home/afdha export AFD_WORK_DIR=/usr/afd AFD needs one environment-variable named AFD_WORK_DIR to know, where to work I also did chown afdha:501 /usr/afd and chown /home/afdha How far does this work, because

Re: [Linux-HA] Script to calculate scores to allow a defined number of resource failures before failover

2008-02-29 Thread Dominik Klein
Some cosmetic changes. Thanks to wschlich. Regards Dominik calc_linux_ha_scores.sh Description: Bourne shell script ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also:

Re: [Linux-HA] Question of Service Monitoring in HAv2

2008-02-29 Thread Dominik Klein
I had a query about the service monitoring in HA v2, I was wondering if i can configure it in such a way that if a service fails , heartbeat should try to restart it say n number of times before it fences the system It will (by default) only fence the system, if stop fails. If

Re: [Linux-HA] Node Priority

2008-02-29 Thread Dominik Klein
When heartbeat is started(on both nodes) my node called pgslave gets promoted to DC and that cannot happen, my node pgmaster should always be the active part of the service and also this node needs to always try to get promoted to DC when he have the chance(pgslave gotta spend minimal time being

Re: AW: AW: [Linux-HA] (no subject)

2008-02-29 Thread Dominik Klein
Schmidt, Florian wrote: Mit freundlichen Grüßen Hello, I wrote the following lines on top of the script code: export PATH=$PATH:/home/afdha export AFD_WORK_DIR=/usr/afd AFD needs one environment-variable named AFD_WORK_DIR to know, where to work I also did chown afdha:501 /usr/afd

Re: [Linux-HA] Awesome explanation of stickiness scores :)

2008-02-28 Thread Dominik Klein
Fajar Priyanto wrote: Hi all, This afternoon I would have been asked a question about resource| failure_stickiness, because I was a bit confused the practical use of those stickiness in relation with score in location constrains. But, this page has explained it all very clearly:

Re: [Linux-HA] Deleting Master/Slave-Resources

2008-02-28 Thread Dominik Klein
Schmidt, Florian wrote: Hi list, I'm not able to delete my DRBD-Master/Salve-Set. I tried with crm_resource -D -r drbd_master_slave -t clone and crm_resource -D -r drbd_master_slave -t master-slave drbd_master_slave is the name of my resource. Anyone a short advice? cibadmin -Q -o

Re: [Linux-HA] Newest version of 'showscores'

2008-02-28 Thread Dominik Klein
, only one was looked for. Also fixed a problem with the headings being mixed up. I know this thing produces a lot of logs, but at least it does display the scores, hu? :) #!/bin/bash # Feb 2008, Dominik Klein # Display scores of Linux-HA resources # Known issues: # * cannot get resource

Re: [Linux-HA] linux-ha with drbd -- nothing working

2008-02-28 Thread Dominik Klein
Adam Kaufman wrote: hi all, I've been trying for the last few days to get heartbeat working with a basic drbd configuration. I was initially using heartbeat 2.0.8, but eventually upgraded to 2.1.3. the symptoms exhibited by each version of heartbeat were completely different, so here I'll

Re: [Linux-HA] Newest version of 'showscores'

2008-02-28 Thread Dominik Klein
Serge Dubrouski wrote: On Thu, Feb 28, 2008 at 9:53 AM, Dejan Muhamedagic [EMAIL PROTECTED] wrote: Hi, On Thu, Feb 28, 2008 at 03:47:04PM +0100, Dominik Klein wrote: thanky you for the script and for pointing to the right direction. May I change the format of the output? (Yes, I also saw

Re: [Linux-HA] Clonesets and Resource Groups

2008-02-27 Thread Dominik Klein
Michael is right. Wörz that is :) Didn't see you both had the same first name. Sorry. Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also:

Re: [Linux-HA] Clonesets and Resource Groups

2008-02-27 Thread Dominik Klein
The biggest issue so far is to migrate constraint resources frome one node to another with a single command. I cannot use grouped resources bcs one of the resources must be a cloneset (ocfs) and thus cannot be a member of a group. Why not? You just cannot create this in the GUI. Use CLI.

Re: [Linux-HA] how to create meta-data?

2008-02-27 Thread Dominik Klein
1.Iam using DRBD-0.7.21 , how to create meta-data for this system and how to upgrade it to DRBD-8 meta-data? http://blogs.linbit.com/florian/2007/10/03/step-by-step-upgrade-from-drbd-07-to-drbd-8/ 2.My meta-disk option in /etc/drbd.conf file has /dev/hda6 as entry.Is this same as

Re: [Linux-HA] How to trigger stonith of node

2008-02-27 Thread Dominik Klein
is there a way to trigger the stonith of a node for testing? pkill -9 heartbeat ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Newest version of 'showscores'

2008-02-27 Thread Dominik Klein
Andreas Mock wrote: Hi all, can someone point me to the newest version of 'showscores'. This should be the newest version. http://hg.clusterlabs.org/pacemaker/dev/rev/86e1f081dc7f Apparently it has a mix-up in the headings, but that shouldn't hurt. Someone posted it here once. Yup, that

Re: [Linux-HA] how to create meta-data?

2008-02-26 Thread Dominik Klein
What does meta-data in DRBD mean http://www.drbd.org/users-guide/ch-internals.html#s-metadata and how to create meta-data. http://www.drbd.org/users-guide/s-first-time-up.html Regards Dominik ___ Linux-HA mailing list

Re: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Dominik Klein
Damon Estep wrote: On this page: http://www.linux-ha.org/DRBD/HowTov2 is this comment: drbd must not be started by init Well, you do not have to start drbd by init. But it shouldn't harm if you do. This statement is false if you want to use the heartbeat Resource Agent drbddisk, but that's

Re: [Linux-HA] stonith on an apcmaster

2008-02-19 Thread Dominik Klein
The stonith daemons start successfully now, but with a monitor interval of 15s one of the two fails fairly quickly. The apc (9211 masterswitch) only allows a single login, and I wonder if the two daemons aren't colliding, and one is timing out and giving up. Did you try apcmastersnmp?

Re: [Linux-HA] drbd heartbeat v2 working (problem with fs0)

2008-02-19 Thread Dominik Klein
Marco Leone wrote: Hi, I'm using drbd 8.2.4 and heartbeat v.2 too on two ubuntu 7.04 server nodes. I guess you did not completey do that. I followed this link http://linux-ha.org/DRBD/HowTov2 constraints rsc_location id=rsc_location_group_1 rsc=group_1 rule

Re: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Dominik Klein
crm_verify[19814]: 2008/02/19_08:46:57 WARN: unpack_rsc_op: Processing failed op drbd0:1_start_0 on cn2-inverness-co: Error crm_verify[19814]: 2008/02/19_08:46:57 WARN: unpack_rsc_op: Compatability handling for failed op drbd0:1_start_0 on cn2-inverness-co crm_verify[19814]: 2008/02/19_08:46:57

Re: [Linux-HA] DRBD 8.0 under Debian Etch?

2008-02-07 Thread Dominik Klein
Short question: Does anyone here have DRBD8 running with heartbeat under Etch? Short answer: Yes. Version 8.0.8, upgrading to 8.0.9 within the next days. I use the OCF RA to manage drbd as a Master/Slave Resource. Regards Dominik ___ Linux-HA

Re: [Linux-HA] send_arp cisco Pix v7.2 = arp table not update

2008-01-22 Thread Dominik Klein
Thanks for your advise, unfortunatelly, that command is not include in the PIX :-(( ... I'am still on that point and I must confess that I have no clue at all... You could also modify the RA and have it set a virtual mac address on the interface. Be sure to set the original mac address

[Linux-HA] Re: DRBD with monitor Operations won't start - as soon as I delete the operations, it starts immediately

2008-01-04 Thread Dominik Klein
operations op id=op-ms-drbd2-1 name=monitor interval=60s timeout=60s start_delay=30s role=Master/ op id=op-ms-drbd2-2 name=monitor interval=60s timeout=60s start_delay=30s role=Slave/

Re: [Linux-HA] external/ipmi example configuration

2008-01-04 Thread Dominik Klein
How can I test the stonith plugin eg. tell heartbeat to shoot someone? iptables -I INPUT -j DROP ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] DBRD - split brain - and HA is happily migrating

2008-01-02 Thread Dominik Klein
Thanks for your help. It looks like everything works as desired: (postgres-02) [~] ifconfig eth1 down (postgres-02) [~] cat /proc/drbd version: 8.2.1 (api:86/proto:86-87) GIT-hash: 318925802fc2638479ad090b73d7af45503dd184 build by [EMAIL PROTECTED], 2007-12-29 17:37:25 0: cs:WFConnection

Re: [Linux-HA] DBRD - split brain - and HA is happily migrating

2008-01-01 Thread Dominik Klein
Thomas Glanzmann wrote: Hello, I have drbd (newest version; same goes for heartbeat) running as a master/slave ressource on the latest heart beat ressource and had the following problem. I had a split brain situation In DRBD or in the entire cluster? and heartbeat made it possible to

Re: [Linux-HA] DRBD - ext3 - IP - PostgreSQL Setup

2007-12-30 Thread Dominik Klein
Thomas Glanzmann wrote: Hello, I have the following setup: DRBD = ext3 = IPaddr2 = pgsql. I have the following configured: 00_README:# Ressourcen hinzufügen: 00_README: 00_README:cibadmin -o resources -C -x 01_drbd 00_README:cibadmin -o resources -C -x 02_filesystem 00_README:cibadmin -o

Re: [Linux-HA] DRBD Config

2007-12-21 Thread Dominik Klein
we are using a two node cluster master/slave with an openSuSE 10.3, heartbeat 2.0.7 and drbd 8.0.6. I tried the configuration from this webpage: http://www.linux-ha.org/DRBD/HowTov2 This should only be done with a recent version of heartbeat/crm. There have been major improvements on

Re: [Linux-HA] DRBD Config

2007-12-21 Thread Dominik Klein
2) DRBD8 is NOT supported from heartbeat. Please use DRBD0.7 I know the howto states so, but did you try it? Works for me ... Imho, the docs are outdated about that. Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org

Re: [Linux-HA] DRBD Config

2007-12-21 Thread Dominik Klein
Dec 20 12:57:49 mylogin1 drbd[7119]: [7131]: DEBUG: : Calling /sbin/drbdadm -c /etc/drbd.conf state Dec 20 12:57:49 mylogin1 drbd[7119]: [7134]: DEBUG: : Exit code 0 can you c/p what you get when you issue /sbin/drbdadm -c /etc/drbd.conf state by hand? That's a syntax error. The resource

Re: [Linux-HA] DRBD Config

2007-12-21 Thread Dominik Klein
Please see my thread from Oct 18th on this list and esp the answer from Andrew from Oct 22nd. I read that thread. There also was someone else who stated it's working for him. Can you confirm that It works? For LARGE partitions? Would be good news! The largest partition I manage with DRBD

Re: [Linux-HA] Relevance of STONITH with Xen over DRBD setup

2007-12-14 Thread Dominik Klein
I just want to confirm this. From what i've learned so far, STONITH is relevant only to avoid data corruption when using shared storage. So, is STONITH relevant when i'm using a non-shared setup with Heartbeat and XEN VM on top of DRBD? Xen is using file images created on top of a ext3 FS on top

Re: [Linux-HA] Ordering Resource Groups V2

2007-12-13 Thread Dominik Klein
Damon Estep wrote: I have created an order constraint that requires a DRBD/iSCSI target resource group to be up before an application resource groups comes up. At startup the order is honored, and the resource groups come up in the desired order. In the event of a failover in the

Re: [Linux-HA] Re: A question about the combined score

2007-12-12 Thread Dominik Klein
If the error occurs on the resource, then resouce-failure-stickiness will come to play and make your scores: Node1: 10 - 10 = 0 Node2: 9 As 9 0, the resource will be started on Node2, and 22 stickiness will be added. So you have 31 0. In your comments, you remarked a failover should

Re: [Linux-HA] default_resource_stickiness=INFINITY and default_resource_failure_stickiness=-INFINITY

2007-12-10 Thread Dominik Klein
I have created simple 2-node cluster with 4 drbd multi-state resources and xen DomU on it with enabled stonith and setting: Why don't you use drbd natively in xen? In your drbd installation, you should find a script named block-drbd. Copy that to /etc/xen/scripts and config your domU like:

Re: [Linux-HA] default_resource_stickiness=INFINITY and default_resource_failure_stickiness=-INFINITY

2007-12-10 Thread Dominik Klein
And to answer to your question: I have created simple 2-node cluster with 4 drbd multi-state resources and xen DomU on it with enabled stonith and setting: default_resource_stickiness=INFINITY and default_resource_failure_stickiness=-INFINITY. The idea for cluster working is: 1. promote all

Re: [Linux-HA] Pingd

2007-12-06 Thread Dominik Klein
China wrote: Ok, now it works, but when the PC_A returns up the resource doesn't remains on PC_B and failback to PC_A. How can I configure to switch the first time to PC_B on PC_A failover, but not return back if PC_A returns UP? Set resource stickiness to a reasonable value. Here's roughly

[Linux-HA] Possible bug in Score calculation?

2007-12-06 Thread Dominik Klein
Hi sorry I have to bother again about score calculation but I came across something I don't understand and that might be a bug. I have a master-slave drbd resource called ms-drbd (primitive is called drbd2) and a group named testdb (4 primitives, mount being the first primitive). pingd

Re: [Linux-HA] Pingd

2007-12-06 Thread Dominik Klein
With this configuration the resources doesn't failover to test, but remains on test-ppc. Why? I can't say. The configuration looks good to me. But again: What are you doing to force the failure? Do you really have just one connection between the nodes and unplug that connection to force the

Re: [Linux-HA] Pingd

2007-12-06 Thread Dominik Klein
China wrote: Sorry, I forgot it! I've two connection for the PCs: one with crossover cable, where heartbeat send packets directly to other PC one through network, where the services listen and where pingd test connectivity When I force the failure I disconnect the network cable that give

Re: [Linux-HA] Pingd

2007-12-06 Thread Dominik Klein
But, It's good to use a interface both for heartbeat and for services? It's pretty common I think. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also:

Re: [Linux-HA] Pingd

2007-12-06 Thread Dominik Klein
China wrote: Last question: how can I see what is the node's score during cluster execution? You can grep it out of the ptest output. Or use my script: http://lists.community.tummy.com/pipermail/linux-ha/2007-September/027488.html which has been updated by Robert Lindgren:

Re: [Linux-HA] Possible bug in Score calculation?

2007-12-06 Thread Dominik Klein
Good morning Andrew sorry I have to bother again about score calculation but I came across something I don't understand and that might be a bug. I have a master-slave drbd resource called ms-drbd (primitive is called drbd2) and a group named testdb (4 primitives, mount being the first

Re: [Linux-HA] Possible bug in Score calculation?

2007-12-06 Thread Dominik Klein
Just curious: I suppose its my first constraint that does this job? second - the colocation one Okay thanks - so sure I even got the 50/50 wrong :p Then I must ask another question: Why does this not apply to colocated primitives? I just tested with a single primitive (testdb) colocated

Re: [Linux-HA] 1000 extra score for a group?

2007-11-15 Thread Dominik Klein
i defined a rsc_location constraint with a rule with score 100 for a particular node for my HA-group. Resource stickiness is 200. Furthermore I use the a rule with score_attribute=pingd (multiplier=100) for the group. With 5 available ping nodes this should make 100+200+500=800. But the score

Re: [Linux-HA] RFC: Change on OCF RA Filesystem's monitor action

2007-11-09 Thread Dominik Klein
Grepping by device doesn't work for mount-by-label at least, and requires a lot of escaping for networked mounts; so we thought grepping for the mountpoint was exactly the approach we needed to take. Never used mount-by-label. Good point though. I've got to admit I've never had someone use a

Re: [Linux-HA] stdout and stderr redirection in a resource agent

2007-11-08 Thread Dominik Klein
nohup $binfile $cmdline_options $logfile 2 $errorlogfile ^^ ^^^ That binary is invoked from the RA, right? Sure. So neither stdout nor stderr should be said so Linux-HA should not log it. I'm just curious why it does. Well, despite your

[Linux-HA] RFC: Change on OCF RA Filesystem's monitor action

2007-11-07 Thread Dominik Klein
Hi I would like to suggest a change at the Filesystem RA. The monitor action actually does something like grep $MOUNTPOINT /etc/mtab. This does not work if you use a symbolic link as a mountpoint. If instead it would grep for $DEVICE (maybe with -w to avoid problems with +10 partitions on

[Linux-HA] stdout and stderr redirection in a resource agent

2007-11-07 Thread Dominik Klein
Hi ... again ... I wrote my own little RA to start a custom binary. Very basic RA up to now. I start my binary with nohup $binfile $cmdline_options $logfile 2 $errorlogfile Works ok actually, the logfiles are filled as expected, but I also see some of the output in the Linux-HA log

Re: [Linux-HA] stdout and stderr redirection in a resource agent

2007-11-07 Thread Dominik Klein
Dejan Muhamedagic wrote: Hi, On Wed, Nov 07, 2007 at 04:17:45PM +0100, Dominik Klein wrote: Hi ... again ... I wrote my own little RA to start a custom binary. Very basic RA up to now. I start my binary with nohup $binfile $cmdline_options $logfile 2 $errorlogfile Works ok actually

[Linux-HA] Feedback: Master/Slave RA for Postgres / Slony Cluster?

2007-11-06 Thread Dominik Klein
Hi a week earlier I asked wether there was a resource agent that implements Master/Slave for a Postgres Cluster using slony-1 replication. There was not, so I tried to implement it myself. I want to report back to give an explanation and reference on why I think it is not possible (at the

Re: [Linux-HA] Feedback: Master/Slave RA for Postgres / Slony Cluster?

2007-11-06 Thread Dominik Klein
Hi Andrew thanks for your reply. So I thought I could implement demote as return 0, as promote on the other machine will do the job anyway. Well, not the best idea as a monitor action on the apparently demoted machine will still return Master Status until promote on the second machine

<    1   2   3   4   >