[Linux-HA] Antw: cat < /dev/ttyS0

2011-05-24 Thread Ulrich Windl
>>> Hai Tao  schrieb am 23.05.2011 um 22:59 in Nachricht
:

> this might not be too close to HA, but I am not sure if someone has seem 
> this before:
>  
> I use a serial cable between two nodes, and I am testing the heartbeat with 
> :
>  
>  
> server2$ cat < /dev/ttyS0
> server1$ echo hello > /dev/ttyS0
>  
> instead of receiving "hello" on server2, I see some hashed code there. 
>  
> Does someone have an idea why I do not receive the "hello" in clear text?

Hi!

Definitely not the right forum here (for discussing serial line problems), but 
try "stty -a http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Colocation of VIP and httpd

2011-05-24 Thread Nikita Michalko
Hi,

any chance to update to version 3?
2.1.3 is really very old & buggy!


HTH


Nikita Michalko


Am Donnerstag 19 Mai 2011 19:25:54 schrieb 吴鸿宇:
> Hi All,
> 
> I have a 2 node cluster. My intention is ensuring the VIP is always on the
> node that has httpd running, i.e. if service httpd on the VIP node is
> stopped and fails to start, the VIP should switch to the other node.
> 
> With the configuration below, I observed that when httpd stops and fails to
> start, the VIP is stopped also but is not switched to the other node that
> has healthy httpd. I appreciate any ideas.
> 
>   ignore_dtd="false" num_peers="0" cib_feature_revision="2.0" epoch="28"
> num_updates="1" cib-last-written="Thu May 19 08:48:49 2011"
> ccm_transition="1">
>
>  
>
>  
> value="2.1.3-node: *"/>
> name="cluster-delay" value="60s"/>
> name="default-resource-stickiness" value="INFINITY"/>
>  
>
>  
>  
> type="normal"/>
> type="normal"/>
>  
>  
>
>  
>
>  
>  
>
>  
>  
>  
>
>  
>
>
>  
>
>   value="false"/>
>
>  
>  
>
>  
>
>  
>
>  
>  
> score="INFINITY"/>
>
>  
>
>  
> 
> Thanks a lot,
> Henry
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
> 
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] cat < /dev/ttyS0

2011-05-24 Thread Bart Coninckx
On 05/23/11 23:34, Lars Ellenberg wrote:
> On Mon, May 23, 2011 at 01:59:23PM -0700, Hai Tao wrote:
>> this might not be too close to HA, but I am not sure if someone has seem 
>> this before:
>>
>> I use a serial cable between two nodes, and I am testing the heartbeat with :
>>
>>
>> server2$ cat<  /dev/ttyS0
>> server1$ echo hello>  /dev/ttyS0
>>
>> instead of receiving "hello" on server2, I see some hashed code there.
>>
>> Does someone have an idea why I do not receive the "hello" in clear text?
> Mismatch of settings, especially baud, flow control, or similar.
> If you really want to do this manually, learn about stty.
>
> for starters, try, on both nodes,
> stty -F /dev/ttyS0
> and compare the output.
>
> Then try stty 115200 cs8 -F /dev/ttyS0
> and add whatever else you need to get useful settings.
>
> You also need a "null modem cable", usually,
> not just any serial cable.
>
> Note that, in contrast to the "haresources" mode, with Pacemaker,
> not only small heartbeats are exchanged, but larger stringified
> XML, occasionally even the whole CIB, inclusive configuration and
> status sections.
>
> Which, with a few resources, even when compressed, can reach an
> "unexpected" volume.
>
> Consider the transfer time of even only 10kByte on a serial port
> connection.  (Yes, that's ~one second, on a fast port!).
>
> You want the highest possible stable baud rate, the smallest
> possible pacemaker configuration, and timeouts that take this into
> account.
>
> In my experience, boxes that have high volume serial port activity
> can feel very sluggish in all aspects.
>
> For non-haresources clusters, we recommend against serial
> communication paths. We also recommend against haresources
> clusters, unless that really is all you want and need.
>
> So probably just forget about serial communication paths,
> but use all available physically independend network links,
> then add an other two ;-)
>
>
Just out of curiosity: why the recommendation against serial 
communication (as backup medium for instance) in non-haresources clusters?

thx,

B.


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Colocation of VIP and httpd

2011-05-24 Thread RaSca
Il giorno Gio 19 Mag 2011 19:25:54 CET, 吴鸿宇 ha scritto:
> Hi All,
> I have a 2 node cluster. My intention is ensuring the VIP is always on the
> node that has httpd running, i.e. if service httpd on the VIP node is
> stopped and fails to start, the VIP should switch to the other node.
> With the configuration below, I observed that when httpd stops and fails to
> start, the VIP is stopped also but is not switched to the other node that
> has healthy httpd. I appreciate any ideas.
[...]

Some questions:
Why httpd is cloned? Are you sure you want an INFINITY stickiness? Are 
logs saying anything helpful?

Anyway, like Nikita said, consider upgrading Heartbeat to version 3.

-- 
RaSca
Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene!
ra...@miamammausalinux.org
http://www.miamammausalinux.org

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Status about the four stack options

2011-05-24 Thread alain . moulle
Hi
Many thanks for this status. I suppose this is the same status on RHEL6 as 
Suse is 
likely to be "in advance" with regard to RHEL6 Pacemaker & corosync 
evolutions ?

Alain



De :Lars Marowsky-Bree 
A : General Linux-HA mailing list 
Date :  23/05/2011 13:10
Objet : Re: [Linux-HA] Status about the four stack options
Envoyé par :linux-ha-boun...@lists.linux-ha.org



On 2011-05-23T10:49:16, alain.mou...@bull.net wrote:

> Hi
> 
> I just wonder the status of the 4 stack options :
> from which releases of Pacemaker & corosync are the 3 and 4 options 
> available ? 
> and on which Distribution ? RHEL6 ? 
> 
> 1.  corosync + pacemaker plugin (v0) 

This is what SUSE Linux Enterprise High-Availability Extension uses, and
is fully supported.

> 2.  corosync + pacemaker plugin (v1) + mcp

We may switch to this at a later time during the SLE HA cycle.

> 3.  corosync + cpg + cman + mcp
> 4.  corosync + cpg + quorumd + mcp

On SLE, we're bound to skip 3, but 4) is probably somewhere in the very
late future, once it is fully stabilized and integrated.


Regards,
Lars

-- 
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix 
Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Hetzner server stonith agent

2011-05-24 Thread RaSca

Hi all,
as some of you saw in the last two weeks I've faced some problems in 
configuring a Corosync/Pacemaker cluster on two Hetzner server.


The main problem about those cheap and very powerful servers is their 
network management. For example, if you want to have a failover IP you 
need to manage it by the web interface or via a webservice, there's no 
other way.
Luckily, the guys from Kumina 
(http://blog.kumina.nl/2011/02/hetzner-failover-ip-ocf-script/) wrote an 
ocf resource agent that automates the management of the IP so the last 
(but not least) problem was the Stonith.


In the intention of Hetzner the only way you have to force a reset of 
the machine is... via the same webserver. I know, it's odd, but also in 
this case it is the only way. So, following those directives: 
http://wiki.hetzner.de/index.php/Robot_Webservice_en I wrote the stonith 
agent that is attached to this email.

It is based upon the same configuration file of the Kumina's ocf:

# cat /etc/hetzner.cfg
[dummy]
user = 
pass = 
local_ip = 

And it needs two parameters: the "hostname" and it's related 
"remote_ip", for example:


primitive stonith_hserver-1 stonith:external/hetzner \
params hostname="hserver-1" remote_ip="X.Y.Z.G" \
op start interval="0" timeout="60s

First of all, it works. The system is able to fence nodes in case of 
split brain and manually, so I can say it is ok. But it is the first 
stonith agent that I wrote, so it may need some corrections.


Hope this can help someone. Thanks to andreask who helped me on irc in 
understanding how stonith agents works.


--
RaSca
Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene!
ra...@miamammausalinux.org
http://www.miamammausalinux.org
#!/bin/sh
#
# External STONITH module for Hetzner.
#
# Copyright (c) 2011 MMUL S.a.S. - Raoul Scarazzini 
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#

# Read parameters
conf_file="/etc/hetzner.cfg"
user=`cat /etc/hetzner.cfg | egrep "^user.*=" | sed 's/^user.*=\ *//g'`
pass=`cat /etc/hetzner.cfg | egrep "^pass.*=" | sed 's/^pass.*=\ *//g'`
hetzner_server="https://robot-ws.your-server.de";

is_host_up() {
  if [ "$1" != "" ]
   then
status=`curl -s -u $user:$pass $hetzner_server/server/$1 | sed 
's/.*status\":"\([A-Za-z]*\)",.*/\1/g'`
if [ "$status" = "ready" ]
 then
  return 0
 else
  return 1
fi
   else
return 1
  fi
}

case $1 in
gethosts)
echo $hostname
exit 0
;;
on)
# Can't really be implemented because Hetzner webservice cannot power 
on a system
exit 1
;;
off)
# Can't really be implemented because Hetzner webservice cannot power 
on a system
exit 1
;;
reset)
status=`curl -s -u $user:$pass $hetzner_server/reset/$remote_ip -d 
type=hw`
if [ "$status" = "" ]
 then
  exit 1
 else
  if is_host_up "$hostaddress"
   then
exit 1
   else
exit 0
  fi
fi
exit 1
;;
status)
if [ "$remote_ip" != "" ]
 then
  if is_host_up "$remote_ip"
   then
exit 0
   else
exit 1
  fi
 else
  # Check if we can contact the server
  status=`curl -s -u $user:$pass $hetzner_server/server/`
  if [ "$status" = "" ]
   then
exit 1
   else
exit 0
  fi
fi
;;
getconfignames)
echo "hostname"
exit 0
;;
getinfo-devid)
echo "Hetzner STONITH device"
exit 0
;;
getinfo-devname)
echo "Hetzner STONITH external device"
exit 0
;;
getinfo-devdescr)
echo "Hetzner host reset"
echo "Manages the remote webservice for reset a remote server."
exit 0
;;
getinfo-devurl)
echo "http://wiki.hetzner.de/index.php/Robot_Webservice_en";
exit 0
;;
getinfo-xml)
cat <<

Re: [Linux-HA] cat < /dev/ttyS0

2011-05-24 Thread Dejan Muhamedagic
Hi,

On Tue, May 24, 2011 at 09:43:50AM +0200, Bart Coninckx wrote:
> On 05/23/11 23:34, Lars Ellenberg wrote:
> > On Mon, May 23, 2011 at 01:59:23PM -0700, Hai Tao wrote:
> >> this might not be too close to HA, but I am not sure if someone has seem 
> >> this before:
> >>
> >> I use a serial cable between two nodes, and I am testing the heartbeat 
> >> with :
> >>
> >>
> >> server2$ cat<  /dev/ttyS0
> >> server1$ echo hello>  /dev/ttyS0
> >>
> >> instead of receiving "hello" on server2, I see some hashed code there.
> >>
> >> Does someone have an idea why I do not receive the "hello" in clear text?
> > Mismatch of settings, especially baud, flow control, or similar.
> > If you really want to do this manually, learn about stty.
> >
> > for starters, try, on both nodes,
> > stty -F /dev/ttyS0
> > and compare the output.
> >
> > Then try stty 115200 cs8 -F /dev/ttyS0
> > and add whatever else you need to get useful settings.
> >
> > You also need a "null modem cable", usually,
> > not just any serial cable.
> >
> > Note that, in contrast to the "haresources" mode, with Pacemaker,
> > not only small heartbeats are exchanged, but larger stringified
> > XML, occasionally even the whole CIB, inclusive configuration and
> > status sections.
> >
> > Which, with a few resources, even when compressed, can reach an
> > "unexpected" volume.
> >
> > Consider the transfer time of even only 10kByte on a serial port
> > connection.  (Yes, that's ~one second, on a fast port!).
> >
> > You want the highest possible stable baud rate, the smallest
> > possible pacemaker configuration, and timeouts that take this into
> > account.
> >
> > In my experience, boxes that have high volume serial port activity
> > can feel very sluggish in all aspects.
> >
> > For non-haresources clusters, we recommend against serial
> > communication paths. We also recommend against haresources
> > clusters, unless that really is all you want and need.
> >
> > So probably just forget about serial communication paths,
> > but use all available physically independend network links,
> > then add an other two ;-)
> >
> >
> Just out of curiosity: why the recommendation against serial 
> communication (as backup medium for instance) in non-haresources clusters?

Quoting Lars again:

> > Note that, in contrast to the "haresources" mode, with Pacemaker,
> > not only small heartbeats are exchanged, but larger stringified
> > XML, occasionally even the whole CIB, inclusive configuration and
> > status sections.
> >
> > Which, with a few resources, even when compressed, can reach an
> > "unexpected" volume.

Heartbeat always uses all media to transmit messages, i.e. you
cannot have serial as passive interconnect. But even if that were
possible it wouldn't help much with big configurations.

Thanks,

Dejan

> thx,
> 
> B.
> 
> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dejan Muhamedagic
Hi,

On Mon, May 23, 2011 at 03:18:37PM -0700, Nulgor Wankevitch wrote:
> hi,
> 
> heartbeat seems to be send udp on port 694 to the whole network segment, 

Do you use ucast or bcast? With the latter, which is broadcast
it's of course expected. If it happens with the former, then you
must have gremlins in your network.

Thanks,

Dejan

> not just the link host, and
> getting blocked by firewall, how to limit?
> 
> Firewall: *UDP_IN Blocked* IN=eth0 OUT= 
> MAC=ff:ff:ff:ff:ff:ff:00:22:19:21:f1:75:08:00 SRC=192.168.1.190 
> DST=192.168.1.255 LEN=246 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP 
> SPT=42414 DPT=694 LEN=226
> 
> any help thnk you,
> nulgor
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
Hi,

thnk for reply, when use ucast things do not seem to work, the nodes are 
able
to bring up the VIP but not any services. When using bcast things seem 
to work correctly
but there is that broadcast problem, I would like to firewall the 
broadcast and isolate
it to the local machine and 2nd node however I do not want to cause 
additional problems,
please advise, thks.

nulgor

On 5/24/2011 1:52 AM, Dejan Muhamedagic wrote:
> Hi,
>
> On Mon, May 23, 2011 at 03:18:37PM -0700, Nulgor Wankevitch wrote:
>> hi,
>>
>> heartbeat seems to be send udp on port 694 to the whole network segment,
> Do you use ucast or bcast? With the latter, which is broadcast
> it's of course expected. If it happens with the former, then you
> must have gremlins in your network.
>
> Thanks,
>
> Dejan
>
>> not just the link host, and
>> getting blocked by firewall, how to limit?
>>
>> Firewall: *UDP_IN Blocked* IN=eth0 OUT=
>> MAC=ff:ff:ff:ff:ff:ff:00:22:19:21:f1:75:08:00 SRC=192.168.1.190
>> DST=192.168.1.255 LEN=246 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP
>> SPT=42414 DPT=694 LEN=226
>>
>> any help thnk you,
>> nulgor
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dejan Muhamedagic
Hi,

On Tue, May 24, 2011 at 02:12:12AM -0700, Nulgor Wankevitch wrote:
> Hi,
> 
> thnk for reply, when use ucast things do not seem to work, the nodes are 
> able
> to bring up the VIP but not any services. When using bcast things seem 
> to work correctly

Wow! You really do have gremlins somewhere. ucast cannot not work
in the way you described. Either the nodes can communicate or
they can't. Did you set the right IP address of the peer? Or
there must be some kind of network setup issue.

Thanks,

Dejan

> but there is that broadcast problem, I would like to firewall the 
> broadcast and isolate
> it to the local machine and 2nd node however I do not want to cause 
> additional problems,
> please advise, thks.
> 
> nulgor
> 
> On 5/24/2011 1:52 AM, Dejan Muhamedagic wrote:
> > Hi,
> >
> > On Mon, May 23, 2011 at 03:18:37PM -0700, Nulgor Wankevitch wrote:
> >> hi,
> >>
> >> heartbeat seems to be send udp on port 694 to the whole network segment,
> > Do you use ucast or bcast? With the latter, which is broadcast
> > it's of course expected. If it happens with the former, then you
> > must have gremlins in your network.
> >
> > Thanks,
> >
> > Dejan
> >
> >> not just the link host, and
> >> getting blocked by firewall, how to limit?
> >>
> >> Firewall: *UDP_IN Blocked* IN=eth0 OUT=
> >> MAC=ff:ff:ff:ff:ff:ff:00:22:19:21:f1:75:08:00 SRC=192.168.1.190
> >> DST=192.168.1.255 LEN=246 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP
> >> SPT=42414 DPT=694 LEN=226
> >>
> >> any help thnk you,
> >> nulgor
> >> ___
> >> Linux-HA mailing list
> >> Linux-HA@lists.linux-ha.org
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> > ___
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> >
> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Hetzner server stonith agent

2011-05-24 Thread Dejan Muhamedagic
Hi,

On Tue, May 24, 2011 at 10:08:38AM +0200, RaSca wrote:
> Hi all,
> as some of you saw in the last two weeks I've faced some problems in
> configuring a Corosync/Pacemaker cluster on two Hetzner server.
> 
> The main problem about those cheap and very powerful servers is
> their network management. For example, if you want to have a
> failover IP you need to manage it by the web interface or via a
> webservice, there's no other way.
> Luckily, the guys from Kumina
> (http://blog.kumina.nl/2011/02/hetzner-failover-ip-ocf-script/)
> wrote an ocf resource agent that automates the management of the IP
> so the last (but not least) problem was the Stonith.
> 
> In the intention of Hetzner the only way you have to force a reset
> of the machine is... via the same webserver. I know, it's odd, but
> also in this case it is the only way. So, following those
> directives: http://wiki.hetzner.de/index.php/Robot_Webservice_en I
> wrote the stonith agent that is attached to this email.
> It is based upon the same configuration file of the Kumina's ocf:
> 
> # cat /etc/hetzner.cfg
> [dummy]
> user = 
> pass = 
> local_ip = 
> 
> And it needs two parameters: the "hostname" and it's related
> "remote_ip", for example:
> 
> primitive stonith_hserver-1 stonith:external/hetzner \
>   params hostname="hserver-1" remote_ip="X.Y.Z.G" \
>   op start interval="0" timeout="60s
> 
> First of all, it works. The system is able to fence nodes in case of
> split brain and manually, so I can say it is ok. But it is the first
> stonith agent that I wrote, so it may need some corrections.
> 
> Hope this can help someone. Thanks to andreask who helped me on irc
> in understanding how stonith agents works.
> 
> -- 
> RaSca
> Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene!
> ra...@miamammausalinux.org
> http://www.miamammausalinux.org

> #!/bin/sh
> #
> # External STONITH module for Hetzner.
> #
> # Copyright (c) 2011 MMUL S.a.S. - Raoul Scarazzini 
> #
> # This program is free software; you can redistribute it and/or modify
> # it under the terms of version 2 of the GNU General Public License as
> # published by the Free Software Foundation.
> #
> # This program is distributed in the hope that it would be useful, but
> # WITHOUT ANY WARRANTY; without even the implied warranty of
> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> #
> # Further, this software is distributed without any warranty that it is
> # free of the rightful claim of any third person regarding infringement
> # or the like.  Any license provided herein, whether implied or
> # otherwise, applies only to this software file.  Patent licenses, if
> # any, provided herein do not apply to combinations of this program with
> # other software, or any other product whatsoever.
> #
> # You should have received a copy of the GNU General Public License
> # along with this program; if not, write the Free Software Foundation,
> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
> #
> 
> # Read parameters
> conf_file="/etc/hetzner.cfg"
> user=`cat /etc/hetzner.cfg | egrep "^user.*=" | sed 's/^user.*=\ *//g'`

Better:

user=`sed -n 's/^user.*=\ *//p' /etc/hetzner.cfg`

> pass=`cat /etc/hetzner.cfg | egrep "^pass.*=" | sed 's/^pass.*=\ *//g'`
> hetzner_server="https://robot-ws.your-server.de";

I assume that this is a well-known URL which doesn't need to be
passed as a parameter.

> is_host_up() {
>   if [ "$1" != "" ]
>then
> status=`curl -s -u $user:$pass $hetzner_server/server/$1 | sed 
> 's/.*status\":"\([A-Za-z]*\)",.*/\1/g'`
> if [ "$status" = "ready" ]
>  then
>   return 0
>  else
>   return 1
> fi

This if statement can be reduced to (you save 5 lines):

 [ "$status" = "ready" ]

>else
> return 1
>   fi
> }
> 
> case $1 in
> gethosts)
> echo $hostname
>   exit 0
>   ;;
> on)
>   # Can't really be implemented because Hetzner webservice cannot power 
> on a system
>   exit 1
>   ;;
> off)
>   # Can't really be implemented because Hetzner webservice cannot power 
> on a system
>   exit 1
>   ;;
> reset)
> status=`curl -s -u $user:$pass $hetzner_server/reset/$remote_ip -d 
> type=hw`
> if [ "$status" = "" ]
>  then
>   exit 1
>  else
>   if is_host_up "$hostaddress"
>then
> exit 1
>else
> exit 0
>   fi

Again, better (is return code of is_host_up inverted?):
 is_host_up "$hostaddress"
 exit # this is actually also superfluous, but perhaps better left in


> fi
>   exit 1
>   ;;
> status)
> if [ "$remote_ip" != "" ]
>  then
>   if is_host_up "$remote_ip"
>then
>   exit 0
>else
> exit 1
>   fi

Ditto.

>  else
>   # Check if we can contact the server
>   status=`curl -

[Linux-HA] Incomplete check in start method of ocf:heartbeat:LVM

2011-05-24 Thread Ulrich Windl
Hello boys (and girls?)!

Building my resources incrementally made me debug the LVM RA: Syslog said 
nothing, but the resource won't start.
As it turned out, the check whether a VG was activated is wrong: "vgchange -a" 
seems to report an error if a VG without LVs was activated (It's completely OK 
to have a VG without LVs, but you cannot have a VG without PVs). Unfortunately 
"0 LVs" seems to be treted as "VG not active". More correct seems to inspect 
the output of "vgs". Here's my output:
---snip---
# OCF_ROOT=/usr/lib/ocf OCF_RESKEY_volgrpname=T11_DB_FATA 
/usr/lib/ocf/resource.d/heartbeat/LVM start
LVM[31647]: INFO: Activating volume group T11_DB_FATA
LVM[31647]: INFO: Reading all physical volumes. This may take a while... Found 
volume group "T11_BD_BTD" using metadata type lvm2 Found volume group "sys" 
using metadata type lvm2 Found volume group "T11_CI" using metadata type lvm2 
Found volume group "T11_ERS" using metadata type lvm2 Found volume group 
"T11_ASCS" using metadata type lvm2 Found volume group "T11_DB_FATA" using 
metadata type lvm2 Found volume group "T11_DB_10k" using metadata type lvm2
LVM[31647]: INFO: 0 logical volume(s) in volume group "T11_DB_FATA" now active
LVM[31647]: ERROR: LVM: T11_DB_FATA did not activate correctly
snip end-

On the comment "  # TODO: This MUST run vgimport as well": I doubt that; it 
seems dangerous!

Regards,
Ulrich


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Hetzner server stonith agent

2011-05-24 Thread RaSca
Il giorno Mar 24 Mag 2011 12:27:04 CET, Dejan Muhamedagic ha scritto:
> Hi,

Hi Dejan,

[...]
>> # Read parameters
>> conf_file="/etc/hetzner.cfg"
>> user=`cat /etc/hetzner.cfg | egrep "^user.*=" | sed 's/^user.*=\ *//g'`
> Better:
> user=`sed -n 's/^user.*=\ *//p' /etc/hetzner.cfg`

Absolutely agree.

>> pass=`cat /etc/hetzner.cfg | egrep "^pass.*=" | sed 's/^pass.*=\ *//g'`
>> hetzner_server="https://robot-ws.your-server.de";
> I assume that this is a well-known URL which doesn't need to be
> passed as a parameter.

As far as I know it is the only address, I hard-coded it for this 
reason, but maybe should be a parameter...

>> is_host_up() {
>>if [ "$1" != "" ]
>> then
>>  status=`curl -s -u $user:$pass $hetzner_server/server/$1 | sed 
>> 's/.*status\":"\([A-Za-z]*\)",.*/\1/g'`
>>  if [ "$status" = "ready" ]
>>   then
>>return 0
>>   else
>>return 1
>>  fi
> This if statement can be reduced to (you save 5 lines):
>   [ "$status" = "ready" ]
>> else
>>  return 1
>>fi
>> }

You mean the statement should be:

[ "$status" = "ready" ] && return 0
return 1

?

[...]
> Again, better (is return code of is_host_up inverted?):
>   is_host_up "$hostaddress"
>exit # this is actually also superfluous, but perhaps better left in

The action is reset, so if I had success then is_host_up must be NOT 
ready. Or not?

[...]
> Ditto.
> Good work!
> Cheers,
> Dejan
> P.S. Moving discussion to linux-ha-dev.

If the compact way is correct, I can modify the script and post it again.

-- 
RaSca
Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene!
ra...@miamammausalinux.org
http://www.miamammausalinux.org

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Hetzner server stonith agent

2011-05-24 Thread RaSca
Il giorno Mar 24 Mag 2011 12:44:42 CET, RaSca ha scritto:
> Il giorno Mar 24 Mag 2011 12:27:04 CET, Dejan Muhamedagic ha scritto:
[...]
>> P.S. Moving discussion to linux-ha-dev.
[...]

Sorry... I removed the wrong address and posted again on linux-ha :-(

-- 
RaSca
Mia Mamma Usa Linux: Niente è impossibile da capire, se lo spieghi bene!
ra...@miamammausalinux.org
http://www.miamammausalinux.org

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
ya, gremlins, very reassuring, thanks.

On 5/24/2011 2:42 AM, Dejan Muhamedagic wrote:
> Hi,
>
> On Tue, May 24, 2011 at 02:12:12AM -0700, Nulgor Wankevitch wrote:
>> Hi,
>>
>> thnk for reply, when use ucast things do not seem to work, the nodes are
>> able
>> to bring up the VIP but not any services. When using bcast things seem
>> to work correctly
> Wow! You really do have gremlins somewhere. ucast cannot not work
> in the way you described. Either the nodes can communicate or
> they can't. Did you set the right IP address of the peer? Or
> there must be some kind of network setup issue.
>
> Thanks,
>
> Dejan
>
>> but there is that broadcast problem, I would like to firewall the
>> broadcast and isolate
>> it to the local machine and 2nd node however I do not want to cause
>> additional problems,
>> please advise, thks.
>>
>> nulgor
>>
>> On 5/24/2011 1:52 AM, Dejan Muhamedagic wrote:
>>> Hi,
>>>
>>> On Mon, May 23, 2011 at 03:18:37PM -0700, Nulgor Wankevitch wrote:
 hi,

 heartbeat seems to be send udp on port 694 to the whole network segment,
>>> Do you use ucast or bcast? With the latter, which is broadcast
>>> it's of course expected. If it happens with the former, then you
>>> must have gremlins in your network.
>>>
>>> Thanks,
>>>
>>> Dejan
>>>
 not just the link host, and
 getting blocked by firewall, how to limit?

 Firewall: *UDP_IN Blocked* IN=eth0 OUT=
 MAC=ff:ff:ff:ff:ff:ff:00:22:19:21:f1:75:08:00 SRC=192.168.1.190
 DST=192.168.1.255 LEN=246 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP
 SPT=42414 DPT=694 LEN=226

 any help thnk you,
 nulgor
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
>>> ___
>>> Linux-HA mailing list
>>> Linux-HA@lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>>
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
>

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] ocfs2

2011-05-24 Thread Eric Warnke

Fedora 14 lacks dlm-pcmk since it has been depreciated.  Really
frustrating as whatprovides shows a file but yum install says "nothing to
do" without installing.  Most of the existing "quick start" docs are
therefor inapplicable as they presume that you have dlm_controld.pcmk.

https://www.redhat.com/archives/linux-cluster/2011-March/msg00084.html

Overall I went back over a number of steps and found some errors and was
able to get it up and running.


1) Went back over and reconfigured cman + pacemaker without corosync
2) Since I had presumed that I would be integrating with pacemaker I had
failed to install the ocfs2-tools-cman package
3) Somewhere along the way I setup the cluster.conf with the full hostname
leading to all sorts of fun with pacemaker listing 6 nodes rather than
three.  "crm configure erase nodes" was able to clear that up once the
cluster.conf files were stable.
4) Once those two things were stable I was able to bring up o2cb and ocfs2
clones under pacemaker ( my understanding is dlm is already up thanks to
cman ).

At this point I'll probably have to take a step back and try rebuilding
this cluster to make sure I have the flow right.

Am I correct in presuming that, short of membership and quorm in cman,
pacemaker is where I configure STONITH and obviously all services?

Cheers,
Eric


On 5/23/11 4:18 PM, "asimonell...@gmail.com" 
wrote:

>I found the following link extremely useful for setting up a OCFS with
>OpenAIS/Corosync:
>
>http://www.novell.com/documentation/sle_ha
>
>-Anthony
>--Original Message--
>From: Eric Warnke
>Sender: linux-ha-boun...@lists.linux-ha.org
>To: Linux-HA mailing list
>ReplyTo: General Linux-HA mailing list
>Subject: [Linux-HA] ocfs2
>Sent: May 23, 2011 3:13 PM
>
>
>I have been chasing my tail all day trying to get a simple 3 node cluster
>to
>mount an ocfs2 filesystem over iscsi on Fedora 14.  Up until this morning
>it
>was working wonderfully for testing HA NFSv4 where the filesystems were
>non-clustered xfs volumes.
>
>Is there any useful documentation for converting a simple corosync +
>pacemaker installation to being able to mount an ocfs2 filesystem?
>
>-Eric
>
>
>___
>Linux-HA mailing list
>Linux-HA@lists.linux-ha.org
>http://lists.linux-ha.org/mailman/listinfo/linux-ha
>See also: http://linux-ha.org/ReportingProblems
>
>
>Sent via BlackBerry from T-Mobile
>___
>Linux-HA mailing list
>Linux-HA@lists.linux-ha.org
>http://lists.linux-ha.org/mailman/listinfo/linux-ha
>See also: http://linux-ha.org/ReportingProblems


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: ocf:heartbeat:exportfs

2011-05-24 Thread Warnke, Eric E


On 5/23/11 10:45 AM, "Ulrich Windl" 
wrote:

>Eventually netgroups would be a solution once we had an LDAP server up
>and running to provide the netgroups consistently. We are changing about
>everything, so I have a hard time to find out where to start ;-)
>
>Thanks for the feedback.
>
>Regards,
>Ulrich


On our existing non-HA fileserver we just have a /etc/netgroup file
outside of our LDAP authentication system ( since we don't run the LDAP
directly ).  So we will probably end up with 1+3 unless 2 gets
implemented.  You may want to start there and then try getting it working
under LDAP.

It's dirt simple to setup...

groupname (host1,,) (host2,,) (host3,,) 

And then your exportfs client would be @groupname


Cheers,
Eric

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dimitri Maziuk
On 05/24/2011 05:48 AM, Nulgor Wankevitch wrote:
> ya, gremlins, very reassuring, thanks.

If the broadcast packets from host A are seen by host B, and unicast
packets from host A to host B are not seen by host B, then your universe
is governed by laws of physics we here are completely unfamiliar with.
Sometimes we call them "gremlins".

HTH
Dima
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
I think you guys might have jumped the gun on me, why would you
assume it is "not seen"? I reported it will bring up the VIP but not
the services.

nulgor

On 5/24/2011 9:37 AM, Dimitri Maziuk wrote:
> On 05/24/2011 05:48 AM, Nulgor Wankevitch wrote:
>> ya, gremlins, very reassuring, thanks.
> If the broadcast packets from host A are seen by host B, and unicast
> packets from host A to host B are not seen by host B, then your universe
> is governed by laws of physics we here are completely unfamiliar with.
> Sometimes we call them "gremlins".
>
> HTH
> Dima
>
>
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Dimitri Maziuk
On 05/24/2011 02:56 PM, Nulgor Wankevitch wrote:
> I think you guys might have jumped the gun on me, why would you
> assume it is "not seen"? I reported it will bring up the VIP but not
> the services.

The only way I can vaguely imagine that possibly happening is if cib
isn't propagated to the other node(s) due to, indeed, a problem with
comms channel. However, I can think of only one way to make that happen
over unicast but not broadcast: unicasting to a wrong host.

Dima
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Nulgor Wankevitch
it seems like cib is on both nodes as I am able to view both from crm_mon
and crm configure show shows the same info, am I correct?

On 5/24/2011 2:02 PM, Dimitri Maziuk wrote:
> On 05/24/2011 02:56 PM, Nulgor Wankevitch wrote:
>> I think you guys might have jumped the gun on me, why would you
>> assume it is "not seen"? I reported it will bring up the VIP but not
>> the services.
> The only way I can vaguely imagine that possibly happening is if cib
> isn't propagated to the other node(s) due to, indeed, a problem with
> comms channel. However, I can think of only one way to make that happen
> over unicast but not broadcast: unicasting to a wrong host.
>
> Dima
>
>
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] heartbeat sends udp to whole network

2011-05-24 Thread Lars Ellenberg
On Tue, May 24, 2011 at 02:10:25PM -0700, Nulgor Wankevitch wrote:
> it seems like cib is on both nodes as I am able to view both from crm_mon
> and crm configure show shows the same info, am I correct?

This does not lead anywhere.

You complained that broadcast broadcasts.
Well, that's the nature of it.

"Then use unicast."
"But unicast does not work for me."

Some talk about gremlins...
Let's skip that.

So. Why does unicast seem to not work for you.

Maybe provide logs? E.g. a hb_report from starting up nodes configured
with unicast to them bringing up some, but not all, stuff?

And then we go from there.

BTW, you can directly ask heartbeat
what it thinks about it's comm channels:
for node in $(cl_status listnodes); do
for link in $(cl_status listhblinks $node); do
linkstatus=$(cl_status hblinkstatus $node $link)
printf "%s\t%s\t%s\n" $node $link $linkstatus
done
done

We should add a pretty-print-all-known-link-states to cl_status...

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems