any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Collinson.Shannon
We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so 
far have been handling HA by just clustering servers--yeah, any transaction in 
process to a server that's gone down would be whacked in mid-air, but anything 
new would route to the cluster-buddy that was still up.  That's not true 
high-availability, though, and won't work for all our potential applications.  
Of course, we're looking at SSI which will help out for planned outages, but 
we'd also like to be able to do something in the case of a server-crash (i.e. 
have some sort of heartbeat-monitor that could pop up/activate an application 
on a different server if it noticed something was down).  We'll be 
investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine 
HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is 
compatible with zSeries, but I'm wondering if there's any other good products 
to explore (as well as anyone's experiences with the above products).  We did a 
cursory look at LinuxHA, but unfortunately our management is not keen on using 
freeware, even though price will definitely be a consideration in whatever we 
decide on.

Any comments from those in the field actually exploiting HA for zSeries at 
their shops?

Thanks!

Shannon Collinson, SunTrust Bank, Atlanta, GA
LEGAL DISCLAIMER
The information transmitted is intended solely for the individual or entity to 
which it is addressed and may contain confidential and/or privileged material. 
Any review, retransmission, dissemination or other use of or taking action in 
reliance upon this information by persons or entities other than the intended 
recipient is prohibited. If you have received this email in error please 
contact the sender and delete the material from any computer.
By replying to this e-mail, you consent to SunTrust's monitoring activities of 
all communication that occurs on SunTrust's systems.
SunTrust is a federally registered service mark of SunTrust Banks, Inc.
[ST:XCL]

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread David Boyes
 down).  We'll be investigating Tivoli Systems Automation for Multiplatform,
 and the Sine Nomine HAO (High Availability Option), plus I intend to see if 
 the
 RHEL HA add-on is compatible with zSeries, 

It is not. That's why we created HAO. Red Hat does not offer their HA kit for Z 
or Power.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Filipe Miranda
Hi Collinson,

Sine Nomine's SNA HAO for RHEL on z is basically the Red Hat's source code 
recompiled for s390x arch, with the addition of the fencing mechanism to 
interface with z/VM, fully supported by Sine Nomine (an ISV Red Hat Partner). 
The SNA HAO offers features like: Fail-over HA, Clustered File System (GFS2) 
and Load Balancing (ipvs).
AFAIK SNA already have customers running the SNA HAO for RHEL on z in 
production environments, right David?
During the event 
Enterprise2013(http://www-03.ibm.com/systems/enterprise/systemz.html) in 
Orlando, during the Red Hat's session there will be a live demo of the SNA HAO 
for RHEL on IBM System z by Neale.


Kind Regards,
Filipe Miranda

On Oct 4, 2013, at 8:08 AM, Collinson.Shannon 
shannon.collin...@suntrust.com wrote:

 Ah!  thanks for that prompt response--I'd been 90% sure I remembered that 
 fact, but couldn't back it up from RedHat's datasheets on the web...
 
 -Original Message-
 From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of David 
 Boyes
 Sent: Friday, October 04, 2013 11:06 AM
 To: LINUX-390@VM.MARIST.EDU
 Subject: Re: any good recommendations for an HA tool on zSeries?
 
 down).  We'll be investigating Tivoli Systems Automation for
 Multiplatform, and the Sine Nomine HAO (High Availability Option),
 plus I intend to see if the RHEL HA add-on is compatible with zSeries,
 
 It is not. That's why we created HAO. Red Hat does not offer their HA kit for 
 Z or Power.
 
 --
 For LINUX-390 subscribe / signoff / archive access instructions, send email 
 to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
 http://www.marist.edu/htbin/wlvindex?LINUX-390
 --
 For more information on Linux on System z, visit http://wiki.linuxvm.org/
 LEGAL DISCLAIMER
 The information transmitted is intended solely for the individual or entity 
 to which it is addressed and may contain confidential and/or privileged 
 material. Any review, retransmission, dissemination or other use of or taking 
 action in reliance upon this information by persons or entities other than 
 the intended recipient is prohibited. If you have received this email in 
 error please contact the sender and delete the material from any computer.
 By replying to this e-mail, you consent to SunTrust's monitoring activities 
 of all communication that occurs on SunTrust's systems.
 SunTrust is a federally registered service mark of SunTrust Banks, Inc.
 [ST:XCL]
 
 --
 For LINUX-390 subscribe / signoff / archive access instructions,
 send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
 http://www.marist.edu/htbin/wlvindex?LINUX-390
 --
 For more information on Linux on System z, visit
 http://wiki.linuxvm.org/


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread David Boyes
 Sine Nomine's SNA HAO for RHEL on z is basically the Red Hat's source code
 recompiled for s390x arch, with the addition of the fencing mechanism to
 interface with z/VM, fully supported by Sine Nomine (an ISV Red Hat
 Partner). The SNA HAO offers features like: Fail-over HA, Clustered File
 System (GFS2) and Load Balancing (ipvs).

With a few improvements (waiting to be upstreamed...) and a LOT of testing. 8-) 
HAO was designed to be plug-compatible with RHCS to take maximum advantage of 
existing knowledge.

 AFAIK SNA already have customers running the SNA HAO for RHEL on z in
 production environments, right David?

Correct.  It's a general purpose HA tool. We've been paying particular 
attention to supporting HA for MQSeries, as that seems to be a hot topic at the 
moment, but there are other customers doing different things as well, in a 
number of industries.

 During the event Enterprise2013(http://www-
 03.ibm.com/systems/enterprise/systemz.html) in Orlando, during the Red
 Hat's session there will be a live demo of the SNA HAO for RHEL on IBM
 System z by Neale.

Drop me a note off-list if you'd like more information. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Mark Post
 On 10/4/2013 at 11:03 AM, Collinson.Shannon 
 shannon.collin...@suntrust.com
wrote: 
 We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so 
 far have been handling HA by just clustering servers--yeah, any transaction 
 in process to a server that's gone down would be whacked in mid-air, but 
 anything new would route to the cluster-buddy that was still up.  That's 
 not true high-availability, though, and won't work for all our potential 
 applications.

Shannon,

I could have sworn you were one of our System z customers, in which case the 
SUSE High Availability packages are included in the cost of your subscription.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Marcy Cortes
Hi Shannon,

We have all kinds of HA going on.   

What are your distributed folks doing?   Sometimes it is easiest just to use 
what they use.   Appliances work great (one example is F5's GTM and LTM).
The sw products often have their own solution (i.e. DB2 HADR, Websphere ND, 
Oracle RAC) and those are good choices for those products.
We do use some LinuxHA (on SLES for z it is included and supported) for 
clustered file systems.It's complicated, but it can do things like provide 
r/w file systems to multiple server and move IP addresses around for you.
When you think about just activating an app on another server, does it have 
access to the same files?
From a systems management standpoint, we prefer that our applications run 
active-active (that is send traffic to multiple app servers if possible) to 
use capacity on more than one CEC.   IHS in Websphere ND has plugins too where 
you can adjust percentage of traffic to the various app servers. This 
allows us to take whole lpars or CECs out of service for planned maintenance 
as well without much human intervention.

Hope that is helpful.

Marcy

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of 
Collinson.Shannon
Sent: Friday, October 04, 2013 8:03 AM
To: LINUX-390@VM.MARIST.EDU
Subject: [LINUX-390] any good recommendations for an HA tool on zSeries?

We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so 
far have been handling HA by just clustering servers--yeah, any transaction in 
process to a server that's gone down would be whacked in mid-air, but anything 
new would route to the cluster-buddy that was still up.  That's not true 
high-availability, though, and won't work for all our potential applications.  
Of course, we're looking at SSI which will help out for planned outages, but 
we'd also like to be able to do something in the case of a server-crash (i.e. 
have some sort of heartbeat-monitor that could pop up/activate an application 
on a different server if it noticed something was down).  We'll be 
investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine 
HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is 
compatible with zSeries, but I'm wondering if there's any other good products 
to explore (as well as anyone's experiences with the above products).  We did a 
cursory look at LinuxHA, but unfortunately our management is not keen on using 
freeware, even though price will definitely be a consideration in whatever we 
decide on.

Any comments from those in the field actually exploiting HA for zSeries at 
their shops?

Thanks!

Shannon Collinson, SunTrust Bank, Atlanta, GA LEGAL DISCLAIMER The information 
transmitted is intended solely for the individual or entity to which it is 
addressed and may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of or taking action in reliance upon 
this information by persons or entities other than the intended recipient is 
prohibited. If you have received this email in error please contact the sender 
and delete the material from any computer.

By replying to this e-mail, you consent to SunTrust's monitoring activities of 
all communication that occurs on SunTrust's systems.
SunTrust is a federally registered service mark of SunTrust Banks, Inc.
[ST:XCL]

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Alan Altmark
On Friday, 10/04/2013 at 11:04 EDT, Collinson.Shannon
shannon.collin...@suntrust.com wrote:
 We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2
and so
 far have been handling HA by just clustering servers--yeah, any
transaction in
 process to a server that's gone down would be whacked in mid-air, but
anything
 new would route to the cluster-buddy that was still up.  That's not
true
 high-availability, though, and won't work for all our potential
applications.
 Of course, we're looking at SSI which will help out for planned outages,
but
 we'd also like to be able to do something in the case of a server-crash
(i.e.
 have some sort of heartbeat-monitor that could pop up/activate an
application
 on a different server if it noticed something was down).  We'll be
 investigating Tivoli Systems Automation for Multiplatform, and the Sine
Nomine
 HAO (High Availability Option), plus I intend to see if the RHEL HA
add-on is
 compatible with zSeries, but I'm wondering if there's any other good
products
 to explore (as well as anyone's experiences with the above products). We
did a
 cursory look at LinuxHA, but unfortunately our management is not keen on
using
 freeware, even though price will definitely be a consideration in
whatever we
 decide on.

 Any comments from those in the field actually exploiting HA for zSeries
at
 their shops?

You are asking the right questions, but recognize that there is no single
HA management solution.  Real HA is more than just workload distribution.
You have to protect yourself from outages of networks, servers, storage,
and the components that connect them together and make them go (adapters,
cables, power supplies, etc.) within a single site/campus.   DR is a twist
on HA that drives it to the next level, achieving the same purpose, but
across longer distances and with a higher tolerance for a service outage.
As others have noted, a good HA solutions can be leveraged for planned
outages, too.

Networks and servers are fairly straightforward and well understood
(bonding solutions, app clusters, IP moves, another LPAR, another
CPC).  I find clients who get all that done and then I discover that they
have a single storage controller.  They might be replicating to the DR
site, but that doesn't help them if they lose the local storage frame.  If
you have z/OS, then you need to look at GDPS, even if only for I/O
hyperswap capability.

Alan Altmark

Senior Managing z/VM and Linux Consultant
IBM System Lab Services and Training
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Collinson.Shannon
We do in fact have GDPS, but only GDPS/XRC for our Disaster Recovery--and that, 
of course, doesn't cut it for HA.  GDPS Hyperswap is a little out of the budget 
for now (man, I do not even want to contemplate our DS8700's going down on the 
storage side).  And to Marcy's point, where possible we would want to use the 
same tools across the enterprise (so we are indeed looking at Oracle RAC for 
the oracle databases we're planning to migrate), but management wanted us to 
offer some sort of integrated generic linux solution for applications that 
didn't have any specific (or supported-on-z) tools.  zLinux is a 
small-but-growing segment of our relatively small number of Linux 
servers--we're predominantly running the bigger non-mainframe applications on 
AIX servers with HACMP, so that's what we're competing with in trying to entice 
other applications to join our middleware MQ/Broker servers on zlinux.  If we 
pick up Sine Nomine HAO or Tivoli Systems Automation for Multiplatform, we'd be 
looking at using them on our intel redhat linux servers as well as the zlinux 
ones to try to cut down on the tool proliferation.

But you're right, Alan--if our storage controllers go dead, we'd be looking at 
activating a DR right now, for some subset of both zlinux and our mainframe 
applications.  I'm going to go knock on something wooden...  For now, we're 
concerning ourselves with the server-side of HA--network redundancy is already 
built  (with multiple OSAs as well), we have multiple lpars and mainframes for 
each environment, and we're trusting to IBM's never-gonna-fail on the storage 
side.

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Alan 
Altmark
Sent: Friday, October 04, 2013 1:45 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: any good recommendations for an HA tool on zSeries?

On Friday, 10/04/2013 at 11:04 EDT, Collinson.Shannon
shannon.collin...@suntrust.com wrote:
 We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2
and so
 far have been handling HA by just clustering servers--yeah, any
transaction in
 process to a server that's gone down would be whacked in mid-air, but
anything
 new would route to the cluster-buddy that was still up.  That's not
true
 high-availability, though, and won't work for all our potential
applications.
 Of course, we're looking at SSI which will help out for planned
 outages,
but
 we'd also like to be able to do something in the case of a
 server-crash
(i.e.
 have some sort of heartbeat-monitor that could pop up/activate an
application
 on a different server if it noticed something was down).  We'll be
 investigating Tivoli Systems Automation for Multiplatform, and the
 Sine
Nomine
 HAO (High Availability Option), plus I intend to see if the RHEL HA
add-on is
 compatible with zSeries, but I'm wondering if there's any other good
products
 to explore (as well as anyone's experiences with the above products).
 We
did a
 cursory look at LinuxHA, but unfortunately our management is not keen
 on
using
 freeware, even though price will definitely be a consideration in
whatever we
 decide on.

 Any comments from those in the field actually exploiting HA for
 zSeries
at
 their shops?

You are asking the right questions, but recognize that there is no single HA 
management solution.  Real HA is more than just workload distribution.
You have to protect yourself from outages of networks, servers, storage, and 
the components that connect them together and make them go (adapters,
cables, power supplies, etc.) within a single site/campus.   DR is a twist
on HA that drives it to the next level, achieving the same purpose, but across 
longer distances and with a higher tolerance for a service outage.
As others have noted, a good HA solutions can be leveraged for planned outages, 
too.

Networks and servers are fairly straightforward and well understood (bonding 
solutions, app clusters, IP moves, another LPAR, another CPC).  I find 
clients who get all that done and then I discover that they have a single 
storage controller.  They might be replicating to the DR site, but that doesn't 
help them if they lose the local storage frame.  If you have z/OS, then you 
need to look at GDPS, even if only for I/O hyperswap capability.

Alan Altmark

Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training 
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit http://wiki.linuxvm.org/
LEGAL DISCLAIMER
The information transmitted is intended solely for the individual or 

Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Marcy Cortes
 and we're trusting to IBM's never-gonna-fail on the storage side.

If only! :)
Or you could get a leak in your data center which melts a piece of equipment :)

You may get by without hyperswap and use just PPRC if you can be down for a 
little bit after the primary storage fails.
Of course that means you purchase 2x the disk.
Or if you can get your storage team to give you space on 2 different DS8700's 
you can implement a good HA without Hyperswap and PPRC.


Marcy

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Collinson.Shannon
We do have some F5 appliances, but I think that's just for network routing--if 
there's any kind of F5 that could help manage an HA solution (making a passive 
server active or something like that), we don't have them.  And we are looking 
at Oracle RAC as a possibility for the oracle databases we're hoping to migrate 
to zlinux, but the oracle platform owner wanted to explore something cheaper 
(keeping the critical Oracle stuff that requires RAC on the midrange servers it 
currently uses, but looking for some poor-man's HA to at least provide 
active-passive support on zlinux).  And we tried out the MQ Multi-Instance 
setup for our websphere MQ and Broker servers but could never get it working as 
advertised, so that'll be another application we'll need to support.  Right 
now, as I alluded to below, we're just using an active-active cluster (with 
routing through an F5 load-balancer) for MQ with each server running off its 
own storage--not really what the application owner wants in the long term.  
We're also playing with scripted HA for that which would use shared disks 
across the servers that would be managed by testing to see if the logical 
volume was in use--a really homegrown solution which I think would be more 
problematic than LinuxHA.  

When you say the setup for LinuxHA is complicated, how bad is it?  Did you have 
to resort to bugging SuSE for configuration help, or were you able to work it 
all through with the documentation on the org site/maybe polling the 
interested-users list for it?  not that I think we're anywhere near as 
knowledgeable as you and your team with zlinux, but if you guys had to go to 
the vendor for assistance, we shouldn't even contemplate it!  and I just got 
word that okay, yeah, we can add LinuxHA to the running for the generic HA 
solution we're looking to find.  (I guess reorgs are good in rare occasions, 
such as moving folks obstinate to what seem like good ideas...)

Whatever we come up with would be something we hope could be exploited on all 
Linux servers (on any platform) at SunTrust--chances are, it'd only be 
cost-effective and training-effective if it was common, and right now, we 
actually don't have any standard HA product on our intel Linux side either, so 
this'd be a good time to find one.  Of course we'd want to support any 
application-specific HA solution that the applications wanted to pay for (if 
it could run on zseries), but we'd like to have some kind of generic option for 
those other applications/products that still wanted some way to stay up while 
we were IPLing their z/VM lpars.

Thanks for your consideration/responses!
Shannon 

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy 
Cortes
Sent: Friday, October 04, 2013 12:28 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: any good recommendations for an HA tool on zSeries?

Hi Shannon,

We have all kinds of HA going on.   

What are your distributed folks doing?   Sometimes it is easiest just to use 
what they use.   Appliances work great (one example is F5's GTM and LTM).
The sw products often have their own solution (i.e. DB2 HADR, Websphere ND, 
Oracle RAC) and those are good choices for those products.
We do use some LinuxHA (on SLES for z it is included and supported) for 
clustered file systems.It's complicated, but it can do things like provide 
r/w file systems to multiple server and move IP addresses around for you.
When you think about just activating an app on another server, does it have 
access to the same files?
From a systems management standpoint, we prefer that our applications run 
active-active (that is send traffic to multiple app servers if possible) to 
use capacity on more than one CEC.   IHS in Websphere ND has plugins too where 
you can adjust percentage of traffic to the various app servers. This 
allows us to take whole lpars or CECs out of service for planned maintenance 
as well without much human intervention.

Hope that is helpful.

Marcy

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of 
Collinson.Shannon
Sent: Friday, October 04, 2013 8:03 AM
To: LINUX-390@VM.MARIST.EDU
Subject: [LINUX-390] any good recommendations for an HA tool on zSeries?

We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so 
far have been handling HA by just clustering servers--yeah, any transaction in 
process to a server that's gone down would be whacked in mid-air, but anything 
new would route to the cluster-buddy that was still up.  That's not true 
high-availability, though, and won't work for all our potential applications.  
Of course, we're looking at SSI which will help out for planned outages, but 
we'd also like to be able to do something in the case of a server-crash (i.e. 
have some sort of heartbeat-monitor that could pop up/activate an application 
on a different server if it noticed something was down).  We'll be 
investigating 

Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Marcy Cortes
You lost me :)

You said
 if there's any kind of F5 that could help manage an HA solution (making a 
passive server active or something like that), we don't have them
Then
we're just using an active-active cluster (with routing through an F5 
load-balancer)

I'm not the F5 person here, but as far as I know we use the same devices for 
either scenario.

I'll have to check with our Broker guys to see what they are doing, but it is 
nicely spread over 4 lpars here with no intervention for failover in either 
planned or unplanned events and no shared disk.

We have done the scripted HA thing with volumes off here, on there.  It works 
well enough (usually) - depends on your scripter person!

On the LinuxHA, you may pull out a few hairs (in my case I just aimed for the 
gray ones).We are attempting to use it for one application that needs a r/w 
file system accessible to 4 servers.  We are working with SUSE.   They found 
one config error I had, but we still have an open problem with it now with 
something we are trying to get into production soon.   But that may be unique 
to our prod environment which is a stretched cluster or as SUSE calls it a 
Metro Area Cluster. I have it up, but shutting it down nicely is 
problematic.SUSE has some yast tools to help with the configuration and  a 
pretty comprehensive document (although not a cookbook), some of which I 
couldn't use because of security rules (no direct root login allowed).   So 
that complicated things some for us.It is not difficult to get your servers 
into a reboot death match with a bit of misconfig going too, which is really 
fun to try to stop if you've got it automatically coming back up ... (PS, I've 
never had an F5 shoot a server :) . Oh, I have been following the Linux-HA 
list too.   When I reported my problem there, I got a open an SR response so 
that wasn't all that helpful :( (unlike this place :) .

I do plan to write up all the steps needed in a VM environment to get 
OCFS2/CLVM/Storage Based fencing with minidisks.   Maybe in the form of a SHARE 
presentation.
It would be SUSE specific though, since that's what I have.

With what I've learned this far, I wouldn't make it my generic solution.  Then 
again, we already have one.   
I do think one of the requirements for any generic solution would be to have 
the ability for operations / applications staff to route work to/from different 
things.  I would sure not want to be called to take a node out of service.  It 
would need to integrate in with existing authentication systems too.   SUSE 
does provide Hawk, which provides a web interface into 1 cluster, but I don't 
think that would work the number of people that we have with various roles in 
app cluster management and monitoring.  Alerting is a pretty important thing to 
have in your solution as well.

Hope that is useful.


Marcy

-Original Message-
From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of 
Collinson.Shannon
Sent: Friday, October 04, 2013 12:48 PM
To: LINUX-390@VM.MARIST.EDU
Subject: Re: [LINUX-390] any good recommendations for an HA tool on zSeries?

We do have some F5 appliances, but I think that's just for network routing--if 
there's any kind of F5 that could help manage an HA solution (making a passive 
server active or something like that), we don't have them.  And we are looking 
at Oracle RAC as a possibility for the oracle databases we're hoping to migrate 
to zlinux, but the oracle platform owner wanted to explore something cheaper 
(keeping the critical Oracle stuff that requires RAC on the midrange servers it 
currently uses, but looking for some poor-man's HA to at least provide 
active-passive support on zlinux).  And we tried out the MQ Multi-Instance 
setup for our websphere MQ and Broker servers but could never get it working as 
advertised, so that'll be another application we'll need to support.  Right 
now, as I alluded to below, we're just using an active-active cluster (with 
routing through an F5 load-balancer) for MQ with each server running off its 
own storage--not really what the application owner wants in the long term.  
We're also playing with scripted HA for that which would use shared disks 
across the servers that would be managed by testing to see if the logical 
volume was in use--a really homegrown solution which I think would be more 
problematic than LinuxHA.  

When you say the setup for LinuxHA is complicated, how bad is it?  Did you have 
to resort to bugging SuSE for configuration help, or were you able to work it 
all through with the documentation on the org site/maybe polling the 
interested-users list for it?  not that I think we're anywhere near as 
knowledgeable as you and your team with zlinux, but if you guys had to go to 
the vendor for assistance, we shouldn't even contemplate it!  and I just got 
word that okay, yeah, we can add LinuxHA to the running for the generic HA 
solution we're looking to find.  (I guess reorgs are 

Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Alan Altmark
On Friday, 10/04/2013 at 05:12 EDT, Marcy Cortes
marcy.d.cor...@wellsfargo.com wrote:

 I do plan to write up all the steps needed in a VM environment to get
 OCFS2/CLVM/Storage Based fencing with minidisks.   Maybe in the form of
a SHARE
 presentation.
 It would be SUSE specific though, since that's what I have.

In a home-built clustered single-writer shared storage HA environments
(MDISKs or LUNs) that you're building on your own, each guest needs a
minidisk on an XLINK-managed volume.  If a guest comes up and doesn't have
R/W access to, for example, the 666 disk, it doesn't bother booting Linux,
but simply screams and logs off.  Please note that this applies only
within a data center as it relies on shared DASD among the z/VM systems.

 With what I've learned this far, I wouldn't make it my generic solution.
 Then
 again, we already have one.
 I do think one of the requirements for any generic solution would be to
have
 the ability for operations / applications staff to route work to/from
different
 things.  I would sure not want to be called to take a node out of
service.  It
 would need to integrate in with existing authentication systems too.
SUSE
 does provide Hawk, which provides a web interface into 1 cluster, but I
don't
 think that would work the number of people that we have with various
roles in
 app cluster management and monitoring.  Alerting is a pretty important
thing to
 have in your solution as well.

If you can automate suspension of host monitors, you will be well served.
That way, when you take down part of a cluster (e.g. z/VM IPL), the
central host monitors don't start beeping.  I say suspend rather than
disable because if they aren't back online by the end of the service
window, you WANT the alarms to kick in IF they see that the VM systems is
up.  (I say.  No point in whinging about Linux guests not being up if the
VM system is still down, wot?)

Alan Altmark

Senior Managing z/VM and Linux Consultant
IBM System Lab Services and Training
ibm.com/systems/services/labservices
office: 607.429.3323
mobile; 607.321.7556
alan_altm...@us.ibm.com
IBM Endicott

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: any good recommendations for an HA tool on zSeries?

2013-10-04 Thread Marcy Cortes
If you can automate suspension of host monitors, you will be well served.
That way, when you take down part of a cluster (e.g. z/VM IPL), the central 
host monitors don't start beeping. 
 I say suspend rather than disable because if they aren't back online by 
 the end of the service window, you WANT the alarms to kick in IF they see 
 that the VM systems is up.  (I say.  No point in whinging about Linux guests 
 not being up if the VM system is still down, wot?)


Right, you need that too.
You also want to know if you failed over to your backup device (oh, like say 
the backup device on a vswitch :) so you can go fix the primary.   So you must 
know those messages and alarm on them (or alarm on anything that you don't know 
about :).   And backup devices need their health monitored too so that you 
don't end up failing over to something that isn't really there. 

It's all a work in progress forever I think...   Never a shortage of things to 
do!

Marcy 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/