any good recommendations for an HA tool on zSeries?
We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so far have been handling HA by just clustering servers--yeah, any transaction in process to a server that's gone down would be whacked in mid-air, but anything new would route to the cluster-buddy that was still up. That's not true high-availability, though, and won't work for all our potential applications. Of course, we're looking at SSI which will help out for planned outages, but we'd also like to be able to do something in the case of a server-crash (i.e. have some sort of heartbeat-monitor that could pop up/activate an application on a different server if it noticed something was down). We'll be investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is compatible with zSeries, but I'm wondering if there's any other good products to explore (as well as anyone's experiences with the above products). We did a cursory look at LinuxHA, but unfortunately our management is not keen on using freeware, even though price will definitely be a consideration in whatever we decide on. Any comments from those in the field actually exploiting HA for zSeries at their shops? Thanks! Shannon Collinson, SunTrust Bank, Atlanta, GA LEGAL DISCLAIMER The information transmitted is intended solely for the individual or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of or taking action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this email in error please contact the sender and delete the material from any computer. By replying to this e-mail, you consent to SunTrust's monitoring activities of all communication that occurs on SunTrust's systems. SunTrust is a federally registered service mark of SunTrust Banks, Inc. [ST:XCL] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
down). We'll be investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is compatible with zSeries, It is not. That's why we created HAO. Red Hat does not offer their HA kit for Z or Power. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
Hi Collinson, Sine Nomine's SNA HAO for RHEL on z is basically the Red Hat's source code recompiled for s390x arch, with the addition of the fencing mechanism to interface with z/VM, fully supported by Sine Nomine (an ISV Red Hat Partner). The SNA HAO offers features like: Fail-over HA, Clustered File System (GFS2) and Load Balancing (ipvs). AFAIK SNA already have customers running the SNA HAO for RHEL on z in production environments, right David? During the event Enterprise2013(http://www-03.ibm.com/systems/enterprise/systemz.html) in Orlando, during the Red Hat's session there will be a live demo of the SNA HAO for RHEL on IBM System z by Neale. Kind Regards, Filipe Miranda On Oct 4, 2013, at 8:08 AM, Collinson.Shannon shannon.collin...@suntrust.com wrote: Ah! thanks for that prompt response--I'd been 90% sure I remembered that fact, but couldn't back it up from RedHat's datasheets on the web... -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of David Boyes Sent: Friday, October 04, 2013 11:06 AM To: LINUX-390@VM.MARIST.EDU Subject: Re: any good recommendations for an HA tool on zSeries? down). We'll be investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is compatible with zSeries, It is not. That's why we created HAO. Red Hat does not offer their HA kit for Z or Power. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ LEGAL DISCLAIMER The information transmitted is intended solely for the individual or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of or taking action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this email in error please contact the sender and delete the material from any computer. By replying to this e-mail, you consent to SunTrust's monitoring activities of all communication that occurs on SunTrust's systems. SunTrust is a federally registered service mark of SunTrust Banks, Inc. [ST:XCL] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
Sine Nomine's SNA HAO for RHEL on z is basically the Red Hat's source code recompiled for s390x arch, with the addition of the fencing mechanism to interface with z/VM, fully supported by Sine Nomine (an ISV Red Hat Partner). The SNA HAO offers features like: Fail-over HA, Clustered File System (GFS2) and Load Balancing (ipvs). With a few improvements (waiting to be upstreamed...) and a LOT of testing. 8-) HAO was designed to be plug-compatible with RHCS to take maximum advantage of existing knowledge. AFAIK SNA already have customers running the SNA HAO for RHEL on z in production environments, right David? Correct. It's a general purpose HA tool. We've been paying particular attention to supporting HA for MQSeries, as that seems to be a hot topic at the moment, but there are other customers doing different things as well, in a number of industries. During the event Enterprise2013(http://www- 03.ibm.com/systems/enterprise/systemz.html) in Orlando, during the Red Hat's session there will be a live demo of the SNA HAO for RHEL on IBM System z by Neale. Drop me a note off-list if you'd like more information. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
On 10/4/2013 at 11:03 AM, Collinson.Shannon shannon.collin...@suntrust.com wrote: We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so far have been handling HA by just clustering servers--yeah, any transaction in process to a server that's gone down would be whacked in mid-air, but anything new would route to the cluster-buddy that was still up. That's not true high-availability, though, and won't work for all our potential applications. Shannon, I could have sworn you were one of our System z customers, in which case the SUSE High Availability packages are included in the cost of your subscription. Mark Post -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
Hi Shannon, We have all kinds of HA going on. What are your distributed folks doing? Sometimes it is easiest just to use what they use. Appliances work great (one example is F5's GTM and LTM). The sw products often have their own solution (i.e. DB2 HADR, Websphere ND, Oracle RAC) and those are good choices for those products. We do use some LinuxHA (on SLES for z it is included and supported) for clustered file systems.It's complicated, but it can do things like provide r/w file systems to multiple server and move IP addresses around for you. When you think about just activating an app on another server, does it have access to the same files? From a systems management standpoint, we prefer that our applications run active-active (that is send traffic to multiple app servers if possible) to use capacity on more than one CEC. IHS in Websphere ND has plugins too where you can adjust percentage of traffic to the various app servers. This allows us to take whole lpars or CECs out of service for planned maintenance as well without much human intervention. Hope that is helpful. Marcy -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Collinson.Shannon Sent: Friday, October 04, 2013 8:03 AM To: LINUX-390@VM.MARIST.EDU Subject: [LINUX-390] any good recommendations for an HA tool on zSeries? We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so far have been handling HA by just clustering servers--yeah, any transaction in process to a server that's gone down would be whacked in mid-air, but anything new would route to the cluster-buddy that was still up. That's not true high-availability, though, and won't work for all our potential applications. Of course, we're looking at SSI which will help out for planned outages, but we'd also like to be able to do something in the case of a server-crash (i.e. have some sort of heartbeat-monitor that could pop up/activate an application on a different server if it noticed something was down). We'll be investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is compatible with zSeries, but I'm wondering if there's any other good products to explore (as well as anyone's experiences with the above products). We did a cursory look at LinuxHA, but unfortunately our management is not keen on using freeware, even though price will definitely be a consideration in whatever we decide on. Any comments from those in the field actually exploiting HA for zSeries at their shops? Thanks! Shannon Collinson, SunTrust Bank, Atlanta, GA LEGAL DISCLAIMER The information transmitted is intended solely for the individual or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of or taking action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this email in error please contact the sender and delete the material from any computer. By replying to this e-mail, you consent to SunTrust's monitoring activities of all communication that occurs on SunTrust's systems. SunTrust is a federally registered service mark of SunTrust Banks, Inc. [ST:XCL] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
On Friday, 10/04/2013 at 11:04 EDT, Collinson.Shannon shannon.collin...@suntrust.com wrote: We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so far have been handling HA by just clustering servers--yeah, any transaction in process to a server that's gone down would be whacked in mid-air, but anything new would route to the cluster-buddy that was still up. That's not true high-availability, though, and won't work for all our potential applications. Of course, we're looking at SSI which will help out for planned outages, but we'd also like to be able to do something in the case of a server-crash (i.e. have some sort of heartbeat-monitor that could pop up/activate an application on a different server if it noticed something was down). We'll be investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is compatible with zSeries, but I'm wondering if there's any other good products to explore (as well as anyone's experiences with the above products). We did a cursory look at LinuxHA, but unfortunately our management is not keen on using freeware, even though price will definitely be a consideration in whatever we decide on. Any comments from those in the field actually exploiting HA for zSeries at their shops? You are asking the right questions, but recognize that there is no single HA management solution. Real HA is more than just workload distribution. You have to protect yourself from outages of networks, servers, storage, and the components that connect them together and make them go (adapters, cables, power supplies, etc.) within a single site/campus. DR is a twist on HA that drives it to the next level, achieving the same purpose, but across longer distances and with a higher tolerance for a service outage. As others have noted, a good HA solutions can be leveraged for planned outages, too. Networks and servers are fairly straightforward and well understood (bonding solutions, app clusters, IP moves, another LPAR, another CPC). I find clients who get all that done and then I discover that they have a single storage controller. They might be replicating to the DR site, but that doesn't help them if they lose the local storage frame. If you have z/OS, then you need to look at GDPS, even if only for I/O hyperswap capability. Alan Altmark Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
We do in fact have GDPS, but only GDPS/XRC for our Disaster Recovery--and that, of course, doesn't cut it for HA. GDPS Hyperswap is a little out of the budget for now (man, I do not even want to contemplate our DS8700's going down on the storage side). And to Marcy's point, where possible we would want to use the same tools across the enterprise (so we are indeed looking at Oracle RAC for the oracle databases we're planning to migrate), but management wanted us to offer some sort of integrated generic linux solution for applications that didn't have any specific (or supported-on-z) tools. zLinux is a small-but-growing segment of our relatively small number of Linux servers--we're predominantly running the bigger non-mainframe applications on AIX servers with HACMP, so that's what we're competing with in trying to entice other applications to join our middleware MQ/Broker servers on zlinux. If we pick up Sine Nomine HAO or Tivoli Systems Automation for Multiplatform, we'd be looking at using them on our intel redhat linux servers as well as the zlinux ones to try to cut down on the tool proliferation. But you're right, Alan--if our storage controllers go dead, we'd be looking at activating a DR right now, for some subset of both zlinux and our mainframe applications. I'm going to go knock on something wooden... For now, we're concerning ourselves with the server-side of HA--network redundancy is already built (with multiple OSAs as well), we have multiple lpars and mainframes for each environment, and we're trusting to IBM's never-gonna-fail on the storage side. -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Alan Altmark Sent: Friday, October 04, 2013 1:45 PM To: LINUX-390@VM.MARIST.EDU Subject: Re: any good recommendations for an HA tool on zSeries? On Friday, 10/04/2013 at 11:04 EDT, Collinson.Shannon shannon.collin...@suntrust.com wrote: We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so far have been handling HA by just clustering servers--yeah, any transaction in process to a server that's gone down would be whacked in mid-air, but anything new would route to the cluster-buddy that was still up. That's not true high-availability, though, and won't work for all our potential applications. Of course, we're looking at SSI which will help out for planned outages, but we'd also like to be able to do something in the case of a server-crash (i.e. have some sort of heartbeat-monitor that could pop up/activate an application on a different server if it noticed something was down). We'll be investigating Tivoli Systems Automation for Multiplatform, and the Sine Nomine HAO (High Availability Option), plus I intend to see if the RHEL HA add-on is compatible with zSeries, but I'm wondering if there's any other good products to explore (as well as anyone's experiences with the above products). We did a cursory look at LinuxHA, but unfortunately our management is not keen on using freeware, even though price will definitely be a consideration in whatever we decide on. Any comments from those in the field actually exploiting HA for zSeries at their shops? You are asking the right questions, but recognize that there is no single HA management solution. Real HA is more than just workload distribution. You have to protect yourself from outages of networks, servers, storage, and the components that connect them together and make them go (adapters, cables, power supplies, etc.) within a single site/campus. DR is a twist on HA that drives it to the next level, achieving the same purpose, but across longer distances and with a higher tolerance for a service outage. As others have noted, a good HA solutions can be leveraged for planned outages, too. Networks and servers are fairly straightforward and well understood (bonding solutions, app clusters, IP moves, another LPAR, another CPC). I find clients who get all that done and then I discover that they have a single storage controller. They might be replicating to the DR site, but that doesn't help them if they lose the local storage frame. If you have z/OS, then you need to look at GDPS, even if only for I/O hyperswap capability. Alan Altmark Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ LEGAL DISCLAIMER The information transmitted is intended solely for the individual or
Re: any good recommendations for an HA tool on zSeries?
and we're trusting to IBM's never-gonna-fail on the storage side. If only! :) Or you could get a leak in your data center which melts a piece of equipment :) You may get by without hyperswap and use just PPRC if you can be down for a little bit after the primary storage fails. Of course that means you purchase 2x the disk. Or if you can get your storage team to give you space on 2 different DS8700's you can implement a good HA without Hyperswap and PPRC. Marcy -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
We do have some F5 appliances, but I think that's just for network routing--if there's any kind of F5 that could help manage an HA solution (making a passive server active or something like that), we don't have them. And we are looking at Oracle RAC as a possibility for the oracle databases we're hoping to migrate to zlinux, but the oracle platform owner wanted to explore something cheaper (keeping the critical Oracle stuff that requires RAC on the midrange servers it currently uses, but looking for some poor-man's HA to at least provide active-passive support on zlinux). And we tried out the MQ Multi-Instance setup for our websphere MQ and Broker servers but could never get it working as advertised, so that'll be another application we'll need to support. Right now, as I alluded to below, we're just using an active-active cluster (with routing through an F5 load-balancer) for MQ with each server running off its own storage--not really what the application owner wants in the long term. We're also playing with scripted HA for that which would use shared disks across the servers that would be managed by testing to see if the logical volume was in use--a really homegrown solution which I think would be more problematic than LinuxHA. When you say the setup for LinuxHA is complicated, how bad is it? Did you have to resort to bugging SuSE for configuration help, or were you able to work it all through with the documentation on the org site/maybe polling the interested-users list for it? not that I think we're anywhere near as knowledgeable as you and your team with zlinux, but if you guys had to go to the vendor for assistance, we shouldn't even contemplate it! and I just got word that okay, yeah, we can add LinuxHA to the running for the generic HA solution we're looking to find. (I guess reorgs are good in rare occasions, such as moving folks obstinate to what seem like good ideas...) Whatever we come up with would be something we hope could be exploited on all Linux servers (on any platform) at SunTrust--chances are, it'd only be cost-effective and training-effective if it was common, and right now, we actually don't have any standard HA product on our intel Linux side either, so this'd be a good time to find one. Of course we'd want to support any application-specific HA solution that the applications wanted to pay for (if it could run on zseries), but we'd like to have some kind of generic option for those other applications/products that still wanted some way to stay up while we were IPLing their z/VM lpars. Thanks for your consideration/responses! Shannon -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Marcy Cortes Sent: Friday, October 04, 2013 12:28 PM To: LINUX-390@VM.MARIST.EDU Subject: Re: any good recommendations for an HA tool on zSeries? Hi Shannon, We have all kinds of HA going on. What are your distributed folks doing? Sometimes it is easiest just to use what they use. Appliances work great (one example is F5's GTM and LTM). The sw products often have their own solution (i.e. DB2 HADR, Websphere ND, Oracle RAC) and those are good choices for those products. We do use some LinuxHA (on SLES for z it is included and supported) for clustered file systems.It's complicated, but it can do things like provide r/w file systems to multiple server and move IP addresses around for you. When you think about just activating an app on another server, does it have access to the same files? From a systems management standpoint, we prefer that our applications run active-active (that is send traffic to multiple app servers if possible) to use capacity on more than one CEC. IHS in Websphere ND has plugins too where you can adjust percentage of traffic to the various app servers. This allows us to take whole lpars or CECs out of service for planned maintenance as well without much human intervention. Hope that is helpful. Marcy -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Collinson.Shannon Sent: Friday, October 04, 2013 8:03 AM To: LINUX-390@VM.MARIST.EDU Subject: [LINUX-390] any good recommendations for an HA tool on zSeries? We're running RHEL6.2 and RHEL6.4 on our zlinux servers under z/VM 6.2 and so far have been handling HA by just clustering servers--yeah, any transaction in process to a server that's gone down would be whacked in mid-air, but anything new would route to the cluster-buddy that was still up. That's not true high-availability, though, and won't work for all our potential applications. Of course, we're looking at SSI which will help out for planned outages, but we'd also like to be able to do something in the case of a server-crash (i.e. have some sort of heartbeat-monitor that could pop up/activate an application on a different server if it noticed something was down). We'll be investigating
Re: any good recommendations for an HA tool on zSeries?
You lost me :) You said if there's any kind of F5 that could help manage an HA solution (making a passive server active or something like that), we don't have them Then we're just using an active-active cluster (with routing through an F5 load-balancer) I'm not the F5 person here, but as far as I know we use the same devices for either scenario. I'll have to check with our Broker guys to see what they are doing, but it is nicely spread over 4 lpars here with no intervention for failover in either planned or unplanned events and no shared disk. We have done the scripted HA thing with volumes off here, on there. It works well enough (usually) - depends on your scripter person! On the LinuxHA, you may pull out a few hairs (in my case I just aimed for the gray ones).We are attempting to use it for one application that needs a r/w file system accessible to 4 servers. We are working with SUSE. They found one config error I had, but we still have an open problem with it now with something we are trying to get into production soon. But that may be unique to our prod environment which is a stretched cluster or as SUSE calls it a Metro Area Cluster. I have it up, but shutting it down nicely is problematic.SUSE has some yast tools to help with the configuration and a pretty comprehensive document (although not a cookbook), some of which I couldn't use because of security rules (no direct root login allowed). So that complicated things some for us.It is not difficult to get your servers into a reboot death match with a bit of misconfig going too, which is really fun to try to stop if you've got it automatically coming back up ... (PS, I've never had an F5 shoot a server :) . Oh, I have been following the Linux-HA list too. When I reported my problem there, I got a open an SR response so that wasn't all that helpful :( (unlike this place :) . I do plan to write up all the steps needed in a VM environment to get OCFS2/CLVM/Storage Based fencing with minidisks. Maybe in the form of a SHARE presentation. It would be SUSE specific though, since that's what I have. With what I've learned this far, I wouldn't make it my generic solution. Then again, we already have one. I do think one of the requirements for any generic solution would be to have the ability for operations / applications staff to route work to/from different things. I would sure not want to be called to take a node out of service. It would need to integrate in with existing authentication systems too. SUSE does provide Hawk, which provides a web interface into 1 cluster, but I don't think that would work the number of people that we have with various roles in app cluster management and monitoring. Alerting is a pretty important thing to have in your solution as well. Hope that is useful. Marcy -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of Collinson.Shannon Sent: Friday, October 04, 2013 12:48 PM To: LINUX-390@VM.MARIST.EDU Subject: Re: [LINUX-390] any good recommendations for an HA tool on zSeries? We do have some F5 appliances, but I think that's just for network routing--if there's any kind of F5 that could help manage an HA solution (making a passive server active or something like that), we don't have them. And we are looking at Oracle RAC as a possibility for the oracle databases we're hoping to migrate to zlinux, but the oracle platform owner wanted to explore something cheaper (keeping the critical Oracle stuff that requires RAC on the midrange servers it currently uses, but looking for some poor-man's HA to at least provide active-passive support on zlinux). And we tried out the MQ Multi-Instance setup for our websphere MQ and Broker servers but could never get it working as advertised, so that'll be another application we'll need to support. Right now, as I alluded to below, we're just using an active-active cluster (with routing through an F5 load-balancer) for MQ with each server running off its own storage--not really what the application owner wants in the long term. We're also playing with scripted HA for that which would use shared disks across the servers that would be managed by testing to see if the logical volume was in use--a really homegrown solution which I think would be more problematic than LinuxHA. When you say the setup for LinuxHA is complicated, how bad is it? Did you have to resort to bugging SuSE for configuration help, or were you able to work it all through with the documentation on the org site/maybe polling the interested-users list for it? not that I think we're anywhere near as knowledgeable as you and your team with zlinux, but if you guys had to go to the vendor for assistance, we shouldn't even contemplate it! and I just got word that okay, yeah, we can add LinuxHA to the running for the generic HA solution we're looking to find. (I guess reorgs are
Re: any good recommendations for an HA tool on zSeries?
On Friday, 10/04/2013 at 05:12 EDT, Marcy Cortes marcy.d.cor...@wellsfargo.com wrote: I do plan to write up all the steps needed in a VM environment to get OCFS2/CLVM/Storage Based fencing with minidisks. Maybe in the form of a SHARE presentation. It would be SUSE specific though, since that's what I have. In a home-built clustered single-writer shared storage HA environments (MDISKs or LUNs) that you're building on your own, each guest needs a minidisk on an XLINK-managed volume. If a guest comes up and doesn't have R/W access to, for example, the 666 disk, it doesn't bother booting Linux, but simply screams and logs off. Please note that this applies only within a data center as it relies on shared DASD among the z/VM systems. With what I've learned this far, I wouldn't make it my generic solution. Then again, we already have one. I do think one of the requirements for any generic solution would be to have the ability for operations / applications staff to route work to/from different things. I would sure not want to be called to take a node out of service. It would need to integrate in with existing authentication systems too. SUSE does provide Hawk, which provides a web interface into 1 cluster, but I don't think that would work the number of people that we have with various roles in app cluster management and monitoring. Alerting is a pretty important thing to have in your solution as well. If you can automate suspension of host monitors, you will be well served. That way, when you take down part of a cluster (e.g. z/VM IPL), the central host monitors don't start beeping. I say suspend rather than disable because if they aren't back online by the end of the service window, you WANT the alarms to kick in IF they see that the VM systems is up. (I say. No point in whinging about Linux guests not being up if the VM system is still down, wot?) Alan Altmark Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: any good recommendations for an HA tool on zSeries?
If you can automate suspension of host monitors, you will be well served. That way, when you take down part of a cluster (e.g. z/VM IPL), the central host monitors don't start beeping. I say suspend rather than disable because if they aren't back online by the end of the service window, you WANT the alarms to kick in IF they see that the VM systems is up. (I say. No point in whinging about Linux guests not being up if the VM system is still down, wot?) Right, you need that too. You also want to know if you failed over to your backup device (oh, like say the backup device on a vswitch :) so you can go fix the primary. So you must know those messages and alarm on them (or alarm on anything that you don't know about :). And backup devices need their health monitored too so that you don't end up failing over to something that isn't really there. It's all a work in progress forever I think... Never a shortage of things to do! Marcy -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/