You lost me :)

You said
" if there's any kind of F5 that could help manage an HA solution (making a 
passive server active or something like that), we don't have them"
Then
"we're just using an active-active cluster (with routing through an F5 
load-balancer)"

I'm not the F5 person here, but as far as I know we use the same devices for 
either scenario.

I'll have to check with our Broker guys to see what they are doing, but it is 
nicely spread over 4 lpars here with no intervention for failover in either 
planned or unplanned events and no shared disk.

We have done the scripted HA thing with volumes off here, on there.  It works 
well enough (usually) - depends on your scripter person!

On the LinuxHA, you may pull out a few hairs (in my case I just aimed for the 
gray ones).    We are attempting to use it for one application that needs a r/w 
file system accessible to 4 servers.  We are working with SUSE.   They found 
one config error I had, but we still have an open problem with it now with 
something we are trying to get into production soon.   But that may be unique 
to our prod environment which is a "stretched cluster" or as SUSE calls it a 
"Metro Area Cluster".     I have it up, but shutting it down nicely is 
problematic.    SUSE has some yast tools to help with the configuration and  a 
pretty comprehensive document (although not a cookbook), some of which I 
couldn't use because of security rules (no direct root login allowed).   So 
that complicated things some for us.    It is not difficult to get your servers 
into a reboot death match with a bit of misconfig going too, which is really 
fun to try to stop if you've got it automatically coming back up ... (PS, I've 
never had an F5 shoot a server :) .     Oh, I have been following the Linux-HA 
list too.   When I reported my problem there, I got a "open an SR" response so 
that wasn't all that helpful :( (unlike this place :) .

I do plan to write up all the steps needed in a VM environment to get 
OCFS2/CLVM/Storage Based fencing with minidisks.   Maybe in the form of a SHARE 
presentation.
It would be SUSE specific though, since that's what I have.

With what I've learned this far, I wouldn't make it my generic solution.  Then 
again, we already have one.   
I do think one of the requirements for any generic solution would be to have 
the ability for operations / applications staff to route work to/from different 
things.  I would sure not want to be called to take a node out of service.  It 
would need to integrate in with existing authentication systems too.   SUSE 
does provide Hawk, which provides a web interface into 1 cluster, but I don't 
think that would work the number of people that we have with various roles in 
app cluster management and monitoring.  Alerting is a pretty important thing to 
have in your solution as well.

Hope that is useful.


Marcy

-----Original Message-----
From: Linux on 390 Port [mailto:[email protected]] On Behalf Of 
Collinson.Shannon
Sent: Friday, October 04, 2013 12:48 PM
To: [email protected]
Subject: Re: [LINUX-390] any good recommendations for an HA tool on zSeries?

We do have some F5 appliances, but I think that's just for network routing--if 
there's any kind of F5 that could help manage an HA solution (making a passive 
server active or something like that), we don't have them.  And we are looking 
at Oracle RAC as a possibility for the oracle databases we're hoping to migrate 
to zlinux, but the oracle "platform owner" wanted to explore something cheaper 
(keeping the critical Oracle stuff that requires RAC on the midrange servers it 
currently uses, but looking for some poor-man's HA to at least provide 
active-passive support on zlinux).  And we tried out the MQ Multi-Instance 
setup for our websphere MQ and Broker servers but could never get it working as 
advertised, so that'll be another application we'll need to support.  Right 
now, as I alluded to below, we're just using an active-active cluster (with 
routing through an F5 load-balancer) for MQ with each server running off its 
own storage--not really what the application owner wants in the long term.  
We're also playing with scripted HA for that which would use shared disks 
across the servers that would be managed by testing to see if the logical 
volume was in use--a really homegrown solution which I think would be more 
problematic than LinuxHA.  

When you say the setup for LinuxHA is complicated, how bad is it?  Did you have 
to resort to bugging SuSE for configuration help, or were you able to work it 
all through with the documentation on the org site/maybe polling the 
interested-users list for it?  not that I think we're anywhere near as 
knowledgeable as you and your team with zlinux, but if you guys had to go to 
the vendor for assistance, we shouldn't even contemplate it!  and I just got 
word that okay, yeah, we can add LinuxHA to the running for the "generic HA" 
solution we're looking to find.  (I guess reorgs are good in rare occasions, 
such as moving folks obstinate to what seem like good ideas...)

Whatever we come up with would be something we hope could be exploited on all 
Linux servers (on any platform) at SunTrust--chances are, it'd only be 
cost-effective and training-effective if it was common, and right now, we 
actually don't have any standard HA product on our intel Linux side either, so 
this'd be a good time to find one.  Of course we'd want to support any 
"application-specific" HA solution that the applications wanted to pay for (if 
it could run on zseries), but we'd like to have some kind of generic option for 
those other applications/products that still wanted some way to stay up while 
we were IPLing their z/VM lpars.

Thanks for your consideration/responses!
    Shannon 

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Reply via email to