On 05/08/17 00:10 +0200, Jan Pokorný wrote: > [addendum inline] And some more...
> On 04/08/17 18:35 +0200, Jan Pokorný wrote: >> On 03/08/17 20:37 +0530, sharafraz khan wrote: >>> I am new to clustering so please ignore if my Question sounds silly, i have >>> a requirement were in i need to create cluster for ERP application with >>> apache, VIP component,below is the scenario >>> >>> We have 5 Sites, >>> 1. DC >>> 2. Site A >>> 3. Site B >>> 4. Site C >>> 5. Site D >>> >>> Over here we need to configure HA as such that DC would be the primary Node >>> hosting application & be accessed from by all the users in each sites, in >>> case of Failure of DC Node, Site users should automatically be switched to >>> there local ERP server, and not to the Nodes at other sites, so >>> communication would be as below >>> >>> DC < -- > Site A >>> DC < -- > Site B >>> DC < -- > Site C >>> DC < -- > Site D Note that if you wanted to imply you generally rely on/are limited with star-like network topology with a central machine doubling as a relay, you distort our implicit notion (perhaps we should make it explicit) of cluster forming a complete graph (directly or indirectly through multicast) amongst the nodes of the healthy partition (corosync is not as advanced to support grid/mesh/star topologies, but it's a non-goal for a direct peer messaging layer to start with). Sure, you can workaround this with tunnelling, at the cost of compromising reliability (and efficiency) and hence high availability :) With communication site x DC communication _after failure_, do you mean checking if the DC is OK again or something else? >>> Now the challenge is >>> >>> 1. If i create a cluster between say DC < -- > Site A it won't allow me to >>> create another cluster on DC with other sites >>> >>> 2. if i setup all the nodes in single cluster how can i ensure that in case >>> of Node Failure or loss of connectivity to DC node from any site, users >>> from that sites should be switched to Local ERP node and not to nodes on >>> other site. >>> >>> a urgent response and help would be quite helpful >> >> From your description, I suppose you are limited to just a single >> machine per site/DC (making the overall picture prone to double >> fault, first DC goes down, then any of the sites goes down, then >> at least the clients of that very site encounter the downtime). >> Otherwise I'd suggest looking at booth project that facilitates >> inter-cluster (back to your "multi cluster") decisions, extending >> upon pacemaker performing the intra-cluster ones. >> >> Using a single cluster approach, you should certainly be able to >> model your fallback scenario, something like: >> >> - define a group A (VIP, apache, app), infinity-located with DC >> - define a different group B with the same content, set up as clone >> B_clone being (-infinity)-located with DC >> - set up ordering "B_clone starts when A stops", of "Mandatory" kind >> >> Further tweaks may be needed. > > Hmm, actually VIP would not help much here, even if "ip" adapted per > host ("#uname") as there're two conflicting principles ("globality" > of the network for serving from DC vs. locality when serving from > particular sites _in parallel_). Something more sophisticated would > likely be needed. Thinking about that more, the easiest solution _configuration-wise_ might be: * single-real-node cluster at each of the sites, each having - disabled fencing (but configure and enable when/if you add more nodes) - a unique "prolonged arm" in a form of ocf:pacemaker:remote instance to be running at DC (e.g. different ports can be used so as to avoid communication clashes at DC) - custom resource agent (e.g. customization of ocf:heartbeat:anything) that will be just monitoring liveness of ERP application, preferring to run at the DC through that very pacemaker-remote - a unique IP (hostname mapping) which will be used by the users of that very site to access the application - VIP configured to represent that very IP, collocated with the said monitoring agent, meaning that VIP will follow it, i.e. will only run where the ERP application is running, primarily will stay at DC, with the fallback to the home site There are many adjustments possible on top of this sketch, which is furthermore totally untested. Advantage is that you may add new nodes at each site so as to achieve per-site HA. (But When also DC will get clustered with additional nodes, then booth will likely be a way forward.) Note that to avoid race conditions, pacemaker-remote instances at DC should not try to control the ERP application directly, but it should rather be set for autonomous recovery (most simple restart-after-failure case can be achieved, for instance, through service file directives of systemd if that's how it gets launched). This is definitely quite atypical workload from what I've seen so far, and is not easy to wrap my head around. -- Poki
Description: PGP signature
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org