Hi,

Vladimir Legeza wrote:
Hello,

On Fri, Oct 29, 2010 at 12:35 PM, Dan Frincu <dfri...@streamwide.ro <mailto:dfri...@streamwide.ro>> wrote:

    Hi,


    Vladimir Legeza wrote:
    /Hello folks.

    I try to setup four ip balanced nodes but,  I didn't found the
    right way to balance load between nodes when some of them are filed.

    I've done:/

    [r...@node1 ~]# crm configure show
    node node1
    node node2
    node node3
    node node4
    primitive ClusterIP ocf:heartbeat:IPaddr2 \
        params ip="10.138.10.252" cidr_netmask="32"
    clusterip_hash="sourceip-sourceport" \
        op monitor interval="30s"
    clone StreamIP ClusterIP \
        meta globally-unique="true" *clone-max="8"
    clone-node-max="2"* target-role="Started" notify="true"
    ordered="true" interleave="true"
    property $id="cib-bootstrap-options" \
        dc-version="1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="4" \
        no-quorum-policy="ignore" \
        stonith-enabled="false"

    /When all the nodes are up and running:/

     [r...@node1 ~]# crm status
    ============
    Last updated: Thu Oct 28 17:26:13 2010
    Stack: openais
    Current DC: node2 - partition with quorum
    Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438
    4 Nodes configured, 4 expected votes
    2 Resources configured.
    ============

    Online: [ node1 node2 node3 node4 ]

     Clone Set: StreamIP (unique)
         ClusterIP:0    (ocf::heartbeat:IPaddr2):    Started node1
         ClusterIP:1    (ocf::heartbeat:IPaddr2):    Started node1
         ClusterIP:2    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:3    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:4    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:5    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:6    (ocf::heartbeat:IPaddr2):    Started node4
         ClusterIP:7    (ocf::heartbeat:IPaddr2):    Started node4
    /
    Everything is OK and each node takes 1/4 of all traffic - wonderfull.
    But we become to 25% traffic loss if one of them goes down:
    /
    Isn't this supposed to be normal behavior in a load balancing
    situation, 4 nodes receive 25% of traffic each, one node goes
    down, the load balancer notices the failure and directs 33,33% of
    traffic to the remaining nodes?

The only way I see to achive 33...% is to decrease *clone-max *param value (that should be multiple of online nodes number)
also *clone-max *should be changed on the fly (automaticly).

hmm... Idea is very interesting. =8- )
*
*

    Just out of curiosity.

    [r...@node1 ~]# crm node standby node1
    [r...@node1 ~]# crm status
    ============
    Last updated: Thu Oct 28 17:30:01 2010
    Stack: openais
    Current DC: node2 - partition with quorum
    Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438
    4 Nodes configured, 4 expected votes
    2 Resources configured.
    ============

    Node node1: standby
    Online: [ node2 node3 node4 ]

     Clone Set: StreamIP (unique)
    *     ClusterIP:0    (ocf::heartbeat:IPaddr2):    Stopped
         ClusterIP:1    (ocf::heartbeat:IPaddr2):    Stopped *
         ClusterIP:2    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:3    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:4    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:5    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:6    (ocf::heartbeat:IPaddr2):    Started node4
         ClusterIP:7    (ocf::heartbeat:IPaddr2):    Started node4

    /I found the solution (to prevent loosing) by set *clone-node-max
    *to* 3*/

    [r...@node1 ~]# crm resource meta StreamIP set clone-node-max 3
    [r...@node1 ~]# crm status
    ============
    Last updated: Thu Oct 28 17:35:05 2010
    Stack: openais
    Current DC: node2 - partition with quorum
    Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438
    4 Nodes configured, 4 expected votes
    2 Resources configured.
    ============

    *Node node1: standby*
    Online: [ node2 node3 node4 ]

     Clone Set: StreamIP (unique)
    *     ClusterIP:0    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:1    (ocf::heartbeat:IPaddr2):    Started node3*
         ClusterIP:2    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:3    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:4    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:5    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:6    (ocf::heartbeat:IPaddr2):    Started node4
         ClusterIP:7    (ocf::heartbeat:IPaddr2):    Started node4

    /The problem is that nothing gonna changed when node1 back online./

    [r...@node1 ~]# crm node online node1
    [r...@node1 ~]# crm status
    ============
    Last updated: Thu Oct 28 17:37:43 2010
    Stack: openais
    Current DC: node2 - partition with quorum
    Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438
    4 Nodes configured, 4 expected votes
    2 Resources configured.
    ============

    Online: [ *node1* node2 node3 node4 ]

     Clone Set: StreamIP (unique)
    *     ClusterIP:0    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:1    (ocf::heartbeat:IPaddr2):    Started node3*
         ClusterIP:2    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:3    (ocf::heartbeat:IPaddr2):    Started node2
         ClusterIP:4    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:5    (ocf::heartbeat:IPaddr2):    Started node3
         ClusterIP:6    (ocf::heartbeat:IPaddr2):    Started node4
         ClusterIP:7    (ocf::heartbeat:IPaddr2):    Started node4
    /
    There are NO TRAFFIC on node1.
    If I back clone-node-max to 2  - all nodes revert to the original
    state./

    So, My question is How to avoid such "hand-made" changes ( or is
    it possible to automate/* clone-node-max*/ adjustments)?

    Thanks!
    You could use location constraints for the clones, something like:

    location StreamIP:0 200: node1
    location StreamIP:0 100: node2

    This way if node1 is up, it will run there, but if node1 fails it
    will move to node2. And if you don't define resource stickiness,
    when node1 comes back online, the resource migrates back to it.


I already tried to do so, but such configuration is not seems to be acceptable:

crm(live)configure# location location_marker_0 StreamIP:0 200: node1
crm(live)configure# commit
element rsc_location: Relax-NG validity error : Expecting an element rule, got nothing element rsc_location: Relax-NG validity error : Element constraints has extra content: rsc_location element configuration: Relax-NG validity error : Invalid sequence in interleave element configuration: Relax-NG validity error : Element configuration failed to validate content element cib: Relax-NG validity error : Element cib failed to validate content crm_verify[20887]: 2010/10/29_16:00:21 ERROR: main: CIB did not pass *DTD/schema validation*
Errors found during check: config not valid

Here the issue is with the name of the resource in the location constraint, the name is StreamIP, and it seems it doesn't allow for referencing child clones, only the parent clone. This is probably the expected behavior in this case.

Now you got me thinking, how would such a setup work. The way I see it, probably there's a better way of doing this.
Create 8 clusterip resources, clusterip{1..8}.

For each pair of clusterip resource (1+2, 3+4,etc), set a location constraint of 2x (location clusterip1_on_node1 clusterip1 200: node1, location clusterip2_on_node1 clusterip2 200: node1) and 6 location constraints of x for the other nodes.

This way, you have 2 clusterip resources always preferring one node, with failover to any of the other 3 available nodes if the current node fails. Failback is possible when the node comes back online due to the larger score preference for that node.

I know this will result in a rather complex set of resources and constraints, so maybe someone has a better / more simple vision of this.

Regards,

Dan


    I haven't tested this, but it should give you a general idea about
    how it could be implemented.

    Regards,

    Dan


    ------------------------------------------------------------------------

    _______________________________________________
    Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
<mailto:Pacemaker@oss.clusterlabs.org>
    http://oss.clusterlabs.org/mailman/listinfo/pacemaker

    Project Home: http://www.clusterlabs.org
    Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
    Bugs: 
http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

-- Dan FRINCU
    Systems Engineer
    CCNA, RHCE
    Streamwide Romania

    _______________________________________________
    Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
    <mailto:Pacemaker@oss.clusterlabs.org>
    http://oss.clusterlabs.org/mailman/listinfo/pacemaker

    Project Home: http://www.clusterlabs.org
    Getting started:
    http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
    Bugs:
    http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


------------------------------------------------------------------------

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

--
Dan FRINCU
Systems Engineer
CCNA, RHCE
Streamwide Romania

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to