Hi, On 11-09-21 09:39 AM, Dejan Muhamedagic wrote: > Hi, > > On Wed, Sep 21, 2011 at 09:14:45AM -0400, Yves Trudeau wrote: >> Hi, >> >> On Tue, 2011-09-20 at 18:06 +0200, Dejan Muhamedagic wrote: >>> Hi, >>> >>> On Tue, Sep 20, 2011 at 11:16:39AM -0400, Yves Trudeau wrote: >>>> Hi, >>>> the whole resource agent script is available here: >>>> >>>> https://code.launchpad.net/percona-prm >>> Is that in any way related to the existing mysql RA? >> It is not. The existing MySQL is inadequate for replication and >> unusable with any non-trivial load. > Isn't it that the existing RA also does replication? If it's no > good, that should be discussed and fixed. Or at least we should > try to do that :)
I'll give you some absolute requirements that cause the current mysql RA to be unattractive: - If the slave lags too much, it must be kept running and reader vips removed from it. If no slave are left, reader vips must be on the master. The current agent only set master_score. - The script assumes that master_log_file and master_pos are consistent across servers, that is not the case, new promote master_log_file and position must be provided. - Upon demote, existing connections to a MySQL instance must be killed. - After promotion of a new master, slaves must be allowed to complete their relay logs before the change to the new master is applied. I believe the MySQL resource and the replication should be considered distinct since they are very different. That's why I am leaning toward a distinct type of resource. I agree that they could be merge at some point if it is interesting to do so. There is a current lack of a good, solid, HA solution using MySQL replication and I believe Pacemaker has what is needed to deliver it. People right uses tools like MMM that are very far from perfect but does the above reqs. I _want_ a solution that work. If you demonstrate me that the current RA can do that with little changes, I'll be glad to help. Until then, I'll develop my mysql_replication RA which already works but still has a few rough edges. Regards, Yves >> I had a discussion with Florian a >> few months ago about it. Basically if a slave lags behind, killing it >> will do no good. The RA I am writing deal _only_ with replication and >> the associated logic. > You mean it doesn't support any other mode of operation? > >>>> In order to make thing easier to follow I added the return codes of the >>>> agent to the lrmd log. >>>> >>>> Sep 15 16:54:08 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:6: probe >>>> >>>> + exit 0 >>>> >>>> Sep 15 16:54:09 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:7: promote >>>> >>>> + exit 0 >>>> >>>> Sep 15 16:54:12 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:12: demote >>>> >>>> + exit 0 >>>> >>>> Sep 15 16:54:12 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:14: demote >>>> >>>> + exit 0 >>>> >>>> Sep 15 16:54:13 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:15: stop >>>> >>>> + exit 0 >>>> >>>> Sep 15 16:54:13 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:19: start >>>> >>>> + exit 0 >>>> >>>> Sep 15 16:54:14 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:20: promote >>>> >>>> + exit 0 >>>> >>>> Sep 15 16:54:17 testvirtbox1 lrmd: [30902]: info: >>>> rsc:p_MySQL_replication:0:25: monitor >>>> >>>> + exit 8 >>>> >>>> >>>> What I don't understand is why there is no "monitor" call after the >>>> first promote at 16:54:09. >>> There is and it's called probe. Probe is a monitor with interval >>> set to 0. >> I know about probe... If you read my question, I am asking why there no >> monitor _after_ the promote. probe is the first method on the script >> after pacemaker start. Basically I want to know why there is a demote >> after the promote when the promote returned success. > Sorry, misread your question. Yes, the sequence of actions looks > really strange. Cannot offer any explanation unless there was > some event which further influenced the placement, but that's > not very likely in such a short timeframe. > > Thanks, > > Dejan > >>> Thanks, >>> >>> Dejan >>> >>>> Regards, >>>> >>>> Yves >>>> >>>> >>>> On Tue, 2011-09-20 at 17:27 +0300, Dan Frincu wrote: >>>>> Hi, >>>>> >>>>> On Tue, Sep 20, 2011 at 4:36 PM, Yves Trudeau<[email protected]> >>>>> wrote: >>>>>> Hi, >>>>>> I am currently developing a master-slave resource agent to handle >>>>>> MySQL replication in a sane way. So far, the resource agent works >>>>>> relatively well but I have this strange behavior when promoting a node. >>>>>> The excerpt below is when a single node is started, look at the promote >>>>>> -> demote -> promote sequence. From the trace of my resource agent >>>>>> script, evertything seems alright regarding returns code. Any idea why >>>>>> this behavior. >>>>>> >>>>> Without the actual resource agent I'd say it's easier to speculate and >>>>> harder to troubleshoot. >>>>> >>>>>> Sep 15 16:54:08 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:6: probe >>>>>> Sep 15 16:54:09 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:7: promote >>>>>> Sep 15 16:54:12 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:12: demote >>>>>> Sep 15 16:54:12 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:14: demote >>>>>> Sep 15 16:54:13 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:15: stop >>>>>> Sep 15 16:54:13 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:19: start >>>>>> Sep 15 16:54:14 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:20: promote >>>>>> Sep 15 16:54:17 testvirtbox1 lrmd: [30902]: info: >>>>>> rsc:p_MySQL_replication:0:25: monitor >>>>>> >>>>>> Regards, >>>>>> >>>>>> Yves >>>>>> >>>>>> _______________________________________________ >>>>>> Linux-HA mailing list >>>>>> [email protected] >>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>> >>>>> >>>>> >>>>> -- >>>>> Dan Frincu >>>>> CCNA, RHCE >>>>> _______________________________________________ >>>>> Linux-HA mailing list >>>>> [email protected] >>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>> See also: http://linux-ha.org/ReportingProblems >>>> _______________________________________________ >>>> Linux-HA mailing list >>>> [email protected] >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>> See also: http://linux-ha.org/ReportingProblems >>> _______________________________________________ >>> Linux-HA mailing list >>> [email protected] >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
