Re: master-slave replication sync problems.

Shawn Green (MySQL) Thu, 26 Aug 2010 18:02:58 -0700

Hello List,

On 8/26/2010 3:00 PM, Daevid Vincent wrote:

ssh to the slave
mysql -uroot -pPASSWORD -P3306 -hlocalhost


show slave status\G

If the Slave IO is NOT Running, but SQL is, then simply try to restart the
slave...

*************************** 1. row ***************************
             Slave_IO_State:
                Master_Host: 10.10.10.45
                Master_User: slave
                Master_Port: 3306
                          ...
           Slave_IO_Running: No
          Slave_SQL_Running: Yes

start slave; show slave status\G

Otherwise if it's a replication issue such as this, then you must skip over
the bad SQL queries one at a time till the slave syncs.

*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.10.10.41
                Master_User: slave
                Master_Port: 3306
                          ...
           Slave_IO_Running: Yes
          Slave_SQL_Running: No
            Replicate_Do_DB: agis_core_2008
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 1061
                 Last_Error: Error 'Duplicate key name 'id_operator'' on
query. ....

The SQL statement will give you an idea of where the master and slave went
askew. If these are recent commands you did you can guess as to how much to
increment the SKIP_COUNTER below, otherwise, you just have to do it one at
a time until they sync again.

mysql> stop slave; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; start slave; show
slave status\G

Repeat the above statements over and over until you see two YES rows.

*************************** 1. row ***************************

           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes

-----Original Message-----

From: Norman Khine [mailto:nor...@khine.net]Sent: Thursday, August 26, 2010 6:05 AM

To: mysql@lists.mysql.com
Subject: master-slave replication sync problems.

hello,
i have a working master-slave replication, the problem i find is that
if i restart the MASTER there is a difference in the MASTER_LOG_FILE
and MASTER_LOG_POS on the SLAVE.

what is the correct way to keep the two slaves in sync even after i
restart the server. the way i do it now is to:

[MASTER]
mysql> show master status;

+------------------+----------+----------------------+--------
------------------+
| mysql-bin.000010 | 13405429 | upgrade,tracker,bugs |
mysql,information_schema |
+------------------+----------+----------------------+--------
------------------+
1 row in set (0.00 sec)

[SLAVE]
mysql> stop slave;
mysql> change master to MASTER_HOST='master.domain.com',
MASTER_USER='replicator', MASTER_PASSWORD='xxx',
MASTER_LOG_FILE='mysql-bin.000010', MASTER_LOG_POS=13301215;
mysql> start slave;

is this correct or is there a better way to do this?

thanks

To me, it appears that many of you are not fully versed in the theory ofoperations for how MySQL replication actually functions. Granted, thereare two formats for replication (ROW and STATEMENT) but the generalprocess remains the same. Here is a nutshell summary of the process.


** on the master **

m1) The MySQL master is instructed to change some data.

m2) The data is changed and the results are committed to disk. If yourollback the changes before you commit them, then there is nothing toreplicate. Only the InnoDB engine supports this type of rollback.


m3) The change committed in step 2 is written to the binary log

(repeat from step m1 until the Master is shutdown)


** on the slave - the IO thread **

(assuming that the slave is already configured with compatible data, astarting position, and the proper credentials to act as a slave)

o1) The SLAVE IO thread requests information from the master's binarylogs. This information is identified by a file name and a byte offsetfrom the start of that file

o2) The SLAVE IO thread copies all available information from themaster's binary logs into a local copy of those logs known as the relaylogs.

(repeat from o1 until the SLAVE IO thread is stopped(by error or bycommand) or the slave is shutdown)


** on the slave - the SQL thread **

s1) Once an unapplied change has been completely buffered into the relaylogs, the SLAVE SQL thread attempts to apply the change to the slave'sdata.

s2) If LOG SLAVE UPDATES is enabled, copy the applied change (using thecorrect format) into the slave's binary log.

(repeat from s1 until the SLAVE SQL thread is stopped (by error or bycommand) or the slave is shutdown)

**

As you can tell by this very simplified process description, there is noattempt to rectify one dataset to the other. Replication operates underthe principle that if you perform identical changes to identical sets ofdata, you will end up with identical end results.

Various replication "filters" can omit certain changes from either beingreplicated to or processed by the slave instance. Use these WITH EXTREMECAUTION as they can very easily create situations where the master andslave datasets diverge to the point that a rebuild of the slave data isthe only effective repair.

**

When you get a replication error, such as DUPLICATE KEY, it is theadministrator's responsibility to figure out why and fix it. Maybe someuser on the slave added an extra row to a table? Maybe an INSERT ...SELECT added more rows on the slave than it did on the master? Maybe aDELETE on the master removed less rows than it did on the slave?

Simply skipping those problems (and others) without investigation orcorrection may allow the two datasets (master and slave) to diverge evenmore. The human administrator is required to make a judgment call onwhich version of the conflicting row is the "correct" version. Is is theone on the table or the change coming in from the binary log? Maybe thedata on the slave is "correct" but it has been offset by a few rowsinserted a long time ago.

It's your responsibility to understand and appropriately respond to theerrors not just repeat scripted actions until the problems disappeartemporarily.

--
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc.
Office: Blountville, TN

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Re: master-slave replication sync problems.

Reply via email to