Hi, > Someone please clear things up for me a little!
Here is a try :) > Seems the data exceeded the max_allowed_packet size which is 1M on master and > slaves so all slaves stopped replicating. > > Question is, why did this happen and how could this have been avoided? > Should max_allowed_packet also limit the LOAD DATA INFILE size so that it > would stop and warn on the master instead of on the slaves...? There is no constraint like "the loaded file must be < max_allowed_packet". You can load a GBs file. What happens is that this file is written in chunks into the master's binary log. If you do mysqlbinlog on the binlog you'll see something like: LOAD DATA INFILE .... # Append_block ... # Append_block ... # Append_block ... (one Append_block per chunk) # Append_block ... # Execute_load When the slave sees the LOAD DATA INFILE it knows it creates a temporary file. Then every time it sees an Append_block, it copies the file block it contains into the temp file. Finally when it sees the execute load, it loads the temp file into the table. Each chunk is rather small (1MB or slightly more). When you set SQL_SLAVE_SKIP_COUNTER, each chunk counts for 1. Only mysqlbinlog can show you how many chunks there are and so what the right value for SQL_SLAVE_SKIP_COUNTER is (it is at least 2; if file is small there is no Append_block; so then it's the LOAD line plus the Execute_load line = 2). Why this logging in chunks instead of one big file? Concurrency. If we were to write 500 MB to the binary log in one chunk, it would lock the binary log for the time to write 500 MB; during this, all other updates would be stalled (as they need to write to the binary log too). > Because the data loaded wasn't important I did not care much to find a better > solution... I just wanted to skip the errors and let mysql get on with it... > > After I got one slave running I did the following on the rest of the slaves: > SLAVE STOP; > SET GLOBAL max_allowed_packet=10000000; > SET GLOBAL SQL_SLAVE_SKIP_COUNTER=2; > SLAVE START; > I could probably have changed position with CHANGE MASTER TO instead... > (finding the proper "good" position by running mysqlbinlog on the master > logs) > > *** Events description follows: > > 1. LOAD DATA INFILE statement replicated to slaves. The CSV file was larger > than max_allowed_packet: > > Error reading packet from server: log event entry exceeded max_allowed_packet; > Increase max_allowed_packet on master (server_errno=1236) > Got fatal error 1236: 'log event entry exceeded max_allowed_packet; Increase > max_allowed_packet on master' from master when reading data from binary log > Slave I/O thread exiting, read up to log 'george-bin.3794', position 241147667 > Slave I/O thread: connected to master '[EMAIL PROTECTED]:3306\', replication > started in log 'george-bin.3794' at position 241147667 > > 2. max_allowed_packet was increased on the MASTER with the following command: > MASTER: SET GLOBAL max_allowed_packet=10475220 (10M) > Now the complaint is because of the slave's max_allowed_packet instead: > > Slave I/O thread: connected to master '[EMAIL PROTECTED]:3306', replication > started in log 'george-bin.3794' at position 241147667 > Error reading packet from server: Packet too large - increase > max_allowed_packet on this server (server_errno=1153) > Log entry on master is longer than max_allowed_packet (1047552) on slave. If > the entry is correct, restart the server with a higher value of > max_allowed_packet > Slave I/O thread exiting, read up to log 'george-bin.3794', position 241147667 yes, both master and slave need the packet to fit, as they both use the packet in network communication. > 3. So I increase the max_allowed_packet on the slave as well: > SLAVE: SET GLOBAL max_allowed_packet=10475220 > This time when I do the SLAVE START it messes up some more: > > Slave I/O thread: connected to master '[EMAIL PROTECTED]:3306', replication > started in log 'george-bin.3794' at position 241147667 > Error in Log_event::read_log_event(): 'Event too big', data_len: 1069482, > event_type: 8 > > --- Why does it say Event too big? max_allowed_packet should now have been > changed to 10M, data_len above == 1.1M. > --- Or did the replication SQL thread fail to understand the new > max_allowed_packet I set? Yes. When you do a SET GLOBAL it affects only new threads. When you did the SET GLOBAL, slave I/O thread did not exist (it had terminated because of the "Error reading packet from server: Packet too large". BUT slave SQL thread did exist (it had no reason to terminate; there was nothing in the relay log but hey that could be normal). Then after the SET you did START SLAVE. This started only what needed to be started, i.e. the slave I/O thread. So, the slave SQL thread was the one which already existed before the SET. So it did not take into account the SET. You would need to STOP SLAVE; START SLAVE;. > 4. So the next step SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; followed by another > SLAVE START: > > Slave: Error in Exec_load event: could not open file > '/tmp/SQL_LOAD-30020-10061-1.info', Error_code: 2 > Error running query, slave SQL thread aborted. Fix the problem, and restart > the slave SQL thread with "SLAVE START". We stopped at log 'george-bin.3794' > position 242 > 217149 This new SET took the slave at the beginning of the Execute_load probably. So it tried to open the temp file and failed (as the creation of it had been skipped by previous SET). > 5. another set global SQL_SLAVE_SKIP_COUNTER=1 and it was > replicating again! Slave was then after the Execute_load. I fully agree with you that this is all very complicated. I will shortly put a note about the "surprise" with LOAD DATA / SQL_SLAVE_SKIP_COUNTER / max_allowed_packet into the doc. To conclude I recommend, next time you have to work things out: - mysqlbinlog - then SQL_SLAVE_SKIP_COUNTER (or CHANGE MASTER) Regards, Guilhem -- __ ___ ___ ____ __ / |/ /_ __/ __/ __ \/ / Mr. Guilhem Bichot <[EMAIL PROTECTED]> / /|_/ / // /\ \/ /_/ / /__ MySQL AB, Full-Time Software Developer /_/ /_/\_, /___/\___\_\___/ Bordeaux, France <___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]