> > Thanks. If possible, can we have: > > - The exact INSERT statements issued by the MySQL consumer
- The UUID values generated for those records > I'll try to get them, sure. > > I followed the master-slave replication lag for some hours, and > perceived a > > pattern in the lag: It gets progressively bigger with time, more or less > > with a 10 minute increase per hour, reaching lags of 1 to 2 hours. At > that > > point, the data gap happens and the replication lag goes back to few > minutes > > lag. I could only catch a data gap "live" 2 times, so that's definitely > not > > a conclusive statement. But, there's this hypothesis that the two > problems > > are related. > > Just for clarity, may I ask how are you testing this? 1) To identify the data gaps I used: select left(timestamp, 11), count(*) from Edit_11448630 where timestamp >= '20150415000000' and timestamp < '20150416000000' group by 1; Note that the table name and the timestamps can be adapted as necessary. This query returns something like: +---------------------+----------+ | left(timestamp, 11) | count(*) | +---------------------+----------+ | 20150415000 | 9823 | | 20150415001 | 10158 | | 20150415002 | 9473 | | 20150415003 | 9493 | | 20150415004 | 9297 | | 20150415005 | 9390 | | 20150415010 | 9849 | | 20150415011 | 9619 | | 20150415012 | 10038 | | 20150415013 | 9763 | | 20150415014 | 9750 | | 20150415015 | 9633 | | ... | ... | +---------------------+----------+ Which lists the number of events existing for each 10-minute slot. When there's a data gap, the result of the query looks like this: +---------------------+----------+ | left(timestamp, 11) | count(*) | +---------------------+----------+ | ... | ... | | 20150415150 | 21237 | | 20150415151 | 20677 | | 20150415152 | 20541 | | 20150415153 | 19671 | | 20150415154 | 19623 | | 20150415155 | 19281 | | 20150415160 | 19243 | | 20150415161 | 5708 | <= Gap: 16:20h and 16:30h have no data! | 20150415164 | 11590 | | 20150415165 | 18745 | | ... | ... | +---------------------+----------+ 2) To get the master-slave replication lag I used: select timestamp from Edit_11448630 order by 1 desc limit 1; Again, the table name can be substituted. This gives me, supposedly, the timestamp of the last inserted event. Comparing that with the current time, I get the lag. 3) To correlate both, I just happened to be monitoring the progressively increasing replication lag, and after noticing an abrupt recovery of the latter, I checked and found a data gap.
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
