On Wed, Sep 17, 2003 at 08:28:36AM +0800, Aris Santillan wrote: > yeah ive tried doing scheduled mysqldump and scp but > dumped files are very big to transfer, so i think that > doing a replication will only copy or update latest query > > i also planned a scheduled uptime for the slave to minimize > hassle on peak hours of database usage.
Here's how I do replication. At a set time interval, the db table is dumped, sorted by all keys. Then I simply run a diff, and if it's the same as the last dump, I remove the file and quit. If it's different, then I run a Perl script. The Perl script knows which fields are keys (taken together, the keys are guaranteed unique), and compares the files line by line. It generates a set of insertions, deletions, and changes. The data starts out tab-delimited, so the script simply adds an "i", "d", or "c" followed by a tab to the start of each record. If it's a delete, the script only includes the keys to save space. The diff file is then gzip'd and sent to the slave side. It's filename is "table.upd.gz". On the other side, it's unzipped and another Perl script turns it into a set of sql statements which are then executed against the database. I have a few sizable data sets (30MB+) that I don't want to push across a network connection each time. Also, I have a web site running on the slave side, so I can't just wipe out a table and reload it. In that case, the diff set works perfectly since only the rows that need to be updated are touched. One of my sites has 200,000 products listed on it. The full data is 25MB or so, the daily diffs are generally around 150K, but under 30K when compressed. That's less than a second of their network bandwidth. Even going over a modem that would be 10-15 seconds. The down side of doing it this way is the cpu time required to come up with the diffs. But it's the tradeoff between that and bandwidth, and my feeling is that bandwidth is probably more precious in the Philippines. Let me know if you want my Perl script. I plan to put it together for download at some point. Michael -- Michael Darrin Chaney [EMAIL PROTECTED] http://www.michaelchaney.com/ -- Philippine Linux Users' Group (PLUG) Mailing List [EMAIL PROTECTED] (#PLUG @ irc.free.net.ph) Official Website: http://plug.linux.org.ph Searchable Archives: http://marc.free.net.ph . To leave, go to http://lists.q-linux.com/mailman/listinfo/plug . Are you a Linux newbie? To join the newbie list, go to http://lists.q-linux.com/mailman/listinfo/ph-linux-newbie
