On Wed, Sep 17, 2003 at 08:28:36AM +0800, Aris  Santillan wrote:
> yeah ive tried doing scheduled mysqldump and scp but
> dumped files are very big to transfer, so i think that 
> doing a replication will only copy or update latest query
> 
> i also planned a scheduled uptime for the slave to minimize
> hassle on peak hours of database usage.

Here's how I do replication.  At a set time interval, the db table is
dumped, sorted by all keys.  Then I simply run a diff, and if it's the
same as the last dump, I remove the file and quit.  If it's different,
then I run a Perl script.  The Perl script knows which fields are keys
(taken together, the keys are guaranteed unique), and compares the files
line by line.  It generates a set of insertions, deletions, and changes.

The data starts out tab-delimited, so the script simply adds an "i",
"d", or "c" followed by a tab to the start of each record.  If it's a
delete, the script only includes the keys to save space.

The diff file is then gzip'd and sent to the slave side.  It's filename
is "table.upd.gz".  On the other side, it's unzipped and another Perl
script turns it into a set of sql statements which are then executed
against the database.

I have a few sizable data sets (30MB+) that I don't want to push across
a network connection each time.  Also, I have a web site running on the
slave side, so I can't just wipe out a table and reload it.  In that
case, the diff set works perfectly since only the rows that need to be
updated are touched.

One of my sites has 200,000 products listed on it.  The full data is
25MB or so, the daily diffs are generally around 150K, but under 30K
when compressed.  That's less than a second of their network bandwidth.
Even going over a modem that would be 10-15 seconds.  

The down side of doing it this way is the cpu time required to come up
with the diffs.  But it's the tradeoff between that and bandwidth, and
my feeling is that bandwidth is probably more precious in the
Philippines.

Let me know if you want my Perl script.  I plan to put it together for
download at some point.

Michael
-- 
Michael Darrin Chaney
[EMAIL PROTECTED]
http://www.michaelchaney.com/
--
Philippine Linux Users' Group (PLUG) Mailing List
[EMAIL PROTECTED] (#PLUG @ irc.free.net.ph)
Official Website: http://plug.linux.org.ph
Searchable Archives: http://marc.free.net.ph
.
To leave, go to http://lists.q-linux.com/mailman/listinfo/plug
.
Are you a Linux newbie? To join the newbie list, go to
http://lists.q-linux.com/mailman/listinfo/ph-linux-newbie

Reply via email to