Re: [Dirvish] Dirvish jobs that run more than 24 hours

Keith Lofstrom Wed, 10 Dec 2008 09:22:31 -0800

Asheesh wrote:
> A problem with Keith's suggestion is that if any user at all is  
> running rsync, then the dirvish cron job will fail to start.

On Wed, Dec 10, 2008 at 10:34:29AM -0600, Richard wrote:
> Keith,  try this line on for size. (You'll have to substitute your vault 
> path or tree: path)
> 
> if ps ax | grep r\\sync.*$VAULTPATH > /dev/null ; then
> 
> 
> OK...well I'm off to do other things.  I *STILL* haven't found WHY that 
> grep statement works!

Keith responds:

I appreciate the contributions - for many if not most situations they
may work better, but not in my own case.  Put them on the wiki, too!

Dirvish is designed to be customized with four scripts:  pre-server,
pre-client, post-client, and post server, which run before and after
the individual rsync run.  This moves complexity out of dirvish, 
which is a Good Thing.

The "preclient" script runs on the client, and knows about rsync jobs
running only on the client.  A failed preclient script only stops the
one rsync job that is associated with it - the rest of the dirvish
spawned rsync jobs are unaffected. 

I schedule dirvish to run at around 2AM.  There is normally no reason
to run any other rsync jobs at that time - if there is a human running
rsync at that time, they probably don't want to be slowed down by
dirvish.  If there is a bandwidth limit to a particular client, there
is no advantage to running multiple rsync commands at once - rsync is
designed to push data in parallel anyway, and optimizes for available
bandwidth.  Two or more rsync jobs just slow each other down, thrash
disk and memory, and make completion time for both more uncertain. 
Better to schedule them back-to-back, sequentially.  

I do a lot with the following sequence:

1) Front wrapper script before dirvish runs.  I use this to mount 
disks and prepare failure counters and such.  About 100 lines of
bash.

2) Pre-server.  I don't do much with this beyond log variables, but
it is a good place to check for disk space and wait for other jobs
to complete on the server.  A ping with an upper time limit might
be good here, if the pipe to the client is busy for other reasons.

3) Pre-client.  I log variables from the client, and now check for
running rsync jobs.   This is a good place to set up stuff on the
client, perhaps mount drives, set up security, and lock out other
processes until dirvish is done.  I've also considered mounting 
the client's VMware guests, and backing up their virtual drives
via file sharing.

4) Dirvish/Rsync.   The basic operation, as simple and reliable
as possible.  Since some backups are running through end-to-end
VPN tunnels, I have considered turning off rsync's ssh encryption
on them.  

5) Post-client. Reverse pre-client setups, and run df and fdisk on
the client to aid reconstruction of client disks.

6) Post-server.  A good place to parse results and increment failure
counters.  I also make symlinks to the tree of each branch when they
successfully complete.

7) Back wrapper script.  This is where I count failures, look at disk
usage, and summarize the results of all the runs.  I also unmount 
backup drives and turn off buses and controllers, so the drives are
isolated and can be swapped into the fireproof safe or moved offsite.
I am still using PATA backup drives, but they are connected to the
CPU through PATA/SATA adapters, because SATA hotswap is well supported
in 2.6.XX kernels.  Another 100 or so lines of bash.

So my tendency is to leave dirvish alone, and do the complex and
situation-specific stuff with relatively simple pre- and post- scripts. 
I should post the scripts I use to the wiki.  Real Soon Now.

Keith

-- 
Keith Lofstrom          [EMAIL PROTECTED]         Voice (503)-520-1993
KLIC --- Keith Lofstrom Integrated Circuits --- "Your Ideas in Silicon"
Design Contracting in Bipolar and CMOS - Analog, Digital, and Scan ICs
_______________________________________________
Dirvish mailing list
[email protected]
http://www.dirvish.org/mailman/listinfo/dirvish

Re: [Dirvish] Dirvish jobs that run more than 24 hours

Reply via email to