Some problems with rsync

2003-06-22 Thread Jens Trach
We use rsync to synchronice some folders from our Server inside the intranet
behind firewall and router, to the internet.

All together we sync 23 folders
for example we use:
# Case 1
rsync -e "ssh -i /path/keyfile -1" -auzP --bwlimit=8 '/path/directory_1'
'[EMAIL PROTECTED]:/path/directory_1/'
# Case 2
rsync -e "ssh -i /path/keyfile -1" -auzP --bwlimit=8 '/path/directory_2'
'[EMAIL PROTECTED]:/path/directory_2/'

In 12 cases it will be nice, if I delete the folder on the server in the
internet, rsync create´s it again and loads up the files recursive.
But in 11 cases it creats a folder named ^M and inside that folder the
directory wich I want to transfer and inside it the files recursive.

First time I tought it will be, because there are some empty folders but if
I create a file named empty.txt inside them the problem will still be the
same.

What is the case, because rsync creats ^M folders???

All is running from a batch witch will be started by cron and the folders on
the source-side will be typed in corectly.
The keyfile is the same for all Transfers and will be used, becaus rsync
creats folders on the target.
Exquse me pleas for my bad english, but I´m a german ones an english isn´t
my mothers language.

Can you help me please.


Greatings from germany
Jens Trach
[EMAIL PROTECTED]
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


BackupPC 2.0.0 released (backup to disk for WinXX/Linux/Unix)

2003-06-22 Thread cbarratt
BackupPC version 2.0.0 has been released on SourceForge, see
http://backuppc.sourceforge.net.  New features include support
for rsync/rsyncd and internationalization of the CGI interface
(including English, French, Spanish and German).

BackupPC a high-performance perl-based package for backing up linux,
unix or WinXX PCs and laptops to a server's disk.  BackupPC is highly
configurable and easy to install and maintain.  SMB (via smbclient),
tar over rsh/ssh or rsync/rsyncd are used to extract client data.

Given the ever decreasing cost of disks and raid systems, it is now
practical and cost effective to backup a large number of machines onto
a server's local disk or network storage.  This is what BackupPC does.

Key features are pooling of identical files (big savings in server disk
space), compression, and a comprehensive CGI interface that allows users
to browse backups and restore files.

BackupPC is free software distributed under a GNU GPL license.
BackupPC runs on linux/unix/freenix servers, and has been tested
on linux, unix, Win95, Win98, Win2000, WinXP and Mac OSX clients.

Enjoy!

Craig Barratt
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Installing rsync as a service on Windows 2000.

2003-06-22 Thread cbarratt
> An updated version of the "Rsync on NT" document is now available at:
> http://www.tiarnan.phlegethon.org/rsyncntdoc.html

For those of us without the benefit of the M$ Windows NT or 2000
Resource Kit, cygwin's own cygrunsrv does the equivalent (or better)
job than the M$ instsrv/srvany.

An example for installing rsyncd as a WinXX service (assuming rsync.exe and
rsyncd.conf are in c:\rsyncd):

cygrunsrv.exe -I "RSYNC" -p c:\rsyncd\rsync.exe -a "--config=c:\rsyncd\rsyncd.conf 
--daemon --no-detach"

Then just use the Services panel to start the rsync service.  No need for
registry edits for the command-line arguments.

Craig
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync backup performance question

2003-06-22 Thread jw schultz
On Sun, Jun 22, 2003 at 05:39:49PM +0200, Ron Arts wrote:
> jw schultz wrote:
> 
> >
> [snip.. and thanks for all your comments]
> >
> >
> >Rsync doesn't perform well on non-local filesystems.
> >
> 
> Really? Won't gigabit ethernet help for NFS, or maybe
> Samba? I only have to rsync a relatively
> low number of files, so no large directory scans.

Certainly the faster the link to the fileserver the better.
A gigabit link to optimised disk arrays can be faster than
local disk.  Just be aware that rsync is a cache thrasher
and will propagate that effect to the fileserver or SAN box
possibly impacting other processes.


-- 

J.W. SchultzPegasystems Technologies
email address:  [EMAIL PROTECTED]

Remember Cernan and Schmitt
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: various "rsync: connection unexpectedly closed" errors on debian

2003-06-22 Thread jw schultz
On Sun, Jun 22, 2003 at 11:12:56PM +0800, [EMAIL PROTECTED] wrote:
> 
> >>When I sync'ed against other servers, my rsync client process always 
> >>ran without --compress.  Nevertheless, I've tried increasing 
> >>verbosity by 3, and this is what I get:
> >
> >Check the rsync log file. That is where the error message will be.
> 
> 
> Hi.  Okay, on the server I am downloading from, it shows the following 
> (doesn't seem very detailed, if more details are needed, how do I ask 
> the admin to increase it?):
> 
>2003/06/21 18:07:03 [2020] rsync: name lookup failed for
>203.208.246.57: Name or service not known
>2003/06/21 18:07:04 [2020] rsync on gentoo-portage/ from UNKNOWN
>(203.208.246.57)
>2003/06/21 18:12:16 [2020] rsync error: timeout in data send/receive
>(code 30) at io.c(103)

Ahh, there it is.  A timeout.  You must be overriding the
default and setting it too short for the work you want done..

> 
> Whereas on my client, it shows something like this (this clip not taken 
> from the same session, but likely to be similar).
> 
>[snip]
>x11-wm/xpde/xpde-0.3.0.ebuild is uptodate
>generate_files phase=1
>rsync: connection unexpectedly closed (1657342 bytes read so far)
>rsync error: error in rsync protocol data stream (code 12) at io.c(165)
>_exit_cleanup(code=12, file=io.c, line=165): about to call exit(12)
>rsync: connection unexpectedly closed (1493478 bytes read so far)
>rsync error: error in rsync protocol data stream (code 12) at io.c(165)
>_exit_cleanup(code=12, file=io.c, line=165): about to call exit(12)
> 
> Not being much of a programmer, I can only hazard a guess that something 
> in io.c on the server and client don't agree, possibly due to a mistake 
> on my part.  (The server in this case ran "/usr/bin/rsync --daemon 
> --no-detach --safe-links --compress --timeout=1800", but I definitely 
> ran without -z or --compress on my client, so compression should not 
> have been enabled, right?)

--timeout=1800

Evidently 3 minutes isn't long enough.

-- 

J.W. SchultzPegasystems Technologies
email address:  [EMAIL PROTECTED]

Remember Cernan and Schmitt
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Installing rsync as a service on Windows 2000.

2003-06-22 Thread Tiarnan DeBurca

I was looking to install Rsync on a win2k server, the document telling you how to do 
this (available at: http://samba.org/rsync/nt.html) is fairly out of date and has a 
couple of mistakes. 

An updated version of the "Rsync on NT" document is now available at:
http://www.tiarnan.phlegethon.org/rsyncntdoc.html

I think I've gotten rid of the mistakes, but if I've missed anything feel free to 
point them out. 
Thanks, your application got me out of a serious hole.

Tiarnán de Burca,
Network Admin,
2003 Special Olympic World Summer Games.


**
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please 
notify the system manager.
**

--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: any way to get --one-file-system in rsyncd.conf?

2003-06-22 Thread Phil Howard
On Sun, Jun 22, 2003 at 10:34:17AM -0700, [EMAIL PROTECTED] wrote:

| > I would like to specify an entry in /etc/rsyncd.conf such that it
| > operates on a --one-file-system basis always.  The path will point
| > to a filesystem mount point, but there is another filesystem that
| > is mounted in a subdirectory.  I want to back up only those files
| > in the pointed to filesystem, and not the one mounted within (in
| > that run, anyway).  I do not see such an option in man rsyncd.conf.
| > Is there an undocumented one available?
| 
| I don't think so.  But an alternative is use the exclude option in
| rsyncd.conf to exclude any mount points.  There are some caveats -
| see the man page.

Thanks for the response.

I'll probably avoid the exclude and just use a separate set of bind
mounts of the same filesystems in a non-overlapping way.  I was hoping
to cleanly avoid that, but bind mounts are reasonably clean even if
they do clutter /etc/mtab a bit.  Since I'm doing this on Linux, this
is an option.  I'm not sure what my options will be on other systems
if/when I need to run those.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: any way to get --one-file-system in rsyncd.conf?

2003-06-22 Thread cbarratt
> I would like to specify an entry in /etc/rsyncd.conf such that it
> operates on a --one-file-system basis always.  The path will point
> to a filesystem mount point, but there is another filesystem that
> is mounted in a subdirectory.  I want to back up only those files
> in the pointed to filesystem, and not the one mounted within (in
> that run, anyway).  I do not see such an option in man rsyncd.conf.
> Is there an undocumented one available?

I don't think so.  But an alternative is use the exclude option in
rsyncd.conf to exclude any mount points.  There are some caveats -
see the man page.

Craig
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync backup performance question

2003-06-22 Thread Ron Arts
jw schultz wrote:


[snip.. and thanks for all your comments]

>
Rsync doesn't perform well on non-local filesystems.

Really? Won't gigabit ethernet help for NFS, or maybe
Samba? I only have to rsync a relatively
low number of files, so no large directory scans.
Ron

--
Netland Internet Services
bedrijfsmatige internetoplossingen
http://www.netland.nl   Kruislaan 419  1098 VA Amsterdam
info: 020-5628282   servicedesk: 020-5628280   fax: 020-5628281
To poldly bow air mobius gumby four: Trek on novocaine.


smime.p7s
Description: S/MIME Cryptographic Signature
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: various "rsync: connection unexpectedly closed" errors on debian

2003-06-22 Thread wearitdown

When I sync'ed against other servers, my rsync client process always 
ran without --compress.  Nevertheless, I've tried increasing 
verbosity by 3, and this is what I get:
Check the rsync log file. That is where the error message will be.


Hi.  Okay, on the server I am downloading from, it shows the following 
(doesn't seem very detailed, if more details are needed, how do I ask 
the admin to increase it?):

   2003/06/21 18:07:03 [2020] rsync: name lookup failed for
   203.208.246.57: Name or service not known
   2003/06/21 18:07:04 [2020] rsync on gentoo-portage/ from UNKNOWN
   (203.208.246.57)
   2003/06/21 18:12:16 [2020] rsync error: timeout in data send/receive
   (code 30) at io.c(103)
Whereas on my client, it shows something like this (this clip not taken 
from the same session, but likely to be similar).

   [snip]
   x11-wm/xpde/xpde-0.3.0.ebuild is uptodate
   generate_files phase=1
   rsync: connection unexpectedly closed (1657342 bytes read so far)
   rsync error: error in rsync protocol data stream (code 12) at io.c(165)
   _exit_cleanup(code=12, file=io.c, line=165): about to call exit(12)
   rsync: connection unexpectedly closed (1493478 bytes read so far)
   rsync error: error in rsync protocol data stream (code 12) at io.c(165)
   _exit_cleanup(code=12, file=io.c, line=165): about to call exit(12)
Not being much of a programmer, I can only hazard a guess that something 
in io.c on the server and client don't agree, possibly due to a mistake 
on my part.  (The server in this case ran "/usr/bin/rsync --daemon 
--no-detach --safe-links --compress --timeout=1800", but I definitely 
ran without -z or --compress on my client, so compression should not 
have been enabled, right?)



--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync backup performance question

2003-06-22 Thread jw schultz
On Sun, Jun 22, 2003 at 04:20:34PM +0200, Ron Arts wrote:
> jw schultz wrote:
[snip] 
> Would it be feasible to have a separate process pre-creating
> blocksums during the day in separate files (ending in ",rsync")?
> Or, for example, while writing the changed file, the receiver
> would precompute and save the blocksums, for using it on
> the next run? This would save at least half my I/O.

No.  Not with the current codebase.

> 
> >
> >>>The easiest way to manage the scheduling is to have the
> >>>server pull.  If that isn't possible then you will need to
> >>>use an rsync wrapper that keeps the simultaneous runs within
> >>>limits or put a good deal of smarts into the clients.
> >>>
> >>
> >>Yeah, pulling is out of the question, because the server can't
> >>activate the ISDN link. The clients' rsync start time will need
> >>to be hashed across the night.
> >
> >
> >I'd favour a wrapper over depending on hashing the start
> >times.  An alternate approach might be to have the clients
> >open the connection with port forwarding, write a queue file
> >and wait for a completion indicator before closing the
> >connection.  The server could then pull using on the queue
> >files to identify waiting clients.  While a bit more
> >complicated it avoids the temporal gaps caused by the
> >fallback-sleep-retry of the wrappers.
> >
> 
> What do you mean by a wrapper? something that connects,
> check if the server has some resources, and try again later?
> Does it already exist?

Something that would accept the connection, test to see if
it ok and if not either loop until it is ok or return an
error.  The client side would either have to accept the long
delay or retry if the not-ok error were detected.

> This might incur ISDN call-setup costs that might be
> unacceptable. Same thing with keep-line-open-until-server-pulls.
> But on the other hand, this will maximize server performance.

You would still use a start time hash to manage the number
of clients waiting to run.  This would just serve to
maximize server performance by ensuring you don't have an
overload.

> On the other hand, I will probably need to spread the load
> across multiple servers anyway, so maybe something like the
> linux virtual server project would come in handy have
> to look into that too.

Rsync doesn't perform well on non-local filesystems.


-- 

J.W. SchultzPegasystems Technologies
email address:  [EMAIL PROTECTED]

Remember Cernan and Schmitt
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


any way to get --one-file-system in rsyncd.conf?

2003-06-22 Thread Phil Howard
I would like to specify an entry in /etc/rsyncd.conf such that it
operates on a --one-file-system basis always.  The path will point
to a filesystem mount point, but there is another filesystem that
is mounted in a subdirectory.  I want to back up only those files
in the pointed to filesystem, and not the one mounted within (in
that run, anyway).  I do not see such an option in man rsyncd.conf.
Is there an undocumented one available?

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync backup performance question

2003-06-22 Thread Ron Arts
jw schultz wrote:
You have a couple of points wrong.  The receiver generates
the block checksums.  If you are pushing that would be the
server but if you are pulling it is the client.  In 2.5.6
and earlier the transmitted block checksums are 6 bytes per
block with a default block size of 700 bytes so just under
1% of file size.  Unless you have a slow CPU the block
checksum generation will be I/O bound.
The only files that are opened are those where metadata
indicate the contents are changed.  In those cases you do
have a lot of disk i/o.  For database backups that will
probably amount to every file.
The sender only does one read pass on each changed file.
The receiver does a read pass for the blocksums and later
reads again the unchanged (possibly relocated) blocks as it
merges them with the changed data to write the new file.
Several files may be in process at any given time.  The
cache capacity of the receiver has a significant impact on
performance.
Would it be feasible to have a separate process pre-creating
blocksums during the day in separate files (ending in ",rsync")?
Or, for example, while writing the changed file, the receiver
would precompute and save the blocksums, for using it on
the next run? This would save at least half my I/O.

The easiest way to manage the scheduling is to have the
server pull.  If that isn't possible then you will need to
use an rsync wrapper that keeps the simultaneous runs within
limits or put a good deal of smarts into the clients.
Yeah, pulling is out of the question, because the server can't
activate the ISDN link. The clients' rsync start time will need
to be hashed across the night.


I'd favour a wrapper over depending on hashing the start
times.  An alternate approach might be to have the clients
open the connection with port forwarding, write a queue file
and wait for a completion indicator before closing the
connection.  The server could then pull using on the queue
files to identify waiting clients.  While a bit more
complicated it avoids the temporal gaps caused by the
fallback-sleep-retry of the wrappers.
What do you mean by a wrapper? something that connects,
check if the server has some resources, and try again later?
Does it already exist?
This might incur ISDN call-setup costs that might be
unacceptable. Same thing with keep-line-open-until-server-pulls.
But on the other hand, this will maximize server performance.
On the other hand, I will probably need to spread the load
across multiple servers anyway, so maybe something like the
linux virtual server project would come in handy have
to look into that too.
The last thing you want is to thrash the server or cause an
OOM condition.  If at all possible you will want to avoid
paging on the server.  The instant you start thrashing
filesystem cache performance will shrivel.
Definitely.

Ron

--
Netland Internet Services
bedrijfsmatige internetoplossingen
http://www.netland.nl   Kruislaan 419  1098 VA Amsterdam
info: 020-5628282   servicedesk: 020-5628280   fax: 020-5628281
One way to better your lot is to do a lot better...


smime.p7s
Description: S/MIME Cryptographic Signature
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: rsync backup performance question

2003-06-22 Thread jw schultz
On Sun, Jun 22, 2003 at 01:59:11PM +0200, Ron Arts wrote:
> jw schultz wrote:
> >On Sun, Jun 22, 2003 at 11:42:46AM +0200, Ron Arts wrote:
> >
> >>Dear all,
> >>
> >>I am implementing a backup system, where thousands of postgreSQL
> >>databases (max 1 Gb in size) on as much clients need to be backed
> >>up nightly across ISDN lines.
> >>
> >>Because of the limited bandwidth, rsync is the prime candidate of
> >>course.
> >
> >
> >Only if you are updating an existing file on the backup
> >server with sufficient commonality from one version to the
> >next.  pg_dump --format=t would is good.  Avoid the built-in
> >compression in pg_dump as it defeats rsync.  
> 
> Restore time is significant, so I think I need a straight mirror
> of the database files on the client. I think importing
> a multi gigabyte SQL dump will take too long for us (one hour
> is the limit). Have not tried that yet on postgreSQL though.

Try doing a dump-restore test before you make the decision.
The dumps are a lot smaller and more compressible than the
database files and you don't need to shut down the database
to do them.  The database has to be completely shut down to
do a file level backup of the database.

> > gzip with the
> >rsyncable patch and bzip2 are OK if you must compress.
> >
> 
> So unpatched bzip2 is ok? nice to know..
> Maybe I can tar an LVM snapshot, and bzip2 that
> before rsyncing. Thanks for that one.

bzip2 should be OK because the encoding of each block is
independent of the other blocks.  Of course the block size
is a bit on the large side (a minimum of 100KB) and that
will diminish the effectiveness of rsync.  I suspect that if
a single byte anywhere in a bzip2 block is changed rsync may
fail to find any matching rsync blocks within it.  Given the
propensity for databases to change apparently random blocks
bzip2 may be inappropriate.  Only testing and a detailed
understanding of bzip2 internals will tell.

Be careful that the filesystem caches are flushed if you go
the LVM snapshot route or you will wind up with inconsistent
tablespaces.

> 
> >The other issue is individual file size.  Rsync versions
> >prior to what is in CVS start having some performance issues
> >with files larger than the 200-500MB range.  
> >
> 
> I'll keep that in mind.
> 
> >
> >>Potential problems I see are server load (I/O and CPU), and filesystem 
> >>limits.
> >
> >
> >Most of the load is on the sender.  Over ISDN even with
> >rsync compressing the datastream no one update should be CPU
> >or I/O issue.  The issue is scheduling so you don't have too
> >many running simultaneously.
> >
> 
> As I understand the algorithm, the server creates a list of checksums
> (which is around 1% size of the original file), which is not really
> CPU intensive, sends that to the client, and then the client does a lot
> of work finding blocks that are the same as the server file.
> 
> So the server at least reads every file completely that is in the
> rsync tree am i correct? In my case that means a lots of disk I/O,
> given the total size for all databases (multiple TB's).
> 
> Please correct me if I'm wrong.

You have a couple of points wrong.  The receiver generates
the block checksums.  If you are pushing that would be the
server but if you are pulling it is the client.  In 2.5.6
and earlier the transmitted block checksums are 6 bytes per
block with a default block size of 700 bytes so just under
1% of file size.  Unless you have a slow CPU the block
checksum generation will be I/O bound.

The only files that are opened are those where metadata
indicate the contents are changed.  In those cases you do
have a lot of disk i/o.  For database backups that will
probably amount to every file.

The sender only does one read pass on each changed file.
The receiver does a read pass for the blocksums and later
reads again the unchanged (possibly relocated) blocks as it
merges them with the changed data to write the new file.
Several files may be in process at any given time.  The
cache capacity of the receiver has a significant impact on
performance.

> 
> >The easiest way to manage the scheduling is to have the
> >server pull.  If that isn't possible then you will need to
> >use an rsync wrapper that keeps the simultaneous runs within
> >limits or put a good deal of smarts into the clients.
> >
> 
> Yeah, pulling is out of the question, because the server can't
> activate the ISDN link. The clients' rsync start time will need
> to be hashed across the night.

I'd favour a wrapper over depending on hashing the start
times.  An alternate approach might be to have the clients
open the connection with port forwarding, write a queue file
and wait for a completion indicator before closing the
connection.  The server could then pull using on the queue
files to identify waiting clients.  While a bit more
complicated it avoids the temporal gaps caused by the
fallback-sleep-retry of the wrappers.

The last thing you want is to thrash the server or cause an
OOM condit

Re: rsync backup performance question

2003-06-22 Thread Ron Arts
jw schultz wrote:
On Sun, Jun 22, 2003 at 11:42:46AM +0200, Ron Arts wrote:

Dear all,

I am implementing a backup system, where thousands of postgreSQL
databases (max 1 Gb in size) on as much clients need to be backed
up nightly across ISDN lines.
Because of the limited bandwidth, rsync is the prime candidate of
course.


Only if you are updating an existing file on the backup
server with sufficient commonality from one version to the
next.  pg_dump --format=t would is good.  Avoid the built-in
compression in pg_dump as it defeats rsync.  
Restore time is significant, so I think I need a straight mirror
of the database files on the client. I think importing
a multi gigabyte SQL dump will take too long for us (one hour
is the limit). Have not tried that yet on postgreSQL though.
> gzip with the
rsyncable patch and bzip2 are OK if you must compress.

So unpatched bzip2 is ok? nice to know..
Maybe I can tar an LVM snapshot, and bzip2 that
before rsyncing. Thanks for that one.
The other issue is individual file size.  Rsync versions
prior to what is in CVS start having some performance issues
with files larger than the 200-500MB range.  

I'll keep that in mind.


Potential problems I see are server load (I/O and CPU), and filesystem 
limits.


Most of the load is on the sender.  Over ISDN even with
rsync compressing the datastream no one update should be CPU
or I/O issue.  The issue is scheduling so you don't have too
many running simultaneously.
As I understand the algorithm, the server creates a list of checksums
(which is around 1% size of the original file), which is not really
CPU intensive, sends that to the client, and then the client does a lot
of work finding blocks that are the same as the server file.
So the server at least reads every file completely that is in the
rsync tree am i correct? In my case that means a lots of disk I/O,
given the total size for all databases (multiple TB's).
Please correct me if I'm wrong.

The easiest way to manage the scheduling is to have the
server pull.  If that isn't possible then you will need to
use an rsync wrapper that keeps the simultaneous runs within
limits or put a good deal of smarts into the clients.
Yeah, pulling is out of the question, because the server can't
activate the ISDN link. The clients' rsync start time will need
to be hashed across the night.

Does anyone have experience with such setups?


Unlikely on that scale over that sort of link.

I'd suggest experimenting with -v and the --stats options turned on.

I will, thanks.

Ron

--
Netland Internet Services
bedrijfsmatige internetoplossingen
http://www.netland.nl   Kruislaan 419  1098 VA Amsterdam
info: 020-5628282   servicedesk: 020-5628280   fax: 020-5628281
Useless Invention: Leather cutlery.


smime.p7s
Description: S/MIME Cryptographic Signature
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: rsync backup performance question

2003-06-22 Thread jw schultz
On Sun, Jun 22, 2003 at 11:42:46AM +0200, Ron Arts wrote:
> Dear all,
> 
> I am implementing a backup system, where thousands of postgreSQL
> databases (max 1 Gb in size) on as much clients need to be backed
> up nightly across ISDN lines.
> 
> Because of the limited bandwidth, rsync is the prime candidate of
> course.

Only if you are updating an existing file on the backup
server with sufficient commonality from one version to the
next.  pg_dump --format=t would is good.  Avoid the built-in
compression in pg_dump as it defeats rsync.  gzip with the
rsyncable patch and bzip2 are OK if you must compress.

The other issue is individual file size.  Rsync versions
prior to what is in CVS start having some performance issues
with files larger than the 200-500MB range.  

> Potential problems I see are server load (I/O and CPU), and filesystem 
> limits.

Most of the load is on the sender.  Over ISDN even with
rsync compressing the datastream no one update should be CPU
or I/O issue.  The issue is scheduling so you don't have too
many running simultaneously.

The easiest way to manage the scheduling is to have the
server pull.  If that isn't possible then you will need to
use an rsync wrapper that keeps the simultaneous runs within
limits or put a good deal of smarts into the clients.

> Does anyone have experience with such setups?

Unlikely on that scale over that sort of link.

I'd suggest experimenting with -v and the --stats options turned on.

-- 

J.W. SchultzPegasystems Technologies
email address:  [EMAIL PROTECTED]

Remember Cernan and Schmitt
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: good snapshot solution

2003-06-22 Thread Hans-Juergen Beie
jw schultz wrote on 20.06.2003 04:06:

On Thu, Jun 19, 2003 at 05:44:48PM +0200, [EMAIL PROTECTED] wrote:
[...]

After doing some research I found three candidates:

- Dirvish
- Rlbackup
- Rsync_snapshots by Mike Rubel
[...]

Also, I was wondering if can still share the snapshotted versions of the
files with Samba in an intuitive way. Dirvish for instances seems to use a
certain subdirectory system making that somewhat more complex.


Intuitive is a subjective term.  I'm sure dirvish would
support almost any way you would like.  A discussion of that
is not appropriate for this list.

The last one is very straight forward and looks sufficient, but Mike points
out that he sees this only working for smaller servers, so I worry a bit
about our 900 GB in data. Would this systeem know how to handle this amount
of data?


I'm not sure why, or where, he says that.  I'll grant that
the configuration might not scale well for Mike's snapshot
approach but otherwise it should be no more limiting than
dirvish or rlbackup.  I know of more than one site using
dirvish for hundreds of servers with multiple terabytes of
backup data.
rsback is based on Mike's approach.
It was made to handle configurations for different backup scenarios more 
flexibly. I'm using it on two production servers sharing some trees of 
the backup repositories (read-only mounted via nfs) with Samba.

see http://www.pollux.franken.de/hjb/rsback

hjb :-?

--
Hans-Juergen Beie
Phone: +49 911 396628 / +49 173 3546274
Fax: +49 911 396663
mailto:[EMAIL PROTECTED]


--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync backup performance question

2003-06-22 Thread Ron Arts
Dear all,

I am implementing a backup system, where thousands of postgreSQL
databases (max 1 Gb in size) on as much clients need to be backed
up nightly across ISDN lines.
Because of the limited bandwidth, rsync is the prime candidate of
course.
Potential problems I see are server load (I/O and CPU), and filesystem 
limits.

Does anyone have experience with such setups?

Ron Arts

--
NeoNova BV
bedrijfsmatige internetoplossingen
http://www.neonova.nl   Kruislaan 419  1098 VA Amsterdam
info: 020-5628292   servicedesk: 020-5628292   fax: 020-5628291


smime.p7s
Description: S/MIME Cryptographic Signature
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html