Re: Direct I/O support (patches included)

2013-02-18 Thread Dag Wieers

On Mon, 18 Feb 2013, Linda Walsh wrote:


Hi dag, I really appreciate your working on this,
but it is really annoying hard and tedious.

_I_ am not certain about all the requirements of Direct I/O,

I.e. would have to research (goog/kernel source...etc).

It may be different on different platforms, I _vaguely_ remember
'talking'(email) with someone working on 'dd', and they were telling
me how they had to compensate for a change in the kernel which used
to handle the buffering of partial sector reads/writes for those
who did directio on a device.  Then they decided that much hand-holding
was wrong because, IMO, they basically wanted people to use the buffer
cache -- since for most people, and most things it's a good thing.


It is different on some platforms, iirc the iozone source code has 
provisions for three (VX_DIRECT, O_DIRECT or O_DIRECTIO), albeit 
manageable. I would be already happy providing the functionality for 
systems supporting O_DIRECT (can't test the others anyway).




Sorry about my example below -- I'd already replaced the --directio in
the shell script with --drop-cache -- which I'd forgotten I already
had in the script (memory for these things is completely gone!)..

But really, it did have the direct-io in the statement before I
changed it into a drop cache.


Ok, the Invalid argument error could be because the filesystem did not 
support direct I/O (e.g. NTFS) or there was a misalignment. On the system 
I wrote this patch for, it simply works out of the box on NFS. (RHEL5 64bit)



Anyway, the first thing I'd want to find out is why it is writing to a socket 
for a local file copy?


Because even when running rsync locally, it works as client-server over a 
socket ?




It's going to be hard for direct I/O to make a difference (if it is workable,
the fact that they move a 'window' over the source emulating a memory-mapped
file isn't real helpful lining up memory with the sectors, but the minimums
we need to go for a minimum read size of 4096 (am pretty sure that we have to 
do that even on short files, and the kernel will just tell us we got less).


2nd thing -- need to make sure is to have a source of memory aligned 
boundaries.


I'd look at the implementation in fio, it's well documented.

First improvement to the patch would be to bail out when we get EINVAL so 
it's clear that direct-io as-is does not work/alignment is not 
implemented.



--
-- dag wieers, d...@wieers.com, http://dag.wieers.com/
-- dagit linux solutions, i...@dagit.net, http://dagit.net/

[Any errors in spelling, tact or fact are transmission errors]
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Direct I/O support (patches included)

2013-02-17 Thread Dag Wieers

On Sat, 16 Feb 2013, Linda Walsh wrote:


Dag Wieers wrote:

 On Thu, 14 Feb 2013, Brian K. White wrote:

  On 2/14/2013 9:50 AM, Dag Wieers wrote:
 
Since a --direct-io feature was requested a few times the past decade

with little response and the actual patch is quite trivial, I patched
both v3.0.9 and master branch and included the patches here.
 
  When I drop the 3.0.9 diff into my otherwise working spec file for 3.0.9 
  on opensuse build service, it patches and builds with no error, but 
  make test fails:


 Attached is an updated patch which takes care of the test cases.


---
Tried this patch, but it failed to work:

cmd:
cd /Media  {
 sudo rsync --archive --hard-links --acls -xattrs --whole-file
 --drop-cache \
 --one-file-system
 --drop-cache --8-bit-output
 \
 --human-readable --progress
 --inplace --del .
 /backups/Media/.
}


first it deleted a bunch of stuff (~30 files)
deleting Library /
rsync: delete_file: unlink(MediaBack) failed: Operation not permitted (1)
.recycle/
.recycle/SDT27D6.tmp
  0 100%0.00kB/s0:00:00 (xfer#1, to-check=1006/1019)
.recycle/Library/
.recycle/Library/[Commie] Kotoura - 04 [FC6C5497].mkv
 32.77K   0%0.00kB/s0:00:00
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]:Broken 
pipe (32)


Then a bunch more deletes...(~17)
Then:
rsync: write failed on /backups/Media/./.recycle/Library/[Commie] Kotoura - 
04 [FC6C5497].mkv: Invalid argument (22)

rsync: connection unexpectedly closed (57 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) 
[sender=3.0.9]



???


Since you didn't use --direct-io, my patch was not doing anything. Since 
you were using --drop-cache (twice!) this is not a vanilla rsync either.


What was it you were trying to do ?

--
-- dag wieers, d...@wieers.com, http://dag.wieers.com/
-- dagit linux solutions, i...@dagit.net, http://dagit.net/

[Any errors in spelling, tact or fact are transmission errors]
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Direct I/O support (patches included)

2013-02-17 Thread Dag Wieers

On Sat, 16 Feb 2013, Linda Walsh wrote:


I wondered about that as well -- could speed things
up by 30% over going through the slow linux buffers.

One thing that the 'dd' people found out though was
that if you do direct I/O, memory and your I/O really
do have to line up -- it may be that only 512 byte alignment
is necessary (or 4096 on some newer disks)...but ideally,
you look at the stat's claim for write size since the last
param in stat isn't the allocation size, but the smallest
optimal write size -- i.e. the stripe size if you have
a RAID...as there, you want to write whole strips at once,
otherwise you get into a read/modify/write cycle that slows
down your disk I/O with 200% overhead -- *ouch!*...


True, the patch can be improved. But even without alignment this avoids 
excessive buffering when transferring huge files on systems with a lot of 
free memory. The behavior I noticed (and this patch fixes) is only reads 
until the buffer is filled, and then only writes until the buffers have 
been written. With direct-io you have reads and writes happening at the 
same time.


Since it seems you know what is needed to improve, can you propose a patch ?

(I got some hints from iozone wrt. alignment and portability)

Another solution is fadvise(), although I still had the behavior mentioned 
above using --drop-cache, so it didn't fix my use-case which is why I 
wrote this patch.


Kind regards,
--
-- dag wieers, d...@wieers.com, http://dag.wieers.com/
-- dagit linux solutions, i...@dagit.net, http://dagit.net/

[Any errors in spelling, tact or fact are transmission errors]
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Direct I/O support (patches included)

2013-02-14 Thread Dag Wieers

Hi,

Since a --direct-io feature was requested a few times the past decade with 
little response and the actual patch is quite trivial, I patched both 
v3.0.9 and master branch and included the patches here.


If this functionality is acceptable I don't mind spending the additional 
effort to update the documentation, etc.


Beware that the underlying filesystem needs Direct I/O support, therefore 
for this to work on NFS, one needs the kernel CONFIG_NFS_DIRECTIO enabled 
(probably default these days anyway). This is the reason why --direct-io 
is not enabled by default and should be used with care. I don't know what 
the behavior is if the local or remote side do not support O_DIRECT.


In my tests it provides a much more stable copy process (on NFS) with 
improved performance, but I didn't do any prolonged repeated tests yet (I 
only used it during a large copy that was ongoing so I prefer not to 
juggle with numbers here). Please test yourself and report back.


Kind regards,
--
-- dag wieers, d...@wieers.com, http://dag.wieers.com/
-- dagit linux solutions, i...@dagit.net, http://dagit.net/

[Any errors in spelling, tact or fact are transmission errors]diff --git a/options.c b/options.c
index 088202e..4f2a3f5 100644
--- a/options.c
+++ b/options.c
@@ -122,6 +122,7 @@ int blocking_io = -1;
 int checksum_seed = 0;
 int inplace = 0;
 int delay_updates = 0;
+int direct_io = 0;
 long block_size = 0; /* long because popt can't set an int32. */
 char number_separator;
 char *skip_compress = NULL;
@@ -746,6 +747,7 @@ void usage(enum logcode F)
   rprintf(F, --partial   keep partially transferred files\n);
   rprintf(F, --partial-dir=DIR   put a partially transferred file into DIR\n);
   rprintf(F, --delay-updates put all updated files into place at transfer's end\n);
+  rprintf(F, --direct-io don't use buffer cache for files being transfered\n);
   rprintf(F, -m, --prune-empty-dirs  prune empty directory chains from the file-list\n);
   rprintf(F, --numeric-ids   don't map uid/gid values by user/group name\n);
   rprintf(F, --usermap=STRINGcustom username mapping\n);
@@ -977,6 +979,7 @@ static struct poptOption long_options[] = {
   {partial-dir,  0,  POPT_ARG_STRING, partial_dir, 0, 0, 0 },
   {delay-updates,0,  POPT_ARG_VAL,delay_updates, 1, 0, 0 },
   {no-delay-updates, 0,  POPT_ARG_VAL,delay_updates, 0, 0, 0 },
+  {direct-io,   'n', POPT_ARG_NONE,   direct_io, 0, 0, 0 },
   {prune-empty-dirs,'m', POPT_ARG_VAL,prune_empty_dirs, 1, 0, 0 },
   {no-prune-empty-dirs,0,POPT_ARG_VAL,prune_empty_dirs, 0, 0, 0 },
   {no-m, 0,  POPT_ARG_VAL,prune_empty_dirs, 0, 0, 0 },
@@ -2654,6 +2657,9 @@ void server_options(char **args, int *argc_p)
 	} else if (keep_partial  am_sender)
 		args[ac++] = --partial;
 
+	if (direct_io)
+		args[ac++] = --direct-io;
+
 	if (ignore_errors)
 		args[ac++] = --ignore-errors;
 
diff --git a/syscall.c b/syscall.c
index fd23d15..9d432c0 100644
--- a/syscall.c
+++ b/syscall.c
@@ -34,6 +34,7 @@
 #endif
 
 extern int dry_run;
+extern int direct_io;
 extern int am_root;
 extern int am_sender;
 extern int read_only;
@@ -69,7 +70,11 @@ int do_symlink(const char *lnk, const char *fname)
 	 * and write the lnk into it. */
 	if (am_root  0) {
 		int ok, len = strlen(lnk);
-		int fd = open(fname, O_WRONLY|O_CREAT|O_TRUNC, S_IWUSR|S_IRUSR);
+		int flags = O_WRONLY|O_CREAT|O_TRUNC;
+
+		if (direct_io) flags |= O_DIRECT;
+
+		int fd = open(fname, flags, S_IWUSR|S_IRUSR);
 		if (fd  0)
 			return -1;
 		ok = write(fd, lnk, len) == len;
@@ -190,6 +195,8 @@ int do_open(const char *pathname, int flags, mode_t mode)
 		RETURN_ERROR_IF_RO_OR_LO;
 	}
 
+	if (direct_io) flags |= O_DIRECT;
+
 	return open(pathname, flags | O_BINARY, mode);
 }
 
@@ -461,6 +468,8 @@ int do_open_nofollow(const char *pathname, int flags)
 #endif
 	}
 
+	if (direct_io) flags |= O_DIRECT;
+
 #ifdef O_NOFOLLOW
 	fd = open(pathname, flags|O_NOFOLLOW);
 #else
--- options.c.orig	2013-02-14 13:36:19.0 +0100
+++ options.c	2013-02-14 13:36:33.0 +0100
@@ -120,6 +120,7 @@
 int checksum_seed = 0;
 int inplace = 0;
 int delay_updates = 0;
+int direct_io = 0;
 long block_size = 0; /* long because popt can't set an int32. */
 char *skip_compress = NULL;
 
@@ -381,6 +382,7 @@
   rprintf(F, --partial   keep partially transferred files\n);
   rprintf(F, --partial-dir=DIR   put a partially transferred file into DIR\n);
   rprintf(F, --delay-updates put all updated files into place at transfer's end\n);
+  rprintf(F, --direct-io don't use buffer cache for files being transferedd\n);
   rprintf(F, -m, --prune-empty-dirs  prune empty directory chains from the file-list\n);
   rprintf(F, --numeric-ids   don't map uid/gid values by user/group name\n);
   rprintf(F, --timeout=SECONDS   set I/O timeout in seconds\n);
@@ -593,6 +595,7

Re: Direct I/O support (patches included)

2013-02-14 Thread Dag Wieers

On Thu, 14 Feb 2013, Brian K. White wrote:


On 2/14/2013 9:50 AM, Dag Wieers wrote:


 Since a --direct-io feature was requested a few times the past decade
 with little response and the actual patch is quite trivial, I patched
 both v3.0.9 and master branch and included the patches here.


When I drop the 3.0.9 diff into my otherwise working spec file for 3.0.9 on 
opensuse build service, it patches and builds with no error, but make test 
fails:


Attached is an updated patch which takes care of the test cases.

Kind regards,
--
-- dag wieers, d...@wieers.com, http://dag.wieers.com/
-- dagit linux solutions, i...@dagit.net, http://dagit.net/

[Any errors in spelling, tact or fact are transmission errors]--- options.c.direct-io	2011-09-14 00:41:26.0 +0200
+++ options.c	2013-02-14 14:19:08.0 +0100
@@ -120,6 +120,7 @@
 int checksum_seed = 0;
 int inplace = 0;
 int delay_updates = 0;
+int direct_io = 0;
 long block_size = 0; /* long because popt can't set an int32. */
 char *skip_compress = NULL;
 
@@ -381,6 +382,7 @@
   rprintf(F, --partial   keep partially transferred files\n);
   rprintf(F, --partial-dir=DIR   put a partially transferred file into DIR\n);
   rprintf(F, --delay-updates put all updated files into place at transfer's end\n);
+  rprintf(F, --direct-io don't use buffer cache for files being transferedd\n);
   rprintf(F, -m, --prune-empty-dirs  prune empty directory chains from the file-list\n);
   rprintf(F, --numeric-ids   don't map uid/gid values by user/group name\n);
   rprintf(F, --timeout=SECONDS   set I/O timeout in seconds\n);
@@ -593,6 +595,7 @@
   {partial-dir,  0,  POPT_ARG_STRING, partial_dir, 0, 0, 0 },
   {delay-updates,0,  POPT_ARG_VAL,delay_updates, 1, 0, 0 },
   {no-delay-updates, 0,  POPT_ARG_VAL,delay_updates, 0, 0, 0 },
+  {direct-io,   'n', POPT_ARG_NONE,   direct_io, 0, 0, 0 },
   {prune-empty-dirs,'m', POPT_ARG_VAL,prune_empty_dirs, 1, 0, 0 },
   {no-prune-empty-dirs,0,POPT_ARG_VAL,prune_empty_dirs, 0, 0, 0 },
   {no-m, 0,  POPT_ARG_VAL,prune_empty_dirs, 0, 0, 0 },
@@ -2002,6 +2005,9 @@
 	} else if (keep_partial  am_sender)
 		args[ac++] = --partial;
 
+	if (direct_io)
+		args[ac++] = --direct-io;
+
 	if (ignore_errors)
 		args[ac++] = --ignore-errors;
 
--- syscall.c.direct-io	2011-02-21 20:32:51.0 +0100
+++ syscall.c	2013-02-14 14:19:52.0 +0100
@@ -30,6 +30,7 @@
 #endif
 
 extern int dry_run;
+extern int direct_io;
 extern int am_root;
 extern int read_only;
 extern int list_only;
@@ -143,6 +144,8 @@
 		RETURN_ERROR_IF_RO_OR_LO;
 	}
 
+	if (direct_io) flags |= O_DIRECT;
+
 	return open(pathname, flags | O_BINARY, mode);
 }
 
--- tls.c.orig	2013-02-14 23:03:43.0 +0100
+++ tls.c	2013-02-14 23:03:26.0 +0100
@@ -42,6 +42,7 @@
 
 /* These are to make syscall.o shut up. */
 int dry_run = 0;
+int direct_io = 0;
 int am_root = 0;
 int read_only = 1;
 int list_only = 0;
--- t_unsafe.c.orig	2013-02-14 23:17:26.0 +0100
+++ t_unsafe.c	2013-02-14 23:16:36.0 +0100
@@ -24,6 +24,7 @@
 #include rsync.h
 
 int dry_run = 0;
+int direct_io = 0;
 int am_root = 0;
 int read_only = 0;
 int list_only = 0;
--- trimslash.c.orig	2013-02-14 23:17:16.0 +0100
+++ trimslash.c	2013-02-14 23:16:15.0 +0100
@@ -22,6 +22,7 @@
 
 /* These are to make syscall.o shut up. */
 int dry_run = 0;
+int direct_io = 0;
 int am_root = 0;
 int read_only = 1;
 int list_only = 0;
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: RFE: Lockfile option for use in cronjobs

2008-04-10 Thread Dag Wieers

On Wed, 2 Apr 2008, Matt McCutchen wrote:


On Mon, 2008-03-31 at 11:43 +0200, Dag Wieers wrote:

Looking for an easy way to prevent a repetitive rsync to be running
multiple times, I was wondering if it could be useful to have an option
like:

--pidfile /some/path/rsync-mirror-org.pid

So that rsync can be run directly from cron without requiring a wrapper
script to do pidfile handling.

This way rsync on startup could check the pid-file, see if another rsync
is using this pid, and bail out with an error if it is. Otherwise clean up
the stale pidfile and continue.

I think this would be very useful to instruct mirrors how to configure it,
rather than providing some script that needs local customizations.


I'm not convinced that a pidfile is better implemented in rsync than in
a wrapper script, which could be distributed in support/ of the source
tree.  If you don't care about actually having the pid in the file, you
could use the flock(1) utility, which executes a command while holding a
flock(2) lock on a specified file:

flock --nonblock /some/path/rsync-mirror-org.lock rsync ...

If the process goes away, the lock will too, so no manual cleanup of
stale locks is needed.


The flock utility is great. I learn something new everyday :)

Thanks a lot !
--
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: RFE: Lockfile option for use in cronjobs

2008-04-10 Thread Dag Wieers

On Thu, 10 Apr 2008, Dag Wieers wrote:


On Wed, 2 Apr 2008, Matt McCutchen wrote:


On Mon, 2008-03-31 at 11:43 +0200, Dag Wieers wrote:

Looking for an easy way to prevent a repetitive rsync to be running
multiple times, I was wondering if it could be useful to have an option
like:

--pidfile /some/path/rsync-mirror-org.pid

So that rsync can be run directly from cron without requiring a wrapper
script to do pidfile handling.

This way rsync on startup could check the pid-file, see if another rsync
is using this pid, and bail out with an error if it is. Otherwise clean up
the stale pidfile and continue.

I think this would be very useful to instruct mirrors how to configure it,
rather than providing some script that needs local customizations.


I'm not convinced that a pidfile is better implemented in rsync than in
a wrapper script, which could be distributed in support/ of the source
tree.  If you don't care about actually having the pid in the file, you
could use the flock(1) utility, which executes a command while holding a
flock(2) lock on a specified file:

flock --nonblock /some/path/rsync-mirror-org.lock rsync ...

If the process goes away, the lock will too, so no manual cleanup of
stale locks is needed.


The flock utility is great. I learn something new everyday :)


Just noticed that the flock utility is fairly recent (since util-linux 
2.13) which means that on most systems you do not have it.


For RHEL that means only available since RHEL5 :-(

That could be one of the reasons to have it as part of rsync (to 
facilitate the distribution of a cron-job or simply make sure your mirrors 
are using best practices without too much complexity).


--
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RFE: Lockfile option for use in cronjobs

2008-03-31 Thread Dag Wieers

Hi,

Looking for an easy way to prevent a repetitive rsync to be running 
multiple times, I was wondering if it could be useful to have an option 
like:


--pidfile /some/path/rsync-mirror-org.pid

So that rsync can be run directly from cron without requiring a wrapper 
script to do pidfile handling.


This way rsync on startup could check the pid-file, see if another rsync 
is using this pid, and bail out with an error if it is. Otherwise clean up 
the stale pidfile and continue.


I think this would be very useful to instruct mirrors how to configure it, 
rather than providing some script that needs local customizations.


Thanks in advance,
--
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]
--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Vhost support

2006-08-23 Thread Dag Wieers
Hi,

Maybe this already supported in some way or another, or been discussed 
in the past but let me picture the problem first.

Currently mirrors that offer rsync support have their own filesystem 
layout and a path used on one rsync server would be different to the path 
on another rsync server.

Additionally, rsync is restricted to the IP it is listening on. And if you 
would want to have a second rsync server, you basicly need a second IP.

Now, the CentOS project makes use of different public mirrors to 
distribute their binaries and they use a geo-ip system to automatically 
point clients to different servers to spread the load. This works well for 
HTTP, but for rsync it would not work consistently except if they can 
re-organize the path-structure of public mirrors (or have a dedicated 
rsync server or CentOS that follows the pth structure we lay out).

Vhost support (much like HTTP 1.1) inside the rsync protocol would allow 
an administrator to set up different views or different filesystem 
structures for different hostnames to accomodate the universal location 
that clients would use.

Much like:

http://mirrors.centos.org/centos/4/updates/i386/RPMS/

currently points on each server to the same location, we could have a:

rsync://mirrors.centos.org/centos/4/updates/i386/RPMS/

I'm interested to know what the rsync developers think of this idea. How 
big a change it would be overall and at what timeframe (given there would 
be extensive development and testing) it could be included, if at all.

Thanks !
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Rsync push is slower compared to pull

2006-08-23 Thread Dag Wieers
On Wed, 23 Aug 2006, Srinivasa Battula wrote:

   It has been observed that rsync push mode is much slower when compared
 to pull (On Identical scenarios). Building/receiving file list takes
 almost same time. But data transfer is much slower, whose transfer
 ratios are ranging from 1:3 to 1:5. On pull operation data transfer
 speed is consistently around 3.5 MB/Sec and it reached 10 MB/Sec.
 However, on push the maximum it could reach is around 2 MB/Sec and hogs
 around 1MB/Sec. Why is this difference? Please help me on this...

Do these identical scenarios also include changing the sender and receiver 
for both push and pull ? That would be required to dismiss the network 
configuration/environment as the possible cause.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: Rsync push is slower compared to pull

2006-08-23 Thread Dag Wieers
On Wed, 23 Aug 2006, Srinivasa Battula wrote:

 On Wed, 32 Aug 2006, Dag Wieers wrote:
  On Wed, 23 Aug 2006, Srinivasa Battula wrote:
  
 It has been observed that rsync push mode is much slower when compared
   to pull (On Identical scenarios). Building/receiving file list takes
   almost same time. But data transfer is much slower, whose transfer
   ratios are ranging from 1:3 to 1:5. On pull operation data transfer
   speed is consistently around 3.5 MB/Sec and it reached 10 MB/Sec.
   However, on push the maximum it could reach is around 2 MB/Sec and hogs
   around 1MB/Sec. Why is this difference? Please help me on this...
  
  Do these identical scenarios also include changing the sender and
  receiver 
  for both push and pull ? That would be required to dismiss the network 
  configuration/environment as the possible cause.

 Source and Target remain same. The way rsync is being invoked is
 different.

My question implied to urge you to test with the sender and receiver 
switched. And compare those results as well. You should end up with 4 sets 
of transfer speed data.

Analysing that should make clear if pulling/pushing is the cause, or if it 
is the network environment (sender/receiver).

Also make sure that the tests are not influenced by other factors (eg. 
other people doing exactly the same test as yours on the same 
network/system :)). Doing the tests multiple times during the day may make 
it more convincing to state your case.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Rsync push is slower compared to pull

2006-08-23 Thread Dag Wieers
On Wed, 23 Aug 2006, wwp wrote:

 On Wed, 23 Aug 2006 14:50:39 +0200 (CEST) Dag Wieers [EMAIL PROTECTED] 
 wrote:
  On Wed, 23 Aug 2006, Srinivasa Battula wrote:
   On Wed, 32 Aug 2006, Dag Wieers wrote:
On Wed, 23 Aug 2006, Srinivasa Battula wrote:

   It has been observed that rsync push mode is much slower when
 compared to pull (On Identical scenarios). Building/receiving file
 list takes almost same time. But data transfer is much slower, whose
 transfer ratios are ranging from 1:3 to 1:5. On pull operation data
 transfer speed is consistently around 3.5 MB/Sec and it reached 10
 MB/Sec. However, on push the maximum it could reach is around 2
 MB/Sec and hogs around 1MB/Sec. Why is this difference? Please help
 me on this...

Do these identical scenarios also include changing the sender and
receiver 
for both push and pull ? That would be required to dismiss the network 
configuration/environment as the possible cause.
  
   Source and Target remain same. The way rsync is being invoked is
   different.
  
  My question implied to urge you to test with the sender and receiver 
  switched. And compare those results as well. You should end up with 4 sets 
  of transfer speed data.
  
  Analysing that should make clear if pulling/pushing is the cause, or if it 
  is the network environment (sender/receiver).
  
  Also make sure that the tests are not influenced by other factors (eg. 
  other people doing exactly the same test as yours on the same 
  network/system :)). Doing the tests multiple times during the day may make 
  it more convincing to state your case.
 
 Wouldn't disk-read and disk-write speeds explain or hide the difference
 sometimes, if Srinivasa does the reverse test (exchange server and client
 machines, perform the same test)? Even, hardware speeds in general?

Could be, but the difference he is now seeing could not be explained by 
disks since the pull and push seems to be sending the data in the same 
direction. (from the same disk to the same disk)

I was not implying that the network necessarily was the problem, but at 
least reversing the direction would give much more information to analyse. 
Or debunk the initial proposed cause.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Packages for rsync 2.6.8 with ACLs

2006-05-02 Thread Dag Wieers
On Mon, 1 May 2006, Matt McCutchen wrote:

 I have finally made packages for rsync 2.6.8 with ACL support.  You can
 download a prepatched source package and RPMs from here:
   http://www.kepreon.com/~matt/myrsync/
 
 Or you can use this yum repository:
   http://www.kepreon.com/~matt/rpm/
 
 The RPM is called rsync-acl so automatic updating tools will know not to
 toss it in favor of plain rsync, but it Provides rsync so other
 packages like rsnapshot will be happy.
 
 At some point in the future, I will get back to improving the ACL
 support.

Are there any reasons why the rsync with acl support is not appropriate as 
a drop-in replacement for the 'normal' rsync ? 

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Long file names

2006-03-25 Thread Dag Wieers
On Sat, 25 Mar 2006, Robert Fitzpatrick wrote:

 I believe this post I found is related to a problem I'm having with long
 files names...I would like to use cwRsync, does my problem suggest this
 is still happening and has not been addressed?
 
 http://lists.samba.org/archive/rsync/2005-April/012117.html
 
 However, I have long file name issues using rsync 2.6.6 from a BSD
 server to the default Cygwin rsync 2.6.6. Should I (and how do I) apply
 this patch? Here is one of the errors I am experiencing...

If you're not running the latest rsync, it's wise to check the ChangeLog 
of the latest version. You'll not only find the answer to the question at 
hand, but you'll learn about other things as well.

http://www.samba.org/ftp/rsync/rsync-2.6.7-NEWS

Look for MAXPATHLEN.

- Some buffer sizes were expanded a bit, particularly on systems where
  MAXPATHLEN is overly small (e.g. cygwin).

That seems to be your problem. So upgrading to 2.6.7 will most likely fix 
this for you.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync: connection unexpectedly closed.

2006-03-17 Thread Dag Wieers
On Thu, 16 Mar 2006, C.Jagadish wrote:

 Dear Mr. Samir,
 
  We are using RedHat 7.2. We are getting same error as reported by you some
 time back (given below).
 
  Did you get any solution?

You may want to update your rsync version to something newer. Error output 
has been improved and it might have been an old bug that is already fixed 
by now.

Since you're running RH 7.2, I would recommend taking the latest EL2 
package from:

http://dag.wieers.com/packages/rsync/

And because RH 7.2 is not longer supported (since some time now), it 
probably can't hurt to update it (both the OS or the package). :)

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync: connection unexpectedly closed.

2006-03-17 Thread Dag Wieers
On Fri, 17 Mar 2006, Dag Wieers wrote:

 On Thu, 16 Mar 2006, C.Jagadish wrote:
 
  Dear Mr. Samir,
  
   We are using RedHat 7.2. We are getting same error as reported by you some
  time back (given below).
  
   Did you get any solution?
 
 You may want to update your rsync version to something newer. Error output 
 has been improved and it might have been an old bug that is already fixed 
 by now.
 
 Since you're running RH 7.2, I would recommend taking the latest EL2 
 package from:
 
   http://dag.wieers.com/packages/rsync/
 
 And because RH 7.2 is not longer supported (since some time now), it 
 probably can't hurt to update it (both the OS or the package). :)

Wayne,

Maybe it's useful to link to 3rd party rsync packages from the rsync 
website ? Maybe it's available from somewhere, but I couldn't find it just 
yet.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync files with certain mtime

2006-02-18 Thread Dag Wieers
On Fri, 17 Feb 2006, Matt McCutchen wrote:

 On Fri, 2006-02-17 at 14:55 +0100, Mario Ohnewald wrote:
  How would i rsync all files which are older than X-Days? I am missing
  some kind of -mtime option. Since this is quite common for backups i am
  wondering how you are doing this kind of stuff.
 
 Rsync doesn't have an option to transfer only files with a certain range
 of mtimes, and in general I think such fancy file selections should be
 made outside of rsync and passed to rsync using --files-from.  Try this
 incantation to transfer files at least 7 days old:
   (cd src  find . -atime +6 -print) | rsync --files-from=- src/ dest/

Few people know you can do this in bash as wel:

rsync --files-from=(find /../src -atime +6 -print) src/ dest/

Not that it matters in this case, but when you are to provide different 
files, this comes in handy (especially if you have to repeat the same set 
of commands on the commandline).

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: RSYNC + iNotify

2006-02-08 Thread Dag Wieers
On Tue, 31 Jan 2006, Ryan Kather wrote:

 I'm looking for a way to continually monitor at least one but possibly 
 multiple directories (and/or individual files).  I would like RSYNC to 
 immediately synchronize the changes to said directory(ies) after they 
 occur.  I believe the best approach for this would be to utilize iNotify 
 enabled kernels and create a plugin for the RSYNC daemon.

 However, before I begin the task of actually writing some code (with my 
 poor abilities), I thought I would inquire if anyone else has already 
 created this or something similar?  Am I over thinking this, or is there 
 a better approach?  Is there a reason not to do this?  

I'm very interested in functionality like this. I remember it being 
brought up on this list before so I would look for similar mails in the 
archive for clues.

How to do it efficiently (eg. for files in transit/still open), I don't 
know. Also it seems to me that you may want a seperate daemon that 
implements the rsync protocol itself (instead of relaying on an external 
tool) as that allows you to optimize certain things and have less 
overhead.

I'm most interested in writing this in python, using a python-rsync 
implementation and python-inotify.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


writefd_unbuffered failed to write 32768 bytes: phase unknown

2005-09-15 Thread Dag Wieers
Hi,

I receive this error when rsyncing. I suspect this is because of network 
problems. But the 'Invalid argument' and 'phase unknown' makes it very 
unobvious.

...
packages/scribus/scribus-1.2.3-0.rf.src.rpm
Read from remote host rsync.sw.be: Invalid argument
rsync: writefd_unbuffered failed to write 32768 bytes: phase unknown 
[sender]: Broken pipe (32)
rsync: connection unexpectedly closed (796308 bytes received so far) 
[sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(434)

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: specifying ssh port #?

2005-08-05 Thread Dag Wieers
On Thu, 4 Aug 2005, Keith Warno wrote:

 Thanks for the replies.  All the proposed solutions should've been
 obvious to me.  :/ Is it Friday yet...? nope... dang...
 
 Yes I like the config file method as well, so long as I remember to use
 the same local port # when establishing the tunnel.

This would make a good FAQ. The multiple solutions (and applications) are 
not that obvious to new users.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Spam to this list

2005-03-25 Thread Dag Wieers
Hi,

I'm not sure what the policy of this list is and I bet everyone has a spam 
filter, so nobody might have noticed, but we got spammed.

Can anyone send mail to the list or do you have to subscribe first ?

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync error: protocol incompatibility (code 2) at main.c(451)

2005-03-15 Thread Dag Wieers
On Tue, 15 Mar 2005, Wayne Davison wrote:

 On Tue, Mar 15, 2005 at 04:15:46AM +0100, Dag Wieers wrote:
  Invalid packet at end of run [sender]
 
 OK, this was a combination of --delay-updates and --hard-links that was
 causing a problem at the end of the run.  I've checked in a fix to CVS,
 and the latest nightly tar file has this fix present.  I'm confident
 that will take care of the problem for you.  Thanks for the help!

Great service. :) I will package rsync-HEAD-20050315-1733GMT and send some 
feedback.

Thanks !
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Premature optimization in f_name_cmp()?

2005-03-14 Thread Dag Wieers
On Sun, 13 Mar 2005, Wayne Davison wrote:

 On Mon, Mar 14, 2005 at 10:40:00AM +1100, Andrew Bartlett wrote:
  Is there a way to make this only look at the 'right' part of sorted-
  flist, given it's sorted, and is this really needed at all?
 
 Yeah, that code is really pretty silly.  I neglected to test my
 assumption about how often it would get triggered, and it is actually
 getting run much more often than expected.

Could this be the reason why my rsync 2.6.4pre2 seems to be doing nothing 
for a long time after it has said how much files it considers ?

Would a current CVS checkout have a fix for this so I can verify this ? Is 
CVS in proper shape currently ?

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Premature optimization in f_name_cmp()?

2005-03-14 Thread Dag Wieers
On Mon, 14 Mar 2005, Wayne Davison wrote:

 On Mon, Mar 14, 2005 at 10:15:18PM +0100, Dag Wieers wrote:
  But there seems to be something wrong with the --fuzzy option 
 
 Yes -- it's a bug in the change I just checked into generator.c to deal
 with the dirname-pointers change.  I've just checked in a fix (and also
 a test case for the --fuzzy option into the test suite so that I'll
 catch such a bug sooner).

Great, thanks a lot once again. I'll check with the tomorrow's nightly 
build an report back if I find something unusual.

I'm checking if rsync with the --fuzzy patch would be a solution together 
with rsyncable RPM packages to reduce total bandwidth for people 
mirroring/updating Fedora. There has been a lot of fuss to have patch RPMs 
introduced, but the gain does not compare with the overall complexity imo. 

And the --fuzzy patch could be an adequate solution to this, if I 
understand the concept.

BTW it might be useful to have the CVS manual page also included with the 
online documentation so people can check out new options before they 
decide to try a development release. (I noticed the current one offered is 
a bit dated already).

Kind regards!
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Premature optimization in f_name_cmp()?

2005-03-14 Thread Dag Wieers
On Mon, 14 Mar 2005, Wayne Davison wrote:

 On Mon, Mar 14, 2005 at 10:15:18PM +0100, Dag Wieers wrote:
  And the --fuzzy option was my main reason to try out the newer rsync.
 
 One other caveat -- the CVS version of rsync has a protocol change in it
 that makes it incompatible with 2.6.4 pre1/pre2 for certain options:
 --fuzzy, --compare-dest, --link-dest, --copy-dest, --hard-link, the
 combination of --inplace --backup, or (only if a partial-file is found
 by the generator) --partial-dir and --delayed-updates.  I recommend just
 updating all your pre1 or pre2 versions of 2.6.4 to the CVS version at
 the same time (if you have any).

I'm doing that (so to not invalidate my test). It is still 
--delay-updates, or did the option change name (again) ?


 Another note:  if you want --fuzzy to help you with compressed files
 that have differing contents, you should use the --rsyncable option to
 gzip to compress the files.

I understood that. RPM 4.4.1-something now creates rsyncable compressed 
files. I hope Red Hat will finally also ship a zlib that can do rsyncable 
files (and someone fixes the python zlib bindings as well!) and maybe 
retrofit the RPM to older releases. (hope)

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync error: protocol incompatibility (code 2) at main.c(451)

2005-03-14 Thread Dag Wieers
Wayne, others,

Using the same version on server and client, I still got this error. This 
was still based on last night's CVS (HEAD-20050314-1745GMT) and I am 
certain both side's versions are identical. :)

...
source/sudosh-1.4.7-1.rf.src.rpm
   98373 100%  536.69kB/s0:00:00  (798, 99.9% of 244497)
Invalid packet at end of run [sender]
rsync error: protocol incompatibility (code 2) at main.c(451)

The rsync run completed without errors apart from that message.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync error: protocol incompatibility (code 2) at main.c(451)

2005-03-14 Thread Dag Wieers
On Mon, 14 Mar 2005, Wayne Davison wrote:

 On Tue, Mar 15, 2005 at 04:15:46AM +0100, Dag Wieers wrote:
  Invalid packet at end of run [sender]
 
 It would help to know what options you had enabled.

Damn, I was planning to include those. Here they are:

-avHl --progress --delete-after --delay-updates --exclude bert/ 
--exclude dries/ --exclude redhat/6.2 --exclude redhat/8.0
-e /usr/bin/ssh -oCompression=no 

Thanks,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync error: protocol incompatibility (code 2) at main.c(451)

2005-03-14 Thread Dag Wieers
On Mon, 14 Mar 2005, Wayne Davison wrote:

 On Tue, Mar 15, 2005 at 04:15:46AM +0100, Dag Wieers wrote:
  Invalid packet at end of run [sender]
 
 It would help to know what options you had enabled.

An update. I noticed that the files being transfered or not updated at the 
end of the run (--delay-updates). I seem to be missing the files and the 
next run they're transfered again. Probably related to the error because 
the first few updates worked without receiving this error.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync error: protocol incompatibility (code 2) at main.c(451)

2005-03-14 Thread Dag Wieers
On Tue, 15 Mar 2005, Dag Wieers wrote:

 On Mon, 14 Mar 2005, Wayne Davison wrote:
 
  On Tue, Mar 15, 2005 at 04:15:46AM +0100, Dag Wieers wrote:
 Invalid packet at end of run [sender]
  
  It would help to know what options you had enabled.
 
 An update. I noticed that the files being transfered or not updated at the 
 end of the run (--delay-updates). I seem to be missing the files and the 
 next run they're transfered again. Probably related to the error because 
 the first few updates worked without receiving this error.

Another update. Only some of the hardlinks seem to be actually missing.
Does this make any sense ? I'll be doing some more tests to see if there's 
a pattern or I'm just making things up. Definitely something wrong though 
:)

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Premature optimization in f_name_cmp()?

2005-03-14 Thread Dag Wieers
On Mon, 14 Mar 2005, Wayne Davison wrote:

 On Mon, Mar 14, 2005 at 06:24:15PM +0100, Dag Wieers wrote:
 
  Would a current CVS checkout have a fix for this so I can verify this ?
  Is CVS in proper shape currently ?
 
 Yes, the CVS version (as well as the latest nightly tar file) is in
 good shape and has this fix.  I'm considering releasing what's there as
 2.6.4pre3, so if you'd care to try this out, that would be a nice pre-
 release test for this latest version.

I can confirm that the odd behaviour is fixed with the latest nigthly 
build. But there seems to be something wrong with the --fuzzy option 
still:

Starting remote synchronisation.
[EMAIL PROTECTED]'s password:
building file list ...
241923 files to consider
rsync: connection unexpectedly closed (8 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(420)

And the --fuzzy option was my main reason to try out the newer rsync. I 
thought I'd seen previous mails with the exact same message (and 8 bytes 
:)) but can't find them now.

This is perfectly reproducable.

Thanks in advance,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


delay-renames patch

2005-02-21 Thread Dag Wieers
Hi,

I've been using the delay-renames patch now for over a month and would 
like to propose to have it included in the next release of rsync (if that 
is possible). Since it is fairly important for people that mirror sets of 
files that include seperate metadata files and it doesn't require much 
more than that option to work (if both sides understand it).

Is there a policy about what can go in the next rsync release or when 
something gets included in CVS ?

Thanks for --delay-renames :)
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: error in rsync protocol data stream (code 12) at io.c(692)

2005-01-16 Thread Dag Wieers
On Fri, 14 Jan 2005, Wayne Davison wrote:

 On Thu, Jan 13, 2005 at 07:51:33PM +0100, Dag Wieers wrote:
  unexpected tag -111
  rsync error: error in rsync protocol data stream (code 12) at io.c(692)
  
  Can this be caused by the delay-renames patch ?
 
 I tested how the code handles the verification-failed redo phase, both
 with and without the --delayed-rename patch, and I didn't get it to
 fail.  I thereafter diagnosed and checked-in fixes for the two problems
 above.  You'll probably want to check-out the latest CVS source to get
 them (since --delayed-renames makes use of the --partial-dir logic).

I did not see the same problem with the latest CVS sources yet. And before 
using the new version it was a repeatable problem.

Thanks Wayne !
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: error in rsync protocol data stream (code 12) at io.c(692)

2005-01-14 Thread Dag Wieers
On Thu, 13 Jan 2005, Dag Wieers wrote:

 Hi Wayne, Jeff,
 
 With the same version/build as the last report, I got this error now:
 
   ...
   extra/state/all-packages.list
145 100%   46.83kB/s0:00:30  (634, 0.0% of 201605)
   WARNING: fedora/1/en/i386/base/pkglist.dag.bz2 failed verification -- 
   update put into partial-dir (will try again).
   unexpected tag -111
   rsync error: error in rsync protocol data stream (code 12) at io.c(692)
 
 Can this be caused by the delay-renames patch ?

After this, got another one:

source/perl-Net-XMPP-1.0-1.rf.src.rpm
   99575 100%1.28kB/s0:00:51  (645, 99.6% of 200819)
WARNING: extra/state/all-packages.list failed verification -- update 
put 
into partial-dir (will try again).
unexpected tag 60
rsync error: error in rsync protocol data stream (code 12) at io.c(692)

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


error in rsync protocol data stream (code 12) at io.c(692)

2005-01-13 Thread Dag Wieers
Hi Wayne, Jeff,

With the same version/build as the last report, I got this error now:

...
extra/state/all-packages.list
 145 100%   46.83kB/s0:00:30  (634, 0.0% of 201605)
WARNING: fedora/1/en/i386/base/pkglist.dag.bz2 failed verification -- 
update put into partial-dir (will try again).
unexpected tag -111
rsync error: error in rsync protocol data stream (code 12) at io.c(692)

Can this be caused by the delay-renames patch ?

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Protocol incompatibility with rsync 2.6.4 (cvs) and disk full

2005-01-12 Thread Dag Wieers
Hi,

I got this error after a transaction was interrupted by me 
and a new (slightly updated) transaction was started.

...
fedora/1/en/i386/base/pkglist.dag.bz2
 1246954 100%   67.79kB/s0:00:17  (27, 1.7% of 201557)
Invalid file index 1777522462 (count=201557)
rsync error: protocol incompatibility (code 2) at sender.c(152)

On both ends exactly the same rsync was used. It's a packaged release 
from CVS yesterday:

http://dag.wieers.com/packages/rsync/

A second run (with again a slightly updated transaction) gives the same 
error on another file:

...
fedora/2/en/x86_64/base/pkglist.dag.bz2
 1503129 100%   57.80kB/s0:00:25  (64, 10.5% of 201595)
Invalid file index -1487994937 (count=201595)
rsync error: protocol incompatibility (code 2) at sender.c(152)

A third run gives:

fedora/2/en/x86_64/base/pkglist.dag.bz2
 1503129 100%   12.00kB/s0:01:59  (64, 10.5% of 201595)
Invalid file index -131924025 (count=201595)
rsync error: protocol incompatibility (code 2) at sender.c(152)

And then I noticed my partition was full, made some room, synced again and 
it miraculously worked.

I'm not sure why it gives this weird error when the *sender* has a full 
partition but apparantly it did.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Preliminary Suggestion For Atomic Transactions

2005-01-05 Thread Dag Wieers
On Thu, 6 Jan 2005, Jeff Pitman wrote:

 On Thursday 06 January 2005 07:04, Carson Gaspar wrote:
  I have no objection to the option, just to the name - don't call
  things atomic if they aren't. Call it delayed-rename, or whatever.
 
 How about --rename-after?

I'd rather see it called something like --near-atomic or something else 
abstract (that is explained in the manual), instead of some action that 
may be mis-interpreted.

But I'm more interested to know if something like this is acceptable to 
get included or not. Not this patch specifically, but the same 
functionality.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


atomic transaction set option to rsync

2005-01-03 Thread Dag Wieers
Hi,

Apparently a change of behaviour from rsync 2.5 to rsync 2.6 affected the 
way I worked. I provide RPM repositories that I mirror using rsync. It is 
important to have the repository meta-data in sync with the data otherwise 
people have errors using Yum or Apt.

In the old days (with older rsyncs) I was able to influence the order in 
which my transaction set was processed by changing the order of 
directories I wanted rsync to mirror, so that the metadata was uploaded 
after most of the data and deletes were done at the end of the 
transaction.

With newer rsyncs, rsync seems to sort the transaction set, so I cannot 
longer use this trick to have the metadata uploaded just after the data in 
the same transaction set.

I was wondering if it was possible and acceptable to have an rsync option 
to update the whole transaction in a atomic (or near-atomic way). This 
will also prevent the current problems when a mirror is rsyncing another 
mirror that is rsyncing itself. Since I have little bandwidth to update 
than most other mirrors, I'm often caught in this secnario.

There are other ways to work around it, either by uploading in different 
steps (which is impractical in my scenario) or by using a staging area 
(which is impossible and impractical for large mirror sites).

An option to atomically sync a transaction set would be a god gift for 
situations like these and probably (if not too much overhead) a behaviour 
most repository mirrors would want by default.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: atomic transaction set option to rsync

2005-01-03 Thread Dag Wieers
On Mon, 3 Jan 2005, Steve Bonds wrote:

 On Mon, 3 Jan 2005 17:39:19 +0100 (CET), Dag Wieers dag-at-wieers.com wrote:

  There are other ways to work around it, either by uploading in different
  steps (which is impractical in my scenario) or by using a staging area
  (which is impossible and impractical for large mirror sites).
 
 Rsync does a good job of ensuring file-level coherence by using a
 temporary file during the transfer and a quick rename to the original
 at the end.  Unfortunately for you, this is only good for a single
 file.  If this were done on a larger scale, it would serve as an
 atomic transaction-- but then rsync is just using a staging area of
 its own creation.  The same thing could be accomplished by manually
 creating the staging area and only using rsync as the data transport. 
 (Which is really what it's designed for.)

Well, that would be one solution. Upload everything with deterministic 
temporary files that rsync by default would ignore when mirroring. Then 
when the transaction set is finished, hardlink all files to the real name 
and finally remove all temporary files.

(Having deterministic temporary files is important so that if a 
transaction fails it can be continued similarly to how rsync is working 
already)

I think it could work, should not cause much overhead and could safe a lot 
of people headaches in situations as mine. (I do understand this is only a 
small number of the many uses of rsync though)

 
 I don't see how uploading in different steps would be impractical. 
 The most bulletproof way to do this would be to sync each rpm and
 header file in one rsync session.  However, for your collection of
 thousands of file pairs, this would indeed be impractical.  Breaking
 it up into 10-20 sessions with several dozen file pairs each would be
 practical and could be automated with some shell or Perl wizardry.

Well, it is impractical for several reasons.

 1. I don't control my mirrors, so unless rsync has this behaviour that 
allows mirrors to enable this behaviour easily it's going to be very 
hard to have mirrors do something special for me. I'm pretty sure I 
don't have that authority. Unless it's just a switch they have to 
enable.

 2. Breaking it up in several sessions is hard because I only use 
passphrases. I don't allow password-less connections, nor do I
sign packages automatically because I think it is important 
security-wise. Such a change would slow down my ability to work 
flexibly (I'm already restricted by some other tools and processes)


 Another option you didn't mention would be to make use of LVM
 snapshots to ensure that your repository is always internally
 consistent even while you're in the middle of an rsync.  The
 disadvantage would be some periodic unavailability while you removed
 and re-created your snapshots.  (i.e. the FTP server is configured to
 serve files from the read-only snap volumes, which need to be
 unmounted and re-snapped when new files are uploaded.)

Impossible for the same reason. I only manage a private server that my 
main mirror has access to. I don't even have control over that main 
mirror. So even when I have a complex solution for myself, it would still 
be impossible for others. Rsync, imho, is the best place to have this 
functionality as it already has all the information to make the correct 
decisions.


 There may be some full site-replication applications out there making
 use of rsync.  I suspect someone here on the list would know.  I've
 always just created my own custom scripts for this.

Well, any solution I can think of either requires the same functionality 
as rsync provides (of which I can think only sub-optimal implementations) 
or is using rsync a second time (which would be slowing down and requires 
me to type a password again).


 Thanks again for the RPMs.  I hope you can find a good solution to
 your mirroring dilemma.

Thanks.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: atomic transaction set option to rsync

2005-01-03 Thread Dag Wieers
On Mon, 3 Jan 2005, Wayne Davison wrote:

 On Mon, Jan 03, 2005 at 05:39:19PM +0100, Dag Wieers wrote:
  With newer rsyncs, rsync seems to sort the transaction set, so I
  cannot longer use this trick to have the metadata uploaded just after
  the data in the same transaction set.
 
 Rsync has always sorted the list of files to be sent, so this is not
 something that is different between 2.5.x and 2.6.x.  I'd be interested
 in hearing what you believe to be different in how files are processed.

In the past I could say smt. like:

rsync -a dir1/ dir2/ [EMAIL PROTECTED]:/remote-dir/

and it would process first dir1 and then dir2. I'm not sure when this 
change happened, it may not be when 2.6.x.

This way I first added the packages/ dir (which contains hardlinks of only 
the packages) and then the repository (packages+metadata). Resulting in a 
much smaller window between the start of mirror the repodata and finishing 
the hardlinking of all packages. (still 5 to 10 minutes !)

If rsync includes the atomic transaction set I explained in a previous 
post, it can be reduced to an even smaller window (only the renaming of 
the files strictly). Even the hardlinking can be done in advance.


  I was wondering if it was possible and acceptable to have an rsync
  option to update the whole transaction in a atomic (or near-atomic
  way).
 
 One way to do this would be to use the --link-dest option to create a
 new hierarchy of files (with only the changed files getting sent, and
 all unchanged files being hard-linked to the prior files) and then
 moving the whole file-set into place all at once.  Imagine that there
 is a hierarchy you want to update in /dest/cur by running this script:

I know this, but I don't control all the mirrors and I can't ask all my 
mirrors to implement one of these complex alternatives. Besides this 
requires a per mirror configuration (as it requires a fixed directory per 
rsync transaction). Something inherent to rsync (the same way current 
atomic-per-file behaviour works) that kicks in at the end of a transaction 
set instead of after each transfer (of an individual file) works 
everywhere with a single option and no fuss.

I can ask mirrors to use '--atomic' or '--atomic-ts', bt I can't ask them 
to re-organise their mirror-scripts just for me.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: atomic transaction set option to rsync

2005-01-03 Thread Dag Wieers
On Mon, 3 Jan 2005, Wayne Davison wrote:

 On Tue, Jan 04, 2005 at 02:51:23AM +0100, Dag Wieers wrote:
  In the past I could say smt. like:
  
  rsync -a dir1/ dir2/ [EMAIL PROTECTED]:/remote-dir/
  
  and it would process first dir1 and then dir2.
 
 The filenames read in from dir1 and dir2 have always been sorted into a
 single list of files, so dir1's files will only be sent prior to dir2's
 files if they sort alphabetically earlier in the list.
 
  This way I first added the packages/ dir (which contains hardlinks of only 
  the packages) and then the repository (packages+metadata).
 
 Ahh, now there's a difference that is affected by the 2.6.x series:
 hard-link handling.  If the first instance of a hard-link is not found,
 rsync holds off on sending the file in the hopes that one of the other
 links for the file will match up with an existing file on the receiving
 side.  This avoids a bug where a new hard-link can cause rsync to
 re-send all the file's data just because it sorted alphabetically
 earlier in the list than the other (existing) link(s).

Ok, it doesn't really matter what caused the change in behaviour. I now I 
was using something that was not advertized as such (or was not set in 
stone) so I was not surprised it happened. The new functionality would be 
much better and safer than how I did it before and would actually work for 
other mirrors too and is something people could rely on. 


  I can ask mirrors to use '--atomic' or '--atomic-ts', bt I can't ask
  them to re-organise their mirror-scripts just for me.
 
 Since you'd have to ask them to install a new rsync, maybe just ask them
 to install the attached perl script instead.  Then, they could run
 atomic-rsync ... instead of their current rsync ... command.  (The
 attached script works if they're doing a pull.)

Well, having a newer rsync will be mandatory the next security update :) 
And if someone would have added this 5 years ago, this was no issue. So 
lets hope it is added sooner than later (if accepted).


 The idea of doing a massive number of renames a the end of the transfer
 is interesting, but it is not as atomic as the algorithm implemented by
 the above script.  However, if you'd prefer going that route, I'd
 imagine the implementation sharing a lot of the code that --partial-dir
 uses.  E.g., add an --atomic-dir=.atomic option that causes all finished
 files to be saved off in the .atomic dir (relative to their destination)
 and then add an ending pass that goes back through the file list and
 renames all the .atomic/FOO files.  Something like that should be pretty
 easy to whip up.

Well, the reason why I think a new feature in rsync makes sense, is 
because it does not need an extra directory. I would like to get rid of 
the optional directory, as I would like to make it possible that 2 rsyncs 
are happening at the same time and that it becomes just another flag to 
add to a script instead of some new logic to make the optional directory 
uniq.

I looked at the code, but it only reminded me how little developed my C 
knowledge is.

The changes I think are required is:

Add an --atomic option as boolean and check if the combination of 
options make any sense at all :) (in options.c)

Make the recv_files() function understand it has to delay its 
finish_transfer() call. Save some of the information required to 
make a decisive call in some transaction struct. (like the 
temporary name) (in receiver.c)

Then at the end of recv_files() before delete_files() run 
finish_transaction() on the transaction struct that calls 
finish_transfer() when necessary. (in receiver.c)

If we want to speed up this last step it may make sense to split off the 
renaming and the permission/owner changes ?

I hope someone can pick this up that has better C skills or knows the 
code much better than me.


PS The importance of this change makes it less likely someone is starting 
an rsync between the time I've started uploading the metadata and 
finished my rsync.

From somewhere between 8 hours to 1 hour based on the transaction size 
(350MB to 50MB), to something likely less than 10 secs based on the amount 
of transaction objects needing a move. (avg. 100).

Bringing this 10 secs down to 1 secs or less (or real atomic by swapping a 
directory) is less important. Especially when it has other drawbacks.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[all I want is a warm bed and a kind word and unlimited power]
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync error: errors selecting input/output files, dirs (code 3) at flist.c(980)

2003-09-14 Thread Dag Wieers
Hi,

I've got another error that is not in the FAQ ;) Ran from a script in a 
directory that doesn't exist anymore:

shell-init: could not get current directory: getcwd: cannot access parent 
directories: No such file or directory
building file list ... 
pop_dir /mnt/dar : No such file or directory
rsync error: errors selecting input/output files, dirs (code 3) at flist.c(980)

I guess it is caused because the current directory (pwd) has been deleted, 
nevertheless I guess rsync should handle this situation (as any other 
command can handle it).

The directory in question isn't used by rsync, it was just coincidence I 
was in that directory while it was removed (by rpm) after I ran my 
synchronizing script.

The
pop_dir /mnt/dar : No such file or directory

message blows my mind because that directory exists and has not been 
touched.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]

-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync error: error in rsync protocol data stream (code 12) at io.c(463)

2003-09-13 Thread Dag Wieers
Hi,

I'm having a problem rsyncing one file (since I signed it). It seems that 
the content of a file is able to cause problems in the protocol.

building file list ... 
28820 files to consider
apt/packages/avifile/
apt/packages/avifile/avifile-0.7.34-1.dag.rh90.i386.rpm
rsync: error writing 4 unbuffered bytes - exiting: Broken pipe
rsync error: error in rsync protocol data stream (code 12) at io.c(463)

I'm using rsync-2.5.5-4 (the rsync shipped with RH9). The first time rsync 
halted (indefinitely), every other run gives the above error.

Kind regards,
--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]

-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync error: error in rsync protocol data stream (code 12) at io.c(463)

2003-09-13 Thread Dag Wieers
On Sat, 13 Sep 2003, Dag Wieers wrote:

 I'm having a problem rsyncing one file (since I signed it). It seems that 
 the content of a file is able to cause problems in the protocol.
 
   building file list ... 
   28820 files to consider
   apt/packages/avifile/
   apt/packages/avifile/avifile-0.7.34-1.dag.rh90.i386.rpm
   rsync: error writing 4 unbuffered bytes - exiting: Broken pipe
   rsync error: error in rsync protocol data stream (code 12) at io.c(463)
 
 I'm using rsync-2.5.5-4 (the rsync shipped with RH9). The first time rsync 
 halted (indefinitely), every other run gives the above error.

Using rsync-2.5.6 I get the exact same error:

building file list ... 
28844 files to consider
apt/packages/
apt/packages/avifile/
apt/packages/avifile/avifile-0.7.34-1.dag.rh90.i386.rpm
rsync: writefd_unbuffered failed to write 4 bytes: phase unknown: Broken pipe
rsync error: error in rsync protocol data stream (code 12) at io.c(515)

I'm now going to test with an unpatched rsync, although looking at the Red 
Hat patches I don't see anything that could cause this.


PS: Is there a reason why the Red Hat patches are not applied to the rsync 
sourcecode ? I've attached them for inspection.

--   dag wieers,  [EMAIL PROTECTED],  http://dag.wieers.com/   --
[Any errors in spelling, tact or fact are transmission errors]
--- io.c.orig   2003-09-13 22:36:40.0 +0200
+++ io.c2003-09-13 22:39:13.0 +0200
@@ -509,7 +509,7 @@
 * across the stream */
io_multiplexing_close();
rprintf(FERROR, RSYNC_NAME
-   : writefd_unbuffered failed to write %ld 
bytes: phase \%s\: %s\n,
+   : writefd_unbuffered failed to write %lu 
bytes: phase \%s\: %s\n,
(long) len, io_write_phase, 
strerror(errno));
exit_cleanup(RERR_STREAMIO);
@@ -605,7 +605,7 @@
}
 
while (len) {
-   int n = MIN((int) len, IO_BUFFER_SIZE-io_buffer_count);
+   int n = MIN((ssize_t) len, IO_BUFFER_SIZE-io_buffer_count);
if (n  0) {
memcpy(io_buffer+io_buffer_count, buf, n);
buf += n;
--- match.c.orig2003-09-13 22:39:22.0 +0200
+++ match.c 2003-09-13 22:42:59.0 +0200
@@ -153,12 +153,12 @@
last_i = -1;
 
if (verbose  2)
-   rprintf(FINFO,hash search b=%ld len=%.0f\n,
+   rprintf(FINFO,hash search b=%lu len=%.0f\n,
(long) s-n, (double)len);
 
/* cast is to make s-n signed; it should always be reasonably
 * small */
-   k = MIN(len, (OFF_T) s-n);
+   k = MIN(len, (ssize_t) s-n);

map = (schar *)map_ptr(buf,0,k);

@@ -173,7 +173,7 @@
end = len + 1 - s-sums[s-count-1].len;

if (verbose  3)
-   rprintf(FINFO, hash search s-n=%ld len=%.0f count=%ld\n,
+   rprintf(FINFO, hash search s-n=%lu len=%.0f count=%lu\n,
(long) s-n, (double) len, (long) s-count);

do {
@@ -190,13 +190,13 @@
 
sum = (s1  0x) | (s2  16);
tag_hits++;
-   for (; j  (int) s-count  targets[j].t == t; j++) {
+   for (; j  (ssize_t) s-count  targets[j].t == t; j++) {
int l, i = targets[j].i;

if (sum != s-sums[i].sum1) continue;

/* also make sure the two blocks are the same length */
-   l = MIN(s-n,len-offset);
+   l = MIN((ssize_t) s-n,len-offset);
if (l != s-sums[i].len) continue;  
 
if (verbose  3)
@@ -216,7 +216,7 @@
 
/* we've found a match, but now check to see
if last_i can hint at a better match */
-   for (j++; j  (int) s-count  targets[j].t == t; j++) {
+   for (j++; j  (ssize_t) s-count  targets[j].t == t; j++) {
int i2 = targets[j].i;
if (i2 == last_i + 1) {
if (sum != s-sums[i2].sum1) break;
@@ -232,7 +232,7 @@

matched(f,s,buf,offset,i);
offset += s-sums[i].len - 1;
-   k = MIN((len-offset), s-n);
+   k = MIN((len-offset), (ssize_t) s-n);
map = (schar *)map_ptr(buf,offset,k);
sum = get_checksum1((char *)map, k);
s1 = sum  0x