Re: patch to enable faster mirroring of large filesystems

2001-11-26 Thread Andrew J. Schorr

On Sun, Nov 25, 2001 at 03:21:51AM +1100, Martin Pool wrote:
 On 20 Nov 2001, Dave Dykstra [EMAIL PROTECTED] wrote:
 
   And, by the way, even if the batch stuff accomplishes the same performance
   gains, I would still argue that the --files-from type of behavior
   that I implemented is a nice transparent interface that people might
   like to have.  The ability to pipe in output from gfind -print0 opens
   up some possibilities.
  
  Yes, many people have argued that a --files-from type of option is
  desirable and I agree.
 
 I agree too.  I think there would certainly be no argument with taking
 a patch that did only that.  (Except that it makes the option list
 even more ridiculously bloated.)
 
 I think a better fix is to transfer directories one by one, pipelined,
 rather than walking the whole tree upfront.  This will take a protocol
 change and a fair amount of work, but I think it is possible.  If we
 can get it to just work faster transparently that is better.

I understand your point of view, but I think it is a mistake to
hold rsync's algorithm hostage to the directory tree traversal logic
built into the program.  IMHO, the basic file transfer algorithm of
rsync is terrific, but the program wrapped around it is a bit out of
control.

The spirit of my patch is to expose the low-level rsync algorithm and
to allow people to build up their customized infrastructure outside
of the program instead of having to build it in.  I think this is in
the spirit of Unix tools.  I think if rsync were to expose some of its
low-level capabilities, then we would not have a need for xdelta and rdiff,
projects which are springing up because of rsync's opaqueness.

Anyway, you may not like the way my patch is implemented, but I still argue
that it serves a useful purpose, and it gets the job done for me.

Cheers,
Andy




Request for an rsync-announce mailing list

2001-11-26 Thread Marc Stephenson

Howdy,
It would be nice if there was a low-volume rsync-announce mailing list
for announcements of new releases and security-related information.  This
would help casual users keep current on rsync without the relatively high volume
generated on the rsync list.  Just a suggestion.
-- 
Marc Stephenson   IBM Server Group - Austin, TX
Internet:  [EMAIL PROTECTED]  NOTES: [EMAIL PROTECTED]
Phone:   512-327-5670  T/L 678-3189




Re: Password-less daemon or rsh setup

2001-11-26 Thread Dave Dykstra

On Thu, Nov 22, 2001 at 04:52:02PM -, [EMAIL PROTECTED] wrote:
 I'm new to rsync and trying to configure it to run through either the daemon
 or rsh (ssh is not avalilable).  Doing an 'rlogin production' results in a
 password prompt - which I want rid of, as we're basically going to have a
 robot user running a cron job that utilises rsync.  I'm at a loss as to why
 I can't configure rlogin to user hosts.equiv or similar - any help on this
 would be greatly appreciated as it's definitely the preferred method.

That depends on your system operating system, but it's usually a very
simple matter.  The hardest part is usually finding out what the server
thinks the name of the client is.  Logging in and saying who am i
will tell you on some operating systems.


 I've had some limited success with the daemon approach - but again I'm
 prompted for a password.  Here's the command I'm trying to run:
 
 backup(nemxb01)$ rsync -nr production::test_module /export/home/nemxb01
 
 If I supply nemxb01's password I get the correct results.  FYI: nemxb01
 doesn't exist on the production box.

That depends on production's rsyncd.conf; if you post that we could tell
you what needs to be changed.  It's surprising that it would prompt for a
password because it needs to be explicitly set up to do so.


 If someone could give me a step-by-step approach as to how to configure
 rlogin, and subsequently rsync to run through rsh I'd appreciate it.

- Dave Dykstra




Re: --no-detach option?

2001-11-26 Thread Jos Backus

On Mon, Nov 26, 2001 at 07:37:33PM +1100, Martin Pool wrote:
 I'll apply something like Jos's patch, with a modification to not
 create another bloody global variable but instead stick it in an
 options struct.

Good idea. Btw, Openssh also uses structs to pass options around.

 People who work with Apache should try the mostly-undocumented -X
 option, which tells it to not detach and also not fork on incoming
 connections.  For debugging that's even more useful.

off-topic
Yes, however 2.0.28 still has a problem with this; see my message with subject
``HAVE_SETSID'' problem on the apache-dev list, which unfortunately has been
ignored so far. I notified the maintainer of the FreeBSD port and he told me
he has submitted a bugreport to the Apache folks. So hopefully this problem
will be fixed at some point. The remaining issue with ``-X NO_DETACH'' is that
it keeps httpd from creating its own process group, even though it _kills_ the
process group it is in. This is bad news because this process group contains
svscan/supervise/etc. So what is needed for httpd is another option that only
causes apr_proc_detach() to call setsid()|setpgrp()|..., not the other stuff.
All this is really ugly anyway; httpd should simply keep track of its
children's pid's (I bet it does already) and kill() those instead of a process
group it doesn't own.
/off-topic

S[ai]mple --no-detach manpage patch:

Index: rsync.1
===
RCS file: /cvsroot/rsync/rsync.1,v
retrieving revision 1.95
diff -u -r1.95 rsync.1
--- rsync.1 14 Aug 2001 02:04:49 -  1.95
+++ rsync.1 26 Nov 2001 18:46:24 -
@@ -296,6 +296,7 @@
  --include-from=FILE don\'t exclude patterns listed in FILE
  --version   print version number
  --daemonrun as a rsync daemon
+ --no-detach do not detach from the parent
  --address   bind to the specified address
  --config=FILE   specify alternate rsyncd\.conf file
  --port=PORT specify alternate rsyncd port number
@@ -714,6 +715,10 @@
 config file (/etc/rsyncd\.conf) on each connect made by a client and
 respond to requests accordingly\. See the rsyncd\.conf(5) man page for more
 details\. 
+.IP 
+.IP \fB--no-detach\fP 
+When running as a daemon, this option instructs rsync to not detach itself and
+become a background process\.
 .IP 
 .IP \fB--address\fP 
 By default rsync will bind to the wildcard address


-- 
Jos Backus _/  _/_/_/Santa Clara, CA
  _/  _/   _/
 _/  _/_/_/ 
_/  _/  _/_/
[EMAIL PROTECTED] _/_/   _/_/_/use Std::Disclaimer;




Re: rsync-ing compressed archives

2001-11-26 Thread Dave Dykstra

On Sun, Nov 25, 2001 at 08:39:23PM +0100, Mauro Condarelli wrote:
 Hi there!
 I'm quite happily using rsync.
 There is only one case where I couldn't figure out how to use it
 efficently:
 I sometimes have large compressed files (either .tar.gz or .zip) that
 I need to keep synchronized. The exploded files are usually not
 available on the machines i use for rsync (to keep everything unpacked
 would mean wasting a lot of space on the server).
 The problem is that even if I change slightly the content of the
 archive (just add a file) the compressed form will be almost
 completely different making rsync algorythm practically useless.
 Is there any easy way to tell the daemon to unpack (or just
 uncompress) a compressed archive before doing the matching?
 The problem is not severe with .zip archives, but .gz and .bz2 are
 really inefficient.
 
 Thanks in advance for any hint
 Mauro

No, sorry rsync can't do it.  I answered a similar question just last
week.  Many people have asked but nobody has made a patch.  It's probably
doable but not trivial.

- Dave Dykstra




password file problem.

2001-11-26 Thread Matt Anderson

Hello everyone.
I can't seem to get the --password-file= option to work correctly.  I'm using 
ssh as the transport.  I've got the file 0600 and only the password with no 
carriage return.
Can someone provide an example of use.  Here is my try:
rsync -rtvvuz -e ssh --password-file=file /home/matt/* 
remotecomputer:/home/matt

It still asks for a password which, if I type it manually, it then runs fine.

Thanks!

Matt Anderson




Re: patch to enable faster mirroring of large filesystems

2001-11-26 Thread Martin Pool

On 26 Nov 2001, Andrew J. Schorr [EMAIL PROTECTED] wrote:

 I understand your point of view, but I think it is a mistake to
 hold rsync's algorithm hostage to the directory tree traversal logic
 built into the program.

 IMHO, the basic file transfer algorithm of rsync is terrific, but
 the program wrapped around it is a bit out of control.

I completely agree, and you can see me talking about this in earlier
posts.  I think the large number of command-line options rsync
accumulates, to say nothing of the number of patches not yet accepted
that would add more, indicates some kind of fundamental problem.

The protocol also is pretty crufty.

So breaking it out would be really useful.

Possibly the best way to enable this is to add scripting support.  I
think many people would find it useful if rsync could execute a
fragment of e.g. perl code to decide whether to transfer each file.

Although rsync is hostage to the directory tree traversal algorithm,
it's also one of the most useful parts...

Just doing directory-by-directory transfer would give an immediate and
substantial improvement to the common case of people using rsync to
transfer big directory trees.

 The spirit of my patch is to expose the low-level rsync algorithm and
 to allow people to build up their customized infrastructure outside
 of the program instead of having to build it in.  I think this is in
 the spirit of Unix tools.  I think if rsync were to expose some of its
 low-level capabilities, then we would not have a need for xdelta and rdiff,
 projects which are springing up because of rsync's opaqueness.

To go in this directlion you need to decide whether 

 A) To maintain wire compatibility with the existing code.  This would
 be pretty useful because otherwise you'll need to install the new
 version on both machines.  You can kind of kludge around this by
 having two rather different protocols that bifurcate at the first
 packet, which I think is what SSH1/SSH2 do.  There's some limited
 support in the current protocol for protocol versioning.

 B) Whether you want to preserve the model of every invocation being
 an idempotent copy this to here, or to make it into a more
 generalized file manipulation protocol like FTP or NFS.

If one was going to write a new protocol with a new model of operation
then perhaps it would be better to use librsync and make it a
completely separate program.

At this stage I'm inclined to take the conservative answer to both
questions, but having a discussion about them could be useful.

-- 
Martin 




Re: --no-detach option?

2001-11-26 Thread Martin Pool

On 23 Nov 2001, Andre Pang [EMAIL PROTECTED] wrote:
 On Tue, Nov 20, 2001 at 03:05:32PM -0800, Jos Backus wrote:
 
  How about adding a --no-detach option (to be used in combination with
  supervise? If there's interest I'll provide a patch.

Yes, this is great.  I wanted it today when trying to debug IPv6.
(Incidentally that now seems to work on Linux and at least compile on
other interesting platforms; comments welcome.)

I'll apply something like Jos's patch, with a modification to not
create another bloody global variable but instead stick it in an
options struct.

People who work with Apache should try the mostly-undocumented -X
option, which tells it to not detach and also not fork on incoming
connections.  For debugging that's even more useful.

-- 
Martin