update wrong date, soft links, Debian stable, -L, ...

2002-05-17 Thread John D. Hendrickson

Debian r2.5 (Potato)
rsync version 2.3.2  protocol version 21
( the latest stable deb version anyway :)

RE: absolutely older file keeps getting chosen for update

Hi.

I have something like:

rsync -vxuaz  /sendmail/  /mnt/nfs-mount/sendmail

In the nfs drive the files are links to other files in a
subdirectory of that same directory.   Both the links and the files they
point to  are newer in every respect (both 'stat' and 'newer' say so)
than the files in /sendmail.

However - rsync chooses the older files in sendmail as updates to
the newer files (newer ones get clobbered) -- but only on soft-linked
files.

If I use the -L option this happens with only one file.  BUT even
after writing the file (on the nfs drive) to make it today's date across
the board, rsync still chooses the older file -- even with the
-L option.



The man page says -u chooses the newer date base on mod time.  The
-L option I though a little unclear - it says treat links like files -
but that leaves me wondering as I don't know the special rules for
treating links and whether the -a option effects that still.

I'll have to admitt I haven't searched all 10MB of prev. questions.

Anyway, thanks - I get allot of mileage out of rsync: its a ton better
than writing a shell script using rcp :)

John Hendrickson

[EMAIL PROTECTED]

[EMAIL PROTECTED]

PS

My most desired feature is:  If you use

user@host:/directory

notation right on host, rysnc won't grok it
(you gotta use /directory if your executing
on host).

That means scripts must be
different depending on which host they
run on.  Hopfully, newer versions of
rsync would resolve the remote spec down
to a local spec (if host is local) before using it.

Guess you knew that :)



-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: Improving the rsync protocol (RE: Rsync dies)

2002-05-17 Thread jw schultz

On Fri, May 17, 2002 at 01:42:31PM -0700, Wayne Davison wrote:
> On Fri, 17 May 2002, Allen, John L. wrote:
> > In my humble opinion, this problem with rsync growing a huge memory
> > footprint when large numbers of files are involved should be #1 on
> > the list of things to fix.
> 
> I have certainly been interested in working on this issue.  I think it
> might be time to implement a new algorithm, one that would let us
> correct a number of flaws that have shown up in the current approach.
> 
> Toward this end, I've been thinking about adding a 2nd process on the
> sending side and hooking things up in a different manner:
> 
> The current protocol has one sender process on the sending side, while
> the receiving side has both a generator process and a receiver process.
> There is only one bi-directional pipe/socket that lets data flow from
> the generator to the sender in one direction, and from the sender to the
> receiver in the other direction.  The receiver also has a couple pipes
> connecting itself to the generator in order to get data to the sender.
> 
> I'd suggest changing things so that a (new) scanning process on the
> sending side would have a bi-directional link with the generator process
> on the receiving side.  This would let both processes descend through
> the tree incrementally and simultaneously (working on a single directory
> at a time) and figure out what files were different.  The list of files
> that needed to be transferred PLUS a list of what files need to be
> deleted (if any) would be piped from the scanner process to the sender
> process, who would have a bi-directional link to the receiver process
> (perhaps using ssh's multi-channel support?).  There would be no link
> between the receiver and the generator.

With 4 stages I don't know that there needs to be any bidirectional pipes.
Below i will dis-recommend this unidirectional structure.

scanner - output to generator.
generates stat info (one directory at a time)

generator - input from scanner and output to sender
compare stat info from scanner and
generate ADD, DEL and CHANGE orders with
checksums for change or --checksum

sender - input from generator and output to receiver
send ADD, DEL and CHANGE orders + generate
checksums and transmit file contents

receiver - input from sender output is logging
do the ADD, DEL and CHANGEs

> 
> The advantage of this is that the sender and the receiver are really
> very simple.  There is a list of file actions that is being received on
> stdin by the sending process, and this indicates what files to update
> and which files to delete.  (It might even be possible to make sender be
> controlled by other programs.)  These programs would not need to know
> about exclusion lists, delete options, or any of the more esoteric
> options, but would get told things like the timeout settings via the
> stdin pipe.  In this scenario, all error messages would get sent to the
> sender process, who would output them on stdout (flushed).

In most ways i like this description much better.

scanner+generator create a dataset that can be captured
or created another way.  Similarly the sender output could
be captured or broadcast to update multiple locations or
redo somewhat like --batch*.  To summarize the outputs:

scanner+generator -- changeset without data
sender  -- changeset with data.

This means that it doesn't matter where scanner or generator
run except that you must invert the changeset directives.

> The scanner/generator process would be the thing that parses the
> commandline, communicates the exclude list to its opposite process, and
> figures out exactly what to do.  The scanner would spawn the sender, and
> field all the error messages that it generates.  It would then either
> output the errors locally or send them over to the generator for output
> (depending on whether we're pushing or pulling files).

I would describe it more as a case that we parse the
commandline and set up the communication channels then split
up into whatever parts are needed per Having options and
argv[0]. 
scanner+generator | sender | receiver
scanner+generator | sender > csetdata
scanner+generator > cset
receiver  As for who spawns the receiver, it would be nice if this was done by the
> sender (so they could work alone), but an alternative would be to have
> the generator spawn the receiver and then then let the receiver hook up
> with the sender via the existing ssh connection.
> 
> This idea is still in its early stages, so feel free to tell me exactly
> where I've missed the boat.
> 
> ..wayne..

I really like this idea.

I havn't been stung yet by these issues but changing the
scanner+generator to work on one directory at a time will
not only remove the memory footprint problem but also should
take care 

Re: Improving the rsync protocol (RE: Rsync dies)

2002-05-17 Thread Wayne Davison

On Fri, 17 May 2002, Wayne Davison wrote:
> so feel free to tell me exactly where I've missed the boat.

[Replying to myself...  hmmm...]

In my description of the _new_ protocol, my references to a generator
process are not really accurate.  The current generator process is
forked off after the initial file-list session figures out what files
need to be checked for differences, and it then churns out rolling
checksums for the sender process.  The "generator" in my previous
description is really just a receiver-side scanner process (that looks
for files that need to be check-summed).  So, either the new receiver
process would handle the checksum generation itself, or we'd need a 3rd
process on the receiver side to generate the checksum data (and it would
need a pipeline into the sender).

As a first step in investigating this further, I'm looking into librsync
to see if it might be easy to create a simple sender/receiver duo using
this library.  If anyone knows where some decent documentation on
librsync is, please let me know (I'm looking for it now, but the tar
doesn't appear to come with any decent docs).  I was wondering if 
librsync manages to implement the protocol without forking off a 
separate generator process...

..wayne..


-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: Improving the rsync protocol (RE: Rsync dies)

2002-05-17 Thread tim . conway

Wayne:  If anybody can make that work, I'd bet you could.  The basic rsync 
algorythm is in place, so as you say, it would mostly be a matter of list 
generation.  You'd have to hold on to any files with <1 link, in a 
seperate list, to find all the linkage relationships, which could grow a 
bit, but it does sound more efficient.  Maybe a third pipe to send the 
files over as you go?

Mind:  I'm not offering to help.  It's too complicated for my tiny mind.

Tim Conway
[EMAIL PROTECTED]
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me Tim?"




Wayne Davison <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]
05/17/2002 02:42 PM

 
To: rsync users <[EMAIL PROTECTED]>
cc: (bcc: Tim Conway/LMT/SC/PHILIPS)
Subject:Improving the rsync protocol (RE: Rsync dies)
Classification: 



On Fri, 17 May 2002, Allen, John L. wrote:
> In my humble opinion, this problem with rsync growing a huge memory
> footprint when large numbers of files are involved should be #1 on
> the list of things to fix.

I have certainly been interested in working on this issue.  I think it
might be time to implement a new algorithm, one that would let us
correct a number of flaws that have shown up in the current approach.

Toward this end, I've been thinking about adding a 2nd process on the
sending side and hooking things up in a different manner:

The current protocol has one sender process on the sending side, while
the receiving side has both a generator process and a receiver process.
There is only one bi-directional pipe/socket that lets data flow from
the generator to the sender in one direction, and from the sender to the
receiver in the other direction.  The receiver also has a couple pipes
connecting itself to the generator in order to get data to the sender.

I'd suggest changing things so that a (new) scanning process on the
sending side would have a bi-directional link with the generator process
on the receiving side.  This would let both processes descend through
the tree incrementally and simultaneously (working on a single directory
at a time) and figure out what files were different.  The list of files
that needed to be transferred PLUS a list of what files need to be
deleted (if any) would be piped from the scanner process to the sender
process, who would have a bi-directional link to the receiver process
(perhaps using ssh's multi-channel support?).  There would be no link
between the receiver and the generator.

The advantage of this is that the sender and the receiver are really
very simple.  There is a list of file actions that is being received on
stdin by the sending process, and this indicates what files to update
and which files to delete.  (It might even be possible to make sender be
controlled by other programs.)  These programs would not need to know
about exclusion lists, delete options, or any of the more esoteric
options, but would get told things like the timeout settings via the
stdin pipe.  In this scenario, all error messages would get sent to the
sender process, who would output them on stdout (flushed).

The scanner/generator process would be the thing that parses the
commandline, communicates the exclude list to its opposite process, and
figures out exactly what to do.  The scanner would spawn the sender, and
field all the error messages that it generates.  It would then either
output the errors locally or send them over to the generator for output
(depending on whether we're pushing or pulling files).

As for who spawns the receiver, it would be nice if this was done by the
sender (so they could work alone), but an alternative would be to have
the generator spawn the receiver and then then let the receiver hook up
with the sender via the existing ssh connection.

This idea is still in its early stages, so feel free to tell me exactly
where I've missed the boat.

..wayne..


-- 
To unsubscribe or change options: 
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html




-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: I/O error when deleting files

2002-05-17 Thread Bill Houle

OK, but I'm not exactly sure what I'm looking for... 

I don't think the link error is caused by my data (I have no symlinks).
For whatever reason, it appears that a blank is leading the file list,
and the 'stat' on NULL is what is causing the link_stat error.

write(1, " b u i l d i n g   f i l".., 23)  = 23
brk(0x00058668) = 0
brk(0x00060668) = 0
lstat64("", 0xFFBEF8D8) Err#2 ENOENT
write(2, " l i n k _ s t a t :".., 39)  = 39

The rsync child then just says an error has occured and tells the
parent to not bother deleting files because of it.

waitid(P_PID, 10233, 0xFFBEF890, WEXITED|WTRAPPED|WNOHANG) = 0
Received signal #18, SIGCLD, in poll() [caught]
  siginfo: SIGCLD CLD_EXITED pid=10233 status=0x000C
poll(0xFFBEF870, 0, 20) Err#4 EINTR
waitid(P_ALL, 0, 0xFFBEF3A8, WEXITED|WTRAPPED|WNOHANG) = 0
waitid(P_ALL, 0, 0xFFBEF3A8, WEXITED|WTRAPPED|WNOHANG) Err#10 ECHILD
setcontext(0xFFBEF558)
poll(0xFFBEF870, 0, 11) = 0
waitid(P_PID, 10233, 0xFFBEF890, WEXITED|WTRAPPED|WNOHANG) Err#10 ECHILD
time()  = 1021666775
write(1, " w r o t e   7 9 8 5 5  ".., 54)  = 54
write(1, " t o t a l   s i z e   i".., 43)  = 43
sigaction(SIGUSR1, 0xFFBEF848, 0xFFBEF8C8)  = 0
sigaction(SIGUSR2, 0xFFBEF848, 0xFFBEF8C8)  = 0
waitid(P_PID, 10233, 0xFFBEF888, WEXITED|WTRAPPED|WNOHANG) Err#10 ECHILD
write(2, " r s y n c   e r r o r :".., 55)  = 55

I see that I can turn on "ignore-errors" mode, but that just seems
dangerous if there were ever anything other than the case I am seeing
here. So I guess my question is what's with the lstat64("")??

--bill






> I suggest running truss -f -o trussfile before that command and look for
> the error in the trussfile and see what's happening just before it.  Under
> some circumstances rsync can print that message when a symlink goes
> nowhere, but I'm not sure what the circumstances are.  Perhaps it does that
> because you're using -x, I'm not sure.
> 
> - Dave 
> 
> 
> On Wed, May 15, 2002 at 02:30:00PM -0700, Bill Houle wrote:
> > /usr/local/bin/rsync \
> > -axv --delete \
> > "${MAINDIR}/${FSERVER}/" \
> > "${MAINDIR}/${TSERVER}/"
> > 
> > Both locations are on the same (non-NFS) partition.
> > 
> > All old files are updated and new files copied, but the orphans
> > are never deleted...
> > 
> > --bill
> > 
> > 
> > 
> > > What is the command you are running? 
> > > 
> > > > -Original Message-
> > > > From: [EMAIL PROTECTED] 
> > > > [mailto:[EMAIL PROTECTED]]On
> > > > Behalf Of Bill Houle
> > > > Sent: Monday, 1 January 1601 11:00 AM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: I/O error when deleting files
> > > > 
> > > > 
> > > > When I run rsync in "no-op" mode, all appears fine (it knows what to
> > > > copy, what to delete). But when I run it "for real", I get:
> > > > 
> > > > building file list ... link_stat  : No such file or 
> > > > directory done
> > > > IO error encountered - skipping file deletion
> > > > 
> > > > followed by:
> > > > 
> > > > total size is 98117172  speedup is 563.78
> > > > rsync error: partial transfer (code 23) at main.c(578)
> > > > 
> > > > Any help/pointers -- Solaris 8 on Ultra -- is appreciated.
> > > > 
> > > > --bill
> > > > 
> > > > 
> > > > 
> > > > -- 
> > > > To unsubscribe or change options: 
> > > > http://lists.samba.org/mailman/listinfo/rsync
> > > > Before posting, read: 
> > > http://www.tuxedo.org/~esr/faqs/smart-questions.html
> > > 
> > 
> > 
> > -- 
> > To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
> > Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
> 


-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: rsynch related question

2002-05-17 Thread tim . conway

I don't know how to do it, only that at least in the case of DCE/DFS, it 
can be done.

Tim Conway
[EMAIL PROTECTED]
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me Tim?"




"Umadevi C Reddy" <[EMAIL PROTECTED]>
05/17/2002 12:09 PM

 
To: Tim Conway/LMT/SC/PHILIPS@AMEC
cc: [EMAIL PROTECTED]
Subject:Re: rsynch related question
Classification: 




Hi

Thanks for info.
Let me clear what I want to achieve from this.
Right now first1(/afs/tr/software ) is running has a master and a period 
of
time second1(/afs/ddc/software) will take over, after that first1 will not
exists. I am in the process of transition work. Currently first1 is owned
by some one else and second1 is, we are going to maintain in the future.
One solution is I can cutover and transfer directly, but it is not
immediate and mean time I should see how it works on second1 environment
and for that purpose I am started maintaining second1( need to modify
hardcode paths and things like that) and also I would like to reflect
changes made in first1. So thinking of using rsync.

Now you can give suggestion, which one is feasible?
Thanks a lot


Warm regards,
Uma
IBM Pittsburgh, 11 Stanwix Street, Pittsburgh, PA  15222
Tel: (412) 667-3121
e-mail: [EMAIL PROTECTED]


  
tim.conway@phi  
lips.com To: Umadevi C 
Reddy/India/IBM@IBMIN 
 cc: [EMAIL PROTECTED] 
 
05/17/2002   Subject: Re: rsynch related 
question 
01:34 PM  
Please respond  
to tim.conway  
  
  



Doesn't AFS do inter-cell communication?  Why not just have first1 -
/afs/tr/software be the RW replica, and  second1 -
/afs/ddc/software be a RO replica?  You just release them all instead of
messing with rsync
at all?  With the source constantly changing, you'll generate a lot fewer
errors that way, and it'll be a lot easier on your resources.

If AFS can't do this, I apologize for the useless info.  My experience is
with IBM/TransARC DCE/DFS (and that's almost 2 years stale).

Tim Conway
[EMAIL PROTECTED]
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(,
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970),
".\n" '
"There are some who call me Tim?"




"Umadevi C Reddy" <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]
05/16/2002 05:08 PM


To: [EMAIL PROTECTED]
cc: (bcc: Tim Conway/LMT/SC/PHILIPS)
Subject:rsynch related question
Classification:



Hello,

   I had a question on rsynch'ing, please do help me on this,
 I have scenario where,I need to synch the two softwares, where both
the software are across the network, on different cells( AFS cells).
For ex: first1 - /afs/tr/software , second1 - /afs/ddc/software
Both the softwares are same & fist1 cell will be constantly
updating and I need to synch this software to sencond1. In this scenario
what command I should use ?

   I will appreciate your great help.

Thanks in advance
Thanks
Uma


--
To unsubscribe or change options:
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html











-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Improving the rsync protocol (RE: Rsync dies)

2002-05-17 Thread Wayne Davison

On Fri, 17 May 2002, Allen, John L. wrote:
> In my humble opinion, this problem with rsync growing a huge memory
> footprint when large numbers of files are involved should be #1 on
> the list of things to fix.

I have certainly been interested in working on this issue.  I think it
might be time to implement a new algorithm, one that would let us
correct a number of flaws that have shown up in the current approach.

Toward this end, I've been thinking about adding a 2nd process on the
sending side and hooking things up in a different manner:

The current protocol has one sender process on the sending side, while
the receiving side has both a generator process and a receiver process.
There is only one bi-directional pipe/socket that lets data flow from
the generator to the sender in one direction, and from the sender to the
receiver in the other direction.  The receiver also has a couple pipes
connecting itself to the generator in order to get data to the sender.

I'd suggest changing things so that a (new) scanning process on the
sending side would have a bi-directional link with the generator process
on the receiving side.  This would let both processes descend through
the tree incrementally and simultaneously (working on a single directory
at a time) and figure out what files were different.  The list of files
that needed to be transferred PLUS a list of what files need to be
deleted (if any) would be piped from the scanner process to the sender
process, who would have a bi-directional link to the receiver process
(perhaps using ssh's multi-channel support?).  There would be no link
between the receiver and the generator.

The advantage of this is that the sender and the receiver are really
very simple.  There is a list of file actions that is being received on
stdin by the sending process, and this indicates what files to update
and which files to delete.  (It might even be possible to make sender be
controlled by other programs.)  These programs would not need to know
about exclusion lists, delete options, or any of the more esoteric
options, but would get told things like the timeout settings via the
stdin pipe.  In this scenario, all error messages would get sent to the
sender process, who would output them on stdout (flushed).

The scanner/generator process would be the thing that parses the
commandline, communicates the exclude list to its opposite process, and
figures out exactly what to do.  The scanner would spawn the sender, and
field all the error messages that it generates.  It would then either
output the errors locally or send them over to the generator for output
(depending on whether we're pushing or pulling files).

As for who spawns the receiver, it would be nice if this was done by the
sender (so they could work alone), but an alternative would be to have
the generator spawn the receiver and then then let the receiver hook up
with the sender via the existing ssh connection.

This idea is still in its early stages, so feel free to tell me exactly
where I've missed the boat.

..wayne..


-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: Rsync dies

2002-05-17 Thread Eric Ziegast

> In my humble opinion, this problem with rsync growing a huge memory
> footprint when large numbers of files are involved should be #1 on
> the list of things to fix.

I think many would agree.  If it were trivial, it'd probably be
done by now.

Fix #1 (what most people do):

Split the files/paths to limit the size of each job.

What someone could/should do here is at least edit the
"BUGS" section of the manual to talk about the memory
restrictions.

Fix #2 (IMHO, what should be done to rsync):

File caching of results (or using a file-based database of
some sorts) is the way to go.  Instead of maintaining a
data structure entirely in memory, open a (g)dbm file or add
hooks into the db(3) libraries to store file metadata and
checksums.

It'll be slower than an all-memory implementation, but large
jobs will at least finish predictably.

Fix #3 (what I did):

If you really really need to efficiently transfer large
numbers of files, come up with your own custom process.

I used to run a large web site with thousands of files and
directories that needed to be distributed to dozens of
servers atomically.  Using rsync, I'd run into memory
problems and worked around them with Fix #1.  Another
problem was running rsync in parallel.  The source directory
was scanned order(N) times when it needed to be scaned only
once.  The source content server was pummeled from the
multiple simultaneous instances.  I resorted to making my
own single-threaded rsync-like program in Perl to behave
more like Fix #2 and runs very efficiently.

I've spent a some time cleaning up this program so that
I can publish it, but priorities (*) are getting in the
way.  When I get some time, you'll see it posted here.

--
Eric Ziegast

(*) Looking for a full-time job is a full-time job.  :^(
Will consult for food.

-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: Rsync dies

2002-05-17 Thread Randy Kramer

Allen, John L. wrote:
> In my humble opinion, this problem with rsync growing a huge memory
> footprint when large numbers of files are involved should be #1 on
> the list of things to fix.  It seems that every fifth post is a
> complaint about this problem!  Sorry if this sounds like ungrateful
> whining, but I think rsync would be greatly improved if the overall
> way it processes multiple files were overhauled to be much more
> resource friendly.

Thanks for that opening ;-)

My guess is that such a change would be a fairly big undertaking.  If
someone considers going that far, I'd like to suggest at least
considering dividing rsync functionality into two (or a few) programs.

(This isn't completely fair, but when I hear about Linux programs that
do one thing only (and do it well), I laugh to myself and think about
the --brush_teeth option that must be in there somewhere. ;-) )

It seems to me it would be nice to have a simple module (i.e.,
standalone program) whose basic function in life is to use the rsync
algorithm to transfer a single file efficiently.

Then, one or more other programs (that can call the first program as
needed) can be provided to do things like traverse a directory to find
out what files must be "rsync'd", change ownership and permission if
required, consider includes and excludes, and whatever else.  Seems to
me the programming would be simpler, the documentation could be simpler
and more understandable, and users could be less confused.

Just another $.02.

Randy Kramer

-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



High CPU load and slow performance on NT/W2K

2002-05-17 Thread Petter Österlund


Hi!

I am experiencing very high CPU load and long update times using
rsync.exe on NT/W2K. The versions I have tested on NT are; 2.4.6, 2.5.1-2
and 2.5.5-2. Cygwin1.dll version 1.3.10. They all show the same kind of 
problem.
Is there anything that I might have missed, or is the perfomance not 
better than this?

On the other side of the transfer I have used both NT/W2K, Solaris and 
Linux.
I have been executing the the rsync command on NT/W2K as well as on the Unix
hosts. The outcome is the same; The NT/W2K machine gets loaded to 100%.
Transfering between the UNIX hosts are no problem. The machines are on a 
100Mb/s LAN.

I am doing an update of a 15 MB text file, like this:
(fyrsol2: Solaris2.7, bserv: Win2000)

-
fyrsol2 # touch x.txt ; time rsync-2.5.2 -avv --progress x.txt 
bserv::tooltest
rsync: building file list...

rsync: expand file_list to 4000 bytes, did move
rsync: 1 files to consider.
x.txt
   15806883 100%   10.27MB/s0:00:00
total: matches=10081  tag_hits=10081  false_alarms=0 data=0
wrote 40481 bytes  read 60542 bytes  8081.84 bytes/sec
total size is 15806883  speedup is 156.47
1.50u 0.39s 0:12.15 15.5%


Notice the transfer time is 12 seconds! (The 10.27MB/s reading must be 
false).
During this the CPU load of the NT/W2K host is 100% for at least 6-8 
seconds.
(The file can be transferred in 1.5 seconds using FTP!)

But now to the realy funny part; If I change the command by adding a -z 
option
I get an unexpected performance gain:

--
fyrsol2 # touch x.txt ; time rsync-2.5.2 -avvz --progress x.txt 
bserv::tooltest
rsync: building file list...

rsync: expand file_list to 4000 bytes, did move
rsync: 1 files to consider.
x.txt
   15806883 100%3.67MB/s0:00:00
total: matches=10081  tag_hits=10081  false_alarms=0 data=0
wrote 158 bytes  read 60542 bytes  9338.46 bytes/sec
total size is 15806883  speedup is 260.41
4.23u 0.27s 0:06.03 74.6%
-

The update now is done in half the time and the CPU load of the NT/W2K host
is only touching 100% during a very short time at the end of the update.

Any sugestions?

/Petter



-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: rsynch related question

2002-05-17 Thread Umadevi C Reddy


Hi

Thanks for info.
Let me clear what I want to achieve from this.
Right now first1(/afs/tr/software ) is running has a master and a period of
time second1(/afs/ddc/software) will take over, after that first1 will not
exists. I am in the process of transition work. Currently first1 is owned
by some one else and second1 is, we are going to maintain in the future.
One solution is I can cutover and transfer directly, but it is not
immediate and mean time I should see how it works on second1 environment
and for that purpose I am started maintaining second1( need to modify
hardcode paths and things like that) and also I would like to reflect
changes made in first1. So thinking of using rsync.

Now you can give suggestion, which one is feasible?
Thanks a lot


Warm regards,
Uma
IBM Pittsburgh, 11 Stanwix Street, Pittsburgh, PA  15222
Tel: (412) 667-3121
e-mail: [EMAIL PROTECTED]


   

tim.conway@phi 

lips.com To: Umadevi C Reddy/India/IBM@IBMIN   

 cc: [EMAIL PROTECTED] 

05/17/2002   Subject: Re: rsynch related question  

01:34 PM   

Please respond 

to tim.conway  

   

   




Doesn't AFS do inter-cell communication?  Why not just have first1 -
/afs/tr/software be the RW replica, and  second1 -
/afs/ddc/software be a RO replica?  You just release them all instead of
messing with rsync
at all?  With the source constantly changing, you'll generate a lot fewer
errors that way, and it'll be a lot easier on your resources.

If AFS can't do this, I apologize for the useless info.  My experience is
with IBM/TransARC DCE/DFS (and that's almost 2 years stale).

Tim Conway
[EMAIL PROTECTED]
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(,
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970),
".\n" '
"There are some who call me Tim?"




"Umadevi C Reddy" <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]
05/16/2002 05:08 PM


To: [EMAIL PROTECTED]
cc: (bcc: Tim Conway/LMT/SC/PHILIPS)
Subject:rsynch related question
Classification:



Hello,

   I had a question on rsynch'ing, please do help me on this,
 I have scenario where,I need to synch the two softwares, where both
the software are across the network, on different cells( AFS cells).
For ex: first1 - /afs/tr/software , second1 - /afs/ddc/software
Both the softwares are same & fist1 cell will be constantly
updating and I need to synch this software to sencond1. In this scenario
what command I should use ?

   I will appreciate your great help.

Thanks in advance
Thanks
Uma


--
To unsubscribe or change options:
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html








-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



RE: Rsync dies

2002-05-17 Thread Allen, John L.

In my humble opinion, this problem with rsync growing a huge memory
footprint when large numbers of files are involved should be #1 on
the list of things to fix.  It seems that every fifth post is a
complaint about this problem!  Sorry if this sounds like ungrateful
whining, but I think rsync would be greatly improved if the overall
way it processes multiple files were overhauled to be much more
resource friendly.

John. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Friday, May 17, 2002 01:44 PM
To: C.Zimmermann
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: Rsync dies


Yeah.  You'll have to find a way to break the job up into smaller pieces. 
It's a pain, but I have a similar situation - 3M+ files in 130+Gb.  I 
can't get the whole thing in one chunk, no matter how fast a server with 
however much memory, even on Gb ethernet (for the server).  In my case, 
the filesystem is on NAS, and the NAS has only 100bT simplex (half-duplex, 
to some).
I have some code that can be used to analyze your system before the sync, 
and choose directories containing no more than a maximum number of items 
below them.  Iterating through the list and using -R can let you get the 
whole thing run, though --delete and -H become less certain (not 
dangerous, but if you don't name anything containing a deleted directory 
because it didn't come up on your list, youll never tell the destination 
to delete it, and if you have two hard links to the same file, but hit 
them in two seperate runs, you now have two copies on disk).
Let me know if you want it.  I'm sure you can figure how to modify it for 
your environment.

Tim Conway
[EMAIL PROTECTED]
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me Tim?"




"C.Zimmermann" <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]
05/17/2002 02:08 AM

 
To: <[EMAIL PROTECTED]>
cc: (bcc: Tim Conway/LMT/SC/PHILIPS)
Subject:Rsync dies
Classification: 



I´m trying to rsync a 210 GB Filesystem with approx 1.500.000 Files.

Rsync always dies after about 29 GB without any error messages.
I´m Using rsync  version 2.5.5  protocol version 26.

Has anyone an idea ?

Thank´s Clemens



-- 
To unsubscribe or change options: 
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html




-- 
To unsubscribe or change options:
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html

--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: rsynch related question

2002-05-17 Thread Sriram Ramkrishna

He's on two different cells so I don't think he would be able to
do what you're stating.  I think he would have to have a db server
in his environment thats part of the source cell.  But I'm not sure
about that.

sri

On Fri, May 17, 2002 at 11:34:45AM -0600, [EMAIL PROTECTED] wrote:
> Doesn't AFS do inter-cell communication?  Why not just have first1 - 
>/afs/tr/software be the RW replica, and  second1 - 
> /afs/ddc/software be a RO replica?  You just release them all instead of messing 
>with rsync 
> at all?  With the source constantly changing, you'll generate a lot fewer 
> errors that way, and it'll be a lot easier on your resources.
> 
> If AFS can't do this, I apologize for the useless info.  My experience is 
> with IBM/TransARC DCE/DFS (and that's almost 2 years stale).
> 
> Tim Conway
> [EMAIL PROTECTED]
> 303.682.4917
> Philips Semiconductor - Longmont TC
> 1880 Industrial Circle, Suite D
> Longmont, CO 80501
> Available via SameTime Connect within Philips, n9hmg on AIM
> perl -e 'print pack(, 
> 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
> ".\n" '
> "There are some who call me Tim?"
> 
> 
> 
> 
> "Umadevi C Reddy" <[EMAIL PROTECTED]>
> Sent by: [EMAIL PROTECTED]
> 05/16/2002 05:08 PM
> 
>  
> To: [EMAIL PROTECTED]
> cc: (bcc: Tim Conway/LMT/SC/PHILIPS)
> Subject:rsynch related question
> Classification: 
> 
> 
> 
> Hello,
> 
>I had a question on rsynch'ing, please do help me on this,
>  I have scenario where,I need to synch the two softwares, where both
> the software are across the network, on different cells( AFS cells).
> For ex: first1 - /afs/tr/software , second1 - /afs/ddc/software
> Both the softwares are same & fist1 cell will be constantly
> updating and I need to synch this software to sencond1. In this scenario
> what command I should use ?
> 
>I will appreciate your great help.
> 
> Thanks in advance
> Thanks
> Uma
> 
> 
> -- 
> To unsubscribe or change options: 
> http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
> 
> 
> 
> 
> -- 
> To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html

-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: Rsync dies

2002-05-17 Thread tim . conway

Yeah.  You'll have to find a way to break the job up into smaller pieces. 
It's a pain, but I have a similar situation - 3M+ files in 130+Gb.  I 
can't get the whole thing in one chunk, no matter how fast a server with 
however much memory, even on Gb ethernet (for the server).  In my case, 
the filesystem is on NAS, and the NAS has only 100bT simplex (half-duplex, 
to some).
I have some code that can be used to analyze your system before the sync, 
and choose directories containing no more than a maximum number of items 
below them.  Iterating through the list and using -R can let you get the 
whole thing run, though --delete and -H become less certain (not 
dangerous, but if you don't name anything containing a deleted directory 
because it didn't come up on your list, youll never tell the destination 
to delete it, and if you have two hard links to the same file, but hit 
them in two seperate runs, you now have two copies on disk).
Let me know if you want it.  I'm sure you can figure how to modify it for 
your environment.

Tim Conway
[EMAIL PROTECTED]
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me Tim?"




"C.Zimmermann" <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]
05/17/2002 02:08 AM

 
To: <[EMAIL PROTECTED]>
cc: (bcc: Tim Conway/LMT/SC/PHILIPS)
Subject:Rsync dies
Classification: 



I´m trying to rsync a 210 GB Filesystem with approx 1.500.000 Files.

Rsync always dies after about 29 GB without any error messages.
I´m Using rsync  version 2.5.5  protocol version 26.

Has anyone an idea ?

Thank´s Clemens



-- 
To unsubscribe or change options: 
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html




--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: rsynch related question

2002-05-17 Thread tim . conway

Doesn't AFS do inter-cell communication?  Why not just have first1 - /afs/tr/software 
be the RW replica, and  second1 - 
/afs/ddc/software be a RO replica?  You just release them all instead of messing with 
rsync 
at all?  With the source constantly changing, you'll generate a lot fewer 
errors that way, and it'll be a lot easier on your resources.

If AFS can't do this, I apologize for the useless info.  My experience is 
with IBM/TransARC DCE/DFS (and that's almost 2 years stale).

Tim Conway
[EMAIL PROTECTED]
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me Tim?"




"Umadevi C Reddy" <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]
05/16/2002 05:08 PM

 
To: [EMAIL PROTECTED]
cc: (bcc: Tim Conway/LMT/SC/PHILIPS)
Subject:rsynch related question
Classification: 



Hello,

   I had a question on rsynch'ing, please do help me on this,
 I have scenario where,I need to synch the two softwares, where both
the software are across the network, on different cells( AFS cells).
For ex: first1 - /afs/tr/software , second1 - /afs/ddc/software
Both the softwares are same & fist1 cell will be constantly
updating and I need to synch this software to sencond1. In this scenario
what command I should use ?

   I will appreciate your great help.

Thanks in advance
Thanks
Uma


-- 
To unsubscribe or change options: 
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html




-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: Status Query - Please respond - Re: Patch to avoid 'Connection reset by peer' error for rsync on cygwin

2002-05-17 Thread Max Bowsher

Combined reply:

Mark - Point taken. But even if it worked correctly everywhere, to me there
seems to be something aesthetically wrong about just letting sockets close
themselves. Kind of like a malloc() without a free().

Wayne - Wouldn't the atexit solution require that we keep a list of fds to
close? Anyway, I think it is now up to me to make a patch which fixes the
problem in the most discreet and unsweeping manner possible, and then post it
for Martin to decide.


Max.


Mark Eichin <[EMAIL PROTECTED]> wrote:
>> Just because Linux lets you get away without it doesn't mean its a good idea.
>
> Except this has nothing to do with linux - this is unix behaviour that
> goes "all the way back", it's part of the process model.  It's part of
> what exiting *does*, so it *is* a bug in cygwin if it isn't doing the
> cleanup (and really, cygwin does so many other bizarre things under
> the covers to create Unix/POSIX semantics in the Very Different win32
> environment, I'm more than a little surprised it isn't doing this
> already... skeptical, even, though I haven't been deep inside cygwin
> in years.)

Wayne Davison <[EMAIL PROTECTED]> wrote:
> On Thu, 16 May 2002, Max Bowsher wrote:
>> That just moves the shutdown call from where you finish with the fd to
>> where you start using the fd - that's got to be less intuitive.
>
> Being more or less intuitive is not the point.  The idea was to have as
> little cygwin kludge code as possible.  Thus, we'd just have one call to
> atexit() during startup, with the single cleanup function being able to
> handle any and all opened sockets, and we're done (if this is even
> feasible -- I haven't looked into it).  This was prompted by Martin's
> statement that he considers this a cygwin bug -- I was assuming that he
> didn't want to make sweeping changes to all the cleanup code in rsync.
> Whether he wants to handle this in a more invasive manner is up to him.
>
> ..wayne..


-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html



Re: [patch] suggestions for -v option

2002-05-17 Thread Dick Streefland

On Thursday 2002-05-16 15:55, Dave Dykstra wrote:
| I'm afraid we've got too much history behind the current way to change
| that now.  Undoubtedly there's lots of scripts around that expect the
| current behavior.  The --log-format option is intended for situations
| like yours.  Try "--log-format %f".  If the log-format option isn't
| flexible enough for what you want, a patch to it is much more likely
| to be accepted, especially if it is upward compatible.

I overlooked the --log-format option. I tried it, and it comes close
to what I want, but it doesn't report directory updates, file
deletions and hard links.  While I don't care very much about
directory time updates, deleted files and hard links are important.
I'm not sure whether the log_send() and log_recv() interfaces are
suitable to log these actions.

To ensure backward compatibility, a new option to suppress the
unwanted messages could be added, e.g. --no-stats. The attached patch
implements this. For completeness, I've attached a separate patch
for the second proposed change.

-- 
Dick Streefland   Altium Software BV
[EMAIL PROTECTED]   (@ @)  http://www.altium.com
oOO--(_)--OOo---


--- rsync-2.5.5/flist.c.origThu Mar 14 22:20:20 2002
+++ rsync-2.5.5/flist.c Fri May 17 12:01:04 2002
@@ -34,6 +34,7 @@
 extern struct stats stats;
 
 extern int verbose;
+extern int do_stats;
 extern int do_progress;
 extern int am_server;
 extern int always_checksum;
@@ -72,7 +73,7 @@
 
 static int show_filelist_p(void)
 {
-   return verbose && recurse && !am_server;
+   return verbose && do_stats >= 0 && recurse && !am_server;
 }
 
 static void start_filelist_progress(char *kind)
--- rsync-2.5.5/main.c.orig Wed Mar 27 06:10:44 2002
+++ rsync-2.5.5/main.c  Fri May 17 11:53:18 2002
@@ -57,7 +57,7 @@
extern int remote_version;
int send_stats;
 
-   if (do_stats) {
+   if (do_stats > 0) {
/* These come out from every process */
show_malloc_stats();
show_flist_stats();
@@ -93,7 +93,7 @@
stats.total_read = r;
}
 
-   if (do_stats) {
+   if (do_stats > 0) {
if (!am_sender && !send_stats) {
/* missing the bytes written by the generator */
rprintf(FINFO, "\nCannot show stats as receiver because remote 
protocol version is less than 20\n");
@@ -118,7 +118,7 @@
   (double)stats.total_read);
}

-   if (verbose || do_stats) {
+   if ((verbose && do_stats == 0) || do_stats > 0) {
rprintf(FINFO,"wrote %.0f bytes  read %.0f bytes  %.2f bytes/sec\n",
   (double)stats.total_written,
   (double)stats.total_read,
--- rsync-2.5.5/options.c.orig  Tue Mar 19 21:16:42 2002
+++ rsync-2.5.5/options.c   Fri May 17 11:51:58 2002
@@ -260,6 +260,7 @@
   rprintf(F," --blocking-io   use blocking IO for the remote shell\n");  
   rprintf(F," --no-blocking-ioturn off --blocking-io\n");  
   rprintf(F," --stats give some file transfer stats\n");  
+  rprintf(F," --no-stats  suppress file transfer stats with -v\n");  
   rprintf(F," --progress  show progress during transfer\n");  
   rprintf(F," --log-format=FORMAT log file transfers using specified 
format\n");  
   rprintf(F," --password-file=FILEget password from FILE\n");
@@ -345,7 +346,8 @@
   {"compress",'z', POPT_ARG_NONE,   &do_compression , 0, 0, 0 },
   {"daemon",   0,  POPT_ARG_NONE,   &am_daemon , 0, 0, 0 },
   {"no-detach",0,  POPT_ARG_NONE,   &no_detach , 0, 0, 0 },
-  {"stats",0,  POPT_ARG_NONE,   &do_stats , 0, 0, 0 },
+  {"stats",0,  POPT_ARG_VAL,&do_stats ,  1, 0, 0 },
+  {"no-stats", 0,  POPT_ARG_VAL,&do_stats , -1, 0, 0 },
   {"progress", 0,  POPT_ARG_NONE,   &do_progress , 0, 0, 0 },
   {"partial",  0,  POPT_ARG_NONE,   &keep_partial , 0, 0, 0 },
   {"ignore-errors",0,  POPT_ARG_NONE,   &ignore_errors , 0, 0, 0 },
--- rsync-2.5.5/rsync.1.origWed Feb  6 22:21:19 2002
+++ rsync-2.5.5/rsync.1 Fri May 17 11:59:36 2002
@@ -305,6 +305,7 @@
  --blocking-io   use blocking IO for the remote shell
  --no-blocking-ioturn off --blocking-io
  --stats give some file transfer stats
+ --no-stats  suppress file transfer stats with -v
  --progress  show progress during transfer
  --log-format=FORMAT log file transfers using specified format
  --password-file=FILEget password from FILE
@@ -778,6 +779,10 @@
 This tells rsync to print a verbose set of statistics
 on the file transfer, allowing you to tell how effective the rsync
 algorithm is for your data\&.
+.IP 
+.IP "\fB--no-stats\fP" 
+When y

Rsync dies

2002-05-17 Thread C.Zimmermann

I´m trying to rsync a 210 GB Filesystem with approx 1.500.000 Files.

Rsync always dies after about 29 GB without any error messages.
I´m Using rsync  version 2.5.5  protocol version 26.

Has anyone an idea ?

Thank´s Clemens



--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html