Re: LGPL relicense port of rsync
> > > > > > > I guess I could write an initial protocol specification - but it would > > > not be complete and I wouldn't be able to relicense my library to > > > LGPL anyway. > > > > > > So I guess I have convinced myself that it is not worth the effort > > > trying. Time is probably better spent coding ;) And that's OK too, it > is not > > > that big of a deal anyway. > > > > Or think about following. You insist that your Java library is > > derivative work from the C program. OK. However, I believe a > > "translation into other languages" doesn't mean you make changes into > > the workflow by code restructuring, introducing another data > > structures, classes and so on. More such changes you made, less it just > > a "translation" and more an inspiration. Often I read in code not > > "based on" but "inspired by". > > > > Anyway, you have written every line in Java. This means you're a > > copyright holder on this. Thus you're allowed to license your work as > > you wish. In case you still insist it is a derivative work, you're > > required to allow the usage of your code under GPL. But! As a copyright > > holder you're allowed to give an arbitrary license additionally and > > even on a per case basis. > > > > This was my opinion. Additional references to approve or disapprove are > > welcome :) > > You might be right but I am a bit hesitant. > > > http://programmers.stackexchange.com/questions/58338/when-porting-code-must-i-follow-the-original-license > > http://programmers.stackexchange.com/questions/90232/original-author-rights-in-a-licensed-software-project?rq=1 > > http://programmers.stackexchange.com/questions/86754/is-it-possible-to-rewrite-every-line-of-an-open-source-project-in-a-slightly-dif These are talking about different situations: - 'porting' in the sense of making code run on a different platform while still having some code in common - line-by-line rewrite or translation - writing a new program using the rsync source as documentation of the protocol, as you are doing In my (not a lawyer) opinion, the last of them does not create a copyright derivative, and (separately) I don't object to you doing that on GPL'd work that I wrote. I would consider the first two to be a violation. I think you have a couple of cheap options to get some clarity: - mail the other key authors listed above explaining what you're doing and ask if they object - mail the FSF or SFLC as custodians of the L/GPL > I think that the best thing would be if rsync would be split into a > library part (LGPL) and application part (GPL). This could make the > rsync protocol even more used. > > But again, it could be quite some substantial work, both coding (?) > but also getting permissions from previous contributors to relicense > the library part. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Fwd: Delete some excluded files in rsync
Begin forwarded message: From: Karel Kulhavy [EMAIL PROTECTED] Date: 7 March 2006 18:01:43 To: Martin Pool [EMAIL PROTECTED] Subject: Delete some excluded files in rsync Hello I suggest that a feature be added into rsync. That one could separately specify excluded files that should be deleted on the receiver and excluded files that shouldn't be deleted on the receiver. I am using rsync for remote updating of my website http://ronja.twibright.com and this feature would be handy because some files are generated on the server because cannot be generated on the laptop where the files are edited, and shouldn't be deleted. The remaining excluded files should be deleted on the receiver if they got accidentally copied in the past, for example becaue the rsync script wasn't tuned properly. Regards, CL -- Martin Pool -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [librsync-users] MD4 second-preimage attack
On Tue, 2006-02-21 at 14:58 -0800, [EMAIL PROTECTED] wrote: A year ago we discussed the strength of the MD4 hash used by rsync and librsync, and one of the points mentioned was that only collision attacks are known on MD4. Could you please forward this into the bug tracker so it's not lost? -- Martin signature.asc Description: This is a digitally signed message part -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Spam to this list
John Van Essen wrote: Off list to rsync list owner (feel free to reply on-list if you like): On Fri, 25 Mar 2005, Dag Wieers [EMAIL PROTECTED] wrote: Hi, I'm not sure what the policy of this list is and I bet everyone has a spam filter, so nobody might have noticed, but we got spammed. The policy is to block as much spam as possible without blocking legitimate posts. A 100% solution is impossible, even if we had human moderation (humans make mistakes). It seems that these posts got through during a surge of spam when the filter hit its maximum-process limit. During the day of the 24th more than 60 spam messages to the list were blocked. I got several. Delivered to the mailing list from: cpe-24-243-54-175.satx.res.rr.com [24.243.54.175] unknown [219.252.105.93] unknown [218.59.89.16] unknown [200.159.206.55] The first one has been in the dul.dnsbl.sorbs.net blacklist since Oct. I use these 4 DNS-based blacklists in the mail server that I manage: sbl-xbl.spamhaus.org list.dsbl.org dul.dnsbl.sorbs.net web.dnsbl.sorbs.net And they have helped a LOT. The other 3 have no reverse DNS entries. A machine with no reverse DNS that is sending email is not very likely to be a legitimate email server. It's much more likely a compromised machine on a clueless ISP's network. Rejecting email from those unidentified machines also has helped a lot. Using any of those measures alone tends to block legitimate posters, particularly those running their own mail server, which to my mind is a greater harm than letting ocassional spam go through. Our purpose here is to run a mailing list, not punish ISPs. So we use all the things you named as part of a weighted score. -- Martin signature.asc Description: OpenPGP digital signature -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync filename heuristics
On 5 Jan 2005, Rusty Russell [EMAIL PROTECTED] wrote: On Tue, 2005-01-04 at 18:24 +0100, Robert Lemmen wrote: hi rusty, i read on some webpage about rsync and debian that you wrote a patch to rsync that let's it uses heuristics when deciding which local file to use. could you tell me whether this is planned to be included in a rsync release? could i have that patch? Hmm, good question. This is from 2.5.4, and can't remember how well it worked. Good luck! I'm not the rsync maintainer anymore, but I think it would be cool if this were merged, if the current team feels OK about it. Rusty. diff -urN rsync-2.5.4/Makefile.in rsync-2.5.4-fuzzy/Makefile.in --- rsync-2.5.4/Makefile.in 2002-02-26 05:48:25.0 +1100 +++ rsync-2.5.4-fuzzy/Makefile.in 2002-04-03 16:35:55.0 +1000 @@ -28,7 +28,7 @@ ZLIBOBJ=zlib/deflate.o zlib/infblock.o zlib/infcodes.o zlib/inffast.o \ zlib/inflate.o zlib/inftrees.o zlib/infutil.o zlib/trees.o \ zlib/zutil.o zlib/adler32.o -OBJS1=rsync.o generator.o receiver.o cleanup.o sender.o exclude.o util.o main.o checksum.o match.o syscall.o log.o backup.o +OBJS1=rsync.o generator.o receiver.o cleanup.o sender.o exclude.o util.o main.o checksum.o match.o syscall.o log.o backup.o alternate.o OBJS2=options.o flist.o io.o compat.o hlink.o token.o uidlist.o socket.o fileio.o batch.o \ clientname.o DAEMON_OBJ = params.o loadparm.o clientserver.o access.o connection.o authenticate.o diff -urN rsync-2.5.4/alternate.c rsync-2.5.4-fuzzy/alternate.c --- rsync-2.5.4/alternate.c 1970-01-01 10:00:00.0 +1000 +++ rsync-2.5.4-fuzzy/alternate.c 2002-04-03 17:04:15.0 +1000 @@ -0,0 +1,117 @@ +#include rsync.h + +extern char *compare_dest; +extern int verbose; + +/* Alternate methods for opening files, if local doesn't exist */ +/* Sanity check that we are about to open regular file */ +int do_open_regular(char *fname) +{ + STRUCT_STAT st; + + if (do_stat(fname, st) == 0 S_ISREG(st.st_mode)) + return do_open(fname, O_RDONLY, 0); + + return -1; +} + +static void split_names(char *fname, char **dirname, char **basename) +{ + char *slash; + + slash = strrchr(fname, '/'); + if (slash) { + *dirname = fname; + *slash = '\0'; + *basename = slash+1; + } else { + *basename = fname; + *dirname = .; + } +} + +static unsigned int measure_name(const char *name, + const char *basename, + const char *ext) +{ + int namelen = strlen(name); + int extlen = strlen(ext); + unsigned int score = 0; + + /* Extensions must match */ + if (namelen = extlen || strcmp(name+namelen-extlen, ext) != 0) + return 0; + + /* Now score depends on similarity of prefix */ + for (; *name==*basename *name; name++, basename++) + score++; + return score; +} + +int open_alternate_base_fuzzy(const char *fname) +{ + DIR *d; + struct dirent *di; + char *basename, *dirname; + char mangled_name[MAXPATHLEN]; + char bestname[MAXPATHLEN]; + unsigned int bestscore = 0; + const char *ext; + + /* FIXME: can we assume fname fits here? */ + strcpy(mangled_name, fname); + + split_names(mangled_name, dirname, basename); + d = opendir(dirname); + if (!d) { + rprintf(FERROR,recv_generator opendir(%s): %s\n, + dirname,strerror(errno)); + return -1; + } + + /* Get final extension, eg. .gz; never full basename though. */ + ext = strrchr(basename + 1, '.'); + if (!ext) + ext = basename + strlen(basename); /* ext = */ + + while ((di = readdir(d)) != NULL) { + const char *dname = d_name(di); + unsigned int score; + + if (strcmp(dname,.)==0 || + strcmp(dname,..)==0) + continue; + + score = measure_name(dname, basename, ext); + if (verbose 4) + rprintf(FINFO,fuzzy score for %s = %u\n, + dname, score); + if (score bestscore) { + strcpy(bestname, dname); + bestscore = score; + } + } + closedir(d); + + /* Found a candidate. */ + if (bestscore != 0) { + char fuzzyname[MAXPATHLEN]; + + snprintf(fuzzyname,MAXPATHLEN,%s/%s, dirname, bestname); + if (verbose 2) + rprintf(FINFO,fuzzy match %s-%s\n, + fname, fuzzyname); + return do_open_regular(fuzzyname); + } + return -1; +} + +int open_alternate_base_comparedir(const char *fname) +{ + char fnamebuf[MAXPATHLEN]; + /* try the file at
Re: A question about rsync
On 7 Jun 2004, Guo jing [EMAIL PROTECTED] wrote: Thanks for your answer! Yes,my question is that if we can get a good result when the file is changing while it is being copied by rsync In my test, if the file is being augmented while it been copied using rsync.I can get a normal copy on the other end and the result file is the same as what the source file is when the rsync scanning. The same result can be gotten if the sour file is reduced and the blocks were not occupied. As you said, if the source file reduced and the blocks were occupied by other files there will be a file with other file's content and a abnormal end on the other end. So,is this true that we can't deal with this problem except to do some changes with the OS ? Yes, or don't change the file while it is being copied. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Bug reporting
On 1 Jun 2004, John Summerfield [EMAIL PROTECTED] wrote: The jitterbug link on http://rsync.samba.org/nobugs.html no longer works. I suggest it either be fixed or removed. Thanks, fixed. You make bug-reporting needlessly difficult, I think. I dislike the need to subscribe to a mailing-list and potentially receive lots of email that doesn't interest me. I have plenty of other email to keep me amused. I don't think you need to subscribe to post. I put the address directly on the nobugs page to make it easier for people to write to it. Did you have any other suggestions about how to make it better? The reason we took Jitterbug and faq-o-matic down is that people seemed to get help more promptly when they wrote to the list. What I wanted most to do is ensure you know about rsyncx and consider working with the authors to create a unified product that supports resource forks when built on OSX. See http://www.macosxlabs.org./rsyncx/rsyncx.html Their CVS repository is at http://www.opendarwin.org It seems a shame to have two projects where one will do. Well, sometimes there are reasons not to glom everything into one big program. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: I20 Drivers Crash system when used with Rsync
On 30 May 2004, Dennis R. Gesker [EMAIL PROTECTED] wrote: Note: I don't know if this is a problem withe I20 drivers or Rsync so I'm submitting to both the Kernel Bugzilla and the Rsync mailing list. I couldn't find a bugzilla for Rsync. I hope this was the correct way to submit this issue. Distribution: Debian Hardware Environment: Intel 850MV Mother board, Pentium 4 processor, 1Gig of RAM, Adaptec 2400A RAID Controler. Both the motherboard and Controller card have the most recent BIOS/firmware installed. The Adaptec card is capable of RAID configuration but currently it is configured to view each of the attached IDE drives as individual drives. None of the cards RAID features are presently beeing used. Network is 100MB/s Switched Ethernet. Network cables and connects have been tested and verified. Software Environment: Very basic/vanilla Debian system install (Sid branch). Software package is rsync. Problem Description: When transfering many sometimes large files (3Gig in some cases)for backup purposes using rsync either via an ssh shell or rsync server the I20 drivers cause a kernel panic. The system seems to report increases in queue depth, shortly afterward the system completely hangs indicating a kernel panic. A kernel panic is by definition a kernel bug, not an application bug. Good luck! :-) p.s. kernel bug reports ought to say what kernel you're using. Steps to reproduce: Transfer files using rsync. Last specific command issued at prompt that reproduced this error was: rsync --bwlimit=2048 -vv -r -e ssh --delete --exclude lost+found rsync://[EMAIL PROTECTED]:873:/bu/area1/blue/* /bu/area1/blue This error does not seem to occour when transferrring the same file set using cp over nfs or scp. However, this does happen using rsync over nfs. -- Martin signature.asc Description: Digital signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: A question about rsync
On 31 May 2004, Guo jing [EMAIL PROTECTED] wrote: hello, I am a student in China.I like the linux and usually use the rsync to backup my documents. Last week when I use it,I find a question I want to discuss with you. The condition is like this: The source file that I want to rsync to another computer is 129M before I start the rsync. During the running of the rsync,the file was changed and became to about 50M, then the rsync ended. When I view the destination, I found that the file was 129M. And there were some contents of the files added when the rsync was running. After that, I do some tests about the rsync: 1. After I start the rsync to backup a file, I delete the file during the rsync is running, I found the file can been backuped normally. 2. While the rsync is backuping a file name sourfile (50M), I add some content by the command cat addfile sourfile to enlarge the file to 100M. After the rsync finished.I found the file is still 50M. The question is that , how the rsync copy a file to another computer at the first time ? My attitude is that it remenbers the physical blocks the file used when the rsync scaned. Then ,rsync will send the blocks to the destination no matter if the file or the block has changed. So, is that right?? Who can tell me how the rsync decide which contents should to send to the destiation? I'm not sure I understand the question, sorry. If you change a file while it is being copied by rsync you may end up with undefined results on the other end. There is not much that can be done about this without os-level version control. Sorry, my English is very poor. Thanks for your read and answer!! _ MSN Messenger: http://messenger.msn.com/cn -- Martin signature.asc Description: Digital signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
CVS update: rsyncweb
Date: Tue Jun 1 09:08:29 2004 Author: mbp Update of /data/cvs/rsyncweb In directory dp.samba.org:/tmp/cvs-serv5025 Modified Files: nobugs.html Log Message: Clean up mention of mailing list. Revisions: nobugs.html 1.8 = 1.9 http://www.samba.org/cgi-bin/cvsweb/rsyncweb/nobugs.html.diff?r1=1.8r2=1.9 ___ rsync-cvs mailing list [EMAIL PROTECTED] http://lists.samba.org/mailman/listinfo/rsync-cvs
CVS update: rsyncweb
Date: Tue Jun 1 02:07:38 2004 Author: mbp Update of /data/cvs/rsyncweb In directory dp.samba.org:/tmp/cvs-serv10518 Modified Files: features.html Log Message: typo Revisions: features.html 1.2 = 1.3 http://www.samba.org/cgi-bin/cvsweb/rsyncweb/features.html.diff?r1=1.2r2=1.3 ___ rsync-cvs mailing list [EMAIL PROTECTED] http://lists.samba.org/mailman/listinfo/rsync-cvs
faq-o-matic gone
The rsync faq-o-matic was broken during the recent machine migration. Since there was relatively little useful content and a lot of unanswered or pointless questions, I am going to remove the links to it. If anyone feels like maintaining an FAQ please do so. -- Martin signature.asc Description: Digital signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
CVS update: rsyncweb
Date: Fri May 28 02:25:19 2004 Author: mbp Update of /data/cvs/rsyncweb In directory dp.samba.org:/tmp/cvs-serv15115 Modified Files: header.html Log Message: remove dead faq-o-matic Revisions: header.html 1.14 = 1.15 http://www.samba.org/cgi-bin/cvsweb/rsyncweb/header.html.diff?r1=1.14r2=1.15 ___ rsync-cvs mailing list [EMAIL PROTECTED] http://lists.samba.org/mailman/listinfo/rsync-cvs
(fwd from psdasilva@esoterica.pt) rsync: Request for a feature
- Forwarded message from Paulo da Silva [EMAIL PROTECTED] - From: Paulo da Silva [EMAIL PROTECTED] Subject: rsync: Request for a feature Date: Sun, 02 May 2004 17:09:11 +0100 To: [EMAIL PROTECTED] User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040317 X-Spam-Status: No, hits=-0.9 required=3.2 tests=BAYES_10 autolearn=ham version=2.63 Hi. 1st. of all thank you for mantaining this very useful program. It helps me a lot in a lot of tasks that would be otherwise very tedious and time consuming. However, I think that a small feature could make it yet more powerfull when used as a backup tool. The idea was to have a switch so that files could be kept compressed at the destination. These compressed files could be then restored the same way specifing a switch telling that source files are to be uncompressed. Files with known extensions (.gz, .zip, ...) should not be compressed/uncompressed. All files must keep the original names unchanged. Ex. export RSYNC_COMPRESSED_EXTS=.gz .zip ... ;# Extensions from files not to be compressed Backup: rsync -av --delete --zip MyDir/ BackupDir Restore: rsync -av --delete --unzip BackupDir/ MyDir This is only a sugestion. You may find a better solution. Thank you. Paulo da Silva - End forwarded message - -- Martin signature.asc Description: Digital signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: test message only
On 23 Apr 2004, Jim Salter [EMAIL PROTECTED] wrote: This is a test message - my apologies for it, but everything I send is getting bounced. Our spamfilter was a little too hasty. It should be OK now. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
OT: fyi, spam
Just as background information: our spam filter caught 14000 attempted spams in the last two weeks. Suggestions on blocking more are welcome but the vast majority is already blocked. I think we removed the @samba.org whitelist. -- Martin signature.asc Description: Digital signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
CVS update: rsyncweb
Date: Mon Jan 12 00:49:28 2004 Author: mbp Update of /data/cvs/rsyncweb In directory dp.samba.org:/tmp/cvs-serv6859 Modified Files: lists.html Log Message: Fix link to Smart Questions document. Revisions: lists.html 1.5 = 1.6 http://www.samba.org/cgi-bin/cvsweb/rsyncweb/lists.html.diff?r1=1.5r2=1.6 ___ rsync-cvs mailing list [EMAIL PROTECTED] http://lists.samba.org/mailman/listinfo/rsync-cvs
Re: rsync / ssh -i
On 4 Dec 2003, Michael [EMAIL PROTECTED] wrote: I know that with ssh I can issue the -i command to use a different identity. Is there anyway to use the -i command with rsync and ssh? Thank you. Use the IdentityFile and Host keywords in your ssh_config: Host suzy-alt-key HostName suzy.foo.org IdentityFile ~/.ssh/id_some_other_dsa -- Martin linux.conf.au -- Adelaide, January 2004 signature.asc Description: Digital signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
CVS update: rsyncweb
Date: Thu Dec 4 10:59:33 2003 Author: mbp Update of /data/cvs/rsyncweb In directory dp.samba.org:/tmp/cvs-serv18506 Modified Files: index.html Log Message: Clarify that the problem is with 2.5.6 *and earlier*. Add CVE index. Revisions: index.html 1.17 = 1.18 http://www.samba.org/cgi-bin/cvsweb/rsyncweb/index.html.diff?r1=1.17r2=1.18 ___ rsync-cvs mailing list [EMAIL PROTECTED] http://lists.samba.org/mailman/listinfo/rsync-cvs
Re: rsync-bugs and unclear semantics when copying multiple source-dirs to one target
On 24 Nov 2003, Dirk Pape [EMAIL PROTECTED] wrote: Dear Martin Pool, I tried to ask via the rsync-mailing list but never got an answer. So I contact you directly. I refer to the rsync syntax rsync [OPTION]... SRC [SRC]... DEST with more than one SRC, which is mentioned in the man-pages. We use this form to overlay a target directory tree from more than one sources (class, group1, group2, ..., machine) to yield a costomized cloned directory. There are some glitches and bugs when using this form of rsync commands, one of which I have described in the here attached mail to the rsync mailing list. This is a platform specific bug. The heart of the problem is that you are trying to write the same file from several different source directories. I think this just will not work predictably in the current design of rsync, because it builds a single list of all files at the start of the transfer. Furthermore the order in which files are transferred is rather strange, for reasons of historical compatibility. I think we do not make any guarantees about what happens if the same relative path occurs in several source directories the behaviour is undefined. I agree that it would be nice if it processed the source directories in the order they are given, but that is not how it works. At the moment your options are: Fix rsync to support this behaviour. Transfer the directories one at a time to build up the destination. This has several problems, one being that there may be many redundant transfers and another that the state will be inconsistent for longer. Make a single source directory that has the state you want. Ditto, but use union bind mounts to synthesize it from several directories, assuming that your OS supports that. Use some other tool. Do several rsync transfers using exclude/include options to pick the right directories from each overlay. The last is possibly the most promising. You could even write a little Perl script to build the exactly correct include lists. There is another glitch, which I will describe here: if you have the following directory structure (- is softlink) ./dir1/dir/a ./dir2/dir - ../dir3/dir ./dir3/dir/b and do rsync -av --delete dir1/ dir2/ target you get ./target/dir - ../dir3/dir ./dir3/dir/a ./dir3/dir/b I would expect either Variant 1: ./target/dir - ../dir3/dir ./dir3/dir/b (contents of /dir1/dir is ignored because dir ist overlayed with a symlink in dir2) or Variant 2: ./target/dir - ../dir3/dir ./dir3/dir/b (./dir1/dir/a is copied following the overlayed symlink *but* the --delete then also has to follow the symlink) I would prefer strongly to see variant 1 or a new option to protect target directories from changing contents by linking in o them. For your motivation: Our more complex scenario is like that: We have class/usr/share/bugzilla/some_files machine/usr/share/bugzilla - /local/usr/share/bugzilla and we do something like rsync -av --delete --exclude local class/ machine/ targethost:/ the --exclude local protects files in targethost:/local from being deleted but not from being overwritten with files which are present in class/usr/share/bugzilla/ on the scr-host. I would like to see an option (or standard semantics) to simply killing a directories sub-filelist when the directory is overlayed by a symlink in a source directory given later in the command line. May be it would suffice to do that only if the symlink points to a directory, which is outside all source dirs or element of an exclude list. I hope you understand and can help me. Thanks, Dirk Pape. From: Dirk Pape [EMAIL PROTECTED] Subject: bug (filelist) for platforms solaris and darwin (macosx) and *not* linuxi386 Date: Sun, 28 Sep 2003 13:19:45 +0200 To: [EMAIL PROTECTED] X-Mailer: Mulberry/3.1.0b7 (Mac OS X) I have found a nasty bug when a file, which is in some of many sources, shall be copied to a target. The linux-Version works well but rsync 2.5.{2|5|6} under solaris9 (gcc 2.95.3) and darwin (gcc 3.1) do not. The decision which file (out of which src) shall be copied depends on the number of src dirs given on the command line. This bug bytes us very hard, because we decided to rely on rsync to build local directories by overlaying different directories from a server and need to be sure to have a consistent semantics in what version of the file appears in the local directory. I stripped our sitation down to a (yet fairly complex) test archive, so you can reproduce the situation. Here is the script, which is also in the archive: #!/bin/bash rsyncpath=rsync $rsyncpath -av --delete dir1/ dir2/ merged12 $rsyncpath -av --delete dir1/ dir2/ dir3/ merged123 # as dir3 only consists of an empty dir subdir we expect # that merged12 and merged123 have identical files in them # but merged*/subdir/s0/LOOKATTHIS differ
Re: rsync rcp
On 30 Oct 2003, [EMAIL PROTECTED] wrote: I was hoping that since you guys are the authors to rsync that you could answer a simple question for me. I'm trying to transfer files via the rsh/rexec protocol by remotely executing a cat command, i.e. cat foo.txt and then sending data through the socket to the stdin of the remote process. This all works fine, except for the fact that I have to close the socket to force and end of file. My question is, does rcp/rsync close a socket when it sends files to signify and end of file? If not, how does it send multiple files without closing the socket? It uses a binary protocol to delimit files and describe metadata such as their name and ownership. As you say you cannot use the end-of-file mark more than once. It is conceptually similar to a tar file. So if you wanted to send multiple files with just rsh, you could do tar c mydir | ssh somehost tar x [EMAIL PROTECTED] is a better forum for questions. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: The rsync daemon as a gateway service?
That's an interesting idea. As a temporary measure you might different tcp ports rather than module names to distinguish different services, and then use tcp redirectors. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Filesystem panic
On 22 Oct 2003, Morten [EMAIL PROTECTED] wrote: Hi, I'm running RH9, 2.4.20-18.9. Each night, the server mounts an external FAT32 disk using firewire, and performs backups to it using rsync. Twice within the past 3 months, the backup process has resulted in machine crash (complete hang, hardware reboot needed). From /var/log/messages: Oct 22 04:02:20 yoda kernel: Filesystem panic (dev 08:21). Oct 22 04:02:20 yoda kernel: fat_free: deleting beyond EOF Oct 22 04:02:20 yoda kernel: File system has been set read-only You probably need to report this to the vfat fs maintainer VFAT FILESYSTEM: P: Gordon Chaffee M: [EMAIL PROTECTED] L: [EMAIL PROTECTED] W: http://bmrc.berkeley.edu/people/chaffee S: Maintained From the rsync error log, I can see that the filesystem becomes read-only, and that it begins to fail the synchronization task with mkstemp SAGER_igangvaerende/2003_07_Fremtidens_Uni/Fase_4/Diagrammer/Celletyper/ .Delecelle.psd.Rz49bM failed: Read-only file system Which is understandable. After doing this for a while, the error message changes to Too many open files. And I suspect that this is what causes the machine to hang. Can you please try to reproduce the problem, and then do lsof -p PID_OF_RSYNC for each rsync process sometime before it starts complaining about too many open files. Then kill rsync to avoid the problem. Is there any way to configure rsync to abort execution once the first error occurs? Not at the moment. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: doing an md5sum rsync?
On 9 Sep 2003 Greger Cronquist [EMAIL PROTECTED] wrote: See also unison, http://www.cis.upenn.edu/~bcpierce/unison/ which does exactly this (and synchronizes using the rsync algorithm). Yes, Unison is very cool. I hadn't realized that it detected renames though. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: performance suggestion: sparse files
On 9 Sep 2003 Jon Howell [EMAIL PROTECTED] wrote: Actually you can guess by looking at the allocated-blocks measure, and use this to guess whether it's preallocated zeros or sparse, which might be useful for backups. But there is no way around reading the blocks. Sure. Bummer; that's a lot of memory bus bandwidth (having the kernel zero-fill the blocks, then having rsync zero-compare them) wasted. If the program mmaps the file the kernel will fill the vm with COW references to the zero page and it will be quite cheap. Seems like a fcntl() is in order. Repeat after me: premature optimization is the root of all evil. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsyncing *to* live system
On 26 Aug 2003 jw schultz [EMAIL PROTECTED] wrote: On Wed, Aug 27, 2003 at 09:25:41AM +1200, Steve Wray wrote: Hi there, I have been asked to develop a system for keeping a bunch of machines remotely configured and updated. The client has asked for this to be implemented using rsync. The machines involved are located at remote sites and physical access by competent personel is non trivial. And the systems are running Debian. I am a little concerned at the prospect of using rsync to replace running binaries, open files and system libraries. I've searched for an example where rsync has been used in this way. So far I have found nothing; people use it to backup a live filesystem; we are tasked with doing the reverse (sort of). And there are people who use rsync to replicate systems (rolling out a bunch of identical boxes; typically these recieve the rsync *before* they go live not after). So, can anyone please give me arguments or reasons for or against using rsync in this way? References to sites which currently use rsync in this way would be much appreciated. There are some difficulties that can occur depending on how you structure your filesystems. It is possible to produce temporary dangling symlinks. Rsync may remove the destination of the link before the symlink is updated or deleted (see --delete-after); or if rsync creates or updates a symlink before the destination is created. You can get inter-file inconsistencies. The file sets are not updated atomically so different config files and binaries may be updated at slightly different times. Because rsync processes the file list in lexical order the window size will depend on the relative remoteness of files in the directory hierarchy so files in the same directory have small windows but files in different subtrees will have a somewhat larger window. Here is an example of a bad case: a program depends on a shared library, and needs to be recompiled when a new version of the library is released. Your transfer upgrades the program before it updates the library (or vice versa) and the program crashes. I agree with JW and will just add that the inter-file inconsistencies could be far worse if the transfer is ever interrupted due to e.g. a network outage. If you interrupt it at the right (or wrong) time it is possible that rsync will no longer be able to run. dpkg knows how to upgrade software in a safe and sane way, avoiding all these problems. Let it do its job. By all means use rsync to transfer the packages, but then run apt or dpkg. In addition, once you upgrade software, you will want to restart daemons to make sure the upgraded stuff is used. dpkg handles that too... -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Looking for atime reset...
On 9 Sep 2003 Saylor, Ted [EMAIL PROTECTED] wrote: I find rsync an excellent tool when I need to move multi-gigabyte filesystems, because I can do most of the copying during the week - then a quick cleanup sweep in our 4 hour outage window. I do need to somehow get the atime's to copy over, because as it stands now I loose the age information (which we will soon be using for auto-archiving) on things I copy with rsync. Would it be that hard to enhance rsync to copy the atime along with the current mtime info? Does anyone have a speedy script, perl, or C program to cleanup the atime after the final rsync is done? You don't say what operating system or filesystem you're using, but on Linux there is no standard way to change the atimes of a file, so there is nothing rsync can do about it. If you persuade your friendly neighbourhood kernel hacker / vendor to add an operation to do this then I suppose rsync could support it. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Operation not permitted?
On 9 Sep 2003 Max Kipness [EMAIL PROTECTED] wrote: Can someone tell me what the problem is here. I am doing an rsync on a sendmail spool directory to a folder that is a samba mount. What do you mean by a samba mount? A filesystem mounted over smbfs? Why is rsync trying to change owner? Because you told it to, using the -a, -o or -g options. Does it have to? You asked for it, you got it :-) I tried manually changing owner (as root) on a file that is sitting on the samba mount and I got the same operation not permitted error. Assuming you're using smbfs, it's because smbmount logs in to the server as a single NT user, and all files appear to be owned by that user. Ownership is not preserved. cifsfs may fix this, but you need to ask about that elsewhere. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: doing an md5sum rsync?
On 7 Sep 2003 Marc MERLIN [EMAIL PROTECTED] wrote: I don't know if this has been requested before, but I would really like for rsync to compute an md5sum for each file at the source and destination (with a flag turned off by default of course), and it would realize that I renamed files at the source by noticing a matching md5sum between different filenames It would then rename the destination instead of deleting it and resending the entire source, just because the filename changed. This would also take care of me moving files between directory trees, and again do a mv instead of a delete/resend (if I rsync the root of all that of course) Or is this possible already? This is not possible yet. It is on my wishlist for a future program. Of course remotely detecting files that have moved between directories might mean having the server hash every file on the filesystem. So it might be quite expensive... -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Add a feature : disk and partition cloning
On 2 Sep 2003 francis.mit [EMAIL PROTECTED] wrote: Bonjour, Today, I use rsync for updating some 40 Debian/Linux box, rsync is great. So, now, I'll need to update a whole disk or partition (NTFS) with an image or an other disk or part. (case multiboot system), can'I hope rsync do this task in some day ? I agree that it would be a cool feature. It's unlikely that the existing codebase would be extended for it, but something like rdiff might support it eventually. In the meantime, just dd across ssh. rsync algorithm would be great for this task, isn't-it ? Not directly; the basic rsync algorithm cannot update in place. You might adapt it to do so though. I'don't mind how amount of works this feature need, but some folks are interresting in ? You don't mind how much work other people do for you? How gracious. Or were you volunteering to write it? If so, adding in-place updates to rdiff would be a good place to start. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync daemon and secrets file
On Mon, 25 Aug 2003 12:49:36 -0400 Hardy Merrill [EMAIL PROTECTED] wrote: rsync -avv [EMAIL PROTECTED]::test-secret/one_secret /tmp/rsync_test_secret Yes, that's better. Although 'man rsync' does technically describe this in the CONNECTING TO AN RSYNC SERVER OVER A REMOTE SHELL PROGRAM section with this command: rsync -av --rsh=ssh -l ssh-user [EMAIL PROTECTED]::module[/path] local-path But that's not what you're doing here. You're just connecting to an rsync server over TCP. IMHO, it would enhance user understanding to provide a concrete EXAMPLE of this. Also, it would help in 'man rsyncd.conf' not only to see an example of an rsyncd.conf file, but also to see examples of the different transfers that could be done with that rsyncd.conf file. I'm not criticizing - just mearly noticing an area that given some attention, could increase user understanding and decrease support. Could you please draft a couple of paragraphs to add to the manual that you think would improve it? If you post them here I will check them and commit them. Thanks, -- Martin GNU does not eliminate all the world's problems, only some of them. -- The GNU Manifesto -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: mknod / rsync error
On 22 Aug 2003 16:11:21 +0200 Lars Bungum [EMAIL PROTECTED] wrote: Greetings! I'm experiencing these problems as described in this mail: --- From: Thomas Quinot ([EMAIL PROTECTED]) Subject: Rsync 2.5.5: FreeBSD mknod can't create FIFO's This is the only article in this thread View: Original Format Newsgroups: mailing.unix.rsync Date: 2002-06-24 06:05:25 PST The following patch (adapted to rsync 2.5.5 from the one posted in Dec. 2000, http://lists.samba.org/pipermail/rsync/2000-December/003349.html) n.b. the patch quoted in this mail was truncated. That looks reasonable to me. -- Martin GNU does not eliminate all the world's problems, only some of them. -- The GNU Manifesto pgp0.pgp Description: PGP signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [librsync-devel] Re: state of the rsync nation? (revisited6/2003 from 11/2000)
On 8 Jun 2003, Donovan Baarda [EMAIL PROTECTED] wrote: regarding librsync... It is still in sort-of-active development on SourceForge by a variety of developers... a new release is waiting in CVS for me to finally get around to releasing it, but I'm busy on a big contract at the moment so its currently on hold pending some more cygwin/win32 testing. It is in active use by projects like rdiff-backup. AFAIK, rproxy is pretty much dead, and the only version that exists depends on a very old version of libhsync. The closest thing to this available now is the http proxy proof of concept with xdelta, but it's radically different in many ways to the old rproxy (due to xdelta not using signatures). The main reason why rproxy is dead is that dynamically-generated HTML files, where in principle rproxy does best, are just not a very important problem for many people at the moment. For users with ADSL or better HTML is not a problem, binaries may be. I realize there are people in the world who are still using modem connections where rproxy might be a win but I don't see any of them asking for commit access. A positive thing for rproxy is that both Mozilla and Apache2 are pretty stable now, and they have good interfaces for adding streaming compression support. Neither Apache1.2 or Netscape, which were dominant when I started on it could handle this very well. This is largely still true, except libhsync changed back to librsync and now has its own project on SourceForge separate from the mostly defunct rproxy. librsync itself has no wire format, being just a general purpose signature/delta/patch library implementing the rsync algorithm. The comments about rsync never using libhsync/librsync are still true for the foreseeable future. There are many things rsync includes that are still missing from librsync, and the rsync implementation is very tightly coupled, with many backwards compatibility issues. Even when librsync reaches the point of being as good or better than rsync at signature/delta/patch calculation, it would be a major task to fit it into rsync. I think it's best at the moment to let rsync continue as a nice stable program, good at what it does. Wayne and myself have been toying with replacements in the background and perhaps in the future something better will come out, and perhaps it will use librsync. rsync also has more active development, mostly in the form of incremental feature additions and the resulting bugfix fire-fighting, all of which lead to an even more tangled implementation. Occasionally there are efforts to re-write and clean up sections of the code, but they are (rightly) regarded cautiously because of the breakage risk involved for little immediate gain. The librsync code in CVS is still largely not very good. It is pretty messy and needs a good cleanup. True. :-/ I think I got a bit mentally twisted up by trying to support nonblocking operation, which I still think is very important the the library being generally useful. Doing this in C is a bit hard. But it could certainly be done much better. The other thing I would really like to see is a more thorough test suite. The API is mostly OK though, and it _does_ work quite well, with no known bugs. I have some plans for a major cleanup and optimisation of the code based on my experiences with pysync. I have a patch submitted that I plan to commit after the next release that optimises and cleans up the delta calculation code quite a bit. The next big thing in delta calculation is probably going to be the vcdiff encoding format, which should allow a common delta format for various applications and supports self-referencing delta's, which makes it capable of compression. According to the xdelta project this has already been implemented, and I'm keen to see Josh's code, as it could be used as the basis for a cleanup/replacement of at least the patch component of librsync. Yes, that sounds good. I think someone has a Perl wrapper for librsync that was being used as a test bed for rsync 3 type development (superlifter?). superlifter was my prototype. It uses Python, and in fact just calls out to rdiff at the moment. At the moment I see it as another layer above librsync/rdiff that provides pipelined delta-compressed remote network IO, optionally over SSL or SSH. On top of this you could build a batch transfer like rsync 2.6, or an interactive client, or a backup system like Duplicity, or a real-time mirror based on dnotify. For the future I can see continued support of the exising rsync code. I would also like to see librsync adopt vcdiff as it's delta format, and get a major cleanup, possibly by re-using some xdelta code. There are many common elements to the xdelta and rsync algorithms, and I see no reason why a single library couldn't support both (as pysync does). It would be nice if librsync and/or xdelta could become _the_ delta library. I heartily
Re: patch draft for extended attributes on linux
On 25 Jun 2003, Wayne Davison [EMAIL PROTECTED] wrote: On Wed, Jun 25, 2003 at 10:34:38AM +1000, Martin Pool wrote: There is no mtime for xattrs, so they are transferred every time as part of the file list. One possibly better solution would be to create some kind of CRC of the xattr data (MD4/MD5/whatever) and send just that in the file list for each file. This would allow you to figure out when to update the xattr data, but the protocol would need to be modified to send the xattr data during the file-update phase (and possibly to allow the reciever to request just an xattr update without doing a file update). That's a pretty good idea. For the moment I just wanted a minimal patch, as traffic size is not an overwhelming consideration for the particular user I was helping. However, for many realistic cases the xas are quite small. It is entirely possible for a file's attr and value them to be smaller than a 20-byte SHA1. (Well, perhaps not with my inefficient packing, but in principle they might be.) In cases where xattrs are used for security information, it might not be sufficient to apply them just at the end of the transfer. That might make the permissions on the temporary file too weak. Or perhaps not -- I just didn't want to think about it. :-) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
patch draft for extended attributes on linux
This draft patch adds support for transferring extended attributes with a new --xattr option. It ought to work on Linux with XFS or ext2/ext3 filesystems with the SGI/bestbits attribute system. It is partially working, but there seems to be some kind of hang bug while transferring the file list. I suspect it might be provoking a problem in io.c. You need to rerun autoconf, autoheader and configure after applying it. There is no mtime for xattrs, so they are transferred every time as part of the file list. This means that they will be updated correctly if you change attributes but do not change the file. I wrote this because it was required by a colleague. I have mixed feelings about whether this ought to be merged, even once it's working correctly. rsync hardly needs more options or protocol variations. :-( (Amusingly enough I once said -xattr instead of --xattr and it silently did something else.) diff -urpdN -x .ignore -x packaging -x cvs.log -x configure -x config.h.in -x autom4te.cache -x config.log -x .cvsignore -x dummy -x .svn -x ID -x TAGS rsync-2.5.6/Makefile.in xa/Makefile.in --- rsync-2.5.6/Makefile.in 2003-01-21 05:26:14.0 +1100 +++ xa/Makefile.in 2003-06-24 15:08:09.0 +1000 @@ -34,7 +34,7 @@ OBJS1=rsync.o generator.o receiver.o cle main.o checksum.o match.o syscall.o log.o backup.o OBJS2=options.o flist.o io.o compat.o hlink.o token.o uidlist.o socket.o \ fileio.o batch.o clientname.o -OBJS3=progress.o pipe.o +OBJS3=progress.o pipe.o xattr.o DAEMON_OBJ = params.o loadparm.o clientserver.o access.o connection.o authenticate.o popt_OBJS=popt/findme.o popt/popt.o popt/poptconfig.o \ popt/popthelp.o popt/poptparse.o diff -urpdN -x .ignore -x packaging -x cvs.log -x configure -x config.h.in -x autom4te.cache -x config.log -x .cvsignore -x dummy -x .svn -x ID -x TAGS rsync-2.5.6/cleanup.c xa/cleanup.c --- rsync-2.5.6/cleanup.c 2003-01-27 14:35:08.0 +1100 +++ xa/cleanup.c2003-06-24 16:16:58.0 +1000 @@ -26,7 +26,7 @@ * shutdown() of socket connections. This eliminates the abortive * TCP RST sent by a Winsock-based system when the close() occurs. **/ -void close_all() +void close_all(void) { #ifdef SHUTDOWN_ALL_SOCKETS int max_fd; diff -urpdN -x .ignore -x packaging -x cvs.log -x configure -x config.h.in -x autom4te.cache -x config.log -x .cvsignore -x dummy -x .svn -x ID -x TAGS rsync-2.5.6/configure.in xa/configure.in --- rsync-2.5.6/configure.in2003-01-28 16:27:40.0 +1100 +++ xa/configure.in 2003-06-24 20:27:45.0 +1000 @@ -5,7 +5,7 @@ AC_CONFIG_SRCDIR([byteorder.h]) AC_CONFIG_HEADER(config.h) AC_PREREQ(2.52) -RSYNC_VERSION=2.5.6 +RSYNC_VERSION=2.5.6-xa AC_SUBST(RSYNC_VERSION) AC_MSG_NOTICE([Configuring rsync $RSYNC_VERSION]) @@ -267,6 +267,7 @@ AC_CHECK_HEADERS(glob.h mcheck.h sys/sys AC_CHECK_HEADERS(netdb.h) AC_CHECK_HEADERS(malloc.h) AC_CHECK_HEADERS(float.h) +AC_CHECK_HEADERS(attr/xattr.h) AC_CHECK_SIZEOF(int) AC_CHECK_SIZEOF(long) @@ -414,6 +415,7 @@ AC_CHECK_FUNCS(waitpid wait4 getcwd strd AC_CHECK_FUNCS(fchmod fstat strchr readlink link utime utimes strftime) AC_CHECK_FUNCS(memmove lchown vsnprintf snprintf asprintf setsid glob strpbrk) AC_CHECK_FUNCS(strlcat strlcpy strtol mtrace mallinfo setgroups) +AC_CHECK_FUNCS(lgetxattr) AC_CACHE_CHECK([for working socketpair],rsync_cv_HAVE_SOCKETPAIR,[ AC_TRY_RUN([ diff -urpdN -x .ignore -x packaging -x cvs.log -x configure -x config.h.in -x autom4te.cache -x config.log -x .cvsignore -x dummy -x .svn -x ID -x TAGS rsync-2.5.6/flist.c xa/flist.c --- rsync-2.5.6/flist.c 2003-01-19 05:00:23.0 +1100 +++ xa/flist.c 2003-06-25 08:29:52.0 +1000 @@ -1,7 +1,7 @@ /* Copyright (C) Andrew Tridgell 1996 Copyright (C) Paul Mackerras 1996 - Copyright (C) 2001, 2002 by Martin Pool [EMAIL PROTECTED] + Copyright (C) 2001-2003 by Martin Pool [EMAIL PROTECTED] This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -422,6 +422,12 @@ static void send_file_entry(struct file_ } #endif +#if SUPPORT_XATTRS +if (opt_xattr) { +xalist_send(f, file-xattrs); +} +#endif + #if SUPPORT_HARD_LINKS if (preserve_hard_links S_ISREG(file-mode)) { if (remote_version 26) { @@ -457,7 +463,9 @@ static void send_file_entry(struct file_ } - +/** + * This matches up with send_file_entry() + **/ static void receive_file_entry(struct file_struct **fptr, unsigned flags, int f) { @@ -555,6 +563,13 @@ static void receive_file_entry(struct fi sanitize_path(file-link, file-dirname); } } + +#if SUPPORT_XATTRS +if (opt_xattr) { +xalist_receive(f, file); +} +#endif + #if SUPPORT_HARD_LINKS if (preserve_hard_links
Re: patch draft for extended attributes on linux
On 24 Jun 2003, jw schultz [EMAIL PROTECTED] wrote: I don't much care for sending the xattrs as part of the file list. Even the 4KB ext[23] _currently_ limit it to is huge. I would have preferred to do it doing the regular transfer, rather than in the file list, but that seemed to make it a bit harder to ensure that the attributes were always applied even if the file was not otherwise modified, or if it were a symlink, etc. Nevertheless i do think it worth having something in patches. Yes, if I can get this working I think the best place for it to end up is as an unofficial patch. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: patch draft for extended attributes on linux
On 24 Jun 2003, jw schultz [EMAIL PROTECTED] wrote: That lack of an mtime for xattr could well cause difficulties for backup systems as well. Perhaps a note to the filesystems people is in order. The problem is that you can't use mtime for these. It really needs its own timestamp, perhaps as a mandatory system attribute. Yes, I think so. If that still makese sense when I'm finished on this I will send mail. Perhaps it would fit well into the reiserfs 'tree of small things' model of the world. I don't much care for sending the xattrs as part of the file list. Even the 4KB ext[23] _currently_ limit it to is huge. I'm not sure what is typical here. The situation I'm working on is replicating a Samba share which is storing ACLs and EAs in XFS EAs. Most of them will be pretty small, and most files won't have them. For a small tree with a short XA on each file and no other changes, it's like this: [data]$ ~/work/rsync/xa/rsync -aPzv distcc-2.7.1/ dest --xattr building file list ... 161 files to consider wrote 12365 bytes read 20 bytes 24770.00 bytes/sec total size is 1115540 speedup is 90.07 [data]$ ~/work/rsync/xa/rsync -aPzv distcc-2.7.1/ dest building file list ... 161 files to consider wrote 3027 bytes read 20 bytes 6094.00 bytes/sec total size is 1115540 speedup is 366.11 I think it's tolerable. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Oops more testing was required....
On 17 Jun 2003, Rogier Wolff [EMAIL PROTECTED] wrote: Oops. Missed one line in the last patch Thankyou. That looks good. If we're going to make this more accurate it might be worthwhile to actually look at how long we really did sleep for, and use that to adjust time_to_sleep rather than resetting to zero. Also I'd prefer the variable be called micros_to_sleep or us_to_sleep. Small point I know. diff -ur rsync-2.5.6.orig/io.c rsync-2.5.6/io.c +++ rsync-2.5.6/io.c Tue Jun 17 23:43:49 2003 @@ -416,10 +416,19 @@ * use a bit less bandwidth than specified, because it doesn't make up * for slow periods. But arguably this is a feature. In addition, we * ought to take the time used to write the data into account. + * + * During some phases of big transfers (file XXX is uptodate) this is + * called with a small bytes_written every time. As the kernel has to + * round small waits up to guarantee that we actually wait at least + * the requested number of microseconds, this can become grossly + * inaccurate. We therefore keep a cumulating number of microseconds + * to wait, and only actually perform the sleep when the rouding + * becomes insignificant. (less than 10%) -- REW. **/ static void sleep_for_bwlimit(int bytes_written) { struct timeval tv; + static int time_to_sleep = 0; if (!bwlimit) return; @@ -427,9 +436,13 @@ assert(bytes_written 0); assert(bwlimit 0); - tv.tv_usec = bytes_written * 1000 / bwlimit; - tv.tv_sec = tv.tv_usec / 100; - tv.tv_usec = tv.tv_usec % 100; + time_to_sleep += bytes_written * 1000 / bwlimit; + + if (time_to_sleep 10) return; + + tv.tv_sec = time_to_sleep / 100; + tv.tv_usec = time_to_sleep % 100; + time_to_sleep = 0; select(0, NULL, NULL, NULL, tv); } -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Oops more testing was required....
On 18 Jun 2003, jw schultz [EMAIL PROTECTED] wrote: On Wed, Jun 18, 2003 at 09:09:59PM +1000, Martin Pool wrote: On 17 Jun 2003, Rogier Wolff [EMAIL PROTECTED] wrote: Oops. Missed one line in the last patch Thankyou. That looks good. If we're going to make this more accurate it might be worthwhile to actually look at how long we really did sleep for, and use that to adjust time_to_sleep rather than resetting to zero. That would have to be a platform specific thing since not all systems modify the timeout value to reflect the amount of time not slept. Nevertheless that is a nice idea. Right, I know that is not portable but I forgot to say so. As Rogier say, you need to call gettimeofday() or some such. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Smoother bandwidth limiting
On 4 Feb 2003, jw schultz [EMAIL PROTECTED] wrote: Yes but i'd like to hear from some people who know network performance programming. I know only enough to be mildly dangerous. :-) I don't think you can do this optimally in userspace, because there is lots of buffering between what we write to the kernel and what gets onto the wire, which is generally what the user cares about. It will interact with the MTU, which is generally small enough not to matter, but also with the TCP window size. I think by throttling our connection we will also change the TCP window dynamic behaviour. In particular with no bwlimit rsync will often be blocked on network IO, but it may not be with bwlimit. This might make a difference to whether the Nagle algorithm comes in to effect to get packets pushed out. There is also some kind of interaction with routers with their own queues (as for ADSL, etc), and performance on fast networks may be very different. So I would be a bit cautious of applying patches based on one person's experience. Doing larger writes is likely to make the bandwidth more jerky, as the kernel buffer is filled up, drains, and then pauses. That might make rsync's interaction with interactive traffic more harmful than it ought to be. But bringing it right down to 1024B doesn't sound good -- it's likely to generate MTU packets, which nobody really wants. So by all means tweak it, but I think trying to make it run at the exact specified limit is unlikely to pay off. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Smoother bandwidth limiting
On 15 May 2003, Paul Slootman [EMAIL PROTECTED] wrote: I can't really see that doing smaller writes will lead to packets being padded, unless you're doing really small writes (ref. the ATM 48-byte packets); the TCP and IP headers will always be added, which means that the extra overhead of those will have a larger impact than any padding. So, I'd suggest that 1024 isn't that bad a number for all cases; it'll fit comfortably into most MTU sizes, and for dialup PPP it'll be split into two packets without that much overhead. If not concerned with the dialup PPP case, I'd go for something like 1400. Of course a write() does not necessarily correspond to a TCP frame, which does not necessarily correspond to an IP packet. But nevertheless I would suggest avoiding writes that are this short. In addition to the headers that Paul mentioned, there are other per-packet costs such as Ethernet leadin and trailer times, and the hardware, interrupt and OS overhead for processing packets. Consider also that some people use rsync on fast networks, and they won't appreciate small packets *or* getting more system calls to process a given amount of data. Needlessly causing each packet hold 30% less data than it normally would is very wasteful. The point of bwlimit is after all to help users have more bandwidth for other applications. Checking for bwlimit after every say 4k I can imagine but below that is dubious. I'm happy to be proved wrong though. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: You have emailed an address at dslreports.com
On 16 Jun 2003, Lapo Luchini [EMAIL PROTECTED] wrote: Each time I send a message to the ML I receive this message... (thi mislead me to double-post some days ago). Could someone please unsubscribe the blocked address? But I guess that's not possible, as anyone else shuold have noticed this, too... =( Done. (I saw it too.) -- Martin pgp0.pgp Description: PGP signature -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Interactive Rsync Authentication Problem
On 29 May 2003, Andrew Klein [EMAIL PROTECTED] wrote: The getpassphrase() call is identical to getpass() except it returns 256 chars maximum. Of course you would have to mess with autoconf but I don't think that should be too hard. Based on the autoconf stuff in the latest rsync release, the compile check would be something along these lines: AC_CACHE_CHECK([for getpassphrase],rsync_cv_HAVE_GETPASSPHRASE,[ AC_TRY_COMPILE([#include unistd.h], [char *pass; pass = getpassphrase(Password: );], rsync_cv_HAVE_GETPASSPHRASE=yes,rsync_cv_HAVE_GETPASSPHRASE=no)]) if test x$rsync_cv_HAVE_GETPASSPHRASE = xyes; then AC_DEFINE(HAVE_GETPASSPHRASE) fi Can you try that and tell us if it actually works? It's OK if you can't get the autoconf stuff straight, but it would be good to know that getpassphrase() actually solves the problem before worrying. Better yet, send a patch that adds an appropriately-licenced readpassphrase()/getpassphrase() to the lib/ directory? Someone wrote: I love the fact that the man page for getpass() under Linux says don't use this, but does not provide any alternative. Mmmm... Linux - it's so secure! ;-) Solaris fnmatch(ass, hat, 0) used to return true! -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: support@microsoft e-mails is a VIRUS
On 20 May 2003, jw schultz [EMAIL PROTECTED] wrote: Is there anyway you can stop sending these e-mails to everybody on the list? I've received maybe 3 or 4 of them since yesterday. One possible solution to reduce the spam/virus traffic on the list would be to close the list so that only people on the list can send to it. The rsync team has, so far, rejected that approach. We want to keep the list as open as possible. Many people post to the list without subscribing, because it is the main support forum for a product. It is not really closed in the way that a list for a development team is. So there would be a lot of mails blocked. If they're automatically bounced then it is annoying for rsync users. If they're deferred then the delay is annoying, and somebody needs to spend time reading through the queue. At the moment I don't think that would be a good use of time. The only real solution is to send spammers and virus writers to jail. In the mean time we have set up spam and virus filters. As jw says, you are only seeing a small fraction of the literally hundreds of attacks we suffer every day. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Feature request: true multiple sources
On 14 Jun 2003, Gregory Brauer [EMAIL PROTECTED] wrote: I am a big fan of rsync, but the more I use it, the more I become frustrated at rsync's asymetrical functionality. For instance, I can do this: rsync /A/ /B/ desthost:/AB but not this: rsync srchost:/A/ srchost:/B/ /AB rsync allows remote shell wildcards: rsync 'srchost:/{A,B}/' /AB The limitations are in your own mind. (Well, at least this one is. :-) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Multistreaming rsync
On 10 Feb 2003, Cockram, Michael L (ISI) [EMAIL PROTECTED] wrote: Newbie here! I am not sure if this is possible or not, but is it possible to multistream the connections that rsync is making? Say I had a directory with a bunch of huge sized files. Is there a way of telling rsync to make multiple connections for different groups of files? Am I making sence? Just run different rsync processes for different subdirectories. There is no support in the program itself. Are there tcp window limitations on rync like ftp has? What do you mean? TCP windows are pretty much invisible to applications. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
CVS update: rsync
Date: Tue Jun 17 04:46:32 2003 Author: mbp Update of /data/cvs/rsync In directory dp.samba.org:/tmp/cvs-serv10975 Modified Files: authenticate.c Log Message: Add a comment about using getpassphrase() or readpassphrase() rather than getpass(). No code change. Revisions: authenticate.c 1.22 = 1.23 http://www.samba.org/cgi-bin/cvsweb/rsync/authenticate.c.diff?r1=1.22r2=1.23 ___ rsync-cvs mailing list [EMAIL PROTECTED] http://lists.samba.org/mailman/listinfo/rsync-cvs
Re: [librsync-devel] Re: state of the rsync nation? (revisited6/2003 from 11/2000)
On 12 Jun 2003, jw schultz [EMAIL PROTECTED] wrote: Mind you, that means making the server lightweight with the client doing all the logic and a nearly stateless connection. Much like my earlier post on this thread posited. I was wondering today if that would make it easier to gain confidence in the design's security. Making the semantics of an operation less dependent on a lot of accumulated state probably helps, all other things being equal. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [librsync-devel] Re: state of the rsync nation? (revisited6/2003 from 11/2000)
On 12 Jun 2003, Brad Hards [EMAIL PROTECTED] wrote: Hash: SHA1 On Wed, 11 Jun 2003 11:25 am, Martin Pool wrote: That could be a pretty nice thing. Â We use little rsync shares on workstations here for sharing files, and I know some people do the same with FTP. What aside from SLP would make this more useful? A standardised way of describing the share would be good. By this, I don't mean a software implementation, but a user / admin configuration. Think Standard Operating Procedures. The other thing that would be nice would be a search capability - find me the shares with a copy of rsync-2.5.6.tar.bz2 OK, interesting. 1. I'm thinking about something that, as a minimum, doesn't do plain text passwords. I admire clever attacks as much as the next guy, but the next guy doesn't want some kewl hax0r with a copy of tcpdump uploading warez either. Probably SASL is worth a look. Yes, SASL looks like the way to go, at least for authentication. Some things I read indicate that SASL is not a good choice for encryption/integrity. So perhaps we should use SASL just for authentication, and SSL for confidentiality/integrity. Does that make any sense? Why run this _only_ over TCP? Obviously you don't want to re-invent TCP/IP error handling, but the protocol shouldn't rely on such a system. File transfer can potentially run connectionless. It sounds like you're talking about something like NFS (XDR-RPC) that can run over UDP or TCP? I wouldn't rely on TCP specifically, but I think it's OK to rely on a byte stream channel, such as TCP or SSH. I suppose if you're going to do UDP then you might want to try to do multicast too, but that makes things like error handling a lot harder. But I do think there should be a layer at which there are distinct messages, and that what goes under that might be something other than a byte stream in future. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [librsync-devel] Re: state of the rsync nation? (revisited6/2003 from 11/2000)
On 12 Jun 2003, jw schultz [EMAIL PROTECTED] wrote: Leave the communications protocol to the communications layer. You don't save anything by coding reordering and retransmission at the packet level; that is infrastructure. Connectionless is fine. Lightweight sessions is better. If you lose a connection a restart is possible. It is preferable to not have to authenticate and negotiate protocol versions and encryption with every message. Think in terms of transactions. Each transaction is atomic. If a transaction doesn't complete you have the means to roll-back and retry. If a connection breaks between transactions, or leaving a transaction incomplete, you start a new connection and pick up where you left off. I agree with all this. To extend on what jw says: I think it's fine to (if desired) negotiate SSL, authentication, and compression at the start of a connection. They generally require multiple round trips and it would be wasteful to do them more frequently when per-connection is natural. On the other hand it would be nice if the client could pick up an interrupted transfer halfway through the tree, rather than needing to start from the beginning as rsync 2.x does. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [librsync-devel] Re: state of the rsync nation? (revisited6/2003 from 11/2000)
On 11 Jun 2003, Donovan Baarda [EMAIL PROTECTED] wrote: On Wed, 2003-06-11 at 13:59, Martin Pool wrote: On 11 Jun 2003, Donovan Baarda [EMAIL PROTECTED] wrote: The vcdiff standard is available as RFC3284, and Josh is listed as one of the authors. Yes, I've just been reading that. I seem to remember that it was around as an Internet-Draft when I started, but it didn't seem clear that it would become standard so I didn't use it. I'm not sure if this is the same one... I vaguely recall something like this too, but I think it was an attempt to add delta support to http and had the significant flaw of not supporting rsync's delta-from-signature. It may have come out of the early xdelta http proxy project. IMHO rproxy's http extensions for delta support were better because they were more general. Yes, the most recent version of the Mogul delta-http proposal I read assumed that the server had a complete history of the document to generate diffs. This is fine if you're serving e.g. software distributions or content from a version control system and have the history, but not very general. I forget if I saw this in Tridge's thesis, but I definitely noticed that librsync uses a modified zlib to make feeding data to the compressor and throwing away the compressed output more efficient. I have implemented this in pysync too, though I don't use a modified zlib... I just throw the compressed output away. Yes, I remember that, but that's not rzip. By the way the gzip hack is an example of a place where I think a bit of extra compression doesn't justify cluttering up the code. I think I'd rather just compress the whole stream with plain gzip and be done. See http://samba.org/~tridge/phd_thesis.pdf pg 86ff rzip is about using block search algorithms to find widely-separated identical blocks in a file. (I won't go into detail because tridge's explanation is quite clear.) I am pretty sure you could encode rzip into VCDIFF. I am not sure if VCDIFF will permit an encoding as efficient as you might get from a format natively designed for rzip, but perhaps it will be good enough that using a standard format is a win anyhow. Perhaps building a VCDIFF and then using bzip/gzip/lzop across the top would be acceptable. In fact rzip has more in common with xdelta than rsync, since it works entirely locally and can find blocks of any length. rzip's advantage compared to gzip/bzip2 is that it can use compression windows of unlimited size, as compared to a maximum of 900kB for bzip2. Holding an entire multi-100MB file in memory and compressing it in a single window is feasible on commodity hardware. The self referencing compression idea is neat but would be a... challenge to implement. For it to be effective, the self-referenced matches would need to be non-block aligned like xdelta, which tends to suggest using xdelta to do the self-reference matches on top of rsync for the block aligned remote matches. Fortunately xdelta and rsync have heaps on common, so implementing both in one library would be easy (see pysync for an example). If I didn't have paid work I would be prototyping it in pysync right now. If anyone wanted to fund something like this I could make myself available :-) I may get a chance to work full time on replication again soon, so I'm trying to work out where we're up to. Yeah, my big complaint about librsync at the moment is it is messy. Just cleaning up the code alone will be a big improvement. I would guess that at least 30% of the code could be trimmed away, leaving a cleaner and more extensible core, and because messy leads to inefficient, it would be faster too. If I'd had more time this letter would have been shorter. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [librsync-devel] Re: state of the rsync nation? (revisited6/2003 from 11/2000)
On 10 Jun 2003, Brad Hards [EMAIL PROTECTED] wrote: Yep. Also, I was playing with the idea of rsync with Service Location Protocol to use as a replacement for the crappy practice of sharing data over floppy disks. The rough concept was that each machine had a shared directory, which you could conveiently label and advertise over SLP. That could be a pretty nice thing. We use little rsync shares on workstations here for sharing files, and I know some people do the same with FTP. What aside from SLP would make this more useful? Go superlifter! For what it is worth, the things I identified during the abortive kioslave / SLPv2 share development: 1. More secure than FTP. 2. Easy to label shares/directories and provide fine grained access control, if desired. 3. Client side library that doesn't require hellish text parsing, or at least hides it from you. 4. Well delimited packets, so you can tell when one has been dropped. Can you give more detail on those? What do you mean by packets being dropped? How can that happen on a TCP channel? -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [librsync-devel] Re: state of the rsync nation? (revisited6/2003 from 11/2000)
On 11 Jun 2003, Donovan Baarda [EMAIL PROTECTED] wrote: The vcdiff standard is available as RFC3284, and Josh is listed as one of the authors. Yes, I've just been reading that. I seem to remember that it was around as an Internet-Draft when I started, but it didn't seem clear that it would become standard so I didn't use it. I also had some correspondence with Josh ages ago where he talked about how self-referencing delta's can directly do compression of the miss data without using things like zlib and by default gives you the benefits of rsync's context compression without the overheads (rsync runs a decompressor _and_ a compressor on the receiving end just to regenerate the compressed hit context data). Something possibly similar is mentioned in tridge's thesis. I was talking to him a while ago and (iirc) he thought it would be good to try it again, since it does well with the large amounts of memory and CPU time that are available on modern machines. I strongly agree with what you said a while ago about code simplicity being more valuable than squeezing out every last bit. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: state of the rsync nation? (revisited 6/2003 from 11/2000)
On 9 Jun 2003, Brad Hards [EMAIL PROTECTED] wrote: Hash: SHA1 On Sun, 8 Jun 2003 15:43 pm, Donovan Baarda wrote: The comments about rsync never using libhsync/librsync are still true for the foreseeable future. There are many things rsync includes that are still missing from librsync, and the rsync implementation is very tightly coupled, with many backwards compatibility issues. Even when librsync reaches the point of being as good or better than rsync at signature/delta/patch calculation, it would be a major task to fit it into rsync. The downside to not having a library that is wire-compatible with rsync - --daemon is that it is damn difficult to write something that works as a VFS / kioslave type device. I had a hack at this, by wrapping the rsync executable, and it worked a bit, but it was way too fragile for any real use: http://www.cuneata.net/rsync-kio.html I guess the reason why you're interested in doing it is so that you can browse public rsync mirrors from Konqueror/whatever? Speaking only for myself, I don't think this is worth spending time on. It would be hard to write a wire-compatible library, and hard to refactor rsync into such a library. Not only might a new tool be written more easily without baggage, it might also (in a couple of years) persuade people running mirror sites to switch. I know many of them are unhappy with rsync at the moment: - large memory usage - no really good ways to restrict client usage - ... -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: state of the rsync nation? (revisited 6/2003 from 11/2000)
On 8 Jun 2003, Donovan Baarda [EMAIL PROTECTED] wrote: The next big thing in delta calculation is probably going to be the vcdiff encoding format, which should allow a common delta format for various applications and supports self-referencing delta's, which makes it capable of compression. According to the xdelta project this has already been implemented, and I'm keen to see Josh's code, as it could be used as the basis for a cleanup/replacement of at least the patch component of librsync. Do you have a link for this? Josh plays his cards pretty close to his chest. The XDelta page seems to be even more inactive than librsync :-/ -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
(fwd) PATCH: managing permissions with rsyncd.conf options
-- Martin ---BeginMessage--- This is a patch to control unix permissions when uploading to a rsyncd-server by setting rsyncd.conf options. cu, Stefan -- Stefan Nehlsen | ParlaNet Administration | [EMAIL PROTECTED] | +49 431 988-1260 rsyncd.conf options to handle file permissions (stolen from samba) This patch is made to provide more control on the permissions of files and directories that are uploaded to a rsyncd-server. Normally when files and directories are uploaded to a rsyncd they are created with the permissions of the source. Especially in the case that user and group are set to special values using the uid and gid directives it does not much sense to use the source permission pattern. There is a patch introducing a new chmod command line option but normally you may want to control the permissions on server side. The patch below will allow you to modify file and directory permissions by using 4 new rsyncd.conf directives. I'm sure that those 2 patches will not break each other and it really makes sense to use them both. You may know this options from samba :-) create mask When a file is created (or touched) by rsyncd the permissions will be taken from the source file bit-wise 'AND'ed with this parameter. This parameter may be thought of as a bit-wise MASK for the UNIX modes of a file. Any bit not set here will be removed from the modes set on a file when it is created. The default value of this parameter is set to 0 to be provide the default behaviour of older versions. Following this rsync will bit-wise 'OR' the UNIX mode created from this parameter with the value of the force create mode parameter which is set to 000 by default. This parameter does not affect directory modes. See the parameter directory mask for details. See also the force create mode parameter for forcing particular mode bits to be set on created files. See also the directory mask parameter for masking mode bits on created directories. Default: create mask = 0 Example: create mask = 0644 force create mode This parameter specifies a set of UNIX mode bit permissions that will always be set on a file created by rsyncd. This is done by bitwise 'OR'ing these bits onto the mode bits of a file that is being created or having its permissions changed. The default for this parameter is (in octal) 000. The modes in this parameter are bitwise 'OR'ed onto the file mode after the mask set in the create mask parameter is applied. See also the parameter create mask for details on masking mode bits on files. Default: force create mode = 000 Example: force create mode = 0644 directory mask When a directory is created (or touched) by rsyncd the permissions will be taken from the source directory bit-wise 'AND'ed with this parameter. This parameter may be thought of as a bit-wise MASK for the UNIX modes of a file. Any bit not set here will be removed from the modes set on a file when it is created. The default value of this parameter is set to 0 to be provide the default behaviour of older versions. Following this rsync will bit-wise 'OR' the UNIX mode created from this parameter with the value of the force directory mode parameter which is set to 000 by default. This parameter does not affect file modes. See the parameter create mask for details. See also the force directory mode parameter for forcing particular mode bits to be set on created directories. See also the create mask parameter for masking mode bits on created files. Default: directory mask = 0 Example: directory mask = 0755 force directory mode This parameter specifies a set of UNIX mode bit permissions that will always be set on a directory created by rsyncd. This is done by bitwise 'OR'ing these bits onto the mode bits of a directory that is being created. The default for this parameter is (in octal) which will not add any extra permission bits to a created directory. This operation is done after the mode mask in the parameter directory mask is applied. See also the parameter directory mask for details on masking mode bits on created directories. Default: force directory mode = 000 Example: force directory mode = 0755 diff -ur rsync-2.5.5/loadparm.c rsync-2.5.5-umask/loadparm.c --- rsync-2.5.5/loadparm.c Mon Mar 25 05:04:23 2002 +++ rsync-2.5.5-umask/loadparm.cSun Mar 2 22:53:16 2003 @@ -140,6 +140,10 @@ int timeout; int max_connections;
(fwd from kladit@t-online.de) files of length zero
- Forwarded message from Klaus Dittrich [EMAIL PROTECTED] - From: [EMAIL PROTECTED] (Klaus Dittrich) Subject: files of length zero Date: Tue, 11 Mar 2003 17:08:47 +0100 To: [EMAIL PROTECTED] User-Agent: Mutt/1.4i X-Bogosity: No, tests=bogofilter, spamicity=0.00, version=0.10.2 Hi Martin, MS-Windows users here sometimes make the expierience to become files of length zero when something on windows crashes. They often have many files open and after a crash they don't realize that parts of their work gots lost. Nightly a backup-server using rsync, copies those zero length files and thereby destroyes their files backed up the day before by makeing it zero length too. Can you build in an option to rsync that handles files of length zero the same way as deleted ones, so preserving the old file ? -- Regards Klaus - End forwarded message - -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync in-place (was Re: rsync 1tb+ each day)
On 4 Feb 2003, jw schultz [EMAIL PROTECTED] wrote: The reason why in-place updating is difficult is that rsync expects the unchanged blocks in the old file may be relocated. Data inserted into or removed from the file does not require the rest of the file to be retransmitted. Unchanged blocks will be copied from the old locations in the old file to new locations in the new file. In-place updates requires that blocks not relocate. It may be possible by disallowing matches having differing offsets. That would require deeper investigation. Of course the other place where people want this is for transfers of block devices, where the rename is just not possible. I looked a little at doing this in librsync. The naive solution is to merely prohibit the delta from referring to blocks that have been already overwritten. I will probably eventually add at least this option. You might try this in rsync. A lot of other code to do with e.g. setting permissions makes the assumption of the rename model, though. It would take a fair amount of testing. Of course this model really falls down in some cases. Consider the case of one block inserted at the beginning. Then with the naive no backreferences approach every block will be overwritten just before it's needed. :( You can imagine a smarter algorithm that does non-sequential writes to the output so as to avoid writing over blocks that will be needed later. Alternatively, if you assume some amount of temporary storage, then it might be possible to still produce output as a stream. Really for your problem the practical solution is just to dump the whole file, perhaps allowing for sparse blocks. As other people have observed, by design rsync does a lot more disk IO than network. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: proposal to fork the list (users/developers)
On 30 Jan 2003, Green, Paul [EMAIL PROTECTED] wrote: I tend to be someone who automatically looks for trends, and the nice thing about having just one list is that it lets me know where people are having problems. Judging by the number of questions we get, one of the biggest challenges for inexperienced rsync users is knowing why a particular file is included or excluded. Yes, that's definitely a large advantage of having a single list. Way in the back of my mind I see a need for an option that, for every file included or excluded, says which rule was used to make the decision. Nice and simple. I came to the same conclusion in a similar way a while ago. If you use -vv for rsync, you should see messages about exactly this. :-) -- Martin Debian: giving you the power to shoot yourself in each toe individually. -- ajt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: reconnect ssh connection?
On 30 Jan 2003, David Garamond [EMAIL PROTECTED] wrote: has someone come up with a trick to let disconnected ssh connections be recovered without terminating and having to restart rsync (perhaps by wrapping ssh or something)? Ooh, interesting idea... You might do it with some kind of wrapper at both ends... Alternatively, by changing ssh options perhaps you can get the process to stay open even if the link goes away, by increasing timeouts and so on... -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: Proposal that we now create two branches - 2_5 and head
On 30 Jan 2003, Donovan Baarda [EMAIL PROTECTED] wrote: On Thu, 2003-01-30 at 07:40, Green, Paul wrote: jw schultz [mailto:[EMAIL PROTECTED]] wrote: [general discussion of forthcoming patches removed] All well and good. But the question before this thread is are the changes big and disruptive enough to make a second branch for the event of a security or other critical bug. Agreed. [...] After reading arguments, is support the delay the branch till it absolutely must happen approach... ie don't branch until a bugfix needs to go in to a stable version and HEAD is way too unstable to be released with the fix as the new stable. Yes, that is generally a better approach. Remember, you can always go back and create a branch from the release later on if such a situation occurs. Personally, I'm more interested in eventually starting from scratch with something like duplicity, rzync, or superlifter. I think the way Subversion builds on the experience but not the code from CVS is pretty good. Obviously there are downsides to this approach: it may be a long time before the code is ready, and people may not want to switch for a while after that. But it may be more fun, and eventually yield a cleaner solution. I hope other people are interested in continuing work on librsync and projects based on it. I think the parallels between rsync and CVS are actually reasonably strong: - good tools, and de facto standards both for the free software community - showing signs of age in underlying assumptions (file-by-file versioning in CVS, shared filelist in rsync) - knotty code and interface that are a bit hard to refactor - most existing users have it working properly and don't *want* disruptive changes, just bug fixes or perhaps small additional features - new approach offers substantial benefits - doing something new is not urgent All the above is just for me personally. Continuing to move rsync itself forward as and when appropriate is still a good thing. Actually, a bigger attitude issue for me is having a separate rsync-devel and rsync-user lists. I have almost unsubscribed many times because of the numerous newbie user questions; Me too. Samba does this with samba-technical and samba. I think at this point the user list for samba only has slightly more traffic than rsync. I think apache may now be the same too. Plenty of people post user questions to samba-technical despite prominent notices that it is only for developers. They tend to both piss off developers and go unanswered at least some of the time. It's probably due both to my question *is* technical and if the developers read it they might answer. I'm not sure what a good solution would be: probably a clearer name would help. Perhaps rsync-dev? What do people feel about this? I'm only interested in the devel stuff. I'm sure there are many users who have unsubscribed because of the numerous long technical posts. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: [trivial patch] link overloaded
On 29 Jan 2003, jw schultz [EMAIL PROTECTED] wrote: This is just a trivial documentation change. The word link is overloaded. It refers to symlinks, hardlinks and network links. When looking for references to file links in the manpages the network references get in the way. +1 -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: Proposal that we now create two branches - 2_5 and head
On 28 Jan 2003, Green, Paul [EMAIL PROTECTED] wrote: I think splitting the branches will also let us be a little more experimental in the development branch, at least until we get near the next release phase, because we'll always have the field release in which to make crucial bug fixes available quickly. I agree that this would be a good approach if and only if there is energy to do lots of development in the head branch. What do you have in mind? -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
list filtering
Because of the enormous amount of traffic being generated by Windows viruses[0] I have turned on Mailman attachment filtering on the high-traffic samba.org lists. Lists will now pass only text/plain MIME parts through to the list. multipart/alternative messages with both text and html forms will have the HTML form removed, and messages in only HTML will be squashed to text. Messages which cannot be handled in any of these ways will be rejected. To send patches or log files to the list, you need to either insert them inline into your message, or make sure they're marked as text/plain. On most systems, just making the name be *.txt should be sufficient. I hope everybody's enjoying their SQL Server experience :-) -- Martin samba.org postmaster [0] ... automated notifications about viruses, users complaining about viruses, users complaining about automated notification, users complaining about users complaining, scanners complaining about perfectly ordinary attachments, etc -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: signing tarballs
[replied to list] There was a discussion about this on the Samba list a while ago http://lists.samba.org/pipermail/samba-technical/2002-November/040931.html Briefly We should create a team signing key, with an lifetime of about a year. It has to be relatively short to allow for turnover in the people who have access to the key. The signing key must only be stored on secure machines, certainly *not* on samba.org.(If it was on samba.org, somebody who compromised that machine could also generate new signatures and it would be pointless.) The key should be signed by team members and other relevant people; we should also sign each others' keys. The key should be on the keyservers and on the web site. Unless you've already done so I'll create the key and send the private half to you and the public half to the website, keyservers, and list. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: SPAM on List...
On 9 Dec 2002, John E. Malmberg [EMAIL PROTECTED] wrote: I will agree that the SAMBA lists are being kept more spam free than some of the other mail servers that I get e-mail on. Just as an interesting data point: our bogofilter setup caught 60 spam messages in the last 24 hours aimed at lists.samba.org lists. And while you are saying that you are not in favor of using blocking lists, you are blocking Korea by some method, but that could be just something that bogofilter has figured out. We're using korea.services.net. Unfortunately the spam:ham ratio for Korea is so bad that this seems to be the only appropriate solution. We check the headers on samples of rejected messages and there are dozens of spams per day and I haven't seen a nonspam message yet. It is your servers and your decisions on how to allocate your resources. No spam blocking method is 100%. And I am not complaining about your efforts. I was just posting some methods of spam blocking in use, and of course my bias opinions on them. Thanks, understood. If I'm defensive it's only because maintaining these things is generally a thankless task. People (not you, John) complain and whine when spam gets through, but nobody sees the work that goes into keeping the other 99% out and keeping things running smoothly. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: SPAM on List...
On 10 Dec 2002, jw schultz [EMAIL PROTECTED] wrote: First let me say that Martin (and any others list managers) is doing pretty well. Although there was a breif rise in the volumen of spam leaking through during the transition it has settled down quite nicely. This is an arms war and I don't expect perfection. Cudos! Thanks! I can almost second that. That seems to hold true for the last couple of months. Perhaps html is already blocked. I do know that some valid mail may come in with Content-Type: Multipart/Alternative where one is text/plain. Although i don't like the waste of bandwidth i could see accepting that. It is the stuff that is only html that should definitely be bounced. I've wondered about installing something like mimedefang to handle these things. It would be nice to get rid of TNEF attachments too. I won't start this until we have some experience with the new stopspam-bogofilter setup. There are some complications: As Tim points out, some people don't control whether their mailer sends HTML or not. So we would need to fall back to html-text conversion, rather than bouncing such messages. This makes it not a good way to detect spam. Some people need to send patches/log files/whatever to the lists as attachments. What's not there can't break. Unless it's clearly useful, it shouldn't be installed. Given that some people can't change their HTML setup (not under their control or too clueless) I'm not sure if notification messages are useful. The other clear indicator that comes up more often here seems to be non-english messages. Care has to be taken not to block just because of a few words but if the message is mostly non-english or is in a charset incompatible with english it should be bounced. The previous bouncer did explicitly block non-latin character sets. However, there was a nasty failure mode which caused some non-junk messages to be blocked. People writing from (say) China may be using a mail client that sends messages in a Chinese character set. Some of those character sets contain latin characters, so they may have in fact been writing a purely English message, or perhaps an English message with a part-Chinese sig block. Discarding these messages was incorrect; what was worse was that the old system gave no indication of how to fix the problem and the messages were dropped without review. :-( As an amusing example of going too far in the other direction, a certain government body has XXX as a blackword in their mail filter, and a single occurrence is enough to cause the messages to bounce. Of course people pretty regularly write XXX for don't care values... And let's not even think about byte sex. :-) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: SPAM on List...
On 9 Dec 2002, John E. Malmberg [EMAIL PROTECTED] wrote: If it was on any of the reputable blocking lists, I would not be able to receive any of the SAMBA lists, and you would be getting the bounces. It has since been removed from some of them. I.P. based blocking has shown to be the only thing that motivates some domains to act on abuse reports. I really don't care about abuse reports anymore. There is an inexhaustible supply of other spam sources. Desirable as it may be to have ISPs behave properly, it will not reduce the amount of spam. And the bounce message can contain an alternate contact means such as a web form if someone needs a white-listing. A major goal of this exercise is to reduce or eliminate the number of messages that require manual handling because they waste admin time, and they are often dropped. Our previous experience was that IP blacklists have significant false-positive and false-negative rates. In addition, IP blacklists seem to often go mad when the admins start pursuing a campagin against some ISP in a way that does not agree with our goals. For example, the previously-reputable ORBS server blacklisted most of Australia a few years ago. Basically I want the decisions to be made by samba team admins, not by other people. Some time last fall apparently Korea passed an OPT-OUT with the equivalent of ADV in the headers law. Right after that, list that I subscribe to at a major university went from 2 spams a week to over 8 spams a day. 99% from Korea. We no longer accept any mail from Korea. :-( Now the other thing to consider is that when the filter makes a mistake and deletes a legitimate message, it is quite a while before the sender figures out, if at all that the message did not get through. Our filter sends intelligible, actionable bounce messages. This is an enormous improvement of the previous system, which said something like error 10. If the message is bounced, the sender knows immediately, and can use the alternate contact information, such as a web form to request a whitelisting. As RFC 2822 requires, mail to postmaster is not filtered, and is read by a human. People can report problems there. They also know that there is probably a problem with their ISP or with the particular block list, and they have the information needed to fix it. That's bogus. If my ISP is blocked it is very difficult for me to change -- at home I am on a 12 month contract with my DSL provider, for example. Even if I did move, it's very unlikely that my leaving would persuade them to change/enforce their AUP. People with business hosting are in a even more difficult situation. Filtering makes spam your problem. Using a blocking list makes spam the problem of the ISP sending the spam. Eventually almost noone will accept e-mail from them, either from local blocking lists, or public ones. You describe a long-term solution in which spam-friendly ISPs are gradually ostracised. I'm not quite sure I believe you that there is a clear distinction, that bonafide ISPs are really able to stop spam, and that being ostracised will ever really cut them off. But regardless, these are long-term, global measures.What I care about is reducing admin load and spam transmission on samba.org right now. Our bogofilter setup seems to be doing *extremely well* at just that; I can see it catching many more messages and getting far fewer false positives before, and it is no longer necessary to clear queues by hand. I looked through the queue when I installed it and there were many posters who just happened e.g. to be from China and whose messages were basically dropped. Unless people have specific complaints about the new setup I intend to keep going along this path. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: Head Rotor VE 12/08A
On 8 Dec 2002, [EMAIL PROTECTED] wrote: Howdy... Can we get RID of this member? This is the 2nd time I have seen this posted. Now after the first time, I figured it would have been put into a SPAM filter, and thereby the member would not be able to post SPAM to the list again, but that does not seen to be the case. We're working on improving spam filtering for the list using bogofilter. At the moment we catch about 100 spams per day going to the samba lists, so the percentage is not so bad. The only real solution is to jail spammers. I still suggest we go to a closed list, whereby e-mail addresses are verified by a person before being allowed to post. It would help with SPAM, and when a member posts SPAM, they are put into moderated mode, and if they do it a 2nd time, they are banned...permanently. samba.org already has lots of trouble with people who are not able to follow simple instructions about how to subscribe, unsubscribe, post, etc. Going to a closed list would cause more administrative work, and would also inconvenience posters who want to e.g. read via a local list. So at the moment I don't want to do that. -- Martin Open a medium-sized can of Spam (retain the can (retain the spam too)) -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync] Re: bug reporting.. bugzilla
On 9 Dec 2002, R P Herrold [EMAIL PROTECTED] wrote: Really a better FAQ editor process seems more useful. Isn't this the purpose of a CVS and commit privileges -- set up one or more trusted editors with rights, and delegate that aspect. Anybody who wants to maintain the FAQ-O-Matic already has the necessary access. If somebody starts working on it and feels that CVS would be more appropriate then of course we can switch to that. For the last year, I have acted as editor on the RPM website; there is also an open editorial mailing list, and provided content (dreadfully little) gets slotted in. I monitor all the mailing lists in the area (five primary ones) and watch for common questions or misunderstandings which are well answered end up summarized and on the site. I particularly look for the postings by the lead maintainer and a few others for the 'nuggets' -- the answers float by and may be picked out of the stream and tossed up on the riverbank of a FAQ Yes, that's the process I had in mind. It's just a matter of some set of people finding the time and motiviation to do it. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
(fwd from david@interactiveinstitute.se) Bugs in rsync
- Forwarded message from David Jonsson [EMAIL PROTECTED] - From: David Jonsson [EMAIL PROTECTED] Subject: Bugs in rsync Date: Fri, 19 Jul 2002 18:38:59 +0200 (CEST) To: Martin Pool [EMAIL PROTECTED], Andrew Tridgell [EMAIL PROTECTED] First, Thansk for a great tool! I run rsync supplied with RedHat 7.3 rsync --version rsync version 2.5.4 protocol version 26 Copyright (C) 1996-2002 by Andrew Tridgell and others http://rsync.samba.org/ Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles, IPv6, 64-bit system inums, 64-bit internal inums and I experience the followinf errors when I issue this command rsync -a --delete * /mnt/navhdb2 1. Files beginning with . will not get deleted at the destination if they don't exist in the source. (I detected that i leaves erased .forward files) 2. I hade replaced a directory with a symbolic link at the source but the destination kept the directory. Please respond to me just so I know the reasons of my problems. David - End forwarded message - -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
spam filter on rsync list
# 0.40 0.40 156.153.255.237 # 0.40 0.307692 delivered-to # 0.40 0.228571 for # 0.40 0.164948 from # 0.40 0.116364 lists.samba.org # 0.40 0.080706 mbp # 0.40 0.055292 nov # 0.40 0.037553 palrel12.hp.com # 0.40 0.025353 postfix # 0.40 0.017046 received # 0.40 0.011429 return-path # 0.40 0.007648 rsync # 0.40 0.005112 samba.org # 0.40 0.003414 wed # 0.40 0.002278 with In response to rampant abuse, I have installed a new spam filter, Bogofilter, on the rsync mailing list. Experiments have indicated that it should get a smaller rate of false negatives or positives than the existing system. If there are any problems, please mail me or the postmaster. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: 2.5.6 release
On 5 Nov 2002, jw schultz [EMAIL PROTECTED] wrote: This might be a good time for tagging 2.5.6 perhaps. A fair number of bugfixes have gone in, popt updates, and a few new features. It has been stable for about 2 months. Unless there is something in the pipeline it sounds like time to release and start on 2.5.7cvs. Sounds good to me. I'll do a 2.5.6pre to check next week, unless somebody else really wants to do it. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes and a new proposal
On 4 Aug 2002, Wayne Davison [EMAIL PROTECTED] wrote: Your previous proposal sounded quite a bit more fine-grained than what rZync is doing. For instance, it sounded like you would have much more primitive building-block messages and move much of the controlling smarts into something like a python-language scripting layer. While rZync allows ftp-level control (such as send this file, send this directory tree, delete this file, create this directory) it does this with a small number of higher-level command messages. OK, good. I think that's a good idea. My rZync app currently operates on each arg independently, but I recently discovered that this makes it incompatible with rsync when merging directories and such. For instance, the command rsync -r dir1/ dir2/ dir3 merges the file list and removes duplicates before starting the transfer to dir3. This is a substantial source of cruft in the current code, and one of the reasons claimed to make an up-front traversal necessary. I think a more efficient, and possibly simpler solution, would be to first examine all of the source directories and determine their relationships. Basically, you might discover that dir2 is in fact a subdirectory of dir1, or the same (or vice versa), in which case you can eliminate it. Or you might discover that they're disjoint. Given that directories are trees, I don't think any there are any other possibilities. Doing this in a way that properly respects various symlink options will be a little complex, but I think it is in principle possible. It is also something quite amenable to being thoroughly exercised in isolation as a unit test. I am pretty sure that you can do this by just examining dir1 and dir2. You do need to look at the filesystem to find out about symlinks and so on, but I think you do not need to traverse their contents. It is pretty complex, so there might be some case I've missed. I got rid of the multi-IO idiom of rsync in favor of sending all data via messages and limiting each chunk to 32K to allow other messages to be mixed into the middle of a large file's data-stream (such as verbose output). OK, that makes sense. I guess 32k is as good a number as any. I think the basic idea of how rZync envisions a new protocol working is a good one -- not so much the specifics of the bytes sent in the message-header format, but how the messages flow, how each side handles the messages in a single process, how all I/O is handled by a single function, etc. There's certainly lots of room for improvement, though. I've started looking at the code, and it looks very nice. It's certainly easier to read that rsync. Would you mind putting in some more comments to help me along though? I had a couple of internal thoughts about how the code for a next release ought to go. Please don't take them as criticisms of your right to write experimental code however you want, or as an attempt to dictate how we run things. I just want to raise the issues. Global names should be distinguished with some kind of prefix, as in librsync: rz_ or whatever. If this ever turns into a library that gets linked into something else it will help; in the meantime it helps keep clear what is part of the project and what's pulled in from elsewhere. I really liked mkproto.awk when I first saw it, but now I'm not so keen. I think maintaining header files by hand is in some ways a good thing, because it forces you to think about whether a particular function really needs to be exported to rest of the program, or to the world at large. From rzync.h: #define MSG_HELLO 1 #define MSG_QUIT 3 #define MSG_NO_QUIT_YET 4 // XXX needed?? #define MSG_ABORT 5 #define MSG_NOTE_DIRNAME 6 #define MSG_NOTE_FILENAME 7 #define MSG_DEC_REFCNT8 These might work better as an enum, so that gdb can show symbolic values. typedef struct { char *names[MAX_ID_LIST_LEN]; long nums[MAX_ID_LIST_LEN]; int count; } ID; Linus has a rule about not using typedefs for structures, because it's good to be clear about whether something is a structure or whatever. I'm inclined to agree. So I would refer to that thing struct rz_id or something. Being 64-bit clean probably implies declaring rz_time_t, rz_uid_t and so on, and using that rather than the native types, which will be pretty random. This also reminds me that I hadn't responded to jw's question about why I thought his pipelined approach was more conducive to a batch protocol than an interactive protocol. To make the pipelined protocol as efficient as rsync will require the complexity of his backchannel implementation, which I think will be harder to get right than a single-process message-oriented protocol. If every stage is a separate process, it seems less clear how to implement something like an interactive mkdir or a delete
Re: superlifter design notes and a new proposal
I think there was some confusion earlier in the thread about the redo thing in rsync 2. It's not for handling files that have changed during the transfer. My understanding of this is that it is used when the whole-file md4 hash shows that the block checksum actually made a mistake in transferring the file. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: new rsync release needed soon?
On 31 Jul 2002, Dave Dykstra [EMAIL PROTECTED] wrote: Yes I think a new release is needed soon, but there's more patches than that that should get in. We need to weigh up getting functions in vs making steps small enough that the chance of breakage is acceptable. I am afraid that at the moment our only means of getting really good cross-platform test coverage for rsync is to throw a release out, and so that inclines me towards being conservative in what we put in. Hopefully we can try to get people on the list testing -rc releases more aggressively. A bunch of them have been posted and I was hoping you were keeping track of them and would be putting more of them in. I will try to read back through the list and see about merging them this week, with a view to a release candidate on about the 11th, and a release about a week after that. The patch that I'd most like to see get in JD Paul's patch for using SSH and daemon mode together. We still don't have an agreement on what the syntax should be. I think the combination of -e ssh and :: which he implemented is the most understandable syntax and we should just go with it. I agree that it would be really good to support it. However, -e and :: seem to be a persistent source of confusion for new users. I'm not sure if this change will help those people, or what if anything would be better. (More later on this.) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes and a new proposal
I've been thinking a bit more about Wayne and jw's ideas. My first draft was proposing what you might call a fine-grained rpc system, with operations like list this directory, delete this file, calculate the checksum of this file. I think Wayne's rzync system was kind of like that too. One unusual feature of rsync compared to most protocols is that a single request causes an enormous amount of stuff to happen: there is only one request/response per connection at the moment, really. It is a very CISC-like protocol. I wonder what we could achieve if we stay broadly within that model, of both parties knowing about the whole job, and working in tandem, rather than one of them controlling the other per file? So the client will send something more or less equivalent to its whole command line. This would be a more conservative design in some ways, because it is more similar to the existing system. It also perhaps avoids some of the issues about pipelining that have been giving me trouble at least. While staying with that overall approach, we may still be able to make some improvements in - documenting the protocol - doing one directory at a time - possibly, doing librsync deltas of directories - just one process on either end - getting rid of interleaved streams on top of TCP - sending errors as distinct packets, including a reference to the file that caused them (if any) - handling ACLs, EAs, and other incidental things - holding the connection open and doing more operations afterwards What made me start thinking this way is the realization that the basic idea of cooperating processes (rather than client-server) is not really causing us any trouble at the moment. Other things in that list are, like the interleaved error stream, or the 3-process model. But perhaps sending the arguments across the network and having the remote process know what to do is not such a problem. I will try to write up a more detailed description of this idea later on. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: new rsync release needed soon?
On 1 Aug 2002, Dave Dykstra [EMAIL PROTECTED] wrote: Another change that I think really ought to go in is something like the one at http://lists.samba.org/pipermail/rsync/2002-February/006371.html to get the correct error codes out of rsync. But first I think we really need to hear from Tridge why he put that code there in the first place. Martin, did you ever ask him? If not, can you please get him to look at it? I will follow that up with him. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: Useless option combos (was Re: --password-file switch)
On 30 Jul 2002, Wayne Davison [EMAIL PROTECTED] wrote: On Tue, 30 Jul 2002, Martin Pool wrote: The --password-file option only applies to rsync daemon connections, not ssh. Perhaps we should make rsync complain about such options that don't make sense (another example being trying to use -e with a :: hostspec)? There's a patch in cvs to make it complain about -e with ::. The manual actually already says that --password-file does not effect remote shells, but I have made it a bit more obvious. I agree that a warning would be good. Shall we do a new release soon? There's just one more change I would like to put in, which is partially rolling back the IPv6 patch so that it uses the old code, unmodified, if --disable-ipv6 is specified. I'm not sure this needs to go in before the next release though. I think it would reduce the overall level of pain, particularly on older platforms. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: timestamp on symlink
On 29 Jul 2002, Donovan Baarda [EMAIL PROTECTED] wrote: This is because most of python's os.xxx methods de-reference symlinks. You get this error because 'nothere' doesn't exist. The correct way to get time info on symlinks is to use os.lstat(), which doesn't de-reference links. I realize you can get the time that way (although not on all platforms), but how do you set it? As jw says, there is no lutime(). -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: --password-file switch
On 30 Jul 2002, Jochen K?chelin [EMAIL PROTECTED] wrote: How can I use the --password-file switch with rsync in order not to be promted for the users password so I can run rsync in a cronjob? rsync -uavrpog -e ssh /www [EMAIL PROTECTED]:/DESTINATION/`date +%A` --password-file=/quellen/RSYNC_PASSWD does not work! I always get a prompt to enter users root password! The --password-file option only applies to rsync daemon connections, not ssh. You need to set up an ssh key to make ssh connections with no password; see the recent thread or the ssh manual for instructions. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (OpenVMS perspective)
On 27 Jul 2002, jw schultz [EMAIL PROTECTED] wrote: The server has no need to deal with cleint limitations. I am saying that the protocol would make the bare minimum of limitatons (null termination, no nulls in names). It probably also makes sense to follow NFS4 in representing paths as a vector of components, rather than as a single string with '/'s in it or whatever. ['home', 'mbp', 'work', 'rsync'] avoids any worries about / vs \ vs :, and just lets the client do whatever makes sense. I don't know a lot about i18n support, but it does seem that programs will need to know what encoding to use for the filesystem on platforms that are not natively Unicode. On Unix it probably makes sense to default to UTF-8, but latin-1 or others are equally likely. This is independent of the choice of message locale. I think the W32 APIs are defined in Unicode so we don't need to worry. Quoting, translating, or rejecting illegal characters could all make sense depending on context. I guess I see John's backup vs distribution question as hopefully being different profiles or wrappers around a single codebase, rather than different programs. Perhaps the distinction he's getting at is whether the audience for the client who uploaded the data is the same client, or somebody else? -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (was Re: ...
On 27 Jul 2002, John E. Malmberg [EMAIL PROTECTED] wrote: A program serving source files for distribution does not need to be that concerned with preserving exact file attributes, but may need to track suggested file attributes for for the various client platforms. A program that is replicating for backup purposes must not have any loss of data, including any operating specific file attributes. That is why I posted previously that they should be designed as two separate but related programs. I'm not sure that the application space for rsync really divides neatly into two parts like that. Can you expand a bit more on how you think they would be used? -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (was Re: Latest rZync release: 0.06)
I'm inclined to agree with jw that truthfully representing time and leap seconds is a problem for the operating system, not for us. We just need to be able to accurately represent whatever it tells us, without thinking very much about the meaning. Somebody previously pointed out that timestamp precision is not a property of the kernel, but rather of the filesystem on which the files are stored. In general there may be no easy way to determine it ahead of time: you can (if you squint) imagine a network filesystem with nanosecond resolution that's served by something with rather less. I suspect the only way to know may be to set the time and then read it back. You can also imagine that in the next few years some platform may change to a format that accurately represents leap seconds, whether by TAI or something else. (I'm not sure if I'd put money on it.) Presumably that machine's POSIX interface will do a lossy conversion back to regular Unix time to support old apps. If we merely used that information, then when replicating between two such machines, files whose mtime happened to near on a leap second would be inaccurate. That would contradict our goal of preserving precision as much as possible, even if we can't tell if it is accurate. Ideally, we would use the native interface so as to be able to get the machine's full precision, and that would imply something like TAI internally. Whether this is worth doing depends on whether you reckon any platform will actually move to a filesystem that can represent leap seconds. As jw says, practically all machines have clocks with more than one second of inaccuracy, so handling leap seconds is not practically important. Certainly they might use it within their ntp code, but I don't know if they'll expose it to applications. What is the actual format of TAI? 64-bit signed seconds-since-1970, plus optionally nanoseconds, plus optionally attoseconds. (There's something rather fascinating about using attoseconds.) To be fair, it seems that TAI is an international standard, and djb just made up libtai, not the whole thing. (Mind you, from some standards I've seen, that would be a good reason to walk briskly away.) One drawback, which is not realy djb's fault, is that if you inadvertently use a TAI value as a Unix value it will be about 10 seconds off -- almost, but not quite, correct. I'd hate to have bugs like that but presumably they can be avoided by using the interface correctly. On the other hand, sint32 unix time is clearly running out, and if we have to use something perhaps it might as well be TAI. I would kind of prefer just a single 64-bit quantity measured in (say) nanoseconds, and compromise on being able to time the end of the universe, but I don't think I care enough to invent a new standard. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync (dammit) on RTOS
On 22 Jul 2002, Biju Perumal [EMAIL PROTECTED] wrote: Thanks Martin. I need to port it to QNX Any idea of available implementations of rsync on QNX? I don't know if anybody has done it, but as far as I know QNX is pretty similar to Unix so it should not be too hard. Why not try try compiling it? If you have trouble consult a QNX guru and/or post a clear and detailed description of the problem to this list. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (was Re: Latest rZync release: 0.06)
On 21 Jul 2002, jw schultz [EMAIL PROTECTED] wrote: .From what i can see rsync is very clever. The biggest problems i see with its inability to scale for large trees, a little bit of accumulated cruft and featuritis, and excessively tight integration. Yes, I think that's basically the problem. One question that may (or may not) be worth considering is to what degree you want to be able to implement new features by changing only the client. So with NFS (I'm not proposing we use it, only an example), you can implement any kind of VM or database or whatever on the client, and the server doesn't have to care. The current protocol is just about the opposite: the two halves have to be quite intimately involved, so adding rename detection would require not just small additions but major surgery on the server. What i am seeing is a Multi-stage pipeline. Instead of one side driving the other with comand and response codes each side (client/server) would set up a pipeline containing those components that are needed with the appropriate plumbing. Each stage would largly look like a simple utility reading from input; doing one thing; writing to output, error and log. The output of each stage is sent to the next uni-directionally with no handshake required. So it's like a Unix pipeline? (I realize you're proposing pipelines as a design idea, rather than as an implementation.) So, we could in fact prototype it using plain Unix pipelines? That could be interesting. Choose some files: find ~ | lifter-makedirectory /tmp/local.dir Do an rdiff transfer of the remote directory to here: rdiff sig /tmp/local.dir /tmp/local.dir.sig scp /tmp/local.dir.sig othermachine:/tmp ssh othermachine 'find ~ | lifter-makedirectory | rdiff delta /tmp/local.dir.sig - ' /tmp/remote.dir.delta rdiff patch /tmp/local.dir /tmp/remote.dir.delta /tmp/remote.dir For each of those files, do whatever for file in lifter-dirdiff /tmp/local.dir /tmp/remote.dir do ... done Of course the commands I've sketched there don't fix one of the key problems, which is that of traversing the whole directory up front, but you could equally well write them as a pipeline that is gradually consumed as it finds different files. Imagine lifter-find-different-files /home/mbp/ othermachine:/home/mbp/ | \ xargs -n1 lifter-move-file (I'm just making up the commands as I go along; don't take them too seriously.) That could be very nice indeed. I am just a little concerned that a complicated use of pipelines in both directions will make us prone to deadlock. It's possible to cause local deadlocks if e.g. you have a child process with both stdin and stdout connected to its parent by pipes. It gets potentially more hairy when all the pipes are run through a single TCP connection. I don't think that concern rules this design out by any means, but we need to think about it. One of the design criteria I'd like to add is that it should preferably be obvious by inspection that deadlocks are not possible. timestamps should be represented as seconds from Epoch (SuS) as unsigned 32 int. It will be 90 years before we exceed this by which time the protocol will be extended to use uint64 for milliseconds. I think we should go to milliseconds straight away: if I remember correctly, NTFS already stores files with sub-second precision, and some Linux filesystems are going the same way. A second is a long time in modern computing! (For example, it's possible for a command started by Make to complete in less than a second, and therefore apparently not change a timestamp.) I think there will be increasing pressure for sub-second precision in much less than 90 years, and it would be sensible for us to support it from the beginning. The Java file APIs, for example, already work in nanoseconds(?). Transmitting the precision of the file sounds good. I think by default user and groups only be handled numerically. I think by default we should use names, because that will be least surprising to most people. I agree we need to support both. Names are not universally unique, and need to be qualified, by a NIS domain or NT domain, or some other means. I want to be able to say: map MAPOOL2@ASIAPAC - [EMAIL PROTECTED] - [EMAIL PROTECTED] when transferring across machines. We probably cannot assume UIDs are any particular length; on NT they correspond to SIDs (?) which are 128-bit(?) things, typically represented by strings like S1-212-123-2323-232323 So on the whole I think I would suggest following NFSv4 and just using strings, with the intreptation of them up to the implementation, possibly with guidance from the admin. When textual names are used a special chunk in the datastream would specify a node+ID - name equivalency immediately before the first use of that number. It seems like in general there is a need to have
Re: superlifter design notes (was Re: Latest rZync release: 0.06)
People have proposed network-endianness, ascii fields, etc. Here's a straw-man proposal on handling this for people to criticize, ignite, feed to horses, etc. I don't have any specific numbers to back it up, so take it with a grain of salt. Experiments would be pretty straightforward. Swabbing to/from network endianness is very cheap. On 486s and higher it is a single inlined instruction and I think takes about one cycle. On non-x86 it is free. The cost is barely worth considering: if you are flipping words as fast as you can you will almost certainly be limited by memory bandwidth, not by the work of swapping them. BER-style variable length fields, on the other hand, are very intensive, because you need to look at the top bit, mask it, shift, continue. If you're going to use a protocol that difficult, I think you might as well use ASCII hex or decimal numbers. All other things being equal having a readable protocol is good. A little redundancy in the protocol can help make it readable and also help detect errors. For example, distcc's 4-char commands make it easy for humans to visually parse a packet, and they make errors in transmission almost always immediately cause an error. At the same time they're cheap to process -- it's just a uint32 compare. Arguably we should use x86-endianness because it's the most common architecture at the moment, but I don't think the performance justifies using something non-standard. Anyhow, I would hope that if it gets off the ground, this protocol might still be in use in ten years, in which time x86 may no longer be dominant. Bigendian also has the minor advantage that it's easier to read in packet dumps. Negotiated protocols are a bad idea because they needlessly multiply the test domain. Samba has to deal with Microsoft protocols which are in theory negotiated-endian, but in practice of course Microsoft never test anything but Intel, so BE support is broken and people writing non-x86 servers need to negotiate Intel endianness. Even assuming we're smarter than they are, I don't think we need to make our lives difficult in this way. Lempel-Ziv is ideal for the exact case of compressing 0x0001 into a couple of bits. Even a very cheap compressor such as lzo (about half the speed of memcpy) will do well on that kind of case; presumably numbers like uint64 0, 1, 2, etc will occur often in packet headers and get tightly compressed. I think it will probably deal with filenames for us too. So, as a straw man: - use XDR-like network-endian 32 and 64 bit fields - keep all fields 4-byte aligned - make strings int32 length-preceded, and padded to a 4-byte boundary - don't worry about interning or compressing filenames, just send then as plain UTF-8 relative to a working directory - send things like usernames as strings too - make operation names (or whatever) be human-readable, either variable-length strings or 4-byte tokens that happen to be readable as ascii -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes (OpenVMS perspective)
On 22 Jul 2002, John E. Malmberg [EMAIL PROTECTED] wrote: Qualities 1. Be reasonably portable: at least in principle, it should be possible to port to Windows, OS X, and various Unixes without major changes. In general, I would like to see OpenVMS in that list. Yes, OpenVMS, perhaps also QNX and some other TCP/IP-capable RTOSs. Having a portable protocol is a bit more important than a portable implementation. I would hope that with a new system, even if the implementation was unix-bound, you would at least be able to write a new client, reusing some of the code, that worked well on ITS. A clean design allows optimization to be done by the compiler, and tight optimization should be driven by profiling tools. Right. So, for example, glib has a very smart assembly ntohl() and LZO is tight code. I would much rather use them than try to reduce the byte count by a complicated protocol. 4. Keep the socket open until the client gets bored. (Avoids startup time; good for on-line mirroring; good for interactive clients.) I am afraid I do not quite understand this one. Are you refering to a server waiting for a reconnect for a while instead of reconnecting? What I meant is that I would like to be able to open a connection to a server, download a file, leave the connection open, decide I need another file, and then get that one too. You can do this with FTP, and (kindof) HTTP, but not rsync, which needs to know the command up front. Of course the server can drop you too by a timeout or whatever. If so, that seems to be a standard behavior for network daemons. 5. Similarly, no silly tricks with forking, threads, or nonblocking IO: one process, one IO. Forking or multiple processes can be high cost on some platforms. I am not experienced with Posix threads to judge their portability. But as long as it is done right, non-blocking I/O is not a problem for me. If you structure the protocol processing where no subroutine ever posts a write and then waits for a read, you can set up a library that can be used either blocking or non-blocking. Yes, that's how librsync is structured. Is it reasonable to assume that some kind of poll/select arrangement is available everywhere? In other words, can I check to see if input is available from a socket without needing to block trying to read from it? I would hope that only a relatively small layer needs to know about how and when IO is scheduled. It will make callbacks (or whatever) to processes that produce and consume data. That layer can be adapted, or if necessary, rewritten, to use whatever async IO features are available on the relevant platform. Test programs that internally fork() are very troublesome for me. Starting a few hundred individually by a script are not. If we always use fork/exec (aka spawn()) is that OK? Is it only processes that fork and that then continue executing the same program that cause trouble? I can only read UNIX shell scripts of minor complexity. Apparently Python runs on VMS. I'm in favour of using it for the test suite; it's much more effective than sh. 12. Try to keep the TCP pipe full in both directions at all times. Pursuing this intently has worked well in rsync, but has also led to a complicated design prone to deadlocks. Deadlocks can be avoided. Do you mean that in the technical sense of deadlock avoidance? i.e. checking for a cycle of dependencies and failing? That sounds undesirably complex. Make sure if an I/O is initiated, that the next step is to return to the protocol dispatching routine. 9 Model files as composed of a stream of bytes, plus an optional table of key-value attributes. Some of these can be distinguished to model ownership, ACLs, resource forks, etc. Not portable. This will effectively either exclude all non-UNIX or make it very difficult to port to them. Non-UNIX is not completely fair; as far as I know MacOS, Amiga, OS/2, Windows, BeOS, and QNX are {byte stream + attributes + forks} too. I realize there are platforms which are record-oriented, but I don't have much experience on them. How would the rsync algorithm even operate on such things? Is it sufficient to model them as ascii+linefeeds internally, and then do any necessary translation away from that model on IO? BINARY files are no real problem. The binary is either meaningful on the client or server or it is not. However file attributes may need to be maintained. If the file attributes are maintained, it would be possible for me to have a OpenVMS indexed file moved up to a UNIX server, and then back to another OpenVMS system and be usuable. Possibly it would be nice to have a way to stash attributes that cannot be represented on the destination filesystem, but perhaps that is out of scope. I recall seeing a comment somewhere in this thread about timestamps being left to 16 bits. No, 32 bits. 16 bits is obviously silly. File timestamps
Re: superlifter design notes (OpenVMS perspective)
User-Agent: Mozilla/5.0 (X11; U; OpenVMS COMPAQ_AlphaServer_DS10_466_MHz; en-US; rv:1.1a) Gecko/20020614 If something as complex as Mozilla can run on OpenVMS then I guess we really have no excuse :-) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes and rZync feedback
One more link, about variable-length vs fixed-length encodings: http://ntrg.cs.tcd.ie/undergrad/4ba2/presentation/xdrandber.html (The HTML is a bit broken, view the source.) Basically they make the somewhat obvious point that variable-length encodings are much slower to handle than fixed-length. I don't know if the difference is so great that lzo encoding could produce a smaller result with less work. I wouldn't be surprised either way, actually. One way to look at it is this: in the case where you're CPU-bound, not network-bound, then you'll definitely want to use something like XDR. In the case where you're completely network-bound, then you probably want to use gzip -9 or even bzip2, and whether the underlying protocol is fixed or variable-length probably doesn't matter. So perhaps XDR plus compression is a good tradeoff across a wider domain. (Or perhaps not.) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync anti-FUD
On 18 Jul 2002, Paul Nendick [EMAIL PROTECTED] wrote: I'm working on a commercial project that would benefit immensely from the use of rsync. However, I cannot convince management that rsync is a worthy tool due to the rote it's shareware, it's not supported FUD. Are there any documented, corportate users of rsync? Testimonials? In short, how do I drag this risk-averse group out of the FTP age into the rsync present? I work for HP. We use it extensively, indeed so much so that it would probably be impossible to count the number of users. If you want support, I'm sure HP's consulting group would be interested in helping you out and very capable. If you do not already have an account manager there, I can find somebody good for you to speak to. They can probably produce a nice pointyhead-friendly Powerpoint slideshow about the strengths of open source :-) rsync is a mature product, with many established users. As other people have said, it is the de-facto standard for filesystem mirroring. rsync's stability means that new features do not go in very fast, however there is active work on extending it to new areas and capabilities, including xdelta, Unison, librsync, rzync, lift, pysync, and others. As far as I know, there is no new work going into FTP as a protocol, although people are doing some nice work on implementation, such as ProFTPd. So you need not fear rsync leading you into a dead end. One of the nice things about open source is that you are not locked in to a single provider. If, at some time during the project, you decide you want to pay for commercial support, you can do so. If you do pay for commercial support and it turns out that you're not happy with the company you can change. rsync is not shareware anyhow; it is Free Software, or, if you prefer, Open Source Software. Shareware is sometimes the worst of both worlds -- half-hearted support, but no opportunity to fix things yourself or seek alternative help. I don't know what FTP implementation you're using, but I suspect most of them will be either open source, shareware (on Windows), or a thin veneer of Unix-vendor gloss on an old BSD implementation. A good way to proceed might be to post a brief description of what it is you want to do to the list. I'm sure several people will be able to tell you that's easy, that's possible, or rsync's not the right tool. Drawing on the freely-available resource of experienced users is probably the best thing you can do to reduce risk. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: strip setuid/setgid bits on backup (was Re: small security-related rsync extension)
On 16 Jul 2002, Dan Stromberg [EMAIL PROTECTED] wrote: If by sillyrename, you mean busy text files are renamed to .nfs*, sillyrename is in fact the technical term for this. I am not making it up. I'm pretty sure Callaghan's book calls it that, Sun people call it that, and it is the term used in the Linux NFSv3 implementation, etc. http://www.google.com/search?q=sillyrename then I think you're missing how it works yourself, I'm sorry to say. You just unlink something on the server, and it happens, like magic. Maybe that happens on the client side - but that's really beside the point. Rename will probably do just as well. it happens, like magic. Uh-huh. My understanding of sillyrename, from memory and a brief perusal of the kernel source (I don't have Callaghan here), is as follows: It is a purely client-side behaviour, to handle the fact that Unix files may be still open when unlinked. This relies on Unix having an in-memory use count, in addition to the on-disk link count. The problem is that the NFS server may reboot while the client has the file open, therefore losing its in-memory use count, and causing the file data to be garbage-collected by fsck. As a workaround, deletes or link-replacement on the client for a file still in use are handled by moving that file to a temporary name, so that from the point-of-view of other clients the file has gone. When the use count drops to zero, the client removes the .nfs file. If the client crashes or the net is partitioned before the use count goes to zero, then the .nfs file may remain indefinitely, which is why you need a reaper run from cron. If this is wrong please explain how. This is, incidentally, a much better solution for replacing in-use files than rsync's backups, because it only affects in-use files, and they are gc'd when no longer in use. Replacing local files and letting the kernel handle it is even better, because it can never leak. .nfs* may well suffer from the same the backup file is still setuid problem though. Yes; if you replace in-use setuid binaries in such a way that sillyrename orphans may be generated, then they may still be setuid, and that may be a security problem. I agree. These files should only be generated by edge cases where the program is in use when replaced, and where the client loses contact with the server or abruptly reboots. Presumably if you're installing a security update to a program then you need to restart that program fairly promptly, so the window should be small. Of course small window != zero, but there is no need to unnecessarily panic. It looks like the root problem is that replacing a setuid file from an NFS client may cause a setuid sillyrename file to remain under some circumstances. I haven't tested it, but I can believe that might happen. Is that what you're trying to say? If this is true, then it is a problem with NFS, not with rsync. The failure would presumably occur in the same way if you used dpkg, rpm, pkgadmin or cp to replace the files. I'm finding it hard to see why this makes the issue moot. It is moot because you can just run rsync direct to the NFS server. This is faster and avoids the security hole. If you disagree, please explain why. I'm also finding it hard to understand why security might be so unimportant to you. I seriously wish you'd read bugtraq for a few months before making such a short sighted decision. I have a pretty good understanding of Unix security, and I do consider it important. If you want changes to rsync you have to make a clear case, not just wave your hands and say like magic. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: superlifter design notes and rZync feedback
On 18 Jul 2002, Wayne Davison [EMAIL PROTECTED] wrote: (definitely NOT rzync). Great. (Excuse my overreaction :-) Re: rzync's variable-length fields: Note that my code allows more variation than just 2 or 4 bytes -- e.g., I size the 8-byte file-size value to only as many bytes as needed to actually store the length. I agree that we should question whether this complexity is needed, but I don't agree that it is wrong on principal. There are two areas where field-sizing is used: in the directory-info compression (which is very similar to what rsync does, but with some extra field-sizing thrown in for good measure), and in the transmission protocol itself: OK. If the protocol said that all integers are encoded in a UTF-8-ish or BER-ish variable length scheme that would sound perfectly reasonable to me. I had misunderstood the document as suggesting that some fields should be defined to be different lengths to others that would worry me. There is still a question on the relative merits of having known-length headers (easier to manage buffers, know how much to read, etc), vs making them as small as possible. I think I mentioned this -- I'd like to have a reasonable means to choose a compression scheme at connection time. bzip2 would be good for modems; lzo for 100Mbps. (I think of bzip2 as simmering on the stove all day, and lzo as lightly blanching :-) I still have questions about how best to handle the transfer of directory info. I'm thinking that it might be better to remove the rsync-like downsizing of the data and to use a library like zlib to remove the huge redundancies in the dir data during its transmission. Ben Escoto suggested a stack like this: 1. The specification for an abstract protocol designed to allow a single threaded application get good performance using a single, possibly low bandwidth/high latency pipe. No specific file commands would enter in at this stage, but error reporting and recovery, some kind of security policy, and some other stuff I'm omitting would be included. 2. A library to make it easy for applications to work with protocols that have the form in 1. A well-written interface to a scripting language (probably python) would be considered a core part of this. 3. Specification for a more specific, rsync-like protocol, and maybe another library (again with at least a scripting wrapper) to make it easy for applications to implement the protocol. 4. The model application rsync3 which shows off what the protocol can do. Ideally this part should be really short and sweet. I think that's a good way to play it, because there is enough work in each section that they're non-trivial layers, but they're also sufficiently separate to allow a lot of good experimentation or adaption. I'd hope that by getting a good foundation in #1 and #2, we would be able to experiment with doing binary deltas on directories, or not, or something else again. I would hope that working only at layer 4, you'd be able to implement a client that could detect remote renames (by scanning for files with the same size, looking at their checksums, etc.) I wonder if this layering is excessive, but I think that all the layers are necessary, and a first implementation could be simple in many cases. For example, 2 could initially be trivially implemented in a way that only supports non-pipelined operation. In the protocol itself, there are only two variable-size elements that goes into each message header. While this increases complexity quite a bit over a fixed-length message header, it shouldn't be too hard to automate a test that ensures that the various header combinations (particularly boundary conditions) encode and decode properly. I don't know if this level of message header complexity is actually needed (this is one of the things that we can use the test app to check out), but if we decide we want it, I believe we can adequately test it to ensure that it will not be a sinkhole of latent bugs. OK, good. Re: rzync's name cache. I've revamped it to be a very dependable design that no longer depends on lock-step synchronization in the expiration of old items (just in the creation of new items, which is easy to achieve). Some comments on your registers: You mention having something like 16 registers to hold names. I think you'll find this to be inadequate, but it does depend on exactly how much you plan to cache names outside of the registers, how much retransmission of names you consider to be acceptable, and whether you plan to have a move mode where the source file is deleted. Yes, I agree that 16 is probably too small; the next round number would be 256. If we use something like BER it could be unboundedly big. However, since using a name causes server-side resources to be allocated, that's probably no good. We don't want somebody abusing a public server by allocating a zillion names; on the other hand I
Re: strip setuid/setgid bits on backup (was Re: small security-related rsync extension)
On 11 Jul 2002, Dan Stromberg [EMAIL PROTECTED] wrote: I don't get what you are doing. Where did these insecure suid root files come from in the first place? Have you ever read bugtraq on a regular basis? They're coming out of the woodwork. Another question would be, why do you want to keep them around at all? Presumably so that people can undo the changes if something goes wrong? For your situation, it might work better to dump them all into a mode 700 backup directory. It seems like the overarching problem is different focusses: Dan wants rsync to be a software-distribution mechanism (which is certainly a good use for it), which which case stripping setuid bits is obviously quite desirable. But for a bit-perfect backup tool, it's probably wrong. I have been thinking about what general strategies software tools use to address this problem of focus. They seem to be 1- Add a pile of built-in options (--strip-setuid) -- rsync's strategy to date. 2- Build a common layer, and then variations on the program to suit different purposes. I think rdiff-backup is kind of like this. It has the advantage that end users who just want to do backups or software distribution or mirroring don't need to deal with 3- Make the program call out to various scripts that can control its behavior -- the CVS server is like this, for example, with loginfo scripts and so on. 4- Make the program's interfaces and performance characteristics be such that it can easily be controlled by a scripting language. Subversion is trying to be like this. The --log-format proposal for rsync goes in this direction, though needing a new socket for each invocation rather cripples it. 5- Make the whole program intimately intermingled with a scripting language, like emacs or (perhaps) Mozilla. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: strip setuid/setgid bits on backup (was Re: small security-related rsync extension)
On 12 Jul 2002, Dan Stromberg [EMAIL PROTECTED] wrote: Because when we update, for example, bash, everbody's bash is going to die on them if we don't keep around backups (segfault as you demand page from a binary that has Mostly the Same Stuff in Different Places). rsync creates a new file, and then atomically moves it into place on successful completion. You should never end up with the file being part-changed, assuming you don't use --partial or -P. It does not normally unlink as such, although I think it might try to use that as a big hammer if the rename fails. So up until the file is completely transferred and the replacement takes place, everyone will keep seeing the old file. Afterwards, people who had the old file open will keep seeing it, and people who open the new one will get the new one. This is, as far as I know, the same approach and the same semantics that most unix software-distribution systems, such as dpkg or rpm, will give you. It may break on systems (like HP-UX?) that don't let you rename or remove a file while it's being executed. I don't know what you're meant to do there, except shut down everything on the machine. Presumably you don't have one of those or -b would be failing. Or does rsync unlink and recreate rather than overwriting? In that case, we might just end up with a bunch of .nfs* files if we don't keep backups. Rumor has it, however, that depending on the .nfs* mechanism doesn't always work. I haven't seen it fail myself, but one of the other guys here, who's pretty experienced, sounds pretty convinced that it fails sometimes. It's possible that you can get .nfs* orphans, but only you can know whether they're common in your environment. If I understand correctly, the only problem with that would be that the old, setuid text still hangs around in the .nfs file. I would be inclined to say that it's not rsync's problem if unlink() is unreliable, so just run a sillyrename reaper and be done. Is it possible to just rsync onto the NFS server, rather than onto the clients? That would probably be faster, and avoid sillyrename. I considered this, but I wasn't sure NFS/TheKernel would allow demand paging from an inaccessible binary on all of our supported *ix platforms now and into the future. Are you? We currently support Linux, Solaris, Irix and Tru64 presently, and may add and drop some in the future. I suspect that any machines that let you rename or unlink in-use text files will not care whether they have an accessible name or not. Unfortunately experiment is probably the only way to tell. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
superlifter design notes (was Re: Latest rZync release: 0.06)
I've put a cleaned-up version of my design notes up here http://samba.org/~mbp/superlifter/design-notes.html It's very early days, but (gentle :-) feedback would be welcome. It has some comments on Wayne's rzync design, which on the whole looks pretty clever. I don't have any worthwhile code specifically towards this yet, but I have been experimenting with the protocol ideas in distcc http://distcc.samba.org/ I like the way it has worked out there: the protocol is simple and easy to understand, the bugs more or less found themselves, and it feels like I'm using TCP in a natural way -- all of these much more so than rsync at the moment. (Of course, the rsync problem is much more complicated.) -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: strip setuid/setgid bits on backup (was Re: small security-related rsync extension)
On 8 Jul 2002, Dave Dykstra [EMAIL PROTECTED] wrote: The idea of the rsync client executing programs has been descussed before and rejected because it could easily be done by an external program if rsync simply passes it filenames. The only case I can see for having rsync execute programs is in the daemon; that was once approved in principle but nobody every implemented it. What we need, have long wanted, and even once had someone volunteer for (but it was never completed), is a major upgrade to the --log-format option to allow a lot more flexibility in what gets printed, and to have it work consistently with and without --dry-run. This would work too with lots of files because the names get streamed out as they're processed. See for example the thread around http://lists.samba.org/pipermail/rsync/1999-January/000954.html I'm pretty sure I'm with Dave on this. I think it would be reasonable when over ssh to have a way to run a script on the remote machine, and have that script also get a copy of the log. -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
strip setuid/setgid bits on backup (was Re: small security-related rsync extension)
Any thoughts on whether this should go in? I can see arguments either way. It seems like we ought to think about whether it would be better to do it as part of a generalized --chmod or --chmod-backup facility. -- Martin On 21 Jun 2002, Dan Stromberg [EMAIL PROTECTED] wrote: Included below is a shar archive containing two patches that together: 1) make backup files get their setuid and setgid bits stripped by default 2) add a -s option that allows backup files to continue to have these privileges This means that if you update a collection of binaries with rsync, and one or more of them has a local-root security problem, the backup file(s) created when you fix the problem in your source archive won't remain exploitable. The patch is relative to 2.5.4. The backup-dir support is attempted but untested. We're using the default backup behavior (with ~) in production. I'd be pleased if someone who uses backup-dir were to try it out and let me know how it goes. I'd also be pleased if this were to find its way into the main distribution in some form. Thank you. #!/bin/sh # This is a shell archive (shar 3.32) # made 06/21/2002 20:17 UTC by [EMAIL PROTECTED] # Source directory /dcslibsrc/network/rsync/exportable-patches # # existing files WILL be overwritten # # This shar contains: # length mode name # -- -- -- # 1798 -rw-r--r-- backup-priv-backups # 1339 -rw-r--r-- options-priv-backups # if touch 21 | fgrep 'amc' /dev/null then TOUCH=touch else TOUCH=true fi # = backup-priv-backups == echo x - extracting backup-priv-backups (Text) sed 's/^X//' 'SHAR_EOF' backup-priv-backups X*** backup.c.t Sun May 6 23:59:37 2001 X--- backup.c Fri Jun 21 13:15:51 2002 X*** X*** 29,34 X--- 29,56 X extern int preserve_devices; X extern int preserve_links; X extern int preserve_hard_links; X+ extern int priv_backups; X+ X+ #ifdef HAVE_CHMOD X+ static int strip_perm(char *fname) X+ { X+struct stat buf; X+if (link_stat(fname,buf) != 0) { X+rprintf(FERROR,stat failed\n); X+return 0; X+} X+ X+if (S_ISREG(buf.st_mode) (buf.st_mode (S_ISUID | S_ISGID))) { X+mode_t new_mode; X+new_mode = buf.st_mode 01777; X+if (do_chmod(fname,new_mode) != 0) { X+rprintf(FERROR,chmod failed\n); X+return 0; X+} X+} X+return 1; X+ } X+ #endif X X /* simple backup creates a backup with a suffix in the same directory */ X static int make_simple_backup(char *fname) X*** X*** 46,54 X rsyserr(FERROR, errno, rename %s to backup %s, fname, fnamebak); X return 0; X } X!} else if (verbose 1) { X!rprintf(FINFO,backed up %s to %s\n,fname,fnamebak); X } X return 1; X } X X--- 68,86 X rsyserr(FERROR, errno, rename %s to backup %s, fname, fnamebak); X return 0; X } X!} else { X!if (verbose 1) { X!rprintf(FINFO,backed up %s to %s\n,fname,fnamebak); X!} X! #ifdef HAVE_CHMOD X!if (!priv_backups strip_perm(fnamebak) == 0) { X!return 0; X!} else if (verbose 1) { X!rprintf(FINFO,Stripped setuid and/or setgid from %s\n,fnamebak); X!} X! #endif X } X+ X return 1; X } X X*** X*** 271,276 X--- 303,314 X fname, keep_name, strerror(errno)); X }; X set_perms (keep_name, file, NULL, 0); X+/* may mean an extra stat */ X+ #ifdef HAVE_CHMOD X+if (!priv_backups strip_perm(keep_name) == 0) { X+return 0; X+} X+ #endif X free_file (file); X free (file); X SHAR_EOF $TOUCH -am 06211315102 backup-priv-backups chmod 0644 backup-priv-backups || echo restore of backup-priv-backups failed set `wc -c backup-priv-backups`;Wc_c=$1 if test $Wc_c != 1798; then echo original size 1798, current size $Wc_c fi # = options-priv-backups == echo x - extracting options-priv-backups (Text) sed 's/^X//' 'SHAR_EOF' options-priv-backups X--- options.c.t Fri Jun 21 08:56:31 2002 X+++ options.cFri Jun 21 09:41:41 2002 X@@ -21,6 +21,9 @@ X #include rsync.h X #include popt.h X X+#ifdef HAVE_CHMOD X+int priv_backups = 0; X+#endif X int make_backups = 0; X int whole_file = -1; X int copy_links = 0; X@@ -188,6 +191,7 @@ X rprintf(F, -b, --backupmake backups (default %s suffix)\n,BACKUP_SUFFIX); X rprintf(F, --backup-dirmake backups into this directory\n); X rprintf(F, --suffix=SUFFIX override backup suffix\n); X+
avoiding temporary files (Re: about rsync)
On 22 Jun 2002, macgiver [EMAIL PROTECTED] wrote: hi i love rsync, but i want to know how it is possible to let rsync download a file with the same filename, and not a temp filename like: package.tar.gz.hzmkjz5 or so... i don 't want to use temp filenames when downloading with rsync. Why? Because i'm writing a program with a progress bar and it sucks with the temp filename. That feature is quite tightly tied in to the design of rsync. We need to use a temporary filename because rsync needs access to the old file to do delta encoding, and in any case many people want the file atomically replaced when it's complete. what do you suggest me? :( Run with --progress and read stdout, or just look for .package.tar.gz.* and see how big it is. The name is unpredictable, but there will be only one. You could even set the temporary directory to make it more predictable. Oh, and please use the list so that other people can make suggestions. greetings! macgiver [EMAIL PROTECTED] -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: Possible UID/GID bug in chrooted shells?
On 12 Jun 2002, Tom Worley [EMAIL PROTECTED] wrote: Dear Martin, Sorry to mail you directly, but I've had no joy trying to get round this problem (read the faqs, posted on the mailing list RTFM a lot etc) This is (slightly updated) what I posted to the mailing list: I'm stuck on a problem with rsync... We've got a chrooted shell with rsync and all the needed libs inside (and not much else). We're using rsync over ssh to send the files into this chrooted session. The rsync binary in the chrooted session is SUID root so that it can create the files with the correct UID/GID. When the following is run, it creates all the files as root.staff, not as the test user/group, or the correct UID/GID of the original files, so the SUID root is working. We've also tried extracting files from tar that belong to another user (that is the files inside the tar) and when tar is suid root in the chroot it extracts them with the correct UID/GID. This is the command we used: rsync --delete-excluded --delete -essh -avz --numeric-ids /home/admin/ test@localhost:/home/backup (from outside the chroot, the test user being inside it) The test user's shell is the chrooted session, What do you mean by that? Their /etc/passwd shell is some chrooted session program? If you wrote it please post the source, otherwise what is the name. Do you know you cannot just run /usr/sbin/chroot as a regular user? It's a privileged operation; it must be done before changing uid. and the session works fine through ssh, rsync runs without errors, but all the files created are owned by root. If we try the same but to a non-chrooted user (and suid root to the rsync binary outside the chroot, yeah yeah, it's just a test), it correctly creates the files with the right UID/GID. I've even tried copying the complete /etc/passwd and shadow files into the chroot jail, but that didn't help. We'd rather not have to setup users/passwords for several hundered users for rsync and run it as a daemon (and send the password securely somehow to each person). Could it be a bug in the way rsync sets the UID/GID of the files? Running Debian Linux Sid, up to date as of this morning, and rsync: rsync version 2.5.6cvs protocol version 26 from debian packages, linux 2.4.18 kernel, chroot 2.0.11 on an i686. Kind regards, and TIA, Regards, Tom Worley -- Martin -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html