Re: Anyone syncing database files?
You can do a hot backup by simply setting all the tablespaces in backup mode (alter tablespace 'BLAA' begin backup). Then you can rsync the database files and do an 'end backup' on each tablespace after. This will give you files that are consistient and the database will be recoverable from those files. A hot backup is useless unless you do this... eric Brian Cuttler wrote: I second that. We backup Oracle on Solaris but we insist that the oracle is down. On Fri, Aug 20, 2004 at 10:48:20PM +0200, Jan-Benedict Glaw wrote: On Fri, 2004-08-20 16:43:07 -0400, Loukinas, Jeremy [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]: Anyone syncing say 18-20gb Oracle dbf files..? No problem--as long as you don't try to backup hot database files. Shutdown Oracle first:) MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED]. +49-172-7608481 _ O _ Eine Freie Meinung in einem Freien Kopf| Gegen Zensur | Gegen Krieg _ _ O fuer einen Freien Staat voll Freier Bürger | im Internet! | im Irak! O O O ret = do_actions((curr | FREE_SPEECH) ~(NEW_COPYRIGHT_LAW | DRM | TCPA)); -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html --- Brian R Cuttler [EMAIL PROTECTED] Computer Systems Support(v) 518 486-1697 Wadsworth Center(f) 518 473-6384 NYS Department of HealthHelp Desk 518 473-0773 -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Professional grade rsync?
I do a daily sync to about 20 locations of 2.4 million files. I think rsync is very much 'professional' grade. I have done this sync of 2.4 million files both as a single rsync of the tree as well as splitting the tree up in to multiple rsyncs. (essentially a 'for dir in ls;do rsync $dir dest:blaa;done) Watch your memory usage on both sides of the rsync. How much RAM is rsync using? How much RAM on both the src and dest box? You might also consider 2.6.2 -- this has some improvements in memory usage and speed. eric King, Daniel wrote: Hi, folks. We've gone where no man has gone before. On HP-UX, rsync bombs at about 1.75 million directories, files and links (combined sum) in a single transfer. Is there a professional-grade alternative on HP-UX for folks willing to pay for it? It wouldn't even need to be network-aware, just from single-system areas to the same box, but with the nifty delete and update features that rsync has. My searches turn up unison and some other tools (BSD mirror, etc.), but rsync has beaten any other open-source solution hands down on the scalability side of things. Now, we need more ... Thanks, A. Daniel King, System Analyst HP-UX, Linux, Solaris -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [rsync-announce] Rsync 2.6.1 released (includes security note)
[EMAIL PROTECTED] wrote: Rsync version 2.6.1 has been released. It is primarily a performance release that requires less memory to run, makes fewer write calls to the socket (lowering the system CPU time), does less string copying (lowering the user CPU time), and also reduces the amount of data that is transmitted over the wire. There have also been quite a few bug fixes. See the release NEWS for the full details: I just tried a simple test of 2.6.1. I still need to do more testing, but here is some data that shows a decrease in cpu time. The test results varied depending on other external loading, but here is some representative data: rsync 2.5.6 both ends -- 200,000 files (source files read via nfs, dest is over ssh to local disk) real27m48.939s user0m14.910s sys 0m19.490s approx memory both sides 30M -- receiver had a 18M jump at the end to 48M rsync 2.6.1 both ends-- same tree, same setup real15m30.547s user0m9.840s sys 0m16.910s approx memory both sides 22M -- receiver had a 18M jump at the end from 20G to 38G real time shows a good speed up. (not 100% meaningful unless you have a very controlled test environment) user time shows a good speed up. sys time also shows a speed up. memory use appears to be less. These performance fixes look like they could be helpful. My current bottlenecks are probably still the link speeds to remote locations and the nfs source read, but this new rsync might help some too. eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: measuring 2.6.1 improvements
Wayne Davison wrote: On Tue, Apr 27, 2004 at 11:52:11AM -0600, Eric Whiting wrote: ... but here is some representative data: Thanks for the confirming stats. Some questions/comments: You didn't mention what command you ran, so I'm curious if the 18M jump at the end was due to using --delete-after? (The way the delete options are written uses a lot of memory for large file sets.) I'm using --delete, but not --delete-after. The command was something like this: /usr/bin/rsync -a --rsync-path=/usr/bin/rsync --delete --partial -vv -e '/usr/local/bin/ssh ' --log-format=%o %f %b %l --timeout 7200 --force --compress --exclude-from=/path --exclude 'pattern' /source/dir/from/nfs remotehost:/remote/path I was watching the memory usage with a bash command line while loop while(true);do cat /proc/$pid/status |grep VmSize ;sleep 60;done My last runs I changed the sleep to 5 at the end to get finer resolution on the memory jump. I was not expecting to see that jump -- I was also curious. some more cut/paste: dest_host:/usr/bin # while(true);do cat /proc/24624/status |grep VmSize;sleep 60; done VmSize: 6380 kB VmSize: 8196 kB VmSize: 9236 kB VmSize:10796 kB VmSize:12096 kB VmSize:14168 kB VmSize:15468 kB VmSize:16768 kB VmSize:18068 kB VmSize:20408 kB - memory jump starts after this point (sender stays stable at 22M) VmSize:26016 kB (cntl-c to change the resolution) dest_host:/usr/bin # while(true);do cat /proc/24624/status |grep VmSize;sleep 5; done VmSize:35884 kB VmSize:35884 kB VmSize:35884 kB VmSize:37964 kB VmSize:22412 kB VmSize:22412 kB VmSize:22412 kB VmSize:22412 kB VmSize:22412 kB VmSize:22412 kB The per-file memory improvement is probably the bulk of what you saw in your test. If you want to see an even larger improvement, compare the use of the -H option (--hard-link) between versions (and note how much less data is sent over the wire too). I'm confused. I don't think I have hard links in this tree. This improvement will not help me without hard links will it? I'm curious if you noticed an improvement in shared memory on the receiving side staying shared (that is, if your fork() uses copy- on-write shared memory in the first place). Not sure about my fork() -- here is the sys info: linux kernel 2.4.21 glibc-2.3.2-6 suse 8.2 eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Running rsync without ssh
If you are moving data across a network then you need a network transport mechanism. If you specify a [EMAIL PROTECTED]:dir in the source or destination then you need a transport. ssh is the default transport in 2.5.7. If you want to use rsh then you must specify -e rsh. (or another mechanism) If you are syncing locally without going from one system to another then you do not need a -e flag and you will not have the [EMAIL PROTECTED]:dir specification -- simple two path names. eric Richmond Dyes wrote: What is the command structure for the -e flag to run rsync without the ssh shell? I am using rsync 2.5.7. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: orphan dirs and files with --delete
Wayne Davison wrote: On Mon, Mar 22, 2004 at 04:49:15PM -0700, Eric Whiting wrote: rsync (2.5.[67]) --delete fails on dirs with the w bit cleared. (example below) Rsync will sync a dir with w bit clear, but will not remove it with --delete. It's actually debatable what the correct behavior is in this case. Using rm -rf on that same directory tree would also fail, so rsync is currently behaving in a consistent manner with other unix/linux tools. I'm inclined to leave it that way. Wayne, Thanks. yes I agree with your statements... This is not a clear 'problem' nor is there a 'right' way to fix this. Yet I think there are still reasons to change this behavior. If we depend on -a and --delete to keep sites in sync then what I have described is a scenario where rsync can create/mirror read only dirs/files, but it will refuse to delete them. Rsync is already doing the chmod required for the creation of those dirs/files -- therefore the chmod for deletion is not that inconsistent with the overall purpose of the tool. Rsync is not emulating rm, nor other 'single' commands. Rsync is a tool that combines features of chmod/chown/compress/gzip/tar/find/ssh/rsh/tar/etc -- that is what makes it so useful. I have workarounds for this issue and I can deal with it. (manual chmods, private patch, or running as root) Running rsync as root does not have this problem. The root user has permission to rm dirs/files with the w bit clear (under most OSs). Yet I don't like to run these syncs as root. In this case that just came up we had to go to the destination boxes and do some chmods. The source files/dirs on the master site are already gone so we did not have other simple options. Of course another fix is to keep the source tree 'clean' in not having and files/dirs with the w bit clear for the file owner. I saw that the CVS tree had a change regarding suid behavior. This too might be another solution, but it is also something that I might be hesitant to use as well. (an not sure it would fix the problem) I think this 'small' issue will come up again. Rsync is a very useful tool that is only gaining in popularity. I don't think I'll be the only person who will run into this. Just my 2cents.. eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
orphan dirs and files with --delete
rsync (2.5.[67]) --delete fails on dirs with the w bit cleared. (example below) Rsync will sync a dir with w bit clear, but will not remove it with --delete. This is not a big problem, but it will create situations where there are 'orphaned' files. Has anyone else had this problem? It looks like a change would be needed in robust_unlink (util.c). This function would have to do a chmod on dirs that are locked down before it does the unlink. (syncing as user root doesn't have this problem) The CHECK_RO macro in syscall.c only checks for file being RO. It doesn't check for the dir being RO. eric here is an example: COMMANDS: -- cd /tmp # cleanup chmod -R a+w source dest rm -rf source dest # create a dir and subdir and chmod mkdir source cd source touch file1 file2 mkdir dir1;touch dir1/file3 dir1/file4 chmod a-w dir1 # rsync to dest mkdir /tmp/dest rsync --delete -av /tmp/source/ /tmp/dest # clean up source a little bit chmod a+w dir1 rm -rf dir1 # attempt to clean up dest with rsync (this --delete will fail) rsync --delete -av /tmp/source/ /tmp/dest cd /tmp OUTPUT (of final rsync) --- /tmp/source rsync --delete -av /tmp/source/ /tmp/dest building file list ... done delete_one: unlink dir1/file4: Permission denied delete_one: unlink dir1/file3: Permission denied ./ wrote 102 bytes read 20 bytes 244.00 bytes/sec total size is 0 speedup is 0.00 rsync error: some files could not be transferred (code 23) at main.c(620) -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: using rsync with ssh to distribute to multiple hosts
2 things to do that will fix things... 1. Read man ssh and create public/private ssh keys with an empty passphrase. This will let the rsync run without a password -- you can cron it and it will just work... You can also use .rhosts and sync over rsh, but ssh with the keys is a better solution. 2. Consider the destination syncs in parallel. Use simple bash for loops with in the commands or use perl's Parallel::ForkManager to run several at once.. I have used both methods with good success. eric Jim Stafki wrote: Hello, I have tried to look this up in the archives before posting but I can not connect to the news server at gmane. I hope you can help, or perhaps point me to documentation somewhere... I have a filesystem on an aix box that must be pushed periodically to several unix boxes (over 100) and the directive from our unix admins is to use rsync, which can use ssh to perform the transfer. I would like some opinions about the best way this should be done. I have written a korn shell script that uses a loop to execute the rsync command against a list of hostnames. This works fine, but for one inconvenience: The user must enter the password for every box that is in the list. It is the same password in most cases, but this is simply not acceptable. Is there a better way to do this with rsync? Anyone have thoughts about this? Thanks Jim -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync 1tb+ each day
Kenny Gorman wrote: I am rsyncing 1tb of data each day. I am finding in my testing that actually removing the target files each day then rsyncing is faster than doing a compare of the source-target files then rsyncing over the delta blocks. This is because we have a fast link between the two boxes, and that are disk is fairly slow. I am finding that the creation of the temp file (the 'dot file') is actually the slowest part of the operation. This has to be done for each file because the timestamp and at least a couple blocks are guaranteed to have changed (oracle files). I'm seeing similar behavior here. I ran rsync tests on oracle database files. I have four online snapshots (daily) from hot backups of a database. A given file will appear in all the snapshots as the same size with just a handful of changed blocks. I tested with a small 256M datafile. rsync -av is showing me about 200kBytes of changes in the datafile between each snapshot. (about 1/1000th of the file has actually changed between the hot backups) Rsync reports the speedup as somewhere between 800 and 1000 just as I would expect. This speedup is a number that accounts only for bytes transferred (or not transferred) -- not real time. When I time the rsync -av runs and compare to a rsync -W the real run times are approximately the same. (similar results if I rm the file on the dest and rsync -av). block checksum and rewrite overhead right? In other words -- the 'speedup is 963.22' message really only means that the 'network bytes transferred savings is 963x'. Over a slow network this might be proportional to a real run time savings, but on a fast network there might not be any 'speedup' in the sense of real elapsed time. Similar speedup numbers for some 1G database files -- but the real time is 2x when the destination is there and mostly correct. For this type of data it is 'faster' from a time standpoint to remove the destination file before running rsync. Not what I would expect. eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync 1tb+ each day
I think the -W option might do what you would have described here. eric Kenny Gorman wrote: I am rsyncing 1tb of data each day. I am finding in my testing that actually removing the target files each day then rsyncing is faster than doing a compare of the source-target files then rsyncing over the delta blocks. This is because we have a fast link between the two boxes, and that are disk is fairly slow. I am finding that the creation of the temp file (the 'dot file') is actually the slowest part of the operation. This has to be done for each file because the timestamp and at least a couple blocks are guaranteed to have changed (oracle files). My question is this: Is it possible to tell rsync to update the blocks of the target file 'in-place' without creating the temp file (the 'dot file')? I can guarantee that no other operations are being performed on the file at the same time. The docs don't seem to indicate such an option. Thx in advance.. -kg -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync 1tb+ each day
Replying to self after re-reading the original message... -W will probably help in that it disables the incremental checksum block checking/scanning for the very large files. This is a good option to consider if you have a very fast network. rsync with -W will still probably create the .dest file and will not do the file create/sync in place. (I might be wrong) I have some 2+G domino nsf files that I sync every day using rsync -- I have not seen the incremental checksum block checking helping much on these files either -- I think I'll try -W on that sync It might help in sync time but hurt in terms of network loading. I think some have suggested different -B options for larger files as well -- but I'm not sure about what might work best with oracle datafiles -- probably a -B that is the same size as the db_block_size. eric Eric Whiting wrote: I think the -W option might do what you would have described here. eric Kenny Gorman wrote: I am rsyncing 1tb of data each day. I am finding in my testing that actually removing the target files each day then rsyncing is faster than doing a compare of the source-target files then rsyncing over the delta blocks. This is because we have a fast link between the two boxes, and that are disk is fairly slow. I am finding that the creation of the temp file (the 'dot file') is actually the slowest part of the operation. This has to be done for each file because the timestamp and at least a couple blocks are guaranteed to have changed (oracle files). My question is this: Is it possible to tell rsync to update the blocks of the target file 'in-place' without creating the temp file (the 'dot file')? I can guarantee that no other operations are being performed on the file at the same time. The docs don't seem to indicate such an option. Thx in advance.. -kg -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync 1tb+ each day
I've learned some good things from this discussion. THanks. Kenny, I have one concern/idea -- The original post says the 'disk is fairly slow'. That is one bottleneck that should probably be examined a little more. How fast are your disks? HOw fast is your network? An IDE disk with DMA disabled might run 5M/s and when you enable DMA you can see up to 45M/s. Perhaps this is a root cause problem that has already been looked at, but perhaps it would be good to look at it again. Also do you have enough RAM on the destination to do some caching of the file for the multiple reads of the file? That might also help. eric jw schultz wrote: On Tue, Feb 04, 2003 at 11:29:48AM -0800, Kenny Gorman wrote: I am rsyncing 1tb of data each day. I am finding in my testing that actually removing the target files each day then rsyncing is faster than doing a compare of the source-target files then rsyncing over the delta blocks. This is because we have a fast link between the two boxes, and that are disk is fairly slow. I am finding that the creation of the temp file (the 'dot file') is actually the slowest part of the operation. This has to be done for each file because the timestamp and at least a couple blocks are guaranteed to have changed (oracle files). As others have mentioned -W (--whole-file) will help here. The reason the temp-file is so slow is that it is reading blocks from the disk and writing them to other blocks on the same disk. This means every block that is unchanged must be transfered twice over the interface where changed blocks are only transfered once. If the files are very large this is guaranteed to cause a seek storm. Further, all of this happens after the entire file has been read once to generate the block checksums. Unless your tree is smallish reads from the checksum pass will have been flushed from cache by the time you do the final transfer. --whole-file elminiates most of the disk activity. You no longer do the block checksum pass and replace the local copying (read+write) with a simple write from the network. Most likely your network is faster than the disks. For files that change but change very little your disk subsystem would have to be more than triple the speed of your network for the rsync algorythm (as oposed to the utility) to be of benefit. If the files change a lot then you merely need double the speed. -- J.W. SchultzPegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync in-place (was Re: rsync 1tb+ each day)
jw schultz wrote: I was thinking more in terms of no block relocation at all. Checksums only match if at the same offset. The receiver simply discards (or never gets) info about blocks that are unchanged. It would just lseek and write with a possible truncate at the end. This would seem to help a lot on larger database files. Why look at a 700 byte block of data from a source file and try to find a matching block by fully scanning block checksums at all offsets in a 8G destination datafile? And then doing it again for every 700 bytes? (I read the rsync technical paper -- but I might be confused) In the case of Oracle data files the only place a meaningful/syncable delta will occur is at the same offset. Yes this is a special case -- but it has the potential to really help in rsyncing oracle datafiles during a hotbackup or when syncing from a snapshot to nearstore storage. This approach should be faster than the -W option for very large Oracle datafiles (which often have small amounts of changed blocks). It should also be faster than deleting the destination files and resending (-W) like has been suggested. You can imagine a smarter algorithm that does non-sequential writes to the output so as to avoid writing over blocks that will be needed later. Alternatively, if you assume some amount of temporary storage, then it might be possible to still produce output as a stream. I really doubt it is worthwhile doing to rsync. This principly applies to block oriented files such as devices and database files. For the most part rsync handles these fine. agreed. The original post still raises an interesting issue -- it should not be faster to remove destination files before running rsync. That is counter to one of the main purposes of rsync -- efficiently detect and send only the deltas. eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: directories that change into symlinks
I have observed this same problem. Are you running --delete-after? I assumed it might be related to that option. eric David Garamond wrote: our daily backup is done using the rdiff-backup tool, which in turn utilizes rsync/librsync to do the actual mirroring work. a few days ago we did a refactoring and renamed a bunch of directories. for backward compatibility we maintain the old names by symlinking it to the new names. so, for example, oldname1/ now becomes newname1/, and oldname1 is now a symlink to newname1/. we found that now the mirroring cannot complete. rsync doesn't seem to be able to handle this. it tries to do an rmdir(oldname1) followed by symlink(newname1,oldname1). however, since the directory oldname1/ in the old mirror is not empty, rmdir fails with Directory not empty and thus symlink fails too with File exists (since oldname1 has not been deleted yet). any pointers? we looked at the available rsync options but have found no clue yet. -- dave -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync-2.5.5 memory eater problem
I noticed this behavior earlier. Thanks for the patch. eric Ruediger Oertel wrote: Hi, we ran into a little problem with rsync-2.5.5. Setup: you run rsync-2.5.5 as normal rsync over ssh (ie. not connecting to a rsync server). If you start such a rsync but interrupt the pulling process with Ctrl-C, the process on the other side may start to allocate all memory on the remote machine. As fa as we have analyzed the problem, the remote rsync process wants to issue a error message that it could not write 4096 bytes, receives a SIGPIPE and errno is set to EPIPE in the fwrite call (line 279 in log.c). Then exit_cleanup is called which in turn will call the same routine again and will of course get EPIPE again and the loop continues (as long as we have memory to recursivly call the same functions again and again). As a quick workaround we have the following patch, since there is no real point in trying to issue an error message if you have already received EPIPE on write. --- log.c +++ log.c 2002/10/08 11:55:52 @@ -276,7 +276,7 @@ if (!f) exit_cleanup(RERR_MESSAGEIO); - if (fwrite(buf, len, 1, f) != 1) exit_cleanup(RERR_MESSAGEIO); + if (fwrite(buf, len, 1, f) != 1 errno != EPIPE) exit_cleanup(RERR_MESSAGEIO); if (buf[len-1] == '\r' || buf[len-1] == '\n') fflush(f); } -- with kind regards (mit freundlichem Grinsen), Ruediger Oertel ([EMAIL PROTECTED],[EMAIL PROTECTED]) --This machine does not crash randomly.--- It only schedules a hangup when deemed appropriate. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
still seeing hangs -- strace
I still see 2.5.5 hangs where the destination rsync exits (for whatever reason -- I don't see any errors) and the source waits on something forever. The interesting thing about it is when I strace the child rsync pid on the source side then rsync exits as it should. Strace must introduce a timing delay of some sort that allows one of the processes to properly detect the end of channel. I forgot to do a netstat on the socket this time -- I'll do it next time. I see this hang 3-4 times each week. STRACE OF THE PARENT: (does this forever) wait4(7259, 0xbc38, WNOHANG, NULL) = 0 gettimeofday({1028901467, 81808}, NULL) = 0 gettimeofday({1028901467, 81976}, NULL) = 0 select(0, NULL, NULL, NULL, {0, 2}) = 0 (Timeout) gettimeofday({1028901467, 101509}, NULL) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) gettimeofday({1028901467, 111529}, NULL) = 0 wait4(7259, 0xbc38, WNOHANG, NULL) = 0 gettimeofday({1028901467, 111881}, NULL) = 0 gettimeofday({1028901467, 112936}, NULL) = 0 select(0, NULL, NULL, NULL, {0, 2}) = 0 (Timeout) gettimeofday({1028901467, 132002}, NULL) = 0 wait4(7259, 0xbc38, WNOHANG, NULL) = 0 gettimeofday({1028901467, 132362}, NULL) = 0 STRACE OF CHILD: this causes rsync to exit properly --- select(7, [3 4], [], NULL, NULL)= 1 (in [4]) read(4, , 16384) = 0 close(4)= 0 select(7, [3], [3], NULL, NULL) = 1 (out [3]) write(3, \222\265\316\340q\310\35\244\233T\33\203\5\247~v\246\5..., 32) = 32 select(7, [3], [], NULL, NULL) = 1 (in [3]) read(3, 4\217\321\351`\230\t\32\nx\2150\246\2\352Z\255\33?\314..., 8192) = 96 close(6)= 0 select(7, [3], [3], NULL, NULL) = 1 (out [3]) write(3, \361RK\276\257\23u\2265JmI\272(|4\346b.\2760@k\237-_x\252..., 32) = 32 ioctl(0, TCGETS, 0xba08)= -1 EINVAL (Invalid argument) fcntl64(0, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(0, F_SETFL, O_RDWR) = 0 ioctl(1, TCGETS, 0xba08)= -1 EINVAL (Invalid argument) fcntl64(1, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK) fcntl64(1, F_SETFL, O_RDWR) = 0 ioctl(2, TCGETS, 0xba08)= -1 ENOTTY (Inappropriate ioctl for device) fcntl64(2, F_GETFL) = 0x8801 (flags O_WRONLY|O_NONBLOCK|O_LARGEFILE) fcntl64(2, F_SETFL, O_WRONLY|O_LARGEFILE) = 0 gettimeofday({1028901480, 966810}, NULL) = 0 shutdown(3, 2 /* send and receive */) = 0 close(3)= 0 _exit(0)= ? rsync version 2.5.5 protocol version 26 Copyright (C) 1996-2002 by Andrew Tridgell and others http://rsync.samba.org/ Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles, IPv6, 64-bit system inums, 64-bit internal inums -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: Linux and Solaris performance
Dave North wrote: We have an interesting quandry here. When I'm rsync'ing my directory tree (100 directories each containing 1000 files) I see some strange results: All of these machines are on their own network segment (100t) in our QA lab Solaris-Solaris - time taken: 11m32s Solaris-RH Linux 7.2 - time taken: 206s RH Linux-Rn Linux - time taken 1m59s In each instance I have rsync running as a server on the target machine (Linux in each case). I just can't explain why the performance is just so AMAZING on Linux. Any ideas? You did not mention what hardware you are running. If you are running Ultra-10 boxes then they have a slow IDE hard disk controller chipset (CMD640 I think). If your linux box has a faster IDE/SCSI controller (as long as it runs faster than 12.5Mbytes/s) then that would explain some of the difference. Linux-linux rsync I'd guess you are seeing the network as the bottleneck. Solaris-solaris I'd guess you are seeing the disk (or some other HW) as the bottleneck. Of course if your solaris HW is an V880 then I'm out to lunch on my guess. eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: bug report: errors in file transfer
Yes you should be concerned about this problem. I suggest these things: 1. Try running without -z. Some versions of rsync (2.5.4?) had a libz bug. 2. Better yet, upgrade both sides to 2.5.5 and retry. 3. Make sure your solaris box has the latest NFS patches. eric Michael Lachmann wrote: Hi! I just encountered a serious problem with rsync. I used rsync to copy a big directory between two computers. The source machine was a sun, the destination was a linux box. The destination directory did not exist before the copy started. I used the following command to copy the directory over: rsync -aSz source_machine:projects/ projects2 because I was somewhat paranoid, I checked if all files were transferred correctly using the following command: find . -not -type d -exec cksum \{\} \; |sort +2 /tmp/w which was issued on both the source and the destination machines. Then I used 'diff -b ' to compare these two files. It turned out that some of the files transferred had a different checksum. When I manually checked if the files indeed differed, I found that they were not transferred correctly! This means that rsync did not transfer all the files correctly. The file sizes were correct, but their contents differed. I then tried to overcome the problem using rsync -caSz instead. This gave only one file that had mistakes. Then I tried rsync -az ... which still had errors. With my current setup, this problem is repeatable. The files that have errors are usually big files. Now, a bit about the setup: source is a sun, with the partition mounted over NFS from a disk-server. destination is a linux box, with a partition of type ext3 on LVM, debian sid. source rsync version is: --- rsync version 2.5.2 protocol version 26 Copyright (C) 1996-2002 by Andrew Tridgell and others http://rsync.samba.org/ Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles, no IPv6, 64-bit system inums, 64-bit internal inums -- destination rsync: -- rsync version 2.5.4 protocol version 26 Copyright (C) 1996-2002 by Andrew Tridgell and others http://rsync.samba.org/ Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles, IPv6, 64-bit system inums, 64-bit internal inums -- Since I am using rsync for most of my backups and file transfers, I am VERY worried about this problem. Am I doing anything wrong? Michael -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: file size 0 or null content
I think I was one of the 3 people to see this problem. Our setup was a Solaris 8 client writing to a Network Appliance NFS server. We do our rsyncs direct to the box with the disk on it when we can, but in the case of the Netapp storage we have to use NFS. I guess I should be sure that the automatic --whole-file option is disabled when I'm writing to NFS disk like that -- but in the case where ssh is the transport isn't the rolling checksum active even if the source/dest are distinct hosts and both running against NFS storage? If not then I need to disable that somehow. The --checksum flag cleaned up the file will nulls for me. eric Dave Dykstra wrote: You're the third person I've heard about lately saying they saw nulls in files copied by rsync, but the first on such an old version. What are the operating systems of the NFS client and server? The case I heard about before was a solaris client and sunos4 server and I suspected it was a problem with NFS. To recover from the situation you can add a --checksum option so rsync will detect the corrupted files; otherwise it skips over those that it has previously copied and have matching timestamps sizes. By the way, rsync is designed to work most efficiently when it operates on local disks, so if you can run rsync on the NFS server you'll probably be better off. When going over NFS as you are, it appears to rsync as if it is a local disk-to-disk copy and the most recent versions automatically add a --whole-file option to disable the rsync rolling checksum algorithm. - Dave Dykstra On Thu, May 16, 2002 at 10:51:35PM -0700, girish adiga wrote: Hi All, I am using rsync version 2.4.5 protocol version 24 Some times the file size will be zero or the file contains NULLs when used rsync. ? The command i have used pasted below. I want to sync file c to my local m/c. cd /net/a/ for file in `ls */c` rsync -rR --delete $file Local Dir Path done This is run every 1 hr using cron job to update file c in local directory. Is there anything wrong in command. Any help would be greatly appreciated. Thanks, Girish __ Do You Yahoo!? LAUNCH - Your Yahoo! Music Experience http://launch.yahoo.com -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: Block of unwanted zeros in a dest file
Dave Dykstra wrote: Eric, which version of Sun operating system were you running, and was there any NFS involved? What's a NAS disk? Yes NFS on both ends. Solaris 8 on destination. Netapps NAS boxes on both ends. NAS -- Network Attached Storage. Netapps F720 NFS 'toaster' boxes. (lots of $$ but they mostly work well.) I'm using 2.5.5 now and am not aware of this problem happening again. I'll just consider it fixed.. eric I had a user who reported having rsync (a CVS version between 2.5.4 and 2.5.5) produce a file filled with nulls but he was copying from a Solaris client which was mounting over NFS from a Sunos4 server, and the files were not readable because he was running as root on the Solaris client and the filesystem was not exported from the Sunos4 server to allow root access. The problem went away when he used another operating system. We speculated it may have been that the Sunos4 NFS implementation wasn't returning the proper error code. I haven't seen anbody else report problems with rsync producing files of nulls, but it's pretty disconcerting. - Dave Dykstra On Mon, Mar 25, 2002 at 12:28:06PM -0700, Eric Whiting wrote: I'm syncing from a linux box (NAS disk) to a sun (NAS disk). I just found a file on the destination sun with zeros from bytes 8192 to 32767. (the source file had lots of 'good' random bytes). The rest of the file compares properly. Repeatedly running rsync to send the file didn't fix it. I ran a -c transfer (checksum) and it fixed the file. I'm using 2.5.3 on both sides with the -z turned off. (I thought that was somewhat 'safe') Is this a know issue? I'll upgrade to 2.5.4 (with the fixed zlib) and see if it acts the same. eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: memory requirements was RE: out of memory in build_hash_table
Granzow, Doug (NCI) wrote: From what I've observed by running top while rsync is running, its memory usage appears to grow gradually, not exponentially. A rsync may take several hours to complete. (I have one running now that started over four hours ago. The filesystem contains 236 GB of data in 2.4 million files. It is currently taking up 1351MB of memory on the mirror server and 646M on the source server.) All filesystems are veritas filesystem, in case that is relevant. I just killed a 750M rsync (2.5.5) process. It had been runing for over 12 hours. (around 62G in 1.2 million files).. This was a local rsync without any ssh/remsh/rsyncd involved. I went back to the 'for file in *;do rsync blaa;done' method. This helps it run in a reasonable amount of time and RAM. I'd still like to be able to run a rsync on the toplevel (without the for loop...) eric -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
2.5.3pre compile warnings
/inftrees.c:404: warning: unused parameter `z' Dave Dykstra wrote: On Wed, Feb 06, 2002 at 04:29:49PM -0700, Eric Whiting wrote: Dave, I tried the snapshot... I get an error (after a ./configure;make). gcc -I. -I. -g -O2 -DHAVE_CONFIG_H -Wall -W -c main.c -o main.o main.c: In function `do_cmd': main.c:184: `RSYNC_RSH' undeclared (first use in this function) I added the #define to config.h and it compiled ok with a lot of warnings.. (SUSE 7.3) Martin forgot to run autoconf and autoheader. I just ran them and checked in the files, and they're in unpacked now. - Dave
Re: SIGUSR1 or SIGINT error
Make that 2 of us who need to specify a large timeout. I have found that I have to set the timeout to a large value (1) to get the rsyncs to run successfully. Leaving it at the default seemed to cause timeout/hang problems. Of course I still running a 2.4.6dev version. I had troubles with 2.5.[01]. (solaris/linux mix of of rsync clients/servers) I need to try 2.5.2 as soon as I get a chance. Looks like some good fixes are happening in 2.5.2. eric On Wed, 2002-02-06 at 10:39, [EMAIL PROTECTED] wrote: When i was getting these, I traced the process and its children (solaris: truss -f). I found that one of the spawned threads was experiencing an io timeout while the filelist was building. I had set no timeout, but it did it at 60 seconds every time. I found that this corresponded to a SELECT_TIMEOUT parameter, which was set to 60 if IO_TIMEOUT was 0. BY setting my timeout to 86400 (1 day), i stopped those. Of course, then, it choked farther along, but that's another story. Try setting a timeout, even if you don't want one. Make it the longest the process should ever take. Tim Conway [EMAIL PROTECTED] 303.682.4917 Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, n9hmg on AIM perl -e 'print pack(, 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), .\n ' There are some who call me Tim? Dave Dykstra [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 02/06/2002 10:16 AM To: David Birnbaum [EMAIL PROTECTED] cc: [EMAIL PROTECTED] (bcc: Tim Conway/LMT/SC/PHILIPS) Subject:Re: SIGUSR1 or SIGINT error Classification: On Tue, Feb 05, 2002 at 11:28:54AM -0500, David Birnbaum wrote: I suspected that might be the case...now...how to determine the real problem? Does rsync log it somewhere? lsof shows that STDERR/STDOUT are going to /dev/null, so I hope it's not writing it there. Nothing informative in syslog, just the message about the SIG: Feb 5 09:49:41 hite rsyncd[9279]: [ID 702911 daemon.warning] rsync error: received SIGUSR1 or SIGINT (code 20) at rsync.c(229) Any clues? I'm sorry, but I don't have any more suggestions. - Dave Dykstra
Re: SIGUSR1 or SIGINT error
Dave, I tried the snapshot... I get an error (after a ./configure;make). gcc -I. -I. -g -O2 -DHAVE_CONFIG_H -Wall -W -c main.c -o main.o main.c: In function `do_cmd': main.c:184: `RSYNC_RSH' undeclared (first use in this function) I added the #define to config.h and it compiled ok with a lot of warnings.. (SUSE 7.3) Ill try it out. eric On Wed, 2002-02-06 at 15:41, Dave Dykstra wrote: Looks like a fix for that went into 2.5.0. See revision 1.87 at http://cvs.samba.org/cgi-bin/cvsweb/rsync/io.c Tim David, what version are you running? 2.5.2 has some serious problems, Eric. Try the latest development snapshot at rsync://rsync.samba.org/ftp/unpacked/rsync/ or ftp://rsync.samba.org/pub/unpacked/rsync/ - Dave Dykstra On Wed, Feb 06, 2002 at 11:33:43AM -0700, Eric Whiting wrote: Make that 2 of us who need to specify a large timeout. I have found that I have to set the timeout to a large value (1) to get the rsyncs to run successfully. Leaving it at the default seemed to cause timeout/hang problems. Of course I still running a 2.4.6dev version. I had troubles with 2.5.[01]. (solaris/linux mix of of rsync clients/servers) I need to try 2.5.2 as soon as I get a chance. Looks like some good fixes are happening in 2.5.2. eric On Wed, 2002-02-06 at 10:39, [EMAIL PROTECTED] wrote: When i was getting these, I traced the process and its children (solaris: truss -f). I found that one of the spawned threads was experiencing an io timeout while the filelist was building. I had set no timeout, but it did it at 60 seconds every time. I found that this corresponded to a SELECT_TIMEOUT parameter, which was set to 60 if IO_TIMEOUT was 0. BY setting my timeout to 86400 (1 day), i stopped those. Of course, then, it choked farther along, but that's another story. Try setting a timeout, even if you don't want one. Make it the longest the process should ever take. Tim Conway [EMAIL PROTECTED] 303.682.4917 Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, n9hmg on AIM perl -e 'print pack(, 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), .\n ' There are some who call me Tim? Dave Dykstra [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 02/06/2002 10:16 AM To: David Birnbaum [EMAIL PROTECTED] cc: [EMAIL PROTECTED] (bcc: Tim Conway/LMT/SC/PHILIPS) Subject:Re: SIGUSR1 or SIGINT error Classification: On Tue, Feb 05, 2002 at 11:28:54AM -0500, David Birnbaum wrote: I suspected that might be the case...now...how to determine the real problem? Does rsync log it somewhere? lsof shows that STDERR/STDOUT are going to /dev/null, so I hope it's not writing it there. Nothing informative in syslog, just the message about the SIG: Feb 5 09:49:41 hite rsyncd[9279]: [ID 702911 daemon.warning] rsync error: received SIGUSR1 or SIGINT (code 20) at rsync.c(229) Any clues? I'm sorry, but I don't have any more suggestions. - Dave Dykstra
Re: rsync w/ openssh hangs on linux
It seems similar to Ed Santiago's problem. http://lists.samba.org/pipermail/rsync/2001-December/005511.html This also seems similar to what I reported last month. http://lists.samba.org/pipermail/rsync/2001-December/005628.html I had to go back to 2.4.6dev+waynenohang. That works for me. I run solaris/linux hosts. I'll try 2.5.1 again, but I'm pretty sure there is still a timing/IPC issue. eric Dave Dykstra wrote: On Fri, Jan 04, 2002 at 11:29:13AM -0800, J Robinson wrote: Hello All: I've having repeatable problems copying files between linux machines using rsync-2.5.1 and various recent flavors of openssh. The symptom when testing with a single 2M file is that rsync hangs when when the file is 97% transferred. It seems from looking at various archives that this problem may be non-trivial. I tried applying the patch at http://www.clari.net/~wayne/rsync-nohang.patch to rsync-2.5.1, (even though it's for 2.5.0) but it didn't seem to make things any better. I'm wondering if there are any tips to get around this problem or ??? I don't believe anybody else has reported a problem quite like that lately. First, give us the output of netstat -a on both sides for the relevant connection so we can see if there is data in a send queue somewhere that's not being sent; that could indicate a TCP implementation problem which rsync has a great tendency to expose because of the way it stresses TCP. Please try to give us as much information as you can so somebody else could possibly repeat it, including exact versions of all software involved on both sides, the speed of your network, and the exact command you're using. - Dave Dykstra
Re: rsync 2.5.0 bit length overflow
Martin Pool wrote: On 30 Nov 2001, Thomas J Pinkl [EMAIL PROTECTED] wrote: I'm seeing: bit length overflow code 4 bits 6-7 in the output of rsync 2.5.0 between two Red Hat Linux systems. One is RH 6.1 (kernel 2.2.19-6.2.1, glibc 2.1.3-22), the other is RH 7.2 (kernel 2.4.9-13, glibc 2.2.4-19). Both systems have rsync 2.5.0. On the RH 6.1 box, I run this command: rsync -v --stats -az -e rsh -H --exclude-from FILE /some/dir/. rh72:/some/dir/. Anyone have a idea of what this means? That comes from zlib. It seems like a harmless warning. Is it reproducible? Are the file trees transferred correctly? I see a similar warning every night. In my case the src is 2.5.0 and dest is 2.4.6pre1. I'll upgrade the dest and see if it still happens. It only happens on certain dirs -- I'm guessing dirs with large number of files, but I'm not sure yet... eric bash$ grep bit length overflow */*errors |wc 12743822 54680
bit length overflow
2.5.0 -- thanks for doing the new release with all the fixes. It appears to be working fine. I'm seeing a 'bit length overflow' warning to STDERR. (with -vv) It doesn't appear to be an error -- zlib/trees.c seems to indicate that this is a situation that is properly handled. -vv in older versions (2.4.7pre1) didn't seem to kick this error out. Anyone else seeing this? Is this just a case of increased verbosity? eric
Re: my rsyncs *do* exit, if I don't use -v
What is holding up 2.4.7? 2.4.6 frequently has these hang problems that are fixed in the cvs tree. But linux distros keep shipping 2.4.6 and users keep having troubles. I think we need to release a rsync 2.4.7. eric Jessica Koeppel wrote: If I use -v or -vv, the rsyncs will hang forever (or at least a few days, I haven't let them run any longer than that so I'm nto sure if they'd eventually exit on their own or not). The information on rsync and OS version was sent in my previous email. For now i'm just going to run without -v, as I need to get this going and don't have time to get intimately involved with rsync. But if someone's heard of this before and/or is working on this bug (i assume it's a bug?), and you need mroe info from me, let me know if there's anything I can do. --jessica
Re: Rsync Throughput issues
Ben Ricker wrote: I am using Rsync between a Redhat Linux box and an AIX RS600. We have a about 30gb of database we need to sync to a backup server. Sounds good, right? The problem is that Rsync is so slow when we do the initial dump. We have files that are 1 - 5gb. It takes around 14-20 hours to Rsync the file structure, which means our daily backup will miss that days data because we are waiting for the data to rsync. How many files are in the tree? Are all the files 1-5G? Are the source files on new/fast disk? For older systems, the 100Mbit network is often faster than disk. I have a 40G tree with 1 million files in it. I just killed the tree on a development box and restarted a rsync to it. I think it might take 10+ hours mainly because of how many files are in the tree -- not so much because of the total size. I'll let you know. I know it takes about 2 hours just to verify that all 1 million files are in sync even when there are no changes to the tree. At first we were using ssh as the agent but we thought that might be slowing us down. We then tried to use rsync as a server and push the data to the module/zone of the server. Consistently, it takes roughly 2 minutes to transfer 10meg of data That seems way too long. Sounds like a reverse dns lookup failure. Long shot debug question -- Do you have dns/nis/files all working properly? Sometimes reverse dns lookups can take a few minutes to fail or time out and thus affect the overall speed of transfers. Repeated reverse dns failures can add up fast. I thought it was a network problem so we tried ftp. A 10 mb file took 90% the time (about 20 seconds. This is on a 100bit Full Duplex switched network with both boxes on the SAME subnet. I think you still have other problems. A ftp of a 10Mbyte file takes my boxes about 1s on a 100Mbit wire. Be sure to try ftp both get and put. Oftentimes a duplex mismatch or speed problem will only affect traffic in one direction and not the other. I think you might have some network issues. Do a ifconfig on the linux box to see what sort of errors and collisions you are seeing. Am I missing something? Should I expect these speeds? Could there be something on the App switch throttling rsync for some reason? Any thought will help, Ben Ricker System Administrator Wellinx.com
Re: new rsync feature
I'd like to have that feature available. eric Martin Pool wrote: i didn't see that there was a way to tell rsync to change ownership and group assignment on all files transferred to a specific user. meaning transfer all these files to this remote machine and then chown them to 'doug', no matter who owns them here on the local machine. i need this, so i added this feature in my local tree. seems to work ok on Solaris and Mac OS X. the code was good and easy to modify (yay). You might as well send the patch through to the list. Please send it in context diff format (-c). Does anybody else want this or want it not to go in? -- Martin
Re: unexpected EOF in read_timeout
Michelle/Tim, I ocassionally get this message under 2.4.7pre1 -- I just call it a network error and ignore it, but perhaps there is still a rsync protocol problem. TCP is pretty good about retry/resend under bad network conditions. Network hiccups shouldn't really be hitting us this often or severe, right? Is the receiving side giving up or hanging up too soon? One good thing about this error -- at least it isn't hanging forever like 2.4.6 would often do. my errors look like this: rsync: connection to server unexpectedly closed (69872 bytes read so far) rsync error: error in rsync protocol data stream (code 12) at io.c(255) eric willem wrote: [EMAIL PROTECTED] wrote: Michelle: I get the same sort of results. Solaris 7, rsync 2.4.6, with the nohang patches, or with 2.4.7pre1. If you trace the execution, you'll probably find i/o errors. I can't find out what they are, and would expect the kernel to take care of any disk or network errors, leaving me wondering what rsync is choking on. We've got all the latest patches to Solaris, so i'm pretty sure it's not a case of rsync overstressing the tcp implementation, especially as i bwlimit the transfer to 1000kbps. I've got 2328543 objects (files, directories, symlinks) in 116473553K, on NAS at each end, reached by NFS. I have to break my jobs up into smaller chunks, which makes it hard to have it delete things. Anybody else have any ideas? Tim Conway [EMAIL PROTECTED] 303.682.4917 Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips Available as n9hmg on AIM perl -e 'print pack(, 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), .\n ' There are some who call me Tim? Michelle Brownsworth [EMAIL PROTECTED]@lists.samba.org on 09/10/2001 06:34:16 PM Sent by: [EMAIL PROTECTED] To: [EMAIL PROTECTED] cc: (bcc: Tim Conway/LMT/SC/PHILIPS) Subject: RE: unexpected EOF in read_timeout Classification: Sadly, I don't seem to be having any better luck with 2.4.7pre1: su-2.05# ./rsync_nova.sh receiving file list ... done wrote 16 bytes read 6620 bytes 13272.00 bytes/sec total size is 1309032 speedup is 197.26 receiving file list ... done wrote 16 bytes read 4458 bytes 2982.67 bytes/sec total size is 6037628 speedup is 1349.49 receiving file list ... done wrote 16 bytes read 450 bytes 932.00 bytes/sec total size is 34693 speedup is 74.45 receiving file list ... done mysql/kendall/car.MYD Connection to nova closed by remote host. rsync: connection to server unexpectedly closed (26372 bytes read so far) rsync error: received SIGUSR1 or SIGINT (code 20) at rsync.c(229) rsync error: received SIGUSR1 or SIGINT (code 20) at main.c(693) .\\ichelle - Michelle Brownsworth System Administrator PrimeLogic Communications http://www.primelogic.com sorry to get your hops up but I've no Ideas but I'm wondering how you limit your band width use uther solaris I wan't to do that for an ftp connection. can you help me with that? -- __ Eric T. Whiting AMI Semiconductors
Re: rsync 2.4.6 hangs.
There are 2 patches on this mailing list that fix the 2.4.6 hangs for me. Check the archives and you should be able to find them. I hope to see a 2.4.7 out soon that includes some or all of those hang fixes. eric Steve Ladendorf wrote: I see there is quite a bit discussion on the list about rsync hanging on a variety of conditions, one of which I seem to be having. We use rsync to update remote servers over night. Each system syncs with a master server holding around 4.0 gig of data. Normally there is very little change on the master so rsync only updates a few files each night. My problem is that I am now upgrading the servers and need to sync all the data to the new systems and rsync is hanging like crazy. Sometimes it moves a few hundred meg and sometimes it only moves one or two files and then hangs. My master server is RedHat 6.2 running rsync 2.4.6 (from source - no RPM). The remote server is RedHat 7.0 with rsync 2.4.6 (from source - no RPM). Things were actually better when the master was running rsync 2.4.3 but it still hung. The remote server is a PIII 850 with 512 megs of RAM so I doubt this is a memory problem. Even at 100 bytes per file would not consume all the memory on the remote system. The master only has 112 Meg but I assume the file list is kept on the remote and not the master.. My command is rsync 192.168.0.12:/home/image /home/image -e ssh2 -avz --delete Is there something I'm doing wrong? What can I do to prevent rsync from hanging all the time?? I've tried removing the --delete and -z options but it still hangs.. Any help would be greatly appreciated. --- Steve Ladendorf Network Manager sladendorf at blakeschool.orgThe Blake School As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality. --Albert Einstein -- __ Eric T. Whiting AMI Semiconductors
Re: Don't use -v option with 2.4.6
John, Wayne's fixes solved 2.4.6 hangs I had been seeing for a long time. Actually I've been running 2.3.2 because the 2.4.6 hangs were killing me... Now I'm slowly moving all my apps up to the 2.4.6+wayne patch version of rsync. I hope his patch (or something similar) gets included in the official rsync tree. Give it a try -- Thanks to Wayne. eric Wayne Davison wrote: Remi Laporte wrote: Rsync 2.4.6 -v option is bugged and cause hangs of the transfers, so if you have such problem, first think to remove the -v. On Thu, 7 Jun 2001, John E. Mayorga wrote: Can anyone else confirm this? My recently posted anti-hang patch should hopefully fix this for you. The latest change I made was explicitly to deal with the hang that comes at the end of the data transfer when all the verbose directory-tweaking messages caused the generator process to hang because the sender process wasn't reading the error channel. You can get the patch from here to try it out: http://www.clari.net/~wayne/rsync-nohang.patch ..wayne.. -- __ Eric T. Whiting AMI Semiconductors
Re: FW: Problem with large include files
I still have been unable to move to 2.4.6 for a similar reason -- hangs. I haven't ever detected the #files in a dir issue, but I do still see hang problems in 2.4.6. (both solaris and linux) I see this on localhost to localhost rsync's as well as rsyncs over ssh. eric [EMAIL PROTECTED] wrote: I've seen a problem similar to the include file, but with just running rsync in the following mode copying from one directory to another; rsync -avo --delete --stats /dir1 /dir2 I was using version 2.3.1 for the longest time with no problem, and I just recently moved to 2.4.6. If my directories have more then 61,000 files in them, the process just hangs. Now, in order to use 2.4.6, I must use a script that chops the update process into smaller then 60,000 file chunks. Tim == Tim W. Renwick mailto:[EMAIL PROTECTED] | Put me on the highway Philips Semiconductors (408)474-5370 | and show me a sign 1109 McKay Drive, M/S 41 Fax (408)474-5252 | and take it to the San Jose, CA 95131 SERI trenwick@usvlsjs1 | limit one more time! - Eagles [EMAIL PROTECTED]@[EMAIL PROTECTED] on 05/15/2001 10:36:47 Sent by:[EMAIL PROTECTED] To: [EMAIL PROTECTED]@SMTP cc: [EMAIL PROTECTED]@SMTP Subject:RE: FW: Problem with large include files Classification: How do I go about registering this bug with the include file. It would be good get this bug fixed as I would like to be able to back to 2.4.6 (or whatever) as it is faster and it has bandwidth limiting. Will let you know the results of the testing. Cheers Mark -Original Message- From: Dave Dykstra [mailto:[EMAIL PROTECTED]] Sent: Wednesday, 16 May 2001 01:24 To: Wilson, Mark - MST Cc: RSync List (E-mail) Subject: Re: FW: Problem with large include files On Tue, May 15, 2001 at 03:31:23PM +1200, Wilson, Mark - MST wrote: ... Do you have any idea what the maximum number of files you can have in an include file is (for the current version)? No, I don't. It probably depends on a lot of variables. How do you want your test on 2.3.2 done? ie LAN or high speed WAN, numbers of file, sizes of files, things to time, daemon v rsh. What I'd like to see is a case that might make the biggest difference with and without the optimization: - probably use the LAN - the largest number of files that you can get to work - small files - time the whole run with the time command, CPU time and elapsed time - I don't know about daemon vs rsh, but the daemon leaves the most under rsync's control so that may be preferable - Dave Dykstra CAUTION - This message may contain privileged and confidential information intended only for the use of the addressee named above. If you are not the intended recipient of this message you are hereby notified that any use, dissemination, distribution or reproduction of this message is prohibited. If you have received this message in error please notify Air New Zealand immediately. Any views expressed in this message are those of the individual sender and may not necessarily reflect the views of Air New Zealand. _ For more information on the Air New Zealand Group, visit us online at http://www.airnewzealand.com or http://www.ansett.com.au _ -- __ Eric T. Whiting AMI Semiconductors
Re: rsync across nfs
I have a similar setup and goal as Tim. As another approach to the problem I've been pushing the sync's out to the destinations rather than having the destinations pull the data. I have 2G RAM on the source box (Solaris with a netapps disk) and I push the data via rsync/ssh to destinations in parallel with a simple perl script that forks a child per destination. (and loops over a few top level dirs) That doesn't really relate to Dan's question does it?? Oh well, I guess my opinion is that if you can rsync using rsh/ssh (push or pull) you will be much happier than if you have to use NFS. eric [EMAIL PROTECTED] wrote: One thing you can do to decrease your load is to ad the -W option. If you're reading it via nfs to do the checksumming, you have to read the entire file anyway, so you might as well just move the entire file, instead of wasting processor power, and reading the entire file twice (or more, actually). The single-processor bsd machine would max out at one transfer at a time, probably. using it as rsyncd, though, gives you the advantage of letting you use the "max connections" option, and having the individual machines retry until they succeed, thus controlling the load on the solaris machine. I am developing a similar solution for our system, where we have a single master copy of a set of tools, with identical copies all over the world that must be kept up to date. Tim Conway [EMAIL PROTECTED] 303.682.4917 Philips Semiconductor - Colorado TC 1880 Industrial Circle Suite D Longmont, CO 80501 [EMAIL PROTECTED]@[EMAIL PROTECTED] on 04/11/2001 03:08:28 PM Sent by:[EMAIL PROTECTED] To: [EMAIL PROTECTED]@SMTP cc: Subject:rsync across nfs Classification: I currently use rsync across an nfs mount. This nfs server is a sparc solaris machine mounting to approx 30 freebsd and 10 linux machines. When a typical rsync occurs to replicate data across all these machines they all rsync /usr/local/bin/rsync -az --delete --force /home/cvs/website/config /website where /home/cvs is an nfs mount and /website is just the local drive. Problem is they all hit the solaris box at once driving it to load average as high as 75 for the 10-20 seconds that this occurs. My question is would i be better off taking a single processor freebsd machinerunning a rsync server socket type deal...all getting 40 machines or so to connect this wayor would that be worse. I like having all my machines update at oncejust looking for an efficient way that a) i could even dedicate 1 box as just an rsync server and b) a single processor machine could actually handle that kind of load. My guess is i would prob have to stripe some drives together as IO may be a problem with that many webservers connecting at once? or will cpu be more a factor. I have considered changing rsync times around so that they don;t all connect to same server at once...this is done from crontab btw, but I like the way it currently is. Any suggestions would be much appreciated. -- Dan +--+ | BRAVENET WEB SERVICES | | [EMAIL PROTECTED]| | make installworld| | ln -s /var/qmail/bin/sendmail /usr/sbin/sendmail | | ln -s /var/qmail/bin/newaliases /usr/sbin/newaliases | +__+
Re: I also am getting hang/timeout using rsync 2.4.6 --daemon
I see the same problem on a linux box running without any network involved. rsync -av dir1 dir2. (where dir[12] are both on local ide disks) I posted a few weeks ago regarding my 2.4.6 hang -- (not running --deamon) http://lists.samba.org/pipermail/rsync/2001-January/003552.html. I found two 'fixes'. 1. run without the -v 2. or cntl-c and rerun the command to cleanup the rest of the tree I still do a lot of solaris/linux rsyncs but I'm still using 2.3.2 on those. eric Scott Russell wrote: All - I understand the focus of the discussion below is Linux - Solaris time outs but is anyone else seeing the same problem under Linux - Linux when running 2.4.6 rsync --deamon? Currently I'm seeing it from the client end. Both of the servers I'm pulling from were updated to 2.4.6 and then I started seeing problems. Since I don't have visibility to the server side there isn't to much else I can say for sure. On my end (client side) the system is Red Hat 6.0 + errata and a 2.2.18 kernel using rsync 2.4.6. In my logs I see "unexpected EOF in read_timeout" usually right after the file list is received. Running rysnc with -v -v -v doesn't show anything special about the problem. -- Scott On Tue, Feb 27, 2001 at 01:29:45PM -0600, Dave Dykstra wrote: Re: http://lists.samba.org/pipermail/rsync/2001-February/003628.html Some more email was exhanged the last couple days on the subject of a TCP hang between Solaris and Linux: Date: Mon, 26 Feb 2001 12:06:08 -0600 From: Dave Dykstra [EMAIL PROTECTED] To: "David S. Miller" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Linux 2.2.16 through 2.2.18preX TCP hang bug triggered by rsync David, Regarding the patch you sent us (below) to try to help solve the problem of Solaris sending acks with a sequence number that was out-of-window: We have now completed extensive testing with it and analysis of tcpdumps and have determined that the patch is working as expected (accepting the acks) but it isn't enough to work around the state that Solaris gets itself into; the connection still hangs. It looks like Alexey was right. Linux is able to make further progress getting data sent to Solaris but it isn't enough to recover the whole session; the Linux receive window stays at 0 so I presume the rsync application isn't reading the data because it's waiting for the Solaris side to complete something. Oddly, every 30 or 60 seconds after this situation occurs, Linux sends another 2-4 1460 byte packets and they're acknowledged by Solaris. It seems unlikely that the rsync application would be sending exact multiples of 1460, but I didn't do a trace during the hang to see if it was generating extra data for some reason. I have attached the tcpdump in case you're interested. Recall that 'static' is Linux and 'dynamic' is Solaris. We have added our interpretation on some of the lines. We also have had an initial response from Sun where they recommended upgrading with a certain patch but that too hasn't solved the problem (the attached tcpdump is with the Solaris patch in place). Thanks for your help and I'll let you know if we do ever get a satisfactory answer from Sun. - Dave Dykstra -- Regards, Scott Russell ([EMAIL PROTECTED]) Linux Technology Center, System Admin, RHCE. T/L 441-9289 / External 919-543-9289 http://bzimage.raleigh.ibm.com/webcam -- __ Eric T. Whiting AMI Semiconductors (208) 234-6717 2300 Buckskin Road (208) 234-6659 (fax)Pocatello,ID 83201 [EMAIL PROTECTED]
Re: Problem With Rsync and ssh2
You should be able to find 3 rsync processes. 1 on the source box and 2 on the destination. Do a truss on the pid of the 'hung' processes and report back what state they are in. Also a netstat -na showing the Queues on the socket (port 22) might be helpful. What version of rysnc? What version of ssh? eric Ed Young wrote: Hi, Does anyone have any idea why rsync makes this file *.pq on the client? When I run this command; /opt/bin/rsync -avz -e /usr/local/bin/ssh allegan:/var/data /var It creates the following; receiving file list ... done data/ldm/ data/xcd/ data/ldm/ldm.pq And then just hangs. The OS on the server is Solaris 2.8 and the client is solaris 2.7. Thanks Ed Young Senior Systems Administrator Department of Geography and Earth Resources College of Natural Resources Utah State University Logan, Utah 84322-5240 Phone: (435) 797-7133 Fax: (435) 797-4048 Email: [EMAIL PROTECTED]
Re: rsync 2.4.6 hangs in waitid() on Solaris 2.6 system
On Wed, Jan 24, 2001 at 11:59:18AM -0500, John Stoffel wrote: Hi all, This is a followup to bug report 2779 on the rsync bug tracking web site, I'm also seeing the hang in waitid() on the master process when trying to do an rsync on a single host. snip Thanks, John John Stoffel - Senior Unix Systems Administrator - Lucent Technologies [EMAIL PROTECTED] - http://www.lucent.com - 978-952-7548 Other people have reported similar experiences but nobody has pointed to a problem in rsync; the problem is more likely to be in NFS on the NetApp or Solaris machines. I believe most NFS traffic goes over UDP but do you happen to know if it using TCP? We have seen many problems with TCP connections when rsync is communicating between two different machines. Here is some more data. Different setup, yet similar end result-- but no answers. Once again I saw a very simliar hang last weekend. I put a new HD in my home PC and was rsyncing from one IBM IDE disk to a second IBM IDE hard disk. Linux 2.2.18. Rsync 2.4.6. After moving about 6G of files rsync would stop. I checked the /proc/pid(s)/fd dir and didn't see any open 'real' files (like I do when rsync is actaully moving data. I CNTL-C'd the rsync, ran the same cmd again and it would finish up the job. I love that about rsync. I killed the whole dest directory tree and ran it again (for testing). It still hung. Same file it hung on and same resolution. I'll run it again and provide better details. I realize that rsync has sometimes taken unfair blame for TCP bugs, NFS bugs, ssh bugs, rsh bugs, OS bugs, etc -- but I still think there might be something that can be improved -- either a hard-to-find problem or a different way to handle a infrequent exception. Is there an int64 problem here? A compiler mess up? A cast int64-int that confuses something? I was not using -W, just rsync -av path1 path2 (no :'s in any path). eric Try using "-W" to disable the rsync rolling checksum algorithm when copying between two NFS mounts, because that causes extra NFS traffic. Rsync's algorithm is optimized for minimizing network traffic between its two halves at the expense of extra local access and in your case the "network" is between processes on the same machine and the "local" is over a network. - Dave Dykstra -- __ Eric T. Whiting AMI Semiconductors (208) 234-6717 2300 Buckskin Road (208) 234-6659 (fax)Pocatello,ID 83201 [EMAIL PROTECTED]
Re: OpenSSH hang (was Re: I also am getting hang/timeout using rsync 2.4.6--daemon)
More info/questions: (hung rsync is 17407) # /usr/proc/bin/pstack 17407 17407: /usr/local/bin/rsync -rptgoD --partial --delete-after -vv --delete -e ff215d08 poll (ffbefad0, 0, 14) ff1ce6d4 select (ffbefad0, ff2369f4, ff2369f4, ff2369f4, ff2369f4, 0) + 298 0001b568 msleep (14, ffbefc54, 40, 0, ff233968, 1b638) + 78 0001b620 wait_process (4400, ffbefc54, 1, 49800, 52d10, 1) + c 0001c694 client_run (7, 4, 4400, 1, ffbefdf0, ffbefcc8) + 114 0001cb0c start_client (1, ffbefdf0, 1, 4f400, ff233968, ff1a01e0) + 314 0001cd54 main (2, ffbefdf0, ffbefdfc, 4f338, 0, 0) + 1e8 000160c8 _start (0, 0, 0, 0, 0, 0) + 5c # /usr/proc/bin/pfiles 17407 17407: /usr/local/bin/rsync -rptgoD --partial --delete-after -vv --delete -e Current rlimit: 64 file descriptors 0: S_IFCHR mode:0666 dev:85,0 ino:216127 uid:0 gid:3 rdev:13,2 O_RDONLY|O_NONBLOCK|O_LARGEFILE 1: S_IFREG mode:0644 dev:32,727 ino:970354 uid:0 gid:1 size:9990679 O_WRONLY|O_APPEND 2: S_IFREG mode:0644 dev:32,727 ino:970356 uid:0 gid:1 size:0 O_WRONLY|O_NONBLOCK|O_APPEND 3: S_IFDOOR mode:0444 dev:206,0 ino:53988 uid:0 gid:0 size:0 O_RDONLY|O_LARGEFILE FD_CLOEXEC door to nscd[299] 4: S_IFSOCK mode:0666 dev:201,0 ino:16261 uid:0 gid:0 size:0 O_RDWR|O_NONBLOCK 7: S_IFSOCK mode:0666 dev:201,0 ino:60422 uid:0 gid:0 size:0 O_RDWR|O_NONBLOCK Neil Schellenberger wrote: Eric, Since the poll is nfds=0 and timeo=20 (i.e. almost certainly msleep(20)) and since waitpid is looking for 17408, this actually really has to be the call of wait_process() at main.c:532 where rsync is (apparently) waiting for ssh to die. The reason that timeout has no effect is that only io_flush() is being called in this loop and since there is (presumably) nothing more to be writen so the usual I/O loop stuff (including check_timeout()) is not being called. If you can manage it, both pstack and pfiles output would be useful to check if my guess of main.c:532 is right and to see if the pipe to ssh is still open. If the pipe is still open, we may have our culprit (it doesn't realise it should be exiting?). If it's closed, we'll need to know what ssh is up to Regards, Neil -- Neil Schellenberger | Voice : (613) 599-2300 ext. 8445 CrossKeys Systems Corporation | Fax : (613) 599-2330 350 Terry Fox Drive | E-Mail: [EMAIL PROTECTED] Kanata, Ont., Canada, K2K 2W5 | URL : http://www.crosskeys.com/ + Greg Moore (1975-1999), Gentleman racer and great Canadian + -- __ Eric T. Whiting AMI Semiconductors (208) 234-6717 2300 Buckskin Road (208) 234-6659 (fax)Pocatello,ID 83201 [EMAIL PROTECTED]
logging transfers to a file
I understand the log-format option for the rsync command: --log-format=FORMAT log file transfers using specified format I understand the log option for the rsyncd deamon. log file What I want to do is log from the rsync command (not deamon) to a file. So far all I can get is my log format string written to stdout. How to I tell the rsync command to log to a file? I'd like this command to make a logifle in /tmp but it only writes the info to stdout. Is there a flag I'm missing? BTW I can't get all the % strings to work quite right in the command rsync but I get most of the important ones. Thanks, eric SAMPLE COMMAND: rsync -a --log /tmp/logfile --log-format "%o %f %l %b" /bin /tmp -- __ Eric T. Whiting AMI Semiconductors (208) 234-6717 2300 Buckskin Road (208) 234-6659 (fax)Pocatello,ID 83201 [EMAIL PROTECTED]