Re: rsync mechanics question
On 10.5.2007, at 2:16, Tom Riley wrote: I have 2 mounts on a single computer the production mount is 100gigs (/msgstore) and contains rough 17 million small files (email message store), and a newly created 500g ufs file system (/mnt) However, the curiosity comes in with my source data taking up 86gigs of data on a 100g partition, and as the copy progresses the destination drive is reporting 240 gigs of usage. So as far as I can tell, rsync is working and the data integrity seems good, it's simply taking up 2.5 times the space. The simplest possibility is that the two partitions are using different block size, so a small file that occupies one block is taking more disk space on the partition with larger block size. izidor -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync mechanics question
Tom Riley wrote: > However, the curiosity comes in with my source data taking up 86gigs of > data on a 100g partition, and as the copy progresses the destination > drive is reporting 240 gigs of usage. > > So as far as I can tell, rsync is working and the data integrity seems > good, it's simply taking up 2.5 times the space. Do you need the -S (--sparse) option? Omitting this, when some of the source files are sparse, is one reason files take more space when they are copied on unix in general. If there are sparse files, this will reduce their size at the destination to something more reasonable, but I don't know if they'll be exactly the same size. Secondly, do you need the -H (--hard-links) option? Omitting this, when some of the source files are hard linked, would cause multiple copies of the same file to be created on the destination. To be sure of a clean copy with -S and -H, I think you need to start with an empty destination, the first time. This will show you if those options have helped. You can check if these options are relevant without actually copying, using "du" to get number of inodes and number of bytes used on the source disk, "find . | wc -l" to get the number of inodes (approximately) that will be created without -H, and "find . -printf '.+((%s+4095)/4096*4096)\n' | bc -l | tail -n1" (works on Linux anyway) to get the number of bytes (approximately) that will be created without -S and -H both. > This crosses realms of expertise that I'm a bit light on, and am fast > coming up to speed on. I'm trying to determine if there is some mechanic > within the rsync process that could account for the used space. James > mentioned that rsync creates temp files which could account for double > disk usage, and I'm following up on that. It only creates one temp file at a time, though, and moves it into place before starting the next one. So if the largest individual file is 1G, you'd only expect 1G at most extra during the transfer, and nothing by the end. It cannot possibly explain taking 2.5 times the space. -- Jamie -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
delete old logs
I've several web servers that use log4j to generate the logs. The log file currently appended to is called filename. The older, non-current logs are called filename.-mm-dd-hh. Obviously, rotation takes place once every hour and it's done automatically by log4j. Old logs are never deleted by log4j, something else must delete them. I need to rsync all the filename* files from all web servers periodically to a "log server", and a safe copy is made to a backup log server. Each web server is running an rsync daemon, and a cron job on the main log server cycles through all web servers periodically. Logs older than X hours must be deleted from the web servers, but only provided that a copy already exists on the log server AND another copy exists on the backup log server (any log file must exist in at least two places after any given rsync transfer). Because I want to minimize the number of protocols and applications involved in the process, I'd like to perform the deletion of old log files from the web servers using rsync. Another reason to do that is that I want to handle everything (log transfer, archival, deletion) via one single cron job on the log server, instead of various cron jobs on the log server and on each web server, in order to minimize collisions. The problem is, the files that must be deleted are on the sender and the rsync documentation that I'm looking at doesn't seem to provide any clue as to how to delete files on the sender. Once some of these conditions are relaxed, the problem appears quite solvable (e.g., mount the log directories via NFS and then do rsync and deletion over the NFS mount), but first I'd like to make sure that there's no way to accomplish everything solely via rsync. Please enlighten me. Thanks, -- Florin Andrei http://florin.myip.org/ -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: rsync mechanics question
Matt, Thanks for the reply. To clarify, I'm doing the following: I have 2 mounts on a single computer the production mount is 100gigs (/msgstore) and contains rough 17 million small files (email message store), and a newly created 500g ufs file system (/mnt) I'd like to minimize the downtime required to make the cut over of data, so I'm doing low priority repeative rsync of the source data over to the new 500g partition. Just prior to the downtime, I'll stop mail services and do a final cold rsync and change the mount points. Rsync is working fine, and doing its job. However, the curiosity comes in with my source data taking up 86gigs of data on a 100g partition, and as the copy progresses the destination drive is reporting 240 gigs of usage. So as far as I can tell, rsync is working and the data integrity seems good, it's simply taking up 2.5 times the space. This crosses realms of expertise that I'm a bit light on, and am fast coming up to speed on. I'm trying to determine if there is some mechanic within the rsync process that could account for the used space. James mentioned that rsync creates temp files which could account for double disk usage, and I'm following up on that. A second possibility is that because the disk is so large that Solaris is doing something funky with the minimum block size per inode assignment that's causing small files to consume more space on a larger sized file system. Ideally, I'd prefer the data copied from /msgstore to consume roughly the same amount of the new disk as the old so I can have 400gigs of growth rather than 250gigs. Make sense? Any thoughts or suggestions would be welcomed. -Tom > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of > Matt McCutchen > Sent: Wednesday, May 09, 2007 2:56 PM > To: Tom Riley > Cc: rsync@lists.samba.org > Subject: Re: rsync mechanics question > > On 5/9/07, Tom Riley <[EMAIL PROTECTED]> wrote: > > I've been using rsync (2.6.9) to migrate a 90g message store volume and > I'm > > running into some interesting results. > > Please be more specific about what is going wrong. If you get an > error message, please send the exact text. If rsync is successful, > what does it do that you didn't expect/want? > > > Does rsync copy files at a file copy level or is it attempting to do > some > > block level copying? > > Rsync works at file level, not block level. (Of course, if a source > file is itself a filesystem image, then one could say that rsync works > at block level for that image.) > > Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Using --remove-source-files with backup?
On 5/7/07, ScottZ <[EMAIL PROTECTED]> wrote: With --source-backup the tree structure of the original source file is also being included in the --backup-dir directory. That is an idiosyncracy of the quick-and-dirty way I implemented the source backup, and I documented it in the patched man page. To avoid it, cd into the source directory first so you can give the source argument as "." or just a filename. When Wayne made a modified version of the source-backup patch to include with HEAD of rsync, he found it more convenient to follow the backup-dir path starting from the source dir rather than the source argument path starting from the backup dir. It's not clear to me whether either of these behaviors is the best possible. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync mechanics question
On 5/9/07, Tom Riley <[EMAIL PROTECTED]> wrote: I've been using rsync (2.6.9) to migrate a 90g message store volume and I'm running into some interesting results. Please be more specific about what is going wrong. If you get an error message, please send the exact text. If rsync is successful, what does it do that you didn't expect/want? Does rsync copy files at a file copy level or is it attempting to do some block level copying? Rsync works at file level, not block level. (Of course, if a source file is itself a filesystem image, then one could say that rsync works at block level for that image.) Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync feature needed: preserve atime
On 5/9/07, Dave Dykstra <[EMAIL PROTECTED]> wrote: What does it really mean to preserve access times? When rsync reads a file to copy it, it will change the access time just because it is reading it, so the backup should then have the correct access time, the time the backup file was created. I tested HEAD of rsync + atimes.diff and it indeed behaves this way, which is silly: rsync destroys the data it is told to preserve! The atimes.test doesn't catch this because it uses a zero-length file, so there is no read to actually hit the atime. Some tools have an option to reset the access time of the file they copy, but in so doing the tools update the inode change time of the input file which is generally more important to users than access time. It is not possible to preserve both as a filesystem user like rsync. It is possible on some filesystems to read a file without hitting its atime by opening it with O_NOATIME. The same issues apply to GNU tar, so there was an extensive discussion on the bug-tar mailing list of how not to destroy atimes of source files (including use of O_NOATIME): http://lists.gnu.org/archive/html/bug-tar/2005-09/msg00035.html Personally, I don't like access times because they're impure in the sense that reading shouldn't write and as far as I know they don't have any important uses. All of my computer's filesystems are mounted noatime. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: preserve EA?
On Wed, 2007-05-09 at 16:32 -0400, Matt McCutchen wrote: > On 5/9/07, Ming Zhang <[EMAIL PROTECTED]> wrote: > > side question, if i know one file only have EA changed, thus mtime is > > updated, can i force the rsync to do EA update only? regular rsync run > > will do checksum stuff if mtime changed, and find out all content are > > same which generate too many computation overheads. > > Yes, you can use --size-only, which makes rsync assume that > corresponding files with the same size don't need the contents > transferred. cool. thx! > > Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: preserve EA?
On 5/9/07, Ming Zhang <[EMAIL PROTECTED]> wrote: side question, if i know one file only have EA changed, thus mtime is updated, can i force the rsync to do EA update only? regular rsync run will do checksum stuff if mtime changed, and find out all content are same which generate too many computation overheads. Yes, you can use --size-only, which makes rsync assume that corresponding files with the same size don't need the contents transferred. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync mechanics question
On 5/9/07, JamesDR <[EMAIL PROTECTED]> wrote: By using --inplace (if I read that right) you'll be xfering the entire 90GB store over the network. Not exactly. --inplace only prevents the receiver from matching source data using data at an earlier offset in the old destination file (because the receiver will overwrite the old data too soon to use it). (Note to Wayne: I think the man page should state this explicitly, especially because that will make the remark about sorting data matches meaningful to new users.) Thus, if I reason correctly, inserting data in the source file spoils all the matches from there on, while deleting or modifying data does not do any more harm than it would without --inplace. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: preserve EA?
On Wed, 2007-05-09 at 16:07 -0400, Matt McCutchen wrote: > On 5/9/07, Ming Zhang <[EMAIL PROTECTED]> wrote: > > sorry that i should do more check. the rsync man page on site does not > > have this while fc6 rsync man page has -X support. i guess there are > > some extra patches floating around. > > That extra patch is floating at "patches/xattrs.diff" in rsync source > distributions through version 2.6.9. The support for preserving > extended attributes has been merged into the main rsync in CVS, which > will eventually be released as rsync 3.0.0. ic. thx for the info. side question, if i know one file only have EA changed, thus mtime is updated, can i force the rsync to do EA update only? regular rsync run will do checksum stuff if mtime changed, and find out all content are same which generate too many computation overheads. > > Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: preserve EA?
On 5/9/07, Ming Zhang <[EMAIL PROTECTED]> wrote: sorry that i should do more check. the rsync man page on site does not have this while fc6 rsync man page has -X support. i guess there are some extra patches floating around. That extra patch is floating at "patches/xattrs.diff" in rsync source distributions through version 2.6.9. The support for preserving extended attributes has been merged into the main rsync in CVS, which will eventually be released as rsync 3.0.0. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync mechanics question
JamesDR wrote: > Tom Riley wrote: >> Hey All, >> >> >> >> I’ve been using rsync (2.6.9) to migrate a 90g message store volume and >> I’m running into some interesting results. I have two FC storage arrays >> attached to a Sunfire V280R, running Solaris 8. My 100gig volume is on a >> Sun StoreEdge 3510, and my new 500gig partition is on an HP EVA. >> >> >> >> I used the syntax: rsync -a --stats --delete $SRC/$dir/ $DST/$dir >> >> >> >> At present time, the destination volume is 2.5 times the size of the >> original volume. My first thought was this may be a drive geometry >> issue, and I’ve been working with Sun to get a solution. They believe >> rsync is doing a block level copy, instead of a file level copy. That >> doesn’t seem to jive with what I’ve been reading, but wanted to get some >> more experienced eyes looking at the problem. >> >> >> >> Does rsync copy files at a file copy level or is it attempting to do >> some block level copying? Has anyone experienced this sort of bloated >> expansion of space? Any advice would be greatly appreciated. >> >> >> >> -Tom >> > > It creates temporary files during the xfer then moves them over once > completed. I think you'll want to use --inplace. So if the store is 90GB > in size, while the xfer is going on it could be using 180GB +/- (depending.) > > http://rsync.samba.org/ftp/rsync/rsync.html has more info. > Re-reading that, you may want to consider the sync and storage needed verses network bandwidth. By using --inplace (if I read that right) you'll be xfering the entire 90GB store over the network. Its a toss up between network bandwidth (and time to xfer) and storage space. -- Thanks, James -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync fails to sync files
On 5/9/07, Paul Slootman <[EMAIL PROTECTED]> wrote: On Tue 08 May 2007, Wayne Davison wrote: > > You can read the very latest manpage with my improvements here: > > http://rsync.samba.org/ftp/rsync/nightly/rsync.html > > E.g., there's extra quick-check discussion in the DESCRIPTION section. Yes, that's a great improvement. This is typically one of those things that you don't miss when you're familiar with rsync, but which is quite essential info otherwise... I agree. Making important default behaviors like this one more prominent is a major goal of the man page rewrite that C Sights and I were discussing. I hope to continue working on it at some point. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync mechanics question
Tom Riley wrote: > Hey All, > > > > I’ve been using rsync (2.6.9) to migrate a 90g message store volume and > I’m running into some interesting results. I have two FC storage arrays > attached to a Sunfire V280R, running Solaris 8. My 100gig volume is on a > Sun StoreEdge 3510, and my new 500gig partition is on an HP EVA. > > > > I used the syntax: rsync -a --stats --delete $SRC/$dir/ $DST/$dir > > > > At present time, the destination volume is 2.5 times the size of the > original volume. My first thought was this may be a drive geometry > issue, and I’ve been working with Sun to get a solution. They believe > rsync is doing a block level copy, instead of a file level copy. That > doesn’t seem to jive with what I’ve been reading, but wanted to get some > more experienced eyes looking at the problem. > > > > Does rsync copy files at a file copy level or is it attempting to do > some block level copying? Has anyone experienced this sort of bloated > expansion of space? Any advice would be greatly appreciated. > > > > -Tom > It creates temporary files during the xfer then moves them over once completed. I think you'll want to use --inplace. So if the store is 90GB in size, while the xfer is going on it could be using 180GB +/- (depending.) http://rsync.samba.org/ftp/rsync/rsync.html has more info. -- Thanks, James -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync feature needed: preserve atime
On Tue, May 08, 2007 at 07:09:06PM -0400, Matt McCutchen wrote: > On 5/8/07, Brent Thompson <[EMAIL PROTECTED]> wrote: > >Often we need to preserve the information atime conveys, but I have found > >no way to get rsync to preserve this, nor any hint it is being worked on. > >It would be great if 'rsync -t' also set atime not just mtime -- or maybe a > >new option is desired for atime. > > The standard version of rsync does not support preserving atimes, but > the rsync source distribution includes a patch "patches/atimes.diff" > that adds an option --atimes to preserve atimes. I recommend you > compile your own copy of rsync including this patch (reply if you need > help/instructions for this) and then use it with its --atimes option. What does it really mean to preserve access times? When rsync reads a file to copy it, it will change the access time just because it is reading it, so the backup should then have the correct access time, the time the backup file was created. Some tools have an option to reset the access time of the file they copy, but in so doing the tools update the inode change time of the input file which is generally more important to users than access time. It is not possible to preserve both as a filesystem user like rsync. - Dave -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
rsync mechanics question
Hey All, I've been using rsync (2.6.9) to migrate a 90g message store volume and I'm running into some interesting results. I have two FC storage arrays attached to a Sunfire V280R, running Solaris 8. My 100gig volume is on a Sun StoreEdge 3510, and my new 500gig partition is on an HP EVA. I used the syntax: rsync -a --stats --delete $SRC/$dir/ $DST/$dir At present time, the destination volume is 2.5 times the size of the original volume. My first thought was this may be a drive geometry issue, and I've been working with Sun to get a solution. They believe rsync is doing a block level copy, instead of a file level copy. That doesn't seem to jive with what I've been reading, but wanted to get some more experienced eyes looking at the problem. Does rsync copy files at a file copy level or is it attempting to do some block level copying? Has anyone experienced this sort of bloated expansion of space? Any advice would be greatly appreciated. -Tom -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: preserve EA?
On Wed, 2007-05-09 at 14:51 -0400, Ming Zhang wrote: > Hi all > > When rsync replicate one file to remote side, will the extended > attributes be copied as well? Thanks, sorry that i should do more check. the rsync man page on site does not have this while fc6 rsync man page has -X support. i guess there are some extra patches floating around. sorry for the noise. > > Ming -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
preserve EA?
Hi all When rsync replicate one file to remote side, will the extended attributes be copied as well? Thanks, Ming -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync fails to sync files
On Tue 08 May 2007, Wayne Davison wrote: > > You can read the very latest manpage with my improvements here: > > http://rsync.samba.org/ftp/rsync/nightly/rsync.html > > E.g., there's extra quick-check discussion in the DESCRIPTION section. Yes, that's a great improvement. This is typically one of those things that you don't miss when you're familiar with rsync, but which is quite essential info otherwise... Paul Slootman -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html