Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-16 Thread Simon Hobson
Andrew Gideon wrote: >> btrfs has support for this: you make a backup, then create a btrfs >> snapshot of the filesystem (or directory), then the next time you make a >> new backup with rsync, use --inplace so that just changed parts of the >> file are written to the same blocks and btrfs will ta

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-16 Thread Andrew Gideon
On Tue, 14 Jul 2015 08:59:25 +0200, Paul Slootman wrote: > btrfs has support for this: you make a backup, then create a btrfs > snapshot of the filesystem (or directory), then the next time you make a > new backup with rsync, use --inplace so that just changed parts of the > file are written to th

Re: Fwd: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-16 Thread Ken Chase
yeah, i read somewhere that zfs DOES have separate tuning for metadata and data cache, but i need to read up on that more. as for heavy block duplication: daily backups of the whole system = alot of dupe. /kc On Thu, Jul 16, 2015 at 05:42:32PM +, Andrew Gideon said: >On Mon, 13 Jul 201

Re: Fwd: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-16 Thread Andrew Gideon
On Mon, 13 Jul 2015 17:38:35 -0400, Selva Nair wrote: > As with any dedup solution, performance does take a hit and its often > not worth it unless you have a lot of duplication in the data. This is so only in some volumes in our case, but it appears that zfs permits this to be enabled/disabled

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-14 Thread Simon Hobson
Ken Chase wrote: > And what's performance like? I've heard lots of COW systems performance > drops through the floor when there's many snapshots. For BTRFS I'd suspect the performance penalty to be fairly small. Snapshots can be done in different ways, and the way BTRFS and (I think) ZFS do it

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-14 Thread Ken Chase
And what's performance like? I've heard lots of COW systems performance drops through the floor when there's many snapshots. /kc On Tue, Jul 14, 2015 at 08:59:25AM +0200, Paul Slootman said: >On Mon 13 Jul 2015, Andrew Gideon wrote: >> >> On the other hand, I do confess that I am sometime

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-14 Thread Paul Slootman
On Mon 13 Jul 2015, Andrew Gideon wrote: > > On the other hand, I do confess that I am sometimes miffed at the waste > involved in a small change to a very large file. Rsync is smart about > moving minimal data, but it still stores an entire new copy of the file. > > What's needed is a file sy

Fwd: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-13 Thread Selva Nair
On Mon, Jul 13, 2015 at 5:19 PM, Simon Hobson wrote: > > What's needed is a file system that can do what hard links do, but at the > > file page level. I imagine that this would work using the same Copy On > > Write logic used in managing memory pages after a fork(). > > Well some (all ?) enterp

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-13 Thread Simon Hobson
Andrew Gideon wrote: > However, you've made be a little > apprehensive about storebackup. I like the lack of a need for a "restore > tool". This permits all the standard UNIX tools to be applied to > whatever I might want to do over the backup, which is often *very* > convenient. Well if y

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-13 Thread Andrew Gideon
On Mon, 13 Jul 2015 15:40:51 +0100, Simon Hobson wrote: > The think here is that you are into "backup" tools rather than the > general purpose tool that rsync is intended to be. Yes, that is true. Rsync serves so well as a core component to backup, I can be blind about "something other than rsy

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-13 Thread Ken Chase
inotifywatch or equiv, there's FSM stuff (filesystem monitor) as well. constantData had a product we used years ago - a kernel module that dumped out a list of any changed files out some /proc or /dev/* device and they had a whole toolset that ate the list (into some db) and played it out as it co

Re: rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-13 Thread Simon Hobson
Andrew Gideon wrote: > These both bring me to the idea of using some file system auditing > mechanism to drive - perhaps with an --include-from or --files-from - > what rsync moves. > > Where I get stuck is that I cannot envision how I can provide rsync with > a limited list of files to move

rsync --link-dest and --files-from lead by a "change list" from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-13 Thread Andrew Gideon
On Mon, 13 Jul 2015 02:19:23 +, Andrew Gideon wrote: > Look at tools like inotifywait, auditd, or kfsmd to see what's easily > available to you and what best fits your needs. > > [Though I'd also be surprised if nobody has fed audit information into > rsync before; your need doesn't seem all

Re: cut-off time for rsync ?

2015-07-12 Thread Andrew Gideon
On Thu, 02 Jul 2015 20:57:06 +1200, Mark wrote: > You could use find to build a filter to use with rsync, then update the > filter every few days if it takes too long to create. If you're going to do something of that sort, you might want instead to consider truly tracking changes. This catches

Re: cut-off time for rsync ?

2015-07-03 Thread Simon Hobson
Ken Chase wrote: > You have NO IDEA how long it takes to scan 100M files > on a 7200 rpm disk. Actually I do have some idea ! > Additionally, I dont know if linux (or freebsd or any unix) can be told to > cache > metadata more aggressively than data That had gone through my mind - how much RA

Re: cut-off time for rsync ?

2015-07-02 Thread Ken Chase
On Wed, Jul 01, 2015 at 02:05:50PM +0100, Simon Hobson said: >As I read this, the default is to look at the file size/timestamp and if they match then do nothing as they are assumed to be identical. So unless you have specified this, then files which have already been copied should be igno

Re: cut-off time for rsync ?

2015-07-02 Thread Ken Chase
Yes if rsync could keep a 'last state file' that'd be great, which would require the target be unchanged by any other process/usage - this is however the case with many of our uses here - as a backup only target. Then it could just load the target statefile, and only scan the source for changes vs

Re: cut-off time for rsync ?

2015-07-02 Thread Dirk van Deun
> What is taking time, scanning inodes on the destination, or recopying the > entire > backup because of either source read speed, target write speed or a slow > interconnect > between them? It takes hours to traverse all these directories with loads of small files on the backup server. That is

Re: cut-off time for rsync ?

2015-07-02 Thread Mark
You could use find to build a filter to use with rsync, then update the filter every few days if it takes too long to create. I have used a script to build a filter on the source server to exclude anything over 5 days old, invoked when the sync starts, but it only parses around 2000 files per

Re: cut-off time for rsync ?

2015-07-01 Thread Ken Chase
What is taking time, scanning inodes on the destination, or recopying the entire backup because of either source read speed, target write speed or a slow interconnect between them? Do you keep a full new backup every day, or are you just overwriting the target directory? /kc On Wed, Jul 01, 20

Re: cut-off time for rsync ?

2015-07-01 Thread Simon Hobson
> The goal was not to reduce storage, it was to reduce work. A full > rsync takes more than the whole night, and the destination server is > almost unusable for anything else when it is doing its rsyncs. I > am sorry if this was unclear. I just want to give rsync a hint that > comparing files an

Re: cut-off time for rsync ?

2015-07-01 Thread Dirk van Deun
> If your goal is to reduce storage, and scanning inodes doesnt matter, > use --link-dest for targets. However, that'll keep a backup for every > time that you run it, by link-desting yesterday's copy. The goal was not to reduce storage, it was to reduce work. A full rsync takes more than the wh

Re: cut-off time for rsync ?

2015-07-01 Thread Dirk van Deun
> >I used to rsync a /home with thousands of home directories every > >night, although only a hundred or so would be used on a typical day, > >and many of them have not been used for ages. This became too large a > >burden on the poor old destination server, so I switched to a script > >that uses

Re: cut-off time for rsync ?

2015-06-30 Thread Ken Chase
If your goal is to reduce storage, and scanning inodes doesnt matter, use --link-dest for targets. However, that'll keep a backup for every time that you run it, by link-desting yesterday's copy. Y end up with a backup tree dir per day, with files hardlinked against all other backup dirs. My (and

Re: cut-off time for rsync ?

2015-06-30 Thread Fabian Cenedese
At 10:32 30.06.2015, Dirk van Deun wrote: >Hi, > >I used to rsync a /home with thousands of home directories every >night, although only a hundred or so would be used on a typical day, >and many of them have not been used for ages. This became too large a >burden on the poor old destination server

cut-off time for rsync ?

2015-06-30 Thread Dirk van Deun
Hi, I used to rsync a /home with thousands of home directories every night, although only a hundred or so would be used on a typical day, and many of them have not been used for ages. This became too large a burden on the poor old destination server, so I switched to a script that uses "find -cti