Re: rsync --server command line options

2018-10-09 Thread Ken Chase via rsync
. is the 'current directory' notation in unix.

.. is the parent directory.

/kc


On Mon, Oct 08, 2018 at 01:57:09PM -0700, Parke via rsync said:
  >Hello,
  >
  >I ran the following commands:
  >
  >rsync /tmp/foo remote:
  >rsync remote:/tmp/foo .
  >
  >On the remote computer, the following commands were executed:
  >
  >rsync --server -e.LsfxC . .
  >rsync --server --sender -e.LsfxC . /tmp/foo
  >
  >Does anyone know, what is the meaning of the three dots/periods in the
  >above two commands?  The first command ends with two dots (". .") and
  >the second command has one dot (namely, the dot before /tmp/foo).
  >
  >(Yes, I know that --server and --sender are intended for internal use
  >only.  Despite that, I want to try to get two rsync children to talk
  >to each other over a pipe created by a non-rsync parent.)
  >
  >Thank you,
  >
  >Parke
  >
  >-- 
  >Please use reply-all for most replies to avoid omitting the mailing list.
  >To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - Heavy Computing Inc. - Guelph Canada


-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Solution for rsync overall progress data display

2017-11-26 Thread Ken Chase via rsync
with --no-i-r you at least get some idea of the # of files to check.


/kc


On Sun, Nov 26, 2017 at 09:34:25PM +, Simon Hobson via rsync said:
  >> I looking for a solution to display overall rsync progress on an LCD 
display as a bargraph.
  >> I have found 2 parameters:
  >> 
  >> --progress
  >> This  option  tells  rsync  to  print  information  showing  
the
  >> progress of the transfer. This gives a bored user  something  
to
  >> watch.  Implies --verbose if it wasn't already specified.
  >> 
  >> While  rsync  is  transferring  a  regular  file,  it  updates 
a
  >> progress line that looks like this:
  >> 
  >>   782448  63%  110.64kB/s0:00:04
  >> 
  >> But they are not showing the overall progress during the transfer what I 
need.
  >
  >Bear in mind that until the sync is almost finished, rsync does NOT know how 
much is left to do. AIUI, one thread is running a compare, working down the 
directory tree and building a list of files that aren't up to date on the 
target. Another thread is then taking files from this list and syncing them.
  >So at any point in time, there is a queue of files to be synced which is NOT 
complete, and a process that's syncing those files one at a time. Until the 
first thread is done, there isn't even a list of files, and until the sync is 
running, there isn't information on how much needs to be transferred for each 
of those files.
  >
  >It's well worth reading Andrew Tridgell's PHD thesis where the algorithm is 
detailed. It's quite readable and gives a good insight into how rsync works.
  >https://www.samba.org/~tridge/phd_thesis.pdf
  >
  >
  >-- 
  >Please use reply-all for most replies to avoid omitting the mailing list.
  >To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

--
Ken Chase - Heavy Computing Inc. Guelph Canada

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: How do you exclude a directory that is a symlink?

2017-03-03 Thread Ken Chase
Considering you cant INCLUDE a directory that is a symlink... which would
be really handy right now for me to resolve a mapping of 103 -> meaningful_name
for backups, instead im resorting to temporary bind mounts of 103 onto 
meaningful_name, and when the bind mount isnt there, the --del is emptying
meaningful_name accidentally at times.

I think both situations could benefit from a --resolve-cmd-line-links switch
to resolve COMMAND LINE-SUPPLIED symlinks.

http://unix.stackexchange.com/questions/153262/get-rsync-to-dereference-symlinked-dirs-presented-on-cmdline-like-find-h

/kc


On Fri, Mar 03, 2017 at 07:41:10AM -0500, Steve Dondley said:
  >A thousand greetings,
  >
  >I'm trying to rsync a directory from a server to my local machine that has
  >a symbolic link to a directory I don't want to download. I have an
  >"exclude" option to exclude the symlink which works fine. However, if I add
  >a --copy-links option to the command, it appears to override my "exclude"
  >directive and the contents of the symlinked directory gets downloaded
  >anyway.
  >
  >I suspect I need some kind of --filter option. I read the documentation (or
  >at least tried do) regarding the --filter option but a mortal, casual user
  >like me could not make heads or tails of it.
  >
  >Thanks.

  >-- 
  >Please use reply-all for most replies to avoid omitting the mailing list.
  >To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: large rsync fails with assertion error - bug #11506 was #6542 not fixed

2015-09-14 Thread Ken Chase
I opened a new bug for this but didnt see it come across the list. I have a
file tree I can reproduce this on readily (tho it's quite huge :/ ).

Any way to get around this? This is a major travesty for a backup scripting
situation I have, I expect others to bump into this too.

new bug is

https://bugzilla.samba.org/show_bug.cgi?id=11506

i can execute specific tests if required and log output, please advise.

/kc

On Wed, Sep 09, 2015 at 01:42:20AM -0400, Ken Chase said:
  >Ok I found a bug about this:
  >
  >https://bugzilla.samba.org/show_bug.cgi?id=6542
  >
  >and it says fixed by upgrade. I found a way to upgrade. Using:
  >
  >rsync  version 3.1.1  protocol version 31
  > on receiving side that issues the rsync command, and
  >
  >rsync  version 3.1.1  protocol version 31
  > on the remote sending side.
  >
  >Im still getting the same thing:
  >
  >rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' 
failed.
  >
  >/kc
  >
  >
  >On Wed, Sep 09, 2015 at 12:58:30AM -0400, Ken Chase said:
  >  >rsyncing a tree of perhaps 30M files, getting this:   

  >  >  

  >  >rsync: hlink.c:126: match_gnums: Assertion `gnum >= 
hlink_flist->ndx_start' failed.   
  >  >  

  >  >then a bit more output and the parent catches up to the child:

  >  >  

  >  >rsync: writefd_unbuffered failed to write 8 bytes to message fd 
[receiver]: Broken pipe   
  >  >(32)  

  >  >rsync error: error in rsync protocol data stream (code 12) at io.c(1532) 
[receiver=3.0.9] 
  >  >  

  >  >it's from a remote system. No errors visible (kernel or otherwise) on 
either end. 
  >  >Hints?

  >  >  

  >  >source:   

  >  >rsync  version 3.1.1  protocol version 31 

  >  >  

  >  >dest, where commands are issued from: 

  >  >rsync  version 3.0.9  protocol version 30 

  >  >  

  >  >ill have to try upgrading dest to 3.1.1 but its not in wheezy-backports   

  >  >and dont really want to mess with this production machine too much.   

  >  >          

  >  >/kc   

  >  >-- 
  >  >Ken Chase - Toronto Canada
  >  >
  >  >-- 
  >  >Please use reply-all for most replies to avoid omitting the mailing list.
  >  >To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  >  >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: large rsync fails with assertion error

2015-09-08 Thread Ken Chase
Ok I found a bug about this:

https://bugzilla.samba.org/show_bug.cgi?id=6542

and it says fixed by upgrade. I found a way to upgrade. Using:

rsync  version 3.1.1  protocol version 31
 on receiving side that issues the rsync command, and

rsync  version 3.1.1  protocol version 31
 on the remote sending side.

Im still getting the same thing:

rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' 
failed.

/kc


On Wed, Sep 09, 2015 at 12:58:30AM -0400, Ken Chase said:
  >rsyncing a tree of perhaps 30M files, getting this:  
 
  > 
 
  >rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' 
failed.   
  > 
 
  >then a bit more output and the parent catches up to the child:   
 
  > 
 
  >rsync: writefd_unbuffered failed to write 8 bytes to message fd [receiver]: 
Broken pipe   
  >(32) 
 
  >rsync error: error in rsync protocol data stream (code 12) at io.c(1532) 
[receiver=3.0.9] 
  > 
 
  >it's from a remote system. No errors visible (kernel or otherwise) on either 
end. 
  >Hints?   
 
  > 
 
  >source:  
 
  >rsync  version 3.1.1  protocol version 31
 
  > 
 
  >dest, where commands are issued from:
 
  >rsync  version 3.0.9  protocol version 30
 
  > 
 
  >ill have to try upgrading dest to 3.1.1 but its not in wheezy-backports  
 
  >and dont really want to mess with this production machine too much.  
 
  > 
 
  >/kc          
 
  >-- 
  >Ken Chase - Toronto Canada
  >
  >-- 
  >Please use reply-all for most replies to avoid omitting the mailing list.
  >To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  >Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


large rsync fails with assertion error

2015-09-08 Thread Ken Chase
rsyncing a tree of perhaps 30M files, getting this: 
  

  
rsync: hlink.c:126: match_gnums: Assertion `gnum >= hlink_flist->ndx_start' 
failed.   

  
then a bit more output and the parent catches up to the child:  
  

  
rsync: writefd_unbuffered failed to write 8 bytes to message fd [receiver]: 
Broken pipe   
(32)
  
rsync error: error in rsync protocol data stream (code 12) at io.c(1532) 
[receiver=3.0.9] 

  
it's from a remote system. No errors visible (kernel or otherwise) on either 
end. 
Hints?  
  

  
source: 
  
rsync  version 3.1.1  protocol version 31   
  

  
dest, where commands are issued from:   
  
rsync  version 3.0.9  protocol version 30   
  

  
ill have to try upgrading dest to 3.1.1 but its not in wheezy-backports 
  
and dont really want to mess with this production machine too much. 
  

  
/kc 
  
-- 
Ken Chase - Toronto Canada

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: [Bug 3099] Please parallelize filesystem scan

2015-07-17 Thread Ken Chase
I dont understand - scanning metadata is sped up by thrashing the head
all over the disk instead of mostly-sequentially scanning through?

How does that work out?

/kc


On Fri, Jul 17, 2015 at 02:37:21PM +, samba-b...@samba.org said:
  https://bugzilla.samba.org/show_bug.cgi?id=3099
  
  --- Comment #8 from Chip Schweiss c...@innovates.com ---
  I would argue that optionally all directory scanning should be made 
parallel.  
  Modern file systems perform best when request queues are kept full.  The
  current mode of rsync scanning directories does nothing to take advantage of
  this.   
  
  I currently use scripts to split a couple dozen or so rsync jobs in to
  literally 100's of jobs.   This reduces execution time from what would be 
days
  to a couple hours every night.   There are lots of scripts like this 
appearing
  on the net because the current state of rsync is inadequate.  
  
  This ticket could reasonably combined with 5124.
  
  -- 
  You are receiving this mail because:
  You are the QA Contact for the bug.
  
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 Toronto Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: [Bug 3099] Please parallelize filesystem scan

2015-07-17 Thread Ken Chase
Sounds to me like maintaining the metadata cache is important - and tuning the
filesystem to do so would be more beneficial than caching writes, especially
with a backup target where a write already written will likely never be read
again (and isnt a big deal if it is since so few files are changed compared to
the total # of inodes to scan).

Your report of the minutes for the re-sync shows the unthrashed cache is highly
valuable. So all we need to do is tune the backup target (and even the 
operational
servers themselves) to maintain more metadata. I dont know how much ram is used
per inode, but I'd throw in another 4-8gb just for metadata caching per box, or
even more, if it meant scanning was sped up.

(Really, actually, one only needs it in the backup target - if you can run all
the backups in parallel, and there's N servers to backup, they can all run at 
1/N
speed, as long as scanning metadata on the backup target is fast enough to keep
up with it all -- my total data written is only 20-30GB for example, which at 
reasonable
speed (20-30MB/s even, which is slow) is only 15 minutes total writing. Even 
200-300GB
changed would be 150 minutes at that rate, and the rate could easily be 4x 
faster.

So, tuning caches to prefer metadata seems to be key. How?

As we've discussed before, letting the filesystem at it throws away precious
metadata cache, and so tracking your own changes (since the backup system will 
never
be used for anything else, right? :) would be beneficial. Of course the danger
is using the backup system for anything else and changing any of the target 
info -
inconsistencies would crop up and make the backup worthless very quickly.

/kc

On Fri, Jul 17, 2015 at 03:18:02PM +, Schweiss, Chip said:
  Modern file systems have many internal queues, and service many clients 
simultaneously.  They arrange their work to maximize throughput in both read 
and write operations.This is the norm on any enterprise file system, be it 
Hitachi, Oracle, Dell, HP, Isilon, etc.  You will get significantly higher 
throughput if you hit it with multiple threads.   These systems have elaborate 
predictive read ahead caches and perform best when multiple threads hit them.
  
  Using the test case of a single server with a simple file system such as 
ext3/4, or xfs, no gains will be seen in multithreading rsync.   Use an 
enterprise file system with 100's of TBs and the more threads you use the 
faster you will go.   Metadata and data on these systems ends up across 100's 
of disks.   Single threads end up severely bound by latency.  This is why 
multi-threading should be optional.  It doesn't help everyone.
  
  For example, one of my rsync jobs moving from a ZFS system in St. Louis, 
Missouri to a Hitachi HNAS in Minneapolis, Minnesota has over 100 million 
files.   Each day 50 to 100 thousand files get added or updated.   A single 
rsync job would take weeks to parse this job and send the changes.   I split it 
into 120 jobs and it typically completes in 2 hours when no humans are using 
the systems.   A re-sync immediately afterwards, again with 120 jobs, scans 
both ends in minutes.
  
  -Chip
  
  -Original Message-
  From: rsync [mailto:rsync-boun...@lists.samba.org] On Behalf Of Ken Chase
  Sent: Friday, July 17, 2015 9:51 AM
  To: samba-b...@samba.org
  Cc: rsync...@samba.org
  Subject: Re: [Bug 3099] Please parallelize filesystem scan
  
  I dont understand - scanning metadata is sped up by thrashing the head
  all over the disk instead of mostly-sequentially scanning through?
  
  How does that work out?
  
  /kc
  
  
  On Fri, Jul 17, 2015 at 02:37:21PM +, samba-b...@samba.org said:
https://bugzilla.samba.org/show_bug.cgi?id=3099

--- Comment #8 from Chip Schweiss c...@innovates.com ---
I would argue that optionally all directory scanning should be made 
parallel.
Modern file systems perform best when request queues are kept full.  The
current mode of rsync scanning directories does nothing to take advantage 
of
this.

I currently use scripts to split a couple dozen or so rsync jobs in to
literally 100's of jobs.   This reduces execution time from what would be 
days
to a couple hours every night.   There are lots of scripts like this 
appearing
on the net because the current state of rsync is inadequate.

This ticket could reasonably combined with 5124.

--
You are receiving this mail because:
You are the QA Contact for the bug.

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
  
  --
  Ken Chase - k...@heavycomputing.ca skype:kenchase23 Toronto Canada
  Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 
Front St. W.
  
  --
  Please use reply-all for most replies to avoid omitting

Re: Fwd: rsync --link-dest and --files-from lead by a change list from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-16 Thread Ken Chase

yeah, i read somewhere that zfs DOES have separate tuning for metadata 
and data cache, but i need to read up on that more.

as for heavy block duplication: daily backups of the whole system = alot of 
dupe.

/kc


On Thu, Jul 16, 2015 at 05:42:32PM +, Andrew Gideon said:
  On Mon, 13 Jul 2015 17:38:35 -0400, Selva Nair wrote:
  
   As with any dedup solution, performance does take a hit and its often
   not worth it unless you have a lot of duplication in the data.
  
  This is so only in some volumes in our case, but it appears that zfs 
  permits this to be enabled/disabled on a per-volume basis.  That would 
  work for us.
  
  Is there a way to save cycles by offering zfs a hint as to where a 
  previous copy of a file's blocks may be found?
  
   - Andrew
  
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync --link-dest and --files-from lead by a change list from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-14 Thread Ken Chase
And what's performance like? I've heard lots of COW systems performance
drops through the floor when there's many snapshots.

/kc


On Tue, Jul 14, 2015 at 08:59:25AM +0200, Paul Slootman said:
  On Mon 13 Jul 2015, Andrew Gideon wrote:
   
   On the other hand, I do confess that I am sometimes miffed at the waste 
   involved in a small change to a very large file.  Rsync is smart about 
   moving minimal data, but it still stores an entire new copy of the file.
   
   What's needed is a file system that can do what hard links do, but at the 
   file page level.  I imagine that this would work using the same Copy On 
   Write logic used in managing memory pages after a fork().
  
  btrfs has support for this: you make a backup, then create a btrfs
  snapshot of the filesystem (or directory), then the next time you make a
  new backup with rsync, use --inplace so that just changed parts of the
  file are written to the same blocks and btrfs will take care of the
  copy-on-write part.
  
  
  Paul
  
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync --link-dest and --files-from lead by a change list from some file system audit tool (Was: Re: cut-off time for rsync ?)

2015-07-13 Thread Ken Chase
inotifywatch or equiv, there's FSM stuff (filesystem monitor) as well.

constantData had a product we used years ago - a kernel module that dumped
out a list of any changed files out some /proc or /dev/* device and they
had a whole toolset that ate the list (into some db) and played it out
as it constantly tried to keep up with replication to a target (kinda like
drdb but async). They got eaten by some large backup company and the product
was later priced at 5x what we had paid for it (in the mid $x000s/y)

This 2003-4 technolog is certainly available in some format now.

If you only copy the changes, you're likely saving a lot of time.

/kc


On Mon, Jul 13, 2015 at 01:53:43PM +, Andrew Gideon said:
  On Mon, 13 Jul 2015 02:19:23 +, Andrew Gideon wrote:
  
   Look at tools like inotifywait, auditd, or kfsmd to see what's easily
   available to you and what best fits your needs.
   
   [Though I'd also be surprised if nobody has fed audit information into
   rsync before; your need doesn't seem all that unusual given ever-growing
   disk storage.]
  
  I wanted to take this a bit further.  I've thought, on and off, about 
  this for a while and I always get stuck.
  
  I use rsync with --link-desk as a backup tool.  For various reasons, this 
  is not something I want to give up.  But, esp. for some very large file 
  systems, doing something that avoids the scan would be desirable.
  
  I should also add that I mistrust time-stamp, and even time-stamp+file-
  size, mechanism for detecting changes.  Checksums, on the other hand, are 
  prohibitively expensive for backup of large file systems.
  
  These both bring me to the idea of using some file system auditing 
  mechanism to drive - perhaps with an --include-from or --files-from - 
  what rsync moves.
  
  Where I get stuck is that I cannot envision how I can provide rsync with 
  a limited list of files to move that doesn't deny the benefit of --link-
  dest: a complete snapshot of the old file system via [hard] links into a 
  prior snapshot for those files that are unchanged.
  
  Has anyone done something of this sort?  I'd thought of preceding the 
  rsync with a cp -Rl on the destination from the old snapshot to the new 
  snapshot, but I still think that this will break in the face of hard 
  links (to a file not in the --files-from list) or a change to file 
  attributes (ie. a chmod would effect the copy of a file in the old 
  snapshot).
  
  Thanks...
  
   Andrew
  
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: [Bug 11378] Please add a '--line-buffered' option to rsync to make logging/output more friendly with pipes/syslog/CI systems/etc.

2015-07-04 Thread Ken Chase
Imagine it, all those updates to a transfering large files with -P , 100,000 
lines
of progress PER file...

/kc


On Sat, Jul 04, 2015 at 06:56:21PM +, samba-b...@samba.org said:
  https://bugzilla.samba.org/show_bug.cgi?id=11378
  
  --- Comment #3 from Karl O. Pinc k...@meme.com ---
  On Sat, 04 Jul 2015 17:56:25 +
  samba-b...@samba.org wrote:
  
   --- Comment #2 from Nathan Neulinger nn...@neulinger.org ---
   Perhaps the naming is not correct on my suggested option (and I'll
   admit, I completely missed the outbuf option) - unfortunately, outbuf
   doesn't actually solve the problem. 
   
   The goal is to get incremental progress output while running rsync
   through a build system or similar.
  
  What would happen if you piped the rsync output through
  tr and changed \r to \n?
  
  
  
  Karl k...@meme.com
  Free Software:  You don't pay back, you pay forward.
   -- Robert A. Heinlein
  
  -- 
  You are receiving this mail because:
  You are the QA Contact for the bug.
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: cut-off time for rsync ?

2015-07-02 Thread Ken Chase
Yes if rsync could keep a 'last state file' that'd be great, which would
require the target be unchanged by any other process/usage - this is however
the case with many of our uses here - as a backup only target.

Then it could just load the target statefile, and only scan the source
for changes vs the last-state file. 

Cant think of any way around this issue with rsync alone without some external
parsing of previous logs, etc. 

This is unfortunately why I never use 5400/5900 rpm disks on my backup targets,
and use raid 10 not 5, for speed. Little more $ in the end, but necessary
to scan 50-80M inodes per night in my ~6hr backup window.

/kc


On Thu, Jul 02, 2015 at 11:43:37AM +0200, Dirk van Deun said:
   What is taking time, scanning inodes on the destination, or recopying the 
entire
   backup because of either source read speed, target write speed or a slow 
interconnect
   between them?
  
  It takes hours to traverse all these directories with loads of small
  files on the backup server.  That is the limiting factor.  Not
  even copying: just checking the timestamp and size of the old copies.
  
  The source server is the actual live system, which has fast disks,
  so I can afford to move the burden to the source side, using the find
  utility to select homes that have been touched recently and using
  rsync only on these.
  
  But it would be nice if a clever invocation of rsync could remove the
  extra burden entirely.
  
  Dirk van Deun
  -- 
  Ceterum censeo Redmond delendum

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: cut-off time for rsync ?

2015-07-02 Thread Ken Chase
On Wed, Jul 01, 2015 at 02:05:50PM +0100, Simon Hobson said:

  As I read this, the default is to look at the file size/timestamp and if
  they match then do nothing as they are assumed to be identical. So unless
  you have specified this, then files which have already been copied should be
  ignored - the check should be quite low in CPU, at least compared to the
  cost of generating a file checksum etc.

This belies the issue of many rsync users not sufficiently abusing rsync to do
backups like us idiots do! :) You have NO IDEA how long it takes to scan 100M 
files
on a 7200 rpm disk. It becomes the dominant issue - CPU isnt the issue at all.
(Additionally, I would think that metadata scanning could max out only 2 cores
anyway - 1 for rsync's userland gobbling of another core of kernel running the
fs scanning inodes).

This is why throwing away all that metadata seems silly. Keeping detailed logs
and parsing them before copy would be good, but requires an external selection
script before rsync starts, the script handing rsync a list of files to copy
directly. Unfortunate because rsync's scan method is quite advanced, but doesnt
avoid this pitfall.

Additionally, I dont know if linux (or freebsd or any unix) can be told to cache
metadata more aggressively than data - not much point for the latter on a backup
server. The former would be great. I dont know how big metadata is in ram either
for typical OS's, per inode.

/kc
-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: [Bug 11378] New: Please add a '--line-buffered' option to rsync to make logging/output more friendly with pipes/syslog/CI systems/etc.

2015-07-02 Thread Ken Chase
How about

@andrewTO alias unbuf='stdbuf -i0 -o0 -e0'

then unbuf rsync

i have not tested this in any way.

--progress would be some interesting stuff to parse, esp with all the screen 
redrawing
of the K/s line as well as background deletes and scans overwriting while
--progress of the previous file occurs. Ever try to parse ANSI screendraw 
output?

/kc


On Thu, Jul 02, 2015 at 06:00:19PM +, samba-b...@samba.org said:
  https://bugzilla.samba.org/show_bug.cgi?id=11378
  
  Bug ID: 11378
 Summary: Please add a '--line-buffered' option to rsync to make
  logging/output more friendly with pipes/syslog/CI
  systems/etc.
 Product: rsync
 Version: 3.1.1
Hardware: All
  OS: All
  Status: NEW
Severity: enhancement
Priority: P5
   Component: core
Assignee: way...@samba.org
Reporter: nn...@neulinger.org
  QA Contact: rsync...@samba.org
  
  Created attachment 11225
-- https://bugzilla.samba.org/attachment.cgi?id=11225action=edit
  patch to implement --line-buffered option
  
  Behavior change with --line-buffered would be primarily to --progress - which
  would output a newline after percentage update instead of just a
  carriage-return.
  
  During a normal operation with smaller files you'd never notice the 
difference,
  but with large files (say recurrent sync of ISO images or similar) - you'd 
get
  a MUCH more usable output trace in the build system and logs instead of it 
all
  being merged onto one line of output.
  
  -- 
  You are receiving this mail because:
  You are the QA Contact for the bug.
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca Toronto Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: cut-off time for rsync ?

2015-07-01 Thread Ken Chase
What is taking time, scanning inodes on the destination, or recopying the entire
backup because of either source read speed, target write speed or a slow 
interconnect
between them?

Do you keep a full new backup every day, or are you just overwriting the target
directory?

/kc


On Wed, Jul 01, 2015 at 10:06:57AM +0200, Dirk van Deun said:
   If your goal is to reduce storage, and scanning inodes doesnt matter,
   use --link-dest for targets. However, that'll keep a backup for every
   time that you run it, by link-desting yesterday's copy.
   
  The goal was not to reduce storage, it was to reduce work.  A full
  rsync takes more than the whole night, and the destination server is
  almost unusable for anything else when it is doing its rsyncs.  I
  am sorry if this was unclear.  I just want to give rsync a hint that
  comparing files and directories that are older than one week on
  the source side is a waste of time and effort, as the rsync is done
  every day, so they can safely be assumed to be in sync already.
  
  Dirk van Deun
  -- 
  Ceterum censeo Redmond delendum

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: cut-off time for rsync ?

2015-06-30 Thread Ken Chase
If your goal is to reduce storage, and scanning inodes doesnt matter,
use --link-dest for targets. However, that'll keep a backup for every
time that you run it, by link-desting yesterday's copy.

Y end up with a backup tree dir per day, with files hardlinked against
all other backup dirs. My (and many others) here's solution is to

mv $ancientbackup $today; rsync --del --link-dest=$yest source:$dirs $today 

creating gaps in the ancient sequence of days of backups - so I end up
keeping (very roughly) 1,2,3,4,7,10,15,21,30,45,60,90,120,180 days old backups
(of course this isnt how it works, there's some binary counting going on in 
there,
so the elimination isnt exactly like that - every day each of those gets a day 
older.
There are some tower of hanoi-like solutions to this for automated backups.)

This means something twice as old has twice as few backups for the same time 
range,
meaning I keep the same frequency*age value for each backup timerange into the 
past.

The result is a set of dirs dated (in my case) 20150630 for eg, which looks
exactly like the actual source tree i backed up, but only taking up space of
changed files since yesterday. (caveat: it's hardlinked against all the other
backups, thus using no more space on disk HOWEVER, some server stuff like
postfix doenst like hardlinked files in its spool due to security concerns -
so if you should boot/use the backup itself without making a plain copy (which
is recommended) 1) postfix et al will yell 2) you will be modifying the whole
set of dirs that point to the inode you just booted/used).

My solution avoids scanning the source twice (which in my case of backing up
5x 10M files off servers daily is a huge cost), important because the scantime
takes longer than the backup/xfer time (gigE network for a mere 20,000 changed
files per 10M seems average per box of 5). Also it's production gear - as
little time as possible thrashing the box (and its poor metadata cache) is
important for performance. Getting the backups done during the night lull is
therefore required. I dont have time to delete (nor the disk RMA cycle
patience) 10M files on the receiving side just to spend 5 hours recreating
them; 20,000 seems better to me.

You could also use --backup and --backup-dir, but I dont do it that way.

/kc


On Tue, Jun 30, 2015 at 10:32:31AM +0200, Dirk van Deun said:
  Hi,
  
  I used to rsync a /home with thousands of home directories every
  night, although only a hundred or so would be used on a typical day,
  and many of them have not been used for ages.  This became too large a
  burden on the poor old destination server, so I switched to a script
  that uses find -ctime -7 on the source to select recently used homes
  first, and then rsyncs only those.  (A week being a more than good
  enough safety margin in case something goes wrong occasionally.)
  
  Is there a smarter way to do this, using rsync only ?  I would like to
  use rsync with a cut-off time, saying if a file is older than this,
  don't even bother checking it on the destination server (and the same
  for directories -- but without ending a recursive traversal).  Now
  I am traversing some directories twice on the source server to lighten
  the burden on the destination server (first find, then rsync).
  
  Best,
  
  Dirk van Deun
  -- 
  Ceterum censeo Redmond delendum
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync very slow with large include/exclude file list

2015-06-15 Thread Ken Chase
This is similar to using fuzzy / -y in a large directory. O(n^2) behaviour
occurs and can be incredibly slow. No caching of md5's for the directory
occurs, it would seem (or even so, there are O(N^2) comparisons).

/kc


On Mon, Jun 15, 2015 at 06:02:14PM -0500, ray vantassle said:
 I investigated the rsync code and found the reason why.
 For every file in the source, it searches the entire filter-list looking
 to see if that filename is on the exclude/include list.** Most aren't, so
 it compares (350K - 72K) * 72K names (the non-listed files) plus (72K *
 72K/2) names (the ones that are listed), for a total of about**
 22,608,000,000 strcmp's.** That's 22 BILLION comparisons. (I may have left
 off a zero there, it might be 220 B).
  
 I'm working on a fix to improve this.** The first phase was to just
 improve the existing code without changing the methodology.
 The set I've been testing with is local-local machine, dry-run, 216K files
 in the source directory, 25,000 files in the exclude-from list.
 The original rsync takes 488 seconds.
 The improved code takes 300 seconds.
  
 The next phase was to improve the algorithm of handling large
 filter_lists.** Change the unsorted linear search to a sorted binary
 search (skiplist).
 This improved code takes 2 seconds.
  
 The original code does 4,492,304,682 strcmp's.
 The fully improved code does 6,472,564.** 98.5% fewer.
  
 I am cleaning up the code and will submit a patchfile soon.

  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


feature request: rsync dereference symlinks on cmdline

2015-05-15 Thread Ken Chase
This post

http://unix.stackexchange.com/questions/153262/get-rsync-to-dereference-symlinked-dirs-presented-on-cmdline-like-find-h

explains most of what i want, but basically, looking for a find -H option to 
rsync.

Reason is so that I can hit a source (or target!) dir in rsync by making a nice
dir of symlink maps.

For eg openVZ names their containers with ID#s which isnt very condusive to
careful handling/recognition:

100/ 101/ 102/ 103/ 

etc

Id like to create a dir of symlinks, a map (I think this would work on the 
target too?)

customer1 - ../production/100
customer2 - ../production/101
customer3 - ../production/102

and have rsync write dirs

customer1/ customer2/ customer3/ 

in the target.

Obviously I could do this by iterating over the source's 100 101 102 adnd
point at custom target names, etc, but that gets tedious and requires manually
updating the script to get any new sources that are added.

Obviously I dont want --copy-links, as I want only those links mentioned on
the command line to be dereferenced, not those inside the tree.

/kc
-- 
Ken Chase - Toronto Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync --delete

2015-04-16 Thread Ken Chase
Wow, it took me a few seconds to figure out what you were trying to do.

What's wrong with rm?

Also I think trying to leverage the side of disqualifying all source files
just to get the delete effect (very clever but somewhat obtuse!) risks
creating a temporary file of some kind in the target at the start of the
operation, and if you cant even mkdir then that exceeds disk quota
immediately and fails.

/kc


On Thu, Apr 16, 2015 at 12:20:52PM +0300, ? ?? said:
  Hi, Rsync.
  
  I want to help rsink delete a folder with a large number of files and 
folders. Tried this:
  rsync -a --no-D --delete /dev/null 
/home/rc-41/data/061/2015-04-01-07-04/
  skipping non-regular file null
  
  rsync -a --no-D --delete /dev/zero 
/home/rc-41/data/061/2015-04-01-07-04/
  skipping non-regular file zero
  
  
  That's how it turns out
  rsync -a --delete /empty_folder/ 
/home/rc-41/data/061/2015-04-01-07-04/
  But this option is not satisfied as if the disk is 100% filled to create an 
empty folder does not work
  
  mkdir /empty folder/
  Disk quota ekstseeded
  
  Got an error.
  
  
  find /home/rc-41/data/061/2015-04-01-07-04/ -delete
  I know not suitable
  
  rm -rf /home/rc-41/data/061/2015-04-01-07-04/
  is also not suitable
  
  
  
  How to do it differently?
  
  
  -- 
  Sincerely,
Dugin Sergey mailto: d...@qwarta.ru
QWARTA
  
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync --delete

2015-04-16 Thread Ken Chase
problem is he's trying to rsync into the target dir and have the
side effect of delete. so an empty dir would necessarily need to be
in the target of course and thus created there, triggering the quota block.

he tried to avoid this by using device files then 'blocking all device files'
but i think rsync figures out first there's nothing to do, so it just stops
and doesnt do the delete. wonder if --delete-first would help there perhaps.

however, this is a REALLY obtuse way of running rm. unless of course he's
trying to inject some kinda options into a script that can only run rsync
or something wonky like that.

/kc


On Thu, Apr 16, 2015 at 11:23:59AM -0400, Kevin Korb said:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  I don't understand what is wrong with rm either.
  
  But if you must have an empty directory is there a tmpfs where you can
  make one?  Is there already an empty one like /var/empty?
  
  On 04/16/2015 10:13 AM, Ken Chase wrote:
   Wow, it took me a few seconds to figure out what you were trying to
   do.
   
   What's wrong with rm?
   
   Also I think trying to leverage the side of disqualifying all
   source files just to get the delete effect (very clever but
   somewhat obtuse!) risks creating a temporary file of some kind in
   the target at the start of the operation, and if you cant even
   mkdir then that exceeds disk quota immediately and fails.
   
   /kc
   
   
   On Thu, Apr 16, 2015 at 12:20:52PM +0300, ? ?? said:
   Hi, Rsync.
   
   I want to help rsink delete a folder with a large number of files
   and folders. Tried this: rsync -a --no-D --delete /dev/null
   /home/rc-41/data/061/2015-04-01-07-04/ skipping
   non-regular file null
   
   rsync -a --no-D --delete /dev/zero
   /home/rc-41/data/061/2015-04-01-07-04/ skipping
   non-regular file zero
   
   
   That's how it turns out rsync -a --delete /empty_folder/
   /home/rc-41/data/061/2015-04-01-07-04/ But this
   option is not satisfied as if the disk is 100% filled to create
   an empty folder does not work
   
   mkdir /empty folder/ Disk quota ekstseeded
   
   Got an error.
   
   
   find /home/rc-41/data/061/2015-04-01-07-04/ -delete I
   know not suitable
   
   rm -rf /home/rc-41/data/061/2015-04-01-07-04/ is also
   not suitable
   
   
   
   How to do it differently?
   
   
   -- Sincerely, Dugin Sergey mailto: d...@qwarta.ru QWARTA
   
   -- Please use reply-all for most replies to avoid omitting the
   mailing list. To unsubscribe or change options:
   https://lists.samba.org/mailman/listinfo/rsync Before posting,
   read: http://www.catb.org/~esr/faqs/smart-questions.html
   
  
  - -- 
  ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
   Kevin Korb  Phone:(407) 252-6853
   Systems Administrator   Internet:
   FutureQuest, Inc.   ke...@futurequest.net  (work)
   Orlando, Floridak...@sanitarium.net (personal)
   Web page:   http://www.sanitarium.net/
   PGP public key available on web site.
  ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
  -BEGIN PGP SIGNATURE-
  Version: GnuPG v2
  
  iEYEARECAAYFAlUv1A8ACgkQVKC1jlbQAQfaGACfR7g0t19aeY5KiUTcsxBJqEVy
  tjcAnR63Viq8B0NZ4p+GgwMO+ZENjdPZ
  =aHlw
  -END PGP SIGNATURE-
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca 

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Recycling directories and backup performance. Was: Re: rsync --link-dest won't link even if existing file is out of date (fwd)

2015-04-16 Thread Ken Chase
How do you handle snapshotting? or do you leave that to the block/fs 
virtualization
layer?

/kc


On Fri, Apr 17, 2015 at 01:35:27PM +1200, Henri Shustak said:
   Our backup procudures have provision for looking back at previous 
directories, but there is not much to be gained with recycled directories.  
Without recycling, and after a failure, the latest available backup may not 
have much in it.
  
  Just wanted to point out that LBackup has a number of checks in place to 
detect failures during a backup. If this happens, then that backup is not 
labeled as a successful snapshot. 
  
  At present, when the next snap shot is started, the previous incomplete 
snapshot(s) are not used as a link-dest source. As mentioned, this is something 
I have been looking at for a while. However, there are some edge cases which 
need to be handled carefully if you use incomplete backups as a link-dest 
source. I am sure these problems are all contractable, I have simply not spend 
enough time.
  
  -
  This email is protected by LBackup, an open source backup solution.
  http://www.lbackup.org
  
  
  
  
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Can I let rsync only transer a part of file within specific byte ranges?

2015-04-15 Thread Ken Chase
rsync doesnt do that

why not use a range get with http server and wget client, or just ssh

ssh remotehost 'dd if=file bs=500 count=1'  file ?

/kc


On Wed, Apr 15, 2015 at 11:02:36AM +, Hongyi Zhao said:
  Hi all,
  
  Suppose I have a file on the remote rsync server:
  
  rsync://path/to/myfile
  
  And I want to only retrieve a part of the file based a ranges of bytes to 
  my local host, say, 0-499, means only transfer the first 500 bytes of 
  that file.
  
  Is this possible with rsync client? 
  
  Regards
  -- 
  .: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
  
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync --link-dest won't link even if existing file is out of date

2015-04-15 Thread Ken Chase
/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync --link-dest won't link even if existing file is out of date

2015-04-06 Thread Ken Chase
This has been a consideration. But it pains me that a tiny change/addition
to the rsync option set would save much time and space for other legit use
cases.

We know rsync very well, we dont know ZFS very well (licensing kept the
tech out of our linux-centric operations). We've been using it but we're
not experts yet.

Thanks for the suggestion.

/kc

On Mon, Apr 06, 2015 at 12:07:05PM -0400, Kevin Korb said:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  Since you are in an environment with millions of files I highly
  recommend that you move to ZFS storage and use ZFS's subvolume
  snapshots instead of --link-dest.  It is much more space efficient,
  rsync run time efficient, and the old backups can be deleted in
  seconds.  Rsync doesn't have to understand anything about ZFS.  You
  just rsync to the same directory every time and have ZFS do a snapshot
  on that directory between runs.
  
  On 04/06/2015 01:51 AM, Ken Chase wrote:
   Feature request: allow --link-dest dir to be linked to even if file
   exists in target.
   
   This statement from the man page is adhered to too strongly IMHO:
   
   This option works best when copying into an empty destination
   hierarchy, as rsync treats existing files as definitive (so it
   never looks in the link-dest dirs when a destination file already
   exists).
   
   I was suprised by this behaviour as generally the scheme is to be
   efficient/save space with rsync.
   
   When the file is out of date but exists in the --l-d target, it
   would be great if it could be removed and linked. If an option was
   supplied to request this behaviour, I'd actually throw some money
   at making it happen.  (And a further option to retain a copy if
   inode permissions/ownership would otherwise be changed.)
   
   Reasoning:
   
   I backup many servers with --link-dest that have filesystems of
   10+M files on them.  I do not delete old backups - which take 60min
   per tree or more just so rsync can recreate them all in an empty
   target dir when 1% of files change per day (takes 3-5 hrs per
   backup!).
   
   Instead, I cycle them in with mv $olddate $today then rsync --del
   --link-dest over them - takes 30-60 min depending. (Yes, some
   malleability of permissions risk there, mostly interested in
   contents tho).  Problem is, if a file exists AT ALL, even out of
   date, a new copy is put overtop of it per the above man page
   decree.
   
   Thus much more disk space is used. Running this scheme with moving
   old backups to be written overtop of accumulates many copies of the
   exact same file over time.  Running pax -rpl over the copies before
   rsyncing to them works (and saves much space!), but takes a very
   long time as it traverses and compares 2 large backup trees
   thrashing the same device (in the order of 3-5x the rsync's time,
   3-5 hrs for pax - hardlink(1) is far worse, I suspect a some
   non-linear algorithm therein - it ran 3-5x slower than pax again).
   
   I have detailed an example of this scenario at
   
   
http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists
  
which also indicates --delete-before and --whole-file do not help
   at all.
   
   /kc
   
  
  - -- 
  ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
   Kevin Korb  Phone:(407) 252-6853
   Systems Administrator   Internet:
   FutureQuest, Inc.   ke...@futurequest.net  (work)
   Orlando, Floridak...@sanitarium.net (personal)
   Web page:   http://www.sanitarium.net/
   PGP public key available on web site.
  ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
  -BEGIN PGP SIGNATURE-
  Version: GnuPG v2
  
  iEYEARECAAYFAlUirykACgkQVKC1jlbQAQc83ACfa7lawkyPFyO9kDE/D8aztql0
  AkAAoIQ970yTCHB1ypScQ8ILIQR6zphl
  =ktEg
  -END PGP SIGNATURE-
  -- 
  Please use reply-all for most replies to avoid omitting the mailing list.
  To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/rsync
  Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
Ken Chase - ken att heavycomputing.ca Toronto Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync --link-dest won't link even if existing file is out of date

2015-04-05 Thread Ken Chase
Feature request: allow --link-dest dir to be linked to even if file exists
in target.

This statement from the man page is adhered to too strongly IMHO:

This option works best when copying into an empty destination hierarchy, as
rsync treats existing files as definitive (so it never looks in the link-dest
dirs when a destination file already exists).

I was suprised by this behaviour as generally the scheme is to be efficient/save
space with rsync.

When the file is out of date but exists in the --l-d target, it would be great
if it could be removed and linked. If an option was supplied to request this
behaviour, I'd actually throw some money at making it happen.  (And a further
option to retain a copy if inode permissions/ownership would otherwise be
changed.)

Reasoning:

I backup many servers with --link-dest that have filesystems of 10+M files on
them.  I do not delete old backups - which take 60min per tree or more just so
rsync can recreate them all in an empty target dir when 1% of files change
per day (takes 3-5 hrs per backup!). 

Instead, I cycle them in with mv $olddate $today then rsync --del --link-dest
over them - takes 30-60 min depending. (Yes, some malleability of permissions
risk there, mostly interested in contents tho).  Problem is, if a file exists
AT ALL, even out of date, a new copy is put overtop of it per the above man
page decree.

Thus much more disk space is used. Running this scheme with moving old backups
to be written overtop of accumulates many copies of the exact same file over
time.  Running pax -rpl over the copies before rsyncing to them works (and
saves much space!), but takes a very long time as it traverses and compares 2
large backup trees thrashing the same device (in the order of 3-5x the rsync's
time, 3-5 hrs for pax - hardlink(1) is far worse, I suspect a some non-linear
algorithm therein - it ran 3-5x slower than pax again).

I have detailed an example of this scenario at

http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists

which also indicates --delete-before and --whole-file do not help at all.

/kc
-- 
Ken Chase - k...@heavycomputing.ca skype:kenchase23 +1 416 897 6284 Toronto 
Canada
Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front 
St. W.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html