Re: Workaround for hardlink count problem?

2012-09-11 Thread Arne Jansen
On 11.09.2012 01:38, Jan Engelhardt wrote:
> 
> On Tuesday 2012-09-11 01:09, Martin Steigerwald wrote:
 What about:

 - copy first backup version
 - btrfs subvol create first next
 - copy next backup version
 - btrfs subvol create previous next
>>>
>>> Wouldn't "btrfs subvolume snapshot", plus "rsync --inplace" more
>>> useful here? That is. if the original hardlink is caused by multiple
>>> versions of backup of the same file.
>>
>> Sure, I meant subvol snapshot in above example. Thanks for noticing.
>>
>> But I do not use --inplace as it conflicts with some other rsync options I 
>> like to use:
> 
> It is a tradeoff.
> 
> rsync "--inplace" leads to fragmentation which is detrimental for the
> speed of reads (and read-write-cycles as used by rsync) of big files
> (multi-GB) that are regularly updated, but it is probably even worse
> for smaller-than-GB files because percent-wise, they are even more
> fragmented.
> 
> $ filefrag */vm/intranet.dsk
> snap-2012-08-15/vm/intranet.dsk: 23 extents found
> snap-2012-08-16/vm/intranet.dsk: 23 extents found
> snap-2012-08-17/vm/intranet.dsk: 4602 extents found
> snap-2012-08-18/vm/intranet.dsk: 6221 extents found
> snap-2012-08-19/vm/intranet.dsk: 6604 extents found
> snap-2012-08-20/vm/intranet.dsk: 6694 extents found
> snap-2012-08-21/vm/intranet.dsk: 6650 extents found
> snap-2012-08-22/vm/intranet.dsk: 6760 extents found
> snap-2012-08-23/vm/intranet.dsk: 7226 extents found
> snap-2012-08-24/vm/intranet.dsk: 7159 extents found
> snap-2012-08-25/vm/intranet.dsk: 7464 extents found
> snap-2012-08-26/vm/intranet.dsk: 7746 extents found
> snap-2012-08-27/vm/intranet.dsk: 8017 extents found
> snap-2012-08-28/vm/intranet.dsk: 8145 extents found
> snap-2012-08-29/vm/intranet.dsk: 8393 extents found
> snap-2012-08-30/vm/intranet.dsk: 8474 extents found
> snap-2012-08-31/vm/intranet.dsk: 9150 extents found
> snap-2012-09-01/vm/intranet.dsk: 8900 extents found
> snap-2012-09-02/vm/intranet.dsk: 9218 extents found
> snap-2012-09-03/vm/intranet.dsk: 9575 extents found
> snap-2012-09-04/vm/intranet.dsk: 9760 extents found
> snap-2012-09-05/vm/intranet.dsk: 9839 extents found
> snap-2012-09-06/vm/intranet.dsk: 9907 extents found
> snap-2012-09-07/vm/intranet.dsk: 10006 extents found
> snap-2012-09-08/vm/intranet.dsk: 10248 extents found
> snap-2012-09-09/vm/intranet.dsk: 10488 extents found
> 
> Without --inplace (prerequisite to use -S) however, it will recreate
> a file if it has been touched. While this easily avoids fragmentation
> (since it won't share any data blocks with the old one), it can take
> up more space with the big files.
> 
>> -ax --acls --xattrs --sparse --hard-links --del --delete-excluded --
> 
> I knew short options would be helpful here: -axAXSH
> (why don't they just become the standard... they are in like almost
> every other rsync invocation I ever had)
> 
>>   -S, --sparse
>>  Try to handle sparse files efficiently so they  take  up
>>  less space on the destination.  Conflicts with --inplace
>>  because it’s not possible to overwrite data in a  sparse
>>  fashion.
> 
> Oh and if anybody from the rsync camp reads it: with hole-punching
> now supported in Linux, there is no reason not to support "-S" with
> "--inplace", I think.

I sent a patch for this quite some time ago:

https://bugzilla.samba.org/show_bug.cgi?id=7194

Feel free to push it :)

-Arne

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Workaround for hardlink count problem?

2012-09-11 Thread Martin Steigerwald
Am Dienstag, 11. September 2012 schrieb Jan Engelhardt:
> 
> On Tuesday 2012-09-11 01:09, Martin Steigerwald wrote:
> >> > What about:
> >> > 
> >> > - copy first backup version
> >> > - btrfs subvol create first next
> >> > - copy next backup version
> >> > - btrfs subvol create previous next
> >> 
> >> Wouldn't "btrfs subvolume snapshot", plus "rsync --inplace" more
> >> useful here? That is. if the original hardlink is caused by multiple
> >> versions of backup of the same file.
> >
> >Sure, I meant subvol snapshot in above example. Thanks for noticing.
> >
> >But I do not use --inplace as it conflicts with some other rsync options I 
> >like to use:
> 
> It is a tradeoff.
> 
> rsync "--inplace" leads to fragmentation which is detrimental for the
> speed of reads (and read-write-cycles as used by rsync) of big files
> (multi-GB) that are regularly updated, but it is probably even worse
> for smaller-than-GB files because percent-wise, they are even more
> fragmented.
> 
> $ filefrag */vm/intranet.dsk
> snap-2012-08-15/vm/intranet.dsk: 23 extents found
> snap-2012-08-16/vm/intranet.dsk: 23 extents found
> snap-2012-08-17/vm/intranet.dsk: 4602 extents found
> snap-2012-08-18/vm/intranet.dsk: 6221 extents found
[…]
> snap-2012-08-25/vm/intranet.dsk: 7464 extents found
[…]
> snap-2012-09-09/vm/intranet.dsk: 10488 extents found
> 
> Without --inplace (prerequisite to use -S) however, it will recreate
> a file if it has been touched. While this easily avoids fragmentation
> (since it won't share any data blocks with the old one), it can take
> up more space with the big files.

Yes. But I do not care as much as about sparse files. Cause for the
example I gave on a backup restore those sparse files would consume
about 1 GiB more on the SSD. Then I prefer to have some duplicated
files on the 2TB backup harddisk.

As for recreating the sparse nature of the files I´d have to format new
hardfiles and copy tons of mail files over within E-UAE. Thus I prefer
not to loose it on backup.

> >-ax --acls --xattrs --sparse --hard-links --del --delete-excluded --
> 
> I knew short options would be helpful here: -axAXSH
> (why don't they just become the standard... they are in like almost
> every other rsync invocation I ever had)

Hey, I like those. I do not have to look up in the manpage what each
option means ;)

> >   -S, --sparse
> >  Try to handle sparse files efficiently so they  take  up
> >  less space on the destination.  Conflicts with --inplace
> >  because it’s not possible to overwrite data in a  sparse
> >  fashion.
> 
> Oh and if anybody from the rsync camp reads it: with hole-punching
> now supported in Linux, there is no reason not to support "-S" with
> "--inplace", I think.

Hmm, maybe I forward this to them.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Workaround for hardlink count problem?

2012-09-10 Thread Jan Engelhardt

On Tuesday 2012-09-11 01:09, Martin Steigerwald wrote:
>> > What about:
>> > 
>> > - copy first backup version
>> > - btrfs subvol create first next
>> > - copy next backup version
>> > - btrfs subvol create previous next
>> 
>> Wouldn't "btrfs subvolume snapshot", plus "rsync --inplace" more
>> useful here? That is. if the original hardlink is caused by multiple
>> versions of backup of the same file.
>
>Sure, I meant subvol snapshot in above example. Thanks for noticing.
>
>But I do not use --inplace as it conflicts with some other rsync options I 
>like to use:

It is a tradeoff.

rsync "--inplace" leads to fragmentation which is detrimental for the
speed of reads (and read-write-cycles as used by rsync) of big files
(multi-GB) that are regularly updated, but it is probably even worse
for smaller-than-GB files because percent-wise, they are even more
fragmented.

$ filefrag */vm/intranet.dsk
snap-2012-08-15/vm/intranet.dsk: 23 extents found
snap-2012-08-16/vm/intranet.dsk: 23 extents found
snap-2012-08-17/vm/intranet.dsk: 4602 extents found
snap-2012-08-18/vm/intranet.dsk: 6221 extents found
snap-2012-08-19/vm/intranet.dsk: 6604 extents found
snap-2012-08-20/vm/intranet.dsk: 6694 extents found
snap-2012-08-21/vm/intranet.dsk: 6650 extents found
snap-2012-08-22/vm/intranet.dsk: 6760 extents found
snap-2012-08-23/vm/intranet.dsk: 7226 extents found
snap-2012-08-24/vm/intranet.dsk: 7159 extents found
snap-2012-08-25/vm/intranet.dsk: 7464 extents found
snap-2012-08-26/vm/intranet.dsk: 7746 extents found
snap-2012-08-27/vm/intranet.dsk: 8017 extents found
snap-2012-08-28/vm/intranet.dsk: 8145 extents found
snap-2012-08-29/vm/intranet.dsk: 8393 extents found
snap-2012-08-30/vm/intranet.dsk: 8474 extents found
snap-2012-08-31/vm/intranet.dsk: 9150 extents found
snap-2012-09-01/vm/intranet.dsk: 8900 extents found
snap-2012-09-02/vm/intranet.dsk: 9218 extents found
snap-2012-09-03/vm/intranet.dsk: 9575 extents found
snap-2012-09-04/vm/intranet.dsk: 9760 extents found
snap-2012-09-05/vm/intranet.dsk: 9839 extents found
snap-2012-09-06/vm/intranet.dsk: 9907 extents found
snap-2012-09-07/vm/intranet.dsk: 10006 extents found
snap-2012-09-08/vm/intranet.dsk: 10248 extents found
snap-2012-09-09/vm/intranet.dsk: 10488 extents found

Without --inplace (prerequisite to use -S) however, it will recreate
a file if it has been touched. While this easily avoids fragmentation
(since it won't share any data blocks with the old one), it can take
up more space with the big files.

>-ax --acls --xattrs --sparse --hard-links --del --delete-excluded --

I knew short options would be helpful here: -axAXSH
(why don't they just become the standard... they are in like almost
every other rsync invocation I ever had)

>   -S, --sparse
>  Try to handle sparse files efficiently so they  take  up
>  less space on the destination.  Conflicts with --inplace
>  because it’s not possible to overwrite data in a  sparse
>  fashion.

Oh and if anybody from the rsync camp reads it: with hole-punching
now supported in Linux, there is no reason not to support "-S" with
"--inplace", I think.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Workaround for hardlink count problem?

2012-09-10 Thread Martin Steigerwald
Am Montag, 10. September 2012 schrieb Fajar A. Nugraha:
> On Mon, Sep 10, 2012 at 4:12 PM, Martin Steigerwald 
 wrote:
> > Am Samstag, 8. September 2012 schrieb Marc MERLIN:
> >> I was migrating a backup disk to a new btrfs disk, and the backup
> >> had a lot of hardlinks to collapse identical files to cut down on
> >> inode count and disk space.
> > 
> >> Then, I started seeing:
> > […]
> > 
> >> Has someone come up with a cool way to work around the too many link
> >> error and only when that happens, turn the hardlink into a file copy
> >> instead? (that is when copying an entire tree with millions of
> >> files).
> > 
> > What about:
> > 
> > - copy first backup version
> > - btrfs subvol create first next
> > - copy next backup version
> > - btrfs subvol create previous next
> 
> Wouldn't "btrfs subvolume snapshot", plus "rsync --inplace" more
> useful here? That is. if the original hardlink is caused by multiple
> versions of backup of the same file.

Sure, I meant subvol snapshot in above example. Thanks for noticing.

But I do not use --inplace as it conflicts with some other rsync options I 
like to use:

-ax --acls --xattrs --sparse --hard-links --del --delete-excluded --
exclude-from "debian-exclude"

Yes, it was --sparse:

   -S, --sparse
  Try to handle sparse files efficiently so they  take  up
  less space on the destination.  Conflicts with --inplace
  because it’s not possible to overwrite data in a  sparse
  fashion.

As I have some pretty big sparse files, I went without --inplace:

martin@merkaba:~/Amiga> du -sch M-Archiv.hardfile Messages.hardfile 
241MM-Archiv.hardfile
726MMessages.hardfile
966Minsgesamt
martin@merkaba:~/Amiga> ls -lh M-Archiv.hardfile Messages.hardfile 
-rw-r- 1 martin martin 1,0G Mär 27  2005 M-Archiv.hardfile
-rw-r- 1 martin martin 1,0G Sep 10 17:33 Messages.hardfile
martin@merkaba:~/Amiga>

(my old mails when I used Amiga as my main machine, still accessible via 
e-uae ;)

Anyway, I think that will be solved by btrfs send/receive.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Workaround for hardlink count problem?

2012-09-10 Thread Fajar A. Nugraha
On Mon, Sep 10, 2012 at 4:12 PM, Martin Steigerwald  wrote:
> Am Samstag, 8. September 2012 schrieb Marc MERLIN:
>> I was migrating a backup disk to a new btrfs disk, and the backup had a
>> lot of hardlinks to collapse identical files to cut down on inode
>> count and disk space.
>>
>> Then, I started seeing:
> […]
>> Has someone come up with a cool way to work around the too many link
>> error and only when that happens, turn the hardlink into a file copy
>> instead? (that is when copying an entire tree with millions of files).
>
> What about:
>
> - copy first backup version
> - btrfs subvol create first next
> - copy next backup version
> - btrfs subvol create previous next

Wouldn't "btrfs subvolume snapshot", plus "rsync --inplace" more
useful here? That is. if the original hardlink is caused by multiple
versions of backup of the same file.

Personally, if I need a feature not currently impelented yet in btrfs,
I'd just switch to something else for now, like zfs. And revisit btrfs
later when it has the needed features have been merged.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Workaround for hardlink count problem?

2012-09-10 Thread Martin Steigerwald
Am Samstag, 8. September 2012 schrieb Marc MERLIN:
> I read the discussions on hardlinks, and saw that there was a proposed
> patch (although I'm not sure if it's due in 3.6 or not, or whether I
> can apply it to my 3.5.3 tree).
> 
> I was migrating a backup disk to a new btrfs disk, and the backup had a
> lot of hardlinks to collapse identical files to cut down on inode
> count and disk space.
> 
> Then, I started seeing:
[…]
> Has someone come up with a cool way to work around the too many link
> error and only when that happens, turn the hardlink into a file copy
> instead? (that is when copying an entire tree with millions of files).

What about:

- copy first backup version
- btrfs subvol create first next
- copy next backup version
- btrfs subvol create previous next

I use this scheme for my backup since quite a while. Except that first 
backup, then create a read only snapshot. And at some time remove old 
snapshots.

Works like a charm and is easily scriptable.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Workaround for hardlink count problem?

2012-09-08 Thread Marc MERLIN
I read the discussions on hardlinks, and saw that there was a proposed patch
(although I'm not sure if it's due in 3.6 or not, or whether I can apply it 
to my 3.5.3 tree).

I was migrating a backup disk to a new btrfs disk, and the backup had a lot of 
hardlinks
to collapse identical files to cut down on inode count and disk space.

Then, I started seeing:

cp: cannot create hard link 
`../dshelf3/backup/saroumane/20080319/var/lib/dpkg/info/libaspell15.postrm' to 
`../dshelf3/backup/moremagic/oldinstall/var/lib/dpkg/info/libncurses5.postrm': 
Too many links
cp: cannot create hard link 
`../dshelf3/backup/saroumane/20080319/var/lib/dpkg/info/libxp6.postrm' to 
`../dshelf3/backup/moremagic/oldinstall/var/lib/dpkg/info/libncurses5.postrm': 
Too many links
cp: cannot create symbolic link 
`../dshelf3/backup/saroumane/20020317_oldload/usr/share/doc/menu/examples/system.fvwmrc':
 File name too long
cp: cannot create hard link 
`../dshelf3/backup/saroumane/20061218/var/lib/dpkg/info/libxxf86vm1.postrm' to 
`../dshelf3/backup/moremagic/oldinstall/var/lib/dpkg/info/libncurses5.postrm': 
Too many links
cp: cannot create hard link 
`../dshelf3/backup/saroumane/20061218/var/lib/dpkg/info/libxxf86dga1.postrm' to 
`../dshelf3/backup/moremagic/oldinstall/var/lib/dpkg/info/libncurses5.postrm': 
Too many links
cp: cannot create hard link 
`../dshelf3/backup/saroumane/20061218/var/lib/dpkg/info/libavc1394-0.postrm' to 
`../dshelf3/backup/moremagic/oldinstall/var/lib/dpkg/info/libncurses5.postrm': 
Too many links

What's interesting is the 'File name too long' one, but more generally, 
I'm trying to find a userspace workaround for this by unlinking files that go 
beyond
the hardlink count that btrfs can support for now.

Has someone come up with a cool way to work around the too many link error
and only when that happens, turn the hardlink into a file copy instead?
(that is when copying an entire tree with millions of files).

I realize I could parse the errors and pipe that into some crafty shell to do 
this,
but if there is a smarter already made solution, I'm all ears :)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html