Cross-subvolume reflink copy (BTRFS_IOC_CLONE over subvolume boundaries)

2012-02-13 Thread Hubert Kario
It's been nearly a year since the patches needed to implement a reflinked copy 
between subvolumes have been posted 
(http://permalink.gmane.org/gmane.comp.file-systems.btrfs/9865 ) and I still 
get "Invalid cross-device link" error with Linux 3.2.4 while I try to do a cp 
--reflink between subvolumes.

This is a *very* useful feature to have (think offline file-level deduplication 
for one thing).

>From what I was able to find in the archives, the only objection (userland 
operation crossing subvolume boundaries) was rebutted by Chris Mason. 

Is there something else that I missed?

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix segmentation fault when opening invalid file system

2012-02-19 Thread Hubert Kario
Signed-off-by: Hubert Kario 
---
 btrfslabel.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/btrfslabel.c b/btrfslabel.c
index c9f4684..3c3566b 100644
--- a/btrfslabel.c
+++ b/btrfslabel.c
@@ -55,6 +55,8 @@ static void change_label_unmounted(char *dev, char *nLabel)
 * and as read-write.
 */
root = open_ctree(dev, 0, 1);
+   if (!root) /* errors are printed by open_ctree() */
+ return;
 
trans = btrfs_start_transaction(root, 1);
strncpy(root->fs_info->super_copy.label, nLabel, BTRFS_LABEL_SIZE);
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-convert processing time

2012-02-20 Thread Hubert Kario
On Monday 20 of February 2012 13:51:29 Olivier Bonvalet wrote:
> Hi,
>
> I'm trying to convert two ext4 FS to btrfs, but I'm surprised by the
> time needed to do that conversion.
>
> The first FS is on a 500GiB block device, and btrfs-convert is running
> since more than 48h :
> root  1978 25.6 47.7 748308 732556 ?   DFeb18 944:44
> btrfs-convert /dev/vg-backup/backup
>
> The second is on a 340GiB block device, and the processing time is similar :
> root 30192  5.2  3.0  73180 61520 pts/0RFeb18 121:04
> btrfs-convert /dev/vg-balor/backup
>
>
> Is it "normal", or is there a problem during the processing ?
>
> Thanks,
> Olivier

How much free space did those FS have? How many files were on them (was they
dominated by small or large files)?

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature


Re: btrfs-convert processing time

2012-02-20 Thread Hubert Kario
(sorry for the duplicate, previous one has broken signature)
On Monday 20 of February 2012 13:51:29 Olivier Bonvalet wrote:
> Hi,
> 
> I'm trying to convert two ext4 FS to btrfs, but I'm surprised by the
> time needed to do that conversion.
> 
> The first FS is on a 500GiB block device, and btrfs-convert is running
> since more than 48h :
> root  1978 25.6 47.7 748308 732556 ?   DFeb18 944:44
> btrfs-convert /dev/vg-backup/backup
> 
> The second is on a 340GiB block device, and the processing time is similar :
> root 30192  5.2  3.0  73180 61520 pts/0RFeb18 121:04
> btrfs-convert /dev/vg-balor/backup
> 
> 
> Is it "normal", or is there a problem during the processing ?
> 
> Thanks,
> Olivier

How much free space did those FS have? How many files were on them (was they 
dominated by small or large files)?

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-convert processing time

2012-02-20 Thread Hubert Kario
On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote:
> On 20/02/2012 14:20, Hubert Kario wrote:
> > On Monday 20 of February 2012 13:51:29 Olivier Bonvalet wrote:
> >> Hi,
> >> 
> >> I'm trying to convert two ext4 FS to btrfs, but I'm surprised by the
> >> time needed to do that conversion.
> >> 
> >> The first FS is on a 500GiB block device, and btrfs-convert is running
> >> since more than 48h :
> >> root  1978 25.6 47.7 748308 732556 ?   DFeb18 944:44
> >> btrfs-convert /dev/vg-backup/backup
> >> 
> >> The second is on a 340GiB block device, and the processing time is
> >> similar : root 30192  5.2  3.0  73180 61520 pts/0RFeb18
> >> 121:04 btrfs-convert /dev/vg-balor/backup
> >> 
> >> 
> >> Is it "normal", or is there a problem during the processing ?
> >> 
> >> Thanks,
> >> Olivier
> > 
> > How much free space did those FS have? How many files were on them (was
> > they dominated by small or large files)?
> > 
> > Regards,
> 
> Lot of small files (like compressed email from Maildir), and lot of
> hardlinks, and probably low free space (near 15% I suppose).
> 
> 
> So I think I have my answer :)
> 

Yes, this is probably the worst possible combination.

Plese keep us updated. Just to have exact numbers for new users.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-convert processing time

2012-02-20 Thread Hubert Kario
Chris: What will btrfs-convert do when it encounters a directory with more 
hardlinks than the btrfs limit?

On Monday 20 of February 2012 21:00:34 Fajar A. Nugraha wrote:
> On Mon, Feb 20, 2012 at 8:50 PM, Hubert Kario  wrote:
> > On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote:
> >> Lot of small files (like compressed email from Maildir), and lot of
> >> hardlinks, and probably low free space (near 15% I suppose).
> >> 
> >> 
> >> So I think I have my answer :)
> > 
> > Yes, this is probably the worst possible combination.
> > 
> > Plese keep us updated. Just to have exact numbers for new users.
> 
> ... although it would probably fail anyway due to btrfs hardlink limit
> in the same directory.

This problem also went through my mind.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] fix reporting of status for degraded arrays

2012-02-20 Thread Hubert Kario
Signed-off-by: Hubert Kario 
---
 scrub.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/scrub.c b/scrub.c
index 9dca5f6..630a1bf 100644
--- a/scrub.c
+++ b/scrub.c
@@ -1010,7 +1010,7 @@ static int scrub_fs_info(int fd, char *path,
if (!fi_args->num_devices)
return 0;
 
-   di_args = *di_ret = malloc(fi_args->num_devices * sizeof(*di_args));
+   di_args = *di_ret = calloc(fi_args->num_devices, sizeof(*di_args));
if (!di_args)
return -errno;
 
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] btrfs: handle dev info ioctl on degraded mount

2012-02-20 Thread Hubert Kario
Signed-off-by: Hubert Kario 
---
 fs/btrfs/ioctl.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3dede5c..d536816 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2120,7 +2120,10 @@ static long btrfs_ioctl_dev_info(struct btrfs_root 
*root, void __user *arg)
di_args->bytes_used = dev->bytes_used;
di_args->total_bytes = dev->total_bytes;
memcpy(di_args->uuid, dev->uuid, sizeof(di_args->uuid));
-   strncpy(di_args->path, dev->name, sizeof(di_args->path));
+   if (dev->name)
+   strncpy(di_args->path, dev->name, sizeof(di_args->path));
+   else
+   di_args->path[0] = '\0';
di_args->path[BTRFS_DEVICE_PATH_NAME_MAX-1] = '\0';
 
 out:
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: ensure null-termination of string copied from userspace

2012-02-20 Thread Hubert Kario
Signed-off-by: Hubert Kario 
---
 fs/btrfs/ioctl.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index dae5dfe..3dede5c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2121,6 +2121,7 @@ static long btrfs_ioctl_dev_info(struct btrfs_root *root, 
void __user *arg)
di_args->total_bytes = dev->total_bytes;
memcpy(di_args->uuid, dev->uuid, sizeof(di_args->uuid));
strncpy(di_args->path, dev->name, sizeof(di_args->path));
+   di_args->path[BTRFS_DEVICE_PATH_NAME_MAX-1] = '\0';
 
 out:
if (ret == 0 && copy_to_user(arg, di_args, sizeof(*di_args)))
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] fix reporting of status for degraded arrays

2012-02-20 Thread Hubert Kario
Sorry for the subject, this a single patch, not in a series.

Hubert Kario
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Balance RAID10 with odd device count

2012-02-22 Thread Hubert Kario
On Wednesday 22 of February 2012 09:56:27 Xavier Nicollet wrote:
> Le 21 February 2012 ? 07:54, Hugo Mills a écrit:
> >Some time ago, I proposed the following scheme:
> > CSP
> > 
> >where n is the number of copies (suffixed by C), m is the number of
> > 
> > stripes for that data (suffixed by S), and p is the number of parity
> > blocks (suffixed by P). Values of zero are omitted.
> > 
> >So btrfs's RAID-1 would be 2C, RAID-0 would be 1CnS, RAID-5 would
> > 
> > be 1CnS1P, and RAID-6 would be 1CnS2P. DUP would need a special
> > indicator to show that it wasn't redundant in the face of a whole-disk
> > failure: 2CN
> 
> Seems clear. However, is the S really relevant ?
> It would be simpler without it, wouldn't it ?

It depends how striping will be implemented. Generally it provides 
information on how much spindles is the data using. With static 
configuration it will be useless, but when you start changing number of 
drives in set then it's necessary to know if you're not under- or over-
utilising the disks.
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs filesystem resize syntax

2012-02-22 Thread Hubert Kario
What happened to the patches to --help and man pages that did explain the 
multi-device use-case:

 btrfs filesystem resize [:][+/-][gkm]|max 

(the : bit is missing)
It is explained on the wiki: 
http://btrfs.ipv5.de/index.php?title=Btrfs(command)
but I don't see those patches in Chris btrfs-progs tree.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Hubert Kario
On Wednesday 17 of August 2011 10:15:46 Anand Jain wrote:
>   btrfs auto snapshot feature will include:
>   Initially:

>   - snapshot destination will be subvol/.btrfs/snapshot@ and
> snapshot/.btrfs/snapshot@ for subvolume and snapshot
> respectively

Is there some reason not to use the format used by shadow_copy2 overlay for
Samba? (The one providing Shadow Volume Copy functionality for Windows
clients):

Current date in this format you get like this:

@GMT-`date -u '+%Y.%m.%d-%H.%M.%S'`

For example: @GMT-2012.02.23-10.34.32

This way, when the volume is exported using Samba, you can easily export
past copies too, without creating links.

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Hubert Kario
On Thursday 23 of February 2012 20:02:38 Anand Jain wrote:
>   autosnap code is available either end of this week or early
>   next week and what you will notice is autosnap snapshots
>   are named using uuid.
>
>   Main reason to drop time-stamp based names is that,
>  - test (clicking on Take-snapshot button) which took more
>than one snapshot per second was failing.
>  - a more descriptive creation time is available using a
> command line option as in the example below.
>   -
>   # btrfs su list -t tag=@minute,parent=/btrfs/sv1 /btrfs
>   /btrfs/.autosnap/6c0dabfa-5ddb-11e1-a8c1-0800271feb99 Thu Feb 23 13:01:18
> 2012 /btrfs/sv1 @minute
> /btrfs/.autosnap/5669613e-5ddd-11e1-a644-0800271feb99 Thu Feb 23 13:15:01
> 2012 /btrfs/sv1 @minute -
>
>   As of now code for time-stamp as autosnap snapshot name is
>   commented out, if more people wanted it to be a time-stamp
>   based names, I don't mind having that way. Please do let me know.

I'd say, that having it as configure option (do Samba-style snapshot naming
vs. uuid based) would be sufficient. The question remains what should be the
default.

That being said, what use-case would require snapshots taken more often than
every second? I doubt that you actually can do snapshots every second on a
busy file system, let alone more often. On lightly-used one they will be
identical and just clutter the name-space.

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature


Re: btrfs-convert processing time

2012-03-31 Thread Hubert Kario
On Friday 30 of March 2012 21:23:52 Olivier Bonvalet wrote:
> Le 22/02/2012 17:55, Olivier Bonvalet a écrit :
> > So, the btrfs-convert for the smaller drive is done... after near 5
> > days. Which stats can I give you ?
> > 
> > It's a 340GB LVM block device, and "btrfs filesystem df /backup/" say
> > that :
> > 
> > Data: total=225.97GB, used=181.94GB
> > System: total=32.00MB, used=24.00KB
> > Metadata: total=111.00GB, used=91.56GB
> > 
> > 
> > If I mount the ext2_saved/image copy, I can see 257GB of data (78% of
> > the block device used), with 17M of inodes.
> > 
> > 
> > The other btrfs-convert stay running.
> 
> Well, the second one is still running :
> 
> root! backup:~# uptime
>   21:17:43 up 41 days, 19:59,  1 user,  load average: 2.06, 1.90, 1.88
> 
> 
> root! backup:~# ps auxw | grep btrfs
> root  1978 25.6 74.3 1269072 1141212 ? DFeb18 15421:50
> btrfs-convert /dev/vg-backup/backup

Wow, that's /long/...


> root! backup:~# iostat -k
> Linux 2.6.42.3-dae-xen (backup)   30/03/2012  _x86_64_(2 CPU)
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>10,540,062,36   54,210,09   32,74
> 
> Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
> xvda724,70   175,50  3181,13  634342597 11498370400
> xvdy  0,00 0,00 0,00   1381  0
> xvdz  0,00 0,00 0,00   4065  0
> dm-0  0,49 3,22 1,78   116274976435028
> dm-1837,57   171,69  3178,57  620593936 11489105416
> dm-2  0,34 0,59 0,7821208002830584
> 
> 11498370400 kB written, so, 11TB written to convert a 518GB partition ?
> 
> 
> but more important : is it safe to abort the process ?

I'm quite sure you won't get a mountable filesystem, either ext or btrfs. 
btrfs-convert has to move at least some ext3 metadata blocks, so ext FS will 
be broken. I don't know if btrfs-convert writes whole btrfs tree and then 
updates it as it finds new inodes in ext3 or just goes over it sequentially. 
But either way, you'll get at best btrfs with lots of files missing.

So the short answer is: no.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backup of btrfs with metadata?

2012-04-24 Thread Hubert Kario
On Monday 23 of April 2012 13:53:57 Ken D'Ambrosio wrote:
> I know this question was asked, oh, a year ago, and the answer was
> "No."  But I'm wondering if anything's changed in the interim.
> Specifically, shy of "dd", is there any way to back up the files and
> metadata on a btrfs partition?

btrfs send is a work in progress, currently there are problems in tree
traversal that are being worked on

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature


Re: Interpreting Output of "btrfs fi show"

2012-04-28 Thread Hubert Kario
On Thursday 26 of April 2012 20:54:47 Duncan wrote:
> Helmut Hullen posted on Thu, 26 Apr 2012 13:11:00 +0200 as excerpted:
> > Hallo, Bart,
> >> 
> >> Well I think there is a btrfs superblock still present from the
> >> full-disk filesystem. Due to the offset of the first partition from the
> >> start of the disk, this superblock was not overwritten when you created
> >> the filesystem inside the partition.
> > 
> > Sounds familiar ...
> > 
> > I now use to delete about the first 10 MByte of the target disk via "dd
> > if=/dev/zero"
> 
> But /unlike/ reiserfs, which was only affected with the well warned as
> don't-use-unless-you-have-to fsck --rebuild-tree option, it seems that
> due to btrfs scan, etc, btrfs has its similar problem in more routine
> operation.

I'd say that this kind of problem is basically impossible in btrfs because 
of FS UUID written all over the tree.

What we see here, is a superblock that is written in *very* specific place 
on the partition, that just is aligned in place that makes the whole disk 
look like btrfs.

I don't think it's actually possible for btrfs to put a file with btrfs 
filesystem image in place where it could seem like the basic block device 
has btrfs /too/. It depends on whatever the metadata block is allocated 
before data block on disk. It /may/ be possible in mixed data-metadata 
allocation mode.
Chris or Josef, can you confirm?

Still, a "zero-superblock" option would be useful for the btrfs tool. I'll 
see what I can do about this.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cross-subvolume cp --reflink

2012-04-28 Thread Hubert Kario
On Sunday 01 of April 2012 11:42:23 Jérôme Poulin wrote:
> On Sun, Apr 1, 2012 at 11:27 AM, Norbert Scheibner  wrote:
> > Some users tested this patch successfully for week,s or months in 2 or 3
> > kernel versions since then, true?
> If this feature must be implented in VFS in another patch, why not
> just activate what works and make the future patch disable it again?

Why would (should) it be impleemented in VFS? reflink copy is completely
different from normal copy and hard link.

Subvolumes in btrfs are barriers *only* in btrfs and not visible in VFS.

IMHO it's strictly btrfs business and not supporting reflink copy between
arbitrary directories is a bug.

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature


Re: Interpreting Output of "btrfs fi show"

2012-04-30 Thread Hubert Kario
On Sunday 29 of April 2012 04:15:24 Duncan wrote:
> > Still, a "zero-superblock" option would be useful for the btrfs tool.
> > I'll see what I can do about this.
> 
> Yes, indeed.  Particularly since various bits of btrfs functionality 
> depend on scanning for filesystems (presumably their superblocks), and 
> output like that in the OP can be confusing indeed, as well as 
> potentially dangerous in recovery situations, where the wrong one might 
> be activated by accident.  (FWIW, there's an mdadm --zero-superblock 
> option.  I should take note of this thread and be sure I use it when next 
> I redo my layouts, probably when I switch some of them to btrfs instead, 
> tho that's going to be a bit as I'm waiting for N-way-mirroring, aka 
> proper raid1 mode, not the 2-way-only-mirroring that btrfs calls raid1 
> mode currently.)

mdadm --zero-superblock removes MD superblock, it doesn't modify the data part 
of the partition, it just zeroes the MD metadata.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Interpreting Output of "btrfs fi show"

2012-04-30 Thread Hubert Kario
On Sunday 29 of April 2012 08:13:48 Martin Steigerwald wrote:
> Am Donnerstag, 26. April 2012 schrieb Bart Noordervliet:
> > On Thu, Apr 26, 2012 at 11:06, Thomas Rohwer  wrote:
> > >> As for the two filesystems shown in btrfs fi show... I have no clue
> > >> what that is about. Did you maybe make a mistake to create a btrfs
> > >> filesystem on the whole disk at first?
> > > 
> > > That is possible. But afterwards I certainly repartioned the device
> > > and created a btrfs filesystem on /dev/sda1. Maybe this info is only
> > > in the partition table? I understand that I should avoid mounting
> > > /dev/sda in this situation.
> > 
> > Well I think there is a btrfs superblock still present from the
> > full-disk filesystem. Due to the offset of the first partition from
> > the start of the disk, this superblock was not overwritten when you
> > created the filesystem inside the partition. But they very much
> > overlap and the full-disk superblock will probably eventually be
> > overwritten by elements from the partition filesystem. How you would
> > go about erasing the stale superblock and whether it is safe to do so
> > I can't say though.
> 
> There is the command wipefs. Whether its safe to use here I do not know. I
> wouldn´t try without a backup.

Sorry, but I'm unable to find it. Is it a `btrfs` tool option or is it a 
standalone application (in similar form as is the `btrfs-zero-log`)?

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: add command to zero out superblock

2012-04-30 Thread Hubert Kario
Signed-off-by: Hubert Kario 
---
 cmds-device.c  |   42 ++
 man/btrfs.8.in |7 +++
 2 files changed, 49 insertions(+)

diff --git a/cmds-device.c b/cmds-device.c
index db625a6..1edb091 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -246,11 +246,53 @@ static int cmd_scan_dev(int argc, char **argv)
return 0;
 }
 
+static const char * const cmd_zero_dev_usage[] = {
+   "btrfs device zero-superblock  [ ...]",
+   "Remove btrfs filesystem superblock from devices.",
+   "WARNING! This command will make filesystem residing on the devices",
+   "completely unmountable!",
+   NULL
+};
+
+static int cmd_zero_dev(int argc, char **argv)
+{
+   int fd;
+   char *file;
+   int arg_processed;
+   int ret = 0;
+   u64 device_len;
+   int mixed_mode_needed;
+   const int ZERO_END = 1;
+
+   if( argc < 2 ) {
+   usage(cmd_zero_dev_usage);
+   }
+
+   for(arg_processed = 1; arg_processed < argc; arg_processed++) {
+   file = argv[arg_processed];
+
+   fd = open(file, O_RDWR);
+   if (fd < 0) {
+   fprintf(stderr, "Unable to open %s\n", file);
+   ret |= 1;
+   continue;
+   }
+
+   ret |= btrfs_prepare_device(fd, file, ZERO_END, &device_len,
+   &mixed_mode_needed);
+
+   close(fd);
+   }
+
+   return ret;
+}
+
 const struct cmd_group device_cmd_group = {
device_cmd_group_usage, NULL, {
{ "add", cmd_add_dev, cmd_add_dev_usage, NULL, 0 },
{ "delete", cmd_rm_dev, cmd_rm_dev_usage, NULL, 0 },
{ "scan", cmd_scan_dev, cmd_scan_dev_usage, NULL, 0 },
+   { "zero-superblock", cmd_zero_dev, cmd_zero_dev_usage, NULL, 0 
},
{ 0, 0, 0, 0, 0 }
}
 };
diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index be478e0..a840f7e 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -39,6 +39,8 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBdevice delete\fP\fI  [...]  \fP
 .PP
+\fBbtrfs\fP \fBdevice zero-superblock\fP\fI  [...] \fP
+.PP
 \fBbtrfs\fP \fBscrub start\fP [-Bdqru] {\fI\fP|\fI\fP}
 .PP
 \fBbtrfs\fP \fBscrub cancel\fP {\fI\fP|\fI\fP}
@@ -230,6 +232,11 @@ Finally, if \fB--all-devices\fP is passed, all the devices 
under /dev are
 scanned.
 .TP
 
+\fBdevice zero-superblock\fR\fI  [..]\fR
+The space on the disk where btrfs metadata can reside is overwritten with
+zeros.
+.TP
+
 \fBscrub start\fP [-Bdqru] {\fI\fP|\fI\fP}
 Start a scrub on all devices of the filesystem identified by \fI\fR or on
 a single \fI\fR. Without options, scrub is started as a background
-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] man: fix btrfs man page formatting

2012-04-30 Thread Hubert Kario

Signed-off-by: Hubert Kario 
---
 man/btrfs.8.in |2 ++
 1 file changed, 2 insertions(+)

diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index a840f7e..d410093 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -156,6 +156,8 @@ The start position and the number of bytes to deframention 
can be specified by \
 NOTE: defragmenting with kernels up to 2.6.37 will unlink COW-ed copies of 
data, don't 
 use it if you use snapshots, have de-duplicated your data or made copies with 
 \fBcp --reflink\fP.
+.TP
+
 \fBsubvolume find-new\fR\fI  \fR
 List the recently modified files in a subvolume, after \fI\fR ID.
 .TP
-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/5] Allow for quick removing of btrfs signature from block device

2012-05-01 Thread Hubert Kario
If there is a btrfs created on a raw block device (raw disk) and later there
are created partitions and btrfs file systems created on partitions, subsequent
`btrfs device scan` won't remove the btrfs signature from the raw block device.

This patch series adds `btrfs device zero super ` command to remove the
btrfs signature from the device as well as fix few minor problems in
btrfs_prepare_device function.

This obsoletes my previous patch "btrfs: add command to zero out superblock".

V1->V2 changes: better error reporting when there are problems with only one
of the devices passed as parameters.

Hubert Kario (5):
  btrfs: add command to zero out superblock
  handle null pointers in btrfs_prepare_device
  Remove unused option in btrfs_prepare_device
  better error handling in btrfs_prepare_device()
  btrfs: remove unused variables

 btrfs-vol.c|2 +-
 cmds-device.c  |   45 -
 man/btrfs.8.in |7 +++
 mkfs.c |   12 ++--
 utils.c|   41 -
 utils.h|4 ++--
 6 files changed, 84 insertions(+), 27 deletions(-)

-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/5] handle null pointers in btrfs_prepare_device

2012-05-01 Thread Hubert Kario
When calling the function from `btrfs device zero-super` we don't need
the additional information returned and don't want the "SMALL VOLUME"
warning printed.

Signed-off-by: Hubert Kario 

diff --git a/utils.c b/utils.c
index ee7fa1b..6773be0 100644
--- a/utils.c
+++ b/utils.c
@@ -557,7 +557,7 @@ int btrfs_prepare_device(int fd, char *file, int zero_end, 
u64 *block_count_ret,
}
zero_end = 1;
 
-   if (block_count < 1024 * 1024 * 1024 && !(*mixed)) {
+   if (mixed && block_count < 1024 * 1024 * 1024 && !(*mixed)) {
printf("SMALL VOLUME: forcing mixed metadata/data groups\n");
*mixed = 1;
}
@@ -588,7 +588,9 @@ int btrfs_prepare_device(int fd, char *file, int zero_end, 
u64 *block_count_ret,
exit(1);
}
}
-   *block_count_ret = block_count;
+
+   if (block_count_ret)
+   *block_count_ret = block_count;
return 0;
 }
 
-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/5] btrfs: add command to zero out superblock

2012-05-01 Thread Hubert Kario
Signed-off-by: Hubert Kario 

diff --git a/cmds-device.c b/cmds-device.c
index db625a6..05a549c 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -246,11 +246,58 @@ static int cmd_scan_dev(int argc, char **argv)
return 0;
 }
 
+static const char * const cmd_zero_dev_usage[] = {
+   "btrfs device zero-superblock  [ ...]",
+   "Remove btrfs filesystem superblock from devices.",
+   "WARNING! This command will make filesystem residing on the devices",
+   "completely unmountable!",
+   NULL
+};
+
+static int cmd_zero_dev(int argc, char **argv)
+{
+   int fd;
+   char *file;
+   int arg_processed;
+   int ret = 0;
+   int n;
+   u64 device_len;
+   int mixed_mode_needed = 1; /* keep btrfs_prepare_device() quiet */
+   const int ZERO_END = 1;
+
+   if( argc < 2 ) {
+   usage(cmd_zero_dev_usage);
+   }
+
+   for(arg_processed = 1; arg_processed < argc; arg_processed++) {
+   file = argv[arg_processed];
+
+   fd = open(file, O_RDWR);
+   if (fd < 0) {
+   fprintf(stderr, "Unable to open %s\n", file);
+   ret |= 1;
+   continue;
+   }
+
+   n = btrfs_prepare_device(fd, file, ZERO_END, &device_len,
+   &mixed_mode_needed);
+   if (n) {
+   fprintf(stderr, "Error when zeroing out %s\n", file);
+   ret |= n;
+   }
+
+   close(fd);
+   }
+
+   return ret;
+}
+
 const struct cmd_group device_cmd_group = {
device_cmd_group_usage, NULL, {
{ "add", cmd_add_dev, cmd_add_dev_usage, NULL, 0 },
{ "delete", cmd_rm_dev, cmd_rm_dev_usage, NULL, 0 },
{ "scan", cmd_scan_dev, cmd_scan_dev_usage, NULL, 0 },
+   { "zero-superblock", cmd_zero_dev, cmd_zero_dev_usage, NULL, 0 
},
{ 0, 0, 0, 0, 0 }
}
 };
diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index be478e0..a840f7e 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -39,6 +39,8 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBdevice delete\fP\fI  [...]  \fP
 .PP
+\fBbtrfs\fP \fBdevice zero-superblock\fP\fI  [...] \fP
+.PP
 \fBbtrfs\fP \fBscrub start\fP [-Bdqru] {\fI\fP|\fI\fP}
 .PP
 \fBbtrfs\fP \fBscrub cancel\fP {\fI\fP|\fI\fP}
@@ -230,6 +232,11 @@ Finally, if \fB--all-devices\fP is passed, all the devices 
under /dev are
 scanned.
 .TP
 
+\fBdevice zero-superblock\fR\fI  [..]\fR
+The space on the disk where btrfs metadata can reside is overwritten with
+zeros.
+.TP
+
 \fBscrub start\fP [-Bdqru] {\fI\fP|\fI\fP}
 Start a scrub on all devices of the filesystem identified by \fI\fR or on
 a single \fI\fR. Without options, scrub is started as a background
-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/5] Remove unused option in btrfs_prepare_device

2012-05-01 Thread Hubert Kario
zero_end is set explicitly to 1 inside the fuction so the device end
always will be zeroed out

Signed-off-by: Hubert Kario 

diff --git a/btrfs-vol.c b/btrfs-vol.c
index 0efdbc1..c7b9f80 100644
--- a/btrfs-vol.c
+++ b/btrfs-vol.c
@@ -150,7 +150,7 @@ int main(int ac, char **av)
if (cmd == BTRFS_IOC_ADD_DEV) {
int mixed = 0;
 
-   ret = btrfs_prepare_device(devfd, device, 1, &dev_block_count, 
&mixed);
+   ret = btrfs_prepare_device(devfd, device, &dev_block_count, 
&mixed);
if (ret) {
fprintf(stderr, "Unable to init %s\n", device);
exit(1);
diff --git a/cmds-device.c b/cmds-device.c
index 05a549c..a28752f 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -107,7 +107,7 @@ static int cmd_add_dev(int argc, char **argv)
continue;
}
 
-   res = btrfs_prepare_device(devfd, argv[i], 1, &dev_block_count, 
&mixed);
+   res = btrfs_prepare_device(devfd, argv[i], &dev_block_count, 
&mixed);
if (res) {
fprintf(stderr, "ERROR: Unable to init '%s'\n", 
argv[i]);
close(devfd);
@@ -263,7 +263,6 @@ static int cmd_zero_dev(int argc, char **argv)
int n;
u64 device_len;
int mixed_mode_needed = 1; /* keep btrfs_prepare_device() quiet */
-   const int ZERO_END = 1;
 
if( argc < 2 ) {
usage(cmd_zero_dev_usage);
@@ -279,8 +278,8 @@ static int cmd_zero_dev(int argc, char **argv)
continue;
}
 
-   n = btrfs_prepare_device(fd, file, ZERO_END, &device_len,
-   &mixed_mode_needed);
+   n = btrfs_prepare_device(fd, file, &device_len,
+&mixed_mode_needed);
if (n) {
fprintf(stderr, "Error when zeroing out %s\n", file);
ret |= n;
diff --git a/mkfs.c b/mkfs.c
index c531ef2..7d1165f 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1209,7 +1209,6 @@ int main(int ac, char **av)
u32 sectorsize = 4096;
u32 nodesize = leafsize;
u32 stripesize = 4096;
-   int zero_end = 1;
int option_index = 0;
int fd;
int ret;
@@ -1264,7 +1263,6 @@ int main(int ac, char **av)
   "metadata/data groups\n");
mixed = 1;
}
-   zero_end = 0;
break;
case 'V':
print_version();
@@ -1311,7 +1309,7 @@ int main(int ac, char **av)
exit(1);
}
first_file = file;
-   ret = btrfs_prepare_device(fd, file, zero_end, 
&dev_block_count, &mixed);
+   ret = btrfs_prepare_device(fd, file, &dev_block_count, &mixed);
if (block_count == 0)
block_count = dev_block_count;
} else {
@@ -1376,7 +1374,6 @@ int main(int ac, char **av)
 
btrfs_register_one_device(file);
 
-   zero_end = 1;
while(ac-- > 0) {
int old_mixed = mixed;
 
@@ -1404,8 +1401,7 @@ int main(int ac, char **av)
close(fd);
continue;
}
-   ret = btrfs_prepare_device(fd, file, zero_end,
-  &dev_block_count, &mixed);
+   ret = btrfs_prepare_device(fd, file, &dev_block_count, &mixed);
mixed = old_mixed;
BUG_ON(ret);
 
diff --git a/utils.c b/utils.c
index 6773be0..e2c72ad 100644
--- a/utils.c
+++ b/utils.c
@@ -536,8 +536,7 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans,
return 0;
 }
 
-int btrfs_prepare_device(int fd, char *file, int zero_end, u64 
*block_count_ret,
-int *mixed)
+int btrfs_prepare_device(int fd, char *file, u64 *block_count_ret, int *mixed)
 {
u64 block_count;
u64 bytenr;
@@ -555,7 +554,6 @@ int btrfs_prepare_device(int fd, char *file, int zero_end, 
u64 *block_count_ret,
fprintf(stderr, "unable to find %s size\n", file);
exit(1);
}
-   zero_end = 1;
 
if (mixed && block_count < 1024 * 1024 * 1024 && !(*mixed)) {
printf("SMALL VOLUME: forcing mixed metadata/data groups\n");
@@ -581,12 +579,10 @@ int btrfs_prepare_device(int fd, char *file, int 
zero_end, u64 *block_count_ret,
zero_blocks(fd, bytenr, BTRFS_SUPER_INFO_SIZE);
}
 
-   if (zero_end) {
-   ret = zero_dev_end(fd, block_count);
-   if (ret) {
-

[PATCH v2 4/5] better error handling in btrfs_prepare_device()

2012-05-01 Thread Hubert Kario
btrfs_prepare_device did abort the whole application on any error,
even when there were other tasks queued that could succeed, now it
returns non zero value on error.

Add more descriptive error messages: print failing device name and
cause of error.

Signed-off-by: Hubert Kario 

diff --git a/mkfs.c b/mkfs.c
index 7d1165f..9a58f67 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1310,6 +1310,10 @@ int main(int ac, char **av)
}
first_file = file;
ret = btrfs_prepare_device(fd, file, &dev_block_count, &mixed);
+   if (ret) {
+   fprintf(stderr, "Unable to init device %s\n", file);
+   exit(1);
+   }
if (block_count == 0)
block_count = dev_block_count;
} else {
diff --git a/utils.c b/utils.c
index e2c72ad..139ba8a 100644
--- a/utils.c
+++ b/utils.c
@@ -546,13 +546,13 @@ int btrfs_prepare_device(int fd, char *file, u64 
*block_count_ret, int *mixed)
ret = fstat(fd, &st);
if (ret < 0) {
fprintf(stderr, "unable to stat %s\n", file);
-   exit(1);
+   return 1;
}
 
block_count = device_size(fd, &st);
if (block_count == 0) {
fprintf(stderr, "unable to find %s size\n", file);
-   exit(1);
+   return 1;
}
 
if (mixed && block_count < 1024 * 1024 * 1024 && !(*mixed)) {
@@ -568,21 +568,30 @@ int btrfs_prepare_device(int fd, char *file, u64 
*block_count_ret, int *mixed)
 
ret = zero_dev_start(fd);
if (ret) {
-   fprintf(stderr, "failed to zero device start %d\n", ret);
-   exit(1);
+   fprintf(stderr, "failed to zero device %s start: %s\n", file,
+   strerror(-ret));
+   return 1;
}
 
for (i = 0 ; i < BTRFS_SUPER_MIRROR_MAX; i++) {
bytenr = btrfs_sb_offset(i);
-   if (bytenr >= block_count)
+   /* don't zero the superblock if it's on device end boundary,
+* it will be zeroed by zero_dev_end() anyway */
+   if (bytenr + BTRFS_SUPER_INFO_SIZE >= block_count)
break;
-   zero_blocks(fd, bytenr, BTRFS_SUPER_INFO_SIZE);
+   ret = zero_blocks(fd, bytenr, BTRFS_SUPER_INFO_SIZE);
+   if (ret) {
+   fprintf(stderr, "failed to zero superblock no. %i "
+   "(at %lli) on device %s: %s\n", i, bytenr,
+   file, strerror(-ret));
+   return 1;
+   }
}
 
ret = zero_dev_end(fd, block_count);
if (ret) {
-   fprintf(stderr, "failed to zero device end %d\n", ret);
-   exit(1);
+   fprintf(stderr, "failed to zero device %s end: %s\n", file, 
strerror(-ret));
+   return 1;
}
 
if (block_count_ret)
-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 5/5] btrfs: remove unused variables

2012-05-01 Thread Hubert Kario
Signed-off-by: Hubert Kario 

diff --git a/cmds-device.c b/cmds-device.c
index a28752f..b1e70f9 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -261,8 +261,6 @@ static int cmd_zero_dev(int argc, char **argv)
int arg_processed;
int ret = 0;
int n;
-   u64 device_len;
-   int mixed_mode_needed = 1; /* keep btrfs_prepare_device() quiet */
 
if( argc < 2 ) {
usage(cmd_zero_dev_usage);
@@ -278,8 +276,7 @@ static int cmd_zero_dev(int argc, char **argv)
continue;
}
 
-   n = btrfs_prepare_device(fd, file, &device_len,
-&mixed_mode_needed);
+   n = btrfs_prepare_device(fd, file, NULL, NULL);
if (n) {
fprintf(stderr, "Error when zeroing out %s\n", file);
ret |= n;
-- 
1.7.10

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/5] Allow for quick removing of btrfs signature from block device

2012-05-01 Thread Hubert Kario
On Tuesday 01 of May 2012 14:43:37 Tomasz Torcz wrote:
> On Tue, May 01, 2012 at 02:38:01PM +0200, Hubert Kario wrote:
> > This patch series adds `btrfs device zero super ` command to remove
> > the btrfs signature from the device as well as fix few minor problems in
> > btrfs_prepare_device function.
> 
>   Shouldn't you rather extend “wipefs” from util-linux?

wipefs v.2.21.1 does detect btrfs so there's no need to.

I'd guess that most people will be thinking about btrfs RAID management in LVM 
or MD terms, so the zero-superblock option should be available in the btrfs 
tool proper (as similar option is avaiable in mdadm).

Adding it is trivial (look at my V1 patch) and does not introduce special 
code.

What's more, wipefs demages only the first superblock (so the fs is still 
mountable using btrfs-select-super -s 1 ), while btrfs device zero-
superblock removes all superblocks.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create subvolume from a directory?

2012-05-01 Thread Hubert Kario
On Thursday 29 of March 2012 09:24:44 Liu Bo wrote:
> On 03/29/2012 12:54 AM, Goffredo Baroncelli wrote:
> > Could you elaborate which would be the issue ?
> > "cp --reflink"-ing a file is not different than snapshotting a file. In
> > any case I could mount a snapshot and not the source subvolume.
> 
> We already have a debate about this "cross-link device":
> http://comments.gmane.org/gmane.comp.file-systems.btrfs/9864
> 
> "cp --reflink" will use clone feature, which can share data among files, but
> metadata is preserved individually.
> 
> My case is that I can mount both a subvolume and a snapshot via "-o
> subvol=xxx" or "-o subvolid=xxx".

And how is this different from regular snapshot of subvolume? In the end you 
get two files pointing to same data on the disk while having different 
metadata.

Let me rephrase it:

People don't want to be able to do:

mount /dev/lvm/btrfs /mnt/a -t btrfs -o subvol=volA
mount /dev/lvm/btrfs /mnt/b -t btrfs -o subvol=volB
cp --reflink=always /mnt/a/file /mnt/b

Just like you can't do hardlinks over `mount --bind` mountpoints, you 
shouldn't be able to cp reflink over mountpoints. That's expected as this 
*does* break VFS semantics.

*But* people want to be able to do this: 

mount /dev/lvm/btrfs /mnt/ -t btrfs
btrfs subvol create /mnt/subvol
big-file-creator > /mnt/subvol/BIG-file
btrfs subvol snapshot /mnt/subvol /mnt/subvol-bak
big-file-editor /mnt/subvol/BIG-file
rm /mnt/subvol-bak/BIG-file
cp --reflink=always /mnt/subvol/BIG-file /mnt/subvol-bak/BIG-file

This does not cross any VFS boundaries.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs support for efficient SSD operation (data blocks alignment)

2012-05-01 Thread Hubert Kario
On Tuesday 01 of May 2012 18:04:25 Martin wrote:
> Are 16kByte blocks/sectors useful to btrfs?
> 
> Or rather, can btrfs usefully use 16kByte blocks?

Yes, and they are already supported using -l and -n flags:

mkfs.btrfs -l $((4*4096)) -n $((4*4096)) /dev/sda1

You can set sector size to 16kb but this requires 16kb memory pages.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] btrfs: add command to zero out superblock

2012-05-02 Thread Hubert Kario
On Wednesday 02 of May 2012 16:28:43 David Sterba wrote:
> On Tue, May 01, 2012 at 02:40:29PM +0200, Hubert Kario wrote:
> > +static const char * const cmd_zero_dev_usage[] = {
> > +   "btrfs device zero-superblock  [ ...]",
> 
> FYI, this step is named 'clear superblock' in kernel code as done after the
> device is removed, and I suggest to consider to name the command like
> 'clear-superblock' or 'clear-sb'.

A similar function in mdadm is called zero superblock so I just re used the 
name (according to the principle of least surprise). Users, even admins, 
generally don't read kernel code...
 
> Also, kernel clears only the "superblock magic" string, ie. the
> "_BHRfS_M" bytes. This leaves the rest of the superblock intact, for
> possible recovery when it's cleared accidentally.

That's when a device is removed from the filesystem, not when a filesystem is 
just not used any more and you want to re-purpose the devices.

> I had prototyped a similar utility (in perl, so nothing for progs
> inclusion for now) and rewrote the magic string with _BHRf$_M ie. the
> S -> $ for visual similarity with the action performed. This allows to
> detect cleared superblocks and activate them back eventually.
> 
> I'm not sure if this is useful and sensible usecase, clearing superblock
> is a one-time action anyway, so it's more for the sake of tool
> flexibility.

Clearing superblock is not a light decision and should generally be performed 
just before formatting the partition with some other fs or physical volume for 
LVM. IMHO recoverability of "cleared" superblock is a function hardly anyone 
would use.
 
> To your implementation: I think adding a function doing the superblock
> reset would be enough here. Something like this (in pseudocode):
> 
> for (i = 0 ; i < BTRFS_SUPER_MIRROR_MAX; i++) {
>   bytenr = btrfs_sb_offset(i);
>   "break if bytenr > device size"
>   memset(superblock buffer, CLEARPATTERN, sizeof(...))
> }
> write_all_supers(root);

That's exactly what btrfs_prepare_device does. And it's a function run by btfs 
just before btrfs dev add and by mkfs. Duplicating its code would be a bad 
idea.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] btrfs: add command to zero out superblock

2012-05-03 Thread Hubert Kario
On Wednesday 02 of May 2012 19:36:29 David Sterba wrote:
> On Wed, May 02, 2012 at 06:42:16PM +0200, Hubert Kario wrote:
> > > I'm not sure if this is useful and sensible usecase, clearing superblock
> > > is a one-time action anyway, so it's more for the sake of tool
> > > flexibility.
> > 
> > Clearing superblock is not a light decision and should generally be
> > performed just before formatting the partition with some other fs or
> > physical volume for LVM. IMHO recoverability of "cleared" superblock is a
> > function hardly anyone would use.
> 
> googled, a few users asking about recovering from md zero-superblock, and
> the solution was to recreate the array, md is said to be smart and
> recognize traces of previous array and will not destroy it if the
> parameters are same. Point for md, btrfs does not do this.

nice, didn't know about this. Such functionality would be nice to have.
But then I don't think that a "recreate the array if the parameters are the 
same" is actually a good idea, lots of space for error. A pair of functions:

btrfs dev zero-superblock
btrfs dev restore-superblock

would be a better solution IMO
 
> > > To your implementation: I think adding a function doing the superblock
> > > reset would be enough here. Something like this (in pseudocode):
> > > 
> > > for (i = 0 ; i < BTRFS_SUPER_MIRROR_MAX; i++) {
> > > 
> > >   bytenr = btrfs_sb_offset(i);
> > >   "break if bytenr > device size"
> > >   memset(superblock buffer, CLEARPATTERN, sizeof(...))
> > > 
> > > }
> > > write_all_supers(root);
> > 
> > That's exactly what btrfs_prepare_device does. And it's a function run by
> > btfs just before btrfs dev add and by mkfs. Duplicating its code would be
> > a bad idea.
> 
> Not 'exactly' IMO:
> 
> * calls TRIM/discard on the device
> * zeroes first 2 megabytes
> * zeroes all reachable superblocks
> * zeroes last 2 megabytes
> 
> Too many undocumented and unobvious side-efects.

True. But close enough ;)

> Code duplication can be avoided by factoring the 'zero superblock' into
> a function and calling it from btrfs_prepare_device().

Then there's also the "actually zero" vs "reversibly destroy" difference but 
it's trivial to fix using a single option.

Regards
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ran cppcheck and going to make cosmetic changes--where do I submit code?

2012-05-03 Thread Hubert Kario
On Thursday 03 of May 2012 09:06:31 Josh Marshall wrote:
> I ran cppcheck on btrfs and it looks like it says there could be a few
> cosmetic changes, such as reducing the scope of some variables.  I'm
> guessing that it won't be too terrible if I do make the changes, but
> where do I submit the code when I'm done?

You send the patches to this mailing list.

Look at `git format-patch` and `git send-email` as far as tools are concerned
If yo're looking for tips for creating patches for Linux see at
Documentation/SubmitChecklist and Documentation/SubmittingPatches in your
Linux checkout.

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature


Re: Create subvolume from a directory?

2012-05-03 Thread Hubert Kario
On Wednesday 02 of May 2012 18:33:37 David Sterba wrote:
> On Tue, May 01, 2012 at 07:09:21PM +0200, Hubert Kario wrote:
> > Let me rephrase it:
> >
> > People don't want to be able to do:
> >
> > mount /dev/lvm/btrfs /mnt/a -t btrfs -o subvol=volA
> > mount /dev/lvm/btrfs /mnt/b -t btrfs -o subvol=volB
> > cp --reflink=always /mnt/a/file /mnt/b
> >
> > Just like you can't do hardlinks over `mount --bind` mountpoints, you
> > shouldn't be able to cp reflink over mountpoints. That's expected as this
> > *does* break VFS semantics.
>
> Proposed fix (incremental on top of the cross-subvol):

I'm a noob as far as kernel development is concerned so take my comments for
what they're worth.

> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -2321,6 +2321,10 @@ static noinline long btrfs_ioctl_clone(struct file
> *file, unsigned long srcfd, goto out_drop_write;
> }
>
> +   ret = -EXDEV;
> +   if (src_file->f_path.mnt != file->f_path.mnt)

I'm not sure about this comparision. Is the f_path struct member used just as
reference to some general kernel structure?

> +   goto out_fput;
> +
> src = src_file->f_dentry->d_inode;
>
> ret = -EINVAL;

Other that that, looks OK.

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

smime.p7s
Description: S/MIME cryptographic signature


Re: Hot data Tracking

2012-05-03 Thread Hubert Kario
On Thursday 03 of May 2012 15:09:25 Waxhead wrote:
> David Sterba wrote:
> > On Sat, Feb 11, 2012 at 05:49:41AM +0100, Timo Witte wrote:
> >> What happened to the hot data tracking feature in btrfs? There are a lot
> >> of old patches from aug 2010, but it looks like the feature has been
> >> completly removed from the current version of btrfs. Is this feature
> >> still on the roadmap?
> > 
> > Removed? AFAIK it hasn't been ever merged, though it's be a nice
> > feature. There were suggestions to turn it into a generic API for any
> > filesystem to use, but this hasn't happened.
> > 
> > The patches are quite independent and it was easy to refresh them on top
> > of current for-linus branch. A test run did not survive a "random"
> > xfstest, 013 this time, so I probably mismerged some bits. The patchset
> > lives in branch foreign/ibm/hotdatatrack in my git repo.
> > 
> > 
> > david
> 
> Someone recently mentioned bcache in another post who seems to cover
> this subject fairly well.

bcache does one very specific assertion that isn't met by btrfs: overwring old 
data in a file writes data to the same place on the disk, same goes for 
metadata. In other words, it won't work with COW file system.

> However would it not make sense if btrfs
> actually was able to automatically take advantage of whatever disks is
> added to the pool? For example if you have 10 disk of different size and
> performance in a raid5/6 like configuration would it not be feasible if
> btrfs automagically (option) could manage it's own cache? For example it
> could reserve a chunk of free space as cache (based on how much data is
> free) and stripe data over all disks (cache). When the filesystem
> becomes idle or at set intervals it could empty the cache or
> move/rebalance pending writes over to the original raid5/6 like setup.
> As far as I remember hot data tracking was all about moving the data
> over to the fastest disk. Why not utilize all disks and benefit from
> disks working together?

For this to work, you need feature set that allows hot data movement between 
disks and data restriping. Then such cache feature will use much of the same 
code.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub BUG: unable to handle kernel NULL pointer dereference

2012-05-04 Thread Hubert Kario
On Friday 04 of May 2012 08:00:59 Arne Jansen wrote:
> On 04.05.2012 05:19, Mark Murawski wrote:
> > I think I have some failing hard drives, they are disconnected for now.
> > 
> > stan {~} root# btrfs filesystem show
> > Label: none  uuid: d71404d4-468e-47d5-8f06-3b65fa7776aa
> > 
> > Total devices 2 FS bytes used 6.27GB
> > devid1 size 9.31GB used 8.16GB path /dev/sde6
> > *** Some devices missing
> > 
> > Label: none  uuid: b142f575-df1c-4a57-8846-a43b979e2e09
> > 
> > Total devices 8 FS bytes used 1.35TB
> > devid3 size 149.05GB used 119.01GB path /dev/sdf
> > devid6 size 920.34GB used 408.00GB path /dev/sde7
> > devid7 size 1.36TB used 885.00GB path /dev/sdd
> > devid4 size 149.05GB used 119.01GB path /dev/sdc
> > devid2 size 465.76GB used 435.00GB path /dev/sdb
> > devid8 size 1.36TB used 885.00GB path /dev/sda
> > *** Some devices missing
> > 
> > stan {~} root# btrfs scrub status /storage
> > Killed
> 
> This is probably the same issue this patch fixes:
> 
> http://permalink.gmane.org/gmane.comp.file-systems.btrfs/16302
> 
> It is included in a recent RC.

OK, that will teach me to write more descriptive commit comments:
http://permalink.gmane.org/gmane.comp.file-systems.btrfs/15856
http://permalink.gmane.org/gmane.comp.file-systems.btrfs/15857

Sill, is the first one patching a real vulnerability?

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and 1 billion small files

2012-05-07 Thread Hubert Kario
On Monday 07 of May 2012 11:28:13 Alessio Focardi wrote:
> Hi,
> 
> I need some help in designing a storage structure for 1 billion of small
> files (<512 Bytes), and I was wondering how btrfs will fit in this
> scenario. Keep in mind that I never worked with btrfs - I just read some
> documentation and browsed this mailing list - so forgive me if my questions
> are silly! :X
> 
> 
> On with the main questions, then:
> 
> - What's the advice to maximize disk capacity using such small files, even
> sacrificing some speed?
> 
> - Would you store all the files "flat", or would you build a hierarchical
> tree of directories to speed up file lookups? (basically duplicating the
> filesystem Btree indexes)
> 
> 
> I tried to answer those questions, and here is what I found:
> 
> it seems that the smallest block size is 4K. So, in this scenario, if every
> file uses a full block I will end up with lots of space wasted. Wouldn't
> change much if block was 2K, anyhow.
> 
> I tough about compression, but is not clear to me the compression is handled
> at the file level or at the block level.
> 
> Also I read that there is a mode that uses blocks for shared storage of
> metadata and data, designed for small filesystems. Haven't found any other
> info about it.
> 
> 
> Still is not yet clear to me if btrfs can fit my situation, would you
> recommend it over XFS?
> 
> XFS has a minimum block size of 512, but BTRFS is more modern and, given the
> fact that is able to handle indexes on his own, it could help us speed up
> file operations (could it?)
> 
> Thank you for any advice!
> 

btrfs will inline such small files in metadata blocks.

I'm not sure about limits to size of directory, but I'd guess that going over 
few tens of thousands of files in single flat directory will have speed 
penalties.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.3.4 damages filesystem (?)

2012-05-08 Thread Hubert Kario
On Tuesday 08 of May 2012 12:00:00 Helmut Hullen wrote:
> Hallo, Fajar,
> 
> Du meintest am 08.05.12:
> >>> And you can use three BTRFS filesystems the same way as three Ext4
> >>> filesystems if you prefer such a setup if the time spent for
> >>> restoring the backup does not make up the cost for one additional
> >>> disk for you.
> >> 
> >> But where's the gain? If a disk fails I have a lot of tools for
> >> repairing an ext2/3/4 system.
> > 
> > It won't work if you use it in RAID0 (e.g. with LVM spanning three
> > disks, then use ext4 on top of the LV).
> 
> But when I use ext2/3/4 I neither need RAID0 nor do I need LVM.
> 
> > As others said, if your only concern is "if a disk is dead, I want to
> > be able to access data on other disks", then simply use btrfs as
> > three different fs, mounted on three directories.
> 
> But then I don't need especially btrfs.
> 
> > btrfs will shine when:
> > - you need checksum and self-healing in raid10 mode
> > - you have lots of small files
> > - you have highly compressible content
> > - you need snapshot/clone feature
> 
> For my video collection (mpeg2) nothing fits ...
> 
> The only advantage I see with btrfs is
> 
> adding a bigger disk
> deleting/removing a smaller disk
> 
> with really simple commands.

Playing the Devil's advocate here (not that I don't use The Other Linux FS 
;)

I don't see btrfs commands much different from
pvcreate /dev/new-disk
vgextend videos-volume-42 /dev/new-disk
pvmove /dev/old-disk /dev/new-disk
vgreduce videos-volume-42 /dev/old-disk
resize2fs /dev/videos-volume-42/logical-volume

Unlike with shrinking, there's really no place for error. Messing up those 
commands will give quite clear error messages and definetly won't destroy 
data (unless a hardware error occurs). And the FS on the LV is online all 
the time, just like with btrfs.

The only difference is that with btrfs you can both extend and shrink the FS 
online, with ext2/3/4 you can only extend online...

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread Hubert Kario
On Tuesday 08 of May 2012 04:45:51 cwillu wrote:
> On Tue, May 8, 2012 at 1:36 AM, Fajar A. Nugraha  wrote:
> > On Tue, May 8, 2012 at 2:13 PM, Clemens Eisserer  
wrote:
> >> Hi,
> >> 
> >> I have a quite unreliable SSD here which develops some bad blocks from
> >> time to time which result in read-errors.
> >> Once the block is written to again, its remapped internally and
> >> everything is fine again for that block.
> >> 
> >> Would it be possible to create 2 btrfs partitions on that drive and
> >> use it in RAID1 - with btrfs silently repairing read-errors when they
> >> occur?
> >> Would it require special settings, to not fallback to read-only mode
> >> when a read-error occurs?
> > 
> > The problem would be how the SSD (and linux) behaves when it
> > encounters bad blocks (not bad disks, which is easier).
> > 
> > If it does "oh, I can't read this block. I just return an error
> > immediately", then it's good.
> > 
> > However, in most situation, it would be like "hmmm, I can't read this
> > block, let me retry that again. What? still error? then lets retry it
> > again, and again.", which could take several minutes for a single bad
> > block. And during that time linux (the kernel) would do something like
> > "hey, the disk is not responding. Why don't we try some stuff? Let's
> > try resetting the link. If it doesn't work, try downgrading the link
> > speed".
> > 
> > In short, if you KNOW the SSD is already showing signs of bad blocks,
> > better just throw it away.
> 
> The excessive number of retries (basically, the kernel repeating the
> work the drive already attempted) is being addressed in the block
> layer.
> 
> "[PATCH] libata-eh don't waste time retrying media errors (v3)", I
> believe this is queued for 3.5

I just hope they don't remove retries completely, I've seen the second or 
third try return correct data on multiple disks from different vendors. 
(Which allowed me to use dd to write the data back to force relocation)

But yes, Linux is a bit too overzelous with regards to retries...

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] btrfs: add command to zero out superblock

2012-05-09 Thread Hubert Kario
On Wednesday 09 of May 2012 19:18:07 David Sterba wrote:
> On Thu, May 03, 2012 at 03:11:45PM +0200, Hubert Kario wrote:
> > nice, didn't know about this. Such functionality would be nice to have.
> > But then I don't think that a "recreate the array if the parameters are
> > the
> > same" is actually a good idea, lots of space for error. A pair of
> > functions:
> > 
> > btrfs dev zero-superblock
> > btrfs dev restore-superblock
> 
> As a user, I'm not sure what can I expect from the restore command. From
> where does it restore? Eg. a file?
> 
> As a tester I have use for a temporary clearing of a superblock on a
> device, then mount it with -o degraded, work work, and then undo
> clearing. So, my idea is like
> 
>   btrfs device zero-superblock --undo
> 
> with the obvious sanity checks. A regular user would never need to call
> this.

Yes, that's a better idea.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.3.4 damages filesystem (?)

2012-05-10 Thread Hubert Kario
On Thursday 10 of May 2012 12:40:49 Martin Steigerwald wrote:
> Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser:
> > Hi,
> > 
> > On 05/08/2012 10:56 PM, Roman Mamedov wrote:
> > > Regarding btrfs, AFAIK even "btrfs -d single" suggested above works
> > > not "per file", but per allocation extent, so in case of one disk
> > > failure you will lose random *parts* (extents) of random files,
> > > which in effect could mean no file in your whole file system will
> > > remain undamaged.
> > 
> > Maybe we should evaluate the possiblility of such a "one file gets on
> > one disk" feature.
> > 
> > Helmut Hullen has the use case: Many disks, totally non-critical but
> > nice-to-have data. If one disk dies, some *files* should lost, not some
> > *random parts of all files*.
> > 
> > This could be accomplished by some userspace-tool that moves stuff
> > around, combined with "file pinning"-support, that lets the user make
> > sure a specific file is on a specific disk.
> 
> Yeah, basically I think thats the whole point Helmut is trying to make.
> 
> I am not sure whether that should be in userspace. It could be just an
> allocation mode like "raid0" or "single". Such as "single" as in one file
> is really on one disk and thats it.

I was thinking that "linear" would be good name for old style allocator.

Regards
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs RAID with enterprise SATA or SAS drives

2012-05-10 Thread Hubert Kario
On Wednesday 09 of May 2012 22:01:49 Daniel Pocock wrote:
> There is various information about
> - enterprise-class drives (either SAS or just enterprise SATA)
> - the SCSI/SAS protocols themselves vs SATA
> having more advanced features (e.g. for dealing with error conditions)
> than the average block device
> 
> For example, Adaptec recommends that such drives will work better with
> their hardware RAID cards:
> 
> http://ask.adaptec.com/cgi-bin/adaptec_tic.cfg/php/enduser/std_adp.php?p_f
> aqid=14596 "Desktop class disk drives have an error recovery feature that
> will result in a continuous retry of the drive (read or write) when an
> error is encountered, such as a bad sector. In a RAID array this can
> cause the RAID controller to time-out while waiting for the drive to
> respond."
> 
> and this blog:
> http://www.adaptec.com/blog/?p=901
> "major advantages to enterprise drives (TLER for one) ... opt for the
> enterprise drives in a RAID environment no matter what the cost of the
> drive over the desktop drive"
> 
> My question..
> 
> - does btrfs RAID1 actively use the more advanced features of these
> drives, e.g. to work around errors without getting stuck on a bad block?

There are no (short) timeouts that I know of
 
> - if a non-RAID SAS card is used, does it matter which card is chosen?
> Does btrfs work equally well with all of them?

If you're using btrfs RAID, you need a HBA, not a RAID card. If the RAID 
card can work as a HBA (usually labelled as JBOD mode) then you're good to 
go.

For example, HP CCISS controllers can't work in JBOD mode.

If you're using the RAID feature of the card, then you need to look at 
general Linux support, btrfs doesn't do anything other FS don't do with the 
block devices.
 
> - ignoring the better MTBF and seek times of these drives, do any of the
> other features passively contribute to a better RAID experience when
> using btrfs?

whatever they really have high MTBF values is debatable...

seek times do matter very much to btrfs, fast CPU is also a good thing to 
have with btrfs, especially if you want to use data compression, high node 
or leaf sizes

> - for someone using SAS or enterprise SATA drives with Linux, I
> understand btrfs gives the extra benefit of checksums, are there any
> other specific benefits over using mdadm or dmraid?

Because btrfs knows when the drive is misbeheaving (because of checksums) 
and is returning bad data, it can detect problems much faster then RAID 
(which doesn't use the reduncancy for checking if the data it's returning is 
actually correct). Both hardware and software RAID implementations depend on 
the drives to return IO errors. In effect, the data is safer on btrfs than 
regular RAID.

Besides that online resize (both shrinking and extending) and (currently not 
implemented) ability to set redundancy level on a per file basis.
In other words, with btrfs you can have a file with RAID6 redundancy and a 
second one with RAID10 level of redundancy in single directory.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.3.4 damages filesystem (?)

2012-05-10 Thread Hubert Kario
On Thursday 10 of May 2012 21:15:30 Hugo Mills wrote:
> On Thu, May 10, 2012 at 09:43:58PM +0200, Hubert Kario wrote:
> > On Thursday 10 of May 2012 12:40:49 Martin Steigerwald wrote:
> > > Am Mittwoch, 9. Mai 2012 schrieb Kaspar Schleiser:
> > > > Hi,
> > > > 
> > > > On 05/08/2012 10:56 PM, Roman Mamedov wrote:
> > > > > Regarding btrfs, AFAIK even "btrfs -d single" suggested above
> > > > > works
> > > > > not "per file", but per allocation extent, so in case of one disk
> > > > > failure you will lose random *parts* (extents) of random files,
> > > > > which in effect could mean no file in your whole file system will
> > > > > remain undamaged.
> > > > 
> > > > Maybe we should evaluate the possiblility of such a "one file gets
> > > > on
> > > > one disk" feature.
> > > > 
> > > > Helmut Hullen has the use case: Many disks, totally non-critical but
> > > > nice-to-have data. If one disk dies, some *files* should lost, not
> > > > some
> > > > *random parts of all files*.
> > > > 
> > > > This could be accomplished by some userspace-tool that moves stuff
> > > > around, combined with "file pinning"-support, that lets the user
> > > > make
> > > > sure a specific file is on a specific disk.
> > > 
> > > Yeah, basically I think thats the whole point Helmut is trying to
> > > make.
> > > 
> > > I am not sure whether that should be in userspace. It could be just an
> > > allocation mode like "raid0" or "single". Such as "single" as in one
> > > file
> > > is really on one disk and thats it.
> > 
> > I was thinking that "linear" would be good name for old style allocator.
> 
>Please do distinguish between the replication level (e.g. "single",
> "RAID-1") and the allocator algorithm. These are distinct. Also, note
> that both of those work on the scale of chunks/block groups. There is
> a further consideration, which is the allocation of file data to block
> groups, which is a whole different thing again (and not something I
> know a great deal about), but which will also affect the desired
> outcome quite a lot.

Yes, I know about that.

I was more thinking on the line "how quickly restore aviability of old 
allocator".

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM on top of BTRFS

2012-06-12 Thread Hubert Kario
On Monday 11 of June 2012 23:53:34 Alex wrote:
> Matthew Hawn  yahoo.com> writes:
> > What are the recommendations for running KVM images on BTRFS systems using
> 
> Install:
> virt-install --connect qemu:///system -n china -r 256 --disk
> path=/var/lib/libvirt/images/china.img,size=4 -c
> /home/alex/debian-testing-amd64-CD-1.iso --vnc --noautoconsole --os-type
> linux --os-variant debianwheezy --accelerate --network=bridge:br0 --hvm
> 
> This runs much slower than expected - have done many debian bare minimum
> installs like this before. Can't hear any disk thrashing. Doesn't appear to
> be CPU or memory bound - will double check.

>From what I heard, this is caused by slow KVM CD virtualisation.

Try to install it and do some tests then.

Regards,
--
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ENOSPC on heterogeneous raid 0

2010-12-12 Thread Hubert Kario
On Wednesday 08 of December 2010 22:53:25 William Sheffler wrote:
> Hello btrfs community.
> 
> First off, thanks for all your hard work... I have been following
> btrfs with interest for several years now and very much look forward
> to the day it replaces ext4. The real killer feature (of btrfs
> specifically) for me is the ability to add *and remove* devices from a
> filesystem, as this allows rolling upgrades of my server's disks. I
> have a 16 port 3ware 1650SE on which I have a number of small raid
> units and it will be fantastic to be able to remove the oldest,
> upgrade, and add the new storage back. I had previously been using
> ZFS, but since ZFS doesn't allow removal of devices, this rolling
> upgrade strategy doesn't work.
> 
> My question is this: can btrfs handle striping (raid 0) across
> heterogeneous devices? I seem to be losing any capacity on the larger
> disk beyond what is available on the smaller disk. I really hope there
> is some simple fix!

Yes, it can handle stripping over devices of different size, unfortunately 
you're still limited to *

if you want to use all the available space use "-d single" when creating 
volume

for details, read the recent thread "800GB free, but no space left"

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Offline Deduplication for Btrfs

2011-01-06 Thread Hubert Kario
On Thursday 06 of January 2011 10:51:04 Mike Hommey wrote:
> On Thu, Jan 06, 2011 at 10:37:46AM +0100, Tomasz Chmielewski wrote:
> > >I have been thinking a lot about de-duplication for a backup application
> > >I am writing. I wrote a little script to figure out how much it would
> > >save me. For my laptop home directory, about 100 GiB of data, it was a
> > >couple of percent, depending a bit on the size of the chunks. With 4 KiB
> > >chunks, I would save about two gigabytes. (That's assuming no MD5 hash
> > >collisions.) I don't have VM images, but I do have a fair bit of saved
> > >e-mail. So, for backups, I concluded it was worth it to provide an
> > >option to do this. I have no opinion on whether it is worthwhile to do
> > >in btrfs.
> > 
> > Online deduplication is very useful for backups of big,
> > multi-gigabyte files which change constantly.
> > Some mail servers store files this way; some MUA store the files
> > like this; databases are also common to pack everything in big files
> > which tend to change here and there almost all the time.
> > 
> > Multi-gigabyte files which only have few megabytes changed can't be
> > hardlinked; simple maths shows that even compressing multiple files
> > which have few differences will lead to greater space usage than a
> > few megabytes extra in each (because everything else is
> > deduplicated).
> > 
> > And I don't even want to think about IO needed to offline dedup a
> > multi-terabyte storage (1 TB disks and bigger are becoming standard
> > nowadays) i.e. daily, especially when the storage is already heavily
> > used in IO terms.
> > 
> > 
> > Now, one popular tool which can deal with small changes in files is
> > rsync. It can be used to copy files over the network - so that if
> > you want to copy/update a multi-gigabyte file which only has a few
> > changes, rsync would need to transfer just a few megabytes.
> > 
> > On disk however, rsync creates a "temporary copy" of the original
> > file, where it packs unchanged contents together with any changes
> > made. For example, while it copies/updates a file, we will have:
> > 
> > original_file.bin
> > .temporary_random_name
> > 
> > Later, original_file.bin would be removed, and
> > .temporary_random_name would be renamed to original_file.bin. Here
> > goes away any deduplication we had so far, we have to start the IO
> > over again.
> 
> Sounds like all you need is cp --reflink=always and rsync --inplace.
> 
> Haven't tested is that works well, though.

It works very well, btrfs with snapshots, compression and rsync --inplace has 
better storage utilisation than lessfs at around 10-15 snapshots with around 
600GB of test data in small files.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-07 Thread Hubert Kario
On Friday, January 07, 2011 00:07:37 Carl Cook wrote:
> On Thu 06 January 2011 14:26:30 Carl Cook wrote:
> > According To Doyle...
> 
> Er, Hoyle...
> 
> I am trying to create a multi-device BTRFS system using two identical
> drives. I want them to be raid 0 for no redunancy, and a total of 4TB.
> But in the wiki it says nothing about using fdisk to set up the drive
> first.  It just basically says for me to: mkfs.btrfs -m raid0 /dev/sdc
> /dev/sdd

I'd suggest at least 
mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
if you really want raid0

> 
> Seems to me that for mdadm I had to set each drive as a raid member,
> assemble the array, then format.  Is this not the case with BTRFS?
> 
> Also in the wiki it says "After a reboot or reloading the btrfs module,
> you'll need to use btrfs device scan to discover all multi-device
> filesystems on the machine".  Is this not done automatically?  Do I have
> to set up some script to do this?
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-07 Thread Hubert Kario
On Thursday, January 06, 2011 22:52:25 Freddie Cash wrote:
> On Thu, Jan 6, 2011 at 1:42 PM, Carl Cook  wrote:
> > On Thu 06 January 2011 11:16:49 Freddie Cash wrote:
> >> > Also with this system, I'm concerned that if there is corruption on
> >> > the HTPC, it could be propagated to the backup server.  Is there some
> >> > way to address this?  Longer intervals to sync, so I have a chance to
> >> > discover?
> >> 
> >> Using snapshots on the backup server allows you to go back in time to
> >> recover files that may have been accidentally deleted, or to recover
> >> files that have been corrupted.
> > 
> > How?  I can see that rsync will not transfer the files that have not
> > changed, but I assume it transfers the changed ones.  How can you go
> > back in time?  Is there like a snapshot file that records the state of
> > all files there?
> 
> I don't know the specifics of how it works in btrfs, but it should be
> similar to how ZFS does it.  The gist of it is:
> 
> Each snapshot gives you a point-in-time view of the entire filesystem.
>  Each snapshot can be mounted (ZFS is read-only; btrfs is read-only or
> read-write).  So, you mount the snapshot for 2010-12-15 onto /mnt,
> then cd to the directory you want (/mnt/htpc/home/fcash/videos/) and
> copy the file out that you want to restore (cp coolvid.avi ~/).
> 
> With ZFS, things are nice and simple:
>   - each filesystem has a .zfs/snapshot directory
>   - in there are sub-directories, each named after the snapshot name
>   - cd into the snapshot name, the OS auto-mounts the snapshot, and off you
> go
> 
> Btrfs should be similar?  Don't know the specifics.
> 
> How it works internally, is some of the magic and the beauty of
> Copy-on-Write filesystems.  :)

I usually create subvolumes in btrfs root volume:

/mnt/btrfs/
|- server-a
|- server-b
\- server-c

then create snapshots of these directories:

/mnt/btrfs/
|- server-a
|- server-b
|- server-c
|- snapshots-server-a
 |- @GMT-2010.12.21-16.48.09
    \- @GMT-2010.12.22-16.45.14
|- snapshots-server-b
\- snapshots-server-c

This way I can use the shadow_copy module for samba to publish the snapshots  
to windows clients.
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Atomic file data replace API

2011-01-07 Thread Hubert Kario
On Friday, January 07, 2011 17:12:11 Chris Mason wrote:
> Excerpts from Olaf van der Spek's message of 2011-01-07 10:17:31 -0500:
> > On Fri, Jan 7, 2011 at 4:13 PM, Chris Mason  
wrote:
> > >> That's not what I asked. ;)
> > >> I asked to wait until the first write (or close). That way, you don't
> > >> get unintentional empty files.
> > >> One step further, you don't have to keep the data in memory, you're
> > >> free to write them to disk. You just wouldn't update the meta-data
> > >> (yet).
> > > 
> > > Sorry ;) Picture an application that truncates 1024 files without
> > > closing any of them.  Basically any operation that includes the kernel
> > > waiting for applications because they promise to do something soon is
> > > a denial of service attack, or a really easy way to run out of memory
> > > on the box.
> > 
> > I'm not sure why you would run out of memory in that case.
> 
> Well, lets make sure I've got a good handle on the proposed interface:
> 
> 1) fd = open(some_file, O_ATOMIC)
> 2) truncate(fd, 0)
> 3) write(fd, new data)
> 
> The semantics are that we promise not to let the truncate hit the disk
> until the application does the write.
> 
> We have a few choices on how we do this:
> 
> 1) Leave the disk untouched, but keep something in memory that says this
> inode is really truncated
> 
> 2) Record on disk that we've done our atomic truncate but it is still
> pending.  We'd need some way to remove or invalidate this record after a
> crash.
> 
> 3) Go ahead and do the operation but don't allow the transaction to
> commit until the write is done.
> 
> option #1: keep something in memory.  Well, any time we have a
> requirement to pin something in memory until userland decides to do a
> write, we risk oom.

Userland has already a file descriptor allocated (which can fail anyway 
because of OOM), I see no problem in increasing the size of kernel memory 
usage by 4 bytes (if not less) just to note that the application wants to see 
the file as truncated (1 bit) and the next write has to be atomic (2nd bit?).

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-10 Thread Hubert Kario
On Sunday 09 of January 2011 12:46:59 Alan Chandler wrote:
> On 07/01/11 16:20, Hubert Kario wrote:
> > I usually create subvolumes in btrfs root volume:
> > 
> > /mnt/btrfs/
> > 
> >  |- server-a
> >  |- server-b
> >  
> >  \- server-c
> > 
> > then create snapshots of these directories:
> > 
> > /mnt/btrfs/
> > 
> >  |- server-a
> >  |- server-b
> >  |- server-c
> >  |- snapshots-server-a
> >  |
> >   |- @GMT-2010.12.21-16.48.09
> > 
> > \- @GMT-2010.12.22-16.45.14
> >  |
> >  |- snapshots-server-b
> >  
> >  \- snapshots-server-c
> > 
> > This way I can use the shadow_copy module for samba to publish the
> > snapshots to windows clients.
> 
> Can you post some actual commands to do this part

# create the default subvolume and mount it
mkfs.btrfs /dev/sdx
mount /dev/sdx /mnt/btrfs
# to be able to snapshot individual servers we have to put them to individual
# subvolumes
btrfs subvolume create /mnt/btrfs/server-a
btrfs subvolume create /mnt/btrfs/server-b
btrfs subvolume create /mnt/btrfs/server-c
# copy data over
rsync --exclude /proc [...] r...@server-a:/ /mnt/btrfs/server-a
rsync --exclude /proc [...] r...@server-b:/ /mnt/btrfs/server-b
rsync --exclude /proc [...] r...@server-c:/ /mnt/btrfs/server-c
# create snapshot directories (in the default subvolume)
mkdir /mnt/btrfs/{snapshots-server-a,snapshots-server-b,snapshots-server-c}
# create snapshot from the synced data:
btrfs subvolume snapshot /mnt/btrfs/server-a /mnt/btrfs/snapshots-server-
a/@GMT-2010.12.21-16.48.09
# copy new data over:
rsync --inplace --exclude /proc [...] r...@server-a:/ /mnt/btrfs/server-a
# make a new snapshot
btrfs subvolume snapshot /mnt/btrfs/server-a /mnt/btrfs/snapshots-server-
a/@GMT-2010.12.22-16.45.14

in the end we have 5 subvolumes, 2 of witch are snapshots of the server-a
> 
> I am extremely confused about btrfs subvolumes v the root filesystem and
> mounting, particularly in relation to the default subvolume.
> 
> For instance, if I create the initial file system using mkfs.btrfs and
> then mount it on /mnt/btrfs is there already a default subvolume?  or do
> I have to make one?  What happens when you unmount the whole filesystem
> and then come back
> 
> The wiki also makes the following statement
> 
> *"Note:* to be mounted the subvolume or snapshot have to be in the root
> of the btrfs filesystem."
> 
> 
> but you seems to have snapshots at one layer down from the root.
> 
> 
> I am trying to use this method for my offsite backups - to a large spare
> sata disk loaded via a usb port.
> 
> I want to create the main filesystem (and possibly a subvolume - this is
> where I start to get confused) and rsync my current daily backup files
> to it.  I would then also (just so I get the correct time - rather than
> do it at the next cycle, as explained below) take a snapshot with a time
> label. I would transport this disk offsite.
> 
> I would repeat this in a months time with a totally different disk
> 
> In a couple of months time - when I come to recycle the first disk for
> my offsite backup, I would mount the retrieved disk (and again I am
> confused - mount the complete filesystem or the subvolume?)  rsync
> (--inplace ? - is this necessary) again the various backup files from my
> server and take another snapshot.

you mount the default, this way you have access to all the data on the HDD, --
inplace is necessary 

> 
> I am hoping that this would effectively allow me to leave the snapshot I
> took last time in place, as because not everything will have changed it
> won't have used much space - so effectively I can keep quite a long
> stream of backup snapshots in place offsite.

yes

> 
> Eventually of course the disk will start to become full, but I assume I
> can reclaim the space by deleting some of the old snapshots.

yes, of course:

btrfs subvolume delete /mnt/btrfs/snapshots-server-a/@GMT-2010.12.21-16.48.09

will reclaim the space used up by the deltas

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backup Command

2011-01-10 Thread Hubert Kario
On Monday 10 of January 2011 14:25:32 Carl Cook wrote:
> Here is my proposed cron:
> 
> btrfs subvolume snapshot hex:///home /media/backups/snapshots/hex-{DATE}
> 
> rsync --archive --hard-links --delete-during --delete-excluded --inplace
> --numeric-ids -e ssh --exclude-from=/media/backups/exclude-hex hex:///home
> /media/backups/hex
> 
> btrfs subvolume snapshot droog:///home
> /media/backups/snapshots/droog-{DATE}
> 
> rsync --archive --hard-links --delete-during --delete-excluded --inplace
> --numeric-ids -e ssh --exclude-from=/media/backups/exclude-droog
> droog:///home /media/backups/droog
> 
> Comments?  Criticisms?

This will make the dates associated with snapshots offset by how often cron is 
run.

In other words, if you run above script daily you will have data from 
2011.01.01 in the hex-2011.01.02 directory.

I do save the current date, do a LVM snapshot on the source, rsync --inplace 
data over and do a local snapshot naming the folder using the saved date. This 
way the date in the name of backup directory is exact to about a second.
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backup Command

2011-01-11 Thread Hubert Kario
On Tuesday, January 11, 2011 14:54:38 Ivan Labáth wrote:
> On 01/10/11 14:36, Hubert Kario wrote:
> > On Monday 10 of January 2011 14:25:32 Carl Cook wrote:
> >> Here is my proposed cron:
> >> 
> >> btrfs subvolume snapshot hex:///home /media/backups/snapshots/hex-{DATE}
> >> 
> >> rsync --archive --hard-links --delete-during --delete-excluded --inplace
> >> --numeric-ids -e ssh --exclude-from=/media/backups/exclude-hex
> >> hex:///home /media/backups/hex
> >> 
> >> btrfs subvolume snapshot droog:///home
> >> /media/backups/snapshots/droog-{DATE}
> >> 
> >> rsync --archive --hard-links --delete-during --delete-excluded --inplace
> >> --numeric-ids -e ssh --exclude-from=/media/backups/exclude-droog
> >> droog:///home /media/backups/droog
> >> 
> >> Comments?  Criticisms?
> > 
> > This will make the dates associated with snapshots offset by how often
> > cron is run.
> > 
> > In other words, if you run above script daily you will have data from
> > 2011.01.01 in the hex-2011.01.02 directory.
> > 
> > I do save the current date, do a LVM snapshot on the source, rsync
> > --inplace data over and do a local snapshot naming the folder using the
> > saved date. This way the date in the name of backup directory is exact
> > to about a second.
> 
> If you are mounting a LVM snapshot of an already mounted filesystem,
> would you be willing verify that it is really a snapshot that is mounted?
> 
> e.g. touch /mnt/live/its_alive && ls /mnt/snapshot/
> 
> I am nearly willing to bet it is not a snapshot.

well, by "LVM snapshot on the source" I meant:
1. do lvcreate --snapshot
2. mount newly created volume 
3. use the new directory as the base for rsync
4. arrange umount and destruction of the snapshot after rsync completes (no 
matter if it was successful)

and this will in fact not make the "its_alive" visible in /mnt/snapshot

You have to use this procedure if you use LVM snapshots for backup no matter 
to where do you copy data. That's why I shortened it to a single point -- it's 
not the part that is important from btrfs perspective.

Regards.
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Backup Command

2011-01-11 Thread Hubert Kario
On Tuesday, January 11, 2011 15:33:38 Ivan Labáth wrote:
> On 01/11/11 15:19, Hubert Kario wrote:
> > On Tuesday, January 11, 2011 14:54:38 Ivan Labáth wrote:
> >> On 01/10/11 14:36, Hubert Kario wrote:
> >>> On Monday 10 of January 2011 14:25:32 Carl Cook wrote:
> >>>> Here is my proposed cron:
> >>>> 
> >>>> btrfs subvolume snapshot hex:///home
> >>>> /media/backups/snapshots/hex-{DATE}
> >>>> 
> >>>> rsync --archive --hard-links --delete-during --delete-excluded
> >>>> --inplace --numeric-ids -e ssh
> >>>> --exclude-from=/media/backups/exclude-hex hex:///home
> >>>> /media/backups/hex
> >>>> 
> >>>> btrfs subvolume snapshot droog:///home
> >>>> /media/backups/snapshots/droog-{DATE}
> >>>> 
> >>>> rsync --archive --hard-links --delete-during --delete-excluded
> >>>> --inplace --numeric-ids -e ssh
> >>>> --exclude-from=/media/backups/exclude-droog droog:///home
> >>>> /media/backups/droog
> >>>> 
> >>>> Comments?  Criticisms?
> >>> 
> >>> This will make the dates associated with snapshots offset by how often
> >>> cron is run.
> >>> 
> >>> In other words, if you run above script daily you will have data from
> >>> 2011.01.01 in the hex-2011.01.02 directory.
> >>> 
> >>> I do save the current date, do a LVM snapshot on the source, rsync
> >>> --inplace data over and do a local snapshot naming the folder using the
> >>> saved date. This way the date in the name of backup directory is exact
> >>> to about a second.
> >> 
> >> If you are mounting a LVM snapshot of an already mounted filesystem,
> >> would you be willing verify that it is really a snapshot that is
> >> mounted?
> >> 
> >> e.g. touch /mnt/live/its_alive && ls /mnt/snapshot/
> >> 
> >> I am nearly willing to bet it is not a snapshot.
> > 
> > well, by "LVM snapshot on the source" I meant:
> > 1. do lvcreate --snapshot
> > 2. mount newly created volume
> > 3. use the new directory as the base for rsync
> > 4. arrange umount and destruction of the snapshot after rsync completes
> > (no matter if it was successful)
> > 
> > and this will in fact not make the "its_alive" visible in /mnt/snapshot
> > 
> > You have to use this procedure if you use LVM snapshots for backup no
> > matter to where do you copy data. That's why I shortened it to a single
> > point -- it's not the part that is important from btrfs perspective.
> > 
> > Regards.
> 
> The point I was trying to make is: it does not work with btrfs.
> Try the above with a btrfs and you will be surprised.
> If the source volume uses another filesystem, it should work properly.
> 
> regards,
> ivan

Yes, you are right, but I don't see a point in using LVM snapshots with btrfs, 
after all the ability to snapshot it on fs level is one of its defining 
features...

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs balance

2011-01-20 Thread Hubert Kario
On Thursday, January 20, 2011 14:40:00 Helmut Hullen wrote:
> Hallo, Andreas,
> 
> Du meintest am 20.01.11:
> > Maybe it is a very stupid question but I want to ask it anyway. In
> > general, 'btrfs filesystem balance' takes very long to finish and
> > produces lots of IO. So what are the classical usage scenarios, when
> > it is (really) worth doing a balance?
> 
> Here (Kernel 2.6.37, btrfs git Nov. 2010): balancing two disks/
> partitions with 2 and 1.5 TByte needs about 24 hours (CPU 1.5 GHz).
> 
That's the effect of lots of IO. IMHO it shouldn't take more than 4-8h for 1TB 
7200rpm disks in an otherwise idle system, as such it's rather inefficient 
right now.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding a disk fails

2011-01-20 Thread Hubert Kario
On Friday 21 of January 2011 00:10:54 Carl Cook wrote:
> On Thu 20 January 2011 14:13:22 Goffredo Baroncelli wrote:
> > To add another disk you don't have to run mkfs.btrfs. For example:
> > 
> > # add the first disk
> > mkfs.btrfs /dev/sdb
> > # mount the disk
> > mount /dev/sdb /media/backups
> > 
> > # add another disk to the first one
> > btrfs device add /dev/sdc /media/backup
> 
> Thanks Goffredo but as I say, I did this and it responds with
>  "ERROR: error adding the device '/dev/sdc'"
> ... it doesn't give a clue.

You still have a btrfs on /dev/sdc, do a 
dd if=/dev/zero of=/dev/sdc bs=8192
(overkill, but I don't remember which blocks have to be zeroed to destroy 
btrfs superblock)
Then
btrfs dev add /dev/sdc /media backup

> 
> > Note1: the filesystem has to be mounted
> > Note2: the medatada will be in raid1, the data in raid0
> > If you shutdown the system, at the reboot you should "scan" all the
> > device in order to find the btrfs ones
> > 
> > # find the btrfs device
> > btrfs device scan
> 
> This must be done at every boot?  If so, where is recommended, in rc.local?

yes. rc.local is too late, unless you will also mount the volume from there 
and not using /etc/fstab

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs RAID1 woes and tiered storage

2011-01-21 Thread Hubert Kario
I've been experimenting lately with btrfs RAID1 implementation and have to say 
that it is performing quite well, but there are few problems:

* when I purposefully damage partitions on which btrfs stores data (for 
  example, by changing the case of letters) it will read the other copy and 
  return correct data. It doesn't report in dmesg this fact every time, but it  
 
  does correct the one with wrong checksum
* when both copies are damaged it returns the damaged block as it is
  written(!) and only adds a warning in the dmesg with exact same wording as 
  with the single block corruption(!!)
* from what I could find, btrfs doesn't remember anywhere the number of 
  detected and fixed corruptions

I don't know if it's the final design and while the first and last points are 
minor inconveniences the second one is quite major. At this time it doesn't 
prevent silent corruption from going unnoticed. I think that reading from such 
blocks should return EIO (unless mounted nodatasum) or at least a broadcast 
message noting that a corrupted block is being returned to userspace.



I've also been thinking about tiered storage (meaning 2+, not only two-tiered) 
and have some ideas about it.

I think that there need to be 3 different mechanisms working together to 
achieve high performance:
* ability to store all metadata on selected volumes (probably read optimised 
  SSDs)
* ability to store all newly written data on selected volumes (write optimised 
  SSDs)
* ability to differentiate between often written, often read and infrequently 
  accessed data (and based on this information, ability to move this data to 
  fast SSDs, slow SSDs, fast RAID, slow RAID or MAID)

While the first two are rather straight-forward, the third one needs some 
explanation. I think that for this to work, we should save not only the time 
of last access to file and last change time but also few past values (I think 
that at least 8 to 16 ctimes and atimes are necessary but this will need 
testing). I'm not sure about how and exactly when to move this data around to 
keep the arrays balanced but a userspace daemon would be most flexible.

This solution won't work well for file systems with few very large files of 
which very few parts change often, in other words it won't be doing block-
level tiered storage. From what I know, databases would benefit most from such 
configuration, but then most databases can already partition tables to 
different files based on access rate. As such, making its granularity on file 
level would make this mechanism easy to implement while still useful.

On second thought: it won't make it exactly file-level granular, if we 
introduce snapshots in the mix, the new version can have the data regularly 
accessed while the old snapshot won't, this way the obsolete blocks can be 
moved to slow storage.
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs, broken design?

2011-01-21 Thread Hubert Kario
On Friday 21 of January 2011 09:11:57 Benoît Thiébault wrote:
> Ok, thanks, I will read the project wiki more carefully then :-)
> Beware however that Wikipedia is the first place to look for information
> for a lot of people (whether this is a good practice or not) and it
> currently does not provide a very good advertisement to btrfs.

It's still aimed more at experts than even to advanced users, though this will 
change when btrfsck will become avaiable.
 
> Le 21 janv. 2011 à 07:46, Chester a écrit :
> > Btrfs has its own wiki page at https://btrfs.wiki.kernel.org which you
> > may find more helpful than what is on wikipedia.
> > 
> > 2011/1/20 Benoît Thiébault :
> >> Thanks for your answer
> >> 
> >> Le 20 janv. 2011 à 22:20, Chris Mason a écrit :
> >>> There was a bug fixed as part of that discussion, and I think I also
> >>> better described the way the tree balancing works to Edward.
> >> 
> >> Maybe the wikipedia article should be modified then, because it is not
> >> very reinsuring :-)
> >> 
> >>> A final release?  We'll keep improving things for a long time.  The
> >>> biggest missing feature today is btrfsck, which I'm working on full
> >>> time right now.
> >>> 
> >>> -chris
> >> 
> >> Still on the wikipedia page, it's written "Btrfs 1.0 (with finalized
> >> on-disk format) was originally slated for a late 2008 release,[5] but a
> >> stable release has not been made as of January 2011.", which is also
> >> very confusing.
> >> 
> >> According to you, is the version of btrfs in 2.6.37 ready for
> >> production? I mean, are there still chances that I may loose all my
> >> data if I use btrfs?
> >> 
> >> Last question, do you know when the RAID-5 like capabilities will be
> >> available?
> >> 
> >> Sorry to ask so many questions, I fully understand you and your team are
> >> working very hard on the project, but I was very confused by the
> >> Wikipedia article.
> >> 
> >> Kind regards,
> >> 
> >> Ben--
> >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> >> in the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding a disk fails

2011-01-21 Thread Hubert Kario
On Friday, January 21, 2011 11:16:49 cac...@quantum-sci.com wrote:
> On Thu 20 January 2011 22:55:54 Hubert Kario wrote:
> > You still have a btrfs on /dev/sdc, do a
> > dd if=/dev/zero of=/dev/sdc bs=8192
> > (overkill, but I don't remember which blocks have to be zeroed to destroy
> > btrfs superblock)
> 
> I gave up and started over.  Maybe it should be clarified in the wiki that
> no mkfs should be done on additional volumes, as it's counterintuitive to
> some.
> 
> > yes. rc.local is too late, unless you will also mount the volume from
> > there and not using /etc/fstab
> 
> How and when then?

in, or before /etc/rcS.d/S35mountall.sh

> 
> > That depends - it has to be done before mounting. And if the device is
> > part of the boot partition then you may put the "scan" command into an
> > init-ramdisk.
> 
> This does not seem to have been sent to the listserv, as I only got it in a
> reply.  This is likely incomplete.

It was sent to list, your MUA may have merged messages with identical Message-
ID.

> 
> > Using something like "device=/dev/sdb,device=/dev/sdc" on fstab mount
> > options should also work.
> 
> You mean as a substitute for a scan?

Would seem so, but I haven't tried this

> 
> On my other server the btrfs array seems to mount just fine without any
> measures to scan.  I don't understand this,  nor why no specific advice is
> given about scanning on boot in the wiki.

your distro may have alredy put it in initrd or initramfs (for example fedora 
and archlinux do)

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: Synching a Backup Server

2011-01-22 Thread Hubert Kario
k mkfs.btrfs to create sub-volumes of an existing
> btrfs setup?

all linux file systems use mkfs., there's no reason why btrfs 
shouldn't. For creation of FS you use one command, for management you use 
other command. I'd say that's a pretty sane division.

> 
> What would be great is if there was an image that showed the layers in
> Btrfs and how they interacted with the userspace tools.

It would either be
* very complicated (if it included different allocation groups and how they 
  interact) and useless for users 
* very simple (you put one fs on many disks, snapshotable part of FS is called 
  subvolume) and pointless...

> Having a set of graphics that compared the layers in Btrfs with the
> layers in the "normal" Linux disk/filesystem partitioning scheme, and
> the LVM layering, would be best.

btrfs doesn't have layers to compare so it's rather hard to make such graph.

> There's lots of info in the wiki, but no images, ASCII-art, graphics,
> etc.  Trying to picture this mentally is not working.  :)


-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] add advanced use of --help to help message

2011-01-23 Thread Hubert Kario
explain how to use

btrfs  --help

command in help message

Signed-off-by: Hubert Kario 
---
 btrfs.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 46314cf..b84607a 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -151,6 +151,8 @@ static void help(char *np)
print_help(np, cp);
 
printf("\n\t%s help|--help|-h\n\t\tShow the help.\n",np);
+   printf("\n\t%s  --help\n\t\tShow detailed help for a command 
or\n\t\t"
+"subset of commands.\n",np);
printf("\n%s\n", BTRFS_BUILD_VERSION);
 }
 
-- 
1.7.3.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] add detailed help messages to btrfs command

2011-01-23 Thread Hubert Kario
extend the

btrfs  --help

command to print detailed help message if available but fallback to
basic help message if detailed is unavailable

add detailed help message for 'filesystem defragment' command

little tweaks in comments

Signed-off-by: Hubert Kario 
---
 btrfs.c |  101 ++
 1 files changed, 68 insertions(+), 33 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index b84607a..bd6f6f8 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -23,6 +23,9 @@
 #include "btrfs_cmds.h"
 #include "version.h"
 
+#define BASIC_HELP 0
+#define ADVANCED_HELP 1
+
 typedef int (*CommandFunction)(int argc, char **argv);
 
 struct Command {
@@ -31,8 +34,10 @@ struct Command {
   if >= 0, number of arguments,
   if < 0, _minimum_ number of arguments */
char*verb;  /* verb */
-   char*help;  /* help lines; form the 2nd onward they are
-  indented */
+   char*help;  /* help lines; from the 2nd line onward they 
+   are automatically indented */
+char*adv_help;  /* advanced help message; from the 2nd line 
+   onward they are automatically indented */
 
/* the following fields are run-time filled by the program */
char**cmds; /* array of subcommands */
@@ -47,73 +52,96 @@ static struct Command commands[] = {
{ do_clone, 2,
  "subvolume snapshot", " [/]\n"
"Create a writable snapshot of the subvolume  with\n"
-   "the name  in the  directory."
+   "the name  in the  directory.",
+  NULL
},
{ do_delete_subvolume, 1,
  "subvolume delete", "\n"
-   "Delete the subvolume ."
+   "Delete the subvolume .",
+  NULL
},
{ do_create_subvol, 1,
  "subvolume create", "[/]\n"
"Create a subvolume in  (or the current directory if\n"
-   "not passed)."
+   "not passed).",
+  NULL
},
{ do_subvol_list, 1, "subvolume list", "\n"
-   "List the snapshot/subvolume of a filesystem."
+   "List the snapshot/subvolume of a filesystem.",
+  NULL
},
{ do_find_newer, 2, "subvolume find-new", " \n"
-   "List the recently modified files in a filesystem."
+   "List the recently modified files in a filesystem.",
+  NULL
},
{ do_defrag, -1,
  "filesystem defragment", "[-vcf] [-s start] [-l len] [-t size] 
| [|...]\n"
-   "Defragment a file or a directory."
+   "Defragment a file or a directory.",
+  "[-vcf] [-s start] [-l len] [-t size] | 
[|...]\n"
+  "Defragment file data or directory metadata.\n"
+"-v be verbose\n"
+"-c compress the file while defragmenting\n"
+"-f flush data to disk immediately after 
defragmenting\n"
+"-s start   defragment only from byte onward\n"
+"-l len defragment only up to len bytes\n"
+"-t sizeminimal size of file to be considered for 
defragmenting\n"
},
{ do_set_default_subvol, 2,
  "subvolume set-default", " \n"
"Set the subvolume of the filesystem  which will be 
mounted\n"
-   "as default."
+   "as default.",
+  NULL
},
{ do_fssync, 1,
  "filesystem sync", "\n"
-   "Force a sync on the filesystem ."
+   "Force a sync on the filesystem .",
+  NULL
},
{ do_resize, 2,
  "filesystem resize", "[+/-][gkm]|max \n"
"Resize the file system. If 'max' is passed, the filesystem\n"
-   "will occupe all available space on the device."
+   "will occupe all available space on the device.",
+  NULL
},
{ do_show_filesystem, 999,
  "filesystem show", "[|]\n"
"Show the info of a btrfs filesystem. If no  or \n"
-   "is passed, info of all the btrfs filesystem are shown."
+   "is passed, info of all the btrfs filesystem are shown.",
+

Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-01-23 Thread Hubert Kario
On Sunday 23 of January 2011 15:07:09 Goffredo Baroncelli wrote:
> Hello Hubert,
> 
> please update the man page too.

I started to do it but then I noticed that you have made a few changes to the
man page yourself.

I'm quite new to the git/mail patch system of releases and I'm not sure if I 
should post the patch from my tree with your patch applied or not.
You have modified some of the same lines I want to fix and as such my patch 
may not be cleanly applicable. So, which tree I should base my patch on?

Regards
Hubert Kario

> On 01/23/2011 01:42 PM, Hubert Kario wrote:
> > extend the
> > 
> > btrfs  --help
> > 
> > command to print detailed help message if available but fallback to
> > basic help message if detailed is unavailable
> > 
> > add detailed help message for 'filesystem defragment' command
> > 
> > little tweaks in comments
> > 
> > Signed-off-by: Hubert Kario 
> > ---
> > 
> >  btrfs.c |  101
> >  ++ 1 files
> >  changed, 68 insertions(+), 33 deletions(-)
> > 
> > diff --git a/btrfs.c b/btrfs.c
> > index b84607a..bd6f6f8 100644
> > --- a/btrfs.c
> > +++ b/btrfs.c
> > @@ -23,6 +23,9 @@
> > 
> >  #include "btrfs_cmds.h"
> >  #include "version.h"
> > 
> > +#define BASIC_HELP 0
> > +#define ADVANCED_HELP 1
> > +
> > 
> >  typedef int (*CommandFunction)(int argc, char **argv);
> >  
> >  struct Command {
> > 
> > @@ -31,8 +34,10 @@ struct Command {
> > 
> >if >= 0, number of arguments,
> >if < 0, _minimum_ number of arguments */
> > 
> > char*verb;  /* verb */
> > 
> > -   char*help;  /* help lines; form the 2nd onward they are
> > -  indented */
> > +   char*help;  /* help lines; from the 2nd line onward they
> > +   are automatically indented */
> > +char*adv_help;  /* advanced help message; from the 2nd
> > line +   onward they are automatically
> > indented */
> > 
> > /* the following fields are run-time filled by the program */
> > char**cmds; /* array of subcommands */
> > 
> > @@ -47,73 +52,96 @@ static struct Command commands[] = {
> > 
> > { do_clone, 2,
> > 
> >   "subvolume snapshot", " [/]\n"
> > 
> > "Create a writable snapshot of the subvolume  with\n"
> > 
> > -   "the name  in the  directory."
> > +   "the name  in the  directory.",
> > +  NULL
> > 
> > },
> > { do_delete_subvolume, 1,
> > 
> >   "subvolume delete", "\n"
> > 
> > -   "Delete the subvolume ."
> > +   "Delete the subvolume .",
> > +  NULL
> > 
> > },
> > { do_create_subvol, 1,
> > 
> >   "subvolume create", "[/]\n"
> > 
> > "Create a subvolume in  (or the current directory if\n"
> > 
> > -   "not passed)."
> > +   "not passed).",
> > +  NULL
> > 
> > },
> > { do_subvol_list, 1, "subvolume list", "\n"
> > 
> > -   "List the snapshot/subvolume of a filesystem."
> > +   "List the snapshot/subvolume of a filesystem.",
> > +  NULL
> > 
> > },
> > { do_find_newer, 2, "subvolume find-new", " \n"
> > 
> > -   "List the recently modified files in a filesystem."
> > +   "List the recently modified files in a filesystem.",
> > +  NULL
> > 
> > },
> > { do_defrag, -1,
> > 
> >   "filesystem defragment", "[-vcf] [-s start] [-l len] [-t size]
> >   | [|...]\n"
> > 
> > -   "Defragment a file or a directory."
> > +   "Defragment a file or a directory.",
> > +  "[-vcf] [-s start] [-l len] [-t size] |
> > [|...]\n" +  "Defragment file data or directory
> > metadata.\n"
> > +"-v be verbose\n"
> > +"-c c

Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-01-23 Thread Hubert Kario
On Sunday 23 of January 2011 15:54:12 Goffredo Baroncelli wrote:
> There are a lot of patches regarding the btrfs-tool.
> Unfortunately, the btrfs maintainer are very busy in other areas of the
> project. I don't know when (if) these patches will be applied.
> However I think that is better have a misaligned patch instead nothing.

Just as I thought. I'll post it in a minute.

Regards
Hubert Kario

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] update man page to new defragment command interface

2011-01-23 Thread Hubert Kario
Update

btrfs filesystem defragment

command explanation. Add explanation of advanced parameters and notes
about general usage.

Add few notes about the

btrfs  --help

usage, fix related grammar.

Signed-off-by: Hubert Kario 
---
 man/btrfs.8.in |   33 ++---
 1 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index 26ef982..cba2de1 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -15,7 +15,7 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBsubvolume set-default\fP\fI  \fP
 .PP
-\fBbtrfs\fP \fBfilesystem defrag\fP\fI | [|...]\fP
+\fBbtrfs\fP \fBfilesystem defragment\fP\fI [-vcf] [-s start] [-l len] [-t 
size] | [|...]\fP
 .PP
 \fBbtrfs\fP \fBfilesystem sync\fP\fI  \fP
 .PP
@@ -34,6 +34,8 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBhelp|\-\-help|\-h \fP\fI\fP
 .PP
+\fBbtrfs\fP \fB \-\-help \fP\fI\fP
+.PP
 .SH DESCRIPTION
 .B btrfs
 is used to control the filesystem and the files and directories stored. It is
@@ -60,12 +62,12 @@ returns an error.
 
 If a command is terminated by
 .I --help
-, the relevant help is showed. If the passed command matches more commands,
-the help of all the matched commands are showed. For example
+, the detailed help is showed. If the passed command matches more commands,
+detailed help of all the matched commands is showed. For example
 .I btrfs dev --help
 shows the help of all
 .I device*
-command.
+commands.
 
 .SH COMMANDS
 .TP
@@ -98,12 +100,29 @@ mount time via the \fIsubvol=\fR option.
 
 \fBsubvolume set-default\fR\fI  \fR
 Set the subvolume of the filesystem \fI\fR which is mounted as 
-\fIdefault\fR. The subvolume is identified by \fB\fR, which 
+\fIdefault\fR. The subvolume is identified by \fI\fR, which 
 is returned by the \fBsubvolume list\fR command.
 .TP
 
-\fBfilesystem defragment\fP\fI | [|...]\fR
-Defragment files and/or directories.
+\fBfilesystem defragment\fP\fI [-vcf] [-s start] [-l len] [-t size] 
| [|...]\fR
+Defragment file data and/or directory metadata. To defragment all files in a
+directory you have to specify each one on its own or use your shell wildcards.
+
+\fB-v\fP be verbose
+
+\fB-c\fP compress file contents while defragmenting
+
+\fB-f\fP flush filesystem after defragmenting
+
+\fB-s start\fP defragment only from byte \fIstart\fR onward
+
+\fB-l len\fP defragment only up to \fIlen\fR bytes
+
+\fB-t size\fP defragment only files at least \fIsize\fR bytes big
+
+NOTE: defragmenting with kernels up to 2.6.37 will unlink COW-ed copies of 
data, don't 
+use it if you use snapshots, have de-duplicated your data or made copies with 
+\fBcp --reflink\fP.
 .TP
 
 \fBdevice scan\fR \fI[ [..]]\fR
-- 
1.7.3.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Defragmentation Not Working with Directory as Argument

2011-01-24 Thread Hubert Kario
On Monday, January 24, 2011 19:11:07 Mitch Harder wrote:
> Defragmentation is not working when a directory is supplied as the
> argument.  When a file name is supplied as the argument,
> defragmentation works well.
> 
> This can be demonstrated as follows:
> 
> # mount /dev/sdb2 /mnt/btrfs
> # cd /mnt/btrfs/
> # cp /usr/src/linux/fs/btrfs/zlib.c  .(any suitable file will do,
> small text files seemed appropriate)
> # cp zlib.c fragmented-zlib.c
> # btrfs fi sync /mnt/btrfs
> # cat zlib.c >> fragmented-zlib.c
> # btrfs fi sync /mnt/btrfs
> # filefrag zlib.c fragmented-zlib.c
> zlib.c: 1 extent found
> fragmented-zlib.c: 2 extents found
> # btrfs filesystem defragment /mnt/btrfs/
> # btrfs fi sync /mnt/btrfs
> # filefrag zlib.c fragmented-zlib.c
> zlib.c: 1 extent found
> fragmented-zlib.c: 2 extents found
> 
> Now, supply a filename as the argument
> 
> # btrfs filesystem defragment /mnt/btrfs/fragmented-zlib.c
> # btrfs fi sync /mnt/btrfs
> # filefrag zlib.c fragmented-zlib.c
> zlib.c: 1 extent found
> fragmented-zlib.c: 1 extent found
> 
> I've also tested defragmentation on my normal btrfs filesystems, and
> have seen the same results.  I've never seen a file defragmented when
> using a directory as the argument.

That's because this defragments only directory metadata.

> Also, I was wondering if one of the Btrfs developers could clarify the
> intended scope of defragmentation when a directory is supplied as the
> argument.

I submited apropriate patches few days back to update man and help messages

> I have seen comments on the mailing list indicating that only the
> directory supplied is intended to be defragmented, and defragmentation
> will not recurse to subdirectories below.

that's true

> However, if I look at the code, and read the documentation for the
> older btrfsctl command (which seems to be the basis for the current
> btrfs command), it seems like the intention is to recursively
> defragment the entire tree below the directory supplied in the
> argument.

Then it looks like the userspace part of implementation changed, don't know 
why though.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to fasten btrfs?

2011-01-25 Thread Hubert Kario
On Tuesday 25 of January 2011 07:45:02 Magicloud Magiclouds wrote:
> Hi,
>   I am using 2.6.36.3 kernel with btrfs, 512MB memory and a very slow
> disk, no special options for mounting btrfs except noatime. Now I
> found it very slow. When I rm a 5GB movie, it took 20 secs.

try mounting with nodatacow and defragment the directory the file resides in

Regards
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to fasten btrfs?

2011-01-25 Thread Hubert Kario
On Tuesday, January 25, 2011 15:28:11 Magicloud Magiclouds wrote:
> So you were saying that, even an unlink operation, the system walks
> through all file data blocks? I thought unlink wouldn't have to write
> too much data.

No, but the amount of data to be changed if the number of extents is small 
will be lower.

> 
> On Tue, Jan 25, 2011 at 6:30 PM, Hubert Kario  wrote:
> > On Tuesday 25 of January 2011 07:45:02 Magicloud Magiclouds wrote:
> >> Hi,
> >>   I am using 2.6.36.3 kernel with btrfs, 512MB memory and a very slow
> >> disk, no special options for mounting btrfs except noatime. Now I
> >> found it very slow. When I rm a 5GB movie, it took 20 secs.
> > 
> > try mounting with nodatacow and defragment the directory the file resides
> > in
> > 
> > Regards
> > --
> > Hubert Kario
> > QBS - Quality Business Software
> > 02-656 Warszawa, ul. Ksawerów 30/85
> > tel. +48 (22) 646-61-51, 646-74-24
> > www.qbs.com.pl

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-25 Thread Hubert Kario
On Tuesday, January 25, 2011 18:29:35 Kaspar Schleiser wrote:
> On 01/22/2011 02:55 PM, Hubert Kario wrote:
> >> It looks like ZFS, Btrfs, and LVM should work in similar manners, but
> >> the overloaded terminology (pool, volume, sub-volume, filesystem are
> >> different in all three) and new terminology that's only in Btrfs is
> >> confusing.
> > 
> > With btrfs you need to have *a* filesystem, once you have it, you can add
> > and remove disks/partitions from it, no need to use 'mkfs.btrfs', just
> > 'btrfs'.
> 
> That's just a design decision, right? There's no need for a "default" or
> "root" subvolume.
> 
> It should be rather easy to change btrfs so that you first have to
> create a "storage pool" which combines disks for btrfs, and on top of
> that you can create "filesystems" which are just subvolumes.
> 
> The creation of a "storage pool" could be very similar to the current
> mkfs, just without the creation of a root subvolume.
> 
> A new, simpler mkfs would then just create a subvolume on top of the
> "storage pool" that can be mounted.
> 
> Regards,
> Kaspar

I'm not sure, but for btrfs to support storage pools the way ZFS does would 
require change in disk layout.

Besides, I don't see *why* this should be done...

And as far as I know ZFS doesn't support different reduncancy levels for 
different files residing in the same directory. You can have 
~/1billion$-project.tar.gz with triple redundancy and ~/temp.video.mkv with no 
reduncancy with btrfs...

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-25 Thread Hubert Kario
On Tuesday, January 25, 2011 18:59:39 Freddie Cash wrote:
> On Tue, Jan 25, 2011 at 9:43 AM, Hubert Kario  wrote:
> > Besides, I don't see *why* this should be done...
> > 
> > And as far as I know ZFS doesn't support different reduncancy levels for
> > different files residing in the same directory. You can have
> > ~/1billion$-project.tar.gz with triple redundancy and ~/temp.video.mkv
> > with no reduncancy with btrfs...
> 
> With ZFS, redundancy (mirror, raidz1, raidz2, raidz3) is done at the
> storage pool layer, and affects the entire pool.  You can mix and
> match redundancy levels (combine mirror vdevs and raidz vdevs in the
> same pool), but there's no way to control what data blocks go to which
> vdev, as it's all just one giant pool of storage.
> 
> However, there is a "copies" property for each filesystem that affects
> how many copies of data blocks are stored, to increase the redundancy
> for that filesystem.  For example, you can create a storage pool using
> 2 mirror vdevs (4 drives; equivalent to a RAID10 setup); then create a
> filesystem with copies=2.  Thus, any blocks written to that filesystem
> will be stored twice, each of which is then striped across the two
> vdevs, and then mirrored to each disk in the vdevs, potentially
> leading to 4 (or more) blocks of data written to disk.
> 
> This is similar to using Linux md to create RAID arrays underneath LVM
> volume groups.  The redundancy is managed via md; the filesystems just
> see a collection of blocks to write to.
> 
> The big difference (from what I understand) between ZFS and Btrfs is
> the layering.  ZFS separate storage management from filesystem
> management, so redundancy happens at lower layers and the filesystem
> just sends blocks to the pool.  Whereas Btrfs combines them into one,
> so that redundancy is managed at the filesystem level and can be
> changed on a per-directory (or per-sub-volume?) basis, with the
> filesystem handling the writes and the redundancy.

Right now you can't change the raid level at all but there are hooks planned 
to enable selecting raid level on a per file basis.

btrfs allows for better management of space ond less over provisioning.

So I'd say that management of storage space with btrfs is even easier than 
with ZFS:

admin sets the default redundancy level for whole file system (let's say that 
it's a 4 disk system) to a RAID1 with two copies.
After seting up the system sets the redundancy level in directories with 
databases to RAID10
Users storing big files use RAID5 for some files.

one of the drives fails, admin removes the drive from set, schedules 
reballance.

the set is smaller but all reduncancy is preserved

New drives arrive, they are added to fs. FS is reballanced for the second time 
to achive better performance (the space would be usable even without it).

 
> I don't pretend to understand all the intricacies of how Btrfs works
> (I'm working on it), but the layering in ZFS is very nice and easy to
> work with in comparison.  Interesting how ZFS is considered the
> "rampant layering violation", though.  ;)  :)  :D

btrfs is much simpler from user point of view :)

as for rampant layering violation: most of the code that deals with stored 
data isn't concerned with raid level, in contrast with zfs. In other words, 
its in the code, not interface.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Fwd: Re: Suggestion for sticky-compression mount setting (default mount options)]

2011-02-08 Thread Hubert Kario
On Tuesday 08 of February 2011 20:59:13 Kirk Wolff wrote:
> forwarded message below:
> 
> 
> It would be very useful.
> I also propose the same for `-o ssd` option. USB drive may be used on
> different computers, and we need the way to specify it only once. As for
> the options being per-file, I think having them as per subvolume defaults
> looks better: we can override the defaults on per-file basis while having
> reasonable defaults for each subvolume. (There may be also global per-pool
> defaults, which may be overriden by subvolumes).

I think that if the file is compressed it should remain compressed and the "c" 
attr should be used for that (both to report it and to set it).

There should be a way to specify *which* type of compression to use in a 
sticky way, system wide and whatever to compress newly created files also 
'-o ssd' definitely should go in the same basket.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Error mounting multi-device fs after restart

2011-02-08 Thread Hubert Kario
On Tuesday 08 of February 2011 21:25:55 Diwaker Gupta wrote:
> Searching through the archives, it seems others have faced similar
> issues due to sudden power outages. AFAIK we did not have any power
> outage.

SysRq+B will have the same effect, OOPS or BUG will have similar effect

> I've run badblocks on all of the 10 drives and three of them had a few
> bad blocks. I'm inclined to rule out bad disks as the root cause. In
> any case, isn't this exactly the kind of situation btrfs should
> protect users against?

And in the end it will, unfortunately at the moment it will only report the 
read data doesn't match stored checksum in the dmesg. If you have redundacy in 
place it will try to read the other copy of data. That's it.

As a side note, if a drive made in the past 5 years has badblocks detectable 
by `badblocks` it's long gone, probably it was silently corrupting data for a 
long time now.

> A 'btrfsck' aborts on all of the drives. I've tried running it with
> '-s 1' as well as '-s 2' with no success. Does that mean that none of
> the drives have any copy of the superblock intact?

-s 1 and -s 2 will try to read backup copies of superblock, not superblock 
copies on other devices. Regular code should perform the latter by itself.
 
> Diwaker
> 
> On Mon, Feb 7, 2011 at 11:46 AM, Diwaker Gupta  
wrote:
> > Hello,
> > 
> > We have 10 1-TB drives hosting a multi-device btrfs filesystem,
> > configured with raid1+0 for both data and metadata. After some package
> > upgrades over the weekend I restarted the system and it did not come
> > back up afterwards. I booted using a rescue disk and ran btrfsck (next
> > branch from Chris's git repository). Unfortunately btrfsck aborts on
> > every single drive with errors like this:
> > 
> > parent transid verify failed on 12050980864 wanted 377535 found 128327
> > parent transid verify failed on 12074557440 wanted 422817 found 126691
> > parent transid verify failed on 12057542656 wanted 422786 found 126395
> > parent transid verify failed on 12075556864 wanted 423004 found 126691
> > bad block 12095545344
> > parent transid verify failed on 12079190016 wanted 422826 found 105147
> > leaf parent key incorrect 12097544192
> > bad block 12097544192
> > 
> > I'm running 10.04 Ubuntu Lucid with the lts-backport x86_64 kernel:
> > 2.6.35-23-server
> > 
> > Attempting to mount the filesystem blocks indefinitely, with
> > /var/log/messages getting filled with the 'parent transid verify'
> > errors.

Define *indefinitely*.
Are the drives not working?
If the drives are working, have you tried waiting 2-3 days, possibly longer?
10TB is a *lot* of data

> > 
> > IIUC the 'btrfs-select-super' utility is not really helpful in our
> > case. At this point, my only priority is to somehow rescue the data
> > from the filesystem. I'd really appreciate if someone on the list
> > could help me out.

getting the FS mountable is your best bet at the moment (apart from diving in 
the drive with dd in one hand and hexdump in the other...)

> > 
> > I'm happy to provide any other information required. Please CC me on
> > replies as I'm not subscribed to the list.
> > 
> > Thanks,
> > Diwaker
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] explain filesystem resize devid option

2011-02-09 Thread Hubert Kario
Adds explanation to help message and man page how to use `filesystem resize'
to resize only a single device not all devices of a file system.

Signed-off-by: Hubert Kario 
---
patch to apply cleanly requires my previous patches adding advanced help 
functionality

 btrfs.c|   10 +++---
 man/btrfs.8.in |6 --
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index bd6f6f8..a28a573 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -98,10 +98,14 @@ static struct Command commands[] = {
   NULL
},
{ do_resize, 2,
- "filesystem resize", "[+/-][gkm]|max \n"
+ "filesystem resize", "[:][+|-][gkm]|max 
\n"
"Resize the file system. If 'max' is passed, the filesystem\n"
-   "will occupe all available space on the device.",
-  NULL
+   "will occupy all available space on the device.",
+  "[:][+|-][gkm]|max \n"
+"Resize the file system. If no  is specified, change 
is\n"
+"applied to every device in file system. If devid is 
specified\n"
+"the change in size is applied only to selected device.\n"
+"To get device numbers use `btrfs filesystem show\' command."
},
{ do_show_filesystem, 999,
  "filesystem show", "[|]\n"
diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index cba2de1..afb9824 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -19,7 +19,7 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBfilesystem sync\fP\fI  \fP
 .PP
-\fBbtrfs\fP \fBfilesystem resize\fP\fI [+/\-][gkm]|max \fP
+\fBbtrfs\fP \fBfilesystem resize\fP\fI [:][+|\-][gkm]|max 
\fP
 .PP
 \fBbtrfs\fP \fBdevice scan\fP\fI [ [..]]\fP
 .PP
@@ -138,9 +138,11 @@ Force a sync for the filesystem identified by \fI\fR.
 .\" Some wording are extracted by the resize2fs man page
 .\"
 
-\fBfilesystem resize\fR\fI [+/\-][gkm]|max \fR
+\fBfilesystem resize\fR\fI [:][+|\-][gkm]|max \fR
 Resize a filesystem identified by \fI\fR.
 The \fI\fR parameter specifies the new size of the filesystem.
+To change size of only one device the device id can be specified before 
\fI:\fR.
+Device identifiers are printed by \fBfilesystem show\fR command.
 If the prefix \fI+\fR or \fI\-\fR is present the size is increased or decreased
 by the quantity \fI\fR.
 If no units are specified, the unit of the \fI\fR parameter defaults to
-- 
1.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs new btrfs_error() macro to deprecate fprintf(stderr, ...)

2011-02-10 Thread Hubert Kario
On Thursday, February 10, 2011 15:54:55 Eduardo Silva wrote:
> Hi,
> 
> This patch add a new macro called btrfs_error(...) which deprecate the
> use of fprintf(stderr, ...)
> 
> regards,
> 
> Eduardo

Sorry, but I don't see a reason for such change. IMHO it only makes the code 
_less_ readable.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Space used by snapshot

2011-02-17 Thread Hubert Kario
On Thursday, February 17, 2011 12:13:53 Roman Kapusta wrote:
> Hello all,
> 
> Is there any way how to obtain information how much space is
> physically allocated by given subvolume?
> I cannot find any. I'm interested in two values:
> 
> - physical space allocated by SUBVOLUME INCLUDING all space shared by
> other subvolumes

That's the space reported by `du -hs /mnt/btrfs/subvolume`

> 
> - physical space allocated by SUBVOLUME EXCLUDING all space shared by
> other subvolumes

That's not implemented right now
 
> Currently I can use only du, which is not reporting what I want to know.
> 
> Regards
> 
> Roman Kapusta

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Space used by snapshot

2011-02-17 Thread Hubert Kario
On Thursday, February 17, 2011 13:38:58 Roman Kapusta wrote:
> On Thu, Feb 17, 2011 at 12:54, Hubert Kario  wrote:
> > On Thursday, February 17, 2011 12:13:53 Roman Kapusta wrote:
> >> Hello all,
> >> 
> >> Is there any way how to obtain information how much space is
> >> physically allocated by given subvolume?
> >> I cannot find any. I'm interested in two values:
> >> 
> >> - physical space allocated by SUBVOLUME INCLUDING all space shared by
> >> other subvolumes
> > 
> > That's the space reported by `du -hs /mnt/btrfs/subvolume`
> 
> it should be `du -hsx /mnt/btrfs/subvolume`
> but what if there are reflinked/CoW files inside this subvolume?

True, forgot about them.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume backup

2011-02-18 Thread Hubert Kario
On Friday, February 18, 2011 12:24:24 Yuri D'Elia wrote:
> Is there a way to backup/restore a btrfs volume without losing all the
> associated metadata? (something like xfsdump/restore?)
> 
> brtfs-image looks somewhat what one would need, except that it clears the
> actual data (for debugging purposes I guess?). I guess it also doesn't
> work on a mounted filesystem, so fixing btrfs-image would be of limited
> use right now.
> 
There's nothing like it right now, the only way to backup whole btrfs volume 
is to use dd.

something like "zfs send" is in the plans though

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs fi df returns nothing

2011-03-09 Thread Hubert Kario
There's a daily kernel build PPA:
http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/2011-03-08-natty/

Sorry to note, but have you even tried googling for it?

On Wednesday, March 09, 2011 12:20:28 Brian J. Murrell wrote:
> I am using btrfs on Ubuntu's LTS kernel (2.6.32-22-generic) and have
> updated the btrfs-tools to 0.19+20101101-1~lucid1.  I have also built
> the latest from
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.gi
> t.
> 
> The problem is that I don't get anything from btrfs fi df $mountpoint.
> i.e., for the btrfs filesystem mounted at /mnt/btrfs-test/
> 
> # ./btrfs fi df /mnt/btrfs-test/
> #
> 
> See, no output.  Is my kernel module too old for that feature?  Maybe it
> is:
> 
> stat64("/mnt/btrfs-test/", {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> open("/mnt/btrfs-test/",
> O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 3
> fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)
> ioctl(3, 0xc0109414, 0x999c320) = -1 ENOTTY (Inappropriate ioctl
> for device)
> exit_group(-1)  = ?
> 
> If my kernel module is too old, anyone know of a successful backport of
> a newer btrfs for Ubuntu Lucid?  In a PPA or otherwise?
> 
> Cheers,
> b.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 0/3] btrfs-progs: scrub interface

2011-03-12 Thread Hubert Kario
On Saturday 12 of March 2011 19:31:32 Jan Schmidt wrote:
> This patch set for btrfs-progs is meant to be used together with the scrub
> kernel patches submitted by Arne Jansen (latest Subject: [PATCH v3 0/6]
> btrfs: scrub).
> 
(snip)
> 
>  Makefile |4 +-
>  btrfs.c  |   12 +
>  btrfs_cmds.c |3 +-
>  btrfs_cmds.h |4 +
>  ctree.h  |2 +-
>  ioctl.h  |   68 +++-
>  scrub.c  | 1425
> ++ 7 files
> changed, 1513 insertions(+), 5 deletions(-)
>  create mode 100644 scrub.c

Please, update the man pages.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS fsck tool

2011-03-15 Thread Hubert Kario
On Sunday, March 13, 2011 00:53:00 Ric Wheeler wrote:
> On 03/12/2011 05:49 PM, Spelic wrote:
> > On 03/10/2011 02:02 PM, Chris Mason wrote:
> >> Cutting the power isn't problem unless you're using something
> >> where cache flushes are not supported.
> > 
> > Some disks lie about cache flush having completed.
> 
> This is really not true for modern enterprise class drives. You might have
> more issues with USB thumbdrives and other really low end parts.

btrfs is supposed to be an ext3/4 replacement - it _will_ be used with low end 
parts (commodity SATA HDDs)

Regards
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: cloning single-device btrfs file system onto multi-device one

2011-04-17 Thread Hubert Kario
On Monday 21 of March 2011 17:24:50 Stephane Chazelas wrote:
> Hiya,
> 
> I'm trying to move a btrfs FS that's on a hardware raid 5 (6TB
> large, 4 of which are in use) to another machine with 3 3TB HDs
> and preserve all the subvolumes/snapshots.
> 
> Is there a way to do that without using a software/hardware raid
> on the new machine (that is just use btrfs multi-device).
> 
> If fewer than 3TB were occupied, I suppose I could just resize
> it so that it fits on one 3TB hd, then copy device to device
> onto a 3TB disk, add the 2 other ones and do a "balance", but
> here, I can't do that.
> 
> I suspect that if compression was enabled, the FS could fit on
> 3 TB, but AFAICT, compression is enabled at mount time and would
> only apply to newly created files. Is there a way to compress
> files already in a btrfs filesystem?

You can compress files already on disk using
btrfs filesystem defragment -c /path/to/file
but defragmenting breaks snapshotting (at least it did 2 months ago, dunno if 
it's still true)

> 
> Any help would be appreciated.
> Stephane

-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Online Deduplication for Btrfs (Master's thesis)

2012-12-17 Thread Hubert Kario
On Monday 17 of December 2012 13:05:01 Martin Křížek wrote:
> * Limitations
> Not really limitations, but this is a list of situations when dedup will
> not be triggered:
> - compression - basically, dedup is kind of compression, might be worth to
> into it in the future though

I don't see why it would be incompatible, compressed blocks are data like 
any other. COW and subvolume snapshots work with compressed nodes just as 
well as with regular ones...

> - data across subvolumes

Wasn't the "cp --reflink" across subvolumes merged to mainline quite a bit 
ago? Under 3.6.9 it works fine for me... Also, the blocks are shared between 
subvolumes if the other subvolume is a snapshot of the first one.

Besides, I think that doing a rsync from remote server and snapshotting the 
subvolume dedicated for server will be the standard approach.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ssd option for USB flash drive?

2011-05-19 Thread Hubert Kario


smime.p7m
Description: S/MIME encrypted message


Re: ssd option for USB flash drive?

2011-05-19 Thread Hubert Kario
Sorry, loks like list mailer doesn't like SMIME messages.

On Thursday 19 of May 2011 21:04:54 Hubert Kario wrote:
> On Wednesday 18 of May 2011 00:02:52 Stephane Chazelas wrote:
> > Hiya,
> > 
> > I've not found much detail on what the "ssd" btrfs mount option
> > did. Would it make sense to enable it to a fs on a USB flash
> > drive?
> 
> yes, enabling discard is pointless though (no USB storage supports it
> AFAIK).
> 
> > I'm using btrfs (over LVM) on a Live Linux USB stick to benefit
> > from btrfs's compression and am trying to improve the
> > performance.
> 
> ssd mode won't improve performance by much (if any).
> 
> You need to remember that USB2.0 is limited to about 20-30MiB/s (depending
> on CPU) so it will be slow no matter what you do
> 
> > Would anybody have any recommendation on how to improve
> > performance there? Like what would be the best way to
> > enable/increase writeback buffer or any way to make sure writes
> > are delayed and asynchronous? Would disabling read-ahead help?
> > (at which level would it be done?). Any other tip (like
> > disabling atime, aligning blocks/extents, figure out erase block
> > sizes if relevant...)?
> 
> aligning logical blocks to erase blocks can give some performance but the
> only way to make it really fast is not to use USB
> 
> > Many thanks in advance,
> > Stephane

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [btrfs-delalloc-]

2011-06-27 Thread Hubert Kario
On Monday 27 of June 2011 11:04:06 Proskurin Kirill wrote:
> Hello all.
> 
> What we have:
> SL6 - kernel 2.6.32-131.2.1.el6.x86_64
> btrfs on mdadm RAID5 with 8 HDD - 27T partition.
> 
> I see this at top:
> 1182 root  20   0 000 R 100.0  0.0  16:39.73
> [btrfs-delalloc-]
> 
> And LA is grow. What is this and how can I fix it?

delalloc is a delayed allocation kernel thread -- it probably means something 
is writing large amounts of data to the file system

2.6.32 is *old* as far as btrfs is concerned, there have been many bugs fixed 
and performance improvements since

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] remove unused variables

2011-06-29 Thread Hubert Kario
fixes compilation warnings with gcc 4.6.0 20110429

Signed-off-by: Hubert Kario 
---
 mkfs.c|3 ---
 volumes.c |2 --
 2 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index 1b5ef06..d40b2e8 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1060,7 +1060,6 @@ static int make_image(char *source_dir, struct btrfs_root 
*root, int out_fd)
struct btrfs_trans_handle *trans;
 
struct stat root_st;
-   int root_len;
 
struct directory_name_entry dir_head;
 
@@ -1070,8 +1069,6 @@ static int make_image(char *source_dir, struct btrfs_root 
*root, int out_fd)
goto fail;
}
 
-   root_len = strlen(source_dir);
-
INIT_LIST_HEAD(&dir_head.list);
 
trans = btrfs_start_transaction(root, 1);
diff --git a/volumes.c b/volumes.c
index 61af845..95c2e0d 100644
--- a/volumes.c
+++ b/volumes.c
@@ -868,7 +868,6 @@ int btrfs_alloc_data_chunk(struct btrfs_trans_handle *trans,
struct list_head *dev_list = &extent_root->fs_info->fs_devices->devices;
struct list_head *cur;
struct map_lookup *map;
-   u64 physical;
u64 calc_size = 8 * 1024 * 1024;
int num_stripes = 1;
int sub_stripes = 0;
@@ -920,7 +919,6 @@ int btrfs_alloc_data_chunk(struct btrfs_trans_handle *trans,
btrfs_set_stack_stripe_devid(stripe, device->devid);
btrfs_set_stack_stripe_offset(stripe, dev_offset);
memcpy(stripe->dev_uuid, device->uuid, BTRFS_UUID_SIZE);
-   physical = dev_offset;
index++;
}
 
-- 
1.7.5.1
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] initialize all fields of commands[] struct

2011-06-29 Thread Hubert Kario
some help messages don't use advanced help message (shown when
invoked as `btrfs command subcommand --help`)
initialize them to NULL (meaning "use regular help message").

Signed-off-by: Hubert Kario 
---
this patch is for the branch integration-20110626 in 
http://git.darksatanic.net/repo/btrfs-progs-unstable.git/ integration-20110626

 btrfs.c |   18 --
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index e117172..8875911 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -132,29 +132,35 @@ static struct Command commands[] = {
},
{ do_balance_cancel, 1,
  "balance cancel", "\n"
-   "Cancel the balance operation running on ."
+   "Cancel the balance operation running on .",
+  NULL
},
{ do_scrub_start, -1,
  "scrub start", "[-Bdqr] |\n"
-   "Start a new scrub."
+   "Start a new scrub.",
+  NULL
}, 
{ do_scrub_cancel, 1,
  "scrub cancel", "|\n"
-   "Cancel a running scrub."
+   "Cancel a running scrub.",
+  NULL
}, 
{ do_scrub_resume, -1,
  "scrub resume", "[-Bdqr] |\n"
-   "Resume previously canceled or interrupted scrub."
+   "Resume previously canceled or interrupted scrub.",
+  NULL
}, 
{ do_scrub_status, -1,
  "scrub status", "[-d] |\n"
-   "Show status of running or finished scrub."
+   "Show status of running or finished scrub.",
+  NULL
}, 
{ do_change_label, -1,
  "filesystem label", " []\n"
  "With one argument, get the label of filesystem on .\n"
  "If  is passed, set the filesystem label to .\n"
- "The filesystem must be unmounted.\n"
+ "The filesystem must be unmounted.\n",
+  NULL
},
{ do_scan, 999,
  "device scan", "[--all-devices| [..]\n"
-- 
1.7.5.1
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-progs: integration branch updated

2011-07-01 Thread Hubert Kario
On Thursday 30 of June 2011 23:19:02 Hugo Mills wrote:
>After a reorganisation of patches, and sending a bunch of them to
> Chris, I've also updated the integration branch to match that. It's
> available from:
> 
> http://git.darksatanic.net/repo/btrfs-progs-unstable.git/
> integration-20110630
> 
>The shortlog of 17 patches in this branch beyond the ones I've sent
> to Chris is below.
> 
>Hugo.

Is there a reason why my patch for compilation errors 
(http://www.spinics.net/lists/linux-btrfs/msg10965.html)
isn't integrated? Without it, I can't compile btrfs-progs with new gcc 
(4.6.0 20110429):

volumes.c: In function ‘btrfs_alloc_data_chunk’:
volumes.c:891:6: error: variable ‘physical’ set but not used 
[-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors

Regards,
Hubert
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: initial online fsck support

2011-07-02 Thread Hubert Kario
On Friday 01 of July 2011 19:48:36 Andi Kleen wrote:
> Li Zefan  writes:
> > This is an initial version of online fsck. What it does is:
> > 
> > - check the dir item and dir index pointing to a file.
> > - check the structure of extents of a file.
> > 
> > As furthur work, we should consider:
> > 
> > - fix but not only check the structure of a file.
> > - verify the extent allocation tree on the fly.
> 
> It's scary to have a fsck in kernel space. Is there no way to do
> this from user space?
> 
> -Andi

There will be a userspace fsck (Chris Mason is working on it ATM).

The two big features of btrfs are self-healing and online fsck, those have to 
be implemented in kernel space.
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: initial online fsck support

2011-07-02 Thread Hubert Kario
On Saturday 02 of July 2011 19:04:43 Andi Kleen wrote:
> > The two big features of btrfs are self-healing and online fsck, those
> > have to
> 
> Are they?

they are scheduled to be, just like RAID5/6, mixed RAID in single FS...
 
> > be implemented in kernel space.
> 
> Why? There have been online fscks in user space in the past,
> e.g. the various schemes using LVM snapshots for ext* and
> other related work on the BSD FFS. I don't see any principal
> reason why it couldn't be done for btrfs either.

Doing a fsck on LVM snapshot of btrfs:
1. is impossible (UUIDs)
2. won't fix errors

I have to note that I don't know how FFS fsck is implemented.
 
> A good fsck is quite complex and you are unlikely to want all
> that code in kernel space.

complete one, yes, but it's not quite pointless, ZFS does it like this and 
admins rather like it

-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TRIM support

2011-07-03 Thread Hubert Kario
On Sunday 03 of July 2011 00:40:46 Leonidas Spyropoulos wrote:
> On Sat, Jul 2, 2011 at 11:39 PM, Leonidas Spyropoulos
> 
>  wrote:
> > On Sat, Jul 2, 2011 at 8:45 PM, Calvin Walton  
wrote:
> >> On Sat, 2011-07-02 at 19:08 +0100, Leonidas Spyropoulos wrote:
> >>> Hello,
> >>> 
> >>> I just installed an archlinux with btrfs root partition and would like
> >>> to set the correct mount properties
> >>> Following this:
> >>> https://wiki.archlinux.org/index.php/Solid_State_Drives
> >>> it says there that I should use the discard mount parameter to enable
> >>> TRIM.
> >>> 
> >>> I would like to ask by using ssd mount parameter would TRIM be enabled?
> >>> The SSD is Intel 320 Series 120Gb
> >> 
> >> No, the "ssd" mount parameter has nothing to do with TRIM.
> >> 
> >> The "ssd" mount parameter adjusts a couple of tuning parameters where
> >> the default setting is designed to improve performance on spinning HDD,
> >> and instead tunes for the random-access ability of an SSD.
> >> 
> >> The ssd option is automatically enabled if the kernel detects that your
> >> drive is an SSD (you can check with 'cat /proc/mounts').
> >> 
> >> The discard option is not currently automatically enabled; I think there
> >> may have been some performance issues in certain cases with drives that
> >> have slow trim implementations. But feel free to give it a try.
> >> 
> >> --
> >> Calvin Walton 
> 
> On the same system when I try to compile the btrfs-tools I get an error.
> Since on the wiki you mention only the packages for Fedora and Debian,
> 
> Which are the requirements for the btrfs tools?
> 
> PS: AUR package is broken as well.

the AUR package is OK, problem is that the sources don't compile with new gcc.

Download Hugo's integration branch
http://git.darksatanic.net/repo/btrfs-progs-unstable.git/ integration-20110630
and apply my patch to it:
http://www.spinics.net/lists/linux-btrfs/msg10965.html
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TRIM support

2011-07-03 Thread Hubert Kario
On Sunday 03 of July 2011 14:56:40 Leonidas Spyropoulos wrote:
> On Sun, Jul 3, 2011 at 1:20 PM, Hubert Kario  wrote:
> > On Sunday 03 of July 2011 00:40:46 Leonidas Spyropoulos wrote:
> >> On Sat, Jul 2, 2011 at 11:39 PM, Leonidas Spyropoulos
> >> 
> >>  wrote:
> >> > On Sat, Jul 2, 2011 at 8:45 PM, Calvin Walton
> >> > 
> > 
> > wrote:
> >> >> On Sat, 2011-07-02 at 19:08 +0100, Leonidas Spyropoulos wrote:
> >> >>> Hello,
> >> >>> 
> >> >>> I just installed an archlinux with btrfs root partition and would
> >> >>> like to set the correct mount properties
> >> >>> Following this:
> >> >>> https://wiki.archlinux.org/index.php/Solid_State_Drives
> >> >>> it says there that I should use the discard mount parameter to
> >> >>> enable TRIM.
> >> >>> 
> >> >>> I would like to ask by using ssd mount parameter would TRIM be
> >> >>> enabled? The SSD is Intel 320 Series 120Gb
> >> >> 
> >> >> No, the "ssd" mount parameter has nothing to do with TRIM.
> >> >> 
> >> >> The "ssd" mount parameter adjusts a couple of tuning parameters where
> >> >> the default setting is designed to improve performance on spinning
> >> >> HDD, and instead tunes for the random-access ability of an SSD.
> >> >> 
> >> >> The ssd option is automatically enabled if the kernel detects that
> >> >> your drive is an SSD (you can check with 'cat /proc/mounts').
> >> >> 
> >> >> The discard option is not currently automatically enabled; I think
> >> >> there may have been some performance issues in certain cases with
> >> >> drives that have slow trim implementations. But feel free to give it
> >> >> a try.
> >> >> 
> >> >> --
> >> >> Calvin Walton 
> >> 
> >> On the same system when I try to compile the btrfs-tools I get an error.
> >> Since on the wiki you mention only the packages for Fedora and Debian,
> >> 
> >> Which are the requirements for the btrfs tools?
> >> 
> >> PS: AUR package is broken as well.
> > 
> > the AUR package is OK, problem is that the sources don't compile with new
> > gcc.
> > 
> > Download Hugo's integration branch
> > http://git.darksatanic.net/repo/btrfs-progs-unstable.git/
> > integration-20110630
> 
> I download the files:
> 
> git clone  http://git.darksatanic.net/repo/btrfs-progs-unstable.git/
> integration-20110630
> 
> > and apply my patch to it:
> > http://www.spinics.net/lists/linux-btrfs/msg10965.html
> 
> Then I tried to apply the patch you mentioned:
> 
> patch < rem.diff
> 
> but it's failing:
> The rem.diff is the file attached
> 
> > --
> > Hubert Kario
> > QBS - Quality Business Software
> > 02-656 Warszawa, ul. Ksawerów 30/85
> > tel. +48 (22) 646-61-51, 646-74-24
> > www.qbs.com.pl
> 
> Here is the error I am getting:
> patching file mkfs.c
> Hunk #1 FAILED at 1060.
> Hunk #2 FAILED at 1070.
> 2 out of 2 hunks FAILED -- saving rejects to file mkfs.c.rej
> patching file volumes.c
> Hunk #1 FAILED at 868.
> Hunk #2 FAILED at 920.
> 2 out of 2 hunks FAILED -- saving rejects to file volumes.c.rej
> 
> I think the file I created is wrong.
> What is the accepted format for the patch command?

sorry, looks like I changed tabs to spaces while posting.
Following one should apply cleanly

try this:

git clone  http://git.darksatanic.net/repo/btrfs-progs-unstable.git/ 
integration-20110630
cd integration-20110630
git checkout integration-20110630
git apply path/to/patch

Subject: [PATCH] remove unused variables

fixes compilation warnings with gcc 4.6.0 20110429

Signed-off-by: Hubert Kario 
---
 mkfs.c|3 ---
 volumes.c |2 --
 2 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index 3a49bab..152b9da 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1060,7 +1060,6 @@ static int make_image(char *source_dir, struct btrfs_root 
*root, int out_fd)
struct btrfs_trans_handle *trans;
 
struct stat root_st;
-   int root_len;
 
struct directory_name_entry dir_head;
 
@@ -1070,8 +1069,6 @@ static int make_image(char *source_dir, struct btrfs_root 
*root, int out_fd)
goto fail;
}
 
-   root_len = strlen(source_dir);
-
INIT_LIST_HEAD(&dir_head.list);
 
trans = btrfs_start_transaction(root, 1);
diff --git a/volume

Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-07-12 Thread Hubert Kario
On Monday 11 of July 2011 17:13:13 Jan Schmidt wrote:
> Hi Hubert,
> 
> I have to admit I did not recognize this patch but now Hugo is forcing
> me to use the "detailed help messages" and I've got an improvement to
> suggest:
> 
> On 23.01.2011 13:42, Hubert Kario wrote:
[snip]
> > { do_defrag, -1,
> > 
> >   "filesystem defragment", "[-vcf] [-s start] [-l len] [-t size]
> >   | [|...]\n"
> > 
> > -   "Defragment a file or a directory."
> > +   "Defragment a file or a directory.",
> > +  "[-vcf] [-s start] [-l len] [-t size] |
> > [|...]\n" +  "Defragment file data or directory
> > metadata.\n"
> > +"-v be verbose\n"
> > +"-c compress the file while defragmenting\n"
> > +"-f flush data to disk immediately after
> > defragmenting\n" +"-s start   defragment only from byte
> > onward\n" +"-l len defragment only up to len
> > bytes\n"
> > +"-t sizeminimal size of file to be considered for
> > defragmenting\n"
> 
> Lots of too long lines.

you mean the code or the printed messages? 
messages fit a 80 column screen, I remember I double checked it

> 
> I don't like to repeat the synopsis passage. How about adding the
> general ->help when printing ->adv_help as well? This reduces the need
> of duplication.

I think I added it because of differences in formatting.
Also I'd say we don't want to overload the user with information when he 
mistypes a command so the main help command should be as concise as possible 
while the advanced may be much more detailed (looking at the mailing list, `fi 
df` could definitely use some more verbose help message)

> 
> To prove my point, looking at the current version in Hugo's integration
> branch, your two synopsis lines already got inconsistent regarding the
> -c option :-)

That's because the patches are submitted with base as Chris tree, not the 
Hugo's so the result is a real patchwork that needs some clean-up

[snip]

> > @@ -148,10 +184,10 @@ static void help(char *np)
> > 
> > printf("Usage:\n");
> > for( cp = commands; cp->verb; cp++ )
> > 
> > -   print_help(np, cp);
> > +   print_help(np, cp, BASIC_HELP);
> > 
> > printf("\n\t%s help|--help|-h\n\t\tShow the help.\n",np);
> 
> ^
> You did not change this, but as we are here, ...

I wanted to leave as much code unchanged as possible (this /was/ my first 
patch to btrfs-tools)

> 
> > -   printf("\n\t%s  --help\n\t\tShow detailed help for a command
> > or\n\t\t" + printf("\n\t%s  --help\n\t\tShow detailed help for a
> > command or"
> 
>  ^^^
> ... why not extending the general rule so that help messages will be
> printed with --help and -h?

We have to remember that this way we loose -h switch, quite intuitive to show 
base 2 sizes with `btrfs file df`... 

-- 
Hubert
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] add detailed help messages to btrfs command

2011-07-12 Thread Hubert Kario
On Tuesday 12 of July 2011 00:22:01 Hugo Mills wrote:
> On Mon, Jul 11, 2011 at 09:11:24PM +0200, Jan Schmidt wrote:
> > On 07/11/2011 08:38 PM, Goffredo Baroncelli wrote:
[snip]
> > > A script extracts from the comment in the source both:
> > > - the text for the man page
> > > - the text for the detailed help.
> 
>Or possibly going the other direction: from the man page (which
> contains all of the information we need to reproduce in the code), it
> should be possible, with appropriate structuring, to retrieve the bits
> that the code needs to know about, and insert them into a table in a
> generated .c file. Just a thought.

I think that the first line in normal help message and advanced help message 
can and sometimes should be different.

The basic as concise as possible, while the advanced verbose and quite 
detailed (for example explaining what a filesystem scrub /is/)
 
>Oh, and the current man page needs some major work on its
> typography -- it's inconsistent with both itself, and with most other
> man pages, as far as I can tell. I did have a patch for that, but it
> was a long time ago, and clashed with almost everything.

Yes, until we won't have a single current tree for btrfs-progs there will be 
inconsistencies that will need to be fixed later.

But I guess that with Hugo's tree we're getting there.
 
> > Does anybody have such a script around? I suppose we're not the first
> > ones writing help texts and man pages.
> > 
> > > So we can reach the following goals:
> > > - the help is linked to the code
> > > - is less likely to forget to update the message
> > > - the man page, the helps are always aligned
> > 
> > Only, we still will need like short and long help. E.g. the full text in
> > the man page may be inappropriate as a --help message. Also, we do need
> > a clever idea to get indentation right in the man pages. I fiddled a lot
> > on the man pages for scrub parameter indentation (to get the second line
> > describing a command line option indented correctly to start below the
> > text of the first line, that was).
> 
>We actually need three levels of help:
> [snip]

I'd that the biggest problem, putting it all in code and formatting will be a 
real pain...

Hubert
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs vs data deduplication

2011-09-18 Thread Hubert Kario
On Saturday 09 of July 2011 08:19:30 Paweł Brodacki wrote:
> Hello,
>
> I've stumbled upon this article:
> http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/
>
> Reportedly Sandforce SF1200 SSD controller does internally block-level
> data de-duplication. This effectively removes the additional
> protection given by writing multiple metadata copies. This technique
> may be used, or can be used in the future by manufactureres of other
> drives too.

Only a problem in a single disk installation

> I would like to ask, if the metadata copies written to a btrfs system
> with enabled metadata mirroring are identical, or is there something
> that makes them unique on-disk, therefore preventing their
> de-duplication. I tried googling for the answer, but didn't net
> anything that would answer my question.

There is a difference between root inode copies, don't think there's any 
difference between metadata tree copies. I'm quite certain they are bit for 
bit identical.

> If the metadata copies are identical, I'd like to ask if it would be
> possible to change this without major disruption? I know that changes
> to on-disk format aren't a thing made lightly, but I'd be grateful for
> any comments.

That would be a big change for little to no benefit.

> The increase of the risk of file system corruption introduced by data
> de-duplication on Sandforce controllers was down-played in the
> vendor's reply included in the article, but still, what's the point of
> duplicating metadata on file system level, if storage below can remove
> that redundancy?

You shouldn't depend on single drive, metadata raid is there to protect 
against single bad blocks, not disk crash.

If you want redundancy, use mulitple disks. Either HDD or SSD. And have 
readable backups.

Regards,
Hubert



signature.asc
Description: This is a digitally signed message part.


Re: BackupPC, per-dir hard link limit, Debian packaging

2010-03-02 Thread Hubert Kario
On Tuesday 02 March 2010 03:29:05 Robert Collins wrote:
> As I say, I realise this is queued to get addressed anyway, but it seems
> like a realistic thing for people to do (use BackupPC on btrfs) - even
> if something better still can be written to replace the BackupPC store
> in the future. I will note though, that simple snapshots won't achieve
> the deduplication level that BackupPC does, because the fils don't start
> out as the same: they are identified as being identical post-backup.

Isn't the main idea behind deduplication to merge identical parts of files 
together using cow? This way you could have many very similar images of 
virtual machines, run the deduplication process and reduce massively the space 
used while maintaining the differences between images.

If memory serves me right, the plan is to do it in userland on a post-fact 
filesystem, not when the data is being saved. If such a daemon or program was 
available you would run it on the system after rsyncing the workstations.

Though the question remains which system would reduce space usage more in your 
use case. From my experience, hardlinks take less space on disk, I don't know 
whatever it could be possible to optimise btrfs cow system for files that are 
exactly the same.

> 
> Cheers,
> Rob
> 

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

System Zarządzania Jakością
zgodny z normą ISO 9001:2000
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BackupPC, per-dir hard link limit, Debian packaging

2010-03-02 Thread Hubert Kario
On Wednesday 03 March 2010 00:22:31 jim owens wrote:
> Hubert Kario wrote:
> > On Tuesday 02 March 2010 03:29:05 Robert Collins wrote:
> >> As I say, I realise this is queued to get addressed anyway, but it seems
> >> like a realistic thing for people to do (use BackupPC on btrfs) - even
> >> if something better still can be written to replace the BackupPC store
> >> in the future. I will note though, that simple snapshots won't achieve
> >> the deduplication level that BackupPC does, because the fils don't start
> >> out as the same: they are identified as being identical post-backup.
> > 
> > Isn't the main idea behind deduplication to merge identical parts of
> > files together using cow? This way you could have many very similar
> > images of virtual machines, run the deduplication process and reduce
> > massively the space used while maintaining the differences between
> > images.
> > 
> > If memory serves me right, the plan is to do it in userland on a
> > post-fact filesystem, not when the data is being saved. If such a daemon
> > or program was available you would run it on the system after rsyncing
> > the workstations.
> > 
> > Though the question remains which system would reduce space usage more in
> > your use case. From my experience, hardlinks take less space on disk, I
> > don't know whatever it could be possible to optimise btrfs cow system
> > for files that are exactly the same.
> 
> Space use is not the key difference between these methods.
> The btrfs COW makes data sharing safe.  The hard link method
> means changing a file invalidates the content of all linked files.
> 
> So a BackupPC output should be read-only.

I know that, but if you're using "dumb" tools to replicate systems (say 
rsync), you don't want them to overwrite different versions of files and you 
still want to reclaim disk space used by essentially the same data.

My idea behind btrfs as backup storage and using cow not hardlinks for 
duplicated files comes from need to keep archival copies (something not really 
possible with hardlinks) in a way similar to rdiff-backup.

As first backup I just rsync to backup server from all workstations.
But on subsequent backups I copy the last version to a .snapshot/todays-date  
directory using cow, rsync from workstations and then run deduplication 
daemon.

This way I get both reduced storage and old copies (handy for user home 
directories...).

With such use-case, the ability to use cow while needing similar amounts of 
space as hardlinks would be at least useful if not very desired.

That's why I asked if it's possible to optimise btrfs cow mechanism for 
identical files.

>From my testing (directory 584MiB in size, 17395 files, Arch kernel 2.6.32.9, 
coreutils 8.4, btrfs-progs 0.19, 10GiB partition, default mkfs and mount 
options):
cp -al
free space decrease: 6176KiB

cp -a --reflink=always
free space decrease: 23296KiB

and in the second run:
cp -al
free space decrease: 6064KiB

cp -a --reflink=always
free space decrease: 23324KiB

that's nearly 4 times more!
-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid1 with 3 drives

2010-03-05 Thread Hubert Kario
On Friday 05 March 2010 23:13:54 Mike Fedyk wrote:
> On Fri, Mar 5, 2010 at 1:49 PM, Bart Noordervliet  
wrote:
> > Maybe it's worth to consider leaving the burdened raid* terminology
> > behind and name the btrfs redundancy modes more clearly by what they
> > do. For instance "-d double|triple" or "-d 2n|3n". And for raid5/6 "-d
> > single-parity|double-parity" or "-d n+1|n+2".
> 
> +1

Good idea IMHO.

When we will be able to specify the redundancy modes on a file by file basis 
it will make it much less confusig for the users to talk about double, triple  
replication or [single|double]-parity. 

It's a bit silly to talk about "Arrays of Disks" when we mean groups of 
blocks.

-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SSD Optimizations

2010-03-11 Thread Hubert Kario
On Thursday 11 March 2010 08:38:53 Sander wrote:
> Hello Gordan,
> 
> Gordan Bobic wrote (ao):
> > Mike Fedyk wrote:
> > >On Wed, Mar 10, 2010 at 11:49 AM, Gordan Bobic  wrote:
> > >>Are there options available comparable to ext2/ext3 to help reduce
> > >>wear and improve performance?
> 
> With SSDs you don't have to worry about wear.

Sorry, but you do have to worry about wear. I was able to destroy a relatively 
new SD card (2007 or early 2008) just by writing on the first 10MiB over and 
over again for two or three days. The end of the card still works without 
problems but about 10 sectors on the beginning give write errors.

And with journaled file systems that write over and over again on the same spot 
you do have to worry about wear leveling. It depends on the underlying block 
allocation algorithm, but I'm sure that most of the cheap SSDs do wear 
leveling only inside big blocks, not on whole hard drive, making it much 
easier to hit the 10 000-100 000 erase cycles boundary.

Still, I think that if you can prolong the life of hardware without noticable 
performance degradation, you should do it. Just because it may help the drive 
with some defects last those 3-5years between upgreades without any problems.

> 
> > And while I appreciate hopeful remarks along the lines of "I think
> > you'll get more out of btrfs", I am really after specifics of what
> > the ssd mount option does, and what features comparable to the
> > optimizations that can be done with ext2/3/4 (e.g. the mentioned
> > stripe-width option) are available to get the best possible
> > alignment of data and metadata to increase both performance and life
> > expectancy of a SSD.
> 
> Alignment is about the partition, not the fs, and thus taken care of
> with fdisk and the like.
> 
> If you don't create a partition, the fs is aligned with the SSD.

But it does not align internal FS structures to the SSD erase block size and 
that's what Gordon asked for.

And sorry Gordon, I don't know. But there's a 'ssd_spread' option that tries 
to allocate blocks as far as possible (within reason) from  themselfs. That 
should, in most cases, make the fs structures reside on an erase block by  
themself.
I'm afraid that you'll need to dive into the code to know about block 
alignment or one of the developers will need to provide us with info.

> 
> > Also, for drives that don't support TRIM, is there a way to make the
> > FS apply aggressive re-use of erased space (in order to help the
> > drive's internal wear-leveling)?
> 
> TRIM has nothing to do with wear-leveling (although it helps reducing
> wear).
> TRIM lets the OS tell the disk which blocks are not in use anymore, and
> thus don't have to be copied during a rewrite of the blocks.
> Wear-leveling is the SSD making sure all blocks are more or less equally
> written to avoid continuous load on the same blocks.

Isn't this all about wear leveling? TRIM has no meaning for magnetic media. 
It's used to tell the drive which parts of medium contain only junk data and 
can be used in block rotation, making the wear-leveling easier and more 
effective.

-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SSD Optimizations

2010-03-11 Thread Hubert Kario
On Thursday 11 March 2010 14:20:23 Gordan Bobic wrote:
> On Thu, 11 Mar 2010 13:59:09 +0100, Stephan von Krawczynski
> 
>  wrote:
> >> >> > > >On Wed, Mar 10, 2010 at 11:49 AM, Gordan Bobic
> >> >> > > >
> >> >> > > >
> >> >> > > >wrote:
> >> >> > > >>Are there options available comparable to ext2/ext3 to help
> >>
> >> reduce
> >>
> >> >> > > >>wear and improve performance?
> >> >> >
> >> >> > With SSDs you don't have to worry about wear.
> >> >>
> >> >> Sorry, but you do have to worry about wear. I was able to destroy a
> >> >> relatively
> >> >> new SD card (2007 or early 2008) just by writing on the first 10MiB
> >>
> >> over
> >>
> >> >> and
> >> >> over again for two or three days. The end of the card still works
> >> >> without
> >> >> problems but about 10 sectors on the beginning give write errors.
> >> >
> >> > Sorry, the topic was SSD, not SD.
> >>
> >> SD == SSD with an SD interface.
> >
> > That really is quite a statement. You really talk of a few-bucks SD card
> > (like the one in my android handy) as an SSD comparable with Intel XE
> 
> only with
> 
> > different interface? Come on, stay serious. The product is not only made
> 
> of
> 
> > SLCs and some raw logic.
> 
> I am saying that there is no reason for the firmware in an SD card to not
> be as advanced. If the manufacturer has some advanced logic in their SATA
> SSD, I cannot see any valid engineering reason to not apply the same logic
> in a SD product.

The _SD_standard_ states that the media has to implement wear-leveling.
So any card with an SD logo implements it.

As I stated previously, the algorithms used in SD cards may not be as advanced 
as those in top-of-the-line Intel SSDs, but I bet they don't differ by much to 
the ones used in cheapest SSD drives.

Besides, why shouldn't we help the drive firmware by 
- writing the data only in erase-block sizes
- trying to write blocks that are smaller than the erase-block in a way that 
won't cross the erase-block boundary
- using TRIM on deallocated parts of the drive

This will not only increase the life of the SSD but also increase its 
performance.

> 
> >> > Honestly I would just drop the idea of an SSD option simply because
> 
> the
> 
> >> > vendors implement all kinds of neat strategies in their devices. So
> 
> in
> 
> >> the
> >>
> >> > end you cannot really tell if the option does something constructive
> >> > and not destructive in combination with a SSD controller.
> >>
> >> You can make an educated guess. For starters given that visible sector
> >> sizes are not equal to FS block sizes, it means that FS block sizes can
> >> straddle erase block boundaries without the flash controller, no matter
> >> how
> >> fancy, being able to determine this. Thus, at the very least, aligning
> 
> FS
> 
> >> structures so that they do not straddle erase block boundaries is
> 
> useful
> 
> >> in
> >> ALL cases. Thinking otherwise is just sticking your head in the sand
> >> because you cannot be bothered to think.
> >
> > And your guess is that intel engineers had no glue when designing the XE
> > including its controller? You think they did not know what you and me
> 
> know
> 
> > and
> > therefore pray every day that some smart fs designer falls from heaven
> 
> and
> 
> > saves their product from dying in between? Really?
> 
> I am saying that there are problems that CANNOT be solved on the disk
> firmware level. Some problems HAVE to be addressed higher up the stack.

Exactly, you can't assume that the SSDs firmware understands any and all file 
system layouts, especially if they are on fragmented LVM or other logical 
volume manager partitions.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl

System Zarządzania Jakością
zgodny z normą ISO 9001:2000
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >