[PATCH 2/2] block: Drop dead function blk_abort_queue()

2012-05-24 Thread Asias He
This function is only used by btrfs code in btrfs_abort_devices()
(seems in a wrong way). However, btrfs_abort_devices() is not used
anywhere. Let's remove the dead code to avoid any confusion.

Cc: Jens Axboe 
Cc: linux-ker...@vger.kernel.org
Cc: Chris Mason 
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Asias He 
---
 block/blk-timeout.c|   41 -
 include/linux/blkdev.h |1 -
 2 files changed, 42 deletions(-)

diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 7803548..6e4744c 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -197,44 +197,3 @@ void blk_add_timer(struct request *req)
mod_timer(&q->timeout, expiry);
 }
 
-/**
- * blk_abort_queue -- Abort all request on given queue
- * @queue: pointer to queue
- *
- */
-void blk_abort_queue(struct request_queue *q)
-{
-   unsigned long flags;
-   struct request *rq, *tmp;
-   LIST_HEAD(list);
-
-   /*
-* Not a request based block device, nothing to abort
-*/
-   if (!q->request_fn)
-   return;
-
-   spin_lock_irqsave(q->queue_lock, flags);
-
-   elv_abort_queue(q);
-
-   /*
-* Splice entries to local list, to avoid deadlocking if entries
-* get readded to the timeout list by error handling
-*/
-   list_splice_init(&q->timeout_list, &list);
-
-   list_for_each_entry_safe(rq, tmp, &list, timeout_list)
-   blk_abort_request(rq);
-
-   /*
-* Occasionally, blk_abort_request() will return without
-* deleting the element from the list. Make sure we add those back
-* instead of leaving them on the local stack list.
-*/
-   list_splice(&list, &q->timeout_list);
-
-   spin_unlock_irqrestore(q->queue_lock, flags);
-
-}
-EXPORT_SYMBOL_GPL(blk_abort_queue);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4d4ac24..dcd9738 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -809,7 +809,6 @@ extern bool __blk_end_request_err(struct request *rq, int 
error);
 extern void blk_complete_request(struct request *);
 extern void __blk_complete_request(struct request *);
 extern void blk_abort_request(struct request *);
-extern void blk_abort_queue(struct request_queue *);
 extern void blk_unprep_request(struct request *);
 
 /*
-- 
1.7.10.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: Drop unused function btrfs_abort_devices()

2012-05-24 Thread Asias He
1) This function is not used anywhere.

2) Using the blk_abort_queue() to abort the queue seems not correct.
blk_abort_queue() is used for timeout handling (block/blk-timeout.c).

Cc: Chris Mason 
Cc: linux-btrfs@vger.kernel.org
Cc: Jens Axboe 
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Asias He 
---
 fs/btrfs/disk-io.c |   13 -
 fs/btrfs/disk-io.h |1 -
 2 files changed, 14 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e1fe74a..3521866 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2902,19 +2902,6 @@ int write_ctree_super(struct btrfs_trans_handle *trans,
return ret;
 }
 
-/* Kill all outstanding I/O */
-void btrfs_abort_devices(struct btrfs_root *root)
-{
-   struct list_head *head;
-   struct btrfs_device *dev;
-   mutex_lock(&root->fs_info->fs_devices->device_list_mutex);
-   head = &root->fs_info->fs_devices->devices;
-   list_for_each_entry_rcu(dev, head, dev_list) {
-   blk_abort_queue(dev->bdev->bd_disk->queue);
-   }
-   mutex_unlock(&root->fs_info->fs_devices->device_list_mutex);
-}
-
 void btrfs_free_fs_root(struct btrfs_fs_info *fs_info, struct btrfs_root *root)
 {
spin_lock(&fs_info->fs_roots_radix_lock);
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index ab1830a..05b3fab 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -89,7 +89,6 @@ int btrfs_add_log_tree(struct btrfs_trans_handle *trans,
 int btrfs_cleanup_transaction(struct btrfs_root *root);
 void btrfs_cleanup_one_transaction(struct btrfs_transaction *trans,
  struct btrfs_root *root);
-void btrfs_abort_devices(struct btrfs_root *root);
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 void btrfs_init_lockdep(void);
-- 
1.7.10.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Cant mount multi-subvolume via fstab

2012-05-24 Thread Rogerio Bastos
On Wed, May 23, 2012 at 07:31:49PM +0200, Goffredo Baroncelli wrote:
> Hi Rogerio,
> 
> On 05/23/2012 05:00 PM, Rogerio Bastos wrote:
> > Hi,
> > 
> > I'm trying mount many subvolume during boot via fstab:
> > 
> > UUID=xxx /usr btrfs subvol=usr,ro,nodev 0 0
> > UUID=xxx /home btrfs subvol=home,nodev,nosuid 0 0
> > UUID=xxx /var btrfs subvol=var,nodev 0 0
> > UUID=xxx /var/tmp btrfs subvol=var-tmp,nodev,noexec,nosuid 0 0
> > 
> > But only the first one is mounted. When try to mount the others
> > subvolumes, I get this error:
> 
> I did some tests. It seems that the problem is that you want to mount
> different subvolumes *of the same filesystem* (/dev/sda3) both in RO
> (first entry) and RW (the other entries).
> 
> Please try to removing the 'RO' for the first entry, and let know us
> what happens.

You are right, without RO I can mount all subvolume.
Thank you.

> BR
> Goffredo
> 
> > 
> > mount: /dev/sda3 already mounted or /home busy
> > mount: according to mtab, /dev/sda3 is mounted on /usr
> > mount: /dev/sda3 already mounted or /var busy
> > mount: according to mtab, /dev/sda3 is mounted on /usr
> > mount: mount point /var/tmp does not exist
> > 
> > I'm using linux kernel 3.3.6 and mount 2.20 in Debian 7.
> > 
> 

-- 

Rogerio Bastos
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs and more compression algorithms

2012-05-24 Thread Matt
Hi Chris, Hi Josef,

Hi Btrfs-List and all other Btrfs-devs that I've forgot,


is there a chance we'll see a xz file-compression support in Btrfs
anytime soon ?

I'm sure folks have been waiting for additional compression support
besides gzip and lzo (bzip2 seems out of question due to its slowness,
there's pbzip2 but that's not included in the kernel).

This would be a really nice bonus due to the processors getting faster
and SSD usage is more and more widespread - add an efficient
implementation and

we would have a fast, extremely efficient and feature-rich filesystem.

My current situation is that several of my harddrives are almost
completely full - even with forced gzip-compression - so I thought I'd
asked whether there was any change in the near future ahead.
There's fusecompress but that probably wouldn't end up being as stable
as a btrfs with xz/lzma-support.


Thanks for your consideration and your work on Btrfs !

It got significantly more stable compared to the past :)

(I use it mainly for some small backup hdds;

a troublesome usage however is still suspending-to-ram/to-disk
regularly and with that the partition [I have a dedicated partition
for the portage-tarball of Gentoo Linux]
where the filesystem seems to take some damage where it can't be
written to anymore via rsync (or other programs). The bash session
hangs (and nothing gets written to the partition).
Running scrub revealed no issues. I haven't had a chance to test it
yet with the new btrfs-progs - haven't suspended meanwhile)


Kind Regards

Matt
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: resize all devices when we dont assign a specific device id

2012-05-24 Thread Goffredo Baroncelli
On 05/24/2012 04:15 AM, Liu Bo wrote:
> On 05/23/2012 01:21 PM, Goffredo Baroncelli wrote:
> 
>> Hi
>>
>> On 05/17/2012 02:08 PM, Liu Bo wrote:
>>> This patch fixes two bugs:
>>>
>>> When we do not assigne a device id for the resizer,
>>> - it will only take one device to resize, which is supposed to apply on
>>>   all available devices.
>>> - it will take 'id 1' device as default, and this will cause a bug as we
>>>   may have removed the 'id 1' device from the filesystem.
>>>
>>> After this patch, we can find all available devices by searching the
>>> chunk tree and resize them:
>>
>>
>> I am not sure that this is a sane default for all resizing.
>>
>> If the user want to resize to MAX, I agree that it is a sane default,
>> but when the user want to shrink or enlarge of a fixed quantity, the
>> user should specific the dev id. Because the shrinking and or the
>> enlarging should be evaluated case by case.
>>
>> My suggestion is to change the code at kernel level so in case of
>> multi-volume file-system the user *has* to specify the device to shrink
>> and/or enlarge.
>> Should be the user space btrfs tool to handle the check and the growing
>> (i.e: if the new size is max, automatically grow all the device up to
>> max; otherwise the user should specific the device to shrink and/or
>> enlarge).
>>
> 
> 
> Hi,
> 
> It is quite easy to do what you expect, but IMO that will confuse users a 
> lot...

What happens if one device (of a pool) is too small to hosts a
filesystem enlargement ? How from kernel space the error is returned ?

In kernel space is more complex to handle an error. If there is a single
device there is not problem: the error is linked to the single device.
But in the multiple devices case, which one fails ?

Think also to other error like a dive missing.

Returning a generic error and force the user to see log is a too simple
approach.

> 
> With this patch, you can still assign a specific dev id to resize what you 
> want :)

Of course everything that could be done in user-space could be done in
kernel space.
I think that as general rule we should put all the code in user space
except if there are performance problem and/or missing information. In
this case the userspace could have all the information (see
scrub_fs_info() and scrub_device_info() on how acquire the filesystem
information ) and there is no performance problem.


> 
> thanks,
> liubo
> 
>> BR
>> Goffredo
>>
>>
>>> $ mkfs.btrfs /dev/sdb7
>>> $ mount /dev/sdb7 /mnt/btrfs/
>>> $ btrfs dev add /dev/sdb8 /mnt/btrfs/
>>>
>>> $ btrfs fi resize -100m /mnt/btrfs/
>>> then we can get from dmesg:
>>> btrfs: new size for /dev/sdb7 is 980844544
>>> btrfs: new size for /dev/sdb8 is 980844544
>>>
>>> $ btrfs fi resize max /mnt/btrfs
>>> then we can get from dmesg:
>>> btrfs: new size for /dev/sdb7 is 1085702144
>>> btrfs: new size for /dev/sdb8 is 1085702144
>>>
>>> $ btrfs fi resize 1:-100m /mnt/btrfs
>>> then we can get from dmesg:
>>> btrfs: resizing devid 1
>>> btrfs: new size for /dev/sdb7 is 980844544
>>>
>>> $ btrfs fi resize 1:-100m /mnt/btrfs
>>> then we can get from dmesg:
>>> btrfs: resizing devid 2
>>> btrfs: new size for /dev/sdb8 is 980844544
>>>
>>> Signed-off-by: Liu Bo 
>>> ---
>>>  fs/btrfs/ioctl.c |  101 
>>> --
>>>  1 files changed, 83 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>>> index ec2245d..d9a4fa8 100644
>>> --- a/fs/btrfs/ioctl.c
>>> +++ b/fs/btrfs/ioctl.c
>>> @@ -1250,12 +1250,51 @@ out_ra:
>>> return ret;
>>>  }
>>>  
>>> +static struct btrfs_device *get_avail_device(struct btrfs_root *root, u64 
>>> devid)
>>> +{
>>> +   struct btrfs_key key;
>>> +   struct btrfs_path *path;
>>> +   struct btrfs_dev_item *dev_item;
>>> +   struct btrfs_device *device = NULL;
>>> +   int ret;
>>> +
>>> +   path = btrfs_alloc_path();
>>> +   if (!path)
>>> +   return ERR_PTR(-ENOMEM);
>>> +
>>> +   key.objectid = BTRFS_DEV_ITEMS_OBJECTID;
>>> +   key.offset = devid;
>>> +   key.type = BTRFS_DEV_ITEM_KEY;
>>> +
>>> +   ret = btrfs_search_slot(NULL, root->fs_info->chunk_root, &key,
>>> +   path, 0, 0);
>>> +   if (ret < 0) {
>>> +   device = ERR_PTR(ret);
>>> +   goto out;
>>> +   }
>>> +   btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]);
>>> +   if (key.objectid != BTRFS_DEV_ITEMS_OBJECTID ||
>>> +   key.type != BTRFS_DEV_ITEM_KEY) {
>>> +   device = NULL;
>>> +   goto out;
>>> +   }
>>> +   dev_item = btrfs_item_ptr(path->nodes[0], path->slots[0],
>>> + struct btrfs_dev_item);
>>> +   devid = btrfs_device_id(path->nodes[0], dev_item);
>>> +
>>> +   device = btrfs_find_device(root, devid, NULL, NULL);
>>> +out:
>>> +   btrfs_free_path(path);
>>> +   return device;
>>> +}
>>> +
>>>  static noinline int btrfs_ioctl_resize(struct btrfs_root *root,
>>> void __user *arg)
>>>  {
>>> -   u64 ne

[BUG] atime on ro snapshots is updated when it should not

2012-05-24 Thread Alexander Block
Hello,

if a snapshot was created with -r and thus is read only, accessing
files in it will update the atime. I would expect that atime is not
updated on ro snapshots.

I tried to find out where the ro check is missing. The problem seems
to be that the vfs is only checking the mount, super block and
i_flags.
As it has no clue about subvolumes, it never checks them. My temporary
solution for me to continue working is atm the patch at the end of
this mail.
Is anyone with more vfs experience able to fix this in a better way?

Thanks,
Alex.

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3524978..6f126e0 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -115,6 +115,10 @@ void btrfs_update_iflags(struct inode *inode)
inode->i_flags |= S_NOATIME;
if (ip->flags & BTRFS_INODE_DIRSYNC)
inode->i_flags |= S_DIRSYNC;
+
+   if (btrfs_root_readonly(ip->root)) {
+   inode->i_flags |= S_NOATIME;
+   }
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V2] Btrfs: fix the same inode id problem when doing auto defragment

2012-05-24 Thread Miao Xie
Two files in the different subvolumes may have the same inode id, so
The rb-tree which is used to manage the defragment object must take it
into account. This patch fix this problem.

Signed-off-by: Miao Xie 
---
Changelog v1 -> v2
- cleanup unnecessary comparison.
---
 fs/btrfs/file.c |   49 +++--
 1 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 53bf2d7..0c04a62 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -65,6 +65,21 @@ struct inode_defrag {
int cycled;
 };
 
+static int __compare_inode_defrag(struct inode_defrag *defrag1,
+ struct inode_defrag *defrag2)
+{
+   if (defrag1->root > defrag2->root)
+   return 1;
+   else if (defrag1->root < defrag2->root)
+   return -1;
+   else if (defrag1->ino > defrag2->ino)
+   return 1;
+   else if (defrag1->ino < defrag2->ino)
+   return -1;
+   else
+   return 0;
+}
+
 /* pop a record for an inode into the defrag tree.  The lock
  * must be held already
  *
@@ -81,15 +96,17 @@ static void __btrfs_add_inode_defrag(struct inode *inode,
struct inode_defrag *entry;
struct rb_node **p;
struct rb_node *parent = NULL;
+   int ret;
 
p = &root->fs_info->defrag_inodes.rb_node;
while (*p) {
parent = *p;
entry = rb_entry(parent, struct inode_defrag, rb_node);
 
-   if (defrag->ino < entry->ino)
+   ret = __compare_inode_defrag(defrag, entry);
+   if (ret < 0)
p = &parent->rb_left;
-   else if (defrag->ino > entry->ino)
+   else if (ret > 0)
p = &parent->rb_right;
else {
/* if we're reinserting an entry for
@@ -159,28 +176,35 @@ int btrfs_add_inode_defrag(struct btrfs_trans_handle 
*trans,
 /*
  * must be called with the defrag_inodes lock held
  */
-struct inode_defrag *btrfs_find_defrag_inode(struct btrfs_fs_info *info, u64 
ino,
+struct inode_defrag *btrfs_find_defrag_inode(struct btrfs_fs_info *info,
+u64 root, u64 ino,
 struct rb_node **next)
 {
struct inode_defrag *entry = NULL;
+   struct inode_defrag tmp;
struct rb_node *p;
struct rb_node *parent = NULL;
+   int ret;
+
+   tmp.ino = ino;
+   tmp.root = root;
 
p = info->defrag_inodes.rb_node;
while (p) {
parent = p;
entry = rb_entry(parent, struct inode_defrag, rb_node);
 
-   if (ino < entry->ino)
+   ret = __compare_inode_defrag(&tmp, entry);
+   if (ret < 0)
p = parent->rb_left;
-   else if (ino > entry->ino)
+   else if (ret > 0)
p = parent->rb_right;
else
return entry;
}
 
if (next) {
-   while (parent && ino > entry->ino) {
+   while (parent && __compare_inode_defrag(&tmp, entry) > 0) {
parent = rb_next(parent);
entry = rb_entry(parent, struct inode_defrag, rb_node);
}
@@ -202,6 +226,7 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info)
struct btrfs_key key;
struct btrfs_ioctl_defrag_range_args range;
u64 first_ino = 0;
+   u64 root_objectid = 0;
int num_defrag;
int defrag_batch = 1024;
 
@@ -214,11 +239,14 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info)
n = NULL;
 
/* find an inode to defrag */
-   defrag = btrfs_find_defrag_inode(fs_info, first_ino, &n);
+   defrag = btrfs_find_defrag_inode(fs_info, root_objectid,
+first_ino, &n);
if (!defrag) {
-   if (n)
-   defrag = rb_entry(n, struct inode_defrag, 
rb_node);
-   else if (first_ino) {
+   if (n) {
+   defrag = rb_entry(n, struct inode_defrag,
+ rb_node);
+   } else if (root_objectid || first_ino) {
+   root_objectid = 0;
first_ino = 0;
continue;
} else {
@@ -228,6 +256,7 @@ int btrfs_run_defrag_inodes(struct btrfs_fs_info *fs_info)
 
/* remove it from the rbtree */
first_ino = defrag->ino + 1;
+   root_objectid = defrag->root;
rb_erase(&defrag->rb_node, &fs_info->defrag_inodes);
 
if (btrfs_fs_closing(fs_info))
-- 
1.7.6.5
--
To unsubscribe 

Re: Preparing single-disk setup for future multi-disk usage

2012-05-24 Thread Calvin Walton
On Thu, 2012-05-24 at 08:05 +0200, Björn Wüst wrote:
> Good morning,
> 
> 
> I currently have a single-disk setup where I want to use btrfs
> filesystem. Yet, I expect to add additional disks to this system in
> the future. Those disks shall be visible to the OS like a single disk,
> i.e. using multi-disk feature in btrfs. While data shall be striped
> among those disks in the future, meta data shall be mirrored for
> better fault tolerance (loss of some data is acceptable, while loss of
> all data is not acceptable).

If you're using the "RAID 0" (striping) mode, you will not be able to
recover any data in a useful form after a disk failure, because each
file will have been broken into small chunks and spread over both disks.
If you use the 'single' allocation mode instead, it is more likely that
individual files will be mostly on one disk, but there are no
guarantees.

If you want data to be recoverable, you should use a redundant raid
mode; otherwise don't expect that you'll be able to save much after a
disk failure.

-- 
Calvin Walton 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix the same inode id problem when doing auto defragment

2012-05-24 Thread David Sterba
On Thu, May 24, 2012 at 10:42:19AM +0800, Miao Xie wrote:
> @@ -87,9 +102,9 @@ static void __btrfs_add_inode_defrag(struct inode *inode,
>   parent = *p;
>   entry = rb_entry(parent, struct inode_defrag, rb_node);
>  
> - if (defrag->ino < entry->ino)
> + if (__compare_inode_defrag(defrag, entry) < 0)
>   p = &parent->rb_left;
> - else if (defrag->ino > entry->ino)
> + else if (__compare_inode_defrag(defrag, entry) > 0)
>   p = &parent->rb_right;

you don't need to call the __compare_inode_defrag twice, the comparison
result will not change under hands between the two calls.

>   else {
>   /* if we're reinserting an entry for
> @@ -159,28 +174,33 @@ int btrfs_add_inode_defrag(struct btrfs_trans_handle 
> *trans,
>  /*
>   * must be called with the defrag_inodes lock held
>   */
> -struct inode_defrag *btrfs_find_defrag_inode(struct btrfs_fs_info *info, u64 
> ino,
> +struct inode_defrag *btrfs_find_defrag_inode(struct btrfs_fs_info *info,
> +  u64 root, u64 ino,
>struct rb_node **next)
>  {
>   struct inode_defrag *entry = NULL;
> + struct inode_defrag tmp;
>   struct rb_node *p;
>   struct rb_node *parent = NULL;
>  
> + tmp.ino = ino;
> + tmp.root = root;
> +
>   p = info->defrag_inodes.rb_node;
>   while (p) {
>   parent = p;
>   entry = rb_entry(parent, struct inode_defrag, rb_node);
>  
> - if (ino < entry->ino)
> + if (__compare_inode_defrag(&tmp, entry) < 0)
>   p = parent->rb_left;
> - else if (ino > entry->ino)
> + else if (__compare_inode_defrag(&tmp, entry) > 0)

(same here)

>   p = parent->rb_right;
>   else
>   return entry;
>   }
>  
>   if (next) {
> - while (parent && ino > entry->ino) {
> + while (parent && __compare_inode_defrag(&tmp, entry) > 0) {
>   parent = rb_next(parent);
>   entry = rb_entry(parent, struct inode_defrag, rb_node);
>   }


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Preparing single-disk setup for future multi-disk usage

2012-05-24 Thread Bjoern Wuest
Hi Hugo,


Thank you very much for the information.


Regards
  Bjoern


-Original Message-
From: Hugo Mills [mailto:h...@carfax.org.uk] 
Sent: Thursday, May 24, 2012 10:41 AM
To: Björn Wüst
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Preparing single-disk setup for future multi-disk usage

On Thu, May 24, 2012 at 08:05:23AM +0200, Björn Wüst wrote:
> Good morning,
> 
> 
> I currently have a single-disk setup where I want to use btrfs filesystem.
Yet, I expect to add additional disks to this system in the future. Those
disks shall be visible to the OS like a single disk, i.e. using multi-disk
feature in btrfs. While data shall be striped among those disks in the
future, meta data shall be mirrored for better fault tolerance (loss of some
data is acceptable, while loss of all data is not acceptable).
> 
> btrfs supports multi-disk setups and even adding additional devices at a
later point of time. Thus, it is my preferred choice. However, I am puzzled
how the mkfs.btrfs command must be parametrized to have RAID1 for meta data
and RAID0 for data with just a single disk / partition. Could I simply do
mkfs.btrfs -m raid1 -d raid0 /dev/sdaX (where X is the partition number) ?

   No, this will fail to work, as RAID-1 and -0 both require at least two
disks. What you need to do is use the default settings (mkfs.btrfs
/dev/sdaX) for now, which will give you DUP metadata (two copies of each
block, in different places on the disk), and "single" data (one copy of the
data).

   When you add a new disk, you can convert the replication to something
else, using the balance command:

# btrfs dev add /dev/sdY /mountpoint
# btrfs balance start -dconvert=raid0 /mountpoint

(Note that you don't need to specify -m raid1 to convert the metadata, as
the DUP->RAID-1 conversion is implicit where it's possible).

> Unfortunately, I do not have a disk to test it right now. The disk I am
planning to use is with the post service still :) . Searching the Web could
not reveal a similar scenario. All multi-disk examples assume that you
already have multiple disks and not that you are going to have them.

   The conversion code is quite new, so there's few examples out there right
now.

   Hugo.

--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- There are three things you should never see being made: laws, ---  
standards,  and sausages.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ceph on btrfs 3.4rc

2012-05-24 Thread Christian Brunner
Same thing here.

I've tried really hard, but even after 12 hours I wasn't able to get a
single warning from btrfs.

I think you cracked it!

Thanks,
Christian

2012/5/24 Martin Mailand :
> Hi,
> the ceph cluster is running under heavy load for the last 13 hours without a
> problem, dmesg is empty and the performance is good.
>
> -martin
>
> Am 23.05.2012 21:12, schrieb Martin Mailand:
>
>> this patch is running for 3 hours without a Bug and without the Warning.
>> I will let it run overnight and report tomorrow.
>> It looks very good ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Preparing single-disk setup for future multi-disk usage

2012-05-24 Thread Hugo Mills
On Thu, May 24, 2012 at 08:05:23AM +0200, Björn Wüst wrote:
> Good morning,
> 
> 
> I currently have a single-disk setup where I want to use btrfs filesystem. 
> Yet, I expect to add additional disks to this system in the future. Those 
> disks shall be visible to the OS like a single disk, i.e. using multi-disk 
> feature in btrfs. While data shall be striped among those disks in the 
> future, meta data shall be mirrored for better fault tolerance (loss of some 
> data is acceptable, while loss of all data is not acceptable).
> 
> btrfs supports multi-disk setups and even adding additional devices at a 
> later point of time. Thus, it is my preferred choice. However, I am puzzled 
> how the mkfs.btrfs command must be parametrized to have RAID1 for meta data 
> and RAID0 for data with just a single disk / partition. Could I simply do 
> mkfs.btrfs -m raid1 -d raid0 /dev/sdaX (where X is the partition number) ?

   No, this will fail to work, as RAID-1 and -0 both require at least
two disks. What you need to do is use the default settings (mkfs.btrfs
/dev/sdaX) for now, which will give you DUP metadata (two copies of
each block, in different places on the disk), and "single" data (one
copy of the data).

   When you add a new disk, you can convert the replication to
something else, using the balance command:

# btrfs dev add /dev/sdY /mountpoint
# btrfs balance start -dconvert=raid0 /mountpoint

(Note that you don't need to specify -m raid1 to convert the metadata,
as the DUP->RAID-1 conversion is implicit where it's possible).

> Unfortunately, I do not have a disk to test it right now. The disk I am 
> planning to use is with the post service still :) . Searching the Web could 
> not reveal a similar scenario. All multi-disk examples assume that you 
> already have multiple disks and not that you are going to have them.

   The conversion code is quite new, so there's few examples out there
right now.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- There are three things you should never see being made: laws, ---  
standards,  and sausages.


signature.asc
Description: Digital signature