Re: [PATCH v2] btrfs: handle dynamically reappearing missing device

2017-12-03 Thread Anand Jain




[..]  would be better an explicit user intervention instead of an automatic one 
?


 What is the user intervention method steps that you have in mind ?
 Just curious. Pls remember downtime is not a choice of recovery
 from this context which means FS should be available for the
 applications to perform read/write.

 -o remount was one this which I had skipped for sometime as
 theoretically it can't solve the problem (to bring back missing
 device) as well, but now I have verified it, it won't.

 There is no record of what is the original idea to perform this
 basic volume manager step.

Thanks, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: handle dynamically reappearing missing device

2017-11-20 Thread Anand Jain



On 11/21/2017 03:28 AM, Goffredo Baroncelli wrote:

On 11/20/2017 09:19 AM, Anand Jain wrote:



On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:

On 11/17/2017 01:36 PM, Anand Jain wrote:

If the device is not present at the time of (-o degrade) mount,
the mount context will create a dummy missing struct btrfs_device.
Later this device may reappear after the FS is mounted and
then device is included in the device list but it missed the
open_device part. So this patch handles that case by going
through the open_device steps which this device missed and finally
adds to the device alloc list.


What happens if the first devices got writes before the "last device" is joined 
?

Supposing to have a raid1 pair of devices: sda, sdb

- sda is dissappeared (usb detached ?)
- the filesystem is mounted as
 mount -o degraded /dev/sdb
- some writes happens on /dev/sdb
- the user reattach /dev/sda
- udev run "btrfs dev scan"
- the system joins /dev/sda to the filesystem disks pool

Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) 
or /dev/sda (old data), or worse read something from the former and something 
from the later (metadata from sda and data from sdb ) ???


  Thanks for the test scenario, this case is fine as its read from
  the disk having highest generation number, so we pick sdb in the
  case above.


Ok, so in this case which is the benefit to add a disk ?
With a lover number generation, the added will be used at all ?


 It will be used for new writes, it will stripe across both the disks.
 Balance is only for the older single chunks, which in the long term
 would go away when we are able to create a degraded raid1 chunks.


In this case, would be better an explicit user intervention instead of an 
automatic one ?


  Without this patch the disk already joins the raid group.

  But it had a bug that it missed joining the alloc group.

Thanks, Anand




Thanks, Anand



Am I missing something ?

BR
G.Baroncelli







--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: handle dynamically reappearing missing device

2017-11-20 Thread Goffredo Baroncelli
On 11/20/2017 09:19 AM, Anand Jain wrote:
> 
> 
> On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:
>> On 11/17/2017 01:36 PM, Anand Jain wrote:
>>> If the device is not present at the time of (-o degrade) mount,
>>> the mount context will create a dummy missing struct btrfs_device.
>>> Later this device may reappear after the FS is mounted and
>>> then device is included in the device list but it missed the
>>> open_device part. So this patch handles that case by going
>>> through the open_device steps which this device missed and finally
>>> adds to the device alloc list.
>>
>> What happens if the first devices got writes before the "last device" is 
>> joined ?
>>
>> Supposing to have a raid1 pair of devices: sda, sdb
>>
>> - sda is dissappeared (usb detached ?)
>> - the filesystem is mounted as
>> mount -o degraded /dev/sdb
>> - some writes happens on /dev/sdb
>> - the user reattach /dev/sda
>> - udev run "btrfs dev scan"
>> - the system joins /dev/sda to the filesystem disks pool
>>
>> Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated 
>> data) or /dev/sda (old data), or worse read something from the former and 
>> something from the later (metadata from sda and data from sdb ) ???
> 
>  Thanks for the test scenario, this case is fine as its read from
>  the disk having highest generation number, so we pick sdb in the
>  case above.

Ok, so in this case which is the benefit to add a disk ? With a lover number 
generation, the added will be used at all ?
In this case, would be better an explicit user intervention instead of an 
automatic one ?


> 
> Thanks, Anand
> 
> 
>> Am I missing something ?
>>
>> BR
>> G.Baroncelli
>>
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: handle dynamically reappearing missing device

2017-11-20 Thread Anand Jain



On 11/18/2017 09:52 PM, Goffredo Baroncelli wrote:

On 11/17/2017 01:36 PM, Anand Jain wrote:

If the device is not present at the time of (-o degrade) mount,
the mount context will create a dummy missing struct btrfs_device.
Later this device may reappear after the FS is mounted and
then device is included in the device list but it missed the
open_device part. So this patch handles that case by going
through the open_device steps which this device missed and finally
adds to the device alloc list.


What happens if the first devices got writes before the "last device" is joined 
?

Supposing to have a raid1 pair of devices: sda, sdb

- sda is dissappeared (usb detached ?)
- the filesystem is mounted as
mount -o degraded /dev/sdb
- some writes happens on /dev/sdb
- the user reattach /dev/sda
- udev run "btrfs dev scan"
- the system joins /dev/sda to the filesystem disks pool

Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) 
or /dev/sda (old data), or worse read something from the former and something 
from the later (metadata from sda and data from sdb ) ???


 Thanks for the test scenario, this case is fine as its read from
 the disk having highest generation number, so we pick sdb in the
 case above.

Thanks, Anand



Am I missing something ?

BR
G.Baroncelli


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: handle dynamically reappearing missing device

2017-11-18 Thread Goffredo Baroncelli
On 11/17/2017 01:36 PM, Anand Jain wrote:
> If the device is not present at the time of (-o degrade) mount,
> the mount context will create a dummy missing struct btrfs_device.
> Later this device may reappear after the FS is mounted and
> then device is included in the device list but it missed the
> open_device part. So this patch handles that case by going
> through the open_device steps which this device missed and finally
> adds to the device alloc list.

What happens if the first devices got writes before the "last device" is joined 
?

Supposing to have a raid1 pair of devices: sda, sdb

- sda is dissappeared (usb detached ?)
- the filesystem is mounted as
mount -o degraded /dev/sdb
- some writes happens on /dev/sdb
- the user reattach /dev/sda
- udev run "btrfs dev scan"
- the system joins /dev/sda to the filesystem disks pool

Because the filesystem is a raid1, btrfs may read from /dev/sdb (updated data) 
or /dev/sda (old data), or worse read something from the former and something 
from the later (metadata from sda and data from sdb ) ???

Am I missing something ?

BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs: handle dynamically reappearing missing device

2017-11-17 Thread Anand Jain
If the device is not present at the time of (-o degrade) mount,
the mount context will create a dummy missing struct btrfs_device.
Later this device may reappear after the FS is mounted and
then device is included in the device list but it missed the
open_device part. So this patch handles that case by going
through the open_device steps which this device missed and finally
adds to the device alloc list.

So now with this patch, to bring back the missing device user can run,

   btrfs dev scan 

Without this kernel patch, even though 'btrfs fi show' and 'btrfs
dev ready' would tell you that missing device has reappeared
successfully but actually in kernel FS layer it didn't.

Signed-off-by: Anand Jain 
---
This patch needs:
 [PATCH 0/4]  factor __btrfs_open_devices()

v2:
Add more comments.
Add more change log.
Add to check if device missing is set, to handle the case
dev open fail and user will rerun the dev scan.

 fs/btrfs/volumes.c | 63 +++---
 1 file changed, 60 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5bd73edc2602..b3224baa6db4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -725,7 +725,9 @@ static noinline int device_list_add(const char *path,
 
ret = 1;
device->fs_devices = fs_devices;
-   } else if (!device->name || strcmp(device->name->str, path)) {
+   } else if (!device->name ||
+   strcmp(device->name->str, path) ||
+   device->missing) {
/*
 * When FS is already mounted.
 * 1. If you are here and if the device->name is NULL that
@@ -769,8 +771,63 @@ static noinline int device_list_add(const char *path,
rcu_string_free(device->name);
rcu_assign_pointer(device->name, name);
if (device->missing) {
-   fs_devices->missing_devices--;
-   device->missing = 0;
+   int ret;
+   struct btrfs_fs_info *fs_info = fs_devices->fs_info;
+   fmode_t fmode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
+
+   if (btrfs_super_flags(disk_super) &
+   BTRFS_SUPER_FLAG_SEEDING)
+   fmode &= ~FMODE_WRITE;
+
+   /*
+* Missing can be set only when FS is mounted.
+* So here its always fs_devices->opened > 0 and most
+* of the struct device members are already updated by
+* the mount process even if this device was missing, so
+* now follow the normal open device procedure for this
+* device. The scrub will take care of filling the
+* missing stripes for raid56 and balance for raid1 and
+* raid10.
+*/
+   ASSERT(fs_devices->opened);
+   mutex_lock(_devices->device_list_mutex);
+   mutex_lock(_info->chunk_mutex);
+   /*
+* By not failing for the reason that
+* btrfs_open_one_device() could fail, it will
+* keep the original behaviors as it is for now.
+* That's fix me for later.
+*/
+   ret = btrfs_open_one_device(fs_devices, device, fmode,
+   fs_info->bdev_holder);
+   if (!ret) {
+   /*
+* Making sure missing is set to 0
+* only when bdev != NULL is as expected
+* at most of the places in the code.
+* Further, what if user fixes the
+* dev and reruns the dev scan, then again
+* we need to come here to reset missing.
+*/
+   fs_devices->missing_devices--;
+   device->missing = 0;
+   btrfs_clear_opt(fs_info->mount_opt, DEGRADED);
+   btrfs_warn(fs_info,
+   "BTRFS: device %s devid %llu uuid %pU 
joined\n",
+   path, devid, device->uuid);
+   }
+
+   if (device->writeable &&
+   !device->is_tgtdev_for_dev_replace) {
+   fs_devices->total_rw_bytes +=
+   device->total_bytes;
+   atomic64_add(device->total_bytes -
+