Re: [PATCH 1/3] btrfs: Add incompat flags check for btrfs_check_super_valid()

2018-04-20 Thread Anand Jain



On 04/21/2018 10:43 AM, Qu Wenruo wrote:



On 2018年04月21日 10:38, Anand Jain wrote:



On 04/20/2018 11:15 PM, David Sterba wrote:

On Fri, Apr 20, 2018 at 10:32:03PM +0800, Anand Jain wrote:



On 04/19/2018 05:38 PM, Qu Wenruo wrote:

Although we have already checked incompat flags manually before really
mounting it, we could still enhance btrfs_check_super_valid() to check
incompat flags for later write time super block validation check.

This patch adds such incompat flags check for
btrfs_check_super_valid(),
currently it won't be triggered, but provides the basis for later write
time check.

Signed-off-by: Qu Wenruo 
---
    fs/btrfs/disk-io.c | 13 +
    1 file changed, 13 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 60caa68c3618..ec123158f051 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4104,6 +4104,19 @@ static int btrfs_check_super_valid(struct
btrfs_fs_info *fs_info)
    ret = -EINVAL;
    }
    +    /*
+ * Before calling btrfs_check_super_valid() we have already
checked
+ * incompat flags. So if we developr new incompat flags, it's
must be
+ * some corruption.
+ */
+    if (btrfs_super_incompat_flags(sb) &
~BTRFS_FEATURE_INCOMPAT_SUPP) {
+    btrfs_err(fs_info,
+    "corrupted incompat flags detected 0x%llx, supported 0x%llx",



2707 features = btrfs_super_incompat_flags(disk_super) &
2708 ~BTRFS_FEATURE_INCOMPAT_SUPP;
2709 if (features) {
2710 btrfs_err(fs_info,
2711 "cannot mount because of unsupported
optional features (%llx)",
2712 features);
2713 err = -EINVAL;
2714 goto fail_alloc;
2715 }


So there's a "user experience" change, now that you pointed out the
other check.


  Its a regression.


If a filesystem with incompat bits set will be mounted,
this will say 'you have corrupted filesystem', which is not IMO what we
want to tell.

Tough the extended btrfs_check_super_valid could catch the corrupted
incompat bits, what we need at mount time is wording from the 2nd
message ("cannot mount unsuppported features").

I think that there should be a separate function that does the
pre-commit checks, calls btrfs_check_super_valid and also validates the
incompat bit.


  We can still print it as 'unsupported optional features', which would
  imply corruption in the non-mount context.


In that case such wording is not obvious enough to info it's super block
corruption.


 If the device is already mounted. Then its a corruption.

Thanks, Anand


So I still prefer current way of checking incompact flags and csum types
manually at mount time before btrfs_validate_super().

David's idea of a separate function to do mount time check and gives
better prompt is pretty good.
Especially for incompat features, as incompat features will increase in
the future and for older kernel info such feature as corruption is not
acceptable.

But currently, there are only 2 members we need to check at mount time
(csum_type and incompat features), and only one caller, current code
looks good enough for its purpose.

Thanks,
Qu



Thanks, Anand


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] btrfs: Add incompat flags check for btrfs_check_super_valid()

2018-04-20 Thread Qu Wenruo


On 2018年04月21日 10:38, Anand Jain wrote:
> 
> 
> On 04/20/2018 11:15 PM, David Sterba wrote:
>> On Fri, Apr 20, 2018 at 10:32:03PM +0800, Anand Jain wrote:
>>>
>>>
>>> On 04/19/2018 05:38 PM, Qu Wenruo wrote:
 Although we have already checked incompat flags manually before really
 mounting it, we could still enhance btrfs_check_super_valid() to check
 incompat flags for later write time super block validation check.

 This patch adds such incompat flags check for
 btrfs_check_super_valid(),
 currently it won't be triggered, but provides the basis for later write
 time check.

 Signed-off-by: Qu Wenruo 
 ---
    fs/btrfs/disk-io.c | 13 +
    1 file changed, 13 insertions(+)

 diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
 index 60caa68c3618..ec123158f051 100644
 --- a/fs/btrfs/disk-io.c
 +++ b/fs/btrfs/disk-io.c
 @@ -4104,6 +4104,19 @@ static int btrfs_check_super_valid(struct
 btrfs_fs_info *fs_info)
    ret = -EINVAL;
    }
    +    /*
 + * Before calling btrfs_check_super_valid() we have already
 checked
 + * incompat flags. So if we developr new incompat flags, it's
 must be
 + * some corruption.
 + */
 +    if (btrfs_super_incompat_flags(sb) &
 ~BTRFS_FEATURE_INCOMPAT_SUPP) {
 +    btrfs_err(fs_info,
 +    "corrupted incompat flags detected 0x%llx, supported 0x%llx",
>>
>>> 2707 features = btrfs_super_incompat_flags(disk_super) &
>>> 2708 ~BTRFS_FEATURE_INCOMPAT_SUPP;
>>> 2709 if (features) {
>>> 2710 btrfs_err(fs_info,
>>> 2711 "cannot mount because of unsupported
>>> optional features (%llx)",
>>> 2712 features);
>>> 2713 err = -EINVAL;
>>> 2714 goto fail_alloc;
>>> 2715 }
>>
>> So there's a "user experience" change, now that you pointed out the
>> other check. 
> 
>  Its a regression.
> 
>> If a filesystem with incompat bits set will be mounted,
>> this will say 'you have corrupted filesystem', which is not IMO what we
>> want to tell.
>>
>> Tough the extended btrfs_check_super_valid could catch the corrupted
>> incompat bits, what we need at mount time is wording from the 2nd
>> message ("cannot mount unsuppported features").
>>
>> I think that there should be a separate function that does the
>> pre-commit checks, calls btrfs_check_super_valid and also validates the
>> incompat bit.
> 
>  We can still print it as 'unsupported optional features', which would
>  imply corruption in the non-mount context.

In that case such wording is not obvious enough to info it's super block
corruption.

So I still prefer current way of checking incompact flags and csum types
manually at mount time before btrfs_validate_super().

David's idea of a separate function to do mount time check and gives
better prompt is pretty good.
Especially for incompat features, as incompat features will increase in
the future and for older kernel info such feature as corruption is not
acceptable.

But currently, there are only 2 members we need to check at mount time
(csum_type and incompat features), and only one caller, current code
looks good enough for its purpose.

Thanks,
Qu

> 
> Thanks, Anand
> 
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 1/3] btrfs: Add incompat flags check for btrfs_check_super_valid()

2018-04-20 Thread Anand Jain



On 04/20/2018 11:15 PM, David Sterba wrote:

On Fri, Apr 20, 2018 at 10:32:03PM +0800, Anand Jain wrote:



On 04/19/2018 05:38 PM, Qu Wenruo wrote:

Although we have already checked incompat flags manually before really
mounting it, we could still enhance btrfs_check_super_valid() to check
incompat flags for later write time super block validation check.

This patch adds such incompat flags check for btrfs_check_super_valid(),
currently it won't be triggered, but provides the basis for later write
time check.

Signed-off-by: Qu Wenruo 
---
   fs/btrfs/disk-io.c | 13 +
   1 file changed, 13 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 60caa68c3618..ec123158f051 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4104,6 +4104,19 @@ static int btrfs_check_super_valid(struct btrfs_fs_info 
*fs_info)
ret = -EINVAL;
}
   
+	/*

+* Before calling btrfs_check_super_valid() we have already checked
+* incompat flags. So if we developr new incompat flags, it's must be
+* some corruption.
+*/
+   if (btrfs_super_incompat_flags(sb) & ~BTRFS_FEATURE_INCOMPAT_SUPP) {
+   btrfs_err(fs_info,
+   "corrupted incompat flags detected 0x%llx, supported 0x%llx",



2707 features = btrfs_super_incompat_flags(disk_super) &
2708 ~BTRFS_FEATURE_INCOMPAT_SUPP;
2709 if (features) {
2710 btrfs_err(fs_info,
2711 "cannot mount because of unsupported optional features 
(%llx)",
2712 features);
2713 err = -EINVAL;
2714 goto fail_alloc;
2715 }


So there's a "user experience" change, now that you pointed out the
other check. 


 Its a regression.


If a filesystem with incompat bits set will be mounted,
this will say 'you have corrupted filesystem', which is not IMO what we
want to tell.

Tough the extended btrfs_check_super_valid could catch the corrupted
incompat bits, what we need at mount time is wording from the 2nd
message ("cannot mount unsuppported features").

I think that there should be a separate function that does the
pre-commit checks, calls btrfs_check_super_valid and also validates the
incompat bit.


 We can still print it as 'unsupported optional features', which would
 imply corruption in the non-mount context.

Thanks, Anand


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Directory entry not persisted on a fsync

2018-04-20 Thread Jayashree Mohan
Hi,

We came across a scenario where inspite of fsync-ing the directory,
the entry was not persisted - the file created under this directory
was lost.

Consider the following workload :

1. creat test/foo
2. mkdir test/A
3. creat test/A/foo
4. fsync test/A/foo
5. fsync test
-crash-

When we recover after the crash, the contents in the directory are as follows:
dir test:
A

dir test/A:
foo

Notice that file foo that was created in step 1 above, is lost inspite
of calling a fsync on its parent directory. On all other
filesystems(ext4, xfs, and f2fs), we see file foo persisted in the
test directory. We expect directory entries to be persisted when the
directory inode is fsynced right? Losing file foo doesn't seem to be
the right behavior.


Thanks,
Jayashree Mohan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: add chattr support for send/receive

2018-04-20 Thread Howard McLauchlan
On 04/18/2018 01:15 AM, Filipe Manana wrote:
> On Wed, Apr 18, 2018 at 12:39 AM, Howard McLauchlan  
> wrote:
>> Presently btrfs send/receive does not propagate inode attribute flags;
>> all chattr operations are effectively discarded upon transmission.
>>
>> This patch adds kernel support for inode attribute flags. Userspace
>> support can be found under the commit:
>>
>> btrfs-progs: add chattr support for send/receive
>>
>> An associated xfstest can be found at:
>>
>> btrfs: add verify chattr support for send/receive test
>>
>> A caveat is that a user with an updated kernel (aware of chattrs) and an
>> older version of btrfs-progs (unaware of chattrs) will fail to receive
>> if a chattr is included in the send stream.
> So we do have several things missing in send besides attribute flags,
> like hole punching for example (there's a list on the wiki).
> We can't just add a new command and introduce such caveat every time
> we implement one of the missing and desired features.
>
> In 2014, while wanting to implement some of those features, I
> introduced a way to bump the send stream version with room (commands)
> for all those missing features, so that all could be implemented later
> without adding further backward incompatibility between kernel
> versions btrfs-progs versions.
> Some of the threads for reference:
>
> https://patchwork.kernel.org/patch/4021491/
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_linux-2Dbtrfs_msg35169.html=DwIFaQ=5VD0RTtNlTh3ycd41b3MUw=UA4c4GV4shA70-jKB4kwcF99U6K6bzKVYdicFvu-DtQ=pWSiTBXI54TJfXnctkAeLJUbyIs9VZqupmGKs8JsETE=rsgeLq1QeHGaRs6xIXtGcGV-UgjCuqnWMATCw_KaxY8=
>
> It never took off, and honestly I don't remember why as no one add
> more comments on the latest versions of the kernel and btrfs-progs
> patchsets.
Thanks for the heads up, I'll go dig up your patches and try to get them 
working on 4.17

Howard

>
>> Signed-off-by: Howard McLauchlan 
>> ---
>> Based on 4.17-rc1
>>
>>  fs/btrfs/ctree.h |   2 +
>>  fs/btrfs/ioctl.c |   2 +-
>>  fs/btrfs/send.c  | 176 +++
>>  fs/btrfs/send.h  |   2 +
>>  4 files changed, 154 insertions(+), 28 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 5474ef14d6e6..a0dc6a8a37eb 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -1436,6 +1436,8 @@ struct btrfs_map_token {
>> unsigned long offset;
>>  };
>>
>> +unsigned int btrfs_flags_to_ioctl(unsigned int flags);
>> +
>>  #define BTRFS_BYTES_TO_BLKS(fs_info, bytes) \
>> ((bytes) >> (fs_info)->sb->s_blocksize_bits)
>>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 632e26d6f7ce..36ce1e589f9e 100644
>> --- a/fs/btrfs/ioctl.c
>> +++ b/fs/btrfs/ioctl.c
>> @@ -106,7 +106,7 @@ static unsigned int btrfs_mask_flags(umode_t mode, 
>> unsigned int flags)
>>  /*
>>   * Export inode flags to the format expected by the FS_IOC_GETFLAGS ioctl.
>>   */
>> -static unsigned int btrfs_flags_to_ioctl(unsigned int flags)
>> +unsigned int btrfs_flags_to_ioctl(unsigned int flags)
>>  {
>> unsigned int iflags = 0;
>>
>> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
>> index 221e5cdb060b..da521a5a1843 100644
>> --- a/fs/btrfs/send.c
>> +++ b/fs/btrfs/send.c
>> @@ -101,6 +101,13 @@ struct send_ctx {
>> u64 cur_inode_last_extent;
>> u64 cur_inode_next_write_offset;
>>
>> +   /*
>> +* state for chattr purposes
>> +*/
>> +   u64 cur_inode_flip_flags;
>> +   u64 cur_inode_receive_flags;
>> +   int receive_flags_valid;
>> +
>> u64 send_progress;
>>
>> struct list_head new_refs;
>> @@ -798,7 +805,7 @@ static int send_rmdir(struct send_ctx *sctx, struct 
>> fs_path *path)
>>   */
>>  static int __get_inode_info(struct btrfs_root *root, struct btrfs_path 
>> *path,
>>   u64 ino, u64 *size, u64 *gen, u64 *mode, u64 *uid,
>> - u64 *gid, u64 *rdev)
>> + u64 *gid, u64 *rdev, u64 *flags)
>>  {
>> int ret;
>> struct btrfs_inode_item *ii;
>> @@ -828,6 +835,8 @@ static int __get_inode_info(struct btrfs_root *root, 
>> struct btrfs_path *path,
>> *gid = btrfs_inode_gid(path->nodes[0], ii);
>> if (rdev)
>> *rdev = btrfs_inode_rdev(path->nodes[0], ii);
>> +   if (flags)
>> +   *flags = btrfs_inode_flags(path->nodes[0], ii);
>>
>> return ret;
>>  }
>> @@ -835,7 +844,7 @@ static int __get_inode_info(struct btrfs_root *root, 
>> struct btrfs_path *path,
>>  static int get_inode_info(struct btrfs_root *root,
>>   u64 ino, u64 *size, u64 *gen,
>>   u64 *mode, u64 *uid, u64 *gid,
>> - u64 *rdev)
>> + u64 *rdev, u64 *flags)
>>  {
>> struct btrfs_path *path;
>> int ret;
>> @@ -844,7 +853,7 @@ 

[PATCH 0/7] Add ioctl to support extended inode flags

2018-04-20 Thread David Sterba
The patchset implements the existing VFS ioctls for reading extended
ioctl flags by btrfs.

There are many flags/attributes/extended/combined, the naming is
confusing, so let's recap what we have:

* generic VFS inode flags (i_flags)
  - S_* namespace
  https://elixir.bootlin.com/linux/v4.17-rc1/source/include/linux/fs.h#L1850
  - FS_IOC_GETFLAGS, FS_IOC_SETFLAGS
  - tools: lsatrr, chattr

* btrfs inode flags, on-disk format, independent of the above, with
  to/from conversions
  https://elixir.bootlin.com/linux/v4.17-rc1/source/fs/btrfs/ctree.h#L1416

* extended attributes, also called XATTR, but they're different entity,
  stored by an inode as key:value pairs
  - tools: getfattr, setfattr

* XFLAGs, another interface to the generic S_* flags, new ioctl added
  because [GS]ETFLAGS is frozen, new bits added, eg. for project quotas
  or DAX, and more that originate in XFS features
  https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/fs.h#L168
  - tools: xfs_io -c lsattr, xfs_io -c chattr

In the future, btrfs will probably get:

- nodefrag -- eg. to disable autodefrag or defrag ioctl
- nosymlinks -- for directories, prevent creating new symlinks
- dax

git://github.com/kdave/btrfs-devel dev/xflags

David Sterba (7):
  btrfs: rename btrfs_update_iflags to reflect which flags it touches
  btrfs: rename btrfs_mask_flags to reflect which flags it touches
  btrfs: rename check_flags to reflect which flags it touches
  btrfs: rename btrfs_flags_to_ioctl to reflect which flags it touches
  btrfs: add helpers for FS_XFLAG_* conversion
  btrfs: add FS_IOC_FSGETXATTR ioctl
  btrfs: add FS_IOC_FSSETXATTR ioctl

 fs/btrfs/ctree.h |   2 +-
 fs/btrfs/inode.c |   4 +-
 fs/btrfs/ioctl.c | 185 +--
 3 files changed, 170 insertions(+), 21 deletions(-)

-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] btrfs: rename btrfs_update_iflags to reflect which flags it touches

2018-04-20 Thread David Sterba
The btrfs inode flag flavour is now simply called 'inode flags' and the
vfs inode are i_flags. Also rename the internal variable to something
less confusing than 'ip'.

Signed-off-by: David Sterba 
---
 fs/btrfs/ctree.h |  2 +-
 fs/btrfs/inode.c |  4 ++--
 fs/btrfs/ioctl.c | 16 
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 5474ef14d6e6..08a96e594097 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3235,7 +3235,7 @@ void btrfs_test_inode_set_ops(struct inode *inode);
 long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
 long btrfs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long 
arg);
 int btrfs_ioctl_get_supported_features(void __user *arg);
-void btrfs_update_iflags(struct inode *inode);
+void btrfs_sync_inode_flags_to_i_flags(struct inode *inode);
 int btrfs_is_empty_uuid(u8 *uuid);
 int btrfs_defrag_file(struct inode *inode, struct file *file,
  struct btrfs_ioctl_defrag_range_args *range,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e064c49c9a9a..6a24b159aeeb 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3923,7 +3923,7 @@ static int btrfs_read_locked_inode(struct inode *inode)
break;
}
 
-   btrfs_update_iflags(inode);
+   btrfs_sync_inode_flags_to_i_flags(inode);
return 0;
 
 make_bad:
@@ -6259,7 +6259,7 @@ static void btrfs_inherit_iflags(struct inode *inode, 
struct inode *dir)
BTRFS_I(inode)->flags |= BTRFS_INODE_NODATASUM;
}
 
-   btrfs_update_iflags(inode);
+   btrfs_sync_inode_flags_to_i_flags(inode);
 }
 
 static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 9cc4fd25f83d..c3189ebf7c88 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -136,20 +136,20 @@ static unsigned int btrfs_flags_to_ioctl(unsigned int 
flags)
 /*
  * Update inode->i_flags based on the btrfs internal flags.
  */
-void btrfs_update_iflags(struct inode *inode)
+void btrfs_sync_inode_flags_to_i_flags(struct inode *inode)
 {
-   struct btrfs_inode *ip = BTRFS_I(inode);
+   struct btrfs_inode *binode = BTRFS_I(inode);
unsigned int new_fl = 0;
 
-   if (ip->flags & BTRFS_INODE_SYNC)
+   if (binode->flags & BTRFS_INODE_SYNC)
new_fl |= S_SYNC;
-   if (ip->flags & BTRFS_INODE_IMMUTABLE)
+   if (binode->flags & BTRFS_INODE_IMMUTABLE)
new_fl |= S_IMMUTABLE;
-   if (ip->flags & BTRFS_INODE_APPEND)
+   if (binode->flags & BTRFS_INODE_APPEND)
new_fl |= S_APPEND;
-   if (ip->flags & BTRFS_INODE_NOATIME)
+   if (binode->flags & BTRFS_INODE_NOATIME)
new_fl |= S_NOATIME;
-   if (ip->flags & BTRFS_INODE_DIRSYNC)
+   if (binode->flags & BTRFS_INODE_DIRSYNC)
new_fl |= S_DIRSYNC;
 
set_mask_bits(>i_flags,
@@ -317,7 +317,7 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
goto out_drop;
}
 
-   btrfs_update_iflags(inode);
+   btrfs_sync_inode_flags_to_i_flags(inode);
inode_inc_iversion(inode);
inode->i_ctime = current_time(inode);
ret = btrfs_update_inode(trans, root, inode);
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] btrfs: rename btrfs_mask_flags to reflect which flags it touches

2018-04-20 Thread David Sterba
The FS_*_FL flags cannot be easily identified by a variable name prefix
but we still need to recognize them so the 'fsflags' should be closer to
the naming scheme but again the 'fs' part sounds like it's a filesystem
flag. I don't have a better idea for now.

Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index c3189ebf7c88..b2c26beffd82 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -93,11 +93,12 @@ static int btrfs_clone(struct inode *src, struct inode 
*inode,
   int no_time_update);
 
 /* Mask out flags that are inappropriate for the given type of inode. */
-static unsigned int btrfs_mask_flags(umode_t mode, unsigned int flags)
+static unsigned int btrfs_mask_fsflags_for_type(struct inode *inode,
+   unsigned int flags)
 {
-   if (S_ISDIR(mode))
+   if (S_ISDIR(inode->i_mode))
return flags;
-   else if (S_ISREG(mode))
+   else if (S_ISREG(inode->i_mode))
return flags & ~FS_DIRSYNC_FL;
else
return flags & (FS_NODUMP_FL | FS_NOATIME_FL);
@@ -218,7 +219,7 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
i_oldflags = inode->i_flags;
mode = inode->i_mode;
 
-   flags = btrfs_mask_flags(inode->i_mode, flags);
+   flags = btrfs_mask_fsflags_for_type(inode, flags);
oldflags = btrfs_flags_to_ioctl(ip->flags);
if ((flags ^ oldflags) & (FS_APPEND_FL | FS_IMMUTABLE_FL)) {
if (!capable(CAP_LINUX_IMMUTABLE)) {
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] btrfs: add helpers for FS_XFLAG_* conversion

2018-04-20 Thread David Sterba
Preparatory work for the FS_IOC_FSGETXATTR ioctl, basic conversions and
checking helpers.

Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f0e6074233fa..a25ad4e6a478 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -338,6 +338,38 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
return ret;
 }
 
+/*
+ * Translate btrfs internal inode flags to xflags as expected by the
+ * FS_IOC_FSGETXATT ioctl. Filter only the supported ones, unknown flags are
+ * silently dropped.
+ */
+static unsigned int btrfs_inode_flags_to_xflags(unsigned int flags)
+{
+   unsigned int xflags = 0;
+
+   if (flags & BTRFS_INODE_APPEND)
+   xflags |= FS_XFLAG_APPEND;
+   if (flags & BTRFS_INODE_IMMUTABLE)
+   xflags |= FS_XFLAG_IMMUTABLE;
+   if (flags & BTRFS_INODE_NOATIME)
+   xflags |= FS_XFLAG_NOATIME;
+   if (flags & BTRFS_INODE_NODUMP)
+   xflags |= FS_XFLAG_NODUMP;
+   if (flags & BTRFS_INODE_SYNC)
+   xflags |= FS_XFLAG_SYNC;
+
+   return xflags;
+}
+
+/* Check if @flags are a supported and valid set of FS_XFLAGS_* flags */
+static int check_xflags(unsigned int flags)
+{
+   if (flags & ~(FS_XFLAG_APPEND | FS_XFLAG_IMMUTABLE | FS_XFLAG_NOATIME |
+ FS_XFLAG_NODUMP | FS_XFLAG_SYNC))
+   return -EOPNOTSUPP;
+   return 0;
+}
+
 static int btrfs_ioctl_getversion(struct file *file, int __user *arg)
 {
struct inode *inode = file_inode(file);
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] btrfs: add FS_IOC_FSGETXATTR ioctl

2018-04-20 Thread David Sterba
The new ioctl is an extension to the FS_IOC_GETFLAGS and adds new
flags and is extensible. This patch allows to return the xflags portion
of the fsxattr structure, other items have no meaning for btrfs or can
be added later.

The original patch was written by Chandan Jay Sharma but was incomplete
and no further revision has been sent. Several cleanups were necessary
to avoid confusion with other ioctls, as we have another flavor of
flags.

Based-on-patches-by: Chandan Jay Sharma 
Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a25ad4e6a478..52b12ab9b82b 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -370,6 +370,24 @@ static int check_xflags(unsigned int flags)
return 0;
 }
 
+/*
+ * Set the xflags from the internal inode flags. The remaining items of fsxattr
+ * are zeroed.
+ */
+static int btrfs_ioctl_fsgetxattr(struct file *file, void __user *arg)
+{
+   struct btrfs_inode *binode = BTRFS_I(file_inode(file));
+   struct fsxattr fa;
+
+   memset(, 0, sizeof(fa));
+   fa.fsx_xflags = btrfs_inode_flags_to_xflags(binode->flags);
+
+   if (copy_to_user(arg, , sizeof(fa)))
+   return -EFAULT;
+
+   return 0;
+}
+
 static int btrfs_ioctl_getversion(struct file *file, int __user *arg)
 {
struct inode *inode = file_inode(file);
@@ -5600,6 +5618,8 @@ long btrfs_ioctl(struct file *file, unsigned int
return btrfs_ioctl_get_features(file, argp);
case BTRFS_IOC_SET_FEATURES:
return btrfs_ioctl_set_features(file, argp);
+   case FS_IOC_FSGETXATTR:
+   return btrfs_ioctl_fsgetxattr(file, argp);
}
 
return -ENOTTY;
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] btrfs: add FS_IOC_FSSETXATTR ioctl

2018-04-20 Thread David Sterba
The new ioctl is an extension to the FS_IOC_SETFLAGS and adds new
flags and is extensible. Don't get fooled by the XATTR in the name, it
does not have anything in common with the extended attributes,
incidentally also abbreviated as XATTRs.

This patch allows to set the xflags portion of the fsxattr structure,
other items have no meaning and non-zero values will result in
EOPNOTSUPP.

Currently supported xflags:

- APPEND
- IMMUTABLE
- NOATIME
- NODUMP
- SYNC

The structure of btrfs_ioctl_fssetxattr copies btrfs_ioctl_setflags but
is simpler on the flag setting side.

The original patch was written by Chandan Jay Sharma but was incomplete
and no further revision has been sent.

Based-on-patches-by: Chandan Jay Sharma 
Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 94 
 1 file changed, 94 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 52b12ab9b82b..4fd61f191bba 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -388,6 +388,98 @@ static int btrfs_ioctl_fsgetxattr(struct file *file, void 
__user *arg)
return 0;
 }
 
+static int btrfs_ioctl_fssetxattr(struct file *file, void __user *arg)
+{
+   struct inode *inode = file_inode(file);
+   struct btrfs_inode *binode = BTRFS_I(inode);
+   struct btrfs_root *root = binode->root;
+   struct btrfs_trans_handle *trans;
+   struct fsxattr fa;
+   unsigned oldflags;
+   unsigned old_i_flags;
+   int ret = 0;
+
+   if (!inode_owner_or_capable(inode))
+   return -EPERM;
+
+   if (btrfs_root_readonly(root))
+   return -EROFS;
+
+   memset(, 0, sizeof(fa));
+   if (copy_from_user(, arg, sizeof(fa)))
+   return -EFAULT;
+
+   ret = check_xflags(fa.fsx_xflags);
+   if (ret)
+   return ret;
+
+   if (fa.fsx_extsize != 0 || fa.fsx_projid != 0 || fa.fsx_cowextsize != 0)
+   return -EOPNOTSUPP;
+
+   ret = mnt_want_write_file(file);
+   if (ret)
+   return ret;
+
+   inode_lock(inode);
+
+   oldflags = binode->flags;
+   old_i_flags = inode->i_flags;
+
+   /* We need the capabilities to change append-only or immutable inode */
+   if (((oldflags & (BTRFS_INODE_APPEND | BTRFS_INODE_IMMUTABLE)) ||
+(fa.fsx_xflags & (FS_XFLAG_APPEND | FS_XFLAG_IMMUTABLE))) &&
+   !capable(CAP_LINUX_IMMUTABLE)) {
+   ret = -EPERM;
+   goto out_unlock;
+   }
+
+   if (fa.fsx_xflags & FS_XFLAG_SYNC)
+   binode->flags |= BTRFS_INODE_SYNC;
+   else
+   binode->flags &= ~BTRFS_INODE_SYNC;
+   if (fa.fsx_xflags & FS_XFLAG_IMMUTABLE)
+   binode->flags |= BTRFS_INODE_IMMUTABLE;
+   else
+   binode->flags &= ~BTRFS_INODE_IMMUTABLE;
+   if (fa.fsx_xflags & FS_XFLAG_APPEND)
+   binode->flags |= BTRFS_INODE_APPEND;
+   else
+   binode->flags &= ~BTRFS_INODE_APPEND;
+   if (fa.fsx_xflags & FS_XFLAG_NODUMP)
+   binode->flags |= BTRFS_INODE_NODUMP;
+   else
+   binode->flags &= ~BTRFS_INODE_NODUMP;
+   if (fa.fsx_xflags & FS_XFLAG_NOATIME)
+   binode->flags |= BTRFS_INODE_NOATIME;
+   else
+   binode->flags &= ~BTRFS_INODE_NOATIME;
+
+   /* 1 item for the inode */
+   trans = btrfs_start_transaction(root, 1);
+   if (IS_ERR(trans)) {
+   ret = PTR_ERR(trans);
+   goto out_drop;
+   }
+
+   btrfs_sync_inode_flags_to_i_flags(inode);
+   inode_inc_iversion(inode);
+   inode->i_ctime = current_time(inode);
+   ret = btrfs_update_inode(trans, root, inode);
+
+   btrfs_end_transaction(trans);
+
+   if (ret) {
+   binode->flags = oldflags;
+   inode->i_flags = old_i_flags;
+   }
+
+out_unlock:
+   inode_unlock(inode);
+out_drop:
+   mnt_drop_write_file(file);
+   return ret;
+}
+
 static int btrfs_ioctl_getversion(struct file *file, int __user *arg)
 {
struct inode *inode = file_inode(file);
@@ -5620,6 +5712,8 @@ long btrfs_ioctl(struct file *file, unsigned int
return btrfs_ioctl_set_features(file, argp);
case FS_IOC_FSGETXATTR:
return btrfs_ioctl_fsgetxattr(file, argp);
+   case FS_IOC_FSSETXATTR:
+   return btrfs_ioctl_fssetxattr(file, argp);
}
 
return -ENOTTY;
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] btrfs: rename btrfs_flags_to_ioctl to reflect which flags it touches

2018-04-20 Thread David Sterba
Converts btrfs_inode::flags to the FS_*_FL flags.

Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 953473f2a136..f0e6074233fa 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -105,9 +105,10 @@ static unsigned int btrfs_mask_fsflags_for_type(struct 
inode *inode,
 }
 
 /*
- * Export inode flags to the format expected by the FS_IOC_GETFLAGS ioctl.
+ * Export internal inode flags to the format expected by the FS_IOC_GETFLAGS
+ * ioctl.
  */
-static unsigned int btrfs_flags_to_ioctl(unsigned int flags)
+static unsigned int btrfs_inode_flags_to_fsflags(unsigned int flags)
 {
unsigned int iflags = 0;
 
@@ -161,7 +162,7 @@ void btrfs_sync_inode_flags_to_i_flags(struct inode *inode)
 static int btrfs_ioctl_getflags(struct file *file, void __user *arg)
 {
struct btrfs_inode *ip = BTRFS_I(file_inode(file));
-   unsigned int flags = btrfs_flags_to_ioctl(ip->flags);
+   unsigned int flags = btrfs_inode_flags_to_fsflags(ip->flags);
 
if (copy_to_user(arg, , sizeof(flags)))
return -EFAULT;
@@ -221,7 +222,7 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
mode = inode->i_mode;
 
flags = btrfs_mask_fsflags_for_type(inode, flags);
-   oldflags = btrfs_flags_to_ioctl(ip->flags);
+   oldflags = btrfs_inode_flags_to_fsflags(ip->flags);
if ((flags ^ oldflags) & (FS_APPEND_FL | FS_IMMUTABLE_FL)) {
if (!capable(CAP_LINUX_IMMUTABLE)) {
ret = -EPERM;
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] btrfs: rename check_flags to reflect which flags it touches

2018-04-20 Thread David Sterba
The FS_*_FL flags cannot be easily identified by a prefix but we still
need to recognize them so the 'fsflags' should be closer to the naming
scheme but again the 'fs' part sounds like it's a filesystem flag. I
don't have a better idea for now.

Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index b2c26beffd82..953473f2a136 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -168,7 +168,8 @@ static int btrfs_ioctl_getflags(struct file *file, void 
__user *arg)
return 0;
 }
 
-static int check_flags(unsigned int flags)
+/* Check if @flags are a supported and valid set of FS_*_FL flags */
+static int check_fsflags(unsigned int flags)
 {
if (flags & ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
  FS_NOATIME_FL | FS_NODUMP_FL | \
@@ -205,7 +206,7 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
if (copy_from_user(, arg, sizeof(flags)))
return -EFAULT;
 
-   ret = check_flags(flags);
+   ret = check_fsflags(flags);
if (ret)
return ret;
 
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] btrfs: Add incompat flags check for btrfs_check_super_valid()

2018-04-20 Thread David Sterba
On Fri, Apr 20, 2018 at 10:32:03PM +0800, Anand Jain wrote:
> 
> 
> On 04/19/2018 05:38 PM, Qu Wenruo wrote:
> > Although we have already checked incompat flags manually before really
> > mounting it, we could still enhance btrfs_check_super_valid() to check
> > incompat flags for later write time super block validation check.
> > 
> > This patch adds such incompat flags check for btrfs_check_super_valid(),
> > currently it won't be triggered, but provides the basis for later write
> > time check.
> > 
> > Signed-off-by: Qu Wenruo 
> > ---
> >   fs/btrfs/disk-io.c | 13 +
> >   1 file changed, 13 insertions(+)
> > 
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index 60caa68c3618..ec123158f051 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -4104,6 +4104,19 @@ static int btrfs_check_super_valid(struct 
> > btrfs_fs_info *fs_info)
> > ret = -EINVAL;
> > }
> >   
> > +   /*
> > +* Before calling btrfs_check_super_valid() we have already checked
> > +* incompat flags. So if we developr new incompat flags, it's must be
> > +* some corruption.
> > +*/
> > +   if (btrfs_super_incompat_flags(sb) & ~BTRFS_FEATURE_INCOMPAT_SUPP) {
> > +   btrfs_err(fs_info,
> > +   "corrupted incompat flags detected 0x%llx, supported 0x%llx",

> 2707 features = btrfs_super_incompat_flags(disk_super) &
> 2708 ~BTRFS_FEATURE_INCOMPAT_SUPP;
> 2709 if (features) {
> 2710 btrfs_err(fs_info,
> 2711 "cannot mount because of unsupported optional 
> features (%llx)",
> 2712 features);
> 2713 err = -EINVAL;
> 2714 goto fail_alloc;
> 2715 }

So there's a "user experience" change, now that you pointed out the
other check. If a filesystem with incompat bits set will be mounted,
this will say 'you have corrupted filesystem', which is not IMO what we
want to tell.

Tough the extended btrfs_check_super_valid could catch the corrupted
incompat bits, what we need at mount time is wording from the 2nd
message ("cannot mount unsuppported features").

I think that there should be a separate function that does the
pre-commit checks, calls btrfs_check_super_valid and also validates the
incompat bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] Add ioctl to clear unused space

2018-04-20 Thread Austin S. Hemmelgarn

On 2018-04-20 10:21, David Sterba wrote:

This patchset adds new ioctl similar to TRIM, that provides several
other ways how to clear the unused space.  The changelogs are
incomplete, for preview not for inclusion yet.

+1 for the idea.  This will be insanely useful for certain VM setups.


It compiles and has been tested lightly, the clearing modes depend on hw
capabilities (secure discard, sector unmapping instead of zeros), so
I've tested only zeroing and discard.
I don't have specific experience with blkdev_issue_zeroout, but I think 
the sector unmapping may be covered by the discard support.  When 
dealing with SCSI devices, discard operations get converted to the SCSI 
UNMAP command, which just based on the naming, sounds like what 
blkdev_issue_zeroout is issuing (at least, when dealing with SCSI 
devices).  Pretty much, the SCSI command works just like the ATA TRIM 
command does on ATA devices which zero data on discard (except it's 
SCSI, so it can pretty much always be queued, and has some other 
semantics around it that give it better performance than ATA TRIM).


Assuming that is the case, the unmapping support could be tested using 
the QEMU PV-SCSI driver (which supports the UNMAP command) with 
appropriate backing storage.  I can look at doing this testing myself, 
as I already use PV-SCSI for storage on all my QEMU VM's.


I personally think the zeroing has a usecase and the other modes were
easy to add. Further extensions can be considered, eg. WRITE_SAME,
overwriting with a randomly generated pattern, or some filesystem canary
patterns that can be used to report unused block read as metadata.
Zeroing does have a use-case.  It's wonderfully useful for handling 
compaction of VM disks when using a virtual machine manager that doesn't 
support discard operations for this purpose.  I can also see the secure 
discard support being useful (though not necessarily the sector 
unmapping support for reasons mentioned above).


As this is modelled after the generic FITRIM ioctl, so this could be a
new generic ioctl too. However, last time somebody wanted such ioctl,
there was a pushback. I'll consider making a generic version and send it
for comments to fsdevel eventually.

git://github.com/kdave/btrfs-devel dev/zero-free


David Sterba (6):
   btrfs: extend trim callchain to pass the operation type
   btrfs: add wrapper to switch clearing operation
   btrfs: add zeroing clear operation
   btrfs: add new ioctl BTRFS_IOC_CLEAR_FREE
   btrfs: add discard secure to clear unused space
   btrfs: add more zeroout modes for clearing unused space

  fs/btrfs/ctree.h|   5 +-
  fs/btrfs/extent-tree.c  | 130 ++--
  fs/btrfs/free-space-cache.c |  22 +---
  fs/btrfs/free-space-cache.h |   3 +-
  fs/btrfs/ioctl.c|  42 ++
  include/uapi/linux/btrfs.h  |  26 +
  6 files changed, 199 insertions(+), 29 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] btrfs: Do super block verification before writing it to disk

2018-04-20 Thread Anand Jain






@@ -3563,6 +3565,12 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int 
max_mirrors)
sb = fs_info->super_for_commit;
dev_item = >dev_item;
  
+	if (btrfs_check_super_valid(fs_info, sb, -1)) {

+   btrfs_err(fs_info,
+   "superblock corruption detected before transaction commitment");
+   return -EUCLEAN;
+   }
+
mutex_lock(_info->fs_devices->device_list_mutex);
head = _info->fs_devices->devices;
max_errors = btrfs_super_num_devices(fs_info->super_copy) - 1;


 With patch 1/3 (incompat feature checks) now this set as a
 whole address my concern that I explained earlier. I am ok with
 this approach.

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] btrfs: Add incompat flags check for btrfs_check_super_valid()

2018-04-20 Thread Anand Jain



On 04/19/2018 05:38 PM, Qu Wenruo wrote:

Although we have already checked incompat flags manually before really
mounting it, we could still enhance btrfs_check_super_valid() to check
incompat flags for later write time super block validation check.

This patch adds such incompat flags check for btrfs_check_super_valid(),
currently it won't be triggered, but provides the basis for later write
time check.

Signed-off-by: Qu Wenruo 
---
  fs/btrfs/disk-io.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 60caa68c3618..ec123158f051 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4104,6 +4104,19 @@ static int btrfs_check_super_valid(struct btrfs_fs_info 
*fs_info)
ret = -EINVAL;
}
  
+	/*

+* Before calling btrfs_check_super_valid() we have already checked
+* incompat flags. So if we developr new incompat flags, it's must be
+* some corruption.
+*/
+   if (btrfs_super_incompat_flags(sb) & ~BTRFS_FEATURE_INCOMPAT_SUPP) {
+   btrfs_err(fs_info,
+   "corrupted incompat flags detected 0x%llx, supported 0x%llx",
+ btrfs_super_incompat_flags(sb),
+ BTRFS_FEATURE_INCOMPAT_SUPP);
+   ret = -EINVAL;
+   }
+
/*
 * The generation is a global counter, we'll trust it more than the 
others
 * but it's still possible that it's the one that's wrong.



This patch should move the existing check in the open_ctree() line 2707
into the btrfs_check_super_valid(), we don't need duplicate checks.

disk-io.c

2707 features = btrfs_super_incompat_flags(disk_super) &
2708 ~BTRFS_FEATURE_INCOMPAT_SUPP;
2709 if (features) {
2710 btrfs_err(fs_info,
2711 "cannot mount because of unsupported optional 
features (%llx)",

2712 features);
2713 err = -EINVAL;
2714 goto fail_alloc;
2715 }


Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] btrfs: add new ioctl BTRFS_IOC_CLEAR_FREE

2018-04-20 Thread David Sterba
A new ioctl that will clear the free space (by writing zeros) given by
the user range. Similar to the TRIM ioctl.

We need a new ioctl for that because struct fstrim_range does not
provide any existing or reserved member for extensions. The new ioctl
also supports TRIM as the operation type.

Signed-off-by: David Sterba 
---
 fs/btrfs/ctree.h   | 11 ++-
 fs/btrfs/extent-tree.c | 48 ++
 fs/btrfs/ioctl.c   | 42 
 include/uapi/linux/btrfs.h | 20 +++
 4 files changed, 112 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 772cb4ccc5f7..d8cc70a6bef7 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -674,15 +674,6 @@ struct btrfs_stripe_hash {
spinlock_t lock;
 };
 
-/*
- * Type of operation that will be used to clear unused blocks.
- */
-enum btrfs_clear_op_type {
-   BTRFS_CLEAR_OP_DISCARD = 0,
-   BTRFS_CLEAR_OP_ZERO,
-   BTRFS_NR_CLEAR_OP_TYPES,
-};
-
 /* used by the raid56 code to lock stripes for read/modify/write */
 struct btrfs_stripe_hash_table {
struct list_head stripe_cache;
@@ -2800,6 +2791,8 @@ int btrfs_error_unpin_extent_range(struct btrfs_fs_info 
*fs_info,
 int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr,
 u64 num_bytes, u64 *actual_bytes,
 enum btrfs_clear_op_type clear);
+int btrfs_clear_free_space(struct btrfs_root *root,
+   struct btrfs_ioctl_clear_free_args *args);
 int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info, u64 type);
 int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 800aaf45e6bd..21a24fff32dd 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -11061,6 +11061,54 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, 
struct fstrim_range *range)
return ret;
 }
 
+int btrfs_clear_free_space(struct btrfs_root *root,
+   struct btrfs_ioctl_clear_free_args *args)
+{
+   struct btrfs_fs_info *fs_info = root->fs_info;
+   struct btrfs_block_group_cache *cache = NULL;
+   u64 group_cleared;
+   u64 start;
+   u64 cleared = 0;
+   u64 end;
+   int ret = 0;
+
+   cache = btrfs_lookup_first_block_group(fs_info, args->start);
+
+   while (cache) {
+   if (cache->key.objectid >= (args->start + args->length)) {
+   btrfs_put_block_group(cache);
+   break;
+   }
+
+   start = max(args->start, cache->key.objectid);
+   end = min(args->start + args->length,
+   cache->key.objectid + cache->key.offset);
+
+   if (end - start >= args->minlen) {
+   if (!block_group_cache_done(cache)) {
+   ret = cache_block_group(cache, 0);
+   if (!ret)
+   wait_block_group_cache_done(cache);
+   }
+   ret = btrfs_trim_block_group(cache, _cleared,
+   start, end, args->minlen,
+   args->type);
+
+   if (ret) {
+   btrfs_put_block_group(cache);
+   break;
+   }
+   cleared += group_cleared;
+   }
+
+   cache = next_block_group(fs_info, cache);
+   }
+
+   args->length = cleared;
+
+   return ret;
+}
+
 /*
  * btrfs_{start,end}_write_no_snapshotting() are similar to
  * mnt_{want,drop}_write(), they are used to prevent some tasks from writing
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 9cc4fd25f83d..a56d4fb3ae82 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5429,6 +5429,46 @@ static int _btrfs_ioctl_send(struct file *file, void 
__user *argp, bool compat)
return ret;
 }
 
+static int btrfs_ioctl_clear_free(struct file *file, void __user *arg)
+{
+   struct btrfs_fs_info *fs_info;
+   struct btrfs_ioctl_clear_free_args args;
+   u64 total_bytes;
+   int ret;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
+   if (copy_from_user(, arg, sizeof(args)))
+   return -EFAULT;
+
+   if (args.type >= BTRFS_NR_CLEAR_OP_TYPES)
+   return -EOPNOTSUPP;
+
+   ret = mnt_want_write_file(file);
+   if (ret)
+   return ret;
+
+   fs_info = btrfs_sb(file_inode(file)->i_sb);
+   total_bytes = btrfs_super_total_bytes(fs_info->super_copy);
+   if (args.start > total_bytes) {
+   ret = -EINVAL;
+   goto out_drop_write;
+   }
+
+   ret = 

[PATCH 6/6] btrfs: add more zeroout modes for clearing unused space

2018-04-20 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/extent-tree.c | 12 +++-
 include/uapi/linux/btrfs.h |  5 +
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 285bace8e2c6..bd2ac5779998 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2046,9 +2046,19 @@ static int btrfs_issue_clear_op(struct block_device 
*bdev, u64 start, u64 size,
case BTRFS_CLEAR_OP_DISCARD:
return blkdev_issue_discard(bdev, start >> 9, size >> 9,
GFP_NOFS, flags);
+   case BTRFS_CLEAR_OP_ZERO_NOUNMAP:
+   flags = BLKDEV_ZERO_NOUNMAP;
+   goto zeroout;
+   case BTRFS_CLEAR_OP_ZERO_NOFALLBACK:
+   flags = BLKDEV_ZERO_NOFALLBACK;
+   goto zeroout;
+   case BTRFS_CLEAR_OP_ZERO_NOUNMAP_NOFALLBACK:
+   flags = BLKDEV_ZERO_NOUNMAP | BLKDEV_ZERO_NOFALLBACK;
+   /* fall through */
case BTRFS_CLEAR_OP_ZERO:
+zeroout:
return blkdev_issue_zeroout(bdev, start >> 9, size >> 9,
-   GFP_NOFS, 0);
+   GFP_NOFS, flags);
default:
return -EOPNOTSUPP;
}
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 1becbb241e53..5d1f13045906 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -744,6 +744,11 @@ enum btrfs_clear_op_type {
BTRFS_CLEAR_OP_DISCARD = 0,
BTRFS_CLEAR_OP_ZERO,
BTRFS_CLEAR_OP_DISCARD_SECURE,
+
+   /* Fine tuning for clearing by zeros, see __blkdev_issue_zeroout */
+   BTRFS_CLEAR_OP_ZERO_NOUNMAP,
+   BTRFS_CLEAR_OP_ZERO_NOFALLBACK,
+   BTRFS_CLEAR_OP_ZERO_NOUNMAP_NOFALLBACK,
BTRFS_NR_CLEAR_OP_TYPES,
 };
 
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] btrfs: add wrapper to switch clearing operation

2018-04-20 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/extent-tree.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index ffdd3aba508c..b317f8ee42a9 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2034,6 +2034,18 @@ static int remove_extent_backref(struct 
btrfs_trans_handle *trans,
return ret;
 }
 
+static int btrfs_issue_clear_op(struct block_device *bdev, u64 start, u64 size,
+   enum btrfs_clear_op_type clear)
+{
+   switch (clear) {
+   case BTRFS_CLEAR_OP_DISCARD:
+   return blkdev_issue_discard(bdev, start >> 9, size >> 9,
+   GFP_NOFS, 0);
+   default:
+   return -EOPNOTSUPP;
+   }
+}
+
 #define in_range(b, first, len)((b) >= (first) && (b) < (first) + 
(len))
 static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
   u64 *discarded_bytes,
@@ -2081,13 +2093,8 @@ static int btrfs_issue_discard(struct block_device 
*bdev, u64 start, u64 len,
bytes_left = end - start;
continue;
}
-
if (size) {
-   if (clear == BTRFS_CLEAR_OP_DISCARD)
-   ret = blkdev_issue_discard(bdev, start >> 9,
-   size >> 9, GFP_NOFS, 0);
-   else
-   ret = -EOPNOTSUPP;
+   ret = btrfs_issue_clear_op(bdev, start, size, clear);
if (!ret)
*discarded_bytes += size;
else if (ret != -EOPNOTSUPP)
@@ -2103,11 +2110,7 @@ static int btrfs_issue_discard(struct block_device 
*bdev, u64 start, u64 len,
}
 
if (bytes_left) {
-   if (clear == BTRFS_CLEAR_OP_DISCARD)
-   ret = blkdev_issue_discard(bdev, start >> 9,
-   bytes_left >> 9, GFP_NOFS, 0);
-   else
-   ret = -EOPNOTSUPP;
+   ret = btrfs_issue_clear_op(bdev, start, bytes_left, clear);
if (!ret)
*discarded_bytes += bytes_left;
}
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] btrfs: add discard secure to clear unused space

2018-04-20 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/extent-tree.c | 14 --
 include/uapi/linux/btrfs.h |  1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 21a24fff32dd..285bace8e2c6 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2037,10 +2037,15 @@ static int remove_extent_backref(struct 
btrfs_trans_handle *trans,
 static int btrfs_issue_clear_op(struct block_device *bdev, u64 start, u64 size,
enum btrfs_clear_op_type clear)
 {
+   int flags = 0;
+
switch (clear) {
+   case BTRFS_CLEAR_OP_DISCARD_SECURE:
+   flags = BLKDEV_DISCARD_SECURE;
+   /* fall through */
case BTRFS_CLEAR_OP_DISCARD:
return blkdev_issue_discard(bdev, start >> 9, size >> 9,
-   GFP_NOFS, 0);
+   GFP_NOFS, flags);
case BTRFS_CLEAR_OP_ZERO:
return blkdev_issue_zeroout(bdev, start >> 9, size >> 9,
GFP_NOFS, 0);
@@ -2129,7 +2134,8 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, 
u64 bytenr,
struct btrfs_bio *bbio = NULL;
int rw;
 
-   if (clear == BTRFS_CLEAR_OP_DISCARD)
+   if (clear == BTRFS_CLEAR_OP_DISCARD ||
+   clear == BTRFS_CLEAR_OP_DISCARD_SECURE)
rw = BTRFS_MAP_DISCARD;
else
rw = BTRFS_MAP_WRITE;
@@ -2160,6 +2166,10 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, 
u64 bytenr,
!blk_queue_discard(req_q))
continue;
 
+   if (clear == BTRFS_CLEAR_OP_DISCARD_SECURE &&
+   !blk_queue_secure_erase(req_q))
+   continue;
+
ret = btrfs_issue_discard(stripe->dev->bdev,
  stripe->physical,
  stripe->length,
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 860f9df01614..1becbb241e53 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -743,6 +743,7 @@ enum btrfs_err_code {
 enum btrfs_clear_op_type {
BTRFS_CLEAR_OP_DISCARD = 0,
BTRFS_CLEAR_OP_ZERO,
+   BTRFS_CLEAR_OP_DISCARD_SECURE,
BTRFS_NR_CLEAR_OP_TYPES,
 };
 
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] btrfs: add zeroing clear operation

2018-04-20 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/ctree.h   | 1 +
 fs/btrfs/extent-tree.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 7cde72683b8e..772cb4ccc5f7 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -679,6 +679,7 @@ struct btrfs_stripe_hash {
  */
 enum btrfs_clear_op_type {
BTRFS_CLEAR_OP_DISCARD = 0,
+   BTRFS_CLEAR_OP_ZERO,
BTRFS_NR_CLEAR_OP_TYPES,
 };
 
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index b317f8ee42a9..800aaf45e6bd 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2041,6 +2041,9 @@ static int btrfs_issue_clear_op(struct block_device 
*bdev, u64 start, u64 size,
case BTRFS_CLEAR_OP_DISCARD:
return blkdev_issue_discard(bdev, start >> 9, size >> 9,
GFP_NOFS, 0);
+   case BTRFS_CLEAR_OP_ZERO:
+   return blkdev_issue_zeroout(bdev, start >> 9, size >> 9,
+   GFP_NOFS, 0);
default:
return -EOPNOTSUPP;
}
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] Add ioctl to clear unused space

2018-04-20 Thread David Sterba
This patchset adds new ioctl similar to TRIM, that provides several
other ways how to clear the unused space.  The changelogs are
incomplete, for preview not for inclusion yet.

It compiles and has been tested lightly, the clearing modes depend on hw
capabilities (secure discard, sector unmapping instead of zeros), so
I've tested only zeroing and discard.

I personally think the zeroing has a usecase and the other modes were
easy to add. Further extensions can be considered, eg. WRITE_SAME,
overwriting with a randomly generated pattern, or some filesystem canary
patterns that can be used to report unused block read as metadata.

As this is modelled after the generic FITRIM ioctl, so this could be a
new generic ioctl too. However, last time somebody wanted such ioctl,
there was a pushback. I'll consider making a generic version and send it
for comments to fsdevel eventually.

git://github.com/kdave/btrfs-devel dev/zero-free


David Sterba (6):
  btrfs: extend trim callchain to pass the operation type
  btrfs: add wrapper to switch clearing operation
  btrfs: add zeroing clear operation
  btrfs: add new ioctl BTRFS_IOC_CLEAR_FREE
  btrfs: add discard secure to clear unused space
  btrfs: add more zeroout modes for clearing unused space

 fs/btrfs/ctree.h|   5 +-
 fs/btrfs/extent-tree.c  | 130 ++--
 fs/btrfs/free-space-cache.c |  22 +---
 fs/btrfs/free-space-cache.h |   3 +-
 fs/btrfs/ioctl.c|  42 ++
 include/uapi/linux/btrfs.h  |  26 +
 6 files changed, 199 insertions(+), 29 deletions(-)

-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] btrfs: extend trim callchain to pass the operation type

2018-04-20 Thread David Sterba
This is preparatory work to implement clearing free space by zeroing,
much in the same way the FITRIM ioctl works.  This patch simply defines
a symbolic name for the current clearing operation and passes the
parameter around. No functional change.

Signed-off-by: David Sterba 
---
 fs/btrfs/ctree.h| 11 -
 fs/btrfs/extent-tree.c  | 54 +++--
 fs/btrfs/free-space-cache.c | 22 ++
 fs/btrfs/free-space-cache.h |  3 ++-
 4 files changed, 62 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 5474ef14d6e6..7cde72683b8e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -674,6 +674,14 @@ struct btrfs_stripe_hash {
spinlock_t lock;
 };
 
+/*
+ * Type of operation that will be used to clear unused blocks.
+ */
+enum btrfs_clear_op_type {
+   BTRFS_CLEAR_OP_DISCARD = 0,
+   BTRFS_NR_CLEAR_OP_TYPES,
+};
+
 /* used by the raid56 code to lock stripes for read/modify/write */
 struct btrfs_stripe_hash_table {
struct list_head stripe_cache;
@@ -2789,7 +2797,8 @@ u64 btrfs_account_ro_block_groups_free_space(struct 
btrfs_space_info *sinfo);
 int btrfs_error_unpin_extent_range(struct btrfs_fs_info *fs_info,
   u64 start, u64 end);
 int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr,
-u64 num_bytes, u64 *actual_bytes);
+u64 num_bytes, u64 *actual_bytes,
+enum btrfs_clear_op_type clear);
 int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info, u64 type);
 int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 5bdb5636d552..ffdd3aba508c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2036,7 +2036,8 @@ static int remove_extent_backref(struct 
btrfs_trans_handle *trans,
 
 #define in_range(b, first, len)((b) >= (first) && (b) < (first) + 
(len))
 static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
-  u64 *discarded_bytes)
+  u64 *discarded_bytes,
+  enum btrfs_clear_op_type clear)
 {
int j, ret = 0;
u64 bytes_left, end;
@@ -2082,8 +2083,11 @@ static int btrfs_issue_discard(struct block_device 
*bdev, u64 start, u64 len,
}
 
if (size) {
-   ret = blkdev_issue_discard(bdev, start >> 9, size >> 9,
-  GFP_NOFS, 0);
+   if (clear == BTRFS_CLEAR_OP_DISCARD)
+   ret = blkdev_issue_discard(bdev, start >> 9,
+   size >> 9, GFP_NOFS, 0);
+   else
+   ret = -EOPNOTSUPP;
if (!ret)
*discarded_bytes += size;
else if (ret != -EOPNOTSUPP)
@@ -2099,8 +2103,11 @@ static int btrfs_issue_discard(struct block_device 
*bdev, u64 start, u64 len,
}
 
if (bytes_left) {
-   ret = blkdev_issue_discard(bdev, start >> 9, bytes_left >> 9,
-  GFP_NOFS, 0);
+   if (clear == BTRFS_CLEAR_OP_DISCARD)
+   ret = blkdev_issue_discard(bdev, start >> 9,
+   bytes_left >> 9, GFP_NOFS, 0);
+   else
+   ret = -EOPNOTSUPP;
if (!ret)
*discarded_bytes += bytes_left;
}
@@ -2108,12 +2115,18 @@ static int btrfs_issue_discard(struct block_device 
*bdev, u64 start, u64 len,
 }
 
 int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr,
-u64 num_bytes, u64 *actual_bytes)
+u64 num_bytes, u64 *actual_bytes,
+enum btrfs_clear_op_type clear)
 {
int ret;
u64 discarded_bytes = 0;
struct btrfs_bio *bbio = NULL;
+   int rw;
 
+   if (clear == BTRFS_CLEAR_OP_DISCARD)
+   rw = BTRFS_MAP_DISCARD;
+   else
+   rw = BTRFS_MAP_WRITE;
 
/*
 * Avoid races with device replace and make sure our bbio has devices
@@ -2121,8 +2134,7 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, 
u64 bytenr,
 */
btrfs_bio_counter_inc_blocked(fs_info);
/* Tell the block device(s) that the sectors can be discarded */
-   ret = btrfs_map_block(fs_info, BTRFS_MAP_DISCARD, bytenr, _bytes,
- , 0);
+   ret = btrfs_map_block(fs_info, rw, bytenr, _bytes, , 0);
/* Error condition is -ENOMEM */
if (!ret) {
struct btrfs_bio_stripe *stripe = bbio->stripes;
@@ 

Re: [PATCH v2 12/16] btrfs: track running balance in a simpler way

2018-04-20 Thread Anand Jain






Ok not that simple, the running status is checked outside of
balance_mutex and there's one more assertion that does not expect the
balance_ctl to exist:

@@ -4031,16 +4032,16 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info)
 return -ENOTCONN;
 }

-   if (atomic_read(_info->balance_running)) {
+   if (test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags)) {
 atomic_inc(_info->balance_pause_req);
 mutex_unlock(_info->balance_mutex);

 wait_event(fs_info->balance_wait_q,
-  atomic_read(_info->balance_running) == 0);
+  !test_bit(BTRFS_FS_BALANCE_RUNNING, 
_info->flags));

here it's unlocked

 mutex_lock(_info->balance_mutex);
 /* we are good with balance_ctl ripped off from under us */
-   BUG_ON(atomic_read(_info->balance_running));
+   BUG_ON(test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags));

and rewriting the code so this could be checked the same way is not a simple
fixup as I expected.

As there's still the extra balance mutex lock/unlock after the volume
mutex removal, I'll have a look how this could be cleaned up further.

 atomic_dec(_info->balance_pause_req);
 } else {
 ret = -ENOTCONN;


 Makes sense.

Thanks, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] btrfs: Add incompat flags check for btrfs_check_super_valid()

2018-04-20 Thread David Sterba
On Thu, Apr 19, 2018 at 07:24:48PM +0300, Nikolay Borisov wrote:
> I agree with David, just make it btrfs_validate_super

Ack.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 12/16] btrfs: track running balance in a simpler way

2018-04-20 Thread David Sterba
On Fri, Apr 20, 2018 at 01:58:11PM +0200, David Sterba wrote:
> On Fri, Apr 20, 2018 at 03:52:24PM +0800, Anand Jain wrote:
> > 
> > 
> > On 04/20/2018 12:33 AM, David Sterba wrote:
> > > Currently fs_info::balance_running is 0 or 1 and does not use the
> > > semantics of atomics. The pause and cancel check for 0, that can happen
> > > only after __btrfs_balance exits for whatever reason.
> > > 
> > > Parallel calls to balance ioctl may enter btrfs_ioctl_balance multiple
> > > times but will block on the balance_mutex that protects the
> > > fs_info::flags bit.
> > > 
> > > Signed-off-by: David Sterba 
> > > ---
> > >   fs/btrfs/ctree.h   |  6 +-
> > >   fs/btrfs/disk-io.c |  1 -
> > >   fs/btrfs/ioctl.c   |  6 +++---
> > >   fs/btrfs/volumes.c | 18 ++
> > >   4 files changed, 18 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> > > index 011cb9a8a967..fda94a264eb7 100644
> > > --- a/fs/btrfs/ctree.h
> > > +++ b/fs/btrfs/ctree.h
> > > @@ -726,6 +726,11 @@ struct btrfs_delayed_root;
> > >* (device replace, resize, device add/delete, balance)
> > >*/
> > >   #define BTRFS_FS_EXCL_OP16
> > > +/*
> > > + * Indicate that balance has been set up from the ioctl and is in the 
> > > main
> > > + * phase. The fs_info::balance_ctl is initialized.
> > > + */
> > > +#define BTRFS_FS_BALANCE_RUNNING 17
> > 
> > 
> >   Change looks good to me as such and its a good idea to drop
> > fs_info::balance_running;
> > 
> >   However instead of making BTRFS_FS_BALANCE_RUNNING part of
> > btrfs_fs_info::flags
> >   pls make it part of
> > btrfs_balance_control::flags
> > 
> >   Because:
> >   We already have BTRFS_BALANCE_RESUME there.
> >   And at fs_info level BTRFS_FS_EXCL_OP tells someone excl is running.
> >   And if its a balance (either in running or implicit-paused state)
> >   then btrfs_fs_info::balance_ctl is not NULL.
> 
> This would work fine, if the btrfs_balance_control::flags were not copy
> of the ioctl structure. There are the filter flags stored there, in
> addition to BTRFS_BALANCE_RESUME.
> 
> >From that point it's the balance ioctl interface and adding the internal
> runtime flag does not seem to fit.
> 
> The status bit could be tracked separatelly in btrfs_balance_control so
> it does not use the whole filesystem flag namespace, as it's always used
> under balance mutex.
> 
> As this is simple change to the patch, I'll do that.

Ok not that simple, the running status is checked outside of
balance_mutex and there's one more assertion that does not expect the
balance_ctl to exist:

@@ -4031,16 +4032,16 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info)
return -ENOTCONN;
}

-   if (atomic_read(_info->balance_running)) {
+   if (test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags)) {
atomic_inc(_info->balance_pause_req);
mutex_unlock(_info->balance_mutex);

wait_event(fs_info->balance_wait_q,
-  atomic_read(_info->balance_running) == 0);
+  !test_bit(BTRFS_FS_BALANCE_RUNNING, 
_info->flags));

here it's unlocked


mutex_lock(_info->balance_mutex);
/* we are good with balance_ctl ripped off from under us */
-   BUG_ON(atomic_read(_info->balance_running));
+   BUG_ON(test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags));

and rewriting the code so this could be checked the same way is not a simple
fixup as I expected.

As there's still the extra balance mutex lock/unlock after the volume
mutex removal, I'll have a look how this could be cleaned up further.

atomic_dec(_info->balance_pause_req);
} else {
ret = -ENOTCONN;
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2.1 13/16] btrfs: move and comment read-only check in btrfs_cancel_balance

2018-04-20 Thread David Sterba
Balance cannot be started on a read-only filesystem and will have to
finish/exit before eg. going to read-only via remount.

In case the filesystem is forcibly set to read-only after an error,
balance will finish anyway and if the cancel call is too fast it will
just wait for that to happen.

The last case is when the balance is paused after mount but it's
read-only and cancelling would want to delete the item. The test is
moved after the check if balance is running at all, as it looks more
logical to report "no balance running" instead of "read-only
filesystem".

Signed-off-by: David Sterba 
---

- Add missing unlock

 fs/btrfs/volumes.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a766d2f988c1..9176d77e02ee 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4053,15 +4053,22 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info)
 
 int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
 {
-   if (sb_rdonly(fs_info->sb))
-   return -EROFS;
-
mutex_lock(_info->balance_mutex);
if (!fs_info->balance_ctl) {
mutex_unlock(_info->balance_mutex);
return -ENOTCONN;
}
 
+   /*
+* A paused balance with the item stored on disk can be resumed at
+* mount time if the mount is read-write. Otherwise it's still paused
+* and we must not allow cancelling as it deletes the item.
+*/
+   if (sb_rdonly(fs_info->sb)) {
+   mutex_unlock(_info->balance_mutex);
+   return -EROFS;
+   }
+
atomic_inc(_info->balance_cancel_req);
/*
 * if we are running just wait and return, balance item is
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 13/16] btrfs: move and comment read-only check in btrfs_cancel_balance

2018-04-20 Thread David Sterba
On Fri, Apr 20, 2018 at 05:13:02PM +0800, Anand Jain wrote:
> 
> 
> On 04/20/2018 12:33 AM, David Sterba wrote:
> > Balance cannot be started on a read-only filesystem and will have to
> > finish/exit before eg. going to read-only via remount.
> 
>   It can be paused as well.
> btrfs balance pause /btrfs && mount -o remount,ro /dev/sdb /btrfs
> 
> 
> > @@ -4053,15 +4053,20 @@ int btrfs_pause_balance(struct btrfs_fs_info 
> > *fs_info)
> >   
> >   int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
> >   {
> > -   if (sb_rdonly(fs_info->sb))
> > -   return -EROFS;
> > -
> > mutex_lock(_info->balance_mutex);
> > if (!fs_info->balance_ctl) {
> > mutex_unlock(_info->balance_mutex);
> > return -ENOTCONN;
> > }
> >   
> > +   /*
> > +* A paused balance with the item stored on disk can be resumed at
> > +* mount time if the mount is read-write. Otherwise it's still paused
> > +* and we must not allow cancelling as it deletes the item.
> > +*/
> > +   if (sb_rdonly(fs_info->sb))
> > +   return -EROFS;
> > +
> 
>mutex_unlock(_info->balance_mutex); ?

Ah, of course. Fixed, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 12/16] btrfs: track running balance in a simpler way

2018-04-20 Thread David Sterba
On Fri, Apr 20, 2018 at 03:52:24PM +0800, Anand Jain wrote:
> 
> 
> On 04/20/2018 12:33 AM, David Sterba wrote:
> > Currently fs_info::balance_running is 0 or 1 and does not use the
> > semantics of atomics. The pause and cancel check for 0, that can happen
> > only after __btrfs_balance exits for whatever reason.
> > 
> > Parallel calls to balance ioctl may enter btrfs_ioctl_balance multiple
> > times but will block on the balance_mutex that protects the
> > fs_info::flags bit.
> > 
> > Signed-off-by: David Sterba 
> > ---
> >   fs/btrfs/ctree.h   |  6 +-
> >   fs/btrfs/disk-io.c |  1 -
> >   fs/btrfs/ioctl.c   |  6 +++---
> >   fs/btrfs/volumes.c | 18 ++
> >   4 files changed, 18 insertions(+), 13 deletions(-)
> > 
> > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> > index 011cb9a8a967..fda94a264eb7 100644
> > --- a/fs/btrfs/ctree.h
> > +++ b/fs/btrfs/ctree.h
> > @@ -726,6 +726,11 @@ struct btrfs_delayed_root;
> >* (device replace, resize, device add/delete, balance)
> >*/
> >   #define BTRFS_FS_EXCL_OP  16
> > +/*
> > + * Indicate that balance has been set up from the ioctl and is in the main
> > + * phase. The fs_info::balance_ctl is initialized.
> > + */
> > +#define BTRFS_FS_BALANCE_RUNNING   17
> 
> 
>   Change looks good to me as such and its a good idea to drop
> fs_info::balance_running;
> 
>   However instead of making BTRFS_FS_BALANCE_RUNNING part of
> btrfs_fs_info::flags
>   pls make it part of
> btrfs_balance_control::flags
> 
>   Because:
>   We already have BTRFS_BALANCE_RESUME there.
>   And at fs_info level BTRFS_FS_EXCL_OP tells someone excl is running.
>   And if its a balance (either in running or implicit-paused state)
>   then btrfs_fs_info::balance_ctl is not NULL.

This would work fine, if the btrfs_balance_control::flags were not copy
of the ioctl structure. There are the filter flags stored there, in
addition to BTRFS_BALANCE_RESUME.

>From that point it's the balance ioctl interface and adding the internal
runtime flag does not seem to fit.

The status bit could be tracked separatelly in btrfs_balance_control so
it does not use the whole filesystem flag namespace, as it's always used
under balance mutex.

As this is simple change to the patch, I'll do that.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Hard link not persisted on fsync

2018-04-20 Thread Jayashree Mohan
Hi Zygo,

Thanks for the reply. Here are few responses about btrfs behavior that
you had questions about in your email.

On Thu, Apr 19, 2018 at 4:41 PM, Zygo Blaxell
 wrote:
>
> On Mon, Apr 16, 2018 at 09:35:24AM -0500, Jayashree Mohan wrote:
> > Hi,
> >
> > The following seems to be a crash consistency bug on btrfs, where in
> > the link count is not persisted even after a fsync on the original
> > file.
> >
> > Consider the following workload :
> > creat foo
> > link (foo, A/bar)
> > fsync(foo)
> > ---Crash---
> >
> > Now, on recovery we expect the metadata of foo to be persisted i.e
> > have a link count of 2. However in btrfs, the link count is 1 and file
> > A/bar is not persisted. The expected behaviour would be to persist the
> > dependencies of inode foo. That is to say, shouldn't fsync of foo
> > persist A/bar and correctly update the link count?
>
> Those dependencies are backward.  foo's inode doesn't depend on anything
> but the data in the file foo, and foo's inode itself.
>
> "foo" and "A/bar" are dirents that both depend on the inode of foo, which
> implies that "A" and "." must be updated atomically with foo's inode.
> If you had called fsync(A) then we'd expect A/bar to exist and the inode
> to have a link count of 2.  If you'd called fsync(.) then...well, you
> didn't modify "." at all, so I guess either outcome is valid as long as
> the inode link count matches the number of dirents referencing the inode.
>
> But then...why does foo exist at all?  I'd expect at least some tests
> would end without foo on disk either, since all that was fsync()ed was the
> foo inode, not the foo dirent in the directory '.'.  Does btrfs combine
> creating foo and updating foo's inode into a single atomic operation?
> I vaguely recall that it does exactly that, in order to solve a bug
> some years ago.  What happens if you add a rename, e.g.
>
> unlink foo2 # make sure foo2 doesn't exist
> creat foo
> rename(foo, foo2)
> link(foo2, A/bar)
> fsync(foo2)
>
> Do you get foo or foo2?  I'd expect foo since you didn't fsync '.',
> but maybe rename implies flush and you get foo2.

This is the workload we tried:

creat foo
rename(foo, foo2)
mkdir A
link(foo2, A/bar)
fsync(foo2)

Btrfs persists foo2 here, and A/bar is not persisted. Also, the link
count of foo2 is 1, while on the other filesystems, A/bar persist as
well and foo2 has a link count 2.
I don't think it's the rename here that implies the flush - removing
the fsync(foo2) ends up not persisting A/bar. So looks like, its the
fsync that forces all dependencies to the disk.

As a variant of the above workload, consider the following:

mkdir A
sync
creat foo
rename(foo, foo2)
link(foo2, A/bar)
fsync(foo2)

The only difference between this workload and the one above is, where
directory A is created and whether its persisted previously or not. In
this modified workload, foo2 has a link count of 2 and ends up
persisting A/bar. Why is A/bar being persisted here even when we did
not explicitly call a fsync on directory A?

We don't seem to correctly understand the reason behind these
different behaviors. It will be useful if you could provide more
insight into this.

> That's not to say that fsync is not a rich source of filesystem bugs.
> btrfs did once have (and maybe still has?) a bug where renames and fsync
> can create a dirent with no inode, e.g.
>
> loop continuously:
> creat foo
> write(foo, data)
> fsync(foo)
> rename(foo, bar)
>
> and crash somewhere in the middle of the loop, which will create a
> dirent "foo" that points to a non-existent inode.
>
> Removing the "fsync" works around the bug.  rename() does a flush anyway,
> so the fsync() wasn't needed, but fsync() shouldn't _create_ a filesystem
> inconsistency, especially when Googling recommends app developers to
> sprinkle fsync()s indiscriminately in their code to prevent their data
> from being mangled.
>
> I haven't been tracking to see if that's fixed yet.  I last saw it on
> 4.11, but I have been aggressively avoiding fsync with eatmydata for
> some years now.
>
> > Note that ext4, xfs and f2fs recover to the correct link count of 2
> > for the above workload.
>
> Do those filesystems also work if you remove the fsync?  That may be
> your answer:  they could be flushing the other metadata earlier, before
> you call fsync().

>> creat foo
>> link (foo, A/bar)
> >fsync(foo)
>>---Crash---

Actually, they don't seem to work if we remove the fsync. The behavior
changes and we neither persist A/bar nor update the link count of foo
to 2. So this confirms that fsync is forcing the metadata and not the
link() syscall itself right?

Thanks,
Jayashree
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 09/16] btrfs: cleanup helpers that reset balance state

2018-04-20 Thread David Sterba
On Fri, Apr 20, 2018 at 10:07:17AM +0300, Nikolay Borisov wrote:
> 
> 
> On 19.04.2018 19:33, David Sterba wrote:
> > +   /* reset_balance_state needs volume_mutex */
> 
> Does it make sense to codify this invariant as lockdep_assert_held in
> reset_balance_state ?

No, the comment and the mutex will be removed in the following patches.

But yeah in general the lockdep annotations are better than the comments
stating which lock is supposed to be held.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug report] btrfs fail when run xfstests-dev/btrfs/007

2018-04-20 Thread Filipe Manana
On Fri, Apr 20, 2018 at 10:23 AM, Gu, Jinxiang  wrote:
> Hi,
>
> Here's a bug report.
> Kernel v4.17-rc1 failed xfstests-dev/btrfs/007.
> It is not always happens.( occurred times/test times: 1/20)

You don't need to check how many times it fails... You have a fsstress
seed from the .full file, that's all that matters.

I'll take a look at it, it's due to a missing truncate operation.

thanks

>
> And I confirmed this test using kernel v4.16-rc1.
> It also occurred sometimes.
>
> LOG when using kernel v4.17-rc1.
> btrfs/007[failed, exit status 1] - output mismatch (see 
> /home/gujx/xfstests-dev/results//btrfs/007.out.bad)
> --- tests/btrfs/007.out 2017-08-18 12:45:06.560413266 +0800
> +++ /home/gujx/xfstests-dev/results//btrfs/007.out.bad  2018-04-17 
> 13:24:06.887089998 +0800
> @@ -1,4 +1,5 @@
>  QA output created by 007
>  *** test send / receive
> -*** done
> +failed: '/home/gujx/xfstests-dev/src/fssum -r 
> /tmp/tmp.SkEUIXw683/incr.fssum /mnt/scratch/incr'
> +(see /home/gujx/xfstests-dev/results//btrfs/007.full for details)
>  *** unmount
> ...
> (Run 'diff -u tests/btrfs/007.out 
> /home/gujx/xfstests-dev/results//btrfs/007.out.bad'  to see the entire diff)
>
> And the detail can be found in the attachment.
>
> Thanks, Gu Jinxiang
>
>



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: fix BUG trying to resume balance without resume flag

2018-04-20 Thread Anand Jain
We set the BTRFS_BALANCE_RESUME flag in the btrfs_recover_balance(),
which is not called during the remount. So when resuming from the
paused balance we hit BUG.

 kernel: kernel BUG at fs/btrfs/volumes.c:3890!
 ::
 kernel:  balance_kthread+0x51/0x60 [btrfs]
 kernel:  kthread+0x111/0x130
 ::
 kernel: RIP: btrfs_balance+0x12e1/0x1570 [btrfs] RSP: ba7d0090bde8

Reproducer:
  On a mounted BTRFS.

  btrfs balance start --full-balance /btrfs
  btrfs balance pause /btrfs
  mount -o remount,ro /dev/sdb /btrfs
  mount -o remount,rw /dev/sdb /btrfs

To fix this set the BTRFS_BALANCE_RESUME flag in btrfs_resume_balance_async()
instead of btrfs_recover_balance().

Signed-off-by: Anand Jain 
---
 fs/btrfs/volumes.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 66d2ed96aef1..9c29fdca9075 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3963,6 +3963,10 @@ int btrfs_resume_balance_async(struct btrfs_fs_info 
*fs_info)
return 0;
}
 
+   spin_lock(_info->balance_lock);
+   fs_info->balance_ctl->flags |= BTRFS_BALANCE_RESUME;
+   spin_unlock(_info->balance_lock);
+
tsk = kthread_run(balance_kthread, fs_info, "btrfs-balance");
return PTR_ERR_OR_ZERO(tsk);
 }
@@ -4004,7 +4008,6 @@ int btrfs_recover_balance(struct btrfs_fs_info *fs_info)
 
bctl->fs_info = fs_info;
bctl->flags = btrfs_balance_flags(leaf, item);
-   bctl->flags |= BTRFS_BALANCE_RESUME;
 
btrfs_balance_data(leaf, item, _bargs);
btrfs_disk_balance_args_to_cpu(>data, _bargs);
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/16] btrfs: use mutex in btrfs_resume_balance_async

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

While the spinlock does not cause problems, using the mutex is more
correct and consistent with others. The global status of balance is eg.
checked from btrfs_pause_balance or btrfs_cancel_balance with mutex.

Resuming balance happens during mount or ro->rw remount. In the former
case, no other user of the balance_ctl exists, in the latter, balance
cannot run until the ro/rw transition is finished.

Signed-off-by: David Sterba 


Reviewed-by: Anand Jain 

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[bug report] btrfs fail when run xfstests-dev/btrfs/007

2018-04-20 Thread Gu, Jinxiang
Hi,

Here's a bug report.
Kernel v4.17-rc1 failed xfstests-dev/btrfs/007.
It is not always happens.( occurred times/test times: 1/20)

And I confirmed this test using kernel v4.16-rc1.
It also occurred sometimes.

LOG when using kernel v4.17-rc1.
btrfs/007[failed, exit status 1] - output mismatch (see 
/home/gujx/xfstests-dev/results//btrfs/007.out.bad)
--- tests/btrfs/007.out 2017-08-18 12:45:06.560413266 +0800
+++ /home/gujx/xfstests-dev/results//btrfs/007.out.bad  2018-04-17 
13:24:06.887089998 +0800
@@ -1,4 +1,5 @@
 QA output created by 007
 *** test send / receive
-*** done
+failed: '/home/gujx/xfstests-dev/src/fssum -r 
/tmp/tmp.SkEUIXw683/incr.fssum /mnt/scratch/incr'
+(see /home/gujx/xfstests-dev/results//btrfs/007.full for details)
 *** unmount
... 
(Run 'diff -u tests/btrfs/007.out 
/home/gujx/xfstests-dev/results//btrfs/007.out.bad'  to see the entire diff)

And the detail can be found in the attachment.

Thanks, Gu Jinxiang




007.full
Description: 007.full


Re: [PATCH v2 14/16] btrfs: drop lock parameter from update_ioctl_balance_args and rename

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

The parameter controls locking of the stats part but we can lock it
unconditionally, as this only happens once when balance starts. This is
not performance critical.

Add the prefix for an exported function.

Signed-off-by: David Sterba 


 Reviewed-by: Anand Jain 

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recovery from full metadata with all device space consumed?

2018-04-20 Thread Duncan
Timofey Titovets posted on Fri, 20 Apr 2018 01:32:42 +0300 as excerpted:

> 2018-04-20 1:08 GMT+03:00 Drew Bloechl :
>> I've got a btrfs filesystem that I can't seem to get back to a useful
>> state. The symptom I started with is that rename() operations started
>> dying with ENOSPC, and it looks like the metadata allocation on the
>> filesystem is full:
>>
>> # btrfs fi df /broken
>> Data, RAID0: total=3.63TiB, used=67.00GiB
>> System, RAID1: total=8.00MiB, used=224.00KiB
>> Metadata, RAID1: total=3.00GiB, used=2.50GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>> All of the consumable space on the backing devices also seems to be in
>> use:
>>
>> # btrfs fi show /broken Label: 'mon_data'  uuid:
>> 85e52555-7d6d-4346-8b37-8278447eb590
>> Total devices 4 FS bytes used 69.50GiB
>> devid1 size 931.51GiB used 931.51GiB path /dev/sda1
>> devid2 size 931.51GiB used 931.51GiB path /dev/sdb1
>> devid3 size 931.51GiB used 931.51GiB path /dev/sdc1
>> devid4 size 931.51GiB used 931.51GiB path /dev/sdd1
>>
>> Even the smallest balance operation I can start fails (this doesn't
>> change even with an extra temporary device added to the filesystem):
>>
>> # btrfs balance start -v -dusage=1 /broken
>> Dumping filters: flags 0x1, state 0x0, force is off
>>   DATA (flags 0x2): balancing, usage=1
>> ERROR: error during balancing '/broken': No space left on device
>> There may be more info in syslog - try dmesg | tail
>> # dmesg | tail -1
>> [11554.296805] BTRFS info (device sdc1): 757 enospc errors during
>> balance
>>
>> The current kernel is 4.15.0 from Debian's stretch-backports
>> (specifically linux-image-4.15.0-0.bpo.2-amd64), but it was Debian's
>> 4.9.30 when the filesystem got into this state. I upgraded it in the
>> hopes that a newer kernel would be smarter, but no dice.
>>
>> btrfs-progs is currently at v4.7.3.
>>
>> Most of what this filesystem stores is Prometheus 1.8's TSDB for its
>> metrics, which are constantly written at around 50MB/second. The
>> filesystem never really gets full as far as data goes, but there's a
>> lot of never-ending churn for what data is there.
>>
>> Question 1: Are there other steps that can be tried to rescue a
>> filesystem in this state? I still have it mounted in the same state,
>> and I'm willing to try other things or extract debugging info.
>>
>> Question 2: Is there something I could have done to prevent this from
>> happening in the first place?
> 
> Not sure why this happening,
> but if you stuck at that state:
>   - Reboot to ensure no other problems will exists
>   - Add any other external device temporary to FS, as example zram.
> After you free small part of fs, delete external dev from FS and
> continue balance chunks.

He did try adding a temporary device.  Requoting from above:

>> Even the smallest balance operation I can start fails (this doesn't
>> change even with an extra temporary device added to the filesystem):

Never-the-less, that's the right idea in general, but I believe the 
following additional suggestions, now addressed to the original poster, 
will help.

1) Try with -dusage=0.

With any luck there will be some totally empty data chunks, which this 
should free, hopefully getting you at least enough space for the -dusage=1 
to work and free additional space.

The reason this can work is that unlike with actual usage, entirely empty 
chunks don't require writing a fresh block to copy the used extents 
into... because there aren't any.  But of course it does require that 
there's some totally empty chunks available to free, which with your 
numbers is somewhat likely, but not a given, especially since newer 
kernels (well, since some time now, but...) normally free entirely empty 
chunks automatically.

FWIW, 0-usage balances are near instant as all it has to do is eliminate 
the empty chunks from the chunk list.  1% usage balances, once you can do 
them, will go real fast too, and in your state may get you back some 
decent unallocated, tho they probably won't do much for people in less 
extreme unbalance conditions.  10% will do more and take a bit longer, 
but still be fast as it's only writing 1/10th of the chunk size, and as 
long as there's enough chunks at that level, it'll still be returning 10 
for every full one it rewrites.  At 50% it'll take much longer but will 
still be returning 2 chunks for every one it writes.  Above that, the 
payback goes down rather fast, so you're only getting back 1 for 2 
written at 67%, and one for 9 written at 90%.  As such, on spinning rust 
it's rarely worth trying above 70% or so, and often not worth trying 
above 50%, unless of course the filesystem really is almost full and you 
are trying to reclaim every last bit of unused chunk space to unallocated 
you can, regardless of the time it takes.  FWIW I'm on ssd and partition 
up so my filesystems are normally under 100 GiB, so even a full balance 
normally only 

Re: [PATCH v2 13/16] btrfs: move and comment read-only check in btrfs_cancel_balance

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

Balance cannot be started on a read-only filesystem and will have to
finish/exit before eg. going to read-only via remount.


 It can be paused as well.
   btrfs balance pause /btrfs && mount -o remount,ro /dev/sdb /btrfs



@@ -4053,15 +4053,20 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info)
  
  int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)

  {
-   if (sb_rdonly(fs_info->sb))
-   return -EROFS;
-
mutex_lock(_info->balance_mutex);
if (!fs_info->balance_ctl) {
mutex_unlock(_info->balance_mutex);
return -ENOTCONN;
}
  
+	/*

+* A paused balance with the item stored on disk can be resumed at
+* mount time if the mount is read-write. Otherwise it's still paused
+* and we must not allow cancelling as it deletes the item.
+*/
+   if (sb_rdonly(fs_info->sb))
+   return -EROFS;
+


  mutex_unlock(_info->balance_mutex); ?

Thanks, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 09/16] btrfs: cleanup helpers that reset balance state

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

The function __cancel_balance name is confusing with the cancel
operation of balance and it really resets the state of balance back to
zero. The unset_balance_control helper is called only from one place and
simple enough to be inlined.

Signed-off-by: David Sterba 


 Reviewed-by: Anand Jain 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 12/16] btrfs: track running balance in a simpler way

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

Currently fs_info::balance_running is 0 or 1 and does not use the
semantics of atomics. The pause and cancel check for 0, that can happen
only after __btrfs_balance exits for whatever reason.

Parallel calls to balance ioctl may enter btrfs_ioctl_balance multiple
times but will block on the balance_mutex that protects the
fs_info::flags bit.

Signed-off-by: David Sterba 
---
  fs/btrfs/ctree.h   |  6 +-
  fs/btrfs/disk-io.c |  1 -
  fs/btrfs/ioctl.c   |  6 +++---
  fs/btrfs/volumes.c | 18 ++
  4 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 011cb9a8a967..fda94a264eb7 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -726,6 +726,11 @@ struct btrfs_delayed_root;
   * (device replace, resize, device add/delete, balance)
   */
  #define BTRFS_FS_EXCL_OP  16
+/*
+ * Indicate that balance has been set up from the ioctl and is in the main
+ * phase. The fs_info::balance_ctl is initialized.
+ */
+#define BTRFS_FS_BALANCE_RUNNING   17



 Change looks good to me as such and its a good idea to drop
   fs_info::balance_running;

 However instead of making BTRFS_FS_BALANCE_RUNNING part of
   btrfs_fs_info::flags
 pls make it part of
   btrfs_balance_control::flags

 Because:
 We already have BTRFS_BALANCE_RESUME there.
 And at fs_info level BTRFS_FS_EXCL_OP tells someone excl is running.
 And if its a balance (either in running or implicit-paused state)
 then btrfs_fs_info::balance_ctl is not NULL.

Thanks, Anand

  
  struct btrfs_fs_info {

u8 fsid[BTRFS_FSID_SIZE];
@@ -991,7 +996,6 @@ struct btrfs_fs_info {
/* restriper state */
spinlock_t balance_lock;
struct mutex balance_mutex;
-   atomic_t balance_running;
atomic_t balance_pause_req;
atomic_t balance_cancel_req;
struct btrfs_balance_control *balance_ctl;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c0482ecea11f..b62559dfb053 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2168,7 +2168,6 @@ static void btrfs_init_balance(struct btrfs_fs_info 
*fs_info)
  {
spin_lock_init(_info->balance_lock);
mutex_init(_info->balance_mutex);
-   atomic_set(_info->balance_running, 0);
atomic_set(_info->balance_pause_req, 0);
atomic_set(_info->balance_cancel_req, 0);
fs_info->balance_ctl = NULL;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 7c99f74c200e..3dfd5ab2807b 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -4506,7 +4506,7 @@ void update_ioctl_balance_args(struct btrfs_fs_info 
*fs_info, int lock,
  
  	bargs->flags = bctl->flags;
  
-	if (atomic_read(_info->balance_running))

+   if (test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags))
bargs->state |= BTRFS_BALANCE_STATE_RUNNING;
if (atomic_read(_info->balance_pause_req))
bargs->state |= BTRFS_BALANCE_STATE_PAUSE_REQ;
@@ -4558,7 +4558,7 @@ static long btrfs_ioctl_balance(struct file *file, void 
__user *arg)
mutex_lock(_info->balance_mutex);
if (fs_info->balance_ctl) {
/* this is either (2) or (3) */
-   if (!atomic_read(_info->balance_running)) {
+   if (!test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags)) {
mutex_unlock(_info->balance_mutex);
/*
 * Lock released to allow other waiters to continue,
@@ -4567,7 +4567,7 @@ static long btrfs_ioctl_balance(struct file *file, void 
__user *arg)
mutex_lock(_info->balance_mutex);
  
  			if (fs_info->balance_ctl &&

-   !atomic_read(_info->balance_running)) {
+   !test_bit(BTRFS_FS_BALANCE_RUNNING, 
_info->flags)) {
/* this is (3) */
need_unlock = false;
goto locked;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e95cc2f09fdd..a766d2f988c1 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3886,13 +3886,14 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
spin_unlock(_info->balance_lock);
}
  
-	atomic_inc(_info->balance_running);

+   ASSERT(!test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags));
+   set_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags);
mutex_unlock(_info->balance_mutex);
  
  	ret = __btrfs_balance(fs_info);
  
  	mutex_lock(_info->balance_mutex);

-   atomic_dec(_info->balance_running);
+   clear_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags);
  
  	if (bargs) {

memset(bargs, 0, sizeof(*bargs));
@@ -4031,16 +4032,16 @@ int btrfs_pause_balance(struct btrfs_fs_info *fs_info)
return -ENOTCONN;
}
  
-	if (atomic_read(_info->balance_running)) {

+   if (test_bit(BTRFS_FS_BALANCE_RUNNING, _info->flags)) 

Re: [PATCH v2 08/16] btrfs: add sanity check when resuming balance after mount

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

Replace a WARN_ON with a proper check and message in case something goes
really wrong and resumed balance cannot set up its exclusive status.
The check is a user friendly assertion, I don't expect to ever happen
under normal circumstances.

Also document that the paused balance starts here and owns the exclusive
op status.

Signed-off-by: David Sterba 


Reviewed-by: Anand Jain 

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] btrfs: add proper safety check before resuming dev-replace

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

The device replace is paused by unmount or read only remount, and
resumed on next mount or write remount.

The exclusive status should be checked properly as it's a global
invariant and we must not allow 2 operations run. In this case, the
balance can be also paused and resumed under same conditions. It's
always checked first so dev-replace could see the EXCL_OP already taken,
BUT, the ioctl would never let start both at the same time.

Replace the WARN_ON with message and return 0, indicating no error as
this is purely theoretical and the user will be informed. Resolving that
manually should be possible by waiting for the other operation to finish
or cancel the paused state.

Signed-off-by: David Sterba 


Reviewed-by: Anand Jain 

Thanks, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 06/16] btrfs: move clearing of EXCL_OP out of __cancel_balance

2018-04-20 Thread Anand Jain



On 04/20/2018 12:33 AM, David Sterba wrote:

Make the clearning visible in the callers so we can pair it with the
test_and_set part.

Signed-off-by: David Sterba 


Reviewed-by: Anand Jain 

Thanks, Anand



---
  fs/btrfs/ioctl.c   |  2 +-
  fs/btrfs/volumes.c | 13 +++--
  2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index b93dea445802..582bde5b7eda 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -4656,7 +4656,7 @@ static long btrfs_ioctl_balance(struct file *file, void 
__user *arg)
 * Ownership of bctl and filesystem flag BTRFS_FS_EXCL_OP
 * goes to to btrfs_balance.  bctl is freed in __cancel_balance,
 * or, if restriper was paused all the way until unmount, in
-* free_fs_info.  The flag is cleared in __cancel_balance.
+* free_fs_info.  The flag should be cleared after __cancel_balance.
 */
need_unlock = false;
  
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c

index e0bd181dc9e0..5c83ebc8e199 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3760,8 +3760,6 @@ static void __cancel_balance(struct btrfs_fs_info 
*fs_info)
ret = del_balance_item(fs_info);
if (ret)
btrfs_handle_fs_error(fs_info, ret, NULL);
-
-   clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
  }
  
  /* Non-zero return value signifies invalidity */

@@ -3919,6 +3917,7 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
if ((ret && ret != -ECANCELED && ret != -ENOSPC) ||
balance_need_close(fs_info)) {
__cancel_balance(fs_info);
+   clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
}
  
  	wake_up(_info->balance_wait_q);

@@ -3927,10 +3926,10 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
  out:
if (bctl->flags & BTRFS_BALANCE_RESUME)
__cancel_balance(fs_info);
-   else {
+   else
kfree(bctl);
-   clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
-   }
+   clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
+
return ret;
  }
  
@@ -4089,8 +4088,10 @@ int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)

mutex_lock(_info->volume_mutex);
mutex_lock(_info->balance_mutex);
  
-		if (fs_info->balance_ctl)

+   if (fs_info->balance_ctl) {
__cancel_balance(fs_info);
+   clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
+   }
  
  		mutex_unlock(_info->volume_mutex);

}


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 09/16] btrfs: cleanup helpers that reset balance state

2018-04-20 Thread Nikolay Borisov


On 19.04.2018 19:33, David Sterba wrote:
> + /* reset_balance_state needs volume_mutex */

Does it make sense to codify this invariant as lockdep_assert_held in
reset_balance_state ?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/16] btrfs: add proper safety check before resuming dev-replace

2018-04-20 Thread Nikolay Borisov


On 19.04.2018 19:33, David Sterba wrote:
> The device replace is paused by unmount or read only remount, and
> resumed on next mount or write remount.
> 
> The exclusive status should be checked properly as it's a global
> invariant and we must not allow 2 operations run. In this case, the
> balance can be also paused and resumed under same conditions. It's
> always checked first so dev-replace could see the EXCL_OP already taken,
> BUT, the ioctl would never let start both at the same time.
> 
> Replace the WARN_ON with message and return 0, indicating no error as
> this is purely theoretical and the user will be informed. Resolving that
> manually should be possible by waiting for the other operation to finish
> or cancel the paused state.
> 
> Signed-off-by: David Sterba 

Reviewed-by: Nikolay Borisov 

> ---
>  fs/btrfs/dev-replace.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index 7a87ffad041e..346bd460f8e7 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -908,7 +908,17 @@ int btrfs_resume_dev_replace_async(struct btrfs_fs_info 
> *fs_info)
>   }
>   btrfs_dev_replace_write_unlock(dev_replace);
>  
> - WARN_ON(test_and_set_bit(BTRFS_FS_EXCL_OP, _info->flags));
> + /*
> +  * This could collide with a paused balance, but the exclusive op logic
> +  * should never allow both to start and pause. We don't want to allow
> +  * dev-replace to start anyway.
> +  */
> + if (test_and_set_bit(BTRFS_FS_EXCL_OP, _info->flags)) {
> + btrfs_info(fs_info,
> + "cannot resume dev-replace, other exclusive operation running");
> + return 0;
> + }
> +
>   task = kthread_run(btrfs_dev_replace_kthread, fs_info, "btrfs-devrepl");
>   return PTR_ERR_OR_ZERO(task);
>  }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 06/16] btrfs: move clearing of EXCL_OP out of __cancel_balance

2018-04-20 Thread Nikolay Borisov


On 19.04.2018 19:33, David Sterba wrote:
> Make the clearning visible in the callers so we can pair it with the
> test_and_set part.
> 
> Signed-off-by: David Sterba 

Reviewed-by: Nikolay Borisov 

> ---
>  fs/btrfs/ioctl.c   |  2 +-
>  fs/btrfs/volumes.c | 13 +++--
>  2 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index b93dea445802..582bde5b7eda 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -4656,7 +4656,7 @@ static long btrfs_ioctl_balance(struct file *file, void 
> __user *arg)
>* Ownership of bctl and filesystem flag BTRFS_FS_EXCL_OP
>* goes to to btrfs_balance.  bctl is freed in __cancel_balance,
>* or, if restriper was paused all the way until unmount, in
> -  * free_fs_info.  The flag is cleared in __cancel_balance.
> +  * free_fs_info.  The flag should be cleared after __cancel_balance.
>*/
>   need_unlock = false;
>  
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index e0bd181dc9e0..5c83ebc8e199 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3760,8 +3760,6 @@ static void __cancel_balance(struct btrfs_fs_info 
> *fs_info)
>   ret = del_balance_item(fs_info);
>   if (ret)
>   btrfs_handle_fs_error(fs_info, ret, NULL);
> -
> - clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
>  }
>  
>  /* Non-zero return value signifies invalidity */
> @@ -3919,6 +3917,7 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
>   if ((ret && ret != -ECANCELED && ret != -ENOSPC) ||
>   balance_need_close(fs_info)) {
>   __cancel_balance(fs_info);
> + clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
>   }
>  
>   wake_up(_info->balance_wait_q);
> @@ -3927,10 +3926,10 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
>  out:
>   if (bctl->flags & BTRFS_BALANCE_RESUME)
>   __cancel_balance(fs_info);
> - else {
> + else
>   kfree(bctl);
> - clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
> - }
> + clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
> +
>   return ret;
>  }
>  
> @@ -4089,8 +4088,10 @@ int btrfs_cancel_balance(struct btrfs_fs_info *fs_info)
>   mutex_lock(_info->volume_mutex);
>   mutex_lock(_info->balance_mutex);
>  
> - if (fs_info->balance_ctl)
> + if (fs_info->balance_ctl) {
>   __cancel_balance(fs_info);
> + clear_bit(BTRFS_FS_EXCL_OP, _info->flags);
> + }
>  
>   mutex_unlock(_info->volume_mutex);
>   }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND v4 0/4] device_list_add() peparation to add reappearing missing device

2018-04-20 Thread Anand Jain



On 04/20/2018 02:22 PM, Gu, Jinxiang wrote:

Hi,

I reproduced this using kernel v4.17-rc1.
It is not always happens.( occurred times/test times: 1/20)


 Though it was reported here, its not related to this patch set.
 Instead its about the send receive.
 Pls post it as a separate thread so it can be followed up properly.

Thanks, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH RESEND v4 0/4] device_list_add() peparation to add reappearing missing device

2018-04-20 Thread Gu, Jinxiang
Hi,

I reproduced this using kernel v4.17-rc1.
It is not always happens.( occurred times/test times: 1/20)

> -Original Message-
> From: linux-btrfs-ow...@vger.kernel.org 
> [mailto:linux-btrfs-ow...@vger.kernel.org] On Behalf Of Anand Jain
> Sent: Tuesday, January 23, 2018 5:53 AM
> To: dste...@suse.cz; linux-btrfs@vger.kernel.org
> Subject: Re: [PATCH RESEND v4 0/4] device_list_add() peparation to add 
> reappearing missing device
> 
> 
> 
> On 01/22/2018 11:26 PM, David Sterba wrote:
> > On Mon, Jan 22, 2018 at 09:31:47PM +0800, Anand Jain wrote:
> >>Problem was mainly due to the patch 3/4, which tried to access the
> >>return pointer even for the failed condition. The fix is to bring the
> >>device point access under the else part as show below [2]. I have
> >>included this fix in V5. Which is tested with btrfs xfstests.
> >>Pls could you consider v5 for 4.16 ?
> >
> > Hm ok, thre's still some time to test it. One more fstests report that
> > appeared before and also with the v5:
> 
>   I will try to nail it down. It passes on bare metal and a VM here.
>   btrfs-progs: I am using your latest master at
>  (git://github.com/kdave/btrfs-progs.git).
> 
> 
> > btrfs/007 4s ...[16:38:09] [16:38:12] [failed, exit status 1] - 
> > output mismatch (see
> /root/test/mmtests/work/sources/xfstests-git-installed/results//btrfs/007.out.bad)
> >  --- tests/btrfs/007.out 2017-09-20 14:24:58.334716658 +0200
> >  +++ 
> > /root/test/mmtests/work/sources/xfstests-git-installed/results//btrfs/007.out.bad
> >2018-01-22 16:38:12.883931593
> +0100
> >  @@ -1,4 +1,5 @@
> >   QA output created by 007
> >   *** test send / receive
> >  -*** done
> >  +failed: 
> > '/root/test/mmtests/work/sources/xfstests-git-installed/src/fssum -r 
> > /tmp/tmp.eZcr17wqNn/incr.fssum
> /root/test/mmtests/scratch_mnt/incr'
> 
>   Looks like fssum on the reverse copied file failed.
> 
> >  +(see
> > /root/test/mmtests/work/sources/xfstests-git-installed/results//btrfs/
> > 007.full for details)
> 
>   Can you pls send me this ?
Please see the attachment.

And I confirmed btrfs/007 using kernel v4.16-rc1.
It also occurred sometimes
> 
> Thanks, Anand
> 
> >   *** unmount
> >  ...
> >  (Run 'diff -u tests/btrfs/007.out
> > /root/test/mmtests/work/sources/xfstests-git-installed/results//btrfs/
> > 007.out.bad'  to see the entire diff)
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> > in the body of a message to majord...@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the 
> body of a message to majord...@vger.kernel.org More
> majordomo info at  http://vger.kernel.org/majordomo-info.html
> 





007.full
Description: 007.full