Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

2017-10-08 Thread Dave Chinner
On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> On Mon, Oct 09, 2017 at 11:51:37AM +1100, Dave Chinner wrote:
> > On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> > > Hello,
> > > 
> > > One thing that comes up a lot every LSF is the fact that we have no 
> > > general way
> > > that we do performance testing.  Every fs developer has a set of scripts 
> > > or
> > > things that they run with varying degrees of consistency, but nothing 
> > > central
> > > that we all use.  I for one am getting tired of finding regressions when 
> > > we are
> > > deploying new kernels internally, so I wired this thing up to try and 
> > > address
> > > this need.
> > > 
> > > We all hate convoluted setups, the more brain power we have to put in to 
> > > setting
> > > something up the less likely we are to use it, so I took the xfstests 
> > > approach
> > > of making it relatively simple to get running and relatively easy to add 
> > > new
> > > tests.  For right now the only thing this framework does is run fio 
> > > scripts.  I
> > > chose fio because it already gathers loads of performance data about it's 
> > > runs.
> > > We have everything we need there, latency, bandwidth, cpu time, and all 
> > > broken
> > > down by reads, writes, and trims.  I figure most of us are familiar 
> > > enough with
> > > fio and how it works to make it relatively easy to add new tests to the
> > > framework.
> > > 
> > > I've posted my code up on github, you can get it here
> > > 
> > > https://github.com/josefbacik/fsperf
> > > 
> > > All (well most) of the results from fio are stored in a local sqlite 
> > > database.
> > > Right now the comparison stuff is very crude, it simply checks against the
> > > previous run and it only checks a few of the keys by default.  You can 
> > > check
> > > latency if you want, but while writing this stuff up it seemed that 
> > > latency was
> > > too variable from run to run to be useful in a "did my thing regress or 
> > > improve"
> > > sort of way.
> > > 
> > > The configuration is brain dead simple, the README has examples.  All you 
> > > need
> > > to do is make your local.cfg, run ./setup and then run ./fsperf and you 
> > > are good
> > > to go.
> > 
> > Why re-invent the test infrastructure? Why not just make it a
> > tests/perf subdir in fstests?
> > 
> 
> Probably should have led with that shouldn't I have?  There's nothing keeping 
> me
> from doing it, but I didn't want to try and shoehorn in a python thing into
> fstests.  I need python to do the sqlite and the json parsing to dump into the
> sqlite database.
> 
> Now if you (and others) are not opposed to this being dropped into tests/perf
> then I'll work that up.  But it's definitely going to need to be done in 
> python.
> I know you yourself have said you aren't opposed to using python in the past, 
> so
> if that's still the case then I can definitely wire it all up.

I have no problems with people using python for stuff like this but,
OTOH, I'm not the fstests maintainer anymore :P

> > > The plan is to add lots of workloads as we discover regressions and such. 
> > >  We
> > > don't want anything that takes too long to run otherwise people won't run 
> > > this,
> > > so the existing tests don't take much longer than a few minutes each.  I 
> > > will be
> > > adding some more comparison options so you can compare against averages 
> > > of all
> > > previous runs and such.
> > 
> > Yup, that fits exactly into what fstests is for... :P
> > 
> > Integrating into fstests means it will be immediately available to
> > all fs developers, it'll run on everything that everyone already has
> > setup for filesystem testing, and it will have familiar mkfs/mount
> > option setup behaviour so there's no new hoops for everyone to jump
> > through to run it...
> > 
> 
> TBF I specifically made it as easy as possible because I know we all hate 
> trying
> to learn new shit.

Yeah, it's also hard to get people to change their workflows to add
a whole new test harness into them. It's easy if it's just a new
command to an existing workflow :P

> I figured this was different enough to warrant a separate
> project, especially since I'm going to add block device jobs so Jens can test
> block layer things.  If we all agree we'd rather see this in fstests then I'm
> happy to do that too.  Thanks,

I'm not fussed either way - it's a good discussion to have, though.

If I want to add tests (e.g. my time-honoured fsmark tests), where
should I send patches?

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

2017-10-08 Thread Theodore Ts'o
On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> 
> Probably should have led with that shouldn't I have?  There's nothing keeping 
> me
> from doing it, but I didn't want to try and shoehorn in a python thing into
> fstests.  I need python to do the sqlite and the json parsing to dump into the
> sqlite database.

What version of python are you using?  From inspection it looks like
some variant of python 3.x (you're using print as a function w/o using
"from __future import print_function") but it's not immediately
obvious from the top-level fsperf shell script what version of python
your scripts are dependant upon.

This could potentially be a bit of a portability issue across various
distributions --- RHEL doesn't ship with Python 3.x at all, and on
Debian you need to use python3 to get python 3.x, since
/usr/bin/python still points at Python 2.7 by default.  So I could see
this as a potential issue for xfstests.

I'm currently using Python 2.7 in my wrapper scripts for, among other
things, to parse xUnit XML format and create nice summaries like this:

ext4/4k: 337 tests, 6 failures, 21 skipped, 3814 seconds
  Failures: generic/232 generic/233 generic/361 generic/388
generic/451 generic/459

So I'm not opposed to python, but I will note that if you are using
modules from the Python Package Index, and they are modules which are
not packaged by your distribution (so you're using pip or easy_install
to pull them off the network), it does make doing hermetic builds from
trusted sources to be a bit trickier.

If you have a secops team who wants to know the provenance of software
which get thrown in production data centers (and automated downloading
from random external sources w/o any code review makes them break out
in hives), use of PyPI adds a new wrinkle.  It's not impossible to
solve, by any means, but it's something to consider.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] btrfs: fix use of error or warning for missing device

2017-10-08 Thread Anand Jain
When device is missing without the -o degraded option then
its an error so report it as an error instead of warning.
And when -o degraded option is provided, log the missing
device as warning.

Signed-off-by: Anand Jain 
---
v2: Rename from
[PATCH 2/2] btrfs: clean up btrfs_report_missing_device() usage
Also drop the idea of moving the DEGRADED option checking
into the function btrfs_report_missing_device() and further
renaming it to check_report_degraded(). If its such a thing
is good idea it can be added on top of this patch. Thxs.
 fs/btrfs/volumes.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6a041bef112c..c76a81938766 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6378,9 +6378,14 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info 
*fs_info,
 }
 
 static void btrfs_report_missing_device(struct btrfs_fs_info *fs_info,
-   u64 devid, u8 *uuid)
+   u64 devid, u8 *uuid, int error)
 {
-   btrfs_warn_rl(fs_info, "devid %llu uuid %pU is missing", devid, uuid);
+   if (error)
+   btrfs_err_rl(fs_info, "devid %llu uuid %pU is missing",
+ devid, uuid);
+   else
+   btrfs_warn_rl(fs_info, "devid %llu uuid %pU is missing",
+ devid, uuid);
 }
 
 static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key,
@@ -6453,7 +6458,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, 
struct btrfs_key *key,
if (!map->stripes[i].dev &&
!btrfs_test_opt(fs_info, DEGRADED)) {
free_extent_map(em);
-   btrfs_report_missing_device(fs_info, devid, uuid);
+   btrfs_report_missing_device(fs_info, devid, uuid, 1);
return -ENOENT;
}
if (!map->stripes[i].dev) {
@@ -6467,7 +6472,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, 
struct btrfs_key *key,
devid, PTR_ERR(map->stripes[i].dev));
return PTR_ERR(map->stripes[i].dev);
}
-   btrfs_report_missing_device(fs_info, devid, uuid);
+   btrfs_report_missing_device(fs_info, devid, uuid, 0);
}
map->stripes[i].dev->in_fs_metadata = 1;
}
@@ -6586,19 +6591,24 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
device = btrfs_find_device(fs_info, devid, dev_uuid, fs_uuid);
if (!device) {
if (!btrfs_test_opt(fs_info, DEGRADED)) {
-   btrfs_report_missing_device(fs_info, devid, dev_uuid);
+   btrfs_report_missing_device(fs_info, devid,
+   dev_uuid, 1);
return -ENOENT;
}
 
device = add_missing_dev(fs_devices, devid, dev_uuid);
if (IS_ERR(device))
return PTR_ERR(device);
-   btrfs_report_missing_device(fs_info, devid, dev_uuid);
+   btrfs_report_missing_device(fs_info, devid, dev_uuid, 0);
} else {
if (!device->bdev) {
-   btrfs_report_missing_device(fs_info, devid, dev_uuid);
-   if (!btrfs_test_opt(fs_info, DEGRADED))
+   if (!btrfs_test_opt(fs_info, DEGRADED)) {
+   btrfs_report_missing_device(fs_info,
+   devid, dev_uuid, 1);
return -ENOENT;
+   }
+   btrfs_report_missing_device(fs_info, devid,
+   dev_uuid, 0);
}
 
if(!device->bdev && !device->missing) {
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/4] btrfs: add_missing_dev() should return the actual error

2017-10-08 Thread Anand Jain
add_missing_dev() can return device pointer so that IS_ERR/
PTR_ERR can be used to check for the actual error occurred
in the function.

Signed-off-by: Anand Jain 
---
v2: This patch is a split from
[PATCH 1/2] btrfs: fix read_one_chunk() return error code

 fs/btrfs/volumes.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0e8f16c305df..2f500a32089e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6249,7 +6249,7 @@ static struct btrfs_device *add_missing_dev(struct 
btrfs_fs_devices *fs_devices,
 
device = btrfs_alloc_device(NULL, , dev_uuid);
if (IS_ERR(device))
-   return NULL;
+   return device;
 
list_add(>dev_list, _devices->devices);
device->fs_devices = fs_devices;
@@ -6454,9 +6454,12 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, 
struct btrfs_key *key,
map->stripes[i].dev =
add_missing_dev(fs_info->fs_devices, devid,
uuid);
-   if (!map->stripes[i].dev) {
+   if (IS_ERR(map->stripes[i].dev)) {
free_extent_map(em);
-   return -EIO;
+   btrfs_err(fs_info,
+   "failed to init missing dev %llu %ld",
+   devid, PTR_ERR(map->stripes[i].dev));
+   return PTR_ERR(map->stripes[i].dev);
}
btrfs_report_missing_device(fs_info, devid, uuid);
}
@@ -6582,8 +6585,8 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
}
 
device = add_missing_dev(fs_devices, devid, dev_uuid);
-   if (!device)
-   return -ENOMEM;
+   if (IS_ERR(device))
+   return PTR_ERR(device);
btrfs_report_missing_device(fs_info, devid, dev_uuid);
} else {
if (!device->bdev) {
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] btrfs: fix EIO misuse to report missing degraded option

2017-10-08 Thread Anand Jain
EIO is only for the IO failure to the device, avoid it.

Signed-off-by: Anand Jain 
---
v2: This patch is renamed from
 [PATCH 1/2] btrfs: fix read_one_chunk() return error code

 fs/btrfs/volumes.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2f500a32089e..844ae25cff9e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6448,7 +6448,7 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, 
struct btrfs_key *key,
!btrfs_test_opt(fs_info, DEGRADED)) {
free_extent_map(em);
btrfs_report_missing_device(fs_info, devid, uuid);
-   return -EIO;
+   return -ENOENT;
}
if (!map->stripes[i].dev) {
map->stripes[i].dev =
@@ -6581,7 +6581,7 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
if (!device) {
if (!btrfs_test_opt(fs_info, DEGRADED)) {
btrfs_report_missing_device(fs_info, devid, dev_uuid);
-   return -EIO;
+   return -ENOENT;
}
 
device = add_missing_dev(fs_devices, devid, dev_uuid);
@@ -6592,7 +6592,7 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
if (!device->bdev) {
btrfs_report_missing_device(fs_info, devid, dev_uuid);
if (!btrfs_test_opt(fs_info, DEGRADED))
-   return -EIO;
+   return -ENOENT;
}
 
if(!device->bdev && !device->missing) {
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] btrfs: declare btrfs_report_missing_device() static

2017-10-08 Thread Anand Jain
Signed-off-by: Anand Jain 
---
v2: This patch is a split from
[PATCH 2/2] btrfs: clean up btrfs_report_missing_device() usage

 fs/btrfs/volumes.c | 12 ++--
 fs/btrfs/volumes.h |  2 --
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 844ae25cff9e..6a041bef112c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6377,6 +6377,12 @@ static int btrfs_check_chunk_valid(struct btrfs_fs_info 
*fs_info,
return 0;
 }
 
+static void btrfs_report_missing_device(struct btrfs_fs_info *fs_info,
+   u64 devid, u8 *uuid)
+{
+   btrfs_warn_rl(fs_info, "devid %llu uuid %pU is missing", devid, uuid);
+}
+
 static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key,
  struct extent_buffer *leaf,
  struct btrfs_chunk *chunk)
@@ -6759,12 +6765,6 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
return -EIO;
 }
 
-void btrfs_report_missing_device(struct btrfs_fs_info *fs_info, u64 devid,
-u8 *uuid)
-{
-   btrfs_warn_rl(fs_info, "devid %llu uuid %pU is missing", devid, uuid);
-}
-
 /*
  * Check if all chunks in the fs are OK for read-write degraded mount
  *
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 6108fdfec67f..ff15208344a7 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -542,7 +542,5 @@ void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
 void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
 
 bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info);
-void btrfs_report_missing_device(struct btrfs_fs_info *fs_info, u64 devid,
-u8 *uuid);
 
 #endif
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

2017-10-08 Thread Josef Bacik
On Mon, Oct 09, 2017 at 11:51:37AM +1100, Dave Chinner wrote:
> On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> > Hello,
> > 
> > One thing that comes up a lot every LSF is the fact that we have no general 
> > way
> > that we do performance testing.  Every fs developer has a set of scripts or
> > things that they run with varying degrees of consistency, but nothing 
> > central
> > that we all use.  I for one am getting tired of finding regressions when we 
> > are
> > deploying new kernels internally, so I wired this thing up to try and 
> > address
> > this need.
> > 
> > We all hate convoluted setups, the more brain power we have to put in to 
> > setting
> > something up the less likely we are to use it, so I took the xfstests 
> > approach
> > of making it relatively simple to get running and relatively easy to add new
> > tests.  For right now the only thing this framework does is run fio 
> > scripts.  I
> > chose fio because it already gathers loads of performance data about it's 
> > runs.
> > We have everything we need there, latency, bandwidth, cpu time, and all 
> > broken
> > down by reads, writes, and trims.  I figure most of us are familiar enough 
> > with
> > fio and how it works to make it relatively easy to add new tests to the
> > framework.
> > 
> > I've posted my code up on github, you can get it here
> > 
> > https://github.com/josefbacik/fsperf
> > 
> > All (well most) of the results from fio are stored in a local sqlite 
> > database.
> > Right now the comparison stuff is very crude, it simply checks against the
> > previous run and it only checks a few of the keys by default.  You can check
> > latency if you want, but while writing this stuff up it seemed that latency 
> > was
> > too variable from run to run to be useful in a "did my thing regress or 
> > improve"
> > sort of way.
> > 
> > The configuration is brain dead simple, the README has examples.  All you 
> > need
> > to do is make your local.cfg, run ./setup and then run ./fsperf and you are 
> > good
> > to go.
> 
> Why re-invent the test infrastructure? Why not just make it a
> tests/perf subdir in fstests?
> 

Probably should have led with that shouldn't I have?  There's nothing keeping me
from doing it, but I didn't want to try and shoehorn in a python thing into
fstests.  I need python to do the sqlite and the json parsing to dump into the
sqlite database.

Now if you (and others) are not opposed to this being dropped into tests/perf
then I'll work that up.  But it's definitely going to need to be done in python.
I know you yourself have said you aren't opposed to using python in the past, so
if that's still the case then I can definitely wire it all up.

> > The plan is to add lots of workloads as we discover regressions and such.  
> > We
> > don't want anything that takes too long to run otherwise people won't run 
> > this,
> > so the existing tests don't take much longer than a few minutes each.  I 
> > will be
> > adding some more comparison options so you can compare against averages of 
> > all
> > previous runs and such.
> 
> Yup, that fits exactly into what fstests is for... :P
> 
> Integrating into fstests means it will be immediately available to
> all fs developers, it'll run on everything that everyone already has
> setup for filesystem testing, and it will have familiar mkfs/mount
> option setup behaviour so there's no new hoops for everyone to jump
> through to run it...
> 

TBF I specifically made it as easy as possible because I know we all hate trying
to learn new shit.  I figured this was different enough to warrant a separate
project, especially since I'm going to add block device jobs so Jens can test
block layer things.  If we all agree we'd rather see this in fstests then I'm
happy to do that too.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] btrfs: Use bd_dev to generate index when dev_state_hashtable add items.

2017-10-08 Thread Gu, Jinxiang
Hi,
sorry for reply too late.

This patch fix for the bug descriped below.
Use MOUNT_OPTIONS="-o check_int" when run xfstest, device
Can not be mount successfully. So xfstest can not run.



> -Original Message-
> From: David Sterba [mailto:dste...@suse.cz]
> Sent: Saturday, September 30, 2017 2:05 AM
> To: Gu, Jinxiang/顾 金香 
> Cc: linux-btrfs@vger.kernel.org; h...@lst.de
> Subject: Re: [PATCH] btrfs: Use bd_dev to generate index when 
> dev_state_hashtable add items.
> 
> On Fri, Sep 29, 2017 at 04:16:35PM +0800, Gu Jinxiang wrote:
> > From: Gu JinXiang 
> >
> > Fix bug of
> >  ().
> 
> Please explain what bug are you fixing.
> 





[PATCH v4 1/5] btrfs: Move leaf and node validation checker to tree-checker.ch

2017-10-08 Thread Qu Wenruo
It's no doubt the comprehensive tree block checker will become larger
and larger, so moving them into their own files is quite reasonable.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/Makefile   |   2 +-
 fs/btrfs/disk-io.c  | 285 +---
 fs/btrfs/tree-checker.c | 310 
 fs/btrfs/tree-checker.h |  23 
 4 files changed, 338 insertions(+), 282 deletions(-)
 create mode 100644 fs/btrfs/tree-checker.c
 create mode 100644 fs/btrfs/tree-checker.h

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 962a95aefb81..88255e133ade 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   export.o tree-log.o free-space-cache.o zlib.o lzo.o zstd.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
-  uuid-tree.o props.o hash.o free-space-tree.o
+  uuid-tree.o props.o hash.o free-space-tree.o tree-checker.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c8633f2abdf1..09407384e996 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -50,6 +50,7 @@
 #include "sysfs.h"
 #include "qgroup.h"
 #include "compression.h"
+#include "tree-checker.h"
 
 #ifdef CONFIG_X86
 #include 
@@ -543,284 +544,6 @@ static int check_tree_block_fsid(struct btrfs_fs_info 
*fs_info,
return ret;
 }
 
-#define CORRUPT(reason, eb, root, slot)
\
-   btrfs_crit(root->fs_info,   \
-  "corrupt %s, %s: block=%llu, root=%llu, slot=%d",\
-  btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
-  reason, btrfs_header_bytenr(eb), root->objectid, slot)
-
-static int check_extent_data_item(struct btrfs_root *root,
- struct extent_buffer *leaf,
- struct btrfs_key *key, int slot)
-{
-   struct btrfs_file_extent_item *fi;
-   u32 sectorsize = root->fs_info->sectorsize;
-   u32 item_size = btrfs_item_size_nr(leaf, slot);
-
-   if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for file extent",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
-
-   if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
-   CORRUPT("invalid file extent type", leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   /*
-* Support for new compression/encrption must introduce incompat flag,
-* and must be caught in open_ctree().
-*/
-   if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-   CORRUPT("invalid file extent compression", leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (btrfs_file_extent_encryption(leaf, fi)) {
-   CORRUPT("invalid file extent encryption", leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
-   /* Inline extent must have 0 as key offset */
-   if (key->offset) {
-   CORRUPT("inline extent has non-zero key offset",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-
-   /* Compressed inline extent has no on-disk size, skip it */
-   if (btrfs_file_extent_compression(leaf, fi) !=
-   BTRFS_COMPRESS_NONE)
-   return 0;
-
-   /* Uncompressed inline extent size must match item size */
-   if (item_size != BTRFS_FILE_EXTENT_INLINE_DATA_START +
-   btrfs_file_extent_ram_bytes(leaf, fi)) {
-   CORRUPT("plaintext inline extent has invalid size",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-   return 0;
-   }
-
-   /* Regular or preallocated extent has fixed item size */
-   if (item_size != sizeof(*fi)) {
-   CORRUPT(
-   "regluar or preallocated extent data item size is invalid",
-   leaf, root, slot);
-   return -EUCLEAN;
-   }
-   if (!IS_ALIGNED(btrfs_file_extent_ram_bytes(leaf, fi), sectorsize) ||
-   !IS_ALIGNED(btrfs_file_extent_disk_bytenr(leaf, fi), sectorsize) ||
-   !IS_ALIGNED(btrfs_file_extent_disk_num_bytes(leaf, fi), sectorsize) 
||
-   !IS_ALIGNED(btrfs_file_extent_offset(leaf, fi), sectorsize) ||
-  

[PATCH v4 4/5] btrfs: tree-checker: Enhance output for check_csum_item

2017-10-08 Thread Qu Wenruo
Output the bad value and expected good value (or its alignment).

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 7bba195ecc8b..42e002f8a560 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -164,15 +164,21 @@ static int check_csum_item(struct btrfs_root *root, 
struct extent_buffer *leaf,
u32 csumsize = btrfs_super_csum_size(root->fs_info->super_copy);
 
if (key->objectid != BTRFS_EXTENT_CSUM_OBJECTID) {
-   CORRUPT("invalid objectid for csum item", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "invalid key objectid for csum item, have %llu expect 
%llu",
+   key->objectid, BTRFS_EXTENT_CSUM_OBJECTID);
return -EUCLEAN;
}
if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for csum item", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "unaligned key offset for csum item, have %llu should 
be aligned to %u",
+   key->offset, sectorsize);
return -EUCLEAN;
}
if (!IS_ALIGNED(btrfs_item_size_nr(leaf, slot), csumsize)) {
-   CORRUPT("unaligned csum item size", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "unaligned item size for csum item, have %u should be 
aligned to %u",
+   btrfs_item_size_nr(leaf, slot), csumsize);
return -EUCLEAN;
}
return 0;
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 5/5] btrfs: tree-checker: Enhance output for check_extent_data_item

2017-10-08 Thread Qu Wenruo
Output the invalid member name and its bad value, along with its
expected value range or alignment.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 98 +++--
 1 file changed, 70 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 42e002f8a560..c7a09cc2a17c 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -32,12 +32,6 @@
 #include "disk-io.h"
 #include "compression.h"
 
-#define CORRUPT(reason, eb, root, slot)
\
-   btrfs_crit(root->fs_info,   \
-  "corrupt %s, %s: block=%llu, root=%llu, slot=%d",\
-  btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
-  reason, btrfs_header_bytenr(eb), root->objectid, slot)
-
 /*
  * Error message should follow the format below:
  * corrupt : , [, ]
@@ -80,6 +74,47 @@ static void generic_err(const struct btrfs_root *root,
va_end(args);
 }
 
+/*
+ * Customized reporter for extent data item, since its key objectid and
+ * offset has its own meaning.
+ */
+__printf(4, 5)
+static void file_extent_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct btrfs_key key;
+   struct va_format vaf;
+   va_list args;
+
+   btrfs_item_key_to_cpu(eb, , slot);
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = 
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s: root=%llu block=%llu slot=%d ino=%llu 
file_offset=%llu, %pV",
+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   key.objectid, key.offset, );
+   va_end(args);
+}
+
+/*
+ * Return 0 if the btrfs_file_extent_##name is aligned to @alignment
+ * Else return 1
+ */
+#define CHECK_FE_ALIGNED(root, leaf, slot, fi, name, alignment)
\
+({ \
+   if (!IS_ALIGNED(btrfs_file_extent_##name((leaf), (fi)), (alignment)))\
+   file_extent_err((root), (leaf), (slot), \
+   "invalid %s for file extent, have %llu, should be 
aligned to %u",\
+   (#name), btrfs_file_extent_##name((leaf), (fi)),\
+   (alignment));   \
+   (!IS_ALIGNED(btrfs_file_extent_##name((leaf), (fi)), (alignment)));\
+})
+
 static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -89,15 +124,19 @@ static int check_extent_data_item(struct btrfs_root *root,
u32 item_size = btrfs_item_size_nr(leaf, slot);
 
if (!IS_ALIGNED(key->offset, sectorsize)) {
-   CORRUPT("unaligned key offset for file extent",
-   leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "unaligned file_offset for file extent, have %llu 
should be aligned to %u",
+   key->offset, sectorsize);
return -EUCLEAN;
}
 
fi = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item);
 
if (btrfs_file_extent_type(leaf, fi) > BTRFS_FILE_EXTENT_TYPES) {
-   CORRUPT("invalid file extent type", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid type for file extent, have %u expect range [0, 
%u]",
+   btrfs_file_extent_type(leaf, fi),
+   BTRFS_FILE_EXTENT_TYPES);
return -EUCLEAN;
}
 
@@ -106,18 +145,24 @@ static int check_extent_data_item(struct btrfs_root *root,
 * and must be caught in open_ctree().
 */
if (btrfs_file_extent_compression(leaf, fi) > BTRFS_COMPRESS_TYPES) {
-   CORRUPT("invalid file extent compression", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid compression for file extent, have %u expect 
range [0, %u]",
+   btrfs_file_extent_compression(leaf, fi),
+   BTRFS_COMPRESS_TYPES);
return -EUCLEAN;
}
if (btrfs_file_extent_encryption(leaf, fi)) {
-   CORRUPT("invalid file extent encryption", leaf, root, slot);
+   file_extent_err(root, leaf, slot,
+   "invalid encryption for file extent, have %u expect 0",
+   btrfs_file_extent_encryption(leaf, fi));
return -EUCLEAN;
}
if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) {
/* Inline extent must have 0 as key offset */

[PATCH v4 0/5] Enhance tree block validation checker

2017-10-08 Thread Qu Wenruo
The patchset can be fetched from github:
https://github.com/adam900710/linux/tree/checker_enhance

It's based on David's misc-next branch, with following commit as base:
a5e50b4b444c ("btrfs: Add checker for EXTENT_CSUM")

According to David's suggestion, enhance the output format of tree block
validation checker.

And move them into separate files: tree-checker.[ch].

Also added a output format rule to try to make all output message
follow the same format.

Some example output using btrfs-progs fsck-test images looks like:

For unagliend file extent member:
---
BTRFS critical (device loop0): corrupt leaf: root=1 block=29360128 slot=7 
ino=257 file_offset=0, invalid disk_bytenr for file extent, have 755944791, 
should be aligned to 4096
---

For bad leaf holes:
---
BTRFS critical (device loop0): corrupt leaf: root=1 block=29360128 slot=28, 
discontinious item end, have 9387 expect 15018
---

Changelog:
v2:
  Unify the error string format, so it should be easier to grep them
  from dmesg. Thanks Nikolay for pointing this out.
  Remove unused CORRUPT() macro.
v3:
  Replace EIO with EUCLEAN in 2nd patch. Thanks Nikolay for pointing
  this out.
  Correct "btrfs-progs:" to "btrfs:" for 1st patch.
v4:
  Code style change suggested by David.
  Use more easy-to-understand error message for NULL node pointer,
  suggested by Nikolay.
  Helper macro enhancement, including naming change and argument
  protection suggested by David.
  Separate tree-checker.h suggested by David.


Qu Wenruo (5):
  btrfs: Move leaf and node validation checker to tree-checker.ch
  btrfs: tree-checker: Enhance btrfs_check_node output
  btrfs: tree-checker: Enhance output for btrfs_check_leaf
  btrfs: tree-checker: Enhance output for check_csum_item
  btrfs: tree-checker: Enhance output for check_extent_data_item

 fs/btrfs/Makefile   |   2 +-
 fs/btrfs/disk-io.c  | 285 +---
 fs/btrfs/tree-checker.c | 429 
 fs/btrfs/tree-checker.h |  23 +++
 4 files changed, 457 insertions(+), 282 deletions(-)
 create mode 100644 fs/btrfs/tree-checker.c
 create mode 100644 fs/btrfs/tree-checker.h

-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/5] btrfs: tree-checker: Enhance output for btrfs_check_leaf

2017-10-08 Thread Qu Wenruo
Enhance the output to print:
1) Reason
2) Bad value
   If reason can't explain enough
3) Good value (range)

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index b4ced8d3ce2a..7bba195ecc8b 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -233,8 +233,9 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
eb = btrfs_root_node(check_root);
/* if leaf is the root, then it's fine */
if (leaf != eb) {
-   CORRUPT("non-root leaf's nritems is 0",
-   leaf, check_root, 0);
+   generic_err(check_root, leaf, 0,
+   "invalid nritems, have %u shouldn't be 
0 for non-root leaf",
+   nritems);
free_extent_buffer(eb);
return -EUCLEAN;
}
@@ -265,7 +266,11 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
 
/* Make sure the keys are in the right order */
if (btrfs_comp_cpu_keys(_key, ) >= 0) {
-   CORRUPT("bad key order", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "bad key order, prev key (%llu %u %llu) current 
key (%llu %u %llu)",
+   prev_key.objectid, prev_key.type,
+   prev_key.offset, key.objectid, key.type,
+   key.offset);
return -EUCLEAN;
}
 
@@ -280,7 +285,10 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
item_end_expected = btrfs_item_offset_nr(leaf,
 slot - 1);
if (btrfs_item_end_nr(leaf, slot) != item_end_expected) {
-   CORRUPT("slot offset bad", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "discontinious item end, have %u expect %u",
+   btrfs_item_end_nr(leaf, slot),
+   item_end_expected);
return -EUCLEAN;
}
 
@@ -291,14 +299,21 @@ int btrfs_check_leaf(struct btrfs_root *root, struct 
extent_buffer *leaf)
 */
if (btrfs_item_end_nr(leaf, slot) >
BTRFS_LEAF_DATA_SIZE(fs_info)) {
-   CORRUPT("slot end outside of leaf", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "slot end outside of leaf, have %u expect range 
[0, %u]",
+   btrfs_item_end_nr(leaf, slot),
+   BTRFS_LEAF_DATA_SIZE(fs_info));
return -EUCLEAN;
}
 
/* Also check if the item pointer overlaps with btrfs item. */
if (btrfs_item_nr_offset(slot) + sizeof(struct btrfs_item) >
btrfs_item_ptr_offset(leaf, slot)) {
-   CORRUPT("slot overlap with its data", leaf, root, slot);
+   generic_err(root, leaf, slot,
+   "slot overlap with its data, item end %lu data 
start %lu",
+   btrfs_item_nr_offset(slot) +
+   sizeof(struct btrfs_item),
+   btrfs_item_ptr_offset(leaf, slot));
return -EUCLEAN;
}
 
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/5] btrfs: tree-checker: Enhance btrfs_check_node output

2017-10-08 Thread Qu Wenruo
Use inline function to replace macro since we don't need
stringification.
(Macro still exist until all caller get updated)

And add more info about the error, and replace EIO with EUCLEAN.

For nr_items error, report if it's too large or too small, and output
valid value range.

For node block pointer, added a new alignment checker.

For key order, also output the next key to make the problem more
obvious.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/tree-checker.c | 70 -
 1 file changed, 63 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 63307e8426a6..b4ced8d3ce2a 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -38,6 +38,48 @@
   btrfs_header_level(eb) == 0 ? "leaf" : "node",   \
   reason, btrfs_header_bytenr(eb), root->objectid, slot)
 
+/*
+ * Error message should follow the format below:
+ * corrupt : , [, ]
+ *
+ * @type:  Either leaf or node
+ * @identifier:The necessary info to locate the leaf/node.
+ * It's recommened to decode key.objecitd/offset if it's
+ * meaningful.
+ * @reason:What's wrong
+ * @bad_value: Optional, it's recommened to output bad value and its
+ * expected value (range).
+ *
+ * Since comma is used to separate the components, only SPACE is allowed
+ * inside each component.
+ */
+
+/*
+ * Append the generic "corrupt leaf/node root=%llu block=%llu slot=%d: " to
+ * @fmt.
+ * Allowing user to customize their output.
+ */
+__printf(4, 5)
+static void generic_err(const struct btrfs_root *root,
+   const struct extent_buffer *eb,
+   int slot, const char *fmt, ...)
+{
+   struct va_format vaf;
+   va_list args;
+
+   va_start(args, fmt);
+
+   vaf.fmt = fmt;
+   vaf.va = 
+
+   btrfs_crit(root->fs_info,
+   "corrupt %s: root=%llu block=%llu slot=%d, %pV",
+   btrfs_header_level(eb) == 0 ? "leaf" : "node",
+   root->objectid, btrfs_header_bytenr(eb), slot,
+   );
+   va_end(args);
+}
+
 static int check_extent_data_item(struct btrfs_root *root,
  struct extent_buffer *leaf,
  struct btrfs_key *key, int slot)
@@ -283,9 +325,11 @@ int btrfs_check_node(struct btrfs_root *root, struct 
extent_buffer *node)
 
if (nr == 0 || nr > BTRFS_NODEPTRS_PER_BLOCK(root->fs_info)) {
btrfs_crit(root->fs_info,
-  "corrupt node: block %llu root %llu nritems %lu",
-  node->start, root->objectid, nr);
-   return -EIO;
+  "corrupt node: root=%llu block=%llu, nritems too %s, 
have %lu expect range [1,%u]",
+  root->objectid, node->start,
+  nr == 0 ? "small" : "large", nr,
+  BTRFS_NODEPTRS_PER_BLOCK(root->fs_info));
+   return -EUCLEAN;
}
 
for (slot = 0; slot < nr - 1; slot++) {
@@ -294,14 +338,26 @@ int btrfs_check_node(struct btrfs_root *root, struct 
extent_buffer *node)
btrfs_node_key_to_cpu(node, _key, slot + 1);
 
if (!bytenr) {
-   CORRUPT("invalid item slot", node, root, slot);
-   ret = -EIO;
+   generic_err(root, node, slot,
+   "invalid NULL node pointer");
+   ret = -EUCLEAN;
+   goto out;
+   }
+   if (!IS_ALIGNED(bytenr, root->fs_info->sectorsize)) {
+   generic_err(root, node, slot,
+   "unaligned pointer, have %llu should be aligned 
to %u",
+   bytenr, root->fs_info->sectorsize);
+   ret = -EUCLEAN;
goto out;
}
 
if (btrfs_comp_cpu_keys(, _key) >= 0) {
-   CORRUPT("bad key order", node, root, slot);
-   ret = -EIO;
+   generic_err(root, node, slot,
+   "bad key order, current key (%llu %u %llu) next 
key (%llu %u %llu)",
+   key.objectid, key.type, key.offset,
+   next_key.objectid, next_key.type,
+   next_key.offset);
+   ret = -EUCLEAN;
goto out;
}
}
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

2017-10-08 Thread Dave Chinner
On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> Hello,
> 
> One thing that comes up a lot every LSF is the fact that we have no general 
> way
> that we do performance testing.  Every fs developer has a set of scripts or
> things that they run with varying degrees of consistency, but nothing central
> that we all use.  I for one am getting tired of finding regressions when we 
> are
> deploying new kernels internally, so I wired this thing up to try and address
> this need.
> 
> We all hate convoluted setups, the more brain power we have to put in to 
> setting
> something up the less likely we are to use it, so I took the xfstests approach
> of making it relatively simple to get running and relatively easy to add new
> tests.  For right now the only thing this framework does is run fio scripts.  
> I
> chose fio because it already gathers loads of performance data about it's 
> runs.
> We have everything we need there, latency, bandwidth, cpu time, and all broken
> down by reads, writes, and trims.  I figure most of us are familiar enough 
> with
> fio and how it works to make it relatively easy to add new tests to the
> framework.
> 
> I've posted my code up on github, you can get it here
> 
> https://github.com/josefbacik/fsperf
> 
> All (well most) of the results from fio are stored in a local sqlite database.
> Right now the comparison stuff is very crude, it simply checks against the
> previous run and it only checks a few of the keys by default.  You can check
> latency if you want, but while writing this stuff up it seemed that latency 
> was
> too variable from run to run to be useful in a "did my thing regress or 
> improve"
> sort of way.
> 
> The configuration is brain dead simple, the README has examples.  All you need
> to do is make your local.cfg, run ./setup and then run ./fsperf and you are 
> good
> to go.

Why re-invent the test infrastructure? Why not just make it a
tests/perf subdir in fstests?

> The plan is to add lots of workloads as we discover regressions and such.  We
> don't want anything that takes too long to run otherwise people won't run 
> this,
> so the existing tests don't take much longer than a few minutes each.  I will 
> be
> adding some more comparison options so you can compare against averages of all
> previous runs and such.

Yup, that fits exactly into what fstests is for... :P

Integrating into fstests means it will be immediately available to
all fs developers, it'll run on everything that everyone already has
setup for filesystem testing, and it will have familiar mkfs/mount
option setup behaviour so there's no new hoops for everyone to jump
through to run it...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "BTRFS error (device vda1): couldn't get super buffer head for bytenr x"

2017-10-08 Thread Nick Gilmour
> I don't see a 'btrfs filesystem resize' command in your sequence. Did
> you actually resize the file system before you resized the underlying
> (virtual) block device?

OK. I guess, this is it. I didn't do any 'btrfs filesystem resize' .
The guides I was following didn't mention something like that. I was
assuming that if it works for other FS like this it should also work
for BTRFS. So I have to run this one:

# btrfs filesystem resize -350g /home ?

Should it be before or after I shrink the .img?

> Is this before or after the resize?

 It's after the supposed resize. I shoud have seen that BTRFS didn't
notice the change...

On Sun, Oct 8, 2017 at 11:03 PM, Chris Murphy  wrote:
> On Sun, Oct 8, 2017 at 4:39 PM, Nick Gilmour  wrote:
>> Thanks for the reply!
>>
>> For conversion I used this command:
>> $ vboxmanage internalcommands converttoraw mydisk.vdi mydisk.img
>>
>> and for resizing this one:
>> $ qemu-img resize mydisk.img 150G
>>
>> Is there something I can do to fix this or another way to do the
>> resizing without getting an error?
>
> I don't see a 'btrfs filesystem resize' command in your sequence. Did
> you actually resize the file system before you resized the underlying
> (virtual) block device?
>
>
>
 btrfs fi show
 Label: none uuid: x
Total devices 1 FS bytes used 473.68GiB
 devid 1 size 492.00 GiB used 492.00GiB path /dev/sda1
>
> Is this before or after the resize?
>
>
>
>
> --
> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "BTRFS error (device vda1): couldn't get super buffer head for bytenr x"

2017-10-08 Thread Chris Murphy
On Sun, Oct 8, 2017 at 4:39 PM, Nick Gilmour  wrote:
> Thanks for the reply!
>
> For conversion I used this command:
> $ vboxmanage internalcommands converttoraw mydisk.vdi mydisk.img
>
> and for resizing this one:
> $ qemu-img resize mydisk.img 150G
>
> Is there something I can do to fix this or another way to do the
> resizing without getting an error?

I don't see a 'btrfs filesystem resize' command in your sequence. Did
you actually resize the file system before you resized the underlying
(virtual) block device?



>>> btrfs fi show
>>> Label: none uuid: x
>>>Total devices 1 FS bytes used 473.68GiB
>>> devid 1 size 492.00 GiB used 492.00GiB path /dev/sda1

Is this before or after the resize?




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: prefix sysfs attribute struct names

2017-10-08 Thread Hans van Kranenburg
Currently struct names for sysfs are generated only based on the
attribute names. This means that attribute names cannot be reused in
multiple places throughout the complete btrfs sysfs hierarchy.

E.g. allocation/data/total_bytes and allocation/data/single/total_bytes
result in the same struct name btrfs_attr_total_bytes. A workaround for
this case was made in the past by ad hoc creating an extra macro
wrapper, BTRFS_RAID_ATTR, that inserts some extra text in the struct
name.

Instead of polluting sysfs.h with such kind of extra macro definitions,
and only doing so when there are collisions, use a prefix which gets
inserted in the struct name, so we keep everything nicely grouped
together by default.

Current collections of attributes are:
* (the toplevel, empty prefix)
* allocation
* space_info
* raid
* features

Signed-off-by: Hans van Kranenburg 
---
 fs/btrfs/sysfs.c | 64 +---
 fs/btrfs/sysfs.h | 26 ++-
 2 files changed, 44 insertions(+), 46 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 883881b16c86..37ad89f9facb 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -247,7 +247,7 @@ static ssize_t global_rsv_size_show(struct kobject *kobj,
struct btrfs_block_rsv *block_rsv = _info->global_block_rsv;
return btrfs_show_u64(_rsv->size, _rsv->lock, buf);
 }
-BTRFS_ATTR(global_rsv_size, global_rsv_size_show);
+BTRFS_ATTR(allocation, global_rsv_size, global_rsv_size_show);
 
 static ssize_t global_rsv_reserved_show(struct kobject *kobj,
struct kobj_attribute *a, char *buf)
@@ -256,15 +256,15 @@ static ssize_t global_rsv_reserved_show(struct kobject 
*kobj,
struct btrfs_block_rsv *block_rsv = _info->global_block_rsv;
return btrfs_show_u64(_rsv->reserved, _rsv->lock, buf);
 }
-BTRFS_ATTR(global_rsv_reserved, global_rsv_reserved_show);
+BTRFS_ATTR(allocation, global_rsv_reserved, global_rsv_reserved_show);
 
 #define to_space_info(_kobj) container_of(_kobj, struct btrfs_space_info, kobj)
 #define to_raid_kobj(_kobj) container_of(_kobj, struct raid_kobject, kobj)
 
 static ssize_t raid_bytes_show(struct kobject *kobj,
   struct kobj_attribute *attr, char *buf);
-BTRFS_RAID_ATTR(total_bytes, raid_bytes_show);
-BTRFS_RAID_ATTR(used_bytes, raid_bytes_show);
+BTRFS_ATTR(raid, total_bytes, raid_bytes_show);
+BTRFS_ATTR(raid, used_bytes, raid_bytes_show);
 
 static ssize_t raid_bytes_show(struct kobject *kobj,
   struct kobj_attribute *attr, char *buf)
@@ -277,7 +277,8 @@ static ssize_t raid_bytes_show(struct kobject *kobj,
 
down_read(>groups_sem);
list_for_each_entry(block_group, >block_groups[index], list) {
-   if (>attr == BTRFS_RAID_ATTR_PTR(total_bytes))
+   if (>attr == \
+   BTRFS_ATTR_PTR(raid, total_bytes))
val += block_group->key.offset;
else
val += btrfs_block_group_used(_group->item);
@@ -287,8 +288,8 @@ static ssize_t raid_bytes_show(struct kobject *kobj,
 }
 
 static struct attribute *raid_attributes[] = {
-   BTRFS_RAID_ATTR_PTR(total_bytes),
-   BTRFS_RAID_ATTR_PTR(used_bytes),
+   BTRFS_ATTR_PTR(raid, total_bytes),
+   BTRFS_ATTR_PTR(raid, used_bytes),
NULL
 };
 
@@ -311,7 +312,7 @@ static ssize_t btrfs_space_info_show_##field(struct kobject 
*kobj,  \
struct btrfs_space_info *sinfo = to_space_info(kobj);   \
return btrfs_show_u64(>field, >lock, buf);\
 }  \
-BTRFS_ATTR(field, btrfs_space_info_show_##field)
+BTRFS_ATTR(space_info, field, btrfs_space_info_show_##field)
 
 static ssize_t btrfs_space_info_show_total_bytes_pinned(struct kobject *kobj,
   struct kobj_attribute *a,
@@ -331,19 +332,20 @@ SPACE_INFO_ATTR(bytes_may_use);
 SPACE_INFO_ATTR(bytes_readonly);
 SPACE_INFO_ATTR(disk_used);
 SPACE_INFO_ATTR(disk_total);
-BTRFS_ATTR(total_bytes_pinned, btrfs_space_info_show_total_bytes_pinned);
+BTRFS_ATTR(space_info, total_bytes_pinned, \
+  btrfs_space_info_show_total_bytes_pinned);
 
 static struct attribute *space_info_attrs[] = {
-   BTRFS_ATTR_PTR(flags),
-   BTRFS_ATTR_PTR(total_bytes),
-   BTRFS_ATTR_PTR(bytes_used),
-   BTRFS_ATTR_PTR(bytes_pinned),
-   BTRFS_ATTR_PTR(bytes_reserved),
-   BTRFS_ATTR_PTR(bytes_may_use),
-   BTRFS_ATTR_PTR(bytes_readonly),
-   BTRFS_ATTR_PTR(disk_used),
-   BTRFS_ATTR_PTR(disk_total),
-   BTRFS_ATTR_PTR(total_bytes_pinned),
+   BTRFS_ATTR_PTR(space_info, flags),
+   BTRFS_ATTR_PTR(space_info, total_bytes),
+   BTRFS_ATTR_PTR(space_info, bytes_used),
+   BTRFS_ATTR_PTR(space_info, bytes_pinned),
+   BTRFS_ATTR_PTR(space_info, bytes_reserved),
+   

Re: "BTRFS error (device vda1): couldn't get super buffer head for bytenr x"

2017-10-08 Thread Nick Gilmour
Thanks for the reply!

For conversion I used this command:
$ vboxmanage internalcommands converttoraw mydisk.vdi mydisk.img

and for resizing this one:
$ qemu-img resize mydisk.img 150G

Is there something I can do to fix this or another way to do the
resizing without getting an error?

Regards,
Nick


On Sat, Oct 7, 2017 at 2:08 AM, Liu Bo  wrote:
> On Fri, Oct 06, 2017 at 12:25:17PM +0200, Nick Gilmour wrote:
>> Hi all,
>>
>> I have converted .vdi disk (BTRFS) into a .img disk, resized it from
>> 500GB to 150GB and then copied into a ZFS Volume. I've imported the VM
>> into VMM and it started normally but an upgrade failed. I've rebooted
>> and  got only a blue screen something like a BSOD on Windows. I've
>> changed into a terminal and now this error appears constantly:
>>
>>  "BTRFS error (device vda1): couldn't get super buffer head for bytenr x"
>
> The error implies that it failed to read the sectors which contains
> btrfs superblock, given you've shrinked the size, I guess btrfs
> doesn't know that somehow and is trying to issue writes to a >150GB
> position(could be 256G).
>
> You can check that 'bytenr x'.
>
> Which command did you use to resize it?
>
> thanks,
>
> -liubo
>
>>
>> I can stop it shortly with Ctrl-C and enter a command. With startx I
>> can see my desktop in blue  color with some icons in it and nothing
>> more.
>>
>>
>> uname -a
>> Linux VM-Ubuntu 4.4.0-83-generic
>>
>> btrfs --version
>> btrfs-progs v4.4
>>
>> btrfs fi show
>> Label: none uuid: x
>>Total devices 1 FS bytes used 473.68GiB
>> devid 1 size 492.00 GiB used 492.00GiB path /dev/sda1
>>
>> Label: 'extra' uuid: y
>>Total devices 1 FS bytes used 112.00KiB
>> devid 1 size 100.00 GiB used 2.02GiB path /dev/sdb1
>>
>> btrfs fi df /home
>> Data, single: total=106.40GiB, used=96.42GiB
>> System, DUP: total=8.00MiB, used=48.00KiB
>> Metadata, DUP: total=14.88GiB, used=11.18GiB
>> GlobalReserve, single: total=512.00 MiB, used=0.00B
>>
>> dmesg > dmesg.log
>> compiz: segfault at ... error 4 in libnux-graphics-4.0.so...
>>
>>
>> Any ideas?
>>
>> Regards,
>> Nick
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 2/2] btrfs: check device for critical errors and mark failed

2017-10-08 Thread Anand Jain



On 10/07/2017 07:56 AM, Liu Bo wrote:

On Thu, Oct 05, 2017 at 09:56:59PM +0800, Anand Jain wrote:



On 10/05/2017 04:11 AM, Liu Bo wrote:

On Tue, Oct 03, 2017 at 11:59:20PM +0800, Anand Jain wrote:

From: Anand Jain 

Write and flush errors are critical errors, upon which the device fd
must be closed and marked as failed.



Can we defer the job of closing device to umount?


  Originally I think it was designed to handle the bad disk
  same as missing device. as below [1]. This patch just does that.



  But I am curious to know if there is any issue to close failed
  device ?


   Anything on this ?



[1]
-
static int read_one_dev(struct btrfs_fs_info *fs_info,
 struct extent_buffer *leaf,
 struct btrfs_dev_item *dev_item)
::
 /*
  * this happens when a device that was properly setup
  * in the device info lists suddenly goes bad.
  * device->bdev is NULL, and so we have to set
  * device->missing to one here
  */
--

  So here I misuse the device missing scenario (device->bdev = NULL)
  to the failed device scenario. (code comment mentioned it).



This is OK to me, we can unify these stuff later, for now, ->missing
is to bypass this failed disk for r/w.


  Secondly as in the patch 1/2 commit log and also Austin mentioned it,
  if failed device close is deferred it will also defer the cleanup of
  the failed device at the block layer.

  But yes block layer can still clean up when btrfs closes the
  device at the time of replace, also where in the long run, pulling
  out of the failed disk would (planned) trigger a udev notification
  into the btrfs sysfs interface and it can close the device.

  Anything that I have missed ?



I'm more inclined to do cleanup by making udev call 'device remove',
which is supposed to do all the validation checks you need to done
here.


   Validation check like what ?

Thanks, Anand


  Further, jfyi closing of failed device isn't related to the
  looping for device failure though.


We can go mark the device failed and skip it while doing read/write,
and umount can do the cleanup work.


  In servers a need of umount is considered as downtime, things
  shouldn't wait for umount to cleanup.



OK.


That way we don't need a dedicated thread looping around to detect a
rare situation.


  I think its better to make it event based instead of thread looping,
  pls take a look..

[PATCH v8.1 2/2] btrfs: mark device failed for write and flush errors



Event based is good, for anything about failing disk, we would have to
follow what md is doing eventually, i.e. let udev handle it.

thanks,
-liubo


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Btrfs: heuristic add shannon entropy calculation

2017-10-08 Thread Timofey Titovets
Byte distribution check in heuristic will filter edge data
cases and some time fail to classificate input data

Let's fix that by adding Shannon entropy calculation,
that will cover classification of most other data types.

As Shannon entropy need log2 with some precision to work,
let's use ilog2(N) and for increase precision,
by do ilog2(pow(N, 4))

Shannon entropy slightly changed for avoiding signed numbers
and divisions.

Changes:
  v1 -> v2:
- Replace log2_lshift4 wiht ilog2
- To compensate accuracy errors of ilog2
  @ENTROPY_LVL_ACEPTABLE 70 -> 65
  @ENTROPY_LVL_HIGH  85 -> 80
- Drop usage of u64 from shannon calculation

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/compression.c | 83 +-
 1 file changed, 82 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 0060bc4ae5f2..8efbce5633b5 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -1224,6 +1225,60 @@ int btrfs_decompress_buf2page(const char *buf, unsigned 
long buf_start,
return 1;
 }

+/*
+ * Shannon Entropy calculation
+ *
+ * Pure byte distribution analyze fail to determine
+ * compressiability of data. Try calculate entropy to
+ * estimate the average minimum number of bits needed
+ * to encode a sample data.
+ *
+ * For comfort, use return of percentage of needed bit's,
+ * instead of bit's amaount directly.
+ *
+ * @ENTROPY_LVL_ACEPTABLE - below that threshold sample has low byte
+ * entropy and can be compressible with high probability
+ *
+ * @ENTROPY_LVL_HIGH - data are not compressible with high probability,
+ *
+ * Use of ilog2() decrease precision, so to see ~ correct answer
+ * LVL decreased by 5.
+ */
+
+#define ENTROPY_LVL_ACEPTABLE 65
+#define ENTROPY_LVL_HIGH 80
+
+/*
+ * For increase precision in shannon_entropy calculation
+ * Let's do pow(n, M) to save more digits after comma
+ *
+ * Max int bit lengh 64
+ * ilog2(MAX_SAMPLE_SIZE) -> 13
+ * 13*4 = 52 < 64 -> M = 4
+ * So use pow(n, 4)
+ */
+static inline u32 ilog2_w(u64 n)
+{
+   return ilog2(n * n * n * n);
+}
+
+static u32 shannon_entropy(struct heuristic_ws *ws)
+{
+   const u32 entropy_max = 8*ilog2_w(2);
+   u32 entropy_sum = 0;
+   u32 p, p_base, sz_base;
+   u32 i;
+
+   sz_base = ilog2_w(ws->sample_size);
+   for (i = 0; i < BUCKET_SIZE && ws->bucket[i].count > 0; i++) {
+   p = ws->bucket[i].count;
+   p_base = ilog2_w(p);
+   entropy_sum += p * (sz_base - p_base);
+   }
+
+   entropy_sum /= ws->sample_size;
+   return entropy_sum * 100 / entropy_max;
+}

 /* Compare buckets by size, ascending */
 static inline int bucket_comp_rev(const void *lv, const void *rv)
@@ -1404,7 +1459,7 @@ int btrfs_compress_heuristic(struct inode *inode, u64 
start, u64 end)
struct heuristic_ws *ws;
u32 i;
u8 byte;
-   int ret = 1;
+   int ret = 0;

ws = list_entry(ws_list, struct heuristic_ws, list);

@@ -1439,6 +1494,32 @@ int btrfs_compress_heuristic(struct inode *inode, u64 
start, u64 end)
goto out;
}

+   i = shannon_entropy(ws);
+   if (i <= ENTROPY_LVL_ACEPTABLE) {
+   ret = 4;
+   goto out;
+   }
+
+   /*
+* At below entropy levels additional
+* analysis needed for show green light to compression
+* For now just assume that compression at that level are
+* not worth for resources, becuase:
+* 1. that possible to defrag data later
+* 2. Case where that will really show compressible data are rare
+*   ex. 150 byte types, every bucket have counter at level ~54
+*   Heuristic will be confused, that can happen when data
+*   have some internal repeated patterns abbacbbc..
+*   that can be detected only by analize sample for byte pairs
+*/
+   if (i < ENTROPY_LVL_HIGH) {
+   ret = 5;
+   goto out;
+   } else {
+   ret = 0;
+   goto out;
+   }
+
 out:
__free_workspace(0, ws_list, true);
return ret;
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


USB upgrade fun

2017-10-08 Thread Kai Hendry
Hi there,

My /mnt/raid1 suddenly became full somewhat expectedly, so I bought 2
new USB 4TB hard drives (one WD, one Seagate) to upgrade to.

After adding sde and sdd I started to see errors in dmesg [2].
https://s.natalian.org/2017-10-07/raid1-newdisks.txt
[2] https://s.natalian.org/2017-10-07/btrfs-errors.txt


I assumed it had to perhaps with the USB bus on my NUC5CPYB being maxed
out, and to expedite the sync, I tried to remove one of the older 2TB
sdc1.  However the load went crazy and my system went completely
unstable. I shutdown the machine and after an hour I hard powered it
down since it seemed to hang (it's headless).

Sidenote: I've since learnt that removing a drive actually deletes the
contents of the drive? I don't want that. I was hoping to put that drive
into cold storage. How do I remove a drive without losing data from a
RAID1 configuration?


After a reboot it failed, namely because "nofail" wasn't in my fstab and
systemd is pedantic by default. After managing to get it booting into my
system without /mnt/raid1 I faced these "open ctree failed" issues. 
After running btrfs check on all the drives and getting nowhere, I
decided to unplug the new drives and I discovered that when I take out
the new 4TB WD drive, I could mount it with -o degraded.

dmesg errors with the WD include "My Passport" Wrong diagnostic page;
asked for 1 got 8 "Failed to get diagnostic page 0xffea" which
raised my suspicions. The model number btw is WDBYFT0040BYI-WESN

Anyway, I'm back up and running with 2x2TB  (one of them didn't finish
removing, I don't know which) & 1x4TB.

[1] https://s.natalian.org/2017-10-08/usb-btrfs.txt

I've decided to send the WD back for a refund. I've decided I want keep
the 2x2TB and RAID1 with the new 1x4TB disk. So 4TB total. btrfs
complains now of "Some devices missing" [1]. How do I fix this
situation?

Any tips how to name this individual disks? hdparm -I isn't a lot to go
on.

[hendry@nuc ~]$ sudo hdparm -I /dev/sdb1 | grep Model
Model Number:   ST4000LM024-2AN17V
[hendry@nuc ~]$ sudo hdparm -I /dev/sdc1 | grep Model
Model Number:   ST2000LM003 HN-M201RAD
[hendry@nuc ~]$ sudo hdparm -I /dev/sdd1 | grep Model
Model Number:   ST2000LM005 HN-M201AAD


Ok, thanks. Hope you can guide me,
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html