date:20151021

BTRFS BUG at insert_inline_extent_backref+0xe3/0xf0 while rebalancing

2015-10-21 Thread Erkki Seppala

Hello,

Recently I added daily rebalancing to my cron.d (after finding myself in
the no-space-situation), and not long after that, I found my PC had
crashed over night. Having no sign in the logs anywhere (not even over
network even though there should be) I had nothing to go on, but this
night it crashed again after starting the rebalance, and this time there
was some information on the kernel log.

Kernel version: 4.2.3 (package linux-image-4.2.0-1-amd64 version 4.2.3-1
from Debian Unstable)

The dump is available at:

  http://www.modeemi.fi/~flux/btrfs/btrfs-BUG-2015-10-55.txt

The log is available as well (stripped some unrelated USB- and firewall
logging, showing that last evening there was some kernel task hung for
120 seconds; but it's in another btrfs filesystem and is another story):

  http://www.modeemi.fi/~flux/btrfs/btrfs-2015-10-55.txt

I'm not quite sure which of the btrfs balance commands caused the
issue. But there is my script:

#!/bin/sh
fs="$1"
if [ -z "$fs" ]; then
  echo usage: btrfs-balance / 0 1 5 10 20 50
  exit 1
fi
fs="$1"
shift
for usage in d m; do for a in "$@"; do date; /bin/btrfs balance start
"$fs" -v -${usage}usage=$a; done; done

And it was started at 07:30 with:

  /usr/local/sbin/btrfs-balance / 0 1 2 5 10 20 30 50 70

I should add that the filesystem in question is backed by MD RAID10 and
that is backed by four SSDs, so it's reasonably fast in IO, if that
affects anything. There should have been no much competing IO at the
time of the occurrence.

Before Duncan asks ;-), I only have a moderate number of subvolumes and
snapshots, ie. one subvolume for each of /, /var/log/journal and /home,
24 snapshots of / and /home plus <10 snapshots of /.

Before that balance there was another balance on a another BTRFS RAID10,
but given the time stamp I think I can easily say it wasn't the cause.

I don't really have other 'solutions' than disabling the rebalancing for
the time being, and only use it as-needed as I had earlier done..

Cheers,

-- 
  _
 / __// /__   __   http://www.modeemi.fi/~flux/\   \
/ /_ / // // /\ \/ /\  /
   /_/  /_/ \___/ /_/\_\@modeemi.fi  \/

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/3] btrfs-progs: kernel based default features for mkfs

2015-10-21 Thread Qu Wenruo




Anand Jain wrote on 2015/10/22 11:09 +0800:


Hi,

  more details about this RFC as below..



So a user with old kernel won't be able to create a filesystem with
newer feature forever.


  Thats not true. Here below is an example of the problem and fix
  in action.


Maybe the user are just making btrfs for his or her newer kernel?


  Yes. Thats still possible as below.



[[current problem]]

I am on 3.8 kernel which does not support the skinny-metadata feature.
-
# uname -r
3.8.13-98.4.1.el6uek.x86_64
-

Lets say customer upgraded the latest btrfs-progs as they wanted better
cli interface, reporting and document which is only available in the
latest btrfs-progs.

As btrfs-progs is backward kernel compatible, they don't have to
upgrade the kernel.

But as skinny-metadata is part of the "mkfs default feature", thats get
enabled by default during mkfs.


For real customer case, that's really a problem.
Customer is god, right?

But on the other hand, normally support team only provides support for 
their own product.
Here "product" may be their distribution or other software, but never 
any software customer build from source by themselves.


If customer want to upgrade btrfs-progs, they should use the repo from 
distribution, not compile by themselves.


And it will be the responsibility of backport team to backport needed 
features for kernel/btrfs-progs or change default features.


But unfortunately(or fortunately?), we are not backport team.



--
# mkfs.btrfs -f /dev/sdc
btrfs-progs v4.2.2-7-g03cf344
See http://btrfs.wiki.kernel.org for more information.

Label:  (null)
UUID:   bb85b156-35eb-49a5-be5a-512fec1abab4
Node size:  16384
Sector size:4096
Filesystem size:3.00GiB
Block group profiles:
   Data: single8.00MiB
   Metadata: DUP 161.56MiB
   System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata <-
Number of devices:  1
Devices:
IDSIZE  PATH
 1 3.00GiB  /dev/sdc
---

mount on the running kernel 3.8 fails, as there is no kernel support
for skinny-metadata
---
# mount /dev/sdc /btrfs
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so

BTRFS: couldn't mount because of unsupported optional features (100).
btrfs: open_ctree failed
--

Customers need to upgrade the kernel also to mount this FS.


[[current solution]]

User must know that their running kernel does not support
skinny metadata and disable features accordingly as below.


# mkfs.btrfs -f -O ^skinny-metadata /dev/sdc > /dev/null && mount
/dev/sdc /btrfs
# echo $?
0



[[problem with the current solution]]

btrfs-progs is backward kernel compatible.

But the default feature as set during mkfs is very btrfs-progs centric,
and is not in sync with the current running kernel.


[[new proposed, with the patch]]

- the default-features that are not supported by the running kernel
  are masked.

  So now the default mkfs.btrfs and mount works. As progs finds that
  kernel version is too old to support skinny-metadata and remotes is
  from the default feature list.

---
# uname -r
3.8.13-98.4.1.el6uek.x86_64


# mkfs.btrfs -f /dev/sdc
btrfs-progs v4.2.2-7-g03cf344
See http://btrfs.wiki.kernel.org for more information.

Label:  (null)
UUID:   f2513ff0-6d94-4b6a-8bbf-ff53d343fa62
Node size:  16384
Sector size:4096
Filesystem size:3.00GiB
Block group profiles:
   Data: single8.00MiB
   Metadata: DUP 161.56MiB
   System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref  <-
Number of devices:  1
Devices:
IDSIZE  PATH
 1 3.00GiB  /dev/sdc

# mount /dev/sdc /btrfs
---


However as Qu pointed out if user wants to create FS with a feature that
is not supported by running kernel. They still have the choice to enable
it using -O option. as below. however they won't be able to mount it
unless kernel is upgraded as well.

---
# mkfs.btrfs -f -O skinny-metadata /dev/sdc
btrfs-progs v4.2.2-7-g03cf344
See http://btrfs.wiki.kernel.org for more information.

Label:  (null)
UUID:   30c018ab-3d2a-4acc-8287-3f28c5324fca
Node size:  16384
Sector size:4096
Filesystem size:3.00GiB
Block group profiles:
   Data: single8.00MiB
   Metadata: DUP 161.56MiB
   System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata   <
Number of devices:  1
Devices:
IDSIZE  PATH
 1 3.00GiB  /dev/sdc

# mount /dev/sdc /btrfs
mount: wrong f

[PATCH] btrfs-progs: fix missing init dev list head for dev_list

2015-10-21 Thread Anand Jain

---
 utils.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/utils.c b/utils.c
index f1e3248..b754686 100644
--- a/utils.c
+++ b/utils.c
@@ -763,6 +763,7 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans,
if (!device->name)
goto err_nomem;
 
+   INIT_LIST_HEAD(&device->dev_list);
ret = btrfs_add_device(trans, root, device);
BUG_ON(ret);
 
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/1] btrfs-progs: fix uninitialized copy of btrfs_fs_devices list

2015-10-21 Thread Anand Jain

Noticed that at print_one_uuid() some of the members of btrfs_fs_devices
contained some junk values. It took a while to dig this further, and found
that we make a local copy of the btrfs_fs_devices list at
search_umounted_fs_uuids() and wasn't initialized properly.

Fixed using using zalloc instead of malloc.

Signed-off-by: Anand Jain 
---
 cmds-filesystem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 3663734..4d3a9a4 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -685,7 +685,7 @@ static int search_umounted_fs_uuids(struct list_head 
*all_uuids,
if (is_seen_fsid(cur_fs->fsid))
continue;
 
-   fs_copy = malloc(sizeof(*fs_copy));
+   fs_copy = kzalloc(sizeof(*fs_copy), GFP_NOFS);
if (!fs_copy) {
ret = -ENOMEM;
goto out;
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs-balance causes system-freeze on full disk

2015-10-21 Thread Duncan

Kyle Manna posted on Wed, 21 Oct 2015 13:51:22 -0700 as excerpted:

> The issue I encountered is described @
> https://bugzilla.kernel.org/show_bug.cgi?id=105681

FWIW...

I won't try to deal with the issue reported there, but I can help clear 
something up that's mentioned on the bug[1].

The question (comment 5 and 6) refers to btrfs device usage output, for a 
three-device btrfs raid1, both data/metadata.  The question was why only 
two of the three devices listed a system chunk.

Btrfs raid1, unlike say mdraid1, is strictly pair-mirror, exactly two 
copies of the chunk, one each on two different devices.  More devices 
adds to the space available, not to the number of redundant copies.

As it happens, the two devices that got a copy of the system chunk were 
sdb and sdd, sdc didn't get a copy, as there are only two copies to 
distribute, no matter the number of devices in the raid1.


And as it happens, I've been personally interested in and thus following 
the roadmapped btrfs N-way-mirroring, the feature that would put a copy 
on all three devices, this being my most hotly anticipated btrfs feature 
since 3-way-mirroring is about the perfect balance between cost and 
reliability due to device redundancy, for me.

For quite some time now, a new N-way-mirroring feature has been on the 
roadmap, to be worked on after raid56 mode, as the planned implementation 
was to use some of the same code.  Raid56 mode is complete now, tho it 
took far longer than initially expected, so hopefully n-way-mirroring is 
already in development.  However, given the time raid56 took, 2-3 years 
of development, it's likely to be some time before n-way-mirroring 
actually appears.  And again, if it follows the pattern of other btrfs 
features, it'll take a couple kernel cycles after initial release to 
stabilize to actual usability, and a full year (five cycles) to stabilize 
to approximately the same maturity/stability as the rest of btrfs in 
general.

For raid56, nominally code-complete in 3.19, the last critical bug was in 
the early 4.1 code, fixed by 4.1 release.  But my recommendation has been 
to wait another couple cycles just to be sure nothing else "interesting" 
comes up, basically a full year, five kernel cycles, after nominal code-
complete release.  That would be 4.4...

Back to N-way-mirroring, assuming the work doesn't get delayed by 
something else, I'd EWAG (educated WAG) an 18 month to 2 year development 
time to nominally complete.  That would put initial release around 
4.7-4.9, actual usability at 4.9-4.11, and year-on stability at 4.12-4.14.

So altho we're nearing a year since raid56 nominal-completion, I don't 
expect N-way-mirroring code release for another year or so yet, don't 
expect it to be really usable for another five months (two kernel cycles) 
after that, and even then, wouldn't expect it to be as stable as the rest 
of btrfs for another further three kernels or so, thus putting actual 
reasonable stability (compared to the already stable 2-way raid1 code) 
two years out...

So it's coming, and at least now it's close enough there's /some/ 
estimate of when it might be available, but it's going to be some time 
yet before I'd expect even nominal code-completion release, and some time 
after that before it reaches the stability benefit that I'm actually 
hotly anticipating the feature for.  Very roughly two years from now, tho 
I'd not be surprised at all to see that slide another six months to a 
year, and that's assuming nothing else shoves it out of the way, priority-
wise.

---
[1] I do have a kernel-bugs login but didn't want to bother logging in 
just to add the comment there, when I had just clicked a link here to get 
there, and could simply reply here instead.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/3] btrfs-progs: kernel based default features for mkfs

2015-10-21 Thread Anand Jain



Hi,

 more details about this RFC as below..



So a user with old kernel won't be able to create a filesystem with
newer feature forever.


 Thats not true. Here below is an example of the problem and fix
 in action.


Maybe the user are just making btrfs for his or her newer kernel?


 Yes. Thats still possible as below.



[[current problem]]

I am on 3.8 kernel which does not support the skinny-metadata feature.
-
# uname -r
3.8.13-98.4.1.el6uek.x86_64
-

Lets say customer upgraded the latest btrfs-progs as they wanted better
cli interface, reporting and document which is only available in the
latest btrfs-progs.

As btrfs-progs is backward kernel compatible, they don't have to
upgrade the kernel.

But as skinny-metadata is part of the "mkfs default feature", thats get
enabled by default during mkfs.

--
# mkfs.btrfs -f /dev/sdc
btrfs-progs v4.2.2-7-g03cf344
See http://btrfs.wiki.kernel.org for more information.

Label:  (null)
UUID:   bb85b156-35eb-49a5-be5a-512fec1abab4
Node size:  16384
Sector size:4096
Filesystem size:3.00GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP 161.56MiB
  System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata <-
Number of devices:  1
Devices:
   IDSIZE  PATH
1 3.00GiB  /dev/sdc
---

mount on the running kernel 3.8 fails, as there is no kernel support
for skinny-metadata
---
# mount /dev/sdc /btrfs
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

BTRFS: couldn't mount because of unsupported optional features (100).
btrfs: open_ctree failed
--

Customers need to upgrade the kernel also to mount this FS.


[[current solution]]

User must know that their running kernel does not support
skinny metadata and disable features accordingly as below.


# mkfs.btrfs -f -O ^skinny-metadata /dev/sdc > /dev/null && mount 
/dev/sdc /btrfs

# echo $?
0



[[problem with the current solution]]

btrfs-progs is backward kernel compatible.

But the default feature as set during mkfs is very btrfs-progs centric,
and is not in sync with the current running kernel.


[[new proposed, with the patch]]

- the default-features that are not supported by the running kernel
 are masked.

 So now the default mkfs.btrfs and mount works. As progs finds that
 kernel version is too old to support skinny-metadata and remotes is
 from the default feature list.

---
# uname -r
3.8.13-98.4.1.el6uek.x86_64


# mkfs.btrfs -f /dev/sdc
btrfs-progs v4.2.2-7-g03cf344
See http://btrfs.wiki.kernel.org for more information.

Label:  (null)
UUID:   f2513ff0-6d94-4b6a-8bbf-ff53d343fa62
Node size:  16384
Sector size:4096
Filesystem size:3.00GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP 161.56MiB
  System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref  <-
Number of devices:  1
Devices:
   IDSIZE  PATH
1 3.00GiB  /dev/sdc

# mount /dev/sdc /btrfs
---


However as Qu pointed out if user wants to create FS with a feature that
is not supported by running kernel. They still have the choice to enable
it using -O option. as below. however they won't be able to mount it
unless kernel is upgraded as well.

---
# mkfs.btrfs -f -O skinny-metadata /dev/sdc
btrfs-progs v4.2.2-7-g03cf344
See http://btrfs.wiki.kernel.org for more information.

Label:  (null)
UUID:   30c018ab-3d2a-4acc-8287-3f28c5324fca
Node size:  16384
Sector size:4096
Filesystem size:3.00GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP 161.56MiB
  System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata   <
Number of devices:  1
Devices:
   IDSIZE  PATH
1 3.00GiB  /dev/sdc

# mount /dev/sdc /btrfs
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so
#
-

In nut shell this patch makes _mkfs and btrfs-convert default features_
dynamically aligned with the running kernel.

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread János Tóth F .

I tried that after every possible combinations of RO mount failed. I used it in 
the past for an USB attached drive where an USB-SATA adapter had some issues (I 
plugged it into a standard USB2 port even though it expected USB3 power 
current, so a high-current or several standard USB2 ports should have been used 
to prevent it from emergency shutdown during file copy). It worked flawlessly 
then but in this case it threw the same errors as "btrsf check". It couldn't 
even find and list alternative roots. Every tools seemed to give up really fast 
and easy, sometimes with segfaults and tracing messages in the kernel log.On 
Oct 21, 2015 22:26, Chris Murphy  wrote:
>
> https://btrfs.wiki.kernel.org/index.php/Restore 
>
> This should still be possible with even a degraded/unmounted raid5. It 
> is a bit tedious to figure out how to use it but if you've got some 
> things you want off the volume, it's not so difficult to prevent 
> trying it. 
>
>
> Chris Murphy

Re: btrfs: check unsupported filters in balance arguments

2015-10-21 Thread Chris Mason

On Thu, Oct 22, 2015 at 12:52:37AM +0200, David Sterba wrote:
> On Wed, Oct 21, 2015 at 11:55:00PM +0300, Dan Carpenter wrote:
> > Hello David Sterba,
> > 
> > The patch 8eb934591f8b: "btrfs: check unsupported filters in balance
> > arguments" from Oct 12, 2015, leads to the following static checker
> > warning:
> > 
> > fs/btrfs/ioctl.c:4673 btrfs_ioctl_balance()
> > warn: possible memory leak of 'bctl'
> 
> Thanks for the report, the fix is on the way:
> https://patchwork.kernel.org/patch/7453231/

Thanks, I've picked this up.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 RESENT 2/2] btrfs: qgroup: Don't copy extent buffer to do qgroup rescan

2015-10-21 Thread Qu Wenruo

Ancient qgroup code call memcpy() on a extent buffer and use it for leaf
iteration.

As extent buffer contains lock, pointers to pages, it's never sane to do
such copy.

The following bug may be caused by this insane operation:
[92098.841309] general protection fault:  [#1] SMP
[92098.841338] Modules linked in: ...
[92098.841814] CPU: 1 PID: 24655 Comm: kworker/u4:12 Not tainted
4.3.0-rc1 #1
[92098.841868] Workqueue: btrfs-qgroup-rescan btrfs_qgroup_rescan_helper
[btrfs]
[92098.842261] Call Trace:
[92098.842277]  [] ? read_extent_buffer+0xb8/0x110
[btrfs]
[92098.842304]  [] ? btrfs_find_all_roots+0x60/0x70
[btrfs]
[92098.842329]  []
btrfs_qgroup_rescan_worker+0x28d/0x5a0 [btrfs]

Where btrfs_qgroup_rescan_worker+0x28d is btrfs_disk_key_to_cpu(),
called in reading key from the memcpied extent_buffer.

This patch will read the whole leaf into memory, and use newly
introduced stack function to do qgroup rescan.

Reported-by: Stephane Lesimple 
Signed-off-by: Qu Wenruo 
---
v2:
  Follow the parameter change in previous patch.
v3:
  None
---
 fs/btrfs/qgroup.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index e9ace09..6a83a40 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2183,11 +2183,11 @@ void assert_qgroups_uptodate(struct btrfs_trans_handle 
*trans)
  */
 static int
 qgroup_rescan_leaf(struct btrfs_fs_info *fs_info, struct btrfs_path *path,
-  struct btrfs_trans_handle *trans,
-  struct extent_buffer *scratch_leaf)
+  struct btrfs_trans_handle *trans, char *stack_leaf)
 {
struct btrfs_key found;
struct ulist *roots = NULL;
+   struct btrfs_header *header;
struct seq_list tree_mod_seq_elem = SEQ_LIST_INIT(tree_mod_seq_elem);
u64 num_bytes;
int slot;
@@ -2224,13 +2224,15 @@ qgroup_rescan_leaf(struct btrfs_fs_info *fs_info, 
struct btrfs_path *path,
fs_info->qgroup_rescan_progress.objectid = found.objectid + 1;
 
btrfs_get_tree_mod_seq(fs_info, &tree_mod_seq_elem);
-   memcpy(scratch_leaf, path->nodes[0], sizeof(*scratch_leaf));
+   read_extent_buffer(path->nodes[0], stack_leaf, 0,
+  fs_info->extent_root->nodesize);
+   header = (struct btrfs_header *)stack_leaf;
slot = path->slots[0];
btrfs_release_path(path);
mutex_unlock(&fs_info->qgroup_rescan_lock);
 
-   for (; slot < btrfs_header_nritems(scratch_leaf); ++slot) {
-   btrfs_item_key_to_cpu(scratch_leaf, &found, slot);
+   for (; slot < btrfs_stack_header_nritems(header); ++slot) {
+   btrfs_stack_item_key_to_cpu(header, &found, slot);
if (found.type != BTRFS_EXTENT_ITEM_KEY &&
found.type != BTRFS_METADATA_ITEM_KEY)
continue;
@@ -2261,15 +2263,15 @@ static void btrfs_qgroup_rescan_worker(struct 
btrfs_work *work)
 qgroup_rescan_work);
struct btrfs_path *path;
struct btrfs_trans_handle *trans = NULL;
-   struct extent_buffer *scratch_leaf = NULL;
+   char *stack_leaf = NULL;
int err = -ENOMEM;
int ret = 0;
 
path = btrfs_alloc_path();
if (!path)
goto out;
-   scratch_leaf = kmalloc(sizeof(*scratch_leaf), GFP_NOFS);
-   if (!scratch_leaf)
+   stack_leaf = kmalloc(fs_info->extent_root->nodesize, GFP_NOFS);
+   if (!stack_leaf)
goto out;
 
err = 0;
@@ -2283,7 +2285,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work 
*work)
err = -EINTR;
} else {
err = qgroup_rescan_leaf(fs_info, path, trans,
-scratch_leaf);
+stack_leaf);
}
if (err > 0)
btrfs_commit_transaction(trans, fs_info->fs_root);
@@ -2292,7 +2294,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work 
*work)
}
 
 out:
-   kfree(scratch_leaf);
+   kfree(stack_leaf);
btrfs_free_path(path);
 
mutex_lock(&fs_info->qgroup_rescan_lock);
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 RESENT 1/2] btrfs: Add support to do stack item key operation

2015-10-21 Thread Qu Wenruo

Normal btrfs_item_key_to_cpu() will need extent buffer to do it, and
there is not stack version to handle in memory leaf.

Add btrfs_stack_item_key_to_cpu() function for such operation, which
will provide the basis for later qgroup fix.

Signed-off-by: Qu Wenruo 
---
v2:
  Change the char* parameter to struct btrfs_header *, as a leaf always
  has a header.
v3:
  Fix a bug caused in type change of stack_leaf.
---
 fs/btrfs/ctree.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 938efe3..b824fe2 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2683,6 +2683,17 @@ static inline void btrfs_item_key(struct extent_buffer 
*eb,
read_eb_member(eb, item, struct btrfs_item, key, disk_key);
 }
 
+static inline void btrfs_stack_item_key(struct btrfs_header *stack_leaf,
+   struct btrfs_disk_key *disk_key,
+   int nr)
+{
+   unsigned long item_offset = btrfs_item_nr_offset(nr);
+   struct btrfs_item *item;
+
+   item = (struct btrfs_item *)((char *)(stack_leaf) + item_offset);
+   memcpy(disk_key, &item->key, sizeof(*disk_key));
+}
+
 static inline void btrfs_set_item_key(struct extent_buffer *eb,
   struct btrfs_disk_key *disk_key, int nr)
 {
@@ -2785,6 +2796,15 @@ static inline void btrfs_item_key_to_cpu(struct 
extent_buffer *eb,
btrfs_disk_key_to_cpu(key, &disk_key);
 }
 
+static inline void btrfs_stack_item_key_to_cpu(struct btrfs_header *stack_leaf,
+  struct btrfs_key *key,
+  int nr)
+{
+   struct btrfs_disk_key disk_key;
+   btrfs_stack_item_key(stack_leaf, &disk_key, nr);
+   btrfs_disk_key_to_cpu(key, &disk_key);
+}
+
 static inline void btrfs_dir_item_key_to_cpu(struct extent_buffer *eb,
  struct btrfs_dir_item *item,
  struct btrfs_key *key)
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: add a check of whether fs_info->fs_root is NULL in btrfs_async_reclaim_metadata_space()

2015-10-21 Thread Tsutomu Itoh


On 2015/10/21 20:27, David Sterba wrote:

On Wed, Oct 21, 2015 at 04:20:00PM +0900, Tsutomu Itoh wrote:

Kernel panic occurred due to NULL pointer reference in can_overcommit().
Because btrfs_async_reclaim_metadata_space() passed NULL pointer to
btrfs_calc_reclaim_metadata_size().



fs_info->fs_root is referred in btrfs_async_reclaim_metadata_space()
when mount kicked kworker(btrfs_async_reclaim_metadata_space).

But at this time, fs_info->fs_root had not been initialized yet,
so NULL pointer passed to btrfs_calc_reclaim_metadata_size().


I don't think it's the right fix, the initialization sequence should
take care of such situations. The fs_tree must exist at the time we
reach the point where it crashed, the code expects it.


OK. I will try to change initialization sequence.

Thanks,
Tsutomu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

overflow in inode.c, file.c

2015-10-21 Thread Victor


Hello, while using linux-⁠4.2.3 (btrfs-⁠progs v4.2.2) with the latest
grsec patch to date, a feature in the grsec patchset, an overflow
checker (made by emese) seems to have found some bugs in the btrfs
code itself (this is not caused by grsec).

First bug: fs/⁠btrfs/⁠inode.c:5759

For example -⁠-⁠>

***
Oct 18 16:09:18 TestMachine kernel: [8.449128] PAX: size overflow
detected in function btrfs_real_readdir fs/⁠btrfs/⁠inode.c:5760
cicus.935_282 max, count: 9, decl: pos; num: 0; context: dir_context;
Oct 18 16:09:18 TestMachine kernel: [8.449132] CPU: 0 PID: 2630
Comm: polkitd Not tainted 4.2.3-⁠grsec #1
Oct 18 16:09:18 TestMachine kernel: [8.449134] Hardware name:
Gigabyte Technology Co., Ltd. H81ND2H/⁠H81ND2H, BIOS F3 08/⁠11/⁠2015
Oct 18 16:09:18 TestMachine kernel: [8.449135]  81901608
 819015e6 c90004973d48
Oct 18 16:09:18 TestMachine kernel: [8.449139]  81742f0f
0007 81901608 c90004973d78
Oct 18 16:09:18 TestMachine kernel: [8.449141]  811cb706
 8800d47359e0 c90004973ed8
Oct 18 16:09:18 TestMachine kernel: [8.449144] Call Trace:
Oct 18 16:09:18 TestMachine kernel: [8.449151]
[] dump_stack+0x4c/⁠0x7f
Oct 18 16:09:18 TestMachine kernel: [8.449154]
[] report_size_overflow+0x36/⁠0x40
Oct 18 16:09:18 TestMachine kernel: [8.449158]
[] btrfs_real_readdir+0x69c/⁠0x6d0
Oct 18 16:09:18 TestMachine kernel: [8.449160]
[] iterate_dir+0xa8/⁠0x150
Oct 18 16:09:18 TestMachine kernel: [8.449164]
[] ? __fget_light+0x2d/⁠0x70
Oct 18 16:09:18 TestMachine kernel: [8.449166]
[] SyS_getdents+0xba/⁠0x1c0
Oct 18 16:09:18 TestMachine kernel: [8.449169]
[] ? iterate_dir+0x150/⁠0x150
Oct 18 16:09:18 TestMachine kernel: [8.449173]
[] entry_SYSCALL_64_fastpath+0x12/⁠0x83
Oct 18 16:09:18 TestMachine kernel: [8.449230] Overflow: 7fff

*

Second bug: fs/⁠btrfs/⁠file.c:1871

Example-⁠-⁠>


Oct 18 16:09:20 TestMachine kernel: [   10.526375] PAX: size overflow
detected in function btrfs_sync_file fs/⁠btrfs/⁠file.c:1871
cicus.679_107 max, count: 289, decl: btrfs_wait_ordered_range; num: 3;
context: fndecl;
Oct 18 16:09:20 TestMachine kernel: [   10.526380] CPU: 1 PID: 3160
Comm: mysqld Not tainted 4.2.3-⁠grsec #1
Oct 18 16:09:20 TestMachine kernel: [   10.526382] Hardware name:
Gigabyte Technology Co., Ltd. H81ND2H/⁠H81ND2H, BIOS F3 08/⁠11/⁠2015
Oct 18 16:09:20 TestMachine kernel: [   10.526384]  819019e5
 81901924 c90004d8bd98
Oct 18 16:09:20 TestMachine kernel: [   10.526387]  81742f0f
88021f28ddc0 819019e5 c90004d8bdc8
Oct 18 16:09:20 TestMachine kernel: [   10.526390]  811cb706
880202e9e270  8000
Oct 18 16:09:20 TestMachine kernel: [   10.526392] Call Trace:
Oct 18 16:09:20 TestMachine kernel: [   10.526399]
[] dump_stack+0x4c/⁠0x7f
Oct 18 16:09:20 TestMachine kernel: [   10.526402]
[] report_size_overflow+0x36/⁠0x40
Oct 18 16:09:20 TestMachine kernel: [   10.526404]
[] btrfs_sync_file+0x90/⁠0x490
Oct 18 16:09:20 TestMachine kernel: [   10.526407]
[] vfs_fsync_range+0x59/⁠0xc0
Oct 18 16:09:20 TestMachine kernel: [   10.526410]
[] ? __fget_light+0x2d/⁠0x70
Oct 18 16:09:20 TestMachine kernel: [   10.526411]
[] do_fsync+0x3c/⁠0x70
Oct 18 16:09:20 TestMachine kernel: [   10.526413]
[] SyS_fsync+0x15/⁠0x30
Oct 18 16:09:20 TestMachine kernel: [   10.526415]
[] entry_SYSCALL_64_fastpath+0x12/⁠0x83
*


len = end -⁠ start + 1

vfs_fsync calls vfs_fsync_range with 0 and LLONG_MAX for start and end.
In btrfs_sync_file the above expression causes a signed overflow
(undefined behaviour) with these values.



This is the whole dmesg http://pastebin.com/S9gjYpYX , thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs: check unsupported filters in balance arguments

2015-10-21 Thread David Sterba

On Wed, Oct 21, 2015 at 11:55:00PM +0300, Dan Carpenter wrote:
> Hello David Sterba,
> 
> The patch 8eb934591f8b: "btrfs: check unsupported filters in balance
> arguments" from Oct 12, 2015, leads to the following static checker
> warning:
> 
>   fs/btrfs/ioctl.c:4673 btrfs_ioctl_balance()
>   warn: possible memory leak of 'bctl'

Thanks for the report, the fix is on the way:
https://patchwork.kernel.org/patch/7453231/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs-balance causes system-freeze on full disk

2015-10-21 Thread Jakob Schürz

Thx!

It seems to be the same bug.

Am 2015-10-21 um 22:51 schrieb Kyle Manna:
> I had a number of similar btrfs balance crashes in the past few days,
> but the disk wasn't full.  You should try tailing the system logs from
> a remote machine when it happens. You'll likely see some bug info
> before the system dies and becomes unusable.
> 
> The issue I encountered is described @
> https://bugzilla.kernel.org/show_bug.cgi?id=105681
> ᐧ
> 
> On Wed, Oct 21, 2015 at 12:38 PM, Jakob Schürz
>  wrote:
>> Hi there!
>>
>> Is it possible, what i've recognized now. My system (debian) runs on
>> btrfs, and i have a lot of snapshots on my hard-disk.
>> Since some days my system freezes totally. I recognized, it always
>> happens during btrfs-balance.
>>
>> So i deleted some of the old snapshots and tried another balance-run.
>> Nothing happened... No system-freeze.
>>
>> System-freeze means: No Keyboard-action. The Mouse is frozen, the screen
>> is frozen, no magic-sysreq, no ssh-login.
>>
>> Can btrfs cause such a freeze??
>>
>> greez
>>
>> jakob
>> --
>> http://xundeenergie.at
>> http://verkehrsloesungen.wordpress.com/
>> http://cogitationum.wordpress.com/
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
http://xundeenergie.at
http://verkehrsloesungen.wordpress.com/
http://cogitationum.wordpress.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

re: btrfs: check unsupported filters in balance arguments

2015-10-21 Thread Dan Carpenter

Hello David Sterba,

The patch 8eb934591f8b: "btrfs: check unsupported filters in balance
arguments" from Oct 12, 2015, leads to the following static checker
warning:

fs/btrfs/ioctl.c:4673 btrfs_ioctl_balance()
warn: possible memory leak of 'bctl'

fs/btrfs/ioctl.c
  4624  bctl = kzalloc(sizeof(*bctl), GFP_NOFS);
  4625  if (!bctl) {
  4626  ret = -ENOMEM;
  4627  goto out_bargs;
  4628  }
  4629  
  4630  bctl->fs_info = fs_info;
  4631  if (arg) {
  4632  memcpy(&bctl->data, &bargs->data, sizeof(bctl->data));
  4633  memcpy(&bctl->meta, &bargs->meta, sizeof(bctl->meta));
  4634  memcpy(&bctl->sys, &bargs->sys, sizeof(bctl->sys));
  4635  
  4636  bctl->flags = bargs->flags;
  4637  } else {
  4638  /* balance everything - no filters */
  4639  bctl->flags |= BTRFS_BALANCE_TYPE_MASK;
  4640  }
  4641  
  4642  if (bctl->flags & ~(BTRFS_BALANCE_ARGS_MASK | 
BTRFS_BALANCE_TYPE_MASK)) {
  4643  ret = -EINVAL;
  4644  goto out_bargs;

Memory leak on this path.

  4645  }
  4646  
  4647  do_balance:
  4648  /*
  4649   * Ownership of bctl and mutually_exclusive_operation_running
  4650   * goes to to btrfs_balance.  bctl is freed in __cancel_balance,
  4651   * or, if restriper was paused all the way until unmount, in
  4652   * free_fs_info.  mutually_exclusive_operation_running is
  4653   * cleared in __cancel_balance.
  4654   */
  4655  need_unlock = false;
  4656  
  4657  ret = btrfs_balance(bctl, bargs);

We free bctl in btrfs_balance() most times.

  4658  
  4659  if (arg) {
  4660  if (copy_to_user(arg, bargs, sizeof(*bargs)))
  4661  ret = -EFAULT;
  4662  }
  4663  
  4664  out_bargs:
  4665  kfree(bargs);
  4666  out_unlock:
  4667  mutex_unlock(&fs_info->balance_mutex);
  4668  mutex_unlock(&fs_info->volume_mutex);
  4669  if (need_unlock)
  4670  
atomic_set(&fs_info->mutually_exclusive_operation_running, 0);
  4671  out:
  4672  mnt_drop_write_file(file);
  4673  return ret;
  4674  }

regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs-balance causes system-freeze on full disk

2015-10-21 Thread Kyle Manna

I had a number of similar btrfs balance crashes in the past few days,
but the disk wasn't full.  You should try tailing the system logs from
a remote machine when it happens. You'll likely see some bug info
before the system dies and becomes unusable.

The issue I encountered is described @
https://bugzilla.kernel.org/show_bug.cgi?id=105681
ᐧ

On Wed, Oct 21, 2015 at 12:38 PM, Jakob Schürz
 wrote:
> Hi there!
>
> Is it possible, what i've recognized now. My system (debian) runs on
> btrfs, and i have a lot of snapshots on my hard-disk.
> Since some days my system freezes totally. I recognized, it always
> happens during btrfs-balance.
>
> So i deleted some of the old snapshots and tried another balance-run.
> Nothing happened... No system-freeze.
>
> System-freeze means: No Keyboard-action. The Mouse is frozen, the screen
> is frozen, no magic-sysreq, no ssh-login.
>
> Can btrfs cause such a freeze??
>
> greez
>
> jakob
> --
> http://xundeenergie.at
> http://verkehrsloesungen.wordpress.com/
> http://cogitationum.wordpress.com/
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread Chris Murphy

https://btrfs.wiki.kernel.org/index.php/Restore

This should still be possible with even a degraded/unmounted raid5. It
is a bit tedious to figure out how to use it but if you've got some
things you want off the volume, it's not so difficult to prevent
trying it.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

btrfs-balance causes system-freeze on full disk

2015-10-21 Thread Jakob Schürz

Hi there!

Is it possible, what i've recognized now. My system (debian) runs on
btrfs, and i have a lot of snapshots on my hard-disk.
Since some days my system freezes totally. I recognized, it always
happens during btrfs-balance.

So i deleted some of the old snapshots and tried another balance-run.
Nothing happened... No system-freeze.

System-freeze means: No Keyboard-action. The Mouse is frozen, the screen
is frozen, no magic-sysreq, no ssh-login.

Can btrfs cause such a freeze??

greez

jakob
-- 
http://xundeenergie.at
http://verkehrsloesungen.wordpress.com/
http://cogitationum.wordpress.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How to remove missing device on RAID1?

2015-10-21 Thread Kyle Manna

Hi Henk,

This trick/hack worked great for me.  After the rebalance was
complete, a sparse file + loop device, `btrfs replace` and `btrfs
delete` worked as expected.  Thanks.

In other news, I did hit a btrfs bug 3 times while attempting to
balance.  I've added my comments @
https://bugzilla.kernel.org/show_bug.cgi?id=105681#c14
ᐧ

On Tue, Oct 20, 2015 at 3:46 PM, Henk Slager  wrote:
> copy-paste error corrected
> On Wed, Oct 21, 2015 at 12:40 AM, Henk Slager  wrote:
>> I had a similar issue some time ago, around the time kernel 4.1.6 was
>> just there.
>> In case you don't want to wait for new disk or decide to just run the
>> filesystem with 1 disk less or maybe later on replace 1 of the still
>> healthy disks with a double/bigger sized one and use current/older
>> kernel+tools, you could do this (assuming the filesystem is not too
>> full of course):
>> - mount degraded
> - btrfs balance start -f -v -sdevid=1 -mdevid=1 -mdevid=1 
>>   (where missing disk has devid 1)
>> After completion the (virtual/missing) device shall be fully unallocated
>> - create /dev/loopX with sparse file of same size as missing disk on
>> some other filesystem
>> - btrfs replace start 1 /dev/loopX 
>> - remove /dev/loopX from the filesystem
>> - remount filesystyem without degraded
>> And remove /dev/loopX
>>
>>
>> On Tue, Oct 20, 2015 at 11:48 PM, Kyle Manna <2blu...@gmail.com> wrote:
>>> Thanks for the follow-up Duncan, that makes sense.  I assumed I was
>>> doing something wrong.
>>>
>>> I downloaded the devel branch of of btrfs-progs and got it running
>>> before I saw the need for a kernel patch and decided to wait.
>>>
>>> For anyone following this later, I needed to use the following to get
>>> the missing device ID:
>>>
>>> btrfs device usage 
>>> ᐧ
>>>
>>> On Tue, Oct 20, 2015 at 1:58 PM, Duncan <1i5t5.dun...@cox.net> wrote:
 Kyle Manna posted on Tue, 20 Oct 2015 10:24:48 -0700 as excerpted:

> Hi all,
>
> I have a collection of three (was 4) 1-2TB devices with data and
> metadata in a RAID1 mirror.  Last night I was struck by the Click of
> Death on an old Samsung drive.
>
> I removed the device from the system, rebooted and mounted the volume
> with `-o degraded` and the file system seems fine and usable.  I'm
> waiting on a replacement, drive but want to remove the old drive and
> re-balance in the meantime.
>
> How do I remove the missing device?  I tried the `btrfs device delete
> missing /mnt` but was greeted with "ERROR: missing is not a block
> device".  A quick look at that btrfs-progs git repo shows that
> `stat("missing")` is called, which of course fails since missing isn't a
> block device.  Nothing other then `btrfs replace` seemed intuitive and
> all the docs mention the older command.  What's the move?
>
> Thanks!
> - Kyle
>
> Versions:
> Kernel: 4.2.3-1-ARCH
> btrfs-progs: 4.2.2-1 ᐧ

 I believe the current advice given here (that you were likely trying to
 follow, wrapped link)...

 https://btrfs.wiki.kernel.org/index.php/
 Using_Btrfs_with_Multiple_Devices#Replacing_failed_devices

 ... is dated and no longer works due to code change some time in the past.

 There's a set of (very) recent patches, to the kernel and userspace both
 (I just updated userspace and it's in the git devel-branch v4.2.3-49-
 g4db87a1 I just built, kernelspace, I don't see it in linus-mainline yet,
 so I'd guess it's in the btrfs-integration patches, to land in the v4.4
 commit window if not in 4.3 as it's getting late in the cycle for that.

 btrfs fi show 

 That will list the btrfs component devices together with their devids.

 Then use the appropriate devid like so:

 btrfs dev del  

 The -progs commit is d462081f, by Anand Jain, titled:

 btrfs-progs: Introduce device delete by devid

 According to it, the required kernel commit (title only listed) is
 similar:

 Btrfs: Introduce device delete by devid

 You can probably find them on-list if you wish to cherry-pick them into a
 current version.

 --
 Duncan - List replies preferred.   No HTML msgs.
 "Every nonfree program has a lord, a master --
 and if you use the program, he is your master."  Richard Stallman

 --
 To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread Janos Toth F.

I am afraid the filesystem right now is really damaged regardless of
it's state upon the unexpected cable failure because I tried some
dangerous options after read-only restore/recovery methods all failed
(including zero-log, followed by init-csum-tree and even
chunk-recovery -> all of them just spit out several kind of errors
which suggested they probably didn't even write anything to the disks
before they decided that they already failed but they only caused more
harm than good if they did write something).

Actually, I almost got rid of this data myself intentionally when my
new set of drives arrived. I was considering if I should simply start
from scratch (may be reviewing and might be saving my "scratchpad"
portion of the data but nothing really irreplaceable and/or valuable)
but I thought it's a good idea to test the "device replace" function
in real life.

Even though the replace operation seemed to be successful I am
beginning to wonder if it wasn't really.

On Wed, Oct 21, 2015 at 7:42 PM, ronnie sahlberg
 wrote:
> Maybe hold off erasing the drives a little in case someone wants to
> collect some extra data for diagnosing how/why the filesystem got into
> this unrecoverable state.
>
> A single device having issues should not cause the whole filesystem to
> become unrecoverable.
>
> On Wed, Oct 21, 2015 at 9:09 AM, Janos Toth F.  wrote:
>> I went through all the recovery options I could find (starting from
>> read-only to "extraordinarily dangerous"). Nothing seemed to work.
>>
>> A Windows based proprietary recovery software (ReclaiMe) could scratch
>> the surface but only that (it showed me the whole original folder
>> structure after a few minutes of scanning and the "preview" of some
>> some plaintext files was promising but most of the bigger files seemed
>> to be broken).
>>
>> I used this as a bulk storage for backups and all the things I didn't
>> care to keep in more than one copies but that includes my
>> "scratchpad", so I cared enough to use RAID5 mode and to try restoring
>> some things.
>>
>> Any last ideas before I "ata secure erase" and sell/repurpose the disks?
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread Janos Toth F.

I tried several things, including the degraded mount option. One example:

# mount /dev/sdb /data -o ro,degraded,nodatasum,notreelog
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

# cat /proc/kmsg
<6>[  262.616929] BTRFS info (device sdd): allowing degraded mounts
<6>[  262.616943] BTRFS info (device sdd): setting nodatasum
<6>[  262.616949] BTRFS info (device sdd): disk space caching is enabled
<6>[  262.616953] BTRFS: has skinny extents
<6>[  262.652671] BTRFS: bdev (null) errs: wr 858, rd 8057, flush 280,
corrupt 0, gen 0
<3>[  262.697162] BTRFS (device sdd): parent transid verify failed on
38719488 wanted 101765 found 101223
<3>[  262.697633] BTRFS (device sdd): parent transid verify failed on
38719488 wanted 101765 found 101223
<3>[  262.697660] BTRFS: Failed to read block groups: -5
<3>[  262.709885] BTRFS: open_ctree failed
<6>[  267.197365] BTRFS info (device sdd): allowing degraded mounts
<6>[  267.197385] BTRFS info (device sdd): setting nodatasum
<6>[  267.197397] BTRFS info (device sdd): disabling tree log
<6>[  267.197406] BTRFS info (device sdd): disk space caching is enabled
<6>[  267.197412] BTRFS: has skinny extents
<6>[  267.232809] BTRFS: bdev (null) errs: wr 858, rd 8057, flush 280,
corrupt 0, gen 0
<3>[  267.246167] BTRFS (device sdd): parent transid verify failed on
38719488 wanted 101765 found 101223
<3>[  267.246706] BTRFS (device sdd): parent transid verify failed on
38719488 wanted 101765 found 101223
<3>[  267.246727] BTRFS: Failed to read block groups: -5
<3>[  267.261392] BTRFS: open_ctree failed

On Wed, Oct 21, 2015 at 6:09 PM, Janos Toth F.  wrote:
> I went through all the recovery options I could find (starting from
> read-only to "extraordinarily dangerous"). Nothing seemed to work.
>
> A Windows based proprietary recovery software (ReclaiMe) could scratch
> the surface but only that (it showed me the whole original folder
> structure after a few minutes of scanning and the "preview" of some
> some plaintext files was promising but most of the bigger files seemed
> to be broken).
>
> I used this as a bulk storage for backups and all the things I didn't
> care to keep in more than one copies but that includes my
> "scratchpad", so I cared enough to use RAID5 mode and to try restoring
> some things.
>
> Any last ideas before I "ata secure erase" and sell/repurpose the disks?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Lockup in BTRFS_IOC_CLONE/Kernel 4.2.0-rc5

2015-10-21 Thread Filipe Manana

On Fri, Aug 7, 2015 at 5:01 AM, Liu Bo  wrote:
> Hi,
>
> On Wed, Aug 05, 2015 at 10:28:05AM +0200, Elias Probst wrote:
>> I can reproduce a hard btrfs lockup (process issuing the ioctl() is in
>> D-state, same goes for btrfs-transacti process) on Kernel 4.2.0-rc5.
>>
>> I had the same issue on 4.1, so it's unlikely a regression introduced in
>> 4.2.
>>
>> ## With the following steps, I can reproduce the problem:
>>
>> 1. Create a new clean btrfs volume for /var/lib/machines
>> machinectl set-limit 6G
>>
>> 2. Paste this to /tmp/yum.conf
>> [main]
>> reposdir=/dev/null
>> gpgcheck=0
>> logfile=/var/log/yum.log
>> installroot=/var/lib/machines/centos7.1-base
>> assumeyes=1
>>
>> [base]
>> name=CentOS 7.1.1503 - x86_64
>> baseurl=http://mirror.centos.org/centos/7.1.1503/os/x86_64/
>> enabled=1
>>
>> 3. Bootstrap a CentOS 7.1 base image
>> /usr/bin/yum -c /tmp/yum.conf groupinstall Base
>>
>> 4. Start an ephemeral systemd-nspawn container based on 'centos7.1-base'
>> strace -o /tmp/systemd-nspawn.out -s 500 -f systemd-nspawn -xbD
>> /var/lib/machines/centos7.1-base/
>>
>>
>> `systemd-nspawn` will now just hang forever.
>> I couldn't come up yet with a shorter/more low-level way to reproduce this 
>> as I lack quite a bit of btrfs experience.
>
> Thank you for reporting this.
>
> Could you do 'echo w > /proc/sysrq-trigger' to gather the whole hang call 
> stack?
>
> Here's a quick patch that may address your problem, can you give it a shot 
> after
> getting sysrq-w output?

Hi Liu,

Can you please send the patch git formatted, in a dedicated thread,
with a change log, signed-off-by tag, etc, so that it hopefully gets
in to 4.4? You can also add my Reviewed-by: Filipe Manana


Thanks

>
> Thanks,
>
> -liubo
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 0770c91..b52bd66 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -3478,6 +3478,22 @@ process_slot:
> drop_start = new_key.offset;
>
> /*
> +* We need to look up the roots that point at
> +* this bytenr and see if the new root does.  If
> +* it does not we need to make sure we update
> +* quotas appropriately.
> +*/
> +   if (disko && root != BTRFS_I(src)->root &&
> +   disko != last_disko) {
> +   no_quota = check_ref(trans, root,
> +disko);
> +   if (no_quota < 0) {
> +   ret = no_quota;
> +   goto out;
> +   }
> +   }
> +
> +   /*
>  * 1 - adjusting old extent (we may have to * split 
> it)
>  * 1 - add new extent
>  * 1 - inode update
> @@ -3544,27 +3560,6 @@ process_slot:
> btrfs_set_file_extent_num_bytes(leaf, extent,
> datal);
>
> -   /*
> -* We need to look up the roots that point at
> -* this bytenr and see if the new root does.  
> If
> -* it does not we need to make sure we update
> -* quotas appropriately.  -   
>*/
> -   if (disko && root != BTRFS_I(src)->root &&
> -   disko != last_disko) {
> -   no_quota = check_ref(trans, root,
> -disko);
> -   if (no_quota < 0) {
> -   btrfs_abort_transaction(trans,
> -   root,
> -   ret);
> -   btrfs_end_transaction(trans,
> - root);
> -   ret = no_quota;
> -   goto out;
> -   }
> -   }
> -
> if (disko) {
> inode_add_bytes(inode, datal);
> ret = btrfs_inc_extent_ref(trans, 
> root,
>
>
>>
>> ## Results:
>>
>> - Last 'strace' lines
>> 6095  fchown(16, 0, 0)  = 0
>> 6095  fchmod(16, 0755)  = 0
>> 6095  utimensat(16, NULL, {{1402362275, 0}, {1438761285, 819041906}}, 0) = 0
>> 6095  flistxattr(15, "", 100)   = 0

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread ronnie sahlberg

Maybe hold off erasing the drives a little in case someone wants to
collect some extra data for diagnosing how/why the filesystem got into
this unrecoverable state.

A single device having issues should not cause the whole filesystem to
become unrecoverable.

On Wed, Oct 21, 2015 at 9:09 AM, Janos Toth F.  wrote:
> I went through all the recovery options I could find (starting from
> read-only to "extraordinarily dangerous"). Nothing seemed to work.
>
> A Windows based proprietary recovery software (ReclaiMe) could scratch
> the surface but only that (it showed me the whole original folder
> structure after a few minutes of scanning and the "preview" of some
> some plaintext files was promising but most of the bigger files seemed
> to be broken).
>
> I used this as a bulk storage for backups and all the things I didn't
> care to keep in more than one copies but that includes my
> "scratchpad", so I cared enough to use RAID5 mode and to try restoring
> some things.
>
> Any last ideas before I "ata secure erase" and sell/repurpose the disks?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Expected behavior of bad sectors on one drive in a RAID1

2015-10-21 Thread Austin S Hemmelgarn


On 2015-10-21 12:01, Chris Murphy wrote:

On Wed, Oct 21, 2015 at 2:07 PM, Austin S Hemmelgarn
 wrote:

And I realize of course right after sending this that my other reply didn't
get through because GMail refuses to send mail in plain text, no matter how
hard I beat it over the head...


In the web browser version, to the right of the trash can for an email
being written, there is an arrow with a drop down menu that includes
"plain text mode" option which will work. This is often sticky, but
randomly with the btrfs list the replies won't have this option
checked and then they bounce. It's annoying. And then both the Gmail
and Inbox Android apps have no such option so it's not possible reply
to list emails from a mobile device short of changing mail clients
just for this purpose.
I actually didn't know about the option in the drop down menu in the 
Web-UI, although that wouldn't have been particularly relevant in this 
case as I was replying from my phone.  What's really annoying in that 
case is that the 'Reply Inline' option makes things _look_ like they're 
plain text, but they really aren't.


I've considered getting a different mail app, but for some reason the 
only one I can find for Android that supports plain text e-mail is K-9 
Mail, and I'm not too fond of the UI for that, and it takes way more 
effort to set up than I'm willing to put in for something I almost never 
use anyway (that and it doesn't (AFAICT) support S/MIME or Hashcash, 
although GMail doesn't either, so that one's not a show stopper).





smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH] Btrfs: Improve FL_KEEP_SIZE handling in fallocate.

2015-10-21 Thread Davide Italiano

On Wed, Jul 22, 2015 at 10:45 AM, Davide Italiano  wrote:
> On Fri, Jun 26, 2015 at 7:08 AM, David Sterba  wrote:
>> On Mon, Apr 06, 2015 at 10:09:15PM -0700, Davide Italiano wrote:
>>> - We call inode_size_ok() only if FL_KEEP_SIZE isn't specified.
>>> - As an optimisation we can skip the call if (off + len)
>>>   isn't greater than the current size of the file. This operation
>>>   is called under the lock so the less work we do, the better.
>>> - If we call inode_size_ok() pass to it the correct value rather
>>>   than a more conservative estimation.
>>>
>>> Signed-off-by: Davide Italiano 
>>
>> Reviewed-by: David Sterba 
>
> Hi Chris, this has been around for a while and it's been reviewed by
> multiple people. Any chances you can pull in your branch?
>
> Thanks,
>
> --
> Davide

Any chance to get this in?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL] Btrfs cleanups for 4.4 (collected)

2015-10-21 Thread David Sterba

Hi,

a bunch of cleanup patches from the mailinglist, no functional changes.
Reviewed and compile-tested, merge with current integration is without
conflits. Please pull, thanks.


The following changes since commit 25cb62b76430a91cc6195f902e61c2cb84ade622:

  Linux 4.3-rc5 (2015-10-11 11:09:45 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git cleanups/for-4.4

for you to fetch changes up to ddd664f4478a4aaf0ac67a4297e33f1992bc9be2:

  btrfs: reada: Fix returned errno code (2015-10-21 18:29:50 +0200)


Alexandru Moise (8):
  btrfs: Fixed dsize and last_off declarations
  btrfs: Fixed declaration of old_len
  btrfs: trimming some start_transaction() code away
  btrfs: memset cur_trans->delayed_refs to zero
  btrfs: use a single if() statement for one outcome in get_block_rsv()
  btrfs: cleanup btrfs_balance profile validity checks
  btrfs: change num_items type from u64 to unsigned int
  btrfs: declare rsv_count as unsigned int instead of int

Byongho Lee (4):
  btrfs: replace unnecessary list_for_each_entry_safe to list_for_each_entry
  btrfs: remove unnecessary list_del
  btrfs: cleanup iterating over prop_handlers array
  btrfs: compress: put variables defined per compress type in struct to 
make cache friendly

Chandan Rajendra (1):
  Btrfs: btrfs_submit_bio_hook: Use btrfs_wq_endio_type values instead of 
integer constants

Geliang Tang (1):
  btrfs: fix a comment typo

Luis de Bethencourt (2):
  btrfs: check-integrity: Fix returned errno codes
  btrfs: reada: Fix returned errno code

Shan Hai (1):
  btrfs/file.c: remove an unsed varialbe first_index

Zhao Lei (3):
  btrfs: Move btrfs_raid_array to public
  btrfs: use btrfs_raid_array for 
btrfs_get_num_tolerated_disk_barrier_failures()
  btrfs: use btrfs_raid_array in btrfs_reduce_alloc_profile

 fs/btrfs/check-integrity.c  |   4 +-
 fs/btrfs/compression.c  |  94 +-
 fs/btrfs/ctree.c|   4 +-
 fs/btrfs/disk-io.c  |  35 ++
 fs/btrfs/extent-tree.c  |  58 +++-
 fs/btrfs/file.c |   5 +-
 fs/btrfs/free-space-cache.c |   2 +-
 fs/btrfs/inode.c|   5 +-
 fs/btrfs/ioctl.c|   3 +-
 fs/btrfs/props.c|  13 ++--
 fs/btrfs/reada.c|   8 ++-
 fs/btrfs/root-tree.c|   2 +-
 fs/btrfs/transaction.c  |  32 +++--
 fs/btrfs/transaction.h  |   5 +-
 fs/btrfs/volumes.c  | 156 +---
 fs/btrfs/volumes.h  |   5 ++
 16 files changed, 220 insertions(+), 211 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread ronnie sahlberg

If it is for mostly archival storage, I would suggest you take a look
at snapraid.


On Wed, Oct 21, 2015 at 9:09 AM, Janos Toth F.  wrote:
> I went through all the recovery options I could find (starting from
> read-only to "extraordinarily dangerous"). Nothing seemed to work.
>
> A Windows based proprietary recovery software (ReclaiMe) could scratch
> the surface but only that (it showed me the whole original folder
> structure after a few minutes of scanning and the "preview" of some
> some plaintext files was promising but most of the bigger files seemed
> to be broken).
>
> I used this as a bulk storage for backups and all the things I didn't
> care to keep in more than one copies but that includes my
> "scratchpad", so I cared enough to use RAID5 mode and to try restoring
> some things.
>
> Any last ideas before I "ata secure erase" and sell/repurpose the disks?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How to remove missing device on RAID1?

2015-10-21 Thread Henk Slager

On Wed, Oct 21, 2015 at 11:54 AM, Dmitry Katsubo
 wrote:
> On 2015-10-21 00:40, Henk Slager wrote:
>> I had a similar issue some time ago, around the time kernel 4.1.6 was
>> just there.
>> In case you don't want to wait for new disk or decide to just run the
>> filesystem with 1 disk less or maybe later on replace 1 of the still
>> healthy disks with a double/bigger sized one and use current/older
>> kernel+tools, you could do this (assuming the filesystem is not too
>> full of course):
>> - mount degraded
>> - btrfs balance start -f -v -sdevid=1 -mdevid=1 -ddevid=1 
>>   (where missing disk has devid 1)
>
> Am I right that one can "btrfs dev delete 1" after balance succeeded?
If the patch that Ducan decribes is applied, then yes, this is then
the way to remove (delete) the missing device from the filesystem.

>> After completion the (virtual/missing) device shall be fully unallocated
>> - create /dev/loopX with sparse file of same size as missing disk on
>> some other filesystem
>> - btrfs replace start 1 /dev/loopX 
>> - remove /dev/loopX from the filesystem
>> - remount filesystyem without degraded
>> And remove /dev/loopX
>
> If would be nice if btrfs allows to delete device and perform rebalance
> automatically (provided that left devices still have enough space to
> sustain raidX prerequisite).
This would indeed be nice, but I guess it should be working already,
looking at older (3.x) btrfs-tools documentation. It's just that it
could not be used for some time. Anyhow, once this patch is more
commonly available, people can test/use it and see if still a manual
extra specific balance is needed in order to get rid of or work-around
issues. It is actually the functionality of the '-r' option of 'btrfs
replace' that is then applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/RAID5 became unmountable after SATA cable fault

2015-10-21 Thread Janos Toth F.

I went through all the recovery options I could find (starting from
read-only to "extraordinarily dangerous"). Nothing seemed to work.

A Windows based proprietary recovery software (ReclaiMe) could scratch
the surface but only that (it showed me the whole original folder
structure after a few minutes of scanning and the "preview" of some
some plaintext files was promising but most of the bigger files seemed
to be broken).

I used this as a bulk storage for backups and all the things I didn't
care to keep in more than one copies but that includes my
"scratchpad", so I cared enough to use RAID5 mode and to try restoring
some things.

Any last ideas before I "ata secure erase" and sell/repurpose the disks?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Expected behavior of bad sectors on one drive in a RAID1

2015-10-21 Thread Chris Murphy

On Wed, Oct 21, 2015 at 2:07 PM, Austin S Hemmelgarn
 wrote:
> And I realize of course right after sending this that my other reply didn't
> get through because GMail refuses to send mail in plain text, no matter how
> hard I beat it over the head...

In the web browser version, to the right of the trash can for an email
being written, there is an arrow with a drop down menu that includes
"plain text mode" option which will work. This is often sticky, but
randomly with the btrfs list the replies won't have this option
checked and then they bounce. It's annoying. And then both the Gmail
and Inbox Android apps have no such option so it's not possible reply
to list emails from a mobile device short of changing mail clients
just for this purpose.

The smarter thing to do is server side conversion of HTML to plain
text, stripping superfluous formatting. Bouncing mails is just as bad
a UX as Google not providing a plain text option in their mobile apps.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/3] btrfs-progs: introduce framework to check kernel supported features

2015-10-21 Thread Eric Sandeen

On 10/21/15 9:49 AM, Anand Jain wrote:
> 
> 
> On 10/21/2015 10:41 PM, Eric Sandeen wrote:
>> On 10/21/15 4:09 AM, Qu Wenruo wrote:
 +static int get_kernel_code()
 +{
 +int ret;
 +struct utsname utsbuf;
 +char *version;
 +
 +ret = uname(&utsbuf);
 +if (ret)
 +return -ret;
 +
 +version = strtok(utsbuf.release, "-");
 +
 +return version_to_code(version);
 +}
>>>
>>> The only problem is, kernel version is never reliable.
>>> If someone wants, uname output may even contain no numeric value.
>>
>> yep, I agree.  This will be misery for any custom kernel.
> 
>  How if we apply this only when kernel version is available ?

The problem is "kernel version" may not match btrfs version.
Distros backport and update subsystems without changing the
kernel version.

>  Otherwise progs will assume all features are supported as in
>  the current design.
> 
> Thanks, Anand

This is only a concern for kernels prior to 3.14, right?
v3.13 was released Sun Jan 19 18:40:23 2014, almost 2 years ago.

What has raised the current concern about these old kernels?
Why does this need fixing in upstream code?

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] btrfs/file.c: remove an unsed varialbe first_index

2015-10-21 Thread David Sterba

On Mon, Sep 21, 2015 at 11:40:57AM +0800, Shan Hai wrote:
> From: Shan Hai 
> 
> The commit b37392ea86761 ("Btrfs: cleanup unnecessary parameter
> and variant of prepare_pages()") makes it redundant.
> 
> Signed-off-by: Shan Hai 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] btrfs: cleanup btrfs_balance profile validity checks

2015-10-21 Thread David Sterba

On Tue, Sep 22, 2015 at 08:02:25PM +, Alexandru Moise wrote:
> Improve readability by generalizing the profile validity checks.
> 
> Signed-off-by: Alexandru Moise <00moses.alexande...@gmail.com>

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/3] btrfs-progs: introduce framework to check kernel supported features

2015-10-21 Thread Anand Jain




On 10/21/2015 10:41 PM, Eric Sandeen wrote:

On 10/21/15 4:09 AM, Qu Wenruo wrote:

+static int get_kernel_code()
+{
+int ret;
+struct utsname utsbuf;
+char *version;
+
+ret = uname(&utsbuf);
+if (ret)
+return -ret;
+
+version = strtok(utsbuf.release, "-");
+
+return version_to_code(version);
+}


The only problem is, kernel version is never reliable.
If someone wants, uname output may even contain no numeric value.


yep, I agree.  This will be misery for any custom kernel.


 How if we apply this only when kernel version is available ?
 Otherwise progs will assume all features are supported as in
 the current design.

Thanks, Anand



IIRC, I suggest to maintain similar feature matrix in fstests, but Dave pointed 
out the above problem.

So I'm not fan of reading kernel version and generate supported features for 
that.

IMHO, just use /sys/fs/btrfs/features is good enough.


*nod*


And if there is no such file, just ignore it, user is responsible for
such case.


Yep, 3.14 was over a year and a half ago, I don't see much point in
hardcoding kernel versions for such old kernels in the current
upstream codebase.

The only kernels that old still running are likely distro kernels, and
they can solve this problem by backporting the /sys/fs/btrfs/features
patch.

-Eric


Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/3] btrfs-progs: introduce framework to check kernel supported features

2015-10-21 Thread Eric Sandeen

On 10/21/15 4:09 AM, Qu Wenruo wrote:
>> +static int get_kernel_code()
>> +{
>> +int ret;
>> +struct utsname utsbuf;
>> +char *version;
>> +
>> +ret = uname(&utsbuf);
>> +if (ret)
>> +return -ret;
>> +
>> +version = strtok(utsbuf.release, "-");
>> +
>> +return version_to_code(version);
>> +}
> 
> The only problem is, kernel version is never reliable.
> If someone wants, uname output may even contain no numeric value.

yep, I agree.  This will be misery for any custom kernel.

> IIRC, I suggest to maintain similar feature matrix in fstests, but Dave 
> pointed out the above problem.
> 
> So I'm not fan of reading kernel version and generate supported features for 
> that.
> 
> IMHO, just use /sys/fs/btrfs/features is good enough.

*nod*

> And if there is no such file, just ignore it, user is responsible for
> such case.

Yep, 3.14 was over a year and a half ago, I don't see much point in
hardcoding kernel versions for such old kernels in the current
upstream codebase.

The only kernels that old still running are likely distro kernels, and
they can solve this problem by backporting the /sys/fs/btrfs/features
patch.

-Eric
 
> Thanks,
> Qu

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/3] btrfs-progs: kernel based default features for mkfs

2015-10-21 Thread Qu Wenruo




在 2015年10月21日 22:15, Anand Jain 写道:


Thanks for the comments.. more below.

On 10/21/2015 05:12 PM, Qu Wenruo wrote:



Anand Jain wrote on 2015/10/21 16:45 +0800:

mkfs from latest btrfs-progs will enable latest default features,
and if the kernel is down-rev and does not support a latest default
feature then mount fails, as expected.

This patch disables default features based on the running kernel.

Signed-off-by: Anand Jain 
---
  mkfs.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mkfs.c b/mkfs.c
index a5802f7..2b9d734 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1357,10 +1357,13 @@ int main(int ac, char **av)
  int dev_cnt = 0;
  int saved_optind;
  char fs_uuid[BTRFS_UUID_UNPARSED_SIZE] = { 0 };
-u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+u64 features;
  struct mkfs_allocation allocation = { 0 };
  struct btrfs_mkfs_config mkfs_cfg;

+features = btrfs_features_allowed_by_kernel();
+features &= BTRFS_MKFS_DEFAULT_FEATURES;
+


Despite the problem of btrfs_features_allowed_by_kernel() I mentioned in
previous mail,
the behavior is a little aggressive for me.

So a user with old kernel won't be able to create a filesystem with
newer feature forever.
Maybe the user are just making btrfs for his or her newer kernel?


   I am not understanding the complete picture here, is there any
example that you can quote. ?

Thanks, Anand



Sorry my fault.
Your features are set before parse_fs_features()

So it's just my misunderstanding.

However, I still prefer warning other than just changing default features.

Thanks,
Qu


IMHO, it's better to output a warning other than just change features
without any information.

Thanks,
Qu


  while(1) {
  int c;
  static const struct option long_options[] = {


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/3] btrfs-progs: kernel based default features for mkfs

2015-10-21 Thread Anand Jain



Thanks for the comments.. more below.

On 10/21/2015 05:12 PM, Qu Wenruo wrote:



Anand Jain wrote on 2015/10/21 16:45 +0800:

mkfs from latest btrfs-progs will enable latest default features,
and if the kernel is down-rev and does not support a latest default
feature then mount fails, as expected.

This patch disables default features based on the running kernel.

Signed-off-by: Anand Jain 
---
  mkfs.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mkfs.c b/mkfs.c
index a5802f7..2b9d734 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1357,10 +1357,13 @@ int main(int ac, char **av)
  int dev_cnt = 0;
  int saved_optind;
  char fs_uuid[BTRFS_UUID_UNPARSED_SIZE] = { 0 };
-u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+u64 features;
  struct mkfs_allocation allocation = { 0 };
  struct btrfs_mkfs_config mkfs_cfg;

+features = btrfs_features_allowed_by_kernel();
+features &= BTRFS_MKFS_DEFAULT_FEATURES;
+


Despite the problem of btrfs_features_allowed_by_kernel() I mentioned in
previous mail,
the behavior is a little aggressive for me.

So a user with old kernel won't be able to create a filesystem with
newer feature forever.
Maybe the user are just making btrfs for his or her newer kernel?


  I am not understanding the complete picture here, is there any 
example that you can quote. ?


Thanks, Anand


IMHO, it's better to output a warning other than just change features
without any information.

Thanks,
Qu


  while(1) {
  int c;
  static const struct option long_options[] = {


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs-progs: Fix a typo which causes super_copy corrupted

2015-10-21 Thread David Sterba

On Tue, Oct 20, 2015 at 10:20:44AM +0800, Qu Wenruo wrote:
> In btrfs_read_dev_super(), a wrong parameter for memcpy will copy wrong
> address of a stack pointer to super_copy.
> Which will cause a lot of unpredictable error.
> 
> What a stupid typo from myself!
> 
> Signed-off-by: Qu Wenruo 

Thanks.  I'd rather fold the change to the original patch as it seems to
break a few things. This would make any future bisecting hard, failing
tests etc. I'm glad you found it quickly.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Expected behavior of bad sectors on one drive in a RAID1

2015-10-21 Thread Austin S Hemmelgarn


On 2015-10-21 07:51, Austin S Hemmelgarn wrote:

On 2015-10-20 15:59, Austin S Hemmelgarn wrote:

On 2015-10-20 15:20, Duncan wrote:

Yes, there's some small but not infinitesimal chance the checksum may be
wrong, but if there's two copies of the data and the checksum on one is
wrong while the checksum on the other verifies... yes, there's still
that
small chance that the one that verifies is wrong too, but that it's any
worse than the one that does not verify?  /That's/ getting close to
infinitesimal, or at least close enough for the purposes of a mailing-
list claim without links to supporting evidence by someone who has
already characterized it as not mathematically rigorous... and for me,
personally.  I'm not spending any serious time thinking about getting
hit
by lightening, either, tho by the same token I don't go out flying kites
or waving long metal rods around in lightning storms, either.

With a 32-bit checksum and a 4k block (the math is easier with smaller
numbers), that's 4128 bits, which means that a random single bit error
will have a approximately 0.24% chance of occurring in a given bit,
which translates to an approximately 7.75% chance that it will occur in
one of the checksum bits.  For a 16k block it's smaller of course
(around 1.8% I think, but that's just a guess), but it's still
sufficiently statistically likely that it should be considered.

As mentioned in my other reply to this, I did the math wrong (bit of a
difference between kilobit and kilobyte)
And I realize of course right after sending this that my other reply 
didn't get through because GMail refuses to send mail in plain text, no 
matter how hard I beat it over the head...





smime.p7s
Description: S/MIME Cryptographic Signature

Re: [devel branch]Btrfs-convert assert at volumes.c:1846

2015-10-21 Thread David Sterba

On Tue, Oct 20, 2015 at 10:33:37AM +0800, Qu Wenruo wrote:
> Finally, the bug is found.

Great, thanks!
> 
> A SUPER AWKWARD TYPO

Happens ... :)

> See the patch just submitted:
> [PATCH] btrfs-progs: Fix a typo which causes super_copy corrupted
> 
> And, I think that's also the reason why PPC64 report errors.
> But unfortunately, there is no PPC64 machine here for me to test.

For now we can use convert and mkfs to test it a bit. Convert crashes on
4k sectors, and with "mkfs.btrfs -n 64k -s 64k --rootdir " we can
partially simulate the ppc64 parameters.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Expected behavior of bad sectors on one drive in a RAID1

2015-10-21 Thread Austin S Hemmelgarn


On 2015-10-20 15:59, Austin S Hemmelgarn wrote:

On 2015-10-20 15:20, Duncan wrote:

Yes, there's some small but not infinitesimal chance the checksum may be
wrong, but if there's two copies of the data and the checksum on one is
wrong while the checksum on the other verifies... yes, there's still that
small chance that the one that verifies is wrong too, but that it's any
worse than the one that does not verify?  /That's/ getting close to
infinitesimal, or at least close enough for the purposes of a mailing-
list claim without links to supporting evidence by someone who has
already characterized it as not mathematically rigorous... and for me,
personally.  I'm not spending any serious time thinking about getting hit
by lightening, either, tho by the same token I don't go out flying kites
or waving long metal rods around in lightning storms, either.

With a 32-bit checksum and a 4k block (the math is easier with smaller
numbers), that's 4128 bits, which means that a random single bit error
will have a approximately 0.24% chance of occurring in a given bit,
which translates to an approximately 7.75% chance that it will occur in
one of the checksum bits.  For a 16k block it's smaller of course
(around 1.8% I think, but that's just a guess), but it's still
sufficiently statistically likely that it should be considered.
As mentioned in my other reply to this, I did the math wrong (bit of a 
difference between kilobit and kilobyte), so here's a (hopefully) 
correct and more thorough analysis:


For 4kb blocks (32768 bits):
There are a total of 32800 bits when including a 32 bit checksum outside 
the block, this makes the chance of a single bit error in either the 
block or the checksum ~0.30%.  This in turn means an approximately 9.7% 
chance of a single bit error in the checksum.


For 16kb blocks (131072 bits):
There are a total of 131104 bits when including a 32 bit checksum 
outside the block, this makes the chance of a single bit error in either 
the block or the checksum ~0.07%.  This in turn means an approximately 
2.4% chance of a single bit error in the checksum.


This all of course assumes a naive interpretation of how modern block 
storage devices work.  All modern hard drives and SSD's include at a 
minimum the ability to correct single bit errors per byte, and detect 
double bit errors per byte, which means that we need a triple bit error 
in the same byte to get bad data back, which in turn makes the numbers 
small enough that it's impractical to represent them without scientific 
notation (on the order of 10^-5).


That in turn assumes zero correlation beyond what's required to get bad 
data back from the storage, however, if there is enough correlation for 
that to happen, it's statistically likely that there will be other 
errors very close by.  This in turn means that it's more likely that the 
checksum is either correct or absolutely completely wrong, which 
increases the chances that the resultant metadata block containing the 
checksum will nnot appear to have an incorrect checksum itself (because 
checksums are good at detecting proportionately small errors, but only 
mediocre at detecting very big errors).


The approximate proportionate chances of an error in the data versus the 
checksum however are still roughly the same however, irrespective of how 
small the chances of getting any error are.  Based on this, the ratio of 
the size of the checksum to the size of the data is a tradeoff that 
needs to be considered, the closer the ratio is to 1, the higher the 
chance of having an error in the checksum, but the less data you need to 
correct/verify when there is an error.




smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH 05/10] btrfs-progs: mute coverity warnings about deadcode

2015-10-21 Thread David Sterba

On Mon, Oct 19, 2015 at 07:37:55PM +0800, Eryu Guan wrote:
> Coverity reports execution cannot reach this statements. So put WARN_ON
> in if-else conditions.
> 
> Signed-off-by: Eryu Guan 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: add a check of whether fs_info->fs_root is NULL in btrfs_async_reclaim_metadata_space()

2015-10-21 Thread David Sterba

On Wed, Oct 21, 2015 at 04:20:00PM +0900, Tsutomu Itoh wrote:
> Kernel panic occurred due to NULL pointer reference in can_overcommit().
> Because btrfs_async_reclaim_metadata_space() passed NULL pointer to
> btrfs_calc_reclaim_metadata_size().

> fs_info->fs_root is referred in btrfs_async_reclaim_metadata_space()
> when mount kicked kworker(btrfs_async_reclaim_metadata_space).
> 
> But at this time, fs_info->fs_root had not been initialized yet,
> so NULL pointer passed to btrfs_calc_reclaim_metadata_size().

I don't think it's the right fix, the initialization sequence should
take care of such situations. The fs_tree must exist at the time we
reach the point where it crashed, the code expects it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How to remove missing device on RAID1?

2015-10-21 Thread Dmitry Katsubo

On 2015-10-21 00:40, Henk Slager wrote:
> I had a similar issue some time ago, around the time kernel 4.1.6 was
> just there.
> In case you don't want to wait for new disk or decide to just run the
> filesystem with 1 disk less or maybe later on replace 1 of the still
> healthy disks with a double/bigger sized one and use current/older
> kernel+tools, you could do this (assuming the filesystem is not too
> full of course):
> - mount degraded
> - btrfs balance start -f -v -sdevid=1 -sdevid=1 -sdevid=1 
>   (where missing disk has devid 1)

Am I right that one can "btrfs dev delete 1" after balance succeeded?

> After completion the (virtual/missing) device shall be fully unallocated
> - create /dev/loopX with sparse file of same size as missing disk on
> some other filesystem
> - btrfs replace start 1 /dev/loopX 
> - remove /dev/loopX from the filesystem
> - remount filesystyem without degraded
> And remove /dev/loopX

If would be nice if btrfs allows to delete device and perform rebalance
automatically (provided that left devices still have enough space to
sustain raidX prerequisite).

-- 
With best regards,
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs: fix possible leak in btrfs_ioctl_balance()

2015-10-21 Thread David Sterba

On Wed, Oct 21, 2015 at 12:50:06AM +0200, Christian Engelmayer wrote:
> Commit 8eb934591f8b ("btrfs: check unsupported filters in balance
> arguments") adds a jump to exit label out_bargs in case the argument
> check fails. At this point in addition to the bargs memory, the
> memory for struct btrfs_balance_control has already been allocated.
> Ownership of bctl is passed to btrfs_balance() in the good case,
> thus the memory is not freed due to the introduced jump. Make sure
> that the memory gets freed in any case as necessary. Detected by
> Coverity CID 1328378.
> 
> Signed-off-by: Christian Engelmayer 
Reviewed-by: David Sterba 

Thanks for fixig it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 02/10] btrfs-progs: save error number correctly in check_chunks_and_extents

2015-10-21 Thread David Sterba

On Tue, Oct 20, 2015 at 06:28:00PM +0800, Eryu Guan wrote:
> On Mon, Oct 19, 2015 at 03:41:04PM +0200, David Sterba wrote:
> > On Mon, Oct 19, 2015 at 07:37:52PM +0800, Eryu Guan wrote:
> > > Coverity reports assigning value from "err" to "ret", but that stored
> > > value is overwritten by check_extent_refs() before it can be used.
> > 
> > If you fix a coverity issue, please add a tag and the id, like
> > 
> > Resolves-coverity-id: 1234
> 
> I was looking hard for CID but without luck.. I will ask around and see
> where I can find the CIDs.

Ok never mind then, I'll pick the patch as-is.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 3/3] btrfs-progs: kernel based default features for btrfs-convert

2015-10-21 Thread Qu Wenruo




Anand Jain wrote on 2015/10/21 16:45 +0800:

btrfs-convert convert FS with latest default features enabled, and
if the kernel is down-rev and does not support a latest feature then
mount fails, as expected.

This patch disables default features based on the running kernel.

Signed-off-by: Anand Jain 
---
  btrfs-convert.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/btrfs-convert.c b/btrfs-convert.c
index cb92020..edec09e 100644
--- a/btrfs-convert.c
+++ b/btrfs-convert.c
@@ -2890,7 +2890,10 @@ int main(int argc, char *argv[])
int progress = 1;
char *file;
char fslabel[BTRFS_LABEL_SIZE];
-   u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+   u64 features;
+
+   features = btrfs_features_allowed_by_kernel();
+   features &= BTRFS_MKFS_DEFAULT_FEATURES;

Same with previous patch.

Thanks,
Qu


while(1) {
enum { GETOPT_VAL_NO_PROGRESS = 256 };


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/3] btrfs-progs: kernel based default features for mkfs

2015-10-21 Thread Qu Wenruo




Anand Jain wrote on 2015/10/21 16:45 +0800:

mkfs from latest btrfs-progs will enable latest default features,
and if the kernel is down-rev and does not support a latest default
feature then mount fails, as expected.

This patch disables default features based on the running kernel.

Signed-off-by: Anand Jain 
---
  mkfs.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mkfs.c b/mkfs.c
index a5802f7..2b9d734 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1357,10 +1357,13 @@ int main(int ac, char **av)
int dev_cnt = 0;
int saved_optind;
char fs_uuid[BTRFS_UUID_UNPARSED_SIZE] = { 0 };
-   u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+   u64 features;
struct mkfs_allocation allocation = { 0 };
struct btrfs_mkfs_config mkfs_cfg;

+   features = btrfs_features_allowed_by_kernel();
+   features &= BTRFS_MKFS_DEFAULT_FEATURES;
+


Despite the problem of btrfs_features_allowed_by_kernel() I mentioned in 
previous mail,

the behavior is a little aggressive for me.

So a user with old kernel won't be able to create a filesystem with 
newer feature forever.

Maybe the user are just making btrfs for his or her newer kernel?

IMHO, it's better to output a warning other than just change features 
without any information.


Thanks,
Qu


while(1) {
int c;
static const struct option long_options[] = {


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/3] btrfs-progs: introduce framework to check kernel supported features

2015-10-21 Thread Qu Wenruo


Hi Anand,

This feature seems quite good, comment inlined below.

Anand Jain wrote on 2015/10/21 16:45 +0800:

In the newer kernel, supported kernel features can be known from
   /sys/fs/btrfs/features
however this interface was introduced only after 3.14, and most the
incompatible FS features were introduce before 3.14.

This patch proposes to maintain kernel version against the feature list,
and so that will be the minimum kernel version needed to use the feature.

Further, for features supported later than 3.14 this list can still be
updated, so it serves as a repository which can be displayed for easy
reference.

Signed-off-by: Anand Jain 
---
  utils.c | 69 -
  utils.h |  1 +
  2 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/utils.c b/utils.c
index b754686..cd5a626 100644
--- a/utils.c
+++ b/utils.c
@@ -32,10 +32,12 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
  #include 
+#include 
  #include 

  #include "kerncompat.h"
@@ -567,21 +569,28 @@ out:
return ret;
  }

+/*
+ * min_ker_ver: update with minimum kernel version at which the feature
+ * was integrated into the mainline. For the transit period, that is
+ * feature not yet in mainline but in mailing list and for testing,
+ * please use "0.0" to indicate the same.
+ */
  static const struct btrfs_fs_feature {
const char *name;
u64 flag;
const char *desc;
+   const char *min_ker_ver;
  } mkfs_features[] = {
{ "mixed-bg", BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS,
-   "mixed data and metadata block groups" },
+   "mixed data and metadata block groups", "2.7.31"},
{ "extref", BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF,
-   "increased hardlink limit per file to 65536" },
+   "increased hardlink limit per file to 65536", "3.7"},
{ "raid56", BTRFS_FEATURE_INCOMPAT_RAID56,
-   "raid56 extended format" },
+   "raid56 extended format", "3.9"},
{ "skinny-metadata", BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA,
-   "reduced-size metadata extent refs" },
+   "reduced-size metadata extent refs", "3.10"},
{ "no-holes", BTRFS_FEATURE_INCOMPAT_NO_HOLES,
-   "no explicit hole extents for files" },
+   "no explicit hole extents for files", "3.14"},
/* Keep this one last */
{ "list-all", BTRFS_FEATURE_LIST_ALL, NULL }
  };
@@ -3077,3 +3086,53 @@ unsigned int get_unit_mode_from_arg(int *argc, char 
*argv[], int df_mode)

return unit_mode;
  }
+
+static int version_to_code(char *v)
+{
+   int i = 0;
+   char *b[3] = {NULL};
+   char *save_b = NULL;
+
+   for (b[i] = strtok_r(v, ".", &save_b);
+   b[i] != NULL;
+   b[i] = strtok_r(NULL, ".", &save_b))
+   i++;
+
+   if (b[2] == NULL)
+   return KERNEL_VERSION(atoi(b[0]), atoi(b[1]), 0);
+   else
+   return KERNEL_VERSION(atoi(b[0]), atoi(b[1]), atoi(b[2]));
+
+}
+
+static int get_kernel_code()
+{
+   int ret;
+   struct utsname utsbuf;
+   char *version;
+
+   ret = uname(&utsbuf);
+   if (ret)
+   return -ret;
+
+   version = strtok(utsbuf.release, "-");
+
+   return version_to_code(version);
+}


The only problem is, kernel version is never reliable.
If someone wants, uname output may even contain no numeric value.

IIRC, I suggest to maintain similar feature matrix in fstests, but Dave 
pointed out the above problem.


So I'm not fan of reading kernel version and generate supported features 
for that.


IMHO, just use /sys/fs/btrfs/features is good enough.
And if there is no such file, just ignore it, user is responsible for
such case.

Thanks,
Qu


+
+u64 btrfs_features_allowed_by_kernel(void)
+{
+   int i;
+   int local_kernel_code = get_kernel_code();
+   u64 features = 0;
+
+   for (i = 0; i < ARRAY_SIZE(mkfs_features) - 1; i++) {
+   char *ver = strdup(mkfs_features[i].min_ker_ver);
+
+   if (local_kernel_code >= version_to_code(ver))
+   features |= mkfs_features[i].flag;
+
+   free(ver);
+   }
+   return (features);
+}
diff --git a/utils.h b/utils.h
index 192f3d1..9044643 100644
--- a/utils.h
+++ b/utils.h
@@ -104,6 +104,7 @@ void btrfs_list_all_fs_features(u64 mask_disallowed);
  char* btrfs_parse_fs_features(char *namelist, u64 *flags);
  void btrfs_process_fs_features(u64 flags);
  void btrfs_parse_features_to_string(char *buf, u64 flags);
+u64 btrfs_features_allowed_by_kernel(void);

  struct btrfs_mkfs_config {
char *label;


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH V2] btrfs/ioctl.c: extent_same - Use inode as src which, close to disk beginning

2015-10-21 Thread Timofey Titovets

This patch have LOT of errors, sorry, please ignore it.

2015-10-21 4:11 GMT+03:00 Timofey Titovets :
> It's just a proof of concept, and i hope to see feedback/ideas/review about
> it.
> ---
> While deduplication,
> Btrfs produce extent and file fragmentation
> But it's can be optimized by compute - which inode data placed a closest to
> beginning of hdd
> It's allow to:
> 1. Performance boost on hdd (beginning of disk faster then end)
> 2. Make sparse only on tail of fs, what can give boost later for
> balancing and resizing operations
>
> New function:
> static u64 btrfs_avg_disko(struct inode *inode,
> const u64 off, const u64 olen_aligned);
>
> It normalize offsets with data lengths, by represent it like offsets of
> blocks
> It return average data offset of all "pagesized" blocks in given range for
> inode
> Function cloned from btrfs_clone()
>
> Changes from V1:
> Added new function which compute "normal" offset
>
> Signed-off-by: Timofey Titovets 
> ---
>  fs/btrfs/ioctl.c | 147
> ++-
>  1 file changed, 145 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 3e3e613..17e5313 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -86,6 +86,9 @@ struct btrfs_ioctl_received_subvol_args_32 {
>  #endif
>
>
> +static u64 btrfs_avg_disko(struct inode *inode,
> +const u64 off, const u64 olen_aligned);
> +
>  static int btrfs_clone(struct inode *src, struct inode *inode,
> u64 off, u64 olen, u64 olen_aligned, u64 destoff,
> int no_time_update);
> @@ -3074,8 +3077,20 @@ static int btrfs_extent_same(struct inode *src, u64
> loff, u64 olen,
>
>  /* pass original length for comparison so we stay within i_size */
>  ret = btrfs_cmp_data(src, loff, dst, dst_loff, olen, &cmp);
> -if (ret == 0)
> -ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> +if (ret == 0) {
> +/* prefer inode with lowest offset as source for clone*/
> +u64 src_weight;
> +u64 dst_weight;
> +src_weight = btrfs_avg_disko(src, off, olen);
> +dst_weight = btrfs_avg_disko(dest, dst_loff, olen);
> +/* if one of weight == 0 -> fallback */
> +if (dest_weight == 0)
> +src_weight = 0;
> +if (src_weight > dest_weight)
> +ret = btrfs_clone(dst, src, dst_loff, olen, len, loff, 1);
> +else
> +ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
> +}
>
>  if (same_inode)
>  unlock_extent(&BTRFS_I(src)->io_tree, same_lock_start,
> @@ -3329,6 +3344,134 @@ static void clone_update_extent_map(struct inode
> *inode,
>  }
>
>  /**
> + * btrfs_avg_disko() - return avg data offset weight for inode
> + *
> + * @inode: Inode
> + * @off: Offset for computing
> + * @olen_aligned: Block-aligned len of data
> + *
> + * Computing avg address place of data, allow to heuristically
> + * determine where on the disk placed most fragment of data
> + */
> +static u64 btrfs_avg_disko(struct inode *inode,
> +const u64 off, const u64 olen_aligned)
> +{
> +struct btrfs_root *root = BTRFS_I(inode)->root;
> +struct btrfs_path *path = NULL;
> +struct extent_buffer *leaf;
> +char *buf = NULL;
> +struct btrfs_key key;
> +u32 nritems;
> +int slot;
> +int no_quota;
> +double sum = 0;
> +u64 ret = 0;
> +u64 counter = 0;
> +
> +buf = vmalloc(root->nodesize);
> +if (!buf)
> +return ret;
> +
> +path = btrfs_alloc_path();
> +if (!path) {
> +vfree(buf);
> +return ret;
> +}
> +
> +path->reada = 2;
> +/* clone data */
> +key.objectid = btrfs_ino(inode);
> +key.type = BTRFS_EXTENT_DATA_KEY;
> +key.offset = off;
> +
> +while (1) {
> +u64 next_key_min_offset = key.offset + 1;
> +
> +/*
> + * note the key will change type as we walk through the
> + * tree.
> + */
> +path->leave_spinning = 1;
> +ret = btrfs_search_slot(NULL, BTRFS_I(inode)->root, &key, path,
> 0, 0);
> +if (ret < 0)
> +goto out;
> +/*
> + * First search, if no extent item that starts at offset off was
> + * found but the previous item is an extent item, it's possible
> + * it might overlap our target range, therefore process it.
> + */
> +if (key.offset == off && ret > 0 && path->slots[0] > 0) {
> +btrfs_item_key_to_cpu(path->nodes[0], &key,
> +  path->slots[0] - 1);
> +if (key.type == BTRFS_EXTENT_DATA_KEY)
> +path->slots[0]--;
> +}
> +
> +nritems = btrfs_header_nritems(path->nodes[0]);
> +process_slot:
> +no_quota = 1;
> +if (path->slots[0] >= nritems) {
> +ret = btrfs_next_leaf(BTRFS_I(inode)->root, path);
> +if (ret <

[RFC PATCH 1/3] btrfs-progs: introduce framework to check kernel supported features

2015-10-21 Thread Anand Jain

In the newer kernel, supported kernel features can be known from
  /sys/fs/btrfs/features
however this interface was introduced only after 3.14, and most the
incompatible FS features were introduce before 3.14.

This patch proposes to maintain kernel version against the feature list,
and so that will be the minimum kernel version needed to use the feature.

Further, for features supported later than 3.14 this list can still be
updated, so it serves as a repository which can be displayed for easy
reference.

Signed-off-by: Anand Jain 
---
 utils.c | 69 -
 utils.h |  1 +
 2 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/utils.c b/utils.c
index b754686..cd5a626 100644
--- a/utils.c
+++ b/utils.c
@@ -32,10 +32,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "kerncompat.h"
@@ -567,21 +569,28 @@ out:
return ret;
 }
 
+/*
+ * min_ker_ver: update with minimum kernel version at which the feature
+ * was integrated into the mainline. For the transit period, that is
+ * feature not yet in mainline but in mailing list and for testing,
+ * please use "0.0" to indicate the same.
+ */
 static const struct btrfs_fs_feature {
const char *name;
u64 flag;
const char *desc;
+   const char *min_ker_ver;
 } mkfs_features[] = {
{ "mixed-bg", BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS,
-   "mixed data and metadata block groups" },
+   "mixed data and metadata block groups", "2.7.31"},
{ "extref", BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF,
-   "increased hardlink limit per file to 65536" },
+   "increased hardlink limit per file to 65536", "3.7"},
{ "raid56", BTRFS_FEATURE_INCOMPAT_RAID56,
-   "raid56 extended format" },
+   "raid56 extended format", "3.9"},
{ "skinny-metadata", BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA,
-   "reduced-size metadata extent refs" },
+   "reduced-size metadata extent refs", "3.10"},
{ "no-holes", BTRFS_FEATURE_INCOMPAT_NO_HOLES,
-   "no explicit hole extents for files" },
+   "no explicit hole extents for files", "3.14"},
/* Keep this one last */
{ "list-all", BTRFS_FEATURE_LIST_ALL, NULL }
 };
@@ -3077,3 +3086,53 @@ unsigned int get_unit_mode_from_arg(int *argc, char 
*argv[], int df_mode)
 
return unit_mode;
 }
+
+static int version_to_code(char *v)
+{
+   int i = 0;
+   char *b[3] = {NULL};
+   char *save_b = NULL;
+
+   for (b[i] = strtok_r(v, ".", &save_b);
+   b[i] != NULL;
+   b[i] = strtok_r(NULL, ".", &save_b))
+   i++;
+
+   if (b[2] == NULL)
+   return KERNEL_VERSION(atoi(b[0]), atoi(b[1]), 0);
+   else
+   return KERNEL_VERSION(atoi(b[0]), atoi(b[1]), atoi(b[2]));
+
+}
+
+static int get_kernel_code()
+{
+   int ret;
+   struct utsname utsbuf;
+   char *version;
+
+   ret = uname(&utsbuf);
+   if (ret)
+   return -ret;
+
+   version = strtok(utsbuf.release, "-");
+
+   return version_to_code(version);
+}
+
+u64 btrfs_features_allowed_by_kernel(void)
+{
+   int i;
+   int local_kernel_code = get_kernel_code();
+   u64 features = 0;
+
+   for (i = 0; i < ARRAY_SIZE(mkfs_features) - 1; i++) {
+   char *ver = strdup(mkfs_features[i].min_ker_ver);
+
+   if (local_kernel_code >= version_to_code(ver))
+   features |= mkfs_features[i].flag;
+
+   free(ver);
+   }
+   return (features);
+}
diff --git a/utils.h b/utils.h
index 192f3d1..9044643 100644
--- a/utils.h
+++ b/utils.h
@@ -104,6 +104,7 @@ void btrfs_list_all_fs_features(u64 mask_disallowed);
 char* btrfs_parse_fs_features(char *namelist, u64 *flags);
 void btrfs_process_fs_features(u64 flags);
 void btrfs_parse_features_to_string(char *buf, u64 flags);
+u64 btrfs_features_allowed_by_kernel(void);
 
 struct btrfs_mkfs_config {
char *label;
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 3/3] btrfs-progs: kernel based default features for btrfs-convert

2015-10-21 Thread Anand Jain

btrfs-convert convert FS with latest default features enabled, and
if the kernel is down-rev and does not support a latest feature then
mount fails, as expected.

This patch disables default features based on the running kernel.

Signed-off-by: Anand Jain 
---
 btrfs-convert.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/btrfs-convert.c b/btrfs-convert.c
index cb92020..edec09e 100644
--- a/btrfs-convert.c
+++ b/btrfs-convert.c
@@ -2890,7 +2890,10 @@ int main(int argc, char *argv[])
int progress = 1;
char *file;
char fslabel[BTRFS_LABEL_SIZE];
-   u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+   u64 features;
+
+   features = btrfs_features_allowed_by_kernel();
+   features &= BTRFS_MKFS_DEFAULT_FEATURES;
 
while(1) {
enum { GETOPT_VAL_NO_PROGRESS = 256 };
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 2/3] btrfs-progs: kernel based default features for mkfs

2015-10-21 Thread Anand Jain

mkfs from latest btrfs-progs will enable latest default features,
and if the kernel is down-rev and does not support a latest default
feature then mount fails, as expected.

This patch disables default features based on the running kernel.

Signed-off-by: Anand Jain 
---
 mkfs.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mkfs.c b/mkfs.c
index a5802f7..2b9d734 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1357,10 +1357,13 @@ int main(int ac, char **av)
int dev_cnt = 0;
int saved_optind;
char fs_uuid[BTRFS_UUID_UNPARSED_SIZE] = { 0 };
-   u64 features = BTRFS_MKFS_DEFAULT_FEATURES;
+   u64 features;
struct mkfs_allocation allocation = { 0 };
struct btrfs_mkfs_config mkfs_cfg;
 
+   features = btrfs_features_allowed_by_kernel();
+   features &= BTRFS_MKFS_DEFAULT_FEATURES;
+
while(1) {
int c;
static const struct option long_options[] = {
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Btrfs: add a check of whether fs_info->fs_root is NULL in btrfs_async_reclaim_metadata_space()

2015-10-21 Thread Tsutomu Itoh

Kernel panic occurred due to NULL pointer reference in can_overcommit().
Because btrfs_async_reclaim_metadata_space() passed NULL pointer to
btrfs_calc_reclaim_metadata_size().


[ 3756.152833] BUG: unable to handle kernel NULL pointer dereference at 
01f0
[ 3756.152882] IP: [] can_overcommit+0x21/0xf0 [btrfs]
[ 3756.152936] PGD 0
[ 3756.152949] Oops:  [#1] SMP
[ 3756.152969] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 
xt_conntrack ebtable_filter ebtable_broute bridge stp llc ebtable_nat 
ebtables ip6table_mangle ip6table_raw ip6table_security ip6table_nat 
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_filter ip6_tables 
iptable_mangle iptable_raw iptable_security iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack coretemp kvm_intel kvm crc32
_pclmul iTCO_wdt iTCO_vendor_support microcode ipmi_si lpc_ich mfd_core pcspkr 
acpi_power_meter ipmi_msghandler i2c_i801 i7core_edac shpchp edac_core 
nfsd acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel btrfs xor 
raid6_pq usb_storage mgag200 drm_kms_helper syscopyarea sysfillrect 
sysimgblt fb_sys_fops ttm drm igb ptp ata_generic pps_core pata_acpi 
crc32c_intel
[ 3756.153397]  dca megaraid_sas i2c_algo_bit ata_piix i2c_core
[ 3756.153433] CPU: 3 PID: 3004 Comm: kworker/u25:4 Tainted: G  I 
4.3.0-rc6 #1
[ 3756.153469] Hardware name: FUJITSU-SV   PRIMERGY RX300 
S6 /D2619, BIOS 6.00 Rev. 1.09.2619.N1   12/13/2010
[ 3756.153537] Workqueue: events_unbound btrfs_async_reclaim_metadata_space 
[btrfs]
[ 3756.153571] task: 88023581a400 ti: 880234648000 task.ti: 
880234648000
[ 3756.153604] RIP: 0010:[]  [] 
can_overcommit+0x21/0xf0 [btrfs]
[ 3756.153655] RSP: 0018:88023464bda8  EFLAGS: 00010282
[ 3756.153679] RAX: 0100 RBX: 880431f68c00 RCX: 0002
[ 3756.153711] RDX: 00c0 RSI:  RDI: 
[ 3756.153742] RBP: 88023464bde0 R08: 0101 R09: 000c
[ 3756.153773] R10: 81d10060 R11: 81d10050 R12: 880431f68c00
[ 3756.153804] R13:  R14: 880035f67070 R15: 00c0
[ 3756.153836] FS:  () GS:880237cc() 
knlGS:
[ 3756.153871] CS:  0010 DS:  ES:  CR0: 8005003b
[ 3756.153897] CR2: 01f0 CR3: 01c08000 CR4: 06e0
[ 3756.153929] Stack:
[ 3756.153940]  8802 880237cd2940 880431f68c00 

[ 3756.153979]  00c0 880035f67070  
88023464be20
[ 3756.154016]  a01e5404 880431f68c80 880234482240 
8802378a1800
[ 3756.154054] Call Trace:
[ 3756.154081]  [] 
btrfs_async_reclaim_metadata_space+0xb4/0x210 [btrfs]
[ 3756.154119]  [] process_one_work+0x19e/0x3d0
[ 3756.154146]  [] worker_thread+0x4e/0x450
[ 3756.154174]  [] ? __schedule+0x2b9/0x930
[ 3756.154199]  [] ? process_one_work+0x3d0/0x3d0
[ 3756.154227]  [] ? process_one_work+0x3d0/0x3d0
[ 3756.154255]  [] kthread+0xc9/0xe0
[ 3756.154279]  [] ? kthread_worker_fn+0x160/0x160
[ 3756.154307]  [] ret_from_fork+0x3f/0x70
[ 3756.154333]  [] ? kthread_worker_fn+0x160/0x160
[ 3756.154361] Code: a5 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 
41 57 41 56 41 55 41 54 49 89 f4 53 31 f6 49 89 fd 49 89 d7 48 83 ec 10 
<4c> 8b b7 f0 01 00 00 89 4d cc 49 3b 7e 30 40 0f 95 c6 48 8d 74
[ 3756.156802] RIP  [] can_overcommit+0x21/0xf0 [btrfs]
[ 3756.157995]  RSP 
[ 3756.159162] CR2: 01f0


fs_info->fs_root is referred in btrfs_async_reclaim_metadata_space()
when mount kicked kworker(btrfs_async_reclaim_metadata_space).

But at this time, fs_info->fs_root had not been initialized yet,
so NULL pointer passed to btrfs_calc_reclaim_metadata_size().


PID: 3045   TASK: 8800bb06b000  CPU: 2   COMMAND: "mount"
[exception RIP: queued_spin_lock_slowpath+350]
RIP: 810be2de  RSP: 8800b9fdb738  RFLAGS: 0202
RAX: 0101  RBX: 880431f68c00  RCX: 0001
RDX: 0101  RSI: 0001  RDI: 880431f68c00
RBP: 8800b9fdb738   R8: 0101   R9: 
R10: 4000  R11: 00018e58  R12: 0001
R13: 8800b9fdb7c0  R14: 8800bb06b000  R15: 0001
CS: 0010  SS: 0018
 #0 [8800b9fdb740] _raw_spin_lock at 81694ff0
 #1 [8800b9fdb750] reserve_metadata_bytes at a01e55cc [btrfs]
 #2 [8800b9fdb800] btrfs_block_rsv_add at a01e5a93 [btrfs]
 #3 [8800b9fdb828] btrfs_truncate_inode_items at a0202779 [btrfs]
 #4 [8800b9fdb920] btrfs_evict_inode at a02040ec [btrfs]
 #5 [8800b9fdb990] evict at 811ed6ea
 #6 [880

54 matches

Mail list logo