Re: How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Duncan
Simon King posted on Sat, 31 Oct 2015 18:31:45 +0100 as excerpted:

> I know that "df" is different from "btrfs fi df". However, I see that df
> shows significantly more free space after balancing. Also, when my
> computer became unusable, the problem disappeared by balancing and
> defragmentation (deleting the old snapshots was not enough).
> 
> Unfortunately, df also shows significantly less free space after
> UNSUCCESSFUL balancing.

On a btrfs, df is hardly relevant at all, except to the extent that if
you're trying to copy a 100 MB file and df says there's only 50 MB of
room, obviously there's going to be problems.

Btrfs actually has two-stage space allocation.

At the first stage, entirely unallocated space is taken in largish chunks,
normally separately for data and metadata, nominally 1 GiB size (tho
larger or smaller is possible depending on the size of the filesystem and
how close to fully chunk-allocated it is) for data chunks,
256 MiB for metadata -- but metadata chunks are normally allocated and
used in dup mode, two at a time, on a single-device btrfs, so 512 MiB at a
time.

At the second stage, space is used from already allocated chunks as needed
for files (data) or metadata.

And particularly on older kernels, this is where the problem arises,
since over time as files are created and deleted, all unallocated space
tends to be allocated as data chunks, such that when the existing metadata
chunks get full, there's no unallocated space left from which to allocate
more metadata chunks, as it's all tied up in data chunks, many of which
might be mostly or entirely empty as the files they once were allocated to
contain have since been deleted or moved (due to btrfs copy-
on-write) elsewhere.

On newer kernels, entirely empty chunks are automatically deleted,
significantly easing the problem, tho it can still happen if there's a lot
of mostly but not entirely empty data chunks.

Which is why df isn't always particularly reliable on btrfs, because it
doesn't know about all this chunk preallocation stuff, and will (again,
at least on older kernels, AFAIK newer ones have improved this to some
extent but it's still not ideal) happily report all that empty data-chunk
space as available for files, not knowing it's out of space to store
metadata.  Often, if you were to have one big file take all the space df
reports, that would work, because tracking a single file uses only a
relatively small bit of metadata space.  But try to use only a tenth of
the space with a thousand much smaller files, and the remaining metadata
space may well be exhausted, allowing no more file creation, even tho df
is still saying there's lots of room left, because it's all in data
chunks!

Which is where balance comes in, since in rewriting the chunks it
consolidates them, eliminating chunks when say 3 2/3 full chunks combine
into only two full chunks, returning the freed space to unallocated, so it
can be allocated for either data or metadata as needed, once again.

As for getting out of the tight spot you're in ATM, with all would-be
unallocated space apparently (you didn't post btrfs fi show and df output,
but this is what the symptoms suggest) gone, tied up in mostly empty data
chunks, without even enough space to easily balance those data chunks to
free up more space by consolidating them...

There's some discussion on the btrfs wiki, in the free-space questions on
the faq, and similarly in the problem-faq (watch the link wrap):

FAQ:

https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_I_ran_out_of_disk_space.21

Also see FAQ sections 4.6-4.9, discussing free space, and 4.12,
discussing balance.

Problem-FAQ:

https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_get_.22No_space_left_on_device.22_errors.2C_but_df_says_I.27ve_got_lots_of_space


Basically, if filters won't let you do it, you can try deleting large
files -- assuming they're not also referenced by still existing snapshots.
 That might empty a data chunk or two, allowing a balance -dusage=0 to
eliminate it, giving you enough room to try a higher dusage number,
perhaps 5% or 10%, then 20 and 50.  (Above 50% the time will go up while
the possible payback goes down, and it shouldn't be necessary until the
filesystem gets real close to actually full, tho on my ssd, speeds are
fast enough I'll sometimes try upto 70% or so.)

If it's too tight for that or everything's snapshotted on snapshots you
don't want to or can't delete, you can try adding (btrfs device add) a
device temporarily.  The device should be several gigs in size, minimum;
even a few-GiB USB thumbdrive or the like can work, tho access can be
slow.  That should give you enough additional space to do the balance
-dusage= thing, which, assuming it does consolidate nearly empty data
chunks, freeing the extra space they took, should free up enough newly
unallocated space on the original device, to do a btrfs device delete of
the temporarily added device, returning everything that was on it
temporarily, 

Re: Crash during mount -o degraded, kernel BUG at fs/btrfs/extent_io.c:2044

2015-10-31 Thread Duncan
Philip Seeger posted on Sun, 01 Nov 2015 00:36:42 +0100 as excerpted:

> On 10/31/2015 08:18 PM, Philip Seeger wrote:

>> But it looks like there are still some "invisible" errors on this (now
>> empty) filesystem; after rebooting and mounting it, this one error is
>> logged:
>> BTRFS: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 199313, gen 0
> 
> However, this "invisible" error shows up even with this kernel version.
> 
> So I'm still wondering why this error is happening even after a 
> successful scrub.

That's NOTABUG and not an error, functioning as designed.

Btrfs device error counts are retained until manually reset, and do not 
restart at zero at reboot or with a umount/mount or btrfs module remove/
insert cycle.  This highlights problem devices over time as their error 
counts increase.

So what btrfs is logging to dmesg on mount here, are the historical error 
counts, in this case expected as they were deliberate during your test, 
nearly 200K of them, not one or more new errors.

To have btrfs report these at the CLI, use btrfs device stats.  To zero 
them out, use its -z option.  Then mounting should again report 0 corrupt 
in dmesg once again... until some other error happens, of course. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Unclear error message when running btrfs check on a mountpoint

2015-10-31 Thread Martin Steigerwald
Hi!

With kernel 4.3-rc7 and btrfs-progs 4.2.2 I get:

merkaba:~> btrfs check /daten
Superblock bytenr is larger than device size
Couldn't open file system


It took me a moment to see that I used a mountpoint and that this may be the
reason for the error message.

Maybe check for a device file as argument and give a clearer error message
in this case?

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: behavior of BTRFS in relation to inodes when moving/copying files between filesystems

2015-10-31 Thread Martin Steigerwald
Am Dienstag, 13. Oktober 2015, 12:39:12 CET schrieben Sie:
> Hi!
> 
> With BTRFS to XFS/Ext4 the inode number of the target file stays the same in 
> with both cp and mv case (/mnt/zeit is a freshly created XFS in this example):
> 
> merkaba:~> ls -li foo /mnt/zeit/moo
> 6609270  foo
>  99  /mnt/zeit/moo
> merkaba:~> cp foo /mnt/zeit/moo
> merkaba:~> ls -li foo /mnt/zeit/moo
> 6609270 8 foo
>  99  /mnt/zeit/moo
> merkaba:~> cp -p foo /mnt/zeit/moo  
> merkaba:~> ls -li foo /mnt/zeit/moo
> 6609270 foo
>  99 /mnt/zeit/moo
> merkaba:~> mv foo /mnt/zeit/moo
> merkaba:~> ls -lid /mnt/zeit/moo
> 99 -rw-r--r-- 1 root root 6 Okt 13 12:28 /mnt/zeit/moo
> 
> 
> With BTRFS as target filesystem however in the mv case I get a new inode:
> 
> merkaba:~> ls -li foo /home/moo
>  6609289 -rw-r--r-- 1 root root 6 Okt 13 12:34 foo
> 16476276 -rw-r--r-- 1 root root 6 Okt 13 12:34 /home/moo
> merkaba:~> cp foo /home/moo
> merkaba:~> ls -li foo /home/moo
>  6609289 -rw-r--r-- 1 root root 6 Okt 13 12:34 foo
> 16476276 -rw-r--r-- 1 root root 6 Okt 13 12:34 /home/moo
> merkaba:~> cp -p foo /home/moo 
> merkaba:~> ls -li foo /home/moo
>  6609289 -rw-r--r-- 1 root root 6 Okt 13 12:34 foo
> 16476276 -rw-r--r-- 1 root root 6 Okt 13 12:34 /home/moo
> merkaba:~> mv foo /home/moo
> merkaba:~> ls -li /home/moo 
> 16476280 -rw-r--r-- 1 root root 6 Okt 13 12:34 /home/moo
> 
> 
> Is this intentional and/or somehow related to the copy on write specifics of 
> the filesystem?
> 
> I think even with COW it can just overwrite the existing file instead of 
> removing the old one and creating a new one – but it wouldn´t give much of a 
> benefit unless the target file is nocow.
> 
> (Also I thought only certain other utilities had supercow powers, but well 
> BTRFS seems to have them as well :)

Anyone any idea?

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [4.3-rc4] scrubbing aborts before finishing

2015-10-31 Thread Martin Steigerwald
Am Donnerstag, 22. Oktober 2015, 10:41:15 CET schrieb Martin Steigerwald:
> I get this:
> 
> merkaba:~> btrfs scrub status -d /   
> scrub status for […]
> scrub device /dev/mapper/sata-debian (id 1) history
> scrub started at Thu Oct 22 10:05:49 2015 and was aborted after 
> 00:00:00
> total bytes scrubbed: 0.00B with 0 errors
> scrub device /dev/dm-2 (id 2) history
> scrub started at Thu Oct 22 10:05:49 2015 and was aborted after 
> 00:01:30
> total bytes scrubbed: 23.81GiB with 0 errors
> 
> For / scrub aborts for sata SSD immediately.
> 
> For /home scrub aborts for both SSDs at some time.
> 
> merkaba:~> btrfs scrub status -d /home
> scrub status for […]
> scrub device /dev/mapper/msata-home (id 1) history
> scrub started at Thu Oct 22 10:09:37 2015 and was aborted after 
> 00:01:31
> total bytes scrubbed: 22.03GiB with 0 errors
> scrub device /dev/dm-3 (id 2) history
> scrub started at Thu Oct 22 10:09:37 2015 and was aborted after 
> 00:03:34
> total bytes scrubbed: 53.30GiB with 0 errors
> 
> Also single volume BTRFS is affected:
> 
> merkaba:~> btrfs scrub status /daten
> scrub status for […]
> scrub started at Thu Oct 22 10:36:38 2015 and was aborted after 
> 00:00:00
> total bytes scrubbed: 0.00B with 0 errors
> 
> 
> No errors in dmesg, btrfs device stat or smartctl -a.
> 
> Any known issue?

I am still seeing this in 4.3-rc7. It happens so that on one SSD BTRFS
doesn´t even start scrubbing. But in the end it aborts it scrubbing anyway.

I do not see any other issue so far. But I would really like to be able to
scrub my BTRFS filesystems completely again. Any hints? Any further
information needed? 

merkaba:~> btrfs scrub status -d /
scrub status for […]
scrub device /dev/dm-5 (id 1) history
scrub started at Sat Oct 31 11:58:45 2015, running for 00:00:00
total bytes scrubbed: 0.00B with 0 errors
scrub device /dev/mapper/msata-debian (id 2) status
scrub started at Sat Oct 31 11:58:45 2015, running for 00:00:20
total bytes scrubbed: 5.27GiB with 0 errors
merkaba:~> btrfs scrub status -d /
scrub status for […]
scrub device /dev/dm-5 (id 1) history
scrub started at Sat Oct 31 11:58:45 2015, running for 00:00:00
total bytes scrubbed: 0.00B with 0 errors
scrub device /dev/mapper/msata-debian (id 2) status
scrub started at Sat Oct 31 11:58:45 2015, running for 00:00:25
total bytes scrubbed: 6.59GiB with 0 errors
merkaba:~> btrfs scrub status -d /
scrub status for […]
scrub device /dev/dm-5 (id 1) history
scrub started at Sat Oct 31 11:58:45 2015, running for 00:00:00
total bytes scrubbed: 0.00B with 0 errors
scrub device /dev/mapper/msata-debian (id 2) status
scrub started at Sat Oct 31 11:58:45 2015, running for 00:01:25
total bytes scrubbed: 21.97GiB with 0 errors
merkaba:~> btrfs scrub status -d /
scrub status for […]
scrub device /dev/dm-5 (id 1) history
scrub started at Sat Oct 31 11:58:45 2015 and was aborted after 00:00:00
total bytes scrubbed: 0.00B with 0 errors
scrub device /dev/mapper/msata-debian (id 2) history
scrub started at Sat Oct 31 11:58:45 2015 and was aborted after 00:01:32
total bytes scrubbed: 23.63GiB with 0 errors


For the sake of it I am going to btrfs check one of the filesystem where
BTRFS aborts scrubbing (which is all of the laptop filesystems, not only
the RAID 1 one).

I will use the /daten filesystem as I can unmount it during laptop runtime
easily. There scrubbing aborts immediately:

merkaba:~> btrfs scrub start /daten 
scrub started on /daten, fsid […] (pid=13861)
merkaba:~> btrfs scrub status /daten
scrub status for […]
scrub started at Sat Oct 31 12:04:25 2015 and was aborted after 00:00:00
total bytes scrubbed: 0.00B with 0 errors

It is single device:

merkaba:~> btrfs fi sh /daten
Label: 'daten'  uuid: […]
Total devices 1 FS bytes used 227.23GiB
devid1 size 230.00GiB used 230.00GiB path /dev/mapper/msata-daten

btrfs-progs v4.2.2
merkaba:~> btrfs fi df /daten
Data, single: total=228.99GiB, used=226.79GiB
System, single: total=4.00MiB, used=48.00KiB
Metadata, single: total=1.01GiB, used=449.50MiB
GlobalReserve, single: total=160.00MiB, used=0.00B


I do not see any output in btrfs check that points to any issue:

merkaba:~> btrfs check /dev/msata/daten
Checking filesystem on /dev/msata/daten
UUID: 7918274f-e2ec-4983-bbb0-aa93ef95fcf7
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 243936530607 bytes used err is 0
total csum bytes: 237758932
total tree bytes: 471384064
total fs tree bytes: 116473856
total extent tree bytes: 78544896
btree space waste bytes: 57523323
file data blocks allocated: 422700576768
 referenced 243803443200
btrfs-progs v4.2.2

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message 

Re: How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Simon King
Hi Hugo,

Am 31.10.2015 um 17:41 schrieb Hugo Mills:
>> linux-va3e:~ # uname -a
>> Linux linux-va3e.site 3.16.7-29-desktop #1 SMP PREEMPT Fri Oct 23
>> 00:46:04 UTC 2015 (6be6a97) x86_64 x86_64 x86_64 GNU/Linux
> 
>OK, that's a bit old -- you would probably do well to upgrade this
> anyway, regardless of the issues you're having. (I'd recommend 4.1 at
> the moment; there's a bug in 4.2 at the moment that affects
> balancing).

The latest version of openSuse is Tumbleweed, in a few days there will
be openSuse Leap; I am not sure what kernel it would give me.

>I thought snapper could automatically delete old snapshots. (I've
> never used it, though, so I'm not sure). Worth looking at the snapper
> config to see if you can tell it how many to keep.

Probably it can, right.

>You're telling it to move the first two chunks with less than 10%
> usage. If all the other chunks are full, and there are two chunks (one
> data, one metadata) with less than 10% usage, then they'll be moved to
> two new chunks... with less than 10% usage. So it's perfectly possible
> that the same command will show the same output.

Do I understand correctly: In that situation, balancing would have no
benefit, as two old chunks are moved to two new chunks? Then why are
they moved at all?

>Incidentally, I would suggest using -dlimit=2 on its own, rather
> than both limit and usage.

I combined the two, since -dlimit on its own won't work:

linux-va3e:~ # btrfs balance start -dlimit=2 /
ERROR: error during balancing '/' - No space left on device
There may be more info in syslog - try dmesg | tail

>"btrfs balance start /" should rebalance the whole filesystem --

linux-va3e:~ # btrfs balance start /
ERROR: error during balancing '/' - No space left on device
There may be more info in syslog - try dmesg | tail
linux-va3e:~ # dmesg | tail
[ 9814.499013] BTRFS info (device sda2): found 8153 extents
[ 9815.254270] BTRFS info (device sda2): relocating block group
820062978048 flags 36
[ 9826.335122] BTRFS info (device sda2): found 8182 extents
[ 9826.858482] BTRFS info (device sda2): relocating block group
805064146944 flags 36
[ 9839.444820] BTRFS info (device sda2): found 8184 extents
[ 9839.822108] BTRFS info (device sda2): relocating block group
794595164160 flags 36
[ 9850.456697] BTRFS info (device sda2): found 8143 extents
[ 9850.778264] BTRFS info (device sda2): relocating block group
794460946432 flags 36
[ 9862.546336] BTRFS info (device sda2): found 8140 extents
[ 9862.890330] BTRFS info (device sda2): 12 enospc errors during balance


> not that you'd need to for purposes of dealing with space usage
> issues.

I know that "df" is different from "btrfs fi df". However, I see that df
shows significantly more free space after balancing. Also, when my
computer became unusable, the problem disappeared by balancing and
defragmentation (deleting the old snapshots was not enough).

Unfortunately, df also shows significantly less free space after
UNSUCCESSFUL balancing.

>You may have more success using mkfs.btrfs --mixed when you create
> the FS, which puts data and metadata in the same chunks.

Can I do this in the running system? Or would that only be an option
during upgrade of openSuse Harlequin to Tumbleweed/Leap? Or even worse:
Only an option after nuking the old installation and installing a new
one from scratch?

Best regards,
Simon
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Hugo Mills
On Sat, Oct 31, 2015 at 06:31:45PM +0100, Simon King wrote:
> Hi Hugo,
> 
> Am 31.10.2015 um 17:41 schrieb Hugo Mills:
> >> linux-va3e:~ # uname -a
> >> Linux linux-va3e.site 3.16.7-29-desktop #1 SMP PREEMPT Fri Oct 23
> >> 00:46:04 UTC 2015 (6be6a97) x86_64 x86_64 x86_64 GNU/Linux
> > 
> >OK, that's a bit old -- you would probably do well to upgrade this
> > anyway, regardless of the issues you're having. (I'd recommend 4.1 at
> > the moment; there's a bug in 4.2 at the moment that affects
> > balancing).
> 
> The latest version of openSuse is Tumbleweed, in a few days there will
> be openSuse Leap; I am not sure what kernel it would give me.
> 
> >I thought snapper could automatically delete old snapshots. (I've
> > never used it, though, so I'm not sure). Worth looking at the snapper
> > config to see if you can tell it how many to keep.
> 
> Probably it can, right.
> 
> >You're telling it to move the first two chunks with less than 10%
> > usage. If all the other chunks are full, and there are two chunks (one
> > data, one metadata) with less than 10% usage, then they'll be moved to
> > two new chunks... with less than 10% usage. So it's perfectly possible
> > that the same command will show the same output.
> 
> Do I understand correctly: In that situation, balancing would have no
> benefit, as two old chunks are moved to two new chunks? Then why are
> they moved at all?

   Because that's what balance does -- it's a fairly blunt tool for
one thing (evening out the usage across multiple devices) that happens
to have some useful side-effects (compacting space usage into smaller
numbers of block groups and freeing up the empty ones).

> >Incidentally, I would suggest using -dlimit=2 on its own, rather
> > than both limit and usage.
> 
> I combined the two, since -dlimit on its own won't work:
> 
> linux-va3e:~ # btrfs balance start -dlimit=2 /
> ERROR: error during balancing '/' - No space left on device
> There may be more info in syslog - try dmesg | tail

   And this is with a filesystem that's not fully allocated?
(i.e. btrfs fi show indicates that used and total are different for
each device). If that's the case, then you may have hit a known but
unfixed bug to do with space allocation.

> >"btrfs balance start /" should rebalance the whole filesystem --
> 
> linux-va3e:~ # btrfs balance start /
> ERROR: error during balancing '/' - No space left on device
> There may be more info in syslog - try dmesg | tail
> linux-va3e:~ # dmesg | tail
> [ 9814.499013] BTRFS info (device sda2): found 8153 extents
> [ 9815.254270] BTRFS info (device sda2): relocating block group
> 820062978048 flags 36
> [ 9826.335122] BTRFS info (device sda2): found 8182 extents
> [ 9826.858482] BTRFS info (device sda2): relocating block group
> 805064146944 flags 36
> [ 9839.444820] BTRFS info (device sda2): found 8184 extents
> [ 9839.822108] BTRFS info (device sda2): relocating block group
> 794595164160 flags 36
> [ 9850.456697] BTRFS info (device sda2): found 8143 extents
> [ 9850.778264] BTRFS info (device sda2): relocating block group
> 794460946432 flags 36
> [ 9862.546336] BTRFS info (device sda2): found 8140 extents
> [ 9862.890330] BTRFS info (device sda2): 12 enospc errors during balance
> 
> 
> > not that you'd need to for purposes of dealing with space usage
> > issues.
> 
> I know that "df" is different from "btrfs fi df". However, I see that df
> shows significantly more free space after balancing. Also, when my
> computer became unusable, the problem disappeared by balancing and
> defragmentation (deleting the old snapshots was not enough).
> 
> Unfortunately, df also shows significantly less free space after
> UNSUCCESSFUL balancing.
> 
> >You may have more success using mkfs.btrfs --mixed when you create
> > the FS, which puts data and metadata in the same chunks.
> 
> Can I do this in the running system? Or would that only be an option
> during upgrade of openSuse Harlequin to Tumbleweed/Leap? Or even worse:
> Only an option after nuking the old installation and installing a new
> one from scratch?

   You'd have to recreate the FS, so it's a matter of a reinstall, or
nuking it and restoring from your backups.

   Hugo.

-- 
Hugo Mills | Le Corbusier's plan for improving Paris involved the
hugo@... carfax.org.uk | assassination of the city, and its rebirth as tower
http://carfax.org.uk/  | blocks.
PGP: E2AB1DE4  |   Robert Hughes, The Shock of the New


signature.asc
Description: Digital signature


Re: Crash during mount -o degraded, kernel BUG at fs/btrfs/extent_io.c:2044

2015-10-31 Thread Philip Seeger

On 10/23/2015 01:13 AM, Erik Berg wrote:

So I intentionally broke this small raid6 fs on a VM to learn recovery
strategies for another much bigger raid6 I have running (which also
suffered a drive failure).

Basically I zeroed out one of the drives (vdd) from under the running
vm. Then ran an md5sum on a file on the fs to trigger some detection of
data inconsistency. I ran a scrub, which completed "ok". Then rebooted.

Now trying to mount the filesystem in degraded mode leads to a kernel
crash.


I've tried this on a system running kernel 4.2.5 and got slightly 
different results.


Created a raid6 array with 4 drives and put some stuff on it. Zeroed out 
the second drive (sdc) and checked the md5 sums of said stuff (all OK, 
good) which caused errors to be logged (dmesg) complaining about 
checksum errors on the 4th drive (sde):
BTRFS warning (device sde): csum failed ino 259 off 1071054848 csum 
2566472073 expected csum 3870060223


This is misleading, these error messages might make one think that the 
4th drive is bad and has to be replaced, which would reduce the 
redundancy to the minimum because it's the second drive that's actually bad.


I started a scrub and this time, the checksum errors mentioned the right 
drive:

BTRFS: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
BTRFS: checksum error at logical 38469632 on dev /dev/sdc, sector 19072: 
metadata leaf (level 0) in tree 7

...

This error mentions a file which is still correct:
BTRFS: checksum error at logical 2396721152 on dev /dev/sdc, sector 
2322056, root 5, inode 257, offset 142282752, length 4096, links 1 
(path: test1)


However, the scrub found uncorrectable errors, which shouldn't happen in 
a raid6 array with only 1 bad drive:

total bytes scrubbed: 3.00GiB with 199314 errors
error details: read=1 super=2 csum=199311
corrected errors: 199306, uncorrectable errors: 6, unverified 
errors: 0

ERROR: There are uncorrectable errors.

So wiping one drive in a btrfs raid6 array turned it into a bad state 
with uncorrectable errors, which should not happen. But at least it's 
still mountable without using the degraded option.


Removing all the files on this filesystem (which were not corrupted) 
fixed the aforementioned uncorrectable errors, another scrub found no 
more errors:
scrub started at Sat Oct 31 19:12:25 2015 and finished after 
00:01:15

total bytes scrubbed: 1.60GiB with 0 errors

But it looks like there are still some "invisible" errors on this (now 
empty) filesystem; after rebooting and mounting it, this one error is 
logged:

BTRFS: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 199313, gen 0

Not sure if this might be because I wiped that drive from the very 
beginning, effectively overwriting everything including MBR and other 
meta data. But whatever happened, a single bad drive (returning 
corrupted data) should not lead to fatal errors in a raid6 array.


Next, I recreated this raid6 array (same drives) and filled it with one 
file (dd if=/dev/urandom of=test bs=4M). I wiped the 2nd *and* 3rd drive 
(sdc and sdd) this time. I unmounted it and tried mounting it, which 
failed (again, sde is fine):

BTRFS (device sde): bad tree block start 0 63651840
BTRFS (device sde): bad tree block start 65536 63651840
BTRFS (device sde): bad tree block start 2360238080 63651840
BTRFS: open_ctree failed

After rebooting, these errors mentioned sdb instead of sde, which is the 
other good drive.


Is it possible to recover from this type of 2-drive failure?

What is that "invisible error" in the first test (empty fs after reboot)?




Philip
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Simon King
Hi!

Am 31.10.2015 um 19:33 schrieb Hugo Mills:
>> I combined the two, since -dlimit on its own won't work:
>>
>> linux-va3e:~ # btrfs balance start -dlimit=2 /
>> ERROR: error during balancing '/' - No space left on device
>> There may be more info in syslog - try dmesg | tail
> 
>And this is with a filesystem that's not fully allocated?
> (i.e. btrfs fi show indicates that used and total are different for
> each device). If that's the case, then you may have hit a known but
> unfixed bug to do with space allocation.

linux-va3e:~ # btrfs fi show
Label: none  uuid: 656dc65f-240b-4137-a490-0175717dd7fa
Total devices 1 FS bytes used 13.71GiB
devid1 size 20.00GiB used 16.88GiB path /dev/sda2

btrfs-progs v4.0+20150429

Is there a manual work-around?

>> Can I do this in the running system? Or would that only be an option
>> during upgrade of openSuse Harlequin to Tumbleweed/Leap? Or even worse:
>> Only an option after nuking the old installation and installing a new
>> one from scratch?
> 
>You'd have to recreate the FS, so it's a matter of a reinstall, or
> nuking it and restoring from your backups.

OK. So, I'll try to find out whether it is better to move on to
Tumbleweed or to Leap (btw, I found out that the latter is based on the
4.1 kernel).

Best regards,
Simon

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state

2015-10-31 Thread Chandan Rajendra
When executing generic/001 in a loop on a ppc64 machine (with both sectorsize
and nodesize set to 64k), the following call trace is observed,

WARNING: at /root/repos/linux/fs/btrfs/locking.c:253
Modules linked in:
CPU: 2 PID: 8353 Comm: umount Not tainted 4.3.0-rc5-13676-ga5e681d #54
task: c000f2b1f560 ti: c000f6008000 task.ti: c000f6008000
NIP: c0520c88 LR: c04a3b34 CTR: 
REGS: c000f600a820 TRAP: 0700   Not tainted  (4.3.0-rc5-13676-ga5e681d)
MSR: 800102029032   CR: 2884  XER: 
CFAR: c04a3b30 SOFTE: 1
GPR00: c04a3b34 c000f600aaa0 c108ac00 c000f5a808c0
GPR04:  c000f600ae60  0005
GPR08: 20a1 0001 c000f2b1f560 0030
GPR12: 84842882 cfdc0900 c000f600ae60 c000f070b800
GPR16:  c000f3c8a000  0049
GPR20: 0001 0001 c000f5aa01f8 
GPR24: 0f83e0f83e0f83e1 c000f5a808c0 c000f3c8d000 c000
GPR28: c000f600ae74 0001 c000f3c8d000 c000f5a808c0
NIP [c0520c88] .btrfs_tree_lock+0x48/0x2a0
LR [c04a3b34] .btrfs_lock_root_node+0x44/0x80
Call Trace:
[c000f600aaa0] [c000f600ab80] 0xc000f600ab80 (unreliable)
[c000f600ab80] [c04a3b34] .btrfs_lock_root_node+0x44/0x80
[c000f600ac00] [c04a99dc] .btrfs_search_slot+0xa8c/0xc00
[c000f600ad40] [c04ab878] .btrfs_insert_empty_items+0x98/0x120
[c000f600adf0] [c050da44] .btrfs_finish_chunk_alloc+0x1d4/0x620
[c000f600af20] [c04be854] 
.btrfs_create_pending_block_groups+0x1d4/0x2c0
[c000f600b020] [c04bf188] .do_chunk_alloc+0x3c8/0x420
[c000f600b100] [c04c27cc] .find_free_extent+0xbfc/0x1030
[c000f600b260] [c04c2ce8] .btrfs_reserve_extent+0xe8/0x250
[c000f600b330] [c04c2f90] .btrfs_alloc_tree_block+0x140/0x590
[c000f600b440] [c04a47b4] .__btrfs_cow_block+0x124/0x780
[c000f600b530] [c04a4fc0] .btrfs_cow_block+0xf0/0x250
[c000f600b5e0] [c04a917c] .btrfs_search_slot+0x22c/0xc00
[c000f600b720] [c050aa40] .btrfs_remove_chunk+0x1b0/0x9f0
[c000f600b850] [c04c4e04] .btrfs_delete_unused_bgs+0x434/0x570
[c000f600b950] [c04d3cb8] .close_ctree+0x2e8/0x3b0
[c000f600ba20] [c049d178] .btrfs_put_super+0x18/0x30
[c000f600ba90] [c0243cd4] .generic_shutdown_super+0xa4/0x1a0
[c000f600bb10] [c02441d8] .kill_anon_super+0x18/0x30
[c000f600bb90] [c049c898] .btrfs_kill_super+0x18/0xc0
[c000f600bc10] [c02444f8] .deactivate_locked_super+0x98/0xe0
[c000f600bc90] [c0269f94] .cleanup_mnt+0x54/0xa0
[c000f600bd10] [c00bd744] .task_work_run+0xc4/0x100
[c000f600bdb0] [c0016334] .do_notify_resume+0x74/0x80
[c000f600be30] [c00098b8] .ret_from_except_lite+0x64/0x68
Instruction dump:
fba1ffe8 fbc1fff0 fbe1fff8 7c791b78 f8010010 f821ff21 e94d0290 81030040
812a04e8 7d094a78 7d290034 5529d97e <0b09> 3b40 3be30050 3bc3004c

The above call trace is seen even on x86_64; albeit very rarely and that too
with nodesize set to 64k and with nospace_cache mount option being used.

The reason for the above call trace is,
btrfs_remove_chunk
  check_system_chunk
Allocate chunk if required
  For each physical stripe on underlying device,
btrfs_free_dev_extent
...
  Take lock on Device tree's root node
  btrfs_cow_block("dev tree's root node");
btrfs_reserve_extent
  find_free_extent
index = BTRFS_RAID_DUP;
have_caching_bg = false;

When in LOOP_CACHING_NOWAIT state, Assume we find a block group
which is being cached; Hence have_caching_bg is set to true

When repeating the search for the next RAID index, we set
have_caching_bg to false.

Hence right after completing the LOOP_CACHING_NOWAIT state, we incorrectly
skip LOOP_CACHING_WAIT state and move to LOOP_ALLOC_CHUNK state where we
allocate a chunk and try to add entries corresponding to the chunk's physical
stripe into the device tree. When doing so the task deadlocks itself waiting
for the blocking lock on the root node of the device tree.

This commit fixes the issue by introducing a new local variable to help
indicate as to whether a block group of any RAID type is being cached.

Signed-off-by: Chandan Rajendra 
---
 fs/btrfs/extent-tree.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 92fdbc6..cac9da8 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7027,6 +7027,7 @@ static noinline int find_free_extent(struct btrfs_root 
*orig_root,
bool failed_alloc = false;
bool 

How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Simon King
Hi!

>From the messages I see in this forum, I got the impression that it is a
developer forum and not a help forum. I seek help. So, please point me
to the right place if I shouldn't ask my questions here.

Since I am new, first my data:

linux-va3e:~ # uname -a
Linux linux-va3e.site 3.16.7-29-desktop #1 SMP PREEMPT Fri Oct 23
00:46:04 UTC 2015 (6be6a97) x86_64 x86_64 x86_64 GNU/Linux

linux-va3e:~ # btrfs --version
btrfs-progs v4.0+20150429

linux-va3e:~ # btrfs fi show
Label: none  uuid: 656dc65f-240b-4137-a490-0175717dd7fa
Total devices 1 FS bytes used 13.49GiB
devid1 size 20.00GiB used 16.19GiB path /dev/sda2

btrfs-progs v4.0+20150429

linux-va3e:~ # btrfs fi df /
Data, single: total=12.62GiB, used=12.14GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=1.75GiB, used=1.35GiB
GlobalReserve, single: total=304.00MiB, used=0.00B

dmesg > dmesg.log is attached.


When I installed openSuse 13.2 on my computer, I used the default:
- btrfs is used for a root partition of 20GB
- A program called "snapper" creates snapshots of the system whenever a
change is done (i.e., whenever a new program is installed or an old
program is upgraded).

After a while, the root partition was full because of btrfs's metadata.
In an openSuse forum, people told me I should regularly delete the
snapshots, and should "btrfs balance" and "btrf fi defragment" in order
to keep the metadata under control.

It was effective in the sense that I could use my computer again. It is
annoying that the user has to manually delete snapshots and has to
remember to regularly do balancing, though.

Now to my current problem: It seems that my root partition gradually
fills up. So, I am afraid that in a few weeks my computer will be broken
again. And that happens although I keep deleting old snapshots and try
balancing.

My questions:
1. There is one old snapshot that I can not delete:
  # snapper -c root delete 318
  Fehler beim Löschen des Schnappschusses.
The error message does not hint *why* the snapshot can not be deleted.
Here is one observation that may indicate what goes wrong:
  # find / -name 318
  /.snapshots/318
  find: File system loop detected; ‘/.snapshots/318/snapshot’ is part of
the same file system loop as ‘/’.

So, what can I do to delete the snapshot to hopefully free some memory?

2. When I simply do "btrfs balance start /", then it says the device is
full. So, I tried
  linux-va3e:~ # btrfs balance start -mlimit=1 -dlimit=2  -dusage=10 /
  Done, had to relocate 2 out of 28 chunks
Fine. But when I try it again, it still says that it had to relocate 2
out of 28 chunks! Shouldn't it be the case that the work has already
been done, so that no further relocation is needed with the same parameters?

3. When I increase the parameters, I always come to the point that there
is no space left on device. So, how can I achieve full balance of the
system?

One last remark: On the openSuse forum, I was advised to re-install the
system, and either reserve at least 50GB for the root partition, or drop
btrfs and use ext4 for the root partition. I would like to avoid such
trouble and hope that you can tell me how to sanitise my root partition,
and *keep* it sane.

Best regards,
Simon
[0.00] CPU0 microcode updated early to revision 0x60f, date = 2010-09-29
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 3.16.7-29-desktop (geeko@buildhost) (gcc version 4.8.3 20140627 [gcc-4_8-branch revision 212064] (SUSE Linux) ) #1 SMP PREEMPT Fri Oct 23 00:46:04 UTC 2015 (6be6a97)
[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.7-29-desktop root=UUID=656dc65f-240b-4137-a490-0175717dd7fa resume=/dev/disk/by-uuid/4d8e38d7-cfb9-4101-965e-1a66d96bd37d splash=silent quiet showopts
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009dfff] usable
[0.00] BIOS-e820: [mem 0x0009e000-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xbfd6afff] usable
[0.00] BIOS-e820: [mem 0xbfd6b000-0xbfdbefff] reserved
[0.00] BIOS-e820: [mem 0xbfdbf000-0xbfe82fff] usable
[0.00] BIOS-e820: [mem 0xbfe83000-0xbfebefff] ACPI NVS
[0.00] BIOS-e820: [mem 0xbfebf000-0xbfedefff] usable
[0.00] BIOS-e820: [mem 0xbfedf000-0xbfef5fff] ACPI data
[0.00] BIOS-e820: [mem 0xbfef6000-0xbfefdfff] usable
[0.00] BIOS-e820: [mem 0xbfefe000-0xbfefefff] ACPI data
[0.00] BIOS-e820: [mem 0xbfeff000-0xbfef] usable
[0.00] BIOS-e820: [mem 0xbff0-0xbfff] reserved
[0.00] BIOS-e820: [mem 

Re: How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Hugo Mills
On Sat, Oct 31, 2015 at 04:26:11PM +0100, Simon King wrote:
> Hi!
> 
> From the messages I see in this forum, I got the impression that it is a
> developer forum and not a help forum. I seek help. So, please point me
> to the right place if I shouldn't ask my questions here.

   No, you're good here. We do support as well as development. :)

> Since I am new, first my data:
> 
> linux-va3e:~ # uname -a
> Linux linux-va3e.site 3.16.7-29-desktop #1 SMP PREEMPT Fri Oct 23
> 00:46:04 UTC 2015 (6be6a97) x86_64 x86_64 x86_64 GNU/Linux

   OK, that's a bit old -- you would probably do well to upgrade this
anyway, regardless of the issues you're having. (I'd recommend 4.1 at
the moment; there's a bug in 4.2 at the moment that affects
balancing).

> linux-va3e:~ # btrfs --version
> btrfs-progs v4.0+20150429
> 
> linux-va3e:~ # btrfs fi show
> Label: none  uuid: 656dc65f-240b-4137-a490-0175717dd7fa
> Total devices 1 FS bytes used 13.49GiB
> devid1 size 20.00GiB used 16.19GiB path /dev/sda2
> 
> btrfs-progs v4.0+20150429
> 
> linux-va3e:~ # btrfs fi df /
> Data, single: total=12.62GiB, used=12.14GiB
> System, DUP: total=32.00MiB, used=16.00KiB
> Metadata, DUP: total=1.75GiB, used=1.35GiB
> GlobalReserve, single: total=304.00MiB, used=0.00B
> 
> dmesg > dmesg.log is attached.
> 
> 
> When I installed openSuse 13.2 on my computer, I used the default:
> - btrfs is used for a root partition of 20GB
> - A program called "snapper" creates snapshots of the system whenever a
> change is done (i.e., whenever a new program is installed or an old
> program is upgraded).
> 
> After a while, the root partition was full because of btrfs's metadata.
> In an openSuse forum, people told me I should regularly delete the
> snapshots, and should "btrfs balance" and "btrf fi defragment" in order
> to keep the metadata under control.
> 
> It was effective in the sense that I could use my computer again. It is
> annoying that the user has to manually delete snapshots and has to
> remember to regularly do balancing, though.

   I thought snapper could automatically delete old snapshots. (I've
never used it, though, so I'm not sure). Worth looking at the snapper
config to see if you can tell it how many to keep.

> Now to my current problem: It seems that my root partition gradually
> fills up. So, I am afraid that in a few weeks my computer will be broken
> again. And that happens although I keep deleting old snapshots and try
> balancing.
> 
> My questions:
> 1. There is one old snapshot that I can not delete:
>   # snapper -c root delete 318
>   Fehler beim Löschen des Schnappschusses.
> The error message does not hint *why* the snapshot can not be deleted.
> Here is one observation that may indicate what goes wrong:
>   # find / -name 318
>   /.snapshots/318
>   find: File system loop detected; ‘/.snapshots/318/snapshot’ is part of
> the same file system loop as ‘/’.
> 
> So, what can I do to delete the snapshot to hopefully free some memory?

   That one, I don't know what's happening, I'm afraid.

> 2. When I simply do "btrfs balance start /", then it says the device is
> full. So, I tried
>   linux-va3e:~ # btrfs balance start -mlimit=1 -dlimit=2  -dusage=10 /
>   Done, had to relocate 2 out of 28 chunks
> Fine. But when I try it again, it still says that it had to relocate 2
> out of 28 chunks! Shouldn't it be the case that the work has already
> been done, so that no further relocation is needed with the same parameters?

   You're telling it to move the first two chunks with less than 10%
usage. If all the other chunks are full, and there are two chunks (one
data, one metadata) with less than 10% usage, then they'll be moved to
two new chunks... with less than 10% usage. So it's perfectly possible
that the same command will show the same output.

   Incidentally, I would suggest using -dlimit=2 on its own, rather
than both limit and usage. You shouldn't normally need to run it on
the metadata: The usual case of early ENOSPC is that all of the block
groups in the filesystem are allocated (i.e. the space is reserved for
either data or metadata, not necessarily used), and then the metadata
fills up, while there is still lots of space for data allocated but
unused.  The balance simply moves some data around so that one or more
of the data block groups can be freed up, giving the FS some more
space that it can allocate to metadata.

> 3. When I increase the parameters, I always come to the point that there
> is no space left on device. So, how can I achieve full balance of the
> system?

   "btrfs balance start /" should rebalance the whole filesystem --
not that you'd need to for purposes of dealing with space usage
issues.

> One last remark: On the openSuse forum, I was advised to re-install the
> system, and either reserve at least 50GB for the root partition, or drop
> btrfs and use ext4 for the root partition. I would like to avoid such
> trouble and hope that you can tell me how to sanitise my root partition,

Re: Crash during mount -o degraded, kernel BUG at fs/btrfs/extent_io.c:2044

2015-10-31 Thread Philip Seeger

On 10/31/2015 08:18 PM, Philip Seeger wrote:

On 10/23/2015 01:13 AM, Erik Berg wrote:

So I intentionally broke this small raid6 fs on a VM to learn recovery
strategies for another much bigger raid6 I have running (which also
suffered a drive failure).

Basically I zeroed out one of the drives (vdd) from under the running
vm. Then ran an md5sum on a file on the fs to trigger some detection of
data inconsistency. I ran a scrub, which completed "ok". Then rebooted.

Now trying to mount the filesystem in degraded mode leads to a kernel
crash.


I've tried this on a system running kernel 4.2.5 and got slightly
different results.


And I've now tried it with kernel 4.3-rc7 and got similar results.


Created a raid6 array with 4 drives and put some stuff on it. Zeroed out
the second drive (sdc) and checked the md5 sums of said stuff (all OK,
good) which caused errors to be logged (dmesg) complaining about
checksum errors on the 4th drive (sde):
BTRFS warning (device sde): csum failed ino 259 off 1071054848 csum
2566472073 expected csum 3870060223


Same issue, this time sdd. The error message appears to chose a random 
device.



This error mentions a file which is still correct:


Same issue.


However, the scrub found uncorrectable errors, which shouldn't happen in
a raid6 array with only 1 bad drive:


This did not happen, the scrub fixed errors and found no uncorrectable 
errors.



But it looks like there are still some "invisible" errors on this (now
empty) filesystem; after rebooting and mounting it, this one error is
logged:
BTRFS: bdev /dev/sdc errs: wr 0, rd 0, flush 0, corrupt 199313, gen 0


However, this "invisible" error shows up even with this kernel version.

So I'm still wondering why this error is happening even after a 
successful scrub.




Philip
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Henk Slager
Hi Simon,

>>> linux-va3e:~ # btrfs balance start -dlimit=2 /
>>> ERROR: error during balancing '/' - No space left on device
>>> There may be more info in syslog - try dmesg | tail
>>
>>And this is with a filesystem that's not fully allocated?
>> (i.e. btrfs fi show indicates that used and total are different for
>> each device). If that's the case, then you may have hit a known but
>> unfixed bug to do with space allocation.
>
> linux-va3e:~ # btrfs fi show
> Label: none  uuid: 656dc65f-240b-4137-a490-0175717dd7fa
> Total devices 1 FS bytes used 13.71GiB
> devid1 size 20.00GiB used 16.88GiB path /dev/sda2
>
> btrfs-progs v4.0+20150429
>
> Is there a manual work-around?
For 'No space left on device', a trick I once saw is to run:

btrfs balance start -dusage=0 -musage=0 /

Under certain circumstances (i don't remember which kernel, tools
versions etc), this enables you to create files again on the
filesystem.
Looking at the   btrfs fi df /  output, I don't see a real need for
balancing, the numbers can be much different and then balance might be
usefull.

The 318 snaphot is more of a problem and you should get rid of
(some/unneeded/all) snapshots first. Default openSuse snapshot ages is
high (months and years) so maybe you want to edit configs
/etc/snapper/configs/<>
/etc/sysconfig/snapper

to keep snapshot age just 1 or 2 days or so, but it really depends on
how you use the notebook and the subvolumes on the filesystem. Or
maybe you just disable snapper snapshotting completely, as 20GB will
quite easily get too full with default snapper config. A crontask will
automatically delete too old snapshots based on the snapper config.

if command (318 or with higher snapshot number)
snapper -c root delete 1-318

does not work, or the crontask fails, try
btrfs sub del /.snapshots//snapshot

The 318 related error might also be fixed/workedaround by newer tools/kernel.
Maybe get newer (4.2.3) btrfstools from this repo
ttp://download.opensuse.org/repositories/filesystems/openSUSE_13.2/
and find some 4.1 kernel rpm for openSuse 13.2 (or compile your own
from kernel.org)

You could also start with a Leap/Tumbleweed liveDVD and mount your
/dev/sda2  somewhere and run the commands suggested above.

You should probably install/enable btrfsmaintenance package (and tune
its config) so that defrag and balancing runs as crontask.

And one important thing: A btrfs fi defragment with still many
snapshots around and 20GB rootfs will make the situation worse...

/Henk
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to delete this snapshot, and how to succeed with balancing?

2015-10-31 Thread Hugo Mills
On Sat, Oct 31, 2015 at 11:45:15PM +0100, Henk Slager wrote:
> Hi Simon,
> 
> >>> linux-va3e:~ # btrfs balance start -dlimit=2 /
> >>> ERROR: error during balancing '/' - No space left on device
> >>> There may be more info in syslog - try dmesg | tail
> >>
> >>And this is with a filesystem that's not fully allocated?
> >> (i.e. btrfs fi show indicates that used and total are different for
> >> each device). If that's the case, then you may have hit a known but
> >> unfixed bug to do with space allocation.
> >
> > linux-va3e:~ # btrfs fi show
> > Label: none  uuid: 656dc65f-240b-4137-a490-0175717dd7fa
> > Total devices 1 FS bytes used 13.71GiB
> > devid1 size 20.00GiB used 16.88GiB path /dev/sda2
> >
> > btrfs-progs v4.0+20150429
> >
> > Is there a manual work-around?

   Not that we've found in the last year or so of poking at it, I'm
afraid.

> For 'No space left on device', a trick I once saw is to run:
> 
> btrfs balance start -dusage=0 -musage=0 /
> 
> Under certain circumstances (i don't remember which kernel, tools
> versions etc), this enables you to create files again on the
> filesystem.

   That's basically what Simon's been doing (although a little more
aggressively). It's not going to help here.

> Looking at the   btrfs fi df /  output, I don't see a real need for
> balancing, the numbers can be much different and then balance might be
> usefull.

   If you're hitting ENOSPC with unallocated space, then that's a
bug. In fact, it's a known bug that hasn't been fixed yet.

   Hugo.

-- 
Hugo Mills | That's not rain, that's a lake with slots in it.
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature