Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem

2016-01-01 Thread cheater00 .
here is the info requested, if that helps anyone.

# uname -a
Linux SX20S 4.3.0-040300rc7-generic #201510260712 SMP Mon Oct 26
11:27:59 UTC 2015 i686 i686 i686 GNU/Linux
# aptitude show btrfs-tools
Package: btrfs-tools
State: installed
Automatically installed: no
Version: 4.2.1+ppa1-1~ubuntu15.10.1
# btrfs --version
btrfs-progs v4.2.1
# btrfs fi show Media
Label: 'Media'  uuid: b397b7ef-6754-4ba4-8b1a-fbf235aa1cf8
Total devices 1 FS bytes used 1.92TiB
devid1 size 5.46TiB used 1.93TiB path /dev/sdd1

btrfs-progs v4.2.1
# btrfs fi usage Media
Overall:
Device size:   5.46TiB
Device allocated:   1.93TiB
Device unallocated:   3.52TiB
Device missing: 0.00B
Used:   1.93TiB
Free (estimated):   3.53TiB (min: 1.76TiB)
Data ratio:  1.00
Metadata ratio:  2.00
Global reserve: 512.00MiB (used: 0.00B)

Data,single: Size:1.92TiB, Used:1.92TiB
   /dev/sdd1   1.92TiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/sdd1   8.00MiB

Metadata,DUP: Size:5.00GiB, Used:3.32GiB
   /dev/sdd1  10.00GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/sdd1   4.00MiB

System,DUP: Size:8.00MiB, Used:224.00KiB
   /dev/sdd1  16.00MiB

Unallocated:
   /dev/sdd1   3.52TiB



# btrfs-show-super /dev/sdd1
superblock: bytenr=65536, device=/dev/sdd1
-
csum 0xae174f16 [match]
bytenr 65536
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
fsid b397b7ef-6754-4ba4-8b1a-fbf235aa1cf8
label Media
generation 11983
root 34340864
sys_array_size 226
chunk_root_generation 11982
root_level 1
chunk_root 21135360
chunk_root_level 1
log_root 0
log_root_transid 0
log_root_level 0
total_bytes 6001173463040
bytes_used 2115339448320
sectorsize 4096
nodesize 16384
leafsize 16384
stripesize 4096
root_dir 6
num_devices 1
compat_flags 0x0
compat_ro_flags 0x0
incompat_flags 0x61
( MIXED_BACKREF |
 BIG_METADATA |
 EXTENDED_IREF )
csum_type 0
csum_size 4
cache_generation 11983
uuid_tree_generation 11983
dev_item.uuid 819e1c8a-5e55-4992-81d3-f22fdd088dc9
dev_item.fsid b397b7ef-6754-4ba4-8b1a-fbf235aa1cf8 [match]
dev_item.type 0
dev_item.total_bytes 6001173463040
dev_item.bytes_used 2124972818432
dev_item.io_align 4096
dev_item.io_width 4096
dev_item.sector_size 4096
dev_item.devid 1
dev_item.dev_group 0
dev_item.seek_speed 0
dev_item.bandwidth 0
dev_item.generation 0

I did mount Media -o enospc_debug and now mount shows:
/dev/sdd1 on /media/cheater/Media type btrfs
(rw,nosuid,nodev,enospc_debug,_netdev)


On Wed, Dec 30, 2015 at 11:13 PM, Chris Murphy  wrote:
> kernel and btrfs-progs versions
> and output from:
> 'btrfs fi show '
> 'btrfs fi usage '
> 'btrfs-show-super '
> 'df -h'
>
> Then umount the volume, and mount with option enospc_debug, and try to
> reproduce the problem, then include everything from dmesg from the
> time the volume was mounted.
>
> --
> Chris Murphy

On Sat, Jan 2, 2016 at 3:09 AM, cheater00 .  wrote:
> I have been unable to reproduce so far.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 6TB partition, Data only 2TB - aka When you haven't hit the "usual" problem

2016-01-01 Thread cheater00 .
I have been unable to reproduce so far.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub failing

2016-01-01 Thread John Center
Last thing for tonight, I tried to run btrfs-debug-tree & direct the
output to a file.  It crashed during the run with the following
errors:

john@mariposa:~$ sudo btrfs-debug-tree /dev/md125p2 >>
/media/data1/btrfs-debug-tree-01012016.txt
parent transid verify failed on 519273742336 wanted 1036426 found 1036428
parent transid verify failed on 519273742336 wanted 1036426 found 1036428
parent transid verify failed on 519273742336 wanted 1036426 found 1036428
parent transid verify failed on 519273742336 wanted 1036426 found 1036428
Ignoring transid failure
parent transid verify failed on 519271792640 wanted 1036425 found 1036428
parent transid verify failed on 519271792640 wanted 1036425 found 1036428
parent transid verify failed on 519271792640 wanted 1036425 found 1036428
parent transid verify failed on 519271792640 wanted 1036425 found 1036428
Ignoring transid failure
parent transid verify failed on 519274119168 wanted 1036426 found 1036428
parent transid verify failed on 519274119168 wanted 1036426 found 1036428
parent transid verify failed on 519274119168 wanted 1036426 found 1036428
parent transid verify failed on 519274119168 wanted 1036426 found 1036428
Ignoring transid failure
parent transid verify failed on 519274135552 wanted 1036426 found 1036428
parent transid verify failed on 519274135552 wanted 1036426 found 1036428
parent transid verify failed on 519274135552 wanted 1036426 found 1036428
parent transid verify failed on 519274135552 wanted 1036426 found 1036428
Ignoring transid failure
print-tree.c:1108: btrfs_print_tree: Assertion failed.
btrfs-debug-tree[0x418d99]
btrfs-debug-tree(btrfs_print_tree+0x2c0)[0x41ad4c]
btrfs-debug-tree(btrfs_print_tree+0x2dc)[0x41ad68]
btrfs-debug-tree(main+0x9a5)[0x432589]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7ff0629ccec5]
btrfs-debug-tree[0x4070e9]

Hopefully this will help the developers.

-John


On Fri, Jan 1, 2016 at 11:04 PM, John Center  wrote:
> Hi Duncan,
>
> Doing some more digging, I ran btrfs-image & found the following
> errors.  I'm not sure how useful this is, or what this means in terms
> of the other btrfs-tools messages.  Maybe more clues?
>
> Thanks.
>
> -John
>
> john@mariposa:~$ sudo btrfs-image -c9 -t4 /dev/md125p2
> /media/data1/btrfs.image-01012016
> WARNING: The device is mounted. Make sure the filesystem is quiescent.
> parent transid verify failed on 337676075008 wanted 1036368 found 1036377
> parent transid verify failed on 337676075008 wanted 1036368 found 1036377
> parent transid verify failed on 337676075008 wanted 1036368 found 1036377
> parent transid verify failed on 337676075008 wanted 1036368 found 1036377
> Ignoring transid failure
> parent transid verify failed on 337674846208 wanted 1036370 found 1036377
> parent transid verify failed on 337674846208 wanted 1036370 found 1036377
> parent transid verify failed on 337674846208 wanted 1036370 found 1036377
> parent transid verify failed on 337674846208 wanted 1036370 found 1036377
> Ignoring transid failure
> parent transid verify failed on 337675403264 wanted 1036370 found 1036377
> parent transid verify failed on 337675403264 wanted 1036370 found 1036377
> parent transid verify failed on 337675403264 wanted 1036370 found 1036377
> parent transid verify failed on 337675403264 wanted 1036370 found 1036377
> Ignoring transid failure
> parent transid verify failed on 337681907712 wanted 1036375 found 1036377
> parent transid verify failed on 337681907712 wanted 1036375 found 1036377
> parent transid verify failed on 337681907712 wanted 1036375 found 1036377
> parent transid verify failed on 337681907712 wanted 1036375 found 1036377
> Ignoring transid failure
> parent transid verify failed on 337646354432 wanted 1036368 found 1036377
> parent transid verify failed on 337646354432 wanted 1036368 found 1036377
> parent transid verify failed on 337646354432 wanted 1036368 found 1036377
> parent transid verify failed on 337646354432 wanted 1036368 found 1036377
> Ignoring transid failure
> parent transid verify failed on 337679597568 wanted 1036373 found 1036377
> parent transid verify failed on 337679597568 wanted 1036373 found 1036377
> parent transid verify failed on 337679597568 wanted 1036373 found 1036377
> parent transid verify failed on 337679597568 wanted 1036373 found 1036377
> Ignoring transid failure
> parent transid verify failed on 337679613952 wanted 1036373 found 1036377
> parent transid verify failed on 337679613952 wanted 1036373 found 1036377
> parent transid verify failed on 337679613952 wanted 1036373 found 1036377
> parent transid verify failed on 337679613952 wanted 1036373 found 1036377
> Ignoring transid failure
> parent transid verify failed on 337679745024 wanted 1036372 found 1036377
> parent transid verify failed on 337679745024 wanted 1036372 found 1036377
> parent transid verify failed on 337679745024 wanted 1036372 found 1036377
> parent transid verify failed on 337679745024 wanted 1036372 found 

[PATCH] BTRFS: Adds the files and options needed for Hybrid Storage

2016-01-01 Thread Sanidhya Solanki
This patch adds the file required for Hybrid Storage. It contains
the memory, time and size limits for the cache and the statistics that
will be provided while the cache is operating.
It also adds the Makefile changes needed to add the Hybrid Storage.

Signed-off-by: Sanidhya Solanki 
---
 fs/btrfs/Makefile |  2 +-
 fs/btrfs/cache.c  | 58 +++
 2 files changed, 59 insertions(+), 1 deletion(-)
 create mode 100644 fs/btrfs/cache.c

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 6d1d0b9..dc56ae4 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -9,7 +9,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   export.o tree-log.o free-space-cache.o zlib.o lzo.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \
   reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \
-  uuid-tree.o props.o hash.o
+  uuid-tree.o props.o hash.o cache.o
 
 btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
 btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o
diff --git a/fs/btrfs/cache.c b/fs/btrfs/cache.c
new file mode 100644
index 000..0ece7a1
--- /dev/null
+++ b/fs/btrfs/cache.c
@@ -0,0 +1,58 @@
+/*
+ * (c) Sanidhya Solanki, 2016
+ *
+ * Licensed under the FSF's GNU Public License v2 or later.
+ */
+#include 
+
+/* Cache size configuration )in MiB).*/
+#define MAX_CACHE_SIZE = 1
+#define MIN_CACHE_SIZE = 10
+
+/* Time (in seconds)before retrying to increase the cache size.*/
+#define CACHE_RETRY = 10
+
+/* Space required to be free (in MiB) before increasing the size of the
+ * cache. If cache size is less than cache_grow_limit, a block will be freed
+ * from the cache to allow the cache to continue growning.
+ */
+#define CACHE_GROW_LIMIT = 100
+
+/* Size required to be free (in MiB) after we shrink the cache, so that it
+ * does not grow in size immediately.
+ */
+#define CACHE_SHRINK_FREE_SPACE_LIMIT = 100
+
+/* Age (in seconds) of oldest and newest block in the cache.*/
+#define MAX_AGE_LIMIT = 300/* Five Minute Rule recommendation,
+* optimum size depends on size of data
+* blocks.
+*/
+#define MIN_AGE_LIMIT = 15 /* In case of cache stampede.*/
+
+/* Memory constraints (in percentage) before we stop caching.*/
+#define MIN_MEM_FREE = 10
+
+/* Cache statistics. */
+struct cache_stats {
+   u64 cache_size;
+   u64 maximum_cache_size_attained;
+   int cache_hit_rate;
+   int cache_miss_rate;
+   u64 cache_evicted;
+   u64 duplicate_read;
+   u64 duplicate_write;
+   int stats_update_interval;
+};
+
+#define cache_size CACHE_SIZE /* Current cache size.*/
+#define max_cache_size MAX_SIZE /* Max cache limit. */
+#define min_cache_size MIN_SIZE /* Min cache limit.*/
+#define cache_time MAX_TIME /* Maximum time to keep data in cache.*/
+#define evicted_csum   EVICTED_CSUM/* Checksum of the evited data
+* (to avoid repeatedly caching
+* data that was just evicted.
+*/
+#define read_csum  READ_CSUM /* Checksum of the read data.*/
+#define write_csum WRITE_CSUM /* Checksum of the written data.*/
+#define evict_interval EVICT_INTERVAL /* Time to keep data before eviction.*/
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Add big device, remove small device, read-only

2016-01-01 Thread Chris Murphy
On Fri, Jan 1, 2016 at 4:47 AM, Rasmus Abrahamsen  wrote:
> Happy New Year!
>
> I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want to 
> remove the 1.5TB. When saying btrfs dev delete it turned into readonly. I am 
> on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do?

btrfs fi show
lsblk -f
btrfs fi usage 


he remove, so I did --since=yesterday
> I am looking at the log now, please stnad by.
> This is my log
> http://pastebin.com/mCPi3y9r

What's this?

Dec 31 15:03:56 rasmusahome systemd-udevd[6340]: inotify_add_watch(9,
/dev/sdd1, 10) failed: No space left on device
Dec 31 15:04:01 rasmusahome kernel:  sdd: sdd1
Dec 31 15:04:01 rasmusahome systemd-udevd[6341]: inotify_add_watch(9,
/dev/sdd1, 10) failed: No space left on device
Dec 31 15:05:43 rasmusahome kernel: BTRFS info (device sdb1): disk
added /dev/sdd1

Why is udev first complaining about no space left on sdd1, but then
it's being added to the btrfs volume?



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs send fail and check hang

2016-01-01 Thread Alistair Grant
Hi,

When trying to send a snapshot I'm now getting errors such as:

ERROR: failed to open 
backups/xps13/@home/@home.20151229_13:43:09/alistair/.mozilla/firefox/yu3bxg7y.default/cookies.sqlite.
 No such file or directory

and

ERROR: could not find parent subvolume

This script has been running without a problem for several weeks.

I can reboot the system and the filesystem mounts without a problem.  I
can also navigate through the existing snapshots and access files
without any problem (these are all read-only snapshots, so I'm not
attempting to write anything).  There are no obvious errors in the
system log (I checked the log manually, and also have Marc Merlin's sec
script running to monitor for errors).

I tried running a read-only btrfs check, however it is hanging while
checking fs roots:

> sudo umount /srv/d2root
> sudo btrfs check /dev/sda
checking filesystem on /dev/sda
UUID: d8daaa62-afa2-4654-b7de-22fdc8456e03
checking extents
checking free space cache
checking fs roots
^C

Disk IO was several MB/s during the initial part of the check and
dropped to 0 on checking fs roots.  I left it for about 10 minutes
before interrupting.

The same happens for /dev/sdb.

General system information:

uname -a
Linux alarmpi 4.1.15-1-ARCH #1 SMP Tue Dec 15 18:39:32 MST 2015 armv7l
GNU/Linux


btrfs --version
btrfs-progs v4.3.1



mount | grep btrfs
/dev/sda on /srv/d2root type btrfs (rw,noatime,compress-force=zlib,space_cache)


> sudo btrfs fi show /srv/d2root
Label: 'data2'  uuid: d8daaa62-afa2-4654-b7de-22fdc8456e03
Total devices 2 FS bytes used 117.34GiB
devid1 size 1.82TiB used 118.03GiB path /dev/sda
devid2 size 1.82TiB used 118.03GiB path /dev/sdb



> sudo btrfs fi df /srv/d2root
Data, RAID1: total=117.00GiB, used=116.76GiB
System, RAID1: total=32.00MiB, used=48.00KiB
Metadata, RAID1: total=1.00GiB, used=595.36MiB
GlobalReserve, single: total=208.00MiB, used=0.00B


> sudo btrfs fi usage /srv/d2root
Overall:
Device size:   3.64TiB
Device allocated:236.06GiB
Device unallocated:3.41TiB
Device missing:  0.00B
Used:234.68GiB
Free (estimated):  1.70TiB  (min: 1.70TiB)
Data ratio:   2.00
Metadata ratio:   2.00
Global reserve:  208.00MiB  (used: 0.00B)

Data,RAID1: Size:117.00GiB, Used:116.76GiB
   /dev/sda  117.00GiB
   /dev/sdb  117.00GiB

Metadata,RAID1: Size:1.00GiB, Used:595.36MiB
   /dev/sda1.00GiB
   /dev/sdb1.00GiB

System,RAID1: Size:32.00MiB, Used:48.00KiB
   /dev/sda   32.00MiB
   /dev/sdb   32.00MiB

Unallocated:
   /dev/sda1.70TiB
   /dev/sdb1.70TiB


Help!

Many Thanks,
Alistair

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub failing

2016-01-01 Thread Martin Steigerwald
Am Freitag, 1. Januar 2016, 13:20:49 CET schrieb John Center:
> > On Jan 1, 2016, at 12:41 PM, Martin Steigerwald 
> > wrote:
> > Am Freitag, 1. Januar 2016, 11:41:20 CET schrieb John Center:
[…]
> >>> On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote:
> >>> 
> >>> A couple months ago, which would have made it around the 4.2 kernel
> >>> you're running (with 4.3 being current and 4.4 nearly out), there were a
> >>> number of similar scrub aborted reports on the list.
> >> 
> >> I must have missed that, I'll check the list again to try & understand
> >> the
> >> issue better.
> > 
> > I had repeatedly failing scrubs as mentioned in another thread here, until
> > I used 4.4 kernel. With 4.3 kernel scrub also didn´t work. I didn´t use
> > the debug options you used above and I am not sure whether I had this
> > scrub issue with 4.2 already, so I am not sure it has been the same
> > issue. But you may need to run 4.4 kernel in order to get scrub working
> > again.
> > 
> > See my thread "[4.3-rc4] scrubbing aborts before finishing" for details.
> 
> I was afraid of this. I just read your thread. I generally try to stay away
> from kernels so new, but I may have to try it. Was there any reason you
> didn't go to 4.1 instead?  (I run win8.1 in VirtualBox 5.0.12, when I need
> to run somethings under Windows. I'd have to wait until 4.4 is released &
> supported to do that.)

So far 4.4-rc6 is pretty stable for me. And I think its almost before release 
as rc7 is out already.

Reason for not going with 4.1? Ey, that would be downgrading, wouldn´t it? But 
sure it is also an option.

Virtualbox 5.0.12-dfsg-2 as packaged by Debian runs fine here with 4.4-rc6.

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Add big device, remove small device, read-only

2016-01-01 Thread Rasmus Abrahamsen
I accidentically sent my messages directly to Duncan, I am copying them
in here.

Hello Duncan,

Thank you for the amazing response. Wow, you are awesome.

Here is the output of fi show, fi df and mount, sorry for not providing
them to begin with:

http://pastebin.com/DpiuDvRy


> On 01 Jan 2016, at 17:39, Duncan <1i5t5.dun...@cox.net> wrote:
> 
> Rasmus Abrahamsen posted on Fri, 01 Jan 2016 12:47:08 +0100 as excerpted:
> 
>> Happy New Year!
>> 
>> I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want
>> to remove the 1.5TB. When saying btrfs dev delete it turned into
>> readonly. I am on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do?
> 
> This isn't going to help with the specific problem, and doesn't apply to 
> your case now anyway as the 4 TB device has already been added so all 
> you're doing now is deleting the old one, but FWIW...
> 
> There's a fairly new command, btrfs replace, that can be used to directly 
> replace an old device with a new one, instead of doing btrfs device add, 
> followed by btrfs device delete/remove.
> 
>> On top of that, my linux is on this same raid, so perhaps btrfs is
>> writing some temp files in the filesystem but cannot?
>> . /dev/sdc1 on / type btrfs
>> (ro,relatime,space_cache,subvolid=1187,subvol=/linux)
> 
> Your wording leaves me somewhat confused.  You say your Linux, presumably 
> your root filesystem, is on the same raid as the filesystem that is 
> having problems.  That would imply that it's a different filesystem, 
> which in turn would apply that the raid is below the filesystem level, 
> say mdraid, dmraid, or hardware raid, with both your btrfs root 
> filesystem, and the separate btrfs with the problems, on the same raid-
> based device, presumably partitioned so you can put multiple filesystems 
> on the same device.
> 
> Which of course would generally mean the two btrfs themselves aren't 
> raid, unless of course you are using at least one non-btrfs raid as one 
> device under a btrfs raid.  But while implied, that's not really 
> supported by what you said, which suggests a single btrfs raid 
> filesystem, instead.  In which case, perhaps you meant that this 
> filesystem contains your root filesystem as well, not just that the raid 
> contains it.
> 
> Of course, if your post had included the usual btrfs fi show and btrfs fi 
> df (and btrfs fi usage would be good as well) that the wiki recommends be 
> posted with such reports, that might make things clearer, but it doesn't, 
> so we're left guessing...
> 
> But I'm assuming you meant a single multi-device btrfs, not multiple 
> btrfs that happen to be on the same non-btrfs raid.

My root / is a sub volume of my filesystem. I will only be talking about
the filesystem named “Fortune”, the one named “Glassbox” is not relevant
to this problem.

> 
> Another question the show and df would answer is what btrfs raid mode 
> you're running.  The default for multiple device btrfs is of course raid1 
> metadata and single mode data, but you might well have set it up with 
> data and metadata in the same mode, and/or with raid0/5/6/10 for one or 
> both data and metadata.  You didn't say and didn't provide the btrfs 
> command output that would show it, so...
> 
>>  Ralle: did you do balance before removing?
>> 
>> I did not, but I have experience with it balancing itself upon doing so.
>> Upon removing a device, that is.
>> I am just not sure how to proceed now that everything is read-only.
> 
> You were correct in that regard.  btrfs device remove (or btrfs replace) 
> trigger balance as part of the process, and balancing after adding a 
> device, only to have balance trigger again with a delete/remove, is 
> needless.
> 
> Actually, I suspect the remove-triggered balance ran across a problem it 
> didn't know how to handle when attempting to move one of the chunks from 
> the existing device, and that's what put the filesystem in read-only 
> mode.  That's usually what happens when btrfs device remove triggers 
> problems and people report it, anyway.  A balance before the remove would 
> have simply triggered it then, anyway.
> 
> But what the specific problem is, and what to do about it, remains to be 
> seen.  Having that btrfs fi show and btrfs fi df would be a good start, 
> letting us know at least what raid type we're dealing with, etc.
> 
>>  I hope that you have backups?
>> 
>> I do have backups, but it's on Crashplan, so I would prefer not to have
>> to go there.
> 
> That's wise, both him asking and you replying you already have them, but 
> just want to avoid using them if possible.  Waaayyy too many folks 
> posting here find out the hard way about the admin's first rule of 
> backups, in simplified form, that if you don't have backups, you are 
> declaring by your actions that the data not backed up is worth less to 
> you than the time, resources and hassle required to do those backups, 
> despite any after-the-fact protests to the contrary.  Not being in that 
> 

Re: Unrecoverable fs corruption?

2016-01-01 Thread Christoph Anton Mitterer
On Fri, 2016-01-01 at 08:13 +, Duncan wrote:
> you can also try a read-only scrub
OT: I just wondered, would a balance include everything a scrub
includes (i.e. read+verify all data and rebuild an errors on different
devices / block copies)... of course in addition to also copying all
"good" data... and perhaps with the difference, that you don't get that
detailed information as in scrub but only the kernel log messages about
errors?

> In this case, 
> you'll need to recover from the degraded-mount working device as if
> the 
> second one had entirely failed.
> 
> What I'd do in this case, if you haven't done so already, is that
> read-
> only btrfs scrub, just to see where you are in terms of corruption on
> the 
> remaining device.
I don't think that this is the best order of the steps - at least not
when it's about precious data.

Doing a scrub at this phase, would just read all data, telling you the
status,... but first you should try to copy as much as possible (just
in case the remaining good drive fails as well) and *then* do the scrub
to see what's actually good or not.


Alternatively the first step could be backing up to another drive in
the sense of dd-copy (beware of the problem of UUID collisions in
btrfs: you MUST make sure here that the kernel doesn't see[0] devices
with the same IDs, which is of course the case with dd, unless you
write to e.g. an image file and not a device)

This has advantages and disadvantages:
- btrfs rebuild would only rebuild those block that are actually
used... so you need to do less reads from a possibly soon-to-be-dying
device
- OTOH, you only copy the blocks which btrfs thinks are actually
used,... and if later it would turn out that there are filesystem
corruptions in these, you don't have any other areas (with possibly
older data) where you could try some last-resort-recoveries..



Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: Btrfs Check - "type mismatch with chunk"

2016-01-01 Thread Christoph Anton Mitterer
On Fri, 2015-12-25 at 08:06 +, Duncan wrote:
> I wasn't personally sure if 4.1 itself was affected or not, but the
> wiki 
> says don't use 4.1.1 as it's broken with this bug, with the quick-fix 
> in 
> 4.1.2, so I /think/ 4.1 itself is fine.  A scan with a current btrfs 
> check should tell you for sure.  But if you meant 4.1.1 and only
> typed 
> 4.1, then yes, better redo.
What exactly was that bug in 4.1.1 mkfs and how would one notice that
one suffers from it?
I created a number of personal filesystems that I use "productively"
and I'm not 100% sure during which version I've created them... :/

Is there some easy way to find out, like a fs creation time stamp??


> Unfortunately, the
> btrfs-
> convert bug isn't as nailed down, but btrfs-convert has a warning up
> on 
> the wiki anyway, as currently being buggy and not reliable.
I hope I don't step on anyone's toes who puts efforts into this, but in
all doing respect,... I think the whole convert thing is at least in
parts a waste of manpower - or perhaps better said: it would be nice to
have, but given the lack of manpower at btrfs development and the
numerous areas[0] that would need some urgent and probably lots of
care,... having a convert from other fs to btrfs seems like luxury that
isn't really needed.

People don't choose btrfs because they can easy convert, I'd guess.
Either they'll choose it (at some time in the future), because it's the
default in distros then,... or they choose it already nowadays because
it's already pretty great and has awesome features.

The conversion is always a think, which at best works and doesn't make
things worse[1],... in practise though it's rather likely to fail than
to work, because the convert tools would need to keep up with
developments at both side (btrfs and e.g. ext) forever.

If people want to change their fs, they should simply copy the data
from on to the other.
Saves a lot of time for the devs :-)



Cheers,
Chris.


[0] From the really awful things like the UUID collision->corruption
issues,... over the pretty serious things like all the missing RAID
functionality (just look at the recent reports at the list where even
RAID1 seems to be far from production ready, not to talk about 5/6),...
and many other bugs (like all the recent error reports about non
working scrubs, etc.)... to the really strongly desired wishlist-
features

[1] It's kinda like the situation when many photographers think it
makes sense to convert their XYZ RAW format to DNG, which is IMHO
inherently stupid. At best all information from XYZ would be preserved
(which is however unlikely) at worst you loose information.
And since there are good readers (dcraw) for basically all RAW formats
there's really not much need to convert it to DNG.
(Which doesn't mean I wouldn't like DNG, but it only makes sense if the
camera does it natively.)

smime.p7s
Description: S/MIME cryptographic signature


Re: btrfs scrub failing

2016-01-01 Thread John Center
Hi Duncan,

On Fri, Jan 1, 2016 at 12:05 PM, Duncan <1i5t5.dun...@cox.net> wrote:
> John Center posted on Fri, 01 Jan 2016 11:41:20 -0500 as excerpted:
>
>> If this doesn't resolve the problem, what would you recommend my next
>> steps should be?  I've been hesitant to run too many of the btrfs-tools,
>> mainly because I don't want to accidentally screw things up & I don't
>> always know how to interpret the results. (I ran btrfs-debug-tree,
>> hoping something obvious would show up.  Big mistake. )
>
> LOLed at that debug-tree remark.  Been there (with other tools) myself.
>
> Well, I'm hoping someone who had the problem can confirm whether it's
> fixed in current kernels (scrub is one of those userspace commands that's
> mostly just a front-end to the kernel code which does the real work, so
> kernel version is the important thing for scrub).  I'm guessing so, and
> that you'll find the problem gone in 4.3.
>
> We'll cross the not-gone bridge if we get to it, but again, if the other
> people who had the similar problem can confirm whether it disappeared for
> them with the new kernel, it would help a lot, as there were enough such
> reports that if it's the same problem and still there for everyone (which
> I doubt as I expect there'd still be way more posts about it if so, but
> confirmation's always good), nothing to do but wait for a fix, while if
> not, and you still have your problem, then it's a different issue and the
> devs will need to work with you on a fix specific to your problem.
>
Ok, I'm at the next bridge. :-(  I upgraded the kernel to 4.4rc7 from
the Ubuntu Mainline archive & I just ran the scrub:

john@mariposa:~$ sudo /sbin/btrfs scrub start -BdR /dev/md125p2
ERROR: scrubbing /dev/md125p2 failed for device id 1: ret=-1, errno=5
(Input/output error)
scrub device /dev/md125p2 (id 1) canceled
scrub started at Fri Jan  1 19:38:21 2016 and was aborted after 00:02:34
data_extents_scrubbed: 111031
tree_extents_scrubbed: 104061
data_bytes_scrubbed: 2549907456
tree_bytes_scrubbed: 1704935424
read_errors: 0
csum_errors: 0
verify_errors: 0
no_csum: 1573
csum_discards: 0
super_errors: 0
malloc_errors: 0
uncorrectable_errors: 0
unverified_errors: 0
corrected_errors: 0
last_physical: 4729667584

I checked dmesg & this appeared:

[11428.983355] BTRFS error (device md125p2): parent transid verify
failed on 241287168 wanted 33554449 found 17
[11431.028399] BTRFS error (device md125p2): parent transid verify
failed on 241287168 wanted 33554449 found 17

Where do I go from here?

Thanks for your help.

-John
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub failing

2016-01-01 Thread John Center
Hi Duncan,

Doing some more digging, I ran btrfs-image & found the following
errors.  I'm not sure how useful this is, or what this means in terms
of the other btrfs-tools messages.  Maybe more clues?

Thanks.

-John

john@mariposa:~$ sudo btrfs-image -c9 -t4 /dev/md125p2
/media/data1/btrfs.image-01012016
WARNING: The device is mounted. Make sure the filesystem is quiescent.
parent transid verify failed on 337676075008 wanted 1036368 found 1036377
parent transid verify failed on 337676075008 wanted 1036368 found 1036377
parent transid verify failed on 337676075008 wanted 1036368 found 1036377
parent transid verify failed on 337676075008 wanted 1036368 found 1036377
Ignoring transid failure
parent transid verify failed on 337674846208 wanted 1036370 found 1036377
parent transid verify failed on 337674846208 wanted 1036370 found 1036377
parent transid verify failed on 337674846208 wanted 1036370 found 1036377
parent transid verify failed on 337674846208 wanted 1036370 found 1036377
Ignoring transid failure
parent transid verify failed on 337675403264 wanted 1036370 found 1036377
parent transid verify failed on 337675403264 wanted 1036370 found 1036377
parent transid verify failed on 337675403264 wanted 1036370 found 1036377
parent transid verify failed on 337675403264 wanted 1036370 found 1036377
Ignoring transid failure
parent transid verify failed on 337681907712 wanted 1036375 found 1036377
parent transid verify failed on 337681907712 wanted 1036375 found 1036377
parent transid verify failed on 337681907712 wanted 1036375 found 1036377
parent transid verify failed on 337681907712 wanted 1036375 found 1036377
Ignoring transid failure
parent transid verify failed on 337646354432 wanted 1036368 found 1036377
parent transid verify failed on 337646354432 wanted 1036368 found 1036377
parent transid verify failed on 337646354432 wanted 1036368 found 1036377
parent transid verify failed on 337646354432 wanted 1036368 found 1036377
Ignoring transid failure
parent transid verify failed on 337679597568 wanted 1036373 found 1036377
parent transid verify failed on 337679597568 wanted 1036373 found 1036377
parent transid verify failed on 337679597568 wanted 1036373 found 1036377
parent transid verify failed on 337679597568 wanted 1036373 found 1036377
Ignoring transid failure
parent transid verify failed on 337679613952 wanted 1036373 found 1036377
parent transid verify failed on 337679613952 wanted 1036373 found 1036377
parent transid verify failed on 337679613952 wanted 1036373 found 1036377
parent transid verify failed on 337679613952 wanted 1036373 found 1036377
Ignoring transid failure
parent transid verify failed on 337679745024 wanted 1036372 found 1036377
parent transid verify failed on 337679745024 wanted 1036372 found 1036377
parent transid verify failed on 337679745024 wanted 1036372 found 1036377
parent transid verify failed on 337679745024 wanted 1036372 found 1036377
Ignoring transid failure
parent transid verify failed on 337682022400 wanted 1036369 found 1036377
parent transid verify failed on 337682022400 wanted 1036369 found 1036377
parent transid verify failed on 337682022400 wanted 1036369 found 1036377
parent transid verify failed on 337682022400 wanted 1036369 found 1036377
Ignoring transid failure
parent transid verify failed on 337679712256 wanted 1036373 found 1036377
parent transid verify failed on 337679712256 wanted 1036373 found 1036377
parent transid verify failed on 337679712256 wanted 1036373 found 1036377
parent transid verify failed on 337679712256 wanted 1036373 found 1036377
Ignoring transid failure
parent transid verify failed on 337647927296 wanted 1036368 found 1036377
parent transid verify failed on 337647927296 wanted 1036368 found 1036377
parent transid verify failed on 337647927296 wanted 1036368 found 1036377
parent transid verify failed on 337647927296 wanted 1036368 found 1036377
Ignoring transid failure
parent transid verify failed on 337683251200 wanted 1036372 found 1036377
parent transid verify failed on 337683251200 wanted 1036372 found 1036377
parent transid verify failed on 337683251200 wanted 1036372 found 1036377
parent transid verify failed on 337683251200 wanted 1036372 found 1036377
Ignoring transid failure
parent transid verify failed on 337683398656 wanted 1036372 found 1036377
parent transid verify failed on 337683398656 wanted 1036372 found 1036377
parent transid verify failed on 337683398656 wanted 1036372 found 1036377
parent transid verify failed on 337683398656 wanted 1036372 found 1036377
Ignoring transid failure
parent transid verify failed on 337682432000 wanted 1036369 found 1036377
parent transid verify failed on 337682432000 wanted 1036369 found 1036377
parent transid verify failed on 337682432000 wanted 1036369 found 1036377
parent transid verify failed on 337682432000 wanted 1036369 found 1036377
Ignoring transid failure
parent transid verify failed on 337633558528 wanted 1036368 found 1036377
parent transid verify failed on 337633558528 

Re: Btrfs send / receive freeze system?

2016-01-01 Thread Henk Slager
On Fri, Jan 1, 2016 at 4:51 PM, fugazzi®  wrote:
> Hi everyone.
> It's a few weeks that I converted my root partition into btrfs with three sub-
> volumes named boot,root,home. I'm booting with /boot on subvol.
>
> I'm using btrfs send and receive to make backup of the three snapshotted
> subvolumes on a second btrfs formatted drive with three commands like this:
>
> btrfs send /btrfs-root/snap-root/ | gzip > $BKFOLDER/root.dump.gz
>
> Sometimes, let say once a week, the system completely freeze (mouse keyboard)
> during the send, only solution was the reset button. The freeze happened on
> different place every time. Last time happened at 80% of the home send for
> example.
>
> The freeze also happened during a send/receive from an external e-sata drive
> (to copy some mp3 using send instead of rsync) to the same internal drive
> where the backup are also made.
>
> The system always run and was stable with XFS/xfsdump.
>
> Kernel is 4.3.3, btrfs progs are 4.3.1, system is Arch Linux 64 bit, Ram 8Gb
> Mainboard Asus striker extreme Nvidia 680i, 8 years old.
>
> After the crash nothing is shown in the systemd log, it simply freeze.

You could try this (maybe you have already) and see if and where the
problem is in btrfs:
https://en.m.wikipedia.org/wiki/Magic_SysRq_key

Instead of pipe to gzip you could do:
| btrfs receive -vv 
and trace back at which file the freeze happens. Maybe that tells something
about the source filesystem.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send clone use case

2016-01-01 Thread Duncan
Chris Murphy posted on Thu, 31 Dec 2015 13:54:42 -0700 as excerpted:

> I haven't previously heard of this use case for -c option.
> It seems to work (no errors or fs weirdness afterward).
> 
> The gist:
> send a snapshot from drive 1 to drive 2;
> rw snapshot of the drive 2 copy,
> and then make changes to it,
> then make an ro snapshot;
> now send it back to drive 1 *as an incremental* send.

While as you likely know my own use-case doesn't use send/receive, based 
on previous on-list discussion, I considered this the obvious workaround 
to the problem of the current send stream format not including enough 
inheritance metadata to allow send/receive to properly handle a /reverse/ 
send -p.

Where -p works, it's the most efficient method, but due to this lack of 
send-stream inheritance metadata, it apparently can't work in the reverse 
case, where the usual receive end is now the send end.

But doing -c clones, while not /quite/ as efficient as -p because more 
metadata is sent, is still far more efficient than doing a full send, and 
can work in this reverse case where the original send side is now the 
receive side because it's not as strict as -p, being rather more metadata 
verbose in place of that strictness, where the -p option would fail due 
to strictness and lack of appropriate inheritance metadata in the stream 
format.

That -p mode missing inheritance metadata, being effectively just one 
more item, would still be much more efficient than using -c clones, as 
the clone format is generally more metadata-verbose in ordered to 
properly identify per-extent clones, but it's simply not there in the 
current format.  When the send format is eventually version-bumped, this 
additional metadata item should be included, making send -p work in these 
reverse-send cases, but they ideally want to do just one more "final"
send-stream format bump including all changes they've found to be needed, 
so they're holding off on the format bump for the moment, so as to be 
able to include anything else they've overlooked when they do finally do 
it.

That's as I understand the state of send/receive, anyway, being 
interested in it on-list, but not being a current user.  But this usage 
of -c being almost precisely that "reverse-send" usage, only with an 
additional change thrown in at the normal receive side before the send, 
I'd actually have been surprised if it /didn't/ work as you outlined. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


still kworker at 100% cpu in all of device size allocated with chunks situations with write load

2016-01-01 Thread Martin Steigerwald
First: Happy New Year to you!

Second: Take your time. I know its holidays for many. For me it means I easily 
have time to follow-up on this.

Am Mittwoch, 16. Dezember 2015, 09:20:45 CET schrieb Qu Wenruo:
> Chris Mason wrote on 2015/12/15 16:59 -0500:
> > On Mon, Dec 14, 2015 at 10:08:16AM +0800, Qu Wenruo wrote:
> >> Martin Steigerwald wrote on 2015/12/13 23:35 +0100:
> >>> Hi!
> >>> 
> >>> For me it is still not production ready.
> >> 
> >> Yes, this is the *FACT* and not everyone has a good reason to deny it.
> >> 
> >>> Again I ran into:
> >>> 
> >>> btrfs kworker thread uses up 100% of a Sandybridge core for minutes on
> >>> random write into big file
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=90401
> >> 
> >> Not sure about guideline for other fs, but it will attract more dev's
> >> attention if it can be posted to maillist.
> >> 
> >>> No matter whether SLES 12 uses it as default for root, no matter whether
> >>> Fujitsu and Facebook use it: I will not let this onto any customer
> >>> machine
> >>> without lots and lots of underprovisioning and rigorous free space
> >>> monitoring. Actually I will renew my recommendations in my trainings to
> >>> be careful with BTRFS.
> >>> 
> >>>  From my experience the monitoring would check for:
> >>> merkaba:~> btrfs fi show /home
> >>> Label: 'home'  uuid: […]
> >>> 
> >>>  Total devices 2 FS bytes used 156.31GiB
> >>>  devid1 size 170.00GiB used 164.13GiB path
> >>>  /dev/mapper/msata-home
> >>>  devid2 size 170.00GiB used 164.13GiB path
> >>>  /dev/mapper/sata-home
> >>> 
> >>> If "used" is same as "size" then make big fat alarm. It is not
> >>> sufficient for it to happen. It can run for quite some time just fine
> >>> without any issues, but I never have seen a kworker thread using 100%
> >>> of one core for extended period of time blocking everything else on the
> >>> fs without this condition being met.>> 
> >> And specially advice on the device size from myself:
> >> Don't use devices over 100G but less than 500G.
> >> Over 100G will leads btrfs to use big chunks, where data chunks can be at
> >> most 10G and metadata to be 1G.
> >> 
> >> I have seen a lot of users with about 100~200G device, and hit unbalanced
> >> chunk allocation (10G data chunk easily takes the last available space
> >> and
> >> makes later metadata no where to store)
> > 
> > Maybe we should tune things so the size of the chunk is based on the
> > space remaining instead of the total space?
> 
> Submitted such patch before.
> David pointed out that such behavior will cause a lot of small
> fragmented chunks at last several GB.
> Which may make balance behavior not as predictable as before.
> 
> 
> At least, we can just change the current 10% chunk size limit to 5% to
> make such problem less easier to trigger.
> It's a simple and easy solution.
> 
> Another cause of the problem is, we understated the chunk size change
> for fs at the borderline of big chunk.
> 
> For 99G, its chunk size limit is 1G, and it needs 99 data chunks to
> fully cover the fs.
> But for 100G, it only needs 10 chunks to covert the fs.
> And it need to be 990G to match the number again.
> 
> The sudden drop of chunk number is the root cause.
> 
> So we'd better reconsider both the big chunk size limit and chunk size
> limit to find a balanaced solution for it.

Did you come to any conclusion here? Is there anything I can change with my 
home BTRFS filesystem to try to find out what works? Challenge here is that it 
doesn´t happen under defined circumstances. So far I only know the required 
condition, but not the sufficient condition for it to happen.

Another user run into the issue and reported his findings in the bug report:

https://bugzilla.kernel.org/show_bug.cgi?id=90401#c14

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Add big device, remove small device, read-only

2016-01-01 Thread Rasmus Abrahamsen
Happy New Year!

I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want to 
remove the 1.5TB. When saying btrfs dev delete it turned into readonly. I am on 
4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do?
On top of that, my linux is on this same raid, so perhaps btrfs is writing some 
temp files in the filesystem but cannot?
. /dev/sdc1 on / type btrfs 
(ro,relatime,space_cache,subvolid=1187,subvol=/linux)

 Ralle: did you do balance before removing?

I did not, but I have experience with it balancing itself upon doing so.
Upon removing a device, that is.
I am just not sure how to proceed now that everything is read-only.

 I dunno, but generally with things that can crumble you should make it 
step-by-step
crumble?
 which, in combination arch + newest bleeding edge kernel should be 
mandatory
 I hope that you have backups?

I do have backups, but it's on Crashplan, so I would prefer not to have to go 
there.

 and do you have any logs?

Where would those be?
I never understood journalctl

 journalctl --since=today

Hmm, it was actually yesterday that I started the remove, so I did 
--since=yesterday
I am looking at the log now, please stnad by.
This is my log
http://pastebin.com/mCPi3y9r
But I fear that it became read-only before actually writing the error to the 
filesystem

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Add big device, remove small device, now read-only

2016-01-01 Thread Rasmus Abrahamsen
Happy NY!
I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want to 
remove the 1.5TB.
When saying btrfs dev delete it turned into readonly. I am on 4.2.5-1-ARCH and 
btrfs-progs v4.3.1 what can I do?
On top of that, my linux is on this same raid, so perhaps btrfs is writing some 
temp files in the filesystem but cannot?
. /dev/sdc1 on / type btrfs 
(ro,relatime,space_cache,subvolid=1187,subvol=/linux)

 Ralle: did you do balance before removing?

I did not, but I have experience with it balancing itself upon doing so.
Upon removing a device, that is.
I am just not sure how to proceed now that everything is read-only.

 I dunno, but generally with things that can crumble you should make it 
step-by-step
crumble?
 which, in combination arch + newest bleeding edge kernel should be 
mandatory
 I hope that you have backups?

I do have backups, but it's on Crashplan, so I would prefer not to have to go 
there.

 and do you have any logs?

Where would those be?
I never understood journalctl

 journalctl --since=today

Hmm, it was actually yesterday that I started the remove, so I did 
--since=yesterday
I am looking at the log now, please stnad by.
This is my log
http://pastebin.com/mCPi3y9r
But I fear that it became read-only before actually writing the error to the 
filesystem

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unrecoverable fs corruption?

2016-01-01 Thread Duncan
Chris Murphy posted on Thu, 31 Dec 2015 18:22:09 -0700 as excerpted:

> On Thu, Dec 31, 2015 at 4:36 PM, Alexander Duscheleit
>  wrote:
>> Hello,
>>
>> I had a power fail today at my home server and after the reboot the
>> btrfs RAID1 won't come back up.
>>
>> When trying to mount one of the 2 disks of the array I get the
>> following error:
>> [ 4126.316396] BTRFS info (device sdb2): disk space caching is enabled
>> [ 4126.316402] BTRFS: has skinny extents [ 4126.337324] BTRFS: failed
>> to read chunk tree on sdb2 [ 4126.353027] BTRFS: open_ctree failed
> 
> 
> Why are you trying to mount only one? What mount options did you use
> when you did this?

Yes, please.

>> btrfs restore -viD seems to find most of the files accessible but since
>> I don't have a spare hdd of sufficient size I would have to break the
>> array and reformat and use one of the disk as restore target. I'm not
>> prepared to do this before I know there is no other way to fix the
>> drives since I'm essentially destroying one more chance at saving the
>> data.

> Anyway, in the meantime, my advice is do not mount either device rw
> (together or separately). The less changes you make right now the
> better.
> 
> What kernel and btrfs-progs version are you using?

Unless you've already tried it (hard to say without the mount options you 
used above), I'd first try a different tact than C Murphy suggests, 
falling back to what he suggests if it doesn't work.  I suppose he 
assumes you've already tried this...

But first things first, as C Murphy suggests, when you post problems like 
this, *PLEASE* post kernel and progs userspace versions.  Given the rate 
at which btrfs is still changing, that's pretty critical information.  
Also, if you're not running the latest or second latest kernel or LTS 
kernel series and a similar or newer userspace, be prepared to be asked 
to try a newer version.  With the almost released 4.4 set to be an LTS, 
that means it if you want to try it, or the LTS kernel series 4.1 and 
3.18, or the current or previous current kernel series 4.3 or 4.2 (tho 
with 4.2 not being an LTS updates are ended or close to it, so people on 
it should be either upgrading to 4.3 or downgrading to 4.1 LTS anyway).  
And for userspace, a good rule of thumb is whatever the kernel series, a 
corresponding or newer userspace as well.

With that covered...

This is a good place to bring in something else CM recommended, but in a 
slightly different context.  If you've read many of my previous posts 
you're likely to know what I'm about to say.  The admin's first rule of 
backups says, in simplest form[1], that if you don't have a backup, by 
your actions you're defining the data that would be backed up as not 
worth the hassle and resources to do that backup.  If in that case you 
lose the data, be happy, as you still saved what you defined by your 
actions as of /true/ value regardless of any claims to the contrary, the 
hassle and resourced you would have spent making that backup.  =:^)

While the rule of backups applies in general, for btrfs it applies even 
more, because btrfs is still under heavy development and while btrfs is 
"stabilizING, it's not yet fully stable and mature, so the risk of 
actually needing to use that backup remains correspondingly higher than 
it'd ordinarily be.

But, you didn't mention having backups, and did mention that you didn't 
have a spare hdd so would have to break the array to have a place to do a 
btrfs restore to, which reads very much like you don't have ANY BACKUPS 
AT ALL!!

Of course, in the context of the above backups rule, I guess you 
understand the implications, that you consider the value of that data 
essentially throw-away, particularly since you still don't have a backup, 
despite running a not entirely stable filesystem that puts the data at 
greater risk than would a fully stable filesystem.

Which means no big deal.  You've obviously saved the time, hassle and 
resources necessary to make that backup, which is obviously of more value 
to you than the data that's not backed up, so the data is obviously of 
low enough value you can simply blow away the filesystem with a fresh 
mkfs and start over. =:^)

Except... were that the case, you probably wouldn't be posting.

Which brings entirely new urgency to what CM said about getting that 
spare hdd, so you can actually create that backup, and count yourself 
very lucky if you don't lose your data before you have it backed up, 
since your previous actions were unfortunately not in accordance with the 
value you seem to be claiming for the data.

OK, the rest of this post is written with the assumption that your claims 
and your actions regarding the value of the data in question, agree, and 
that since you're still trying to recover the data, you don't consider it 
just throw-away, which means you now have someplace to put that backup, 
should you actually be lucky enough to get the chance to make 

how btrfs uses devid?

2016-01-01 Thread UGlee
Dear all:

If a btrfs device is missing, the command tool tells user the devid of
the missing devices.

I understand that each device (disk) in a btrfs volume has been
assigned a uuid (UUID_SUB field in udevadm info output). If the device
is missing, it's hard to tell user to input such uuid string in
command line. So devid is for convenience.

In our product, we want to record all disk information of a volume in
a file. If a disk is missing, not because it's broken, but because the
user has so many disks and in some cases they may put back the wrong
one. In this scenario, we can provide the disk information (such as
serial number) to user and help them to check if they did something
wrong.

My question is: is the devid just an alias to sub uuid? for a given
disk device, it is never changed during any btrfs operation, including
add, remove, balance and replace? or it may be changed, and when?

One more question just for curiosity. I checked the source code of
btrfs-progs briefly. It seems that there is no data structure in
superblock recording all sub-uuids or all devids for the volume, so
how does btrfs figure out the missing devid? since they are not always
sequential integers, for example, after one device is removed, the
devid is simply removed and the devid of other device is not
re-numbered.

matianfu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


cannot repair filesystem

2016-01-01 Thread Jan Koester
Hi,

if I try to repair filesystem got I'am assert. I use Raid6.

Linux dibsi 3.16.0-0.bpo.4-amd64 #1 SMP Debian 3.16.7-ckt4-3~bpo70+1 
(2015-02-12) x86_64 GNU/Linux

root@dibsi:~/btrfs-progs# btrfs fi show
Label: none  uuid: 6a2f3936-d0ef-43c0-9815-41e24f2bc21a
Total devices 1 FS bytes used 26.63GiB
devid1 size 111.70GiB used 49.04GiB path /dev/sdf2

Label: none  uuid: 73d4dc77-6ff3-412f-9b0a-0d11458faf32
Total devices 5 FS bytes used 1.17TiB
devid1 size 931.51GiB used 420.78GiB path /dev/sdb
devid2 size 931.51GiB used 420.78GiB path /dev/sdc
devid3 size 931.51GiB used 420.78GiB path /dev/sdd
devid4 size 931.51GiB used 420.78GiB path /dev/sde
devid5 size 931.51GiB used 420.78GiB path /dev/sda

root@dibsi:~/btrfs-progs# btrfs check --repair 
/dev/disk/by-uuid/73d4dc77-6ff3-412f-9b0a-0d11458faf32 
enabling repair mode
parent transid verify failed on 2280450637824 wanted 861168 found 860380
parent transid verify failed on 2280450637824 wanted 861168 found 860380
checksum verify failed on 2280450637824 found BF5F5D16 wanted AE725F92
checksum verify failed on 2280450637824 found BF5F5D16 wanted AE725F92
bytenr mismatch, want=2280450637824, have=15938376490240
repair mode will force to clear out log tree, Are you sure? [y/N]: y
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223
bytenr mismatch, want=2280260939776, have=15937230354176
parent transid verify failed on 2280260939776 wanted 861166 found 860368
parent transid verify failed on 2280260939776 wanted 861166 found 860368
checksum verify failed on 2280260939776 found 816E966C wanted CB60A223

Re: how btrfs uses devid?

2016-01-01 Thread Hugo Mills
On Fri, Jan 01, 2016 at 08:16:28PM +0800, UGlee wrote:
> Dear all:
> 
> If a btrfs device is missing, the command tool tells user the devid of
> the missing devices.
> 
> I understand that each device (disk) in a btrfs volume has been
> assigned a uuid (UUID_SUB field in udevadm info output). If the device
> is missing, it's hard to tell user to input such uuid string in
> command line. So devid is for convenience.

> In our product, we want to record all disk information of a volume in
> a file. If a disk is missing, not because it's broken, but because the
> user has so many disks and in some cases they may put back the wrong
> one. In this scenario, we can provide the disk information (such as
> serial number) to user and help them to check if they did something
> wrong.
> 
> My question is: is the devid just an alias to sub uuid? for a given
> disk device, it is never changed during any btrfs operation, including
> add, remove, balance and replace? or it may be changed, and when?

   Actually, devid is the ID that the FS uses internally in the device
tree to identify them. It's not just a convenience -- it's the
"official" identifier for the device within the filesystem.
 
> One more question just for curiosity. I checked the source code of
> btrfs-progs briefly. It seems that there is no data structure in
> superblock recording all sub-uuids or all devids for the volume, so
> how does btrfs figure out the missing devid? since they are not always
> sequential integers, for example, after one device is removed, the
> devid is simply removed and the devid of other device is not
> re-numbered.

   The devices that should be there (identified by devid) are listed
in the device tree. If one of those doesn't match up with a
currently-known device for that filesystem (as determined by btrfs dev
scan), then it's missing.

   Hugo.

-- 
Hugo Mills | I gave up smoking, drinking and sex once. It was the
hugo@... carfax.org.uk | scariest 20 minutes of my life.
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Btrfs send / receive freeze system?

2016-01-01 Thread fugazzi®
Hi everyone.
It's a few weeks that I converted my root partition into btrfs with three sub-
volumes named boot,root,home. I'm booting with /boot on subvol.

I'm using btrfs send and receive to make backup of the three snapshotted 
subvolumes on a second btrfs formatted drive with three commands like this: 

btrfs send /btrfs-root/snap-root/ | gzip > $BKFOLDER/root.dump.gz

Sometimes, let say once a week, the system completely freeze (mouse keyboard) 
during the send, only solution was the reset button. The freeze happened on 
different place every time. Last time happened at 80% of the home send for 
example.

The freeze also happened during a send/receive from an external e-sata drive 
(to copy some mp3 using send instead of rsync) to the same internal drive 
where the backup are also made.

The system always run and was stable with XFS/xfsdump.

Kernel is 4.3.3, btrfs progs are 4.3.1, system is Arch Linux 64 bit, Ram 8Gb 
Mainboard Asus striker extreme Nvidia 680i, 8 years old.

After the crash nothing is shown in the systemd log, it simply freeze.

Thanks, regards,
Mario


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs send / receive freeze system - Addendum

2016-01-01 Thread fugazzi®
Sorry, I forgot to mention that this freeze is happening after I converted 
this backup drive to btrfs, before it was XFS. So sending to XFS drive didn't 
caused the freeze while sending with the same script to btrfs formatted drive 
freeze the system. Kernel and btrfs progs were the same.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub failing

2016-01-01 Thread John Center
Hi Duncan,

> On Jan 1, 2016, at 12:05 PM, Duncan <1i5t5.dun...@cox.net> wrote:
> 
> John Center posted on Fri, 01 Jan 2016 11:41:20 -0500 as excerpted:
> 
>> Ok, I'll upgrade to 4.3 & see if that resolves the problem with
>> scrubbing.
>> I was wondering when I compiled the btrfs-tools if there would be a
>> problem with them not being in sync with the major kernel version.
> 
> FWIW newer (or older either, as long as not too old and you don't need 
> any of the features not in the older, and you're not trying to fix 
> problems only the newer can deal with) versions of btrfs-progs should be 
> fine.  As a rule of thumb I recommend staying at least current to kernel 
> version, but that's a rule of thumb, primarily to prevent getting /too/ 
> old, only.  Both the btrfs-progs userspace and the kernel itself are 
> normally designed to be able to work with both older and newer versions 
> of the other one.
> 
> So userspace not being in sync with the kernel version shouldn't be a 
> problem.
> 
Ok, good to know. 

> Well, I'm hoping someone who had the problem can confirm whether it's 
> fixed in current kernels (scrub is one of those userspace commands that's 
> mostly just a front-end to the kernel code which does the real work, so 
> kernel version is the important thing for scrub).  I'm guessing so, and 
> that you'll find the problem gone in 4.3.
> 
I wasn't aware of this. Good to know. 

> We'll cross the not-gone bridge if we get to it, but again, if the other 
> people who had the similar problem can confirm whether it disappeared for 
> them with the new kernel, it would help a lot, as there were enough such 
> reports that if it's the same problem and still there for everyone (which 
> I doubt as I expect there'd still be way more posts about it if so, but 
> confirmation's always good), nothing to do but wait for a fix, while if 
> not, and you still have your problem, then it's a different issue and the 
> devs will need to work with you on a fix specific to your problem.
> 
Ok, understood. 

Thanks & Happy New Year!

-John
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Add big device, remove small device, read-only

2016-01-01 Thread Duncan
Rasmus Abrahamsen posted on Fri, 01 Jan 2016 12:47:08 +0100 as excerpted:

> Happy New Year!
> 
> I have a raid with a 1TB, .5TB, 1.5TB and recently added a 4TB and want
> to remove the 1.5TB. When saying btrfs dev delete it turned into
> readonly. I am on 4.2.5-1-ARCH and btrfs-progs v4.3.1 what can I do?

This isn't going to help with the specific problem, and doesn't apply to 
your case now anyway as the 4 TB device has already been added so all 
you're doing now is deleting the old one, but FWIW...

There's a fairly new command, btrfs replace, that can be used to directly 
replace an old device with a new one, instead of doing btrfs device add, 
followed by btrfs device delete/remove.

> On top of that, my linux is on this same raid, so perhaps btrfs is
> writing some temp files in the filesystem but cannot?
> . /dev/sdc1 on / type btrfs
> (ro,relatime,space_cache,subvolid=1187,subvol=/linux)

Your wording leaves me somewhat confused.  You say your Linux, presumably 
your root filesystem, is on the same raid as the filesystem that is 
having problems.  That would imply that it's a different filesystem, 
which in turn would apply that the raid is below the filesystem level, 
say mdraid, dmraid, or hardware raid, with both your btrfs root 
filesystem, and the separate btrfs with the problems, on the same raid-
based device, presumably partitioned so you can put multiple filesystems 
on the same device.

Which of course would generally mean the two btrfs themselves aren't 
raid, unless of course you are using at least one non-btrfs raid as one 
device under a btrfs raid.  But while implied, that's not really 
supported by what you said, which suggests a single btrfs raid 
filesystem, instead.  In which case, perhaps you meant that this 
filesystem contains your root filesystem as well, not just that the raid 
contains it.

Of course, if your post had included the usual btrfs fi show and btrfs fi 
df (and btrfs fi usage would be good as well) that the wiki recommends be 
posted with such reports, that might make things clearer, but it doesn't, 
so we're left guessing...

But I'm assuming you meant a single multi-device btrfs, not multiple 
btrfs that happen to be on the same non-btrfs raid.

Another question the show and df would answer is what btrfs raid mode 
you're running.  The default for multiple device btrfs is of course raid1 
metadata and single mode data, but you might well have set it up with 
data and metadata in the same mode, and/or with raid0/5/6/10 for one or 
both data and metadata.  You didn't say and didn't provide the btrfs 
command output that would show it, so...

>  Ralle: did you do balance before removing?
> 
> I did not, but I have experience with it balancing itself upon doing so.
> Upon removing a device, that is.
> I am just not sure how to proceed now that everything is read-only.

You were correct in that regard.  btrfs device remove (or btrfs replace) 
trigger balance as part of the process, and balancing after adding a 
device, only to have balance trigger again with a delete/remove, is 
needless.

Actually, I suspect the remove-triggered balance ran across a problem it 
didn't know how to handle when attempting to move one of the chunks from 
the existing device, and that's what put the filesystem in read-only 
mode.  That's usually what happens when btrfs device remove triggers 
problems and people report it, anyway.  A balance before the remove would 
have simply triggered it then, anyway.

But what the specific problem is, and what to do about it, remains to be 
seen.  Having that btrfs fi show and btrfs fi df would be a good start, 
letting us know at least what raid type we're dealing with, etc.

>  I hope that you have backups?
> 
> I do have backups, but it's on Crashplan, so I would prefer not to have
> to go there.

That's wise, both him asking and you replying you already have them, but 
just want to avoid using them if possible.  Waaayyy too many folks 
posting here find out the hard way about the admin's first rule of 
backups, in simplified form, that if you don't have backups, you are 
declaring by your actions that the data not backed up is worth less to 
you than the time, resources and hassle required to do those backups, 
despite any after-the-fact protests to the contrary.  Not being in that 
group already puts you well ahead of the game! =:^)

>  and do you have any logs?
> 
> Where would those be?
> I never understood journalctl
> 
>  journalctl --since=today
> 
> Hmm, it was actually yesterday that I started the remove, so I did
> --since=yesterday I am looking at the log now, please stnad by.
> This is my log http://pastebin.com/mCPi3y9r But I fear that it became
> read-only before actually writing the error to the filesystem

Hmm...  Looks like my strategy of having both systemd's journald, and 
syslog-ng, might pay off.  I have journald configured to only do 
temporary files, which it keeps in /run/log/journal, with /run of course 
tmpfs.  That 

Re: btrfs scrub failing

2016-01-01 Thread Martin Steigerwald
Am Freitag, 1. Januar 2016, 11:41:20 CET schrieb John Center:

Happy New Year!

> > On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote:
> > 
> >
> > John Center posted on Thu, 31 Dec 2015 11:20:28 -0500 as excerpted:
> > 
> >
> >> I run a weekly scrub, using Marc Merlin's btrfs-scrub script.
> >> Usually, it completes without a problem, but this week it failed.  I ran
> >>
> >> the scrub manually & it stops shortly:
> >> 
> >>
> >> john@mariposa:~$ sudo /sbin/btrfs scrub start -BdR /dev/md124p2
> >> ERROR: scrubbing /dev/md124p2 failed for device id 1:
> >> ret=-1, errno=5 (Input/output error)
> >> scrub device /dev/md124p2 (id 1) canceled
> >> scrub started at Thu Dec 31 00:26:34 2015
> >> and was aborted after 00:01:29 [...]
> >
> > 
> >
> >> My Ubuntu 14.04 workstation is using the 4.2 kernel (Wily).
> >> I'm using btrfs-tools v4.3.1. [...]
> >
> > 
> >
> > A couple months ago, which would have made it around the 4.2 kernel 
> > you're running (with 4.3 being current and 4.4 nearly out), there were a 
> > number of similar scrub aborted reports on the list.
> >
> > 
> 
> I must have missed that, I'll check the list again to try & understand the
> issue better. 

I had repeatedly failing scrubs as mentioned in another thread here, until I 
used 4.4 kernel. With 4.3 kernel scrub also didn´t work. I didn´t use the 
debug options you used above and I am not sure whether I had this scrub issue 
with 4.2 already, so I am not sure it has been the same issue. But you may 
need to run 4.4 kernel in order to get scrub working again.

See my thread "[4.3-rc4] scrubbing aborts before finishing" for details.

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub failing

2016-01-01 Thread John Center
Hi Martin,

Happy New Year!

> On Jan 1, 2016, at 12:41 PM, Martin Steigerwald  wrote:
> 
> Am Freitag, 1. Januar 2016, 11:41:20 CET schrieb John Center:
> 
> Happy New Year!
> 
>>> On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote:
>>> 
>>> A couple months ago, which would have made it around the 4.2 kernel 
>>> you're running (with 4.3 being current and 4.4 nearly out), there were a 
>>> number of similar scrub aborted reports on the list.
>> 
>> I must have missed that, I'll check the list again to try & understand the
>> issue better.
> 
> I had repeatedly failing scrubs as mentioned in another thread here, until I 
> used 4.4 kernel. With 4.3 kernel scrub also didn´t work. I didn´t use the 
> debug options you used above and I am not sure whether I had this scrub issue 
> with 4.2 already, so I am not sure it has been the same issue. But you may 
> need to run 4.4 kernel in order to get scrub working again.
> 
> See my thread "[4.3-rc4] scrubbing aborts before finishing" for details.
> 
I was afraid of this. I just read your thread. I generally try to stay away 
from kernels so new, but I may have to try it. Was there any reason you didn't 
go to 4.1 instead?  (I run win8.1 in VirtualBox 5.0.12, when I need to run 
somethings under Windows. I'd have to wait until 4.4 is released & supported to 
do that.)

Thanks. 

-John--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs send / receive freeze system - Addendum

2016-01-01 Thread berny®
Sorry, I forgot to mention that this freeze is happening after I converted 
this backup drive to btrfs, before it was XFS. So sending to XFS drive didn't 
caused the freeze while sending with the same script to btrfs formatted drive 
freeze the system. Kernel and btrfs progs were the same.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub failing

2016-01-01 Thread John Center
Hi Duncan,

> On Jan 1, 2016, at 12:55 AM, Duncan <1i5t5.dun...@cox.net> wrote:
> 
> John Center posted on Thu, 31 Dec 2015 11:20:28 -0500 as excerpted:
> 
>> I run a weekly scrub, using Marc Merlin's btrfs-scrub script.
>> Usually, it completes without a problem, but this week it failed.  I ran
>> the scrub manually & it stops shortly:
>> 
>> john@mariposa:~$ sudo /sbin/btrfs scrub start -BdR /dev/md124p2
>> ERROR: scrubbing /dev/md124p2 failed for device id 1:
>> ret=-1, errno=5 (Input/output error)
>> scrub device /dev/md124p2 (id 1) canceled
>> scrub started at Thu Dec 31 00:26:34 2015
>> and was aborted after 00:01:29 [...]
> 
>> My Ubuntu 14.04 workstation is using the 4.2 kernel (Wily).
>> I'm using btrfs-tools v4.3.1. [...]
> 
> A couple months ago, which would have made it around the 4.2 kernel 
> you're running (with 4.3 being current and 4.4 nearly out), there were a 
> number of similar scrub aborted reports on the list.
> 
I must have missed that, I'll check the list again to try & understand the 
issue better. 

> I don't recall seeing any directly related patches, but the reports died 
> down, whether because everybody having them had reported already, or 
> because a newer kernel fixed the problem, I'm not sure, as I never had 
> the problem myself[1].
> 
> So I'd suggest upgrading to either the current 4.3 kernel or the latest 
> 4.4-rc, and hopefully the problem will be gone.  If I'd had the problem 
> myself I could tell you for sure whether it went away for me with 4.3, 
> but as I didn't...
> 
Ok, I'll upgrade to 4.3 & see if that resolves the problem with scrubbing. I 
was wondering when I compiled the btrfs-tools if there would be a problem with 
them not being in sync with the major kernel version. 

If this doesn't resolve the problem, what would you recommend my next steps 
should be?  I've been hesitant to run too many of the btrfs-tools, mainly 
because I don't want to accidentally screw things up & I don't always know how 
to interpret the results. (I ran btrfs-debug-tree, hoping something obvious 
would show up.  Big mistake. )

Thanks for your help!

-John--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs scrub failing

2016-01-01 Thread Duncan
John Center posted on Fri, 01 Jan 2016 11:41:20 -0500 as excerpted:

> Ok, I'll upgrade to 4.3 & see if that resolves the problem with
> scrubbing.
> I was wondering when I compiled the btrfs-tools if there would be a
> problem with them not being in sync with the major kernel version.

FWIW newer (or older either, as long as not too old and you don't need 
any of the features not in the older, and you're not trying to fix 
problems only the newer can deal with) versions of btrfs-progs should be 
fine.  As a rule of thumb I recommend staying at least current to kernel 
version, but that's a rule of thumb, primarily to prevent getting /too/ 
old, only.  Both the btrfs-progs userspace and the kernel itself are 
normally designed to be able to work with both older and newer versions 
of the other one.

So userspace not being in sync with the kernel version shouldn't be a 
problem.

> If this doesn't resolve the problem, what would you recommend my next
> steps should be?  I've been hesitant to run too many of the btrfs-tools,
> mainly because I don't want to accidentally screw things up & I don't
> always know how to interpret the results. (I ran btrfs-debug-tree,
> hoping something obvious would show up.  Big mistake. )

LOLed at that debug-tree remark.  Been there (with other tools) myself.

Well, I'm hoping someone who had the problem can confirm whether it's 
fixed in current kernels (scrub is one of those userspace commands that's 
mostly just a front-end to the kernel code which does the real work, so 
kernel version is the important thing for scrub).  I'm guessing so, and 
that you'll find the problem gone in 4.3.

We'll cross the not-gone bridge if we get to it, but again, if the other 
people who had the similar problem can confirm whether it disappeared for 
them with the new kernel, it would help a lot, as there were enough such 
reports that if it's the same problem and still there for everyone (which 
I doubt as I expect there'd still be way more posts about it if so, but 
confirmation's always good), nothing to do but wait for a fix, while if 
not, and you still have your problem, then it's a different issue and the 
devs will need to work with you on a fix specific to your problem.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html