[REGRESSION?] Used+avail gives more than size of device

2014-10-12 Thread Martin Steigerwald
Hi!

On 3.17, i.e. since the size reporting changes, I get:

merkaba:~ LANG=C df -hT -t btrfs
Filesystem  Type   Size  Used Avail Use% Mounted on
/dev/mapper/sata-debian btrfs   30G   19G   21G  48% /
/dev/mapper/sata-debian btrfs   30G   19G   21G  48% /mnt/debian-zeit
/dev/mapper/msata-daten btrfs  200G  185G   15G  93% /daten
/dev/mapper/msata-home  btrfs  160G  135G   48G  75% /mnt/home-zeit
/dev/mapper/msata-home  btrfs  160G  135G   48G  75% /home

I wonder about used and avail not adding up to total size of filesystem:
19+21 = 40 GiB instead of 30 GiB for / and 135+48 = 183 GiB for /home.
Only /daten seems to be correct.

/ and /home are RAID 1 spanning two SSDs. /daten is single.

I wondered about compression taken into account? They use compress=lzo.

While /daten has also compress=lzo, it contains mostly incompressible data
like jpeg images and mp3, ogg vorbis and probably some flac music files as
well one or the other mp4 video file, compressed media data that is.

Any explaination for the discrepancy, can it just be due to the compression?

/home has large maildir and lots of source files… which I expect to compress
well. Also a debian installation may contain quite an amount of compressible
data. But still… the ratio seems a bit off. As it means it would have been
able to store 19 GiB of data within 9 GiB of actual disk space for / which
would be a quite high compression ratio for LZO.


merkaba:~ mount | grep btrfs
/dev/mapper/sata-debian on / type btrfs 
(rw,noatime,compress=lzo,ssd,space_cache)
/dev/mapper/sata-debian on /mnt/debian-zeit type btrfs 
(rw,noatime,compress=lzo,ssd,space_cache)
/dev/mapper/msata-daten on /daten type btrfs 
(rw,noatime,compress=lzo,ssd,space_cache)
/dev/mapper/msata-home on /mnt/home-zeit type btrfs 
(rw,noatime,compress=lzo,ssd,space_cache)
/dev/mapper/msata-home on /home type btrfs 
(rw,noatime,compress=lzo,ssd,space_cache)




merkaba:~ btrfs fi sh
Label: 'debian'  uuid: […]
Total devices 2 FS bytes used 18.47GiB
devid1 size 30.00GiB used 30.00GiB path /dev/mapper/sata-debian
devid2 size 30.00GiB used 30.00GiB path /dev/mapper/msata-debian

Label: 'daten'  uuid: […]
Total devices 1 FS bytes used 184.82GiB
devid1 size 200.00GiB used 188.02GiB path /dev/mapper/msata-daten

Label: 'home'  uuid: […]
Total devices 2 FS bytes used 134.39GiB
devid1 size 160.00GiB used 160.00GiB path /dev/mapper/msata-home
devid2 size 160.00GiB used 160.00GiB path /dev/mapper/sata-home


merkaba:~ btrfs fi df / 
Data, RAID1: total=27.99GiB, used=17.84GiB
System, RAID1: total=8.00MiB, used=16.00KiB
Metadata, RAID1: total=2.00GiB, used=645.88MiB
unknown, single: total=224.00MiB, used=0.00
merkaba:~ btrfs fi df /home
Data, RAID1: total=154.97GiB, used=131.46GiB
System, RAID1: total=32.00MiB, used=48.00KiB
Metadata, RAID1: total=5.00GiB, used=2.93GiB
unknown, single: total=512.00MiB, used=0.00
merkaba:~ btrfs fi df /daten
Data, single: total=187.01GiB, used=184.53GiB
System, single: total=4.00MiB, used=48.00KiB
Metadata, single: total=1.01GiB, used=292.31MiB
unknown, single: total=112.00MiB, used=0.00


merkaba:~ LANG=C strace df -hT -t btrfs
execve(/bin/df, [df, -hT, -t, btrfs], [/* 55 vars */]) = 0
brk(0)  = 0x13c2000
access(/etc/ld.so.nohwcap, F_OK)  = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f795e7de000
access(/etc/ld.so.preload, R_OK)  = -1 ENOENT (No such file or directory)
open(/etc/ld.so.cache, O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=250673, ...}) = 0
mmap(NULL, 250673, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f795e7a
close(3)= 0
access(/etc/ld.so.nohwcap, F_OK)  = -1 ENOENT (No such file or directory)
open(/lib/x86_64-linux-gnu/libc.so.6, O_RDONLY|O_CLOEXEC) = 3
read(3, \177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\0\1\0\0\0P\34\2\0\0\0\0\0..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1725888, ...}) = 0
mmap(NULL, 3832352, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f795e218000
mprotect(0x7f795e3b7000, 2093056, PROT_NONE) = 0
mmap(0x7f795e5b6000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19e000) = 0x7f795e5b6000
mmap(0x7f795e5bc000, 14880, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f795e5bc000
close(3)= 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f795e79f000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f795e79e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f795e79d000
arch_prctl(ARCH_SET_FS, 0x7f795e79e700) = 0
mprotect(0x7f795e5b6000, 16384, PROT_READ) = 0
mprotect(0x616000, 4096, PROT_READ) = 0
mprotect(0x7f795e7e, 4096, PROT_READ) = 0
munmap(0x7f795e7a, 250673)  = 0
brk(0)

Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Martin Steigerwald
Am Donnerstag, 9. Oktober 2014, 21:58:53 schrieben Sie:
  * btrfs-zero-log
remove the log tree if log tree is corrupt
  * btrfs rescue
Recover a damaged btrfs filesystem
chunk-recover
super-recover
How does this relate to btrfs check?
  * btrfs check
repair a btrfs filesystem
--repair
--init-csum-tree
--init-extent-tree
How does this relate to btrfs rescue?
 
 These three translate into eight combinations of repairs, adding -o recovery
 there are 9 combinations. I think this is the main source of confusion,
 there are just too many options, but also it's completely non-obvious which
 one to use in which situation.
 
 My expectation is that eventually these get consolidated into just check and
 check --repair. As the repair code matures, it'd go into kernel
 autorecovery code. That's a guess on my part, but it's consistent with
 design goals.

Also I think these should at least all be unter the btrfs command.

So include btrfs-zero-log in btrfs command.

And well how about btrfs repair or btrfs check as upper category and at 
least add the various options as commands below it? So there is at least one
command and one place in manpage to learn about the various options.

But maybe some can be made automatic as well. Or folded into btrfs check --
repair. Ideally it would auto-detect which path to take on filesystem 
recovery.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Martin Steigerwald
Am Freitag, 10. Oktober 2014, 10:37:44 schrieb Chris Murphy:
 On Oct 10, 2014, at 6:53 AM, Bob Marley bobmar...@shiftmail.org wrote:
  On 10/10/2014 03:58, Chris Murphy wrote:
  * mount -o recovery
  
Enable autorecovery attempts if a bad tree root is found at mount
time.
  
  I'm confused why it's not the default yet. Maybe it's continuing to
  evolve at a pace that suggests something could sneak in that makes
  things worse? It is almost an oxymoron in that I'm manually enabling an
  autorecovery
  
  If true, maybe the closest indication we'd get of btrfs stablity is the
  default enabling of autorecovery. 
  No way!
  I wouldn't want a default like that.
  
  If you think at distributed transactions: suppose a sync was issued on
  both sides of a distributed transaction, then power was lost on one side,
  than btrfs had corruption. When I remount it, definitely the worst thing
  that can happen is that it auto-rolls-back to a previous known-good
  state.
 For a general purpose file system, losing 30 seconds (or less) of
 questionably committed data, likely corrupt, is a file system that won't
 mount without user intervention, which requires a secret decoder ring to
 get it to mount at all. And may require the use of specialized tools to
 retrieve that data in any case.
 
 The fail safe behavior is to treat the known good tree root as the default
 tree root, and bypass the bad tree root if it cannot be repaired, so that
 the volume can be mounted with default mount options (i.e. the ones in
 fstab). Otherwise it's a filesystem that isn't well suited for general
 purpose use as rootfs let alone for boot.

To understand this a bit better:

What can be the reasons a recent tree gets corrupted?

I always thought with a controller and device and driver combination that 
honors fsync with BTRFS it would either be the new state of the last known 
good state *anyway*. So where does the need to rollback arise from?

That said all journalling filesystems have some sort of rollback as far as I 
understand: If the last journal entry is incomplete they discard it on journal 
replay. So even there you use the last seconds of write activity.

But in case fsync() returns the data needs to be safe on disk. I always 
thought BTRFS honors this under *any* circumstance. If some proposed 
autorollback breaks this guarentee, I think something is broke elsewhere.

And fsync is an fsync is an fsync. Its semantics are clear as crystal. There 
is nothing, absolutely nothing to discuss about it.

An fsync completes if the device itself reported Yeah, I have the data on 
disk, all safe and cool to go. Anything else is a bug IMO.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Martin Steigerwald
Am Mittwoch, 8. Oktober 2014, 14:11:51 schrieb Eric Sandeen:
 I was looking at Marc's post:
 
 http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub- 
 and-Btrfs-Filesystem-Repair.html
 
 and it feels like there isn't exactly a cohesive, overarching vision for
 repair of a corrupted btrfs filesystem.
 
 In other words - I'm an admin cruising along, when the kernel throws some
 fs corruption error, or for whatever reason btrfs fails to mount.
 What should I do?
 
 Marc lays out several steps, but to me this highlights that there seem to
 be a lot of disjoint mechanisms out there to deal with these problems;
 mostly from Marc's blog, with some bits of my own:
 
 * btrfs scrub
   Errors are corrected along if possible (what *is* possible?)
 * mount -o recovery
   Enable autorecovery attempts if a bad tree root is found at mount 
 time.
 * mount -o degraded
   Allow mounts to continue with missing devices.
   (This isn't really a way to recover from corruption, right?)
 * btrfs-zero-log
   remove the log tree if log tree is corrupt
 * btrfs rescue
   Recover a damaged btrfs filesystem
   chunk-recover
   super-recover
   How does this relate to btrfs check?
 * btrfs check
   repair a btrfs filesystem
   --repair
   --init-csum-tree
   --init-extent-tree
   How does this relate to btrfs rescue?
 * btrfs restore
   try to salvage files from a damaged filesystem
   (not really repair, it's disk-scraping)
 
 
 What's the vision for, say, scrub vs. check vs. rescue?  Should they repair
 the same errors, only online vs. offline?  If not, what class of errors
 does one fix vs. the other?  How would an admin know?  Can btrfs check
 recover a bad tree root in the same way that mount -o recovery does?  How
 would I know if I should use --init-*-tree, or chunk-recover, and what are
 the ramifications of using these options?
 
 It feels like recovery tools have been badly splintered, and if there's an
 overarching design or vision for btrfs fs repair, I can't tell what it is.
 Can anyone help me?

How about taking one step back:

What are the possible corruption cases these tools are meant to address? 
*Where* can BTRFS break and *why*?

What of it can be folded into one command? Where can BTRFS be improved to 
either prevent a corruption from happening ot automatically correcting it? 
What actions can be determined automatically by the repair tool? What needs to 
be options for the user to choose from? And what guidance would the user need 
to decide?

I.e. really going to back what diagnosing and repair of BTRFS actually 
includes and then well… go about a vision how this all can fit together as you 
suggested.

As a minimum I suggest to have all possible options as a main category in 
btrfs command, no external commands whatsoever, so if btrfs-zero-log is still 
needed, at it into btrfs command.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send and kernel 3.17

2014-10-12 Thread David Arendt
This weekend I finally had time to try btrfs send again on the newly
created fs. Now I am running into another problem:

btrfs send returns: ERROR: send ioctl failed with -12: Cannot allocate
memory

In dmesg I see only the following output:

parent transid verify failed on 21325004800 wanted 2620 found 8325


On 10/07/2014 10:46 PM, Chris Mason wrote:
 On Tue, Oct 7, 2014 at 4:45 PM, David Arendt ad...@prnet.org wrote:
 On 10/07/2014 03:19 PM, Chris Mason wrote:


  On Tue, Oct 7, 2014 at 1:25 AM, David Arendt ad...@prnet.org wrote:
  I did a revert of this commit. After creating a snapshot, the
  filesystem was no longer usable, even with kernel 3.16.3 (crashes 10
  seconds after mount without error message) . Maybe there was some
  previous damage that just appeared now. This evening, I will restore
  from backup and report back.

  On October 7, 2014 12:22:11 AM CEST, Chris Mason c...@fb.com wrote:
  On Mon, Oct 6, 2014 at 4:51 PM, David Arendt ad...@prnet.org
 wrote:
   I just tried downgrading to 3.16.3 again. In 3.16.3 btrfs send is
   working without any problem. Afterwards I upgraded again to
 3.17 and
   the
   problem reappeared. So the problem seems to be kernel version
  related.

  [ backref errors during btrfs-send ]

  Ok then, our list of suspects is pretty short.  Can you easily build
  test kernels?

  I'd like to try reverting this commit:

  51f395ad4058883e4273b02fdebe98072dbdc0d2

  Oh no!  Reverting this definitely should not have caused corruptions,
  so I think the problem was already there.  Do you still have the
  filesystem image?

  Please let us know if you're missing files off the backup, we'll help
  pull them out.

 Due to space constraints, it was not possible to take an image of the
 corrupted filesystem. As I do backups daily, and the problems occurred 5
 hours after backup, no file was lost. Thanks for offering your help. In
 4 days I will do some send tests on the newly created filesystem and
 report back.

 Ok, if you have the kernel messages from the panic, please send them
 along.

 -chris



 -- 
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send and kernel 3.17

2014-10-12 Thread john terragon
Hi.

I just wanted to confirm David's story so to speak :)

-kernel 3.17-rc7 (didn't bother to compile 3.17 as there weren't any
btrfs fixes, I think)

-btrfs-progs 3.16.2 (also compiled from source, so no
distribution-specific patches)

-fresh fs

-I get the same two errors David got (first I got the I/O error one
and then the memory allocation one)

-plus now when I ls -la the fs top volume this is what I get

drwxrwsr-x 1 root staff  30 Sep 11 16:15 home
d? ? ??   ?? home-backup
drwxr-xr-x 1 root root  250 Oct 10 15:37 root
d? ? ??   ?? root-backup
drwxr-xr-x 1 root root   88 Sep 15 16:02 vms
drwxr-xr-x 1 root root   88 Sep 15 16:02 vms-backup

yes, the question marks on those two *-backup snapshots are really
there. I can't access the snapshots, I can't delete them, I can't do
anything with them.

-btrfs check segfaults

-the events that led to this situation are these:
 1) btrfs su snap -r root root-backup
 2) send |receive (the entire root-backup, not and incremental send)
 immediate I/O error
 3) move on to home: btrfs su snap -r home home-backup
 4) send|receive (again not an incremental send)
 everything goes well (!)
 5) retry with root: btrfs su snap -r root root-backup
 6) send|receive
 and it goes seemingly well
 7) apt-get dist-upgrade just to modify root and try an incremental send
 8) reboot after the dist-upgrade
 9) ls -la the fs top volume: first I get the memory allocation error
and after that
   any ls -la gives the output I pasted above. (notice that beside
the ls -la, the
   two snapshots were not touched in any way since the two send|receive)

Few final notes. I haven't tried send/receive in a while (they were
unreliable) so I can't tell which is the last version they worked for
me (well, no version actually :) ).
I've never had any problem with just snapshots. I make them regularly,
I use them, I modify them and I've never had one problem (with 3.17
too, it's just send/receive that murders them).

Best regards

John
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Test Commit

2014-10-12 Thread neo
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 18eb944..7cc7783 100644
--- a/Makefile
+++ b/Makefile
@@ -5,7 +5,7 @@ CC = gcc
 LN = ln
 AR = ar
 AM_CFLAGS = -Wall -D_FILE_OFFSET_BITS=64 -DBTRFS_FLAT_INCLUDES 
-fno-strict-aliasing -fPIC
-CFLAGS = -g -O1 -fno-strict-aliasing
+CFLAGS = -g -O1 -fno-strict-aliasing -rdynamic
 objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \
  root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \
  extent-cache.o extent_io.o volumes.o utils.o repair.o \
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


TEST PING EOM

2014-10-12 Thread royy walls

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [PATCH]Btrfs-prog: uniform error handling for utils.c

2014-10-12 Thread neo
---
 Makefile |   4 +-
 btrfs-syscalls.c | 180 +
 btrfs-syscalls.h |  55 +++
 kerncompat.h |   5 +-
 utils.c  | 200 +++
 5 files changed, 337 insertions(+), 107 deletions(-)
 create mode 100644 btrfs-syscalls.c
 create mode 100644 btrfs-syscalls.h

diff --git a/Makefile b/Makefile
index 18eb944..d738f20 100644
--- a/Makefile
+++ b/Makefile
@@ -5,7 +5,7 @@ CC = gcc
 LN = ln
 AR = ar
 AM_CFLAGS = -Wall -D_FILE_OFFSET_BITS=64 -DBTRFS_FLAT_INCLUDES 
-fno-strict-aliasing -fPIC
-CFLAGS = -g -O1 -fno-strict-aliasing
+CFLAGS = -g -O1 -fno-strict-aliasing -rdynamic
 objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \
  root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \
  extent-cache.o extent_io.o volumes.o utils.o repair.o \
@@ -17,7 +17,7 @@ cmds_objects = cmds-subvolume.o cmds-filesystem.o 
cmds-device.o cmds-scrub.o \
   cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \
   cmds-property.o
 libbtrfs_objects = send-stream.o send-utils.o rbtree.o btrfs-list.o crc32c.o \
-  uuid-tree.o utils-lib.o
+  uuid-tree.o utils-lib.o btrfs-syscalls.o
 libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \
   crc32c.h list.h kerncompat.h radix-tree.h extent-cache.h \
   extent_io.h ioctl.h ctree.h btrfsck.h version.h
diff --git a/btrfs-syscalls.c b/btrfs-syscalls.c
new file mode 100644
index 000..b4d791b
--- /dev/null
+++ b/btrfs-syscalls.c
@@ -0,0 +1,180 @@
+/***
+ *  File Name   :   btrfs-syscalls.c
+ *  Description :   This file contains system call wrapper functions with
+ *  uniform error handling.
+ 
**/
+#include btrfs-syscalls.h
+
+#define BKTRACE_BUFFER_SIZE 1024
+
+int err_verbose = 0;
+static void *buf[BKTRACE_BUFFER_SIZE];
+
+void
+btrfs_backtrace(void)
+{
+int i;
+int nptrs;
+char **entries;
+
+fprintf(stderr, Call trace:\n);
+nptrs = backtrace(buf, BKTRACE_BUFFER_SIZE);
+entries = backtrace_symbols(buf, nptrs);
+if (entries == NULL) {
+fprintf(stderr, ERROR: backtrace_symbols\n);
+exit(EXIT_FAILURE);
+}
+for (i = 0; i  nptrs; i++) {
+if (strstr(entries[i], btrfs_backtrace) == NULL);
+fprintf(stderr, \t%s\n, entries[i]);
+}
+free(entries);
+}
+
+int
+btrfs_open(const char *pathname, int flags)
+{
+int ret;
+
+if ((ret = open(pathname, flags))  0)
+SYS_ERROR(open : %s, pathname);
+
+return ret;
+}
+
+int
+btrfs_close(int fd)
+{
+int ret;
+
+if ((ret = close(fd))  0)
+SYS_ERROR(close :);
+
+return ret;
+}
+
+int
+btrfs_stat(const char *path, struct stat *buf)
+{
+int ret;
+
+if ((ret = stat(path, buf))  0)
+SYS_ERROR(stat : %s, path);
+
+return ret;
+}
+
+int
+btrfs_lstat(const char *path, struct stat *buf)
+{
+int ret;
+
+if ((ret = lstat(path, buf))  0) {
+SYS_ERROR(lstat : %s, path);
+}
+
+return ret;
+}
+
+int
+btrfs_fstat(int fd, struct stat *buf)
+{
+int ret;
+
+if ((ret = fstat(fd, buf))  0)
+SYS_ERROR(fstat :);
+
+return ret;
+}
+
+void*
+btrfs_malloc(size_t size)
+{
+void *p;
+
+if ((p = malloc(size)) == NULL) {
+if (size != 0)
+SYS_ERROR(malloc :);
+}
+
+return p;
+}
+
+void*
+btrfs_calloc(size_t nmemb, size_t size)
+{
+void *p;
+
+if ((p = calloc(nmemb, size)) == NULL) {
+if (size != 0)
+SYS_ERROR(calloc :);
+}
+
+return p;
+}
+
+FILE*
+btrfs_fopen(const char *path, const char *mode)
+{
+FILE *f;
+
+if ((f = fopen(path, mode)) == NULL)
+SYS_ERROR(fopen : %s, path);
+
+return f;
+}
+
+DIR*
+btrfs_opendir(const char *name)
+{
+DIR *d;
+
+if ((d = opendir(name)) == NULL)
+SYS_ERROR(opendir :);
+
+return d;
+}
+
+int
+btrfs_dirfd(DIR *dirp)
+{
+int fd;
+
+if ((fd = dirfd(dirp))  0)
+SYS_ERROR(dirfd :);
+
+return fd;
+}
+
+int
+btrfs_closedir(DIR *dirp)
+{
+int ret;
+
+if ((ret = closedir(dirp))  0)
+SYS_ERROR(closedir :);
+
+return ret;
+}
+
+ssize_t
+btrfs_pwrite(int fd, const void *buf, size_t count, off_t offset)
+{
+ssize_t ret;
+
+if ((ret = pwrite(fd, buf, count, offset))  0)
+   SYS_ERROR(pwrite :);
+
+return ret;
+}
+
+ssize_t
+btrfs_pread(int fd, const void *buf, size_t count, off_t offset)
+{
+ssize_t ret;
+
+if ((ret = pread(fd, buf, count, offset))  0)
+   SYS_ERROR(pread :);
+
+return ret;
+}
diff --git a/btrfs-syscalls.h b/btrfs-syscalls.h
new file mode 100644
index 000..2c717bf
--- /dev/null
+++ b/btrfs-syscalls.h
@@ -0,0 +1,55 @@
+#ifndef __BTRFS_SYSCALLS_H__

Re: [PATCH] [PATCH]Btrfs-prog: uniform error handling for utils.c

2014-10-12 Thread royy walls
Hi,

Following path implements the uniform error handling for the utils.c
in btrfs-progs.

On Sun, Oct 12, 2014 at 2:01 PM, neo ckn...@gmail.com wrote:
 ---
  Makefile |   4 +-
  btrfs-syscalls.c | 180 +
  btrfs-syscalls.h |  55 +++
  kerncompat.h |   5 +-
  utils.c  | 200 
 +++
  5 files changed, 337 insertions(+), 107 deletions(-)
  create mode 100644 btrfs-syscalls.c
  create mode 100644 btrfs-syscalls.h

 diff --git a/Makefile b/Makefile
 index 18eb944..d738f20 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -5,7 +5,7 @@ CC = gcc
  LN = ln
  AR = ar
  AM_CFLAGS = -Wall -D_FILE_OFFSET_BITS=64 -DBTRFS_FLAT_INCLUDES 
 -fno-strict-aliasing -fPIC
 -CFLAGS = -g -O1 -fno-strict-aliasing
 +CFLAGS = -g -O1 -fno-strict-aliasing -rdynamic
  objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \
   root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \
   extent-cache.o extent_io.o volumes.o utils.o repair.o \
 @@ -17,7 +17,7 @@ cmds_objects = cmds-subvolume.o cmds-filesystem.o 
 cmds-device.o cmds-scrub.o \
cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \
cmds-property.o
  libbtrfs_objects = send-stream.o send-utils.o rbtree.o btrfs-list.o crc32c.o 
 \
 -  uuid-tree.o utils-lib.o
 +  uuid-tree.o utils-lib.o btrfs-syscalls.o
  libbtrfs_headers = send-stream.h send-utils.h send.h rbtree.h btrfs-list.h \
crc32c.h list.h kerncompat.h radix-tree.h extent-cache.h \
extent_io.h ioctl.h ctree.h btrfsck.h version.h
 diff --git a/btrfs-syscalls.c b/btrfs-syscalls.c
 new file mode 100644
 index 000..b4d791b
 --- /dev/null
 +++ b/btrfs-syscalls.c
 @@ -0,0 +1,180 @@
 +/***
 + *  File Name   :   btrfs-syscalls.c
 + *  Description :   This file contains system call wrapper functions with
 + *  uniform error handling.
 + 
 **/
 +#include btrfs-syscalls.h
 +
 +#define BKTRACE_BUFFER_SIZE 1024
 +
 +int err_verbose = 0;
 +static void *buf[BKTRACE_BUFFER_SIZE];
 +
 +void
 +btrfs_backtrace(void)
 +{
 +int i;
 +int nptrs;
 +char **entries;
 +
 +fprintf(stderr, Call trace:\n);
 +nptrs = backtrace(buf, BKTRACE_BUFFER_SIZE);
 +entries = backtrace_symbols(buf, nptrs);
 +if (entries == NULL) {
 +fprintf(stderr, ERROR: backtrace_symbols\n);
 +exit(EXIT_FAILURE);
 +}
 +for (i = 0; i  nptrs; i++) {
 +if (strstr(entries[i], btrfs_backtrace) == NULL);
 +fprintf(stderr, \t%s\n, entries[i]);
 +}
 +free(entries);
 +}
 +
 +int
 +btrfs_open(const char *pathname, int flags)
 +{
 +int ret;
 +
 +if ((ret = open(pathname, flags))  0)
 +SYS_ERROR(open : %s, pathname);
 +
 +return ret;
 +}
 +
 +int
 +btrfs_close(int fd)
 +{
 +int ret;
 +
 +if ((ret = close(fd))  0)
 +SYS_ERROR(close :);
 +
 +return ret;
 +}
 +
 +int
 +btrfs_stat(const char *path, struct stat *buf)
 +{
 +int ret;
 +
 +if ((ret = stat(path, buf))  0)
 +SYS_ERROR(stat : %s, path);
 +
 +return ret;
 +}
 +
 +int
 +btrfs_lstat(const char *path, struct stat *buf)
 +{
 +int ret;
 +
 +if ((ret = lstat(path, buf))  0) {
 +SYS_ERROR(lstat : %s, path);
 +}
 +
 +return ret;
 +}
 +
 +int
 +btrfs_fstat(int fd, struct stat *buf)
 +{
 +int ret;
 +
 +if ((ret = fstat(fd, buf))  0)
 +SYS_ERROR(fstat :);
 +
 +return ret;
 +}
 +
 +void*
 +btrfs_malloc(size_t size)
 +{
 +void *p;
 +
 +if ((p = malloc(size)) == NULL) {
 +if (size != 0)
 +SYS_ERROR(malloc :);
 +}
 +
 +return p;
 +}
 +
 +void*
 +btrfs_calloc(size_t nmemb, size_t size)
 +{
 +void *p;
 +
 +if ((p = calloc(nmemb, size)) == NULL) {
 +if (size != 0)
 +SYS_ERROR(calloc :);
 +}
 +
 +return p;
 +}
 +
 +FILE*
 +btrfs_fopen(const char *path, const char *mode)
 +{
 +FILE *f;
 +
 +if ((f = fopen(path, mode)) == NULL)
 +SYS_ERROR(fopen : %s, path);
 +
 +return f;
 +}
 +
 +DIR*
 +btrfs_opendir(const char *name)
 +{
 +DIR *d;
 +
 +if ((d = opendir(name)) == NULL)
 +SYS_ERROR(opendir :);
 +
 +return d;
 +}
 +
 +int
 +btrfs_dirfd(DIR *dirp)
 +{
 +int fd;
 +
 +if ((fd = dirfd(dirp))  0)
 +SYS_ERROR(dirfd :);
 +
 +return fd;
 +}
 +
 +int
 +btrfs_closedir(DIR *dirp)
 +{
 +int ret;
 +
 +if ((ret = closedir(dirp))  0)
 +SYS_ERROR(closedir :);
 +
 +return ret;
 +}
 +
 +ssize_t
 +btrfs_pwrite(int fd, const void *buf, size_t count, off_t offset)
 +{
 +ssize_t ret;
 +
 +if ((ret = pwrite(fd, buf, count, offset))  0)
 +   SYS_ERROR(pwrite :);
 +
 +return ret;
 +}
 +
 +ssize_t
 

Re: btrfs send and kernel 3.17

2014-10-12 Thread David Arendt
Just to let you know, I just tried an ls -l on 2 machines running kernel
3.17 and btrfs-progs 3.16.2.

Here is my ls -l output:

Machine 1:
ls: cannot access root.20141009.000503.backup: Cannot allocate memory
total 0
d? ? ?  ? ?? root.20141009.000503.backup
drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141012.095526.backup
drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141012.000503.backup
drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141011.000502.backup
drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141010.000502.backup

root.20141009.000503.backup is not deletable.

Machine 2:
ls: cannot access root.20141006.003239.backup: Cannot allocate memory
ls: cannot access root.20141007.001616.backup: Cannot allocate memory
ls: cannot access root.20141008.000501.backup: Cannot allocate memory
ls: cannot access root.20141009.052436.backup: Cannot allocate memory
total 0
d? ? ??  ?? root.20141009.052436.backup
d? ? ??  ?? root.20141008.000501.backup
d? ? ??  ?? root.20141007.001616.backup
d? ? ??  ?? root.20141006.003239.backup
drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140925.001125.backup
drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140924.001017.backup
drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140923.001008.backup
drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140922.001836.backup
drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140921.001029.backup
drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140920.001020.backup

The ? ones are also not deletable.

Both machines are giving transid verify failed errors.

I verified my logfiles and this problem was never there using previous
kernel versions. On machine 1, it is also sure that it was not any
previous corruption as this filesystem has also been created with
btrfs-progs 3.16.2 using kernel 3.17.

On 10/12/2014 05:24 PM, john terragon wrote:
 Hi.

 I just wanted to confirm David's story so to speak :)

 -kernel 3.17-rc7 (didn't bother to compile 3.17 as there weren't any
 btrfs fixes, I think)

 -btrfs-progs 3.16.2 (also compiled from source, so no
 distribution-specific patches)

 -fresh fs

 -I get the same two errors David got (first I got the I/O error one
 and then the memory allocation one)

 -plus now when I ls -la the fs top volume this is what I get

 drwxrwsr-x 1 root staff  30 Sep 11 16:15 home
 d? ? ??   ?? home-backup
 drwxr-xr-x 1 root root  250 Oct 10 15:37 root
 d? ? ??   ?? root-backup
 drwxr-xr-x 1 root root   88 Sep 15 16:02 vms
 drwxr-xr-x 1 root root   88 Sep 15 16:02 vms-backup

 yes, the question marks on those two *-backup snapshots are really
 there. I can't access the snapshots, I can't delete them, I can't do
 anything with them.

 -btrfs check segfaults

 -the events that led to this situation are these:
  1) btrfs su snap -r root root-backup
  2) send |receive (the entire root-backup, not and incremental send)
  immediate I/O error
  3) move on to home: btrfs su snap -r home home-backup
  4) send|receive (again not an incremental send)
  everything goes well (!)
  5) retry with root: btrfs su snap -r root root-backup
  6) send|receive
  and it goes seemingly well
  7) apt-get dist-upgrade just to modify root and try an incremental send
  8) reboot after the dist-upgrade
  9) ls -la the fs top volume: first I get the memory allocation error
 and after that
any ls -la gives the output I pasted above. (notice that beside
 the ls -la, the
two snapshots were not touched in any way since the two send|receive)

 Few final notes. I haven't tried send/receive in a while (they were
 unreliable) so I can't tell which is the last version they worked for
 me (well, no version actually :) ).
 I've never had any problem with just snapshots. I make them regularly,
 I use them, I modify them and I've never had one problem (with 3.17
 too, it's just send/receive that murders them).

 Best regards

 John

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION?] Used+avail gives more than size of device

2014-10-12 Thread Duncan
Martin Steigerwald posted on Sun, 12 Oct 2014 11:56:51 +0200 as excerpted:


 On 3.17, i.e. since the size reporting changes, I get:
 
 merkaba:~ LANG=C df -hT -t btrfs
 Filesystem  Type   Size  Used Avail Use% Mounted on
 /dev/mapper/sata-debian btrfs   30G   19G   21G  48% /
 /dev/mapper/sata-debian btrfs   30G   19G   21G  48% /mnt/debian-zeit
 /dev/mapper/msata-daten btrfs  200G  185G   15G  93% /daten
 /dev/mapper/msata-home  btrfs  160G  135G   48G  75% /mnt/home-zeit
 /dev/mapper/msata-home  btrfs  160G  135G   48G  75% /home
 
 I wonder about used and avail not adding up to total size of filesystem:
 19+21 = 40 GiB instead of 30 GiB for / and 135+48 = 183 GiB for /home.
 Only /daten seems to be correct.

That's standard df, not btrfs fi df.

Due to the way btrfs works and the constraints of the printing format 
that standard df uses, it cannot and will not present a full picture of 
filesystem usage.  Some compromises must be made in the choice of which 
available filesystem stats to present and the manner in which they are 
presented within the limited df format, and no matter which compromises 
are chosen, standard df output will always look a bit screwy for /some/ 
btrfs filesystem layouts.

 / and /home are RAID 1 spanning two SSDs. /daten is single.
 
 I wondered about compression taken into account? They use compress=lzo.
 [...]  Any explaination for the discrepancy, can it just be due to the
 compression?

It's not compression, but FWIW, I believe I know what's going on...

 merkaba:~ mount | grep btrfs
 /dev/mapper/sata-debian on / type btrfs
 (rw,noatime,compress=lzo,ssd,space_cache)
 /dev/mapper/sata-debian on /mnt/debian-zeit type btrfs
 (rw,noatime,compress=lzo,ssd,space_cache)
 /dev/mapper/msata-daten on /daten type btrfs
 (rw,noatime,compress=lzo,ssd,space_cache)
 /dev/mapper/msata-home on /mnt/home-zeit type btrfs
 (rw,noatime,compress=lzo,ssd,space_cache)
 /dev/mapper/msata-home on /home type btrfs
 (rw,noatime,compress=lzo,ssd,space_cache)
 
 
 merkaba:~ btrfs fi sh
 Label: 'debian'  uuid: […]
Total devices 2 FS bytes used 18.47GiB
devid   1 size 30.00GiB used 30.00GiB path /dev/mapper/sata-debian
devid   2 size 30.00GiB used 30.00GiB path /dev/mapper/msata-debian
 
 Label: 'daten'  uuid: […]
Total devices 1 FS bytes used 184.82GiB
devid   1 size 200.00GiB used 188.02GiB path /dev/mapper/msata-daten
 
 Label: 'home'  uuid: […]
Total devices 2 FS bytes used 134.39GiB
devid   1 size 160.00GiB used 160.00GiB path /dev/mapper/msata-home
devid   2 size 160.00GiB used 160.00GiB path /dev/mapper/sata-home
 
 
 merkaba:~ btrfs fi df /
 Data, RAID1: total=27.99GiB, used=17.84GiB
 System, RAID1: total=8.00MiB, used=16.00KiB
 Metadata, RAID1: total=2.00GiB, used=645.88MiB
 unknown, single: total=224.00MiB, used=0.00

 merkaba:~ btrfs fi df /home
 Data, RAID1: total=154.97GiB, used=131.46GiB
 System, RAID1: total=32.00MiB, used=48.00KiB
 Metadata, RAID1: total=5.00GiB, used=2.93GiB
 unknown, single: total=512.00MiB, used=0.00

 merkaba:~ btrfs fi df /daten
 Data, single: total=187.01GiB, used=184.53GiB
 System, single: total=4.00MiB, used=48.00KiB
 Metadata, single: total=1.01GiB, used=292.31MiB
 unknown, single: total=112.00MiB, used=0.00


Side observation, doesn't look like you have btrfs-progs 3.16.1 yet, 
since btrfs fi df is still reporting unknown for that last chunk-type 
instead of global reserve.


While I didn't follow the (standard) df information presentation change 
discussion closely enough to know what the resolution was, looking at the 
numbers above I believe I know what's going on with df.

First, focus on used, using / as an example.

df (standard) used: 19 G
btrfs fi show (total line) used:18.47 GiB
btrfs fi df (sum all types) used:   17.84 GiB + 646 MiB ~= 18.5 GiB

So the displayed usage for all three reports agrees, roughly 19 G used.

Compression?  Only actual (used) data/metadata can compress, the left 
over free space won't; it's left over.  So any effects of compression 
would be seen in the above used numbers.  The numbers above are close 
enough to each other that compression can't be playing a part.[1]

OK, so what's deal with (standard) df, then?  If the discrepancy isn't 
coming from used, where's it coming from?

Simple enough.  What's the big difference between the filesystem that 
appears correct and the other two?  Big hint, take a look at the second 
field of the btrfs fi df output.  Hint #2, btrfs fi show, count the 
number of devices.

Back to standard df, available: 

/   21 GiB  /2  10.5 GiB10.5 (avail) + 19 (used) ~  30 GiB
/data   15 GiB  --  15 GiB  15 (avail) + 185 (used)  ~ 200 GiB
/home   48 GiB  /2  24 GiB  24 (avail) + 135 (used)  ~ 160 GiB

It's the raid-factor.  =:^)

Btrfs in the kernel is apparently accounting for raid-factor in used 
space in whatever function standard df is using, but not in available 
space, even where 

what is the best way to monitor raid1 drive failures?

2014-10-12 Thread Suman C
Hi,

I am testing some disk failure scenarios in a 2 drive raid1 mirror.
They are 4GB each, virtual SATA drives inside virtualbox.

To simulate the failure, I detached one of the drives from the system.
After that, I see no sign of a problem except for these errors:

Oct 12 15:37:14 rock-dev kernel: btrfs: bdev /dev/sdb errs: wr 0, rd
0, flush 1, corrupt 0, gen 0
Oct 12 15:37:14 rock-dev kernel: lost page write due to I/O error on /dev/sdb

/dev/sdb is gone from the system, but btrfs fi show still lists it.

Label: raid1pool  uuid: 4e5d8b43-1d34-4672-8057-99c51649b7c6
Total devices 2 FS bytes used 1.46GiB
devid1 size 4.00GiB used 2.45GiB path /dev/sdb
devid2 size 4.00GiB used 2.43GiB path /dev/sdc

I am able to read and write just fine, but do see the above errors in dmesg.

What is the best way to find out that one of the drives has gone bad?

Suman
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-12 Thread Duncan
Martin Steigerwald posted on Sun, 12 Oct 2014 12:14:01 +0200 as excerpted:

 I always thought with a controller and device and driver combination
 that honors fsync with BTRFS it would either be the new state of the
 last known good state *anyway*. So where does the need to rollback arise
 from?

My understanding here is...

With btrfs a full-tree commit is atomic.  You should get either the old 
tree or the new tree.  However, due to the cascading nature of updates on 
cow-based structures, these full-tree commits are done by default 
(there's a mount-option to adjust it) every 30 seconds.  Between these 
atomic commits partial updates may have occurred.  The btrfs log (the one 
that btrfs-zero-log kills) is limited to between-commit updates, and thus 
to the upto 30 seconds (default) worth of changes since the last full-
tree atomic commit.

In addition to that, there's a history of tree-root commits kept (with 
the superblocks pointing to the last one).  Btrfs-find-tree-root can be 
used to list this history.  The recovery mount option simply allows btrfs 
to fall back to this history, should the current root be corrupted.  
Btrfs restore can be used to list tree roots as well, and can be pointed 
at an appropriate one if necessary.

Fsync forces the file and its corresponding metadata update to the log 
and barring hardware or software bugs should not return until it's safely 
in the log, but I'm not sure whether it forces a full-tree commit.  
Either way the guarantees should be the same.  If the log can be replayed 
or a full-tree commit has occurred since the fsync, the new copy should 
appear.  If it can't, the rollback to the last atomic tree commit should 
return an intact copy of the file from that point.  If the recovery mount 
option is used and a further rollback to an earlier full-tree commit is 
forced, provided it existed at the point of that full-tree commit, the 
intact file at that point should appear.

So if the current tree root is a good one, the log will replay the last 
upto 30 seconds of activity on top of that last atomic tree root.  If the 
current root tree itself is corrupt, the recovery mount option will let 
an earlier one be used.  Obviously in that case the log will be discarded 
since it applies to a later root tree that itself has been discarded.

The debate is whether recovery should be automated so the admin doesn't 
have to care about it, or whether having to manually add that option 
serves as a necessary notifier to the admin that something /did/ go 
wrong, and that an earlier root is being used instead, so more than a few 
seconds worth of data may have disappeared.


As someone else has already suggested, I'd argue that as long as btrfs 
continues to be under the sort of development it's in now, keeping 
recovery as a non-default option is desired.  Once it's optimized and 
considered stable, arguably recovery should be made the default, perhaps 
with a no-recovery option for those who prefer that in-the-face 
notification in the form of a mount error, if btrfs would otherwise fall 
back to an earlier tree root commit.

What worries me, however, is that IMO the recent warning stripping was 
premature.  Btrfs is certainly NOT fully stable or optimized for normal 
use at this point.  We're still using the even/odd PID balancing scheme 
for raid1 reads, for instance, and multi-device writes are still 
serialized when they could be parallelized to a much larger degree (tho 
keeping some serialization is arguably good for data safety).  Arguably 
optimizing that now would be premature optimization since the code itself 
is still subject to change, so I'm not complaining, but by that very same 
token, it *IS* still subject to change, which by definition means it's 
*NOT* stable, so why are we removing all the warnings and giving the 
impression that it IS stable?

The decision wasn't mine to make and I don't know, but while a nice 
suggestion, making recovery-by-default a measure of when btrfs goes 
stable simply won't work, because surely, the same folks behind the 
warning stripping would then ensure this indicator too, said btrfs was 
stable, while the state of the code itself continues to say otherwise. 

Meanwhile, if your distributed transactions scenario doesn't account for 
crash and loss of data on one side with real-time backup/redundancy, such 
that loss of a few seconds worth of transactions on a single local 
filesystem is going to kill the entire scenario, I don't think too much 
of that scenario in the first place, and regardless, btrfs, certainly in 
its current state, is definitely NOT an appropriate base for it.  Use 
appropriate tools for the task.  Btrfs at least at this point is simply 
not an appropriate tool for that task.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe 

Re: TEST PING EOM

2014-10-12 Thread cwillu
On Sun, Oct 12, 2014 at 2:45 PM, royy walls ckn...@gmail.com wrote:

 --

http://www.tux.org/lkml/#s3

Test messages are very, very inappropriate on the lkml or any other
list, for that matter. If you want to know whether the subscribe
succeeded, wait for a couple of hours after you get a reply from the
mailing list software saying it did. You'll undoubtedly get a number
of list messages. If you want to know whether you can post, you must
have something important to say, right? After you have read the
following paragraphs, compose a real letter, not a test message, in an
editor, saving the body of the letter in the off chance your post
doesn't succeed. Then post your letter to lkml. Please remember that
there are quite a number of subscribers, and it will take a while for
your letter to be reflected back to you. An hour is not too long to
wait.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TEST PING EOM

2014-10-12 Thread royy walls
I apologies for this, I'm new to this and was not sure whether it is
working or not.

On Sun, Oct 12, 2014 at 5:14 PM, cwillu cwi...@cwillu.com wrote:
 On Sun, Oct 12, 2014 at 2:45 PM, royy walls ckn...@gmail.com wrote:

 --

 http://www.tux.org/lkml/#s3

 Test messages are very, very inappropriate on the lkml or any other
 list, for that matter. If you want to know whether the subscribe
 succeeded, wait for a couple of hours after you get a reply from the
 mailing list software saying it did. You'll undoubtedly get a number
 of list messages. If you want to know whether you can post, you must
 have something important to say, right? After you have read the
 following paragraphs, compose a real letter, not a test message, in an
 editor, saving the body of the letter in the off chance your post
 doesn't succeed. Then post your letter to lkml. Please remember that
 there are quite a number of subscribers, and it will take a while for
 your letter to be reflected back to you. An hour is not too long to
 wait.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: return failure if btrfs_dev_replace_finishing() failed

2014-10-12 Thread Miao Xie
Guan

On Sat, 11 Oct 2014 14:45:29 +0800, Eryu Guan wrote:
 device replace could fail due to another running scrub process, but this
 failure doesn't get returned to userspace.

 The following steps could reproduce this issue

mkfs -t btrfs -f /dev/sdb1 /dev/sdb2
mount /dev/sdb1 /mnt/btrfs
while true; do
btrfs scrub start -B /mnt/btrfs /dev/null 21
done 
btrfs replace start -Bf /dev/sdb2 /dev/sdb3 /mnt/btrfs
# if this replace succeeded, do the following and repeat until
# you see this log in dmesg
# BTRFS: btrfs_scrub_dev(/dev/sdb2, 2, /dev/sdb3) failed -115
#btrfs replace start -Bf /dev/sdb3 /dev/sdb2 /mnt/btrfs

# once you see the error log in dmesg, check return value of
# replace
echo $?

 Also only WARN_ON if the return code is not -EINPROGRESS.

 Signed-off-by: Eryu Guan guane...@gmail.com

 Ping, any comments on this patch?

 Thanks,
 Eryu
 ---
  fs/btrfs/dev-replace.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

 diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
 index eea26e1..44d32ab 100644
 --- a/fs/btrfs/dev-replace.c
 +++ b/fs/btrfs/dev-replace.c
 @@ -418,9 +418,11 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
  dev_replace-scrub_progress, 0, 1);
  
ret = btrfs_dev_replace_finishing(root-fs_info, ret);
 -  WARN_ON(ret);
 +  /* don't warn if EINPROGRESS, someone else might be running scrub */
 +  if (ret != -EINPROGRESS)
 +  WARN_ON(ret);

 picky comment

 I prefer WARN_ON(ret  ret != -EINPROGRESS).
 
 Yes, this is simpler :)

  
 -  return 0;
 +  return ret;

 here we will return -EINPROGRESS if scrub is running, I think it better that
 we assign some special number to args-result, and then return 0, just like
 the case the device replace is running.
 
 Seems that requires a new result type, say,
 
 #define BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS  3
 
 and assign this result to args-result if btrfs_scrub_dev() returned 
 -EINPROGRESS
 
 But I don't think returning 0 unconditionally is a good idea, since
 btrfs_dev_replace_finishing() could return other errors too, that way
 these errors will be lost, and userspace still won't catch the
 errors ($? is 0)

Of course.
Maybe the above explanation of mine was not so clear. In fact, I just talked 
about
the EINPROGRESS case, for the other case, returning error code is better.

 What I'm thinking about is something like:
 
   ret = btrfs_scrub_dev(...);
   ret = btrfs_dev_replace_finishing(root-fs_info, ret);
   if (ret == -EINPROGRESS) {
   args-result = BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS;
   ret = 0;
   } else {
   WARN_ON(ret);
   }
 
   return ret;
 
 What do you think? If no objection I'll work on v2.

I like it.

Thanks
Miao

 Thanks for your review!
 
 Eryu

 Thanks
 Miao

  
  leave:
dev_replace-srcdev = NULL;
 @@ -538,7 +540,7 @@ static int btrfs_dev_replace_finishing(struct 
 btrfs_fs_info *fs_info,
btrfs_destroy_dev_replace_tgtdev(fs_info, tgt_device);
mutex_unlock(dev_replace-lock_finishing_cancel_unmount);
  
 -  return 0;
 +  return scrub_ret;
}
  
printk_in_rcu(KERN_INFO
 -- 
 1.8.3.1

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 .
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: kernel BUG at fs/btrfs/extent_io.c:676!

2014-10-12 Thread Sasha Levin
Ping?

This BUG_ON()ing due to GFP_ATOMIC allocation failure is really silly :(

On 03/23/2014 09:26 PM, Sasha Levin wrote:
 Hi all,
 
 While fuzzing with trinity inside KVM tools guest running latest -next kernel
 I've stumbled on the following spew.
 
 This is a result of a failed allocation in alloc_extent_state_atomic() which
 triggers a BUG_ON when the return value is NULL. It's a bit weird that it
 BUGs on failed allocations, since it's obviously not a critical failure.
 
 [  447.705167] kernel BUG at fs/btrfs/extent_io.c:676!
 [  447.706201] invalid opcode:  [#1] PREEMPT SMP DEBUG_PAGEALLOC
 [  447.707732] Dumping ftrace buffer:
 [  447.708473](ftrace buffer empty)
 [  447.709684] Modules linked in:
 [  447.710246] CPU: 17 PID: 4195 Comm: kswapd17 Tainted: GW 
 3.14.0-rc7-next-20140321-sasha-00018-g0516fe6-dirty #265
 [  447.710253] task: 88066be9b000 ti: 88066be82000 task.ti: 
 88066be82000
 [  447.710253] RIP:  clear_extent_bit (fs/btrfs/extent_io.c:676)
 [  447.710253] RSP: :88066be83768  EFLAGS: 00010246
 [  447.710253] RAX:  RBX: 00d00fff RCX: 
 0006
 [  447.710253] RDX: 58e0 RSI: 88066be9bd60 RDI: 
 0286
 [  447.710253] RBP: 88066be837e8 R08:  R09: 
 
 [  447.710253] R10: 0001 R11: 454a4e495f544c55 R12: 
 01ff
 [  447.710253] R13:  R14: 88007b89fd08 R15: 
 00d0
 [  447.710253] FS:  () GS:8804acc0() 
 knlGS:
 [  447.710253] CS:  0010 DS:  ES:  CR0: 8005003b
 [  447.710253] CR2: 02aec968 CR3: 05e29000 CR4: 
 06a0
 [  447.710253] DR0: 00698000 DR1: 00698000 DR2: 
 
 [  447.710253] DR3:  DR6: 0ff0 DR7: 
 0600
 [  447.710253] Stack:
 [  447.710253]  88066be83788 844fc4d5  
 8804ab4800e8
 [  447.710253]   0001 8804ab4800c8 
 fbf7
 [  447.710253]  88066be837c8  0006 
 ea0007aaf340
 [  447.710253] Call Trace:
 [  447.710253]  ? _raw_spin_unlock (arch/x86/include/asm/preempt.h:98 
 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
 [  447.710253]  try_release_extent_mapping (fs/btrfs/extent_io.c:3998 
 fs/btrfs/extent_io.c:4058)
 [  447.710253]  __btrfs_releasepage (fs/btrfs/inode.c:7521)
 [  447.710253]  btrfs_releasepage (fs/btrfs/inode.c:7534)
 [  447.710253]  try_to_release_page (mm/filemap.c:2984)
 [  447.710253]  invalidate_inode_page (mm/truncate.c:165 mm/truncate.c:215)
 [  447.710253]  invalidate_mapping_pages (mm/truncate.c:517)
 [  447.710253]  inode_lru_isolate (arch/x86/include/asm/current.h:14 
 include/linux/swap.h:33 fs/inode.c:724)
 [  447.710253]  ? insert_inode_locked (fs/inode.c:687)
 [  447.710253]  list_lru_walk_node (mm/list_lru.c:89)
 [  447.710253]  prune_icache_sb (fs/inode.c:759)
 [  447.710253]  super_cache_scan (fs/super.c:96)
 [  447.710253]  shrink_slab_node (mm/vmscan.c:306)
 [  447.710253]  shrink_slab (mm/vmscan.c:381)
 [  447.710253]  kswapd_shrink_zone (mm/vmscan.c:2909)
 [  447.710253]  kswapd (mm/vmscan.c:3090 mm/vmscan.c:3296)
 [  447.710253]  ? mem_cgroup_shrink_node_zone (mm/vmscan.c:3213)
 [  447.710253]  kthread (kernel/kthread.c:219)
 [  447.710253]  ? __tick_nohz_task_switch 
 (arch/x86/include/asm/paravirt.h:809 kernel/time/tick-sched.c:272)
 [  447.710253]  ? kthread_create_on_node (kernel/kthread.c:185)
 [  447.710253]  ret_from_fork (arch/x86/kernel/entry_64.S:555)
 [  447.710253]  ? kthread_create_on_node (kernel/kthread.c:185)
 [  447.710253] Code: e9 a9 00 00 00 0f 1f 00 48 39 c3 0f 82 87 00 00 00 4c 39 
 e3 0f 83 7e 00 00 00 48 8b 7d a0 e8 45 ef ff ff 48 85 c0 49 89 c5 75 05 0f 
 0b 0f 1f 00 48 8b 7d b0 48 8d 4b 01 48 89 c2 4c 89 f6 e8 c5
 [  447.710253] RIP  clear_extent_bit (fs/btrfs/extent_io.c:676)
 [  447.710253]  RSP 88066be83768
 
 
 Thanks,
 Sasha

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: what is the best way to monitor raid1 drive failures?

2014-10-12 Thread Anand Jain


Suman,

 To simulate the failure, I detached one of the drives from the system.
 After that, I see no sign of a problem except for these errors:

 Are you physically pulling out the device ? I wonder if lsblk or blkid
 shows the error ? reporting device missing logic is in the progs (so
 have that latest) and it works provided user script such as blkid/lsblk
 also reports the problem. OR for soft-detach tests you could use
 devmgt at http://github.com/anajain/devmgt

 Also I am trying to get the device management framework for the btrfs
 with a more better device management and reporting.

Thanks,  Anand


On 10/13/14 07:50, Suman C wrote:

Hi,

I am testing some disk failure scenarios in a 2 drive raid1 mirror.
They are 4GB each, virtual SATA drives inside virtualbox.

To simulate the failure, I detached one of the drives from the system.
After that, I see no sign of a problem except for these errors:

Oct 12 15:37:14 rock-dev kernel: btrfs: bdev /dev/sdb errs: wr 0, rd
0, flush 1, corrupt 0, gen 0
Oct 12 15:37:14 rock-dev kernel: lost page write due to I/O error on /dev/sdb

/dev/sdb is gone from the system, but btrfs fi show still lists it.

Label: raid1pool  uuid: 4e5d8b43-1d34-4672-8057-99c51649b7c6
 Total devices 2 FS bytes used 1.46GiB
 devid1 size 4.00GiB used 2.45GiB path /dev/sdb
 devid2 size 4.00GiB used 2.43GiB path /dev/sdc

I am able to read and write just fine, but do see the above errors in dmesg.

What is the best way to find out that one of the drives has gone bad?

Suman
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map

2014-10-12 Thread Qu Wenruo


 Original Message 
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to 
insert best fitted extent map

From: Filipe David Manana fdman...@gmail.com
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年10月10日 16:08

On Fri, Oct 10, 2014 at 3:39 AM, Qu Wenruoquwen...@cn.fujitsu.com  wrote:

 Original Message 
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to insert
best fitted extent map
From: Filipe David Mananafdman...@gmail.com
To: Qu Wenruoquwen...@cn.fujitsu.com
Date: 2014年10月09日 18:27

On Thu, Oct 9, 2014 at 1:28 AM, Qu Wenruoquwen...@cn.fujitsu.com  wrote:

 Original Message 
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to
insert
best fitted extent map
From: Filipe David Mananafdman...@gmail.com
To: Qu Wenruoquwen...@cn.fujitsu.com
Date: 2014年10月08日 20:08

On Fri, Sep 19, 2014 at 1:31 AM, Qu Wenruoquwen...@cn.fujitsu.com
wrote:

 Original Message 
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to
insert
best fitted extent map
From: Filipe David Mananafdman...@gmail.com
To: Qu Wenruoquwen...@cn.fujitsu.com
Date: 2014年09月18日 21:16

On Wed, Sep 17, 2014 at 4:53 AM, Qu Wenruoquwen...@cn.fujitsu.com
wrote:

The following commit enhanced the merge_extent_mapping() to reduce
fragment in extent map tree, but it can't handle case which existing
lies before map_start:
51f39 btrfs: Use right extent length when inserting overlap extent
map.

[BUG]
When existing extent map's start is before map_start,
the em-len will be minus, which will corrupt the extent map and fail
to
insert the new extent map.
This will happen when someone get a large extent map, but when it is
going to insert it into extent map tree, some one has already commit
some write and split the huge extent into small parts.

This sounds like very deterministic to me.
Any reason to not add tests to the sanity tests that exercise
this/these case/cases?

Yes, thanks for the informing.
Will add the test case for it soon.

Hi Qu,

Any progress on the test?

This is a very important one IMHO, not only because of the bad
consequences of the bug (extent map corruption, leading to all sorts
of chaos), but also because this problem was not found by the full
xfstests suite on several developer machines.

thanks

Still trying to reproduce it under xfstest framework.

That's the problem, wasn't apparently reproducible (or detectable at
least) by anyone with xfstests.

I'll try to build a C program to behave the same of filebench and to see if
it works.
At least with filebench, it can be triggered in 60s with 100% possibility to
reproduce.

But even followiiing the FileBench randomrw behavior(1 thread random read
1
thread random write on preallocated space),
I still failed to reproduce it.

Still investigating how to reproduce it.
Worst case may be add a new C program into src of xfstests?

How about the sanity tests (fs/btrfs/tests/*.c)? Create an empty map
tree, add some extent maps, then try to merge some new extent maps
that used to fail before this fix. Seems simple, no?

thanks Qu

It needs concurrency read and write(commit) to trigger it, I am not sure it
can be reproduced in sanity tests
since it seems not commit things and lacks multithread facility.

Hum?
Why does concurrency or persistence matters?

Let's review the problem.
So you fixed the function inode.c:merge_extent_mapping(). That
function merges a new extent map (not in the extent map tree) with an
existing extent map (which is in the tree).
The issue was that the merge was incorrect for some cases - producing
a bad extent map (compared to the rest of the existing extent maps)
that either overlaps existing ones or introduces incorrect gaps, etc -
doesn't really matter the reason.
Now, this function is run while holding the write lock of the inode's
extent map tree.
So why does concurrency (or persistence) matters here?

It is true that the patch only fixed the above merge problem.

But the bug involves more.
1) the direct cause.
existing extent map's start is smaller than map_start in 
merge_extent_mapping().


2) the root cause.
As described in my V2 patch, there is a window between 
btrfs_release_path() and write_lock(em_tree-lock)
in btrfs_get_extent(), and under concurrency, one may get a big extent 
map converted from on-disk file extent,
and during the windows, a commit happens and the original extent map is 
split into several small ones.


So 1) will happen and cause the bug.

At least the reporter's filebench reproducer can be explained like above,
and that's why concurrency is needed to trigger the bug under such 
circumstance.

Why can't we have a sanity test that simply reproduces a scenario
where immediately after attempting to merge extent maps, we get an
(in-memory) extent map that is incorrect?

There is other situation triggering the bug(just like the mail below),
but the above known circumstance needs concurrency to let commit happen 

Re: btrfs send and kernel 3.17

2014-10-12 Thread David Arendt
Some more info I thought off. For me, the corruption problem seems not
to be send related but snapshot creation related. On machine 2 send was
never used. However both filesystems are stored on SSDs (of different
brand). Another filesystem stored on a normal HDD didn't experience the
problem. Maybe this is pure coincidence and has nothing to do with the
fact that it is on SSD or HDD. Another thing I noticed is that for me,
the problem only seems to occur for root subvolumes with many small
files. I have no root subvolumes on HDD so it might be not SSD related.

On 10/12/2014 11:35 PM, David Arendt wrote:
 Just to let you know, I just tried an ls -l on 2 machines running kernel
 3.17 and btrfs-progs 3.16.2.

 Here is my ls -l output:

 Machine 1:
 ls: cannot access root.20141009.000503.backup: Cannot allocate memory
 total 0
 d? ? ?  ? ?? root.20141009.000503.backup
 drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141012.095526.backup
 drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141012.000503.backup
 drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141011.000502.backup
 drwxr-xr-x 1 root   root182 Oct  7 20:35 root.20141010.000502.backup

 root.20141009.000503.backup is not deletable.

 Machine 2:
 ls: cannot access root.20141006.003239.backup: Cannot allocate memory
 ls: cannot access root.20141007.001616.backup: Cannot allocate memory
 ls: cannot access root.20141008.000501.backup: Cannot allocate memory
 ls: cannot access root.20141009.052436.backup: Cannot allocate memory
 total 0
 d? ? ??  ?? root.20141009.052436.backup
 d? ? ??  ?? root.20141008.000501.backup
 d? ? ??  ?? root.20141007.001616.backup
 d? ? ??  ?? root.20141006.003239.backup
 drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140925.001125.backup
 drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140924.001017.backup
 drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140923.001008.backup
 drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140922.001836.backup
 drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140921.001029.backup
 drwxr-xr-x 1 root root 232 Aug  3 15:00 root.20140920.001020.backup

 The ? ones are also not deletable.

 Both machines are giving transid verify failed errors.

 I verified my logfiles and this problem was never there using previous
 kernel versions. On machine 1, it is also sure that it was not any
 previous corruption as this filesystem has also been created with
 btrfs-progs 3.16.2 using kernel 3.17.

 On 10/12/2014 05:24 PM, john terragon wrote:
 Hi.

 I just wanted to confirm David's story so to speak :)

 -kernel 3.17-rc7 (didn't bother to compile 3.17 as there weren't any
 btrfs fixes, I think)

 -btrfs-progs 3.16.2 (also compiled from source, so no
 distribution-specific patches)

 -fresh fs

 -I get the same two errors David got (first I got the I/O error one
 and then the memory allocation one)

 -plus now when I ls -la the fs top volume this is what I get

 drwxrwsr-x 1 root staff  30 Sep 11 16:15 home
 d? ? ??   ?? home-backup
 drwxr-xr-x 1 root root  250 Oct 10 15:37 root
 d? ? ??   ?? root-backup
 drwxr-xr-x 1 root root   88 Sep 15 16:02 vms
 drwxr-xr-x 1 root root   88 Sep 15 16:02 vms-backup

 yes, the question marks on those two *-backup snapshots are really
 there. I can't access the snapshots, I can't delete them, I can't do
 anything with them.

 -btrfs check segfaults

 -the events that led to this situation are these:
  1) btrfs su snap -r root root-backup
  2) send |receive (the entire root-backup, not and incremental send)
  immediate I/O error
  3) move on to home: btrfs su snap -r home home-backup
  4) send|receive (again not an incremental send)
  everything goes well (!)
  5) retry with root: btrfs su snap -r root root-backup
  6) send|receive
  and it goes seemingly well
  7) apt-get dist-upgrade just to modify root and try an incremental send
  8) reboot after the dist-upgrade
  9) ls -la the fs top volume: first I get the memory allocation error
 and after that
any ls -la gives the output I pasted above. (notice that beside
 the ls -la, the
two snapshots were not touched in any way since the two send|receive)

 Few final notes. I haven't tried send/receive in a while (they were
 unreliable) so I can't tell which is the last version they worked for
 me (well, no version actually :) ).
 I've never had any problem with just snapshots. I make them regularly,
 I use them, I modify them and I've never had one problem (with 3.17
 too, it's just send/receive that murders them).

 Best regards

 John

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Btrfs: return failure if btrfs_dev_replace_finishing() failed

2014-10-12 Thread Eryu Guan
device replace could fail due to another running scrub process or any
other errors btrfs_scrub_dev() may hit, but this failure doesn't get
returned to userspace.

The following steps could reproduce this issue

mkfs -t btrfs -f /dev/sdb1 /dev/sdb2
mount /dev/sdb1 /mnt/btrfs
while true; do btrfs scrub start -B /mnt/btrfs /dev/null 21; done 
btrfs replace start -Bf /dev/sdb2 /dev/sdb3 /mnt/btrfs
# if this replace succeeded, do the following and repeat until
# you see this log in dmesg
# BTRFS: btrfs_scrub_dev(/dev/sdb2, 2, /dev/sdb3) failed -115
#btrfs replace start -Bf /dev/sdb3 /dev/sdb2 /mnt/btrfs

# once you see the error log in dmesg, check return value of
# replace
echo $?

Introduce a new dev replace result

BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS

to catch -EINPROGRESS explicitly and return other errors directly to
userspace.

Signed-off-by: Eryu Guan guane...@gmail.com
---

v2:
- set result to SCRUB_INPROGRESS if btrfs_scrub_dev returned -EINPROGRESS
  and return 0 as Miao Xie suggested

 fs/btrfs/dev-replace.c | 12 +---
 include/uapi/linux/btrfs.h |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index eea26e1..a141f8b 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -418,9 +418,15 @@ int btrfs_dev_replace_start(struct btrfs_root *root,
  dev_replace-scrub_progress, 0, 1);
 
ret = btrfs_dev_replace_finishing(root-fs_info, ret);
-   WARN_ON(ret);
+   /* don't warn if EINPROGRESS, someone else might be running scrub */
+   if (ret == -EINPROGRESS) {
+   args-result = BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS;
+   ret = 0;
+   } else {
+   WARN_ON(ret);
+   }
 
-   return 0;
+   return ret;
 
 leave:
dev_replace-srcdev = NULL;
@@ -538,7 +544,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info 
*fs_info,
btrfs_destroy_dev_replace_tgtdev(fs_info, tgt_device);
mutex_unlock(dev_replace-lock_finishing_cancel_unmount);
 
-   return 0;
+   return scrub_ret;
}
 
printk_in_rcu(KERN_INFO
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 2f47824..611e1c5 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -157,6 +157,7 @@ struct btrfs_ioctl_dev_replace_status_params {
 #define BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR0
 #define BTRFS_IOCTL_DEV_REPLACE_RESULT_NOT_STARTED 1
 #define BTRFS_IOCTL_DEV_REPLACE_RESULT_ALREADY_STARTED 2
+#define BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS3
 struct btrfs_ioctl_dev_replace_args {
__u64 cmd;  /* in */
__u64 result;   /* out */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: add new dev replace result

2014-10-12 Thread Eryu Guan
A new dev replace result was introduced by kernel commit

Btrfs: return failure if btrfs_dev_replace_finishing() failed

Make the userspace know about the new result too.

Signed-off-by: Eryu Guan guane...@gmail.com
---
 cmds-replace.c | 2 ++
 ioctl.h| 1 +
 2 files changed, 3 insertions(+)

diff --git a/cmds-replace.c b/cmds-replace.c
index 9fe7ad8..7a45cef 100644
--- a/cmds-replace.c
+++ b/cmds-replace.c
@@ -53,6 +53,8 @@ static const char *replace_dev_result2string(__u64 result)
return not started;
case BTRFS_IOCTL_DEV_REPLACE_RESULT_ALREADY_STARTED:
return already started;
+   case BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS:
+   return scrub is in progress;
default:
return illegal result value;
}
diff --git a/ioctl.h b/ioctl.h
index f0fc060..0e02fae 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -144,6 +144,7 @@ struct btrfs_ioctl_dev_replace_status_params {
 #define BTRFS_IOCTL_DEV_REPLACE_RESULT_NO_ERROR0
 #define BTRFS_IOCTL_DEV_REPLACE_RESULT_NOT_STARTED 1
 #define BTRFS_IOCTL_DEV_REPLACE_RESULT_ALREADY_STARTED 2
+#define BTRFS_IOCTL_DEV_REPLACE_RESULT_SCRUB_INPROGRESS3
 struct btrfs_ioctl_dev_replace_args {
__u64 cmd;  /* in */
__u64 result;   /* out */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html