Re: [BUG] bogus out of space reported when mounted raid1 degraded

2014-07-24 Thread Duncan
Chris Murphy posted on Wed, 23 Jul 2014 19:13:10 -0600 as excerpted:


 On Jul 22, 2014, at 11:24 PM, Duncan 1i5t5.dun...@cox.net wrote:
 
 ** ON BTRFS RAID1, TWO DEVICES MUST BE PRESENT IN ORDERED TO ALLOCATE
 NEW CHUNKS.  MOUNTING DEGRADED WITH A SINGLE DEVICE MEANS NO NEW CHUNK
 ALLOCATION, WHICH MEANS YOU'RE LIMITED TO FILLING UP EXISTING CHUNKS **
 
 I can confirm this behavior (see below).

And...

If I'm reading it correctly, patch 9/10 in Miao Xie's new 10-part patch 
series should fix that.

From: Miao Xie mi...@cn.fujitsu.com
Subject: [PATCH 09/10] Btrfs: don't consider the missing device when
 allocating new chunks
Date: Thu, 24 Jul 2014 11:37:14 +0800
Message-ID: 1406173035-29478-9-git-send-email-mi...@cn.fujitsu.com

While I don't claim to be a dev, based on the comments and my reading of 
the patch (and assuming there's no other location blocking it that needs 
patched as well), that should allow new, effectively single-mode chunks 
to be allocated when a btrfs multi-device raid1 mode filesystem is 
mounted degraded with just a single device.  That should allow normal 
writes to continue altho in single mode, and a balance -Xconvert=raid1 
can be used later to upgrade back to raid1 after a second device is added 
back in.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs-progs: remove author and copyright info from man page

2014-07-24 Thread Wang Shilong
From:
http://man7.org/linux/man-pages/man7/man-pages.7.html
...
AUTHORS lists authors of the documentation or program.Use of
an AUTHORS section is strongly discouraged. Generally,
it is better not to clutter every page with a list of
(over time potentially numerous) authors; if you write
or significantly amend a page, add a copyright notice
as a comment in the source file.  If you are the author
of a device driver and want to include an address for
reporting bugs, place this under the BUGS section.
...

Suggested-by: Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com
Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
---
 Documentation/btrfs-convert.txt | 12 
 Documentation/btrfs-debug-tree.txt  | 12 
 Documentation/btrfs-find-root.txt   | 12 
 Documentation/btrfs-image.txt   | 12 
 Documentation/btrfs-map-logical.txt | 12 
 Documentation/btrfs-show-super.txt  | 12 
 Documentation/btrfs-zero-log.txt| 12 
 Documentation/btrfstune.txt | 12 
 8 files changed, 96 deletions(-)

diff --git a/Documentation/btrfs-convert.txt b/Documentation/btrfs-convert.txt
index 1eff0bf..11d6044 100644
--- a/Documentation/btrfs-convert.txt
+++ b/Documentation/btrfs-convert.txt
@@ -31,18 +31,6 @@ EXIT STATUS
 *btrfs-convert* will return 0 if no error happened.
 If any problems happened, 1 will be returned.
 
-AUTHOR
---
-Written by Shilong Wang and Wenruo Qu.
-
-COPYRIGHT
--
-Copyright (C) 2013 FUJITSU LIMITED.
-
-License GPLv2: GNU GPL version 2 http://gnu.org/licenses/gpl.html.
-
-This is free software: you are free  to  change  and  redistribute  it. There 
is NO WARRANTY, to the extent permitted by law.
-
 SEE ALSO
 
 `mkfs.btrfs`(8)
diff --git a/Documentation/btrfs-debug-tree.txt 
b/Documentation/btrfs-debug-tree.txt
index bfc8aa4..23fc115 100644
--- a/Documentation/btrfs-debug-tree.txt
+++ b/Documentation/btrfs-debug-tree.txt
@@ -33,18 +33,6 @@ EXIT STATUS
 *btrfs-debug-tree* will return 0 if no error happened.
 If any problems happened, 1 will be returned.
 
-AUTHOR
---
-Written by Shilong Wang and Wenruo Qu.
-
-COPYRIGHT
--
-Copyright (C) 2013 FUJITSU LIMITED.
-
-License GPLv2: GNU GPL version 2 http://gnu.org/licenses/gpl.html.
-
-This is free software: you are free  to  change  and  redistribute  it. There 
is NO WARRANTY, to the extent permitted by law.
-
 SEE ALSO
 
 `mkfs.btrfs`(8)
diff --git a/Documentation/btrfs-find-root.txt 
b/Documentation/btrfs-find-root.txt
index a360f8f..c934b4c 100644
--- a/Documentation/btrfs-find-root.txt
+++ b/Documentation/btrfs-find-root.txt
@@ -28,18 +28,6 @@ EXIT STATUS
 *btrfs-find-root* will return 0 if no error happened.
 If any problems happened, 1 will be returned.
 
-AUTHOR
---
-Written by Shilong Wang and Wenruo Qu.
-
-COPYRIGHT
--
-Copyright (C) 2013 FUJITSU LIMITED.
-
-License GPLv2: GNU GPL version 2 http://gnu.org/licenses/gpl.html.
-
-This is free software: you are free to change and redistribute it. There is NO 
WARRANTY, to the extent permitted by law.
-
 SEE ALSO
 
 `mkfs.btrfs`(8)
diff --git a/Documentation/btrfs-image.txt b/Documentation/btrfs-image.txt
index 155194a..b7751f9 100644
--- a/Documentation/btrfs-image.txt
+++ b/Documentation/btrfs-image.txt
@@ -56,18 +56,6 @@ EXIT STATUS
 *btrfs-image* will return 0 if no error happened.
 If any problems happened, 1 will be returned.
 
-AUTHOR
---
-Written by Shilong Wang and Wenruo Qu.
-
-COPYRIGHT
--
-Copyright (C) 2013 FUJITSU LIMITED.
-
-License GPLv2: GNU GPL version 2 http://gnu.org/licenses/gpl.html.
-
-This is free software: you are free  to  change  and  redistribute  it. There 
is NO WARRANTY, to the extent permitted by law.
-
 SEE ALSO
 
 `mkfs.btrfs`(8)
diff --git a/Documentation/btrfs-map-logical.txt 
b/Documentation/btrfs-map-logical.txt
index a8710bf..a3d110c 100644
--- a/Documentation/btrfs-map-logical.txt
+++ b/Documentation/btrfs-map-logical.txt
@@ -32,18 +32,6 @@ EXIT STATUS
 *btrfs-map-logical* will return 0 if no error happened.
 If any problems happened, 1 will be returned.
 
-AUTHOR
---
-Written by Shilong Wang and Wenruo Qu.
-
-COPYRIGHT
--
-Copyright (C) 2013 FUJITSU LIMITED.
-
-License GPLv2: GNU GPL version 2 http://gnu.org/licenses/gpl.html.
-
-This is free software: you are free  to  change  and  redistribute  it. There 
is NO WARRANTY, to the extent permitted by law.
-
 SEE ALSO
 
 `mkfs.btrfs`(8)
diff --git a/Documentation/btrfs-show-super.txt 
b/Documentation/btrfs-show-super.txt
index 6fee0f1..1646be3 100644
--- a/Documentation/btrfs-show-super.txt
+++ b/Documentation/btrfs-show-super.txt
@@ -45,18 +45,6 @@ EXIT STATUS
 *btrfs-show-super* will return 0 if no error happened.
 If any problems happened, 1 will be returned.
 
-AUTHOR
---
-Written by Shilong Wang and Wenruo Qu.
-
-COPYRIGHT
--
-Copyright (C) 2013 FUJITSU LIMITED.
-
-License GPLv2: GNU GPL version 2 

Re: feature request: consider rw subvols ro for send when volume is mounted ro

2014-07-24 Thread David Sterba
On Wed, Jul 23, 2014 at 01:47:36PM -0700, Zach Brown wrote:
 On Wed, Jul 23, 2014 at 02:10:29PM -0600, Chris Murphy wrote:
  The use case is when it's possible to mount a Btrfs volume ro, but not rw. 
  Example, a situation where
  
  # mount -o degraded /dev/sdb /mnt
  [   71.064352] BTRFS info (device sdb): allowing degraded mounts
  [   71.064812] BTRFS info (device sdb): enabling auto recovery
  [   71.065210] BTRFS info (device sdb): disk space caching is enabled
  [   71.072068] BTRFS warning (device sdb): devid 2 missing
  [   71.097320] BTRFS: too many missing devices, writeable mount is not 
  allowed
  [   71.116616] BTRFS: open_ctree failed
  
  Yet this works:
  # mount -o degraded,ro /dev/sdb /mnt
  
  It would be great if it were possible to send/receive subvolumes to a
  different btrfs volume. Currently it's not possible because those
  subvols aren't ro, and because the mount is ro I can't make ro
  snapshots first.
 
 I wonder if that's as easy as the following totally untested hack.  I
 have no idea if a read-only mount would still allow background
 modification that might violate the send code's assumptions.

RO mount tries hard not to do any writes (eg. the from the background
threads), however a remount to RW during send would succeed and any
writes to the sent subvolume may (and most probably will) cause lots of
fun.

This could use similar protection as the subvolumes, the usecase 'allow
to send any subvolume on a RO mount' seems valid to me. The failure of
remount,rw is not silent and the user is able to decide what to do next
(stop send, or postpone remount). Remount may fail for other reasons so
I think we're not adding some unexpected surprises.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] Btrfs: fix wrong skipping compression for an inode

2014-07-24 Thread David Sterba
On Thu, Jul 17, 2014 at 11:44:09AM +0800, Wang Shilong wrote:
 If a file's compression ratios is bad, we will set NOCOMPRESS
 flag for it, and it will skip compression for that inode next time.
 
 However, if we remount fs to COMPRESS_FORCE, it still should try
 if we could compress pages for that inode, this patch fix wrong
 check for this problem.
 
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz

The documented compression precedence
https://btrfs.wiki.kernel.org/index.php/Compression#What.27s_the_precedence_of_all_the_options_affecting_compression.3F

matches the way you've fixed it.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] Btrfs: fall into nocompression codes quickly if possible

2014-07-24 Thread David Sterba
On Thu, Jul 17, 2014 at 11:44:10AM +0800, Wang Shilong wrote:
 If flag NOCOMPRESS is set which means bad compression ratio,
 we could avoid call cow_file_range_async() for this case earlier.
 
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/6] Btrfs: fix off-by-one in cow_file_range_inline()

2014-07-24 Thread David Sterba
On Thu, Jul 17, 2014 at 11:44:11AM +0800, Wang Shilong wrote:
 Btrfs could still inline file data if its size is same as
 page size, so don't skip max value here.
 
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] Btrfs: fix wrong max inline data size limit

2014-07-24 Thread David Sterba
On Thu, Jul 17, 2014 at 11:44:12AM +0800, Wang Shilong wrote:
 inline data is stored from offset of @disk_bytenr in
 struct btrfs_file_extent_item. So substracting total
 size of struct btrfs_file_extent_item is wrong, fix it.
 
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz

  #define BTRFS_MAX_INLINE_DATA_SIZE(r) (BTRFS_LEAF_DATA_SIZE(r) - \
   sizeof(struct btrfs_item) - \
 - sizeof(struct btrfs_file_extent_item))
 + offsetof(struct btrfs_file_extent_item, 
 disk_bytenr))

This increases the limit of inline data by 24 bytes but fortunatelly
does not break existing filesystems because the
BTRFS_MAX_INLINE_DATA_SIZE is used at the time the inlining is decided.
IOW it is a bit pessimistic, the rest of the code uses the offsetof
value.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] Btrfs: fix wrong write range for filemap_fdatawrite_range()

2014-07-24 Thread David Sterba
On Thu, Jul 17, 2014 at 11:44:13AM +0800, Wang Shilong wrote:
 filemap_fdatawrite_range() expect the third arg to be @end
 not @len, fix it.
 
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz

Good catch.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] Btrfs: fix wrong extent mapping for DirectIO

2014-07-24 Thread David Sterba
On Thu, Jul 17, 2014 at 11:44:14AM +0800, Wang Shilong wrote:
 btrfs_next_leaf() will use current leaf's last key to search
 and then return a bigger one. So it may still return a file extent
 item that is smaller than expected value and we will
 get an overflow here for @em-len.
 
 This is easy to reproduce for Btrfs Direct writting, it did not
 cause any problem, because writting will re-insert right mapping later.
 
 However, by hacking code to make DIO support compression, wrong extent
 mapping is kept and it encounter merging failure(EEXIST) quickly.

So this cannot happen normally (because compression and DIO do not work
together)?

 Fix this problem by looping to find next file extent item that is bigger
 than @start or we could not find anything more.
 
 Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: btrfs_put_tree_mod_seq: Don't delete an entry whose seq number is min_seq.

2014-07-24 Thread Chandan Rajendra
The current code allows a tree mod log entry whose seq number is equal to
min_seq to be deleted. Fix this.

Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com
---
 fs/btrfs/ctree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index aeab453..49a0df6 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -428,7 +428,7 @@ void btrfs_put_tree_mod_seq(struct btrfs_fs_info *fs_info,
for (node = rb_first(tm_root); node; node = next) {
next = rb_next(node);
tm = container_of(node, struct tree_mod_elem, node);
-   if (tm-seq  min_seq)
+   if (tm-seq = min_seq)
continue;
rb_erase(node, tm_root);
kfree(tm);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Return right extent when fiemap gives unaligned offset and len.

2014-07-24 Thread David Sterba
On Fri, Jul 18, 2014 at 09:55:43AM +0800, Qu Wenruo wrote:
 When page aligned start and len passed to extent_fiemap(), the result is
 good, but when start and len is not aligned, e.g. start = 1 and len =
 4095 is passed to extent_fiemap(), it returns no extent.
 
 The problem is that start and len is all rounded down which causes the
 problem.

ALIGN rounds up, not down. So the wrong rounding will use incorrect start
(4096) and finds no extents if there's eg. only one [0,4095].

 This patch will round down start and round up (start + len) to
 return right extent.
 
 Reported-by: Chandan Rajendra chan...@linux.vnet.ibm.com
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: Call mount_subtree() even 'subvolid=' mount option is given.

2014-07-24 Thread David Sterba
On Wed, Jul 16, 2014 at 12:07:10PM +0800, Qu Wenruo wrote:
 btrfs uses differnet routine to handle 'subvolid=' and 'subvol=' mount
 option.
 Given 'subvol=' mount option, btrfs will mount btrfs first and then call
 mount_subtree() to mount a subtree of btrfs, making vfs handle the path
 searching.
 This is good since vfs layer know extactly that a subtree mount is done
 and findmnt(8) knows which subtree is mounted.
 
 However when using 'subvolid=' mount option, btrfs will do all the
 internal subvolume objectid searching and checking, making VFS unaware
 about which subtree is mounted, as result, findmnt(8) can't showing any
 useful subtree mount info for end users.
 
 This patch will use the root backref to reverse search the subvolume
 path for a given subvolid, making findmnt(8) works again.
 
 Reported-by: Stefan G.Weichinger li...@xunil.at
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com

Ack for unifying the way subvol= and subvolid= are handled, but I don't
like some aspects of the implementation.

The kmalloc/krealloc makes it really complicated and is not imho
necessary. The mount options length is limited to PAGE_SIZE in the vfs
code. Do the same here, allocate a page, filter the options, do the
necessary processing and just check for overflows.

You can drop u64_to_strlen.

 +#define CLEAR_SUBVOL 1
 +#define CLEAR_SUBVOLID   2

Though they're internal and local to the file, please add BTRFS_ prefix
at least.

  /*
 - * This will strip out the subvol=%s argument for an argument string and add
 - * subvolid=0 to make sure we get the actual tree root for path walking to 
 the
 - * subvol we want.
 + * This will strip out the subvol=%s or subvolid=%s argument for an argumen
 + * string and add subvolid=0 to make sure we get the actual tree root for 
 path
 + * walking to the subvol we want.
   */
 -static char *setup_root_args(char *args)
 +static char *setup_root_args(char *args, int flags, u64 subvol_objectid)
  {
 - unsigned len = strlen(args) + 2 + 1;
 - char *src, *dst, *buf;
 + unsigned len;
 + char *src = NULL, *dst, *buf, *comma;

Please use the recommended style and put each on a separate line. I'm
not sure if you'll need all of them for the implementation witouth the
kmallocs, the comment applies generally.

 + char *subvol_string = subvolid=;
 + int option_len = 0;
 +
 + if (!args) {
 + /* Case 1, not args, all default mounting
 +  * just return 'subvolid=FS_ROOT' */

Not the preferred style of comments.

 + len = strlen(subvol_string) +
 +   u64_to_strlen(subvol_objectid) + 1;
 + dst = kmalloc(len, GFP_NOFS);
 + if (!dst)
 + return NULL;
 + sprintf(dst, %s%llu, subvol_string, subvol_objectid);
 + return dst;
 + }
  
 - /*
 -  * We need the same args as before, but with this substitution:
 -  * s!subvol=[^,]+!subvolid=0!
 -  *
 -  * Since the replacement string is up to 2 bytes longer than the
 -  * original, allocate strlen(args) + 2 + 1 bytes.
 -  */
 + switch (flags) {
 + case CLEAR_SUBVOL:
 + src = strstr(args, subvol=);
 + break;
 + case CLEAR_SUBVOLID:
 + src = strstr(args, subvolid=);
 + break;
 + }
  
 - src = strstr(args, subvol=);
 - /* This shouldn't happen, but just in case.. */
 - if (!src)
 - return NULL;
 + if (!src) {
 + /* Case 2, some args, default subvolume mounting
 +  * just append ',subvolid=FS_ROOT' */
 +
 + /* 1 for ending '\0', 1 for leading ',' */
 + len = strlen(args) + strlen(subvol_string) +
 +   u64_to_strlen(subvol_objectid) + 2;
 + dst = kmalloc(len, GFP_NOFS);
 + if (!dst)
 + return NULL;
 + strcpy(dst, args);
 + sprintf(dst + strlen(args), ,%s%llu, subvol_string,
 + subvol_objectid);
 + return dst;
 + }
 +
 + /* Case 3, subvolid=/subvol=  mount
 +  * repalce the 'subvolid/subvol' options to 'subvolid=FS_ROOT' */
 + comma = strchr(src, ',');
 + if (comma)
 + option_len = comma - src;
 + else
 + option_len = strlen(src);
 + len = strlen(args) - option_len  + strlen(subvol_string) +
 +   u64_to_strlen(subvol_objectid) + 1;
  
   buf = dst = kmalloc(len, GFP_NOFS);
   if (!buf)
 @@ -1154,28 +1208,126 @@ static char *setup_root_args(char *args)
   dst += strlen(args);
   }
  
 - strcpy(dst, subvolid=0);
 - dst += strlen(subvolid=0);
 + len = sprintf(dst, %s%llu, subvol_string, subvol_objectid);
 + dst += len;
  
   /*
* If there is a , after the original subvol=... string,
* copy that suffix into our buffer.  Otherwise, we're done.
*/
 - src = strchr(src, ',');
 - if (src)
 - 

Re: [PATCH] btrfs: Add show_path function for btrfs_super_ops.

2014-07-24 Thread David Sterba
On Mon, Jul 21, 2014 at 05:02:29PM +0800, Qu Wenruo wrote:
 show_path() function in struct super_operations is used to output
 subtree mount info for mountinfo.
 Without the implement of show_path() function, user can not found where
 each subvolume is mounted if using 'subvolid=' mount option.
 (When mounted with 'subvol=' mount option, vfs is aware of subtree mount
 and can to the path resolve by vfs itself)

Your previous patches unify both to call mount_subtree, then the default
vfs implementation of show_path will do the right thing, ie
seq_dentry(...), and the path will be resolved for free.

Means this patch is not needed, so I'll skip commenting it.

 With this patch, end users will be able to use findmnt(8) or other
 programs reading mountinfo to find which btrfs subvolume is mounted.
 
 Though we use fs_info-subvol_sem to protect show_path() from subvolume
 destroying/creating, if user renames/moves the parent non-subvolume
 dir of a subvolume, it is still possible that concurrency may happen and
 cause btrfs_search_slot() fails to find the desired key.
 In that case, we just return -EBUSY and info user to try again since
 extra locking like locking the whole subvolume tree is too expensive for
 such usage.

And the subvolume renames will be handled as well.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/10] Btrfs: Fix the problem that the replace destroys the seed filesystem

2014-07-24 Thread David Sterba
On Thu, Jul 24, 2014 at 11:37:06AM +0800, Miao Xie wrote:
 The seed filesystem was destroyed by the device replace, the reproduce
 method is:
  # mkfs.btrfs -f dev0
  # btrfstune -S 1 dev0
  # mount dev0 mnt
  # btrfs device add dev1 mnt
  # umount mnt
  # mount dev1 mnt
  # btrfs replace start -f dev0 dev2 mnt
  # umount mnt
  # mount dev0 mnt
 
 It is because we erase the super block on the seed device. It is wrong,
 we should not change anything on the seed device.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/10] Btrfs: fix wrong fsid check of scrub

2014-07-24 Thread David Sterba
On Thu, Jul 24, 2014 at 11:37:08AM +0800, Miao Xie wrote:
 All the metadata in the seed devices has the same fsid as the fsid
 of the seed filesystem which is on the seed device, so we should check
 them by the current filesystem. Fix it.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz

 ---
  fs/btrfs/scrub.c | 18 +-
  1 file changed, 13 insertions(+), 5 deletions(-)
 
 diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
 index 23d3f6e..9a81874e 100644
 --- a/fs/btrfs/scrub.c
 +++ b/fs/btrfs/scrub.c
 @@ -1361,6 +1361,16 @@ static void scrub_recheck_block(struct btrfs_fs_info 
 *fs_info,
   return;
  }
  
 +static inline int scrub_check_fsid(u8 fsid[],

Please use 'const u8 *fsid' type.

 +struct scrub_page *spage)
 +{
 + struct btrfs_fs_devices *fs_devices = spage-dev-fs_devices;
 + int ret;
 +
 + ret = memcmp(fsid, fs_devices-fsid, BTRFS_UUID_SIZE);

ret is not necessary

 + return !ret;
 +}
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/10] Btrfs: fix wrong generation check of super block on a seed device

2014-07-24 Thread David Sterba
On Thu, Jul 24, 2014 at 11:37:09AM +0800, Miao Xie wrote:
 The super block generation of the seed devices is not the same as the
 filesystem which sprouted from them because we don't update the super
 block on the seed devices when we change that new filesystem. So we
 should not use the generation of that new filesystem to check the super
 block generation on the seed devices, Fix it.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com

Good catch.

Reviewed-by: David Sterba dste...@suse.cz
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/10] Btrfs: don't write any data into a readonly device when scrub

2014-07-24 Thread David Sterba
On Thu, Jul 24, 2014 at 11:37:07AM +0800, Miao Xie wrote:
 We should not write data into a readonly device especially seed device when
 doing scrub, skip those devices.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz

One minor comment below.

 @@ -2904,6 +2904,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 
 devid, u64 start,
   struct scrub_ctx *sctx;
   int ret;
   struct btrfs_device *dev;
 + struct rcu_string *name;
  
 + if (!is_dev_replace  !readonly  !dev-writeable) {

You can define 'name' within the block.

 + mutex_unlock(fs_info-fs_devices-device_list_mutex);
 + rcu_read_lock();
 + name = rcu_dereference(dev-name);
 + btrfs_err(fs_info, scrub: device %s is not writable,
 +   name-str);
 + rcu_read_unlock();
 + return -EROFS;
 + }
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/10] Btrfs: Fix the problem that the dirty flag of dev stats is cleared

2014-07-24 Thread David Sterba
On Thu, Jul 24, 2014 at 11:37:11AM +0800, Miao Xie wrote:
 The io error might happen during writing out the device stats, and the
 device stats information and dirty flag would be update at that time,
 but the current code didn't consider this case, just clear the dirty
 flag, it would cause that we forgot to write out the new device stats
 information. Fix it.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com
 ---
  fs/btrfs/volumes.c |  7 +--
  fs/btrfs/volumes.h | 19 +++
  2 files changed, 20 insertions(+), 6 deletions(-)
 
 diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
 index 19188df..0d37746 100644
 --- a/fs/btrfs/volumes.c
 +++ b/fs/btrfs/volumes.c
 @@ -159,6 +159,7 @@ static struct btrfs_device *__alloc_device(void)
  
   spin_lock_init(dev-reada_lock);
   atomic_set(dev-reada_in_flight, 0);
 + atomic_set(dev-dev_stats_ccnt, 0);
   INIT_RADIX_TREE(dev-reada_zones, GFP_NOFS  ~__GFP_WAIT);
   INIT_RADIX_TREE(dev-reada_extents, GFP_NOFS  ~__GFP_WAIT);
  
 @@ -6398,16 +6399,18 @@ int btrfs_run_dev_stats(struct btrfs_trans_handle 
 *trans,
   struct btrfs_root *dev_root = fs_info-dev_root;
   struct btrfs_fs_devices *fs_devices = fs_info-fs_devices;
   struct btrfs_device *device;
 + int stats_cnt;
   int ret = 0;
  
   mutex_lock(fs_devices-device_list_mutex);
   list_for_each_entry(device, fs_devices-devices, dev_list) {
 - if (!device-dev_stats_valid || !device-dev_stats_dirty)
 + if (!device-dev_stats_valid || !btrfs_dev_stats_dirty(device))

The helper btrfs_dev_stats_dirty is used only once and IMHO not
necessary.
   continue;
  
 + stats_cnt = atomic_read(device-dev_stats_ccnt);

Here it is opencoded anyway.

   ret = update_dev_stat_item(trans, dev_root, device);
   if (!ret)
 - device-dev_stats_dirty = 0;
 + atomic_sub(stats_cnt, device-dev_stats_ccnt);
   }
   mutex_unlock(fs_devices-device_list_mutex);
  
 diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
 index 6fcc8ea..0defd23 100644
 --- a/fs/btrfs/volumes.h
 +++ b/fs/btrfs/volumes.h
 @@ -110,7 +110,9 @@ struct btrfs_device {
   /* disk I/O failure stats. For detailed description refer to
* enum btrfs_dev_stat_values in ioctl.h */
   int dev_stats_valid;
 - int dev_stats_dirty; /* counters need to be written to disk */
 +
 + /* Counter to record the change of device stats */
 + atomic_t dev_stats_ccnt;

dev_stats_dirty is more descriptive, please keep it. The counter
semantics can be documented here.

   atomic_t dev_stat_values[BTRFS_DEV_STAT_VALUES_MAX];
  };
  
 @@ -359,11 +361,18 @@ unsigned long btrfs_full_stripe_len(struct btrfs_root 
 *root,
  int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans,
   struct btrfs_root *extent_root,
   u64 chunk_offset, u64 chunk_size);
 +
 +static inline int btrfs_dev_stats_dirty(struct btrfs_device *dev)
 +{
 + return atomic_read(dev-dev_stats_ccnt);

IMHO too trivial, not necessary.

 +}
 +
  static inline void btrfs_dev_stat_inc(struct btrfs_device *dev,
 int index)
  {
   atomic_inc(dev-dev_stat_values + index);
 - dev-dev_stats_dirty = 1;

 + smp_mb__before_atomic();
 + atomic_inc(dev-dev_stats_ccnt);

Please put the two lines into a wrapper, 3 times repeating the same is
worth it.

 @@ -378,7 +387,8 @@ static inline int btrfs_dev_stat_read_and_reset(struct 
 btrfs_device *dev,
   int ret;
  
   ret = atomic_xchg(dev-dev_stat_values + index, 0);
 - dev-dev_stats_dirty = 1;
 + smp_mb__before_atomic();
 + atomic_inc(dev-dev_stats_ccnt);

 @@ -386,7 +396,8 @@ static inline void btrfs_dev_stat_set(struct btrfs_device 
 *dev,
 int index, unsigned long val)
  {
   atomic_set(dev-dev_stat_values + index, val);
 - dev-dev_stats_dirty = 1;
 + smp_mb__before_atomic();
 + atomic_inc(dev-dev_stats_ccnt);
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix compressed write corruption on enospc

2014-07-24 Thread Liu Bo
When failing to allocate space for the whole compressed extent, we'll
fallback to uncompressed IO, but we've forgotten to redirty the pages
which belong to this compressed extent, and these 'clean' pages will
simply skip 'submit' part and go to endio directly, at last we got data
corruption as we write nothing.

Signed-off-by: Liu Bo bo.li@oracle.com
---
 fs/btrfs/inode.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3668048..8ea7610 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -709,6 +709,18 @@ retry:
unlock_extent(io_tree, async_extent-start,
  async_extent-start +
  async_extent-ram_size - 1);
+
+   /*
+* we need to redirty the pages if we decide to
+* fallback to uncompressed IO, otherwise we
+* will not submit these pages down to lower
+* layers.
+*/
+   extent_range_redirty_for_io(inode,
+   async_extent-start,
+   async_extent-start +
+   async_extent-ram_size - 1);
+
goto retry;
}
goto out_free;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix compressed write corruption on enospc

2014-07-24 Thread Chris Mason
On 07/24/2014 10:48 AM, Liu Bo wrote:
 When failing to allocate space for the whole compressed extent, we'll
 fallback to uncompressed IO, but we've forgotten to redirty the pages
 which belong to this compressed extent, and these 'clean' pages will
 simply skip 'submit' part and go to endio directly, at last we got data
 corruption as we write nothing.

This fallback code was my #1 suspect for the hangs people have been
seeing since 3.15.  I changed things around to trigger the fallback
randomly and wasn't able to trigger problems, but I was looking for
hangs and not corruptions.

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS hang with 3.16-rc5 (and also with 3.16-rc4)

2014-07-24 Thread Chris Mason
On 07/23/2014 06:47 PM, Martin Steigerwald wrote:
 Am Dienstag, 15. Juli 2014, 17:08:27 schrieb Martin Steigerwald:
 Am Dienstag, 15. Juli 2014, 09:21:40 schrieb Chris Mason:
 On 07/14/2014 05:58 PM, Martin Steigerwald wrote:
 Am Montag, 14. Juli 2014, 16:12:22 schrieb Chris Mason:
 On 07/14/2014 11:10 AM, Martin Steigerwald wrote:
 Am Montag, 14. Juli 2014, 17:04:22 schrieben Sie:
 Hi!

 While with 3.16-rc3 and rc4 I didn´t have a BTRFS hang in several
 days
 of
 usage, with 3-16-rc5 I had a hang again. Less than a hour since
 booting
 it.

 Since the hang bug I and others had with 3.15 and upto 3.16-rc2
 usually
 didn´t happen that quickly after boot and since backtrace looks a bit
 different from what I have in memory, I post this in a new thread.
 See thread Blocked tasks on 3.15.1 for a discussion of previous
 hang
 issues.

 Probably good to add some basic information on the filesystem:
 Do you have compression enabled?  I wasn't able to nail down the 3.15.1
 hang before vacation attacked me, but I'm hoping to track it down
 today.

 Yes. I have.

 It just hung again while I was playing PlaneShift.

 Back to 3.16-rc4 as rc5 seems to be broke here.

 The btrfs hang you're hitting goes back to 3.15.  So 3.16-rc4 vs rc5
 shouldn't be a factor.  Are you hitting other problems with 3.16?

 So far for this day 3.16-rc4 behaves nicely. With 3.16-rc5 I had a BTRFS
 hang twice yesterday. 3.16-rc4 before also behaved nicely for several days
 or well about a week here.
 
 3.16-rc4 now hung as well…

Liu Bo has a promising patch:

https://patchwork.kernel.org/patch/4618421/

Please give it a shot.  There's a second deadlock reading the free space
cache, I'm still working on that one too.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: add zero-log to rescue subcommand

2014-07-24 Thread David Sterba
Copy the functionality of standalone btrfs-zero-log to the main tool.
The standalone utility will be removed later.

Signed-off-by: David Sterba dste...@suse.cz
---
 cmds-rescue.c | 49 -
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/cmds-rescue.c b/cmds-rescue.c
index f20a2068a16b..d76a879d543a 100644
--- a/cmds-rescue.c
+++ b/cmds-rescue.c
@@ -19,6 +19,9 @@
 #include kerncompat.h
 
 #include getopt.h
+#include ctree.h
+#include transaction.h
+#include disk-io.h
 #include commands.h
 #include utils.h
 
@@ -149,11 +152,55 @@ int cmd_super_recover(int argc, char **argv)
return ret;
 }
 
+const char * const cmd_rescue_zero_log_usage[] = {
+   btrfs rescue zero-log device,
+   Clear the tree log. Usable if it's corrupted and prevents mount.,
+   ,
+   NULL
+};
+
+int cmd_rescue_zero_log(int argc, char **argv)
+{
+   struct btrfs_root *root;
+   struct btrfs_trans_handle *trans;
+   char *devname;
+   int ret;
+
+   if (check_argc_exact(argc, 2))
+   usage(cmd_rescue_zero_log_usage);
+
+   devname = argv[optind];
+   ret = check_mounted(devname);
+   if (ret  0) {
+   fprintf(stderr, Could not check mount status: %s\n, 
strerror(-ret));
+   return 1;
+   } else if (ret) {
+   fprintf(stderr, %s is currently mounted. Aborting.\n, 
devname);
+   return 1;
+   }
+
+   root = open_ctree(devname, 0, OPEN_CTREE_WRITES);
+   if (!root) {
+   fprintf(stderr, Could not open ctree\n);
+   return 1;
+   }
+
+   printf(Clearing log on %s\n, devname);
+   trans = btrfs_start_transaction(root, 1);
+   btrfs_set_super_log_root(root-fs_info-super_copy, 0);
+   btrfs_set_super_log_root_level(root-fs_info-super_copy, 0);
+   btrfs_commit_transaction(trans, root);
+   close_ctree(root);
+
+   return 0;
+}
+
 const struct cmd_group rescue_cmd_group = {
rescue_cmd_group_usage, NULL, {
{ chunk-recover, cmd_chunk_recover, cmd_chunk_recover_usage, 
NULL, 0},
{ super-recover, cmd_super_recover, cmd_super_recover_usage, 
NULL, 0},
-   { 0, 0, 0, 0, 0 }
+   { zero-log, cmd_rescue_zero_log, cmd_rescue_zero_log_usage, 
NULL, 0},
+   NULL_CMD_STRUCT
}
 };
 
-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS hang with 3.16-rc5 (and also with 3.16-rc4)

2014-07-24 Thread Martin Steigerwald
Am Donnerstag, 24. Juli 2014, 10:58:51 schrieb Chris Mason:
 On 07/23/2014 06:47 PM, Martin Steigerwald wrote:
  Am Dienstag, 15. Juli 2014, 17:08:27 schrieb Martin Steigerwald:
  Am Dienstag, 15. Juli 2014, 09:21:40 schrieb Chris Mason:
  On 07/14/2014 05:58 PM, Martin Steigerwald wrote:
  Am Montag, 14. Juli 2014, 16:12:22 schrieb Chris Mason:
  On 07/14/2014 11:10 AM, Martin Steigerwald wrote:
  Am Montag, 14. Juli 2014, 17:04:22 schrieben Sie:
  Hi!
  
  While with 3.16-rc3 and rc4 I didn´t have a BTRFS hang in several
  days
  of
  usage, with 3-16-rc5 I had a hang again. Less than a hour since
  booting
  it.
  
  Since the hang bug I and others had with 3.15 and upto 3.16-rc2
  usually
  didn´t happen that quickly after boot and since backtrace looks a
  bit
  different from what I have in memory, I post this in a new thread.
  See thread Blocked tasks on 3.15.1 for a discussion of previous
  hang
  issues.
  
  Probably good to add some basic information on the filesystem:
  Do you have compression enabled?  I wasn't able to nail down the
  3.15.1
  hang before vacation attacked me, but I'm hoping to track it down
  today.
  
  Yes. I have.
  
  It just hung again while I was playing PlaneShift.
  
  Back to 3.16-rc4 as rc5 seems to be broke here.
  
  The btrfs hang you're hitting goes back to 3.15.  So 3.16-rc4 vs rc5
  shouldn't be a factor.  Are you hitting other problems with 3.16?
  
  So far for this day 3.16-rc4 behaves nicely. With 3.16-rc5 I had a BTRFS
  hang twice yesterday. 3.16-rc4 before also behaved nicely for several
  days
  or well about a week here.
  
  3.16-rc4 now hung as well…
 
 Liu Bo has a promising patch:
 
 https://patchwork.kernel.org/patch/4618421/
 
 Please give it a shot.  There's a second deadlock reading the free space
 cache, I'm still working on that one too.

Okay, I reverted your printk patch and applied this on on git linus git.

Lets see how this works.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 1 week to rebuid 4x 3TB raid10 is a long time!

2014-07-24 Thread Chris Murphy

On Jul 22, 2014, at 11:13 AM, Chris Murphy li...@colorremedies.com wrote:
 
 It's been a while since I did a rebuild on HDDs, 

So I did this yesterday and day before with an SSD and HDD in raid1, and made 
the HDD do the rebuild. 


Baseline for this hard drive:
hdparm -t
35.68 MB/sec

dd if=/dev/zero of=/dev/rdisk2s1 bs=256k
13508091392 bytes transferred in 521.244920 secs (25915056 bytes/sec)

I don't know why hdparm gets such good reads, and dd writes are 75% of that, 
but the 26MB/s write speed is realistic (this is a Firewire 400 external 
device) and what I typically get with long sequential writes. It's probable 
this is interface limited to mode S200, not a drive limitation since on SATA 
Rev 2 or 3 interface I get 100+MB/s transfers.

During the rebuild, iotop reports actual write averaging in the 24MB/s range, 
and the total data to restore divided by total time for the replace command 
comes out to 23MB/s. The source data is a Fedora 21 install with no meaningful 
user data (cache files and such), so mostly a bunch of libraries, programs, and 
documentation. Therefore it's not exclusively small files, yet the iotop rate 
was very stable throughout the 4 minute rebuild.

So I still think 5MB/s for a SATA connected (?) drive is to be unexpected.

 
Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS hang with 3.16-rc5 (and also with 3.16-rc4)

2014-07-24 Thread Martin Steigerwald
Am Donnerstag, 24. Juli 2014, 10:58:51 schrieb Chris Mason:
 On 07/23/2014 06:47 PM, Martin Steigerwald wrote:
  Am Dienstag, 15. Juli 2014, 17:08:27 schrieb Martin Steigerwald:
  Am Dienstag, 15. Juli 2014, 09:21:40 schrieb Chris Mason:
  On 07/14/2014 05:58 PM, Martin Steigerwald wrote:
  Am Montag, 14. Juli 2014, 16:12:22 schrieb Chris Mason:
  On 07/14/2014 11:10 AM, Martin Steigerwald wrote:
  Am Montag, 14. Juli 2014, 17:04:22 schrieben Sie:
  Hi!
  
  While with 3.16-rc3 and rc4 I didn´t have a BTRFS hang in several
  days
  of
  usage, with 3-16-rc5 I had a hang again. Less than a hour since
  booting
  it.
  
  Since the hang bug I and others had with 3.15 and upto 3.16-rc2
  usually
  didn´t happen that quickly after boot and since backtrace looks a
  bit
  different from what I have in memory, I post this in a new thread.
  See thread Blocked tasks on 3.15.1 for a discussion of previous
  hang
  issues.
  
  Probably good to add some basic information on the filesystem:
  Do you have compression enabled?  I wasn't able to nail down the
  3.15.1
  hang before vacation attacked me, but I'm hoping to track it down
  today.
  
  Yes. I have.
  
  It just hung again while I was playing PlaneShift.
  
  Back to 3.16-rc4 as rc5 seems to be broke here.
  
  The btrfs hang you're hitting goes back to 3.15.  So 3.16-rc4 vs rc5
  shouldn't be a factor.  Are you hitting other problems with 3.16?
  
  So far for this day 3.16-rc4 behaves nicely. With 3.16-rc5 I had a BTRFS
  hang twice yesterday. 3.16-rc4 before also behaved nicely for several
  days
  or well about a week here.
  
  3.16-rc4 now hung as well…
 
 Liu Bo has a promising patch:
 
 https://patchwork.kernel.org/patch/4618421/
 
 Please give it a shot.  There's a second deadlock reading the free space
 cache, I'm still working on that one too.

Now running 3.16-rc6 + current git + this patch.

It may take some time tough cause during compiling the kernel BTRFS hung 
again, which caused loss of KDE Baloo desktop search file index and parts of a 
mail I wrote in KMail.

Since the patch mentioned ENOSPC issues but the filesystem has enough free 
space according to df I shrunk the trees with

btrfs balance start -musage=50 /home
btrfs balance start -musage=50 /home


merkaba:~ btrfs fi sh /home   
Label: 'home'  uuid: […]
Total devices 2 FS bytes used 124.05GiB
devid1 size 160.00GiB used 150.00GiB path /dev/mapper/msata-home
devid2 size 160.00GiB used 150.00GiB path /dev/dm-3


As I bet that the error is more likely to happen when trees occupy all space, 
it may take some time till it happens again.

Well its growing slowly already:

merkaba:~ btrfs fi df /home
Data, RAID1: total=146.97GiB, used=121.84GiB
System, RAID1: total=32.00MiB, used=48.00KiB
Metadata, RAID1: total=4.00GiB, used=2.62GiB
unknown, single: total=512.00MiB, used=0.00
merkaba:~ btrfs fi sh /home
Label: 'home'  uuid: […]
Total devices 2 FS bytes used 124.46GiB
devid1 size 160.00GiB used 151.00GiB path /dev/dm-0
devid2 size 160.00GiB used 151.00GiB path /dev/mapper/sata-home

Btrfs v3.14.1


I wonder why ENOSPC conditions happens with that much space inside trees free. 
Were they just too fragmented?

To me

merkaba:~ LANG=C df -hT /home
Filesystem Type   Size  Used Avail Use% Mounted on
/dev/dm-0  btrfs  320G  249G   69G  79% /home

is a quite healthy free space margin.


Well, lets see how this goes.

I hope it can be fixed soon as it causes loss of recently saved data and 
generally locks up a machine running KDE desktop quite quickly on a BTRFS 
hang.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4 v3] fiemap: add EXTENT_DATA_COMPRESSED flag

2014-07-24 Thread David Sterba
On Thu, Jul 17, 2014 at 12:07:57AM -0600, Andreas Dilger wrote:
 any progress on this patch series?

I'm sorry I got distracted at the end of year and did not finish the
series.

 I never saw an updated version of this patch series after the last round of
 reviews, but it would be great to move it forward.  I have filefrag patches
 in my e2fsprogs tree waiting for an updated version of your patch.
 
 I recall the main changes were:
 - add FIEMAP_EXTENT_PHYS_LENGTH flag to indicate if fe_phys_length was valid

fe_phys_length will be always valid, so other the flags are set only if it's
not equal to the logical length.

 - rename fe_length to fe_logi_length and #define fe_length fe_logi_length
 - always fill in fe_phys_length (= fe_logi_length for uncompressed files)
   and set FIEMAP_EXTENT_PHYS_LENGTH whether the extent is compressed or not

This is my understanding and contradicts the first point.

 - add WARN_ONCE() in fiemap_fill_next_extent() as described below

 I don't know if there was any clear statement about whether there should be
 separate FIEMAP_EXTENT_PHYS_LENGTH and FIEMAP_EXTENT_DATA_COMPRESSED flags,
 or if the latter should be implicit?  Probably makes sense to have separate
 flags.  It should be fine to use:

 #define FIEMAP_EXTENT_PHYS_LENGTH 0x0010
 
 since this flag was never used.

I've kept only FIEMAP_EXTENT_DATA_COMPRESSED, I don't see a need for
FIEMAP_EXTENT_PHYS_LENGTH and this would be yet another flag because the
FIEMAP_EXTENT_DATA_ENCODED is also implied.

I'll send V4, we can discuss the PHYS_LENGTH flag then.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS hang with 3.16-rc5 (and also with 3.16-rc4)

2014-07-24 Thread Chris Mason


On 07/24/2014 02:49 PM, Martin Steigerwald wrote:
 Am Donnerstag, 24. Juli 2014, 10:58:51 schrieb Chris Mason:
 On 07/23/2014 06:47 PM, Martin Steigerwald wrote:
 Am Dienstag, 15. Juli 2014, 17:08:27 schrieb Martin Steigerwald:
 Am Dienstag, 15. Juli 2014, 09:21:40 schrieb Chris Mason:
 On 07/14/2014 05:58 PM, Martin Steigerwald wrote:
 Am Montag, 14. Juli 2014, 16:12:22 schrieb Chris Mason:
 On 07/14/2014 11:10 AM, Martin Steigerwald wrote:
 Am Montag, 14. Juli 2014, 17:04:22 schrieben Sie:
 Hi!

 While with 3.16-rc3 and rc4 I didn´t have a BTRFS hang in several
 days
 of
 usage, with 3-16-rc5 I had a hang again. Less than a hour since
 booting
 it.

 Since the hang bug I and others had with 3.15 and upto 3.16-rc2
 usually
 didn´t happen that quickly after boot and since backtrace looks a
 bit
 different from what I have in memory, I post this in a new thread.
 See thread Blocked tasks on 3.15.1 for a discussion of previous
 hang
 issues.

 Probably good to add some basic information on the filesystem:
 Do you have compression enabled?  I wasn't able to nail down the
 3.15.1
 hang before vacation attacked me, but I'm hoping to track it down
 today.

 Yes. I have.

 It just hung again while I was playing PlaneShift.

 Back to 3.16-rc4 as rc5 seems to be broke here.

 The btrfs hang you're hitting goes back to 3.15.  So 3.16-rc4 vs rc5
 shouldn't be a factor.  Are you hitting other problems with 3.16?

 So far for this day 3.16-rc4 behaves nicely. With 3.16-rc5 I had a BTRFS
 hang twice yesterday. 3.16-rc4 before also behaved nicely for several
 days
 or well about a week here.

 3.16-rc4 now hung as well…

 Liu Bo has a promising patch:

 https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/4618421/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=CJPREifRDOxlzhYeURx75h33LGU7YemJsNeLP%2FvXCv8%3D%0As=8fb0a70afce09530f16ea66a47d2af07966706b21281a7142d86256979013bab

 Please give it a shot.  There's a second deadlock reading the free space
 cache, I'm still working on that one too.
 
 Now running 3.16-rc6 + current git + this patch.
 
 It may take some time tough cause during compiling the kernel BTRFS hung 
 again, which caused loss of KDE Baloo desktop search file index and parts of 
 a 
 mail I wrote in KMail.
 
 Since the patch mentioned ENOSPC issues but the filesystem has enough free 
 space according to df I shrunk the trees with

Thanks for giving it a try.  The ENOSPC mentioned here is looking for a
contiguous extent, so it's easily possible to trigger that enospc
without actually being full.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH, RFC] btrfs: refactor open_ctree()

2014-07-24 Thread Chris Mason
On 06/25/2014 07:55 PM, Eric Sandeen wrote:
 First off: total RFC, don't merge this; it builds, but
 is totally untested.
 
 open_ctree() is almost 1000 lines long.  I've started trying
 to refactor it, primarily into helper functions, and also
 simplifying (?) things a bit at the beginning by removing the
 ret = func(); if (ret) { err = ret; goto ... } dance where it's not
 needed.
 
 Does this look like a reasonable thing to do?  Have I cut
 things into the right chunks?  Would you rather see it as
 as series of patches, moving one hunk of code at a time?

I do love this patch, either as a series or one big patch.  Whatever
makes it easiest for you to test is fine with me.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] btrfs: Use backup superblocks if and only if the first superblock is valid but corrupted.

2014-07-24 Thread Chris Mason


On 06/26/2014 11:53 PM, Qu Wenruo wrote:
 Current btrfs will only use the first superblock, making the backup
 superblocks only useful for 'btrfs rescue super' command.
 
 The old problem is that if we use backup superblocks when the first
 superblock is not valid, we will be able to mount a none btrfs
 filesystem, which used to contains btrfs but other fs is made on it.
 
 The old problem can be solved related easily by checking the first
 superblock in a special way:
 1) If the magic number in the first superblock does not match:
This filesystem is not btrfs anymore, just exit.
If end-user consider it's really btrfs, then old 'btrfs rescue super'
method is still available.
 
 2) If the magic number in the first superblock matches but checksum does
not match:
This filesystem is btrfs but first superblock is corrupted, use
backup roots. Just continue searching remaining superblocks.

I do agree that in these cases we can trust that the backup superblock
comes from the same filesystem.

But, for right now I'd prefer the admin get involved in using the backup
supers.  I think silently using the backups is going to lead to surprises.

Thanks!

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4 v3] fiemap: add EXTENT_DATA_COMPRESSED flag

2014-07-24 Thread Andreas Dilger

On Jul 24, 2014, at 1:22 PM, David Sterba dste...@suse.cz wrote:
 On Thu, Jul 17, 2014 at 12:07:57AM -0600, Andreas Dilger wrote:
 any progress on this patch series?
 
 I'm sorry I got distracted at the end of year and did not finish the
 series.
 
 I never saw an updated version of this patch series after the last round of
 reviews, but it would be great to move it forward.  I have filefrag patches
 in my e2fsprogs tree waiting for an updated version of your patch.
 
 I recall the main changes were:
 - add FIEMAP_EXTENT_PHYS_LENGTH flag to indicate if fe_phys_length was valid
 
 fe_phys_length will be always valid, so other the flags are set only if it's
 not equal to the logical length.
 
 - rename fe_length to fe_logi_length and #define fe_length fe_logi_length
 - always fill in fe_phys_length (= fe_logi_length for uncompressed files)
  and set FIEMAP_EXTENT_PHYS_LENGTH whether the extent is compressed or not
 
 This is my understanding and contradicts the first point.

I think Dave Chinner's former point was that having fe_phys_length validity
depend on FIEMAP_EXTENT_DATA_COMPRESSED is a non-intuitive interface.  It is
not true that fe_phys_length would always be valid, since that is not the
case for older kernels that currently always set this field to 0, so they
need some flag to indicate if fe_phys_length is valid.  Alternately,
userspace could do:

if (ext-fe_phys_length == 0)
ext-fe_phys_length = ext-fe_logi_length;

but that pre-supposes that fe_phys_length == 0 is never a valid value when
fe_logi_length is non-zero, and this might introduce errors in some cases.
I could imagine that some compression methods might not allocate any space
at all if it was all zeroes, and just store a bit in the blockpointer or
extent, so having a separate FIEMAP_EXTENT_PHYS_LENGTH is probably safer
in the long run.  That opens up the question of whether a written zero
filled space that gets compressed away is different from a hole, but I'd
prefer to just return whatever the file mapping is than interpret it.

Cheers, Andreas

 - add WARN_ONCE() in fiemap_fill_next_extent() as described below
 
 I don't know if there was any clear statement about whether there should be
 separate FIEMAP_EXTENT_PHYS_LENGTH and FIEMAP_EXTENT_DATA_COMPRESSED flags,
 or if the latter should be implicit?  Probably makes sense to have separate
 flags.  It should be fine to use:
 
 #define FIEMAP_EXTENT_PHYS_LENGTH0x0010
 
 since this flag was never used.
 
 I've kept only FIEMAP_EXTENT_DATA_COMPRESSED, I don't see a need for
 FIEMAP_EXTENT_PHYS_LENGTH and this would be yet another flag because the
 FIEMAP_EXTENT_DATA_ENCODED is also implied.
 
 I'll send V4, we can discuss the PHYS_LENGTH flag then.


Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PATCH, RFC] btrfs: refactor open_ctree()

2014-07-24 Thread Eric Sandeen
On 7/24/14, 4:25 PM, Chris Mason wrote:
 On 06/25/2014 07:55 PM, Eric Sandeen wrote:
 First off: total RFC, don't merge this; it builds, but
 is totally untested.

 open_ctree() is almost 1000 lines long.  I've started trying
 to refactor it, primarily into helper functions, and also
 simplifying (?) things a bit at the beginning by removing the
 ret = func(); if (ret) { err = ret; goto ... } dance where it's not
 needed.

 Does this look like a reasonable thing to do?  Have I cut
 things into the right chunks?  Would you rather see it as
 as series of patches, moving one hunk of code at a time?
 
 I do love this patch, either as a series or one big patch.  Whatever
 makes it easiest for you to test is fine with me.

Oh right!  I remember this thing!  Let me try to get back to it... ;)

Thanks,
-Eric

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Add show_path function for btrfs_super_ops.

2014-07-24 Thread Qu Wenruo

Thanks for the comment.

 Original Message 
Subject: Re: [PATCH] btrfs: Add show_path function for btrfs_super_ops.
From: David Sterba dste...@suse.cz
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年07月24日 21:09

On Mon, Jul 21, 2014 at 05:02:29PM +0800, Qu Wenruo wrote:

show_path() function in struct super_operations is used to output
subtree mount info for mountinfo.
Without the implement of show_path() function, user can not found where
each subvolume is mounted if using 'subvolid=' mount option.
(When mounted with 'subvol=' mount option, vfs is aware of subtree mount
and can to the path resolve by vfs itself)

Your previous patches unify both to call mount_subtree, then the default
vfs implementation of show_path will do the right thing, ie
seq_dentry(...), and the path will be resolved for free.

Means this patch is not needed, so I'll skip commenting it.


I'm sorry that I forgot to mention this patch is going to replace the 
previous patch(use mount_subtree method).


Since vfs provide the show_path() function to do the fs specific subtree 
showing things,

I would like to use it other than previous mount_subtree() trick.

Also. as mentioned by Chandan Rajendra, previous subtree patch can't 
handle subvolume behind normal directory.

This show_path() patch is somewhat v2 version of previous patch.



With this patch, end users will be able to use findmnt(8) or other
programs reading mountinfo to find which btrfs subvolume is mounted.

Though we use fs_info-subvol_sem to protect show_path() from subvolume
destroying/creating, if user renames/moves the parent non-subvolume
dir of a subvolume, it is still possible that concurrency may happen and
cause btrfs_search_slot() fails to find the desired key.
In that case, we just return -EBUSY and info user to try again since
extra locking like locking the whole subvolume tree is too expensive for
such usage.

And the subvolume renames will be handled as well.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix compressed write corruption on enospc

2014-07-24 Thread Liu Bo
On Thu, Jul 24, 2014 at 10:55:47AM -0400, Chris Mason wrote:
 On 07/24/2014 10:48 AM, Liu Bo wrote:
  When failing to allocate space for the whole compressed extent, we'll
  fallback to uncompressed IO, but we've forgotten to redirty the pages
  which belong to this compressed extent, and these 'clean' pages will
  simply skip 'submit' part and go to endio directly, at last we got data
  corruption as we write nothing.
 
 This fallback code was my #1 suspect for the hangs people have been
 seeing since 3.15.  I changed things around to trigger the fallback
 randomly and wasn't able to trigger problems, but I was looking for
 hangs and not corruptions.
 

So now you're able to trigger the hang without changing the fallback code?

I tried raid1 and raid0 with fsmark and rsync in different ways but still fails
to reproduce the hang :-(

The most weird thing is who the hell holds the free space inode's page, is it
possible to share pages with other inode? (My answer is NO, but I'm not sure
now...)

thanks,
-liubo

 -chris
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: Call mount_subtree() even 'subvolid=' mount option is given.

2014-07-24 Thread Qu Wenruo

Thanks for your comment.

I'm very sorry that this patch takes your time to review, but later 
patch(show_path one) should replace this patch.

As mentioned in that thread, this patch is not completly working.
And in fact, show_path() patch is the v2 version of this patch, but due 
to change of patch name, I didn't add the v2

tag.

Thanks,
Qu

 Original Message 
Subject: Re: [PATCH 1/2] btrfs: Call mount_subtree() even 'subvolid=' 
mount option is given.

From: David Sterba dste...@suse.cz
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年07月24日 20:48

On Wed, Jul 16, 2014 at 12:07:10PM +0800, Qu Wenruo wrote:

btrfs uses differnet routine to handle 'subvolid=' and 'subvol=' mount
option.
Given 'subvol=' mount option, btrfs will mount btrfs first and then call
mount_subtree() to mount a subtree of btrfs, making vfs handle the path
searching.
This is good since vfs layer know extactly that a subtree mount is done
and findmnt(8) knows which subtree is mounted.

However when using 'subvolid=' mount option, btrfs will do all the
internal subvolume objectid searching and checking, making VFS unaware
about which subtree is mounted, as result, findmnt(8) can't showing any
useful subtree mount info for end users.

This patch will use the root backref to reverse search the subvolume
path for a given subvolid, making findmnt(8) works again.

Reported-by: Stefan G.Weichinger li...@xunil.at
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com

Ack for unifying the way subvol= and subvolid= are handled, but I don't
like some aspects of the implementation.

The kmalloc/krealloc makes it really complicated and is not imho
necessary. The mount options length is limited to PAGE_SIZE in the vfs
code. Do the same here, allocate a page, filter the options, do the
necessary processing and just check for overflows.

You can drop u64_to_strlen.


+#define CLEAR_SUBVOL   1
+#define CLEAR_SUBVOLID 2

Though they're internal and local to the file, please add BTRFS_ prefix
at least.


  /*
- * This will strip out the subvol=%s argument for an argument string and add
- * subvolid=0 to make sure we get the actual tree root for path walking to the
- * subvol we want.
+ * This will strip out the subvol=%s or subvolid=%s argument for an argumen
+ * string and add subvolid=0 to make sure we get the actual tree root for path
+ * walking to the subvol we want.
   */
-static char *setup_root_args(char *args)
+static char *setup_root_args(char *args, int flags, u64 subvol_objectid)
  {
-   unsigned len = strlen(args) + 2 + 1;
-   char *src, *dst, *buf;
+   unsigned len;
+   char *src = NULL, *dst, *buf, *comma;

Please use the recommended style and put each on a separate line. I'm
not sure if you'll need all of them for the implementation witouth the
kmallocs, the comment applies generally.


+   char *subvol_string = subvolid=;
+   int option_len = 0;
+
+   if (!args) {
+   /* Case 1, not args, all default mounting
+* just return 'subvolid=FS_ROOT' */

Not the preferred style of comments.


+   len = strlen(subvol_string) +
+ u64_to_strlen(subvol_objectid) + 1;
+   dst = kmalloc(len, GFP_NOFS);
+   if (!dst)
+   return NULL;
+   sprintf(dst, %s%llu, subvol_string, subvol_objectid);
+   return dst;
+   }
  
-	/*

-* We need the same args as before, but with this substitution:
-* s!subvol=[^,]+!subvolid=0!
-*
-* Since the replacement string is up to 2 bytes longer than the
-* original, allocate strlen(args) + 2 + 1 bytes.
-*/
+   switch (flags) {
+   case CLEAR_SUBVOL:
+   src = strstr(args, subvol=);
+   break;
+   case CLEAR_SUBVOLID:
+   src = strstr(args, subvolid=);
+   break;
+   }
  
-	src = strstr(args, subvol=);

-   /* This shouldn't happen, but just in case.. */
-   if (!src)
-   return NULL;
+   if (!src) {
+   /* Case 2, some args, default subvolume mounting
+* just append ',subvolid=FS_ROOT' */
+
+   /* 1 for ending '\0', 1 for leading ',' */
+   len = strlen(args) + strlen(subvol_string) +
+ u64_to_strlen(subvol_objectid) + 2;
+   dst = kmalloc(len, GFP_NOFS);
+   if (!dst)
+   return NULL;
+   strcpy(dst, args);
+   sprintf(dst + strlen(args), ,%s%llu, subvol_string,
+   subvol_objectid);
+   return dst;
+   }
+
+   /* Case 3, subvolid=/subvol=  mount
+* repalce the 'subvolid/subvol' options to 'subvolid=FS_ROOT' */
+   comma = strchr(src, ',');
+   if (comma)
+   option_len = comma - src;
+   else
+   option_len = strlen(src);
+   len = strlen(args) - option_len  + 

Re: btrfs_qgroup_create unused parameter

2014-07-24 Thread Wang Shilong

Hi Kevin,

On 07/25/2014 07:23 AM, Kevin Brandstatter wrote:

I submitted a patch for this a week or two ago
(https://patchwork.kernel.org/patch/4486121/), but latest for-linus
doesn't have it merged, is it just being put of as minor, or is there a
problem with it?

I believe your patch will be picked up by Chris and sent to Linus
when next merge window is open.

Since Chris is sometimes busy, patch merging is always
delayed for some time.

Thanks,
Wang


-Kevin

On 07/04/2014 09:09 PM, Wang Shilong wrote:

Hi

I think you are right,  @name here is unneeded..
You can give a patch for that.^_^

Wang

|The code is pasted below for convenience of reference, but in the function to
create a qgruop, it taks a 4th parameter (char * name). I assume this is the 
name
of the path to limit, however, i don't see where its used anywhere in the 
function.

-Kevin Brandstatter

int btrfs_create_qgroup(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info, u64 qgroupid, *char** 
*name**)*
{
struct btrfs_root *quota_root;
struct btrfs_qgroup *qgroup;
int ret = 0;

mutex_lock(fs_info-qgroup_ioctl_lock);
quota_root = fs_info-quota_root;
if (!quota_root) {
ret = -EINVAL;
goto out;
}
qgroup = find_qgroup_rb(fs_info, qgroupid);
if (qgroup) {
ret = -EEXIST;
goto out;
}

ret = add_qgroup_item(trans, quota_root, qgroupid);
if (ret)
goto out;

spin_lock(fs_info-qgroup_lock);
qgroup = add_qgroup_rb(fs_info, qgroupid);
spin_unlock(fs_info-qgroup_lock);

if (IS_ERR(qgroup))
ret = PTR_ERR(qgroup);
out:
mutex_unlock(fs_info-qgroup_ioctl_lock);
return ret;
}|

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Return right extent when fiemap gives unaligned offset and len.

2014-07-24 Thread Qu Wenruo


 Original Message 
Subject: Re: [PATCH] btrfs: Return right extent when fiemap gives 
unaligned offset and len.

From: David Sterba dste...@suse.cz
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年07月24日 20:17

On Fri, Jul 18, 2014 at 09:55:43AM +0800, Qu Wenruo wrote:

When page aligned start and len passed to extent_fiemap(), the result is
good, but when start and len is not aligned, e.g. start = 1 and len =
4095 is passed to extent_fiemap(), it returns no extent.

The problem is that start and len is all rounded down which causes the
problem.

ALIGN rounds up, not down. So the wrong rounding will use incorrect start
(4096) and finds no extents if there's eg. only one [0,4095].

Sorry for the wrong description in patch.
Should I reword the patch and send a v2 patch?

Thanks,
Qu



This patch will round down start and round up (start + len) to
return right extent.

Reported-by: Chandan Rajendra chan...@linux.vnet.ibm.com
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com

Reviewed-by: David Sterba dste...@suse.cz


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


integration tree updated

2014-07-24 Thread Chris Mason
Hi everyone,

I've pushed out my current integration branch.  It does have a few of
Miao Xie's patches missing because there were some rejects.  I think
this was just because some things got pulled in out of order, and I'll
get it fixed up.

Also missing is Mark's quota snapshot deletion fixes.  They were
crashing during btrfs/011 with CONFIG_DEBUG_PAGE_ALLOC on.  We'll get
that nailed down.

integration is subject to rebasing, so please treat it more like a patch
queue.  It is very lightly tested, the goal is just to show which
patches are already applied and which ones are still pending.

Thanks!

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs: Return right extent when fiemap gives unaligned offset and len.

2014-07-24 Thread Qu Wenruo
When page aligned start and len passed to extent_fiemap(), the result is
good, but when start and len is not aligned, e.g. start = 1 and len =
4095 is passed to extent_fiemap(), it returns no extent.

The problem is that start and len is all rounded up which causes the
problem. This patch will round down start and round up (start + len) to
return right extent.

Reported-by: Chandan Rajendra chan...@linux.vnet.ibm.com
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
changelog:
v2: reword the description(ALIGN rounds up, not rounds down).
---
 fs/btrfs/extent_io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a389820..1c70cff 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4213,8 +4213,8 @@ int extent_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,
return -ENOMEM;
path-leave_spinning = 1;
 
-   start = ALIGN(start, BTRFS_I(inode)-root-sectorsize);
-   len = ALIGN(len, BTRFS_I(inode)-root-sectorsize);
+   start = round_down(start, BTRFS_I(inode)-root-sectorsize);
+   len = round_up(max, BTRFS_I(inode)-root-sectorsize) - start;
 
/*
 * lookup the last file extent.  We're not using i_size here
-- 
2.0.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix compressed write corruption on enospc

2014-07-24 Thread Wang Shilong

On 07/24/2014 10:48 PM, Liu Bo wrote:

When failing to allocate space for the whole compressed extent, we'll
fallback to uncompressed IO, but we've forgotten to redirty the pages
which belong to this compressed extent, and these 'clean' pages will
simply skip 'submit' part and go to endio directly, at last we got data
corruption as we write nothing.

Signed-off-by: Liu Bo bo.li@oracle.com
---
  fs/btrfs/inode.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3668048..8ea7610 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -709,6 +709,18 @@ retry:
unlock_extent(io_tree, async_extent-start,
  async_extent-start +
  async_extent-ram_size - 1);
+
+   /*
+* we need to redirty the pages if we decide to
+* fallback to uncompressed IO, otherwise we
+* will not submit these pages down to lower
+* layers.
+*/
+   extent_range_redirty_for_io(inode,
+   async_extent-start,
+   async_extent-start +
+   async_extent-ram_size - 1);
+
goto retry;

BTW, if such ENOSPC happens, it means we could not reserve compressed space.
So we retry with nocompression codes, it will try to reserve more space. 
Any reason

we do such things?

Thanks,
Wang

}
goto out_free;


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix compressed write corruption on enospc

2014-07-24 Thread Wang Shilong

On 07/25/2014 10:08 AM, Liu Bo wrote:

On Fri, Jul 25, 2014 at 09:53:43AM +0800, Wang Shilong wrote:

On 07/24/2014 10:48 PM, Liu Bo wrote:

When failing to allocate space for the whole compressed extent, we'll
fallback to uncompressed IO, but we've forgotten to redirty the pages
which belong to this compressed extent, and these 'clean' pages will
simply skip 'submit' part and go to endio directly, at last we got data
corruption as we write nothing.

Signed-off-by: Liu Bo bo.li@oracle.com
---
  fs/btrfs/inode.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3668048..8ea7610 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -709,6 +709,18 @@ retry:
unlock_extent(io_tree, async_extent-start,
  async_extent-start +
  async_extent-ram_size - 1);
+
+   /*
+* we need to redirty the pages if we decide to
+* fallback to uncompressed IO, otherwise we
+* will not submit these pages down to lower
+* layers.
+*/
+   extent_range_redirty_for_io(inode,
+   async_extent-start,
+   async_extent-start +
+   async_extent-ram_size - 1);
+
goto retry;

BTW, if such ENOSPC happens, it means we could not reserve compressed space.
So we retry with nocompression codes, it will try to reserve more
space. Any reason
we do such things?

Compressed extents needs continuous space while uncompressed extents can have
more choices.

Yeah, that is reasonable. Thanks for your answer.^_^



thanks,
-liubo


Thanks,
Wang

}
goto out_free;

.



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: integration tree updated

2014-07-24 Thread Qu Wenruo

Hi chris,

It seems that two of my wrong patches got merged in integration branch:
6068d17c8ab5bce946e9678ed2064e9f966cbe62 btrfs: Merge default subvolume 
mount codes into btrfs_mount_subvol().
8a2166332e332541f13b34b7248c0f14f575731e btrfs: Call mount_subtree() 
even 'subvolid=' mount option is given.


These two patches does not completely work, and the successor patch is 
under review.(the show_path one)


If it is OK for you, please remove these two patches.

Thanks,
Qu
 Original Message 
Subject: integration tree updated
From: Chris Mason c...@fb.com
To: linux-btrfs linux-btrfs@vger.kernel.org
Date: 2014年07月25日 09:40

Hi everyone,

I've pushed out my current integration branch.  It does have a few of
Miao Xie's patches missing because there were some rejects.  I think
this was just because some things got pulled in out of order, and I'll
get it fixed up.

Also missing is Mark's quota snapshot deletion fixes.  They were
crashing during btrfs/011 with CONFIG_DEBUG_PAGE_ALLOC on.  We'll get
that nailed down.

integration is subject to rebasing, so please treat it more like a patch
queue.  It is very lightly tested, the goal is just to show which
patches are already applied and which ones are still pending.

Thanks!

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS hang with 3.16-rc5 (and also with 3.16-rc4)

2014-07-24 Thread Duncan
Martin Steigerwald posted on Thu, 24 Jul 2014 20:49:37 +0200 as excerpted:

 It may take some time tough cause during compiling the kernel BTRFS hung
 again, which caused loss of KDE Baloo desktop search file index and
 parts of a mail I wrote in KMail.

Heh.  While I do run a kde(-lite) desktop, at least I don't have those 
problems to deal with.  As a gentooer I have the option to build kde 
without the semantic-desktop junk and I've taken that option, so no baloo 
or the like here.  And after the akonadified kmail lost one too many 
mails and I was going to need to reset the data store to retrieve them 
once again, I asked myself why I was putting up with it, after all, email 
is a decades old technology that should NOT be rocket-science any longer, 
and soon enough I was NOT putting up with it any longer, as I'd switched 
to claws-mail.  Actually, killing with fire kdepim and anything akonadi 
related was what allowed me to kill semantic-desktop as well instead of 
just run-time disabling it, since akonadi is part of the steaming pile.

And claws-mail has this nice option I didn't even know could be done on 
pop3 mail servers, too.  It downloads the mail for local reading, but 
keeps it on the server for a week (configurable) before final pop3 server-
side deletion, just in case you do crash after download and lose the 
local copy.  I've not actually had to take advantage of that feature yet, 
but it's sure nice to have, just in case. =:^)

Interesting this came up here just now, too, as there's a current xmodulo 
post about baloo and milou in kde4 and the carryover to kde-frameworks5 
and plasma5, too, with an ongoing discussion.

http://xmodulo.com/2014/07/kde-semantic-desktop-nepomuk-baloo.html

So fortunately, while I am a development version tester for of both kde 
and btrfs, the akonadi and semantic-desktop steaming-pile-of- is not 
something I have to worry about the stability of (or more precisely the 
lack thereof), while also testing a not yet fully stable btrfs at the 
same time.  Hopefully that'll continue to be the case in the claimed more 
modular kde-frameworks-5 era, because there's more than one way to ensure 
that I don't have to deal with that pile, and just as I suddenly found 
some other option for mail after using kmail since the kde2 era when it 
semantic-desktop-integrated without option, so my kde/plasma desktop, 
also since the kde2 era, can find itself going the same route locally, 
should it insist on going the same route globally.

Tho just as I did the akonadified kmail, I'll likely keep an open enough 
mind to try it.  shrug  Maybe it'll actually work this time, without 
eating up gigs of indexing space that has to be reset frequently due to 
something going wrong, to do it.  Time will tell...

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS hang with 3.16-rc5 (and also with 3.16-rc4)

2014-07-24 Thread Nick Krause
On Thu, Jul 24, 2014 at 10:32 PM, Duncan 1i5t5.dun...@cox.net wrote:
 Martin Steigerwald posted on Thu, 24 Jul 2014 20:49:37 +0200 as excerpted:

 It may take some time tough cause during compiling the kernel BTRFS hung
 again, which caused loss of KDE Baloo desktop search file index and
 parts of a mail I wrote in KMail.

 Heh.  While I do run a kde(-lite) desktop, at least I don't have those
 problems to deal with.  As a gentooer I have the option to build kde
 without the semantic-desktop junk and I've taken that option, so no baloo
 or the like here.  And after the akonadified kmail lost one too many
 mails and I was going to need to reset the data store to retrieve them
 once again, I asked myself why I was putting up with it, after all, email
 is a decades old technology that should NOT be rocket-science any longer,
 and soon enough I was NOT putting up with it any longer, as I'd switched
 to claws-mail.  Actually, killing with fire kdepim and anything akonadi
 related was what allowed me to kill semantic-desktop as well instead of
 just run-time disabling it, since akonadi is part of the steaming pile.

 And claws-mail has this nice option I didn't even know could be done on
 pop3 mail servers, too.  It downloads the mail for local reading, but
 keeps it on the server for a week (configurable) before final pop3 server-
 side deletion, just in case you do crash after download and lose the
 local copy.  I've not actually had to take advantage of that feature yet,
 but it's sure nice to have, just in case. =:^)

 Interesting this came up here just now, too, as there's a current xmodulo
 post about baloo and milou in kde4 and the carryover to kde-frameworks5
 and plasma5, too, with an ongoing discussion.

 http://xmodulo.com/2014/07/kde-semantic-desktop-nepomuk-baloo.html

 So fortunately, while I am a development version tester for of both kde
 and btrfs, the akonadi and semantic-desktop steaming-pile-of- is not
 something I have to worry about the stability of (or more precisely the
 lack thereof), while also testing a not yet fully stable btrfs at the
 same time.  Hopefully that'll continue to be the case in the claimed more
 modular kde-frameworks-5 era, because there's more than one way to ensure
 that I don't have to deal with that pile, and just as I suddenly found
 some other option for mail after using kmail since the kde2 era when it
 semantic-desktop-integrated without option, so my kde/plasma desktop,
 also since the kde2 era, can find itself going the same route locally,
 should it insist on going the same route globally.

 Tho just as I did the akonadified kmail, I'll likely keep an open enough
 mind to try it.  shrug  Maybe it'll actually work this time, without
 eating up gigs of indexing space that has to be reset frequently due to
 something going wrong, to do it.  Time will tell...

 --
 Duncan - List replies preferred.   No HTML msgs.
 Every nonfree program has a lord, a master --
 and if you use the program, he is your master.  Richard Stallman

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hey Duncan and others ,
I have read this and this seems to need some working on.
If you want my help please ask , I am new to the kernel
so I may ask a dumb question or two but if that's fine with
you I have no problem helping out here. I would like
a log of printk statements leading to the hand if that's
not too much work in order for me to trace this back.
Cheers Nick
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


color box, display box, corrugated box, color card, blister card, color sleeve, hang tag, label

2014-07-24 Thread Jinghao Printing - CHINA
Hi, this is David Wu from Shanghai, China.
We are a printing company, we can print color box, corrugated box,
label, hang tag etc.
Please let me know if you need these.

I will send you the website then.

Best regards,
David Wu
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mkfs.btrfs: round all device sizes to sectorsize

2014-07-24 Thread Eric Sandeen
make_btrfs() rounds down the first device size to a multiple of sectorsize:

num_bytes = (num_bytes / sectorsize) * sectorsize;

but subsequent device adds don't.

This seems a bit odd  inconsistent, and it makes xfstest btrfs/011
_notrun(), because it explicitly checks that devices are the same size.

I don't know that there is anything inherently wrong with having
a few device bytes extend past the last block, but to be consistent,
it seems like btrfs_add_to_fsid() should round the size in the same
way.

And now btrfs/011 runs more consistently; the test devices don't
have to be sectorsize multiples in order for all mkfs'd device
sizes to match.

Signed-off-by: Eric Sandeen sand...@redhat.com
---

ideally this might go into btrfs_device_size(), but we don't have
the chosen sector size anywhere near there...

diff --git a/utils.c b/utils.c
index e130849..4d7ee35 100644
--- a/utils.c
+++ b/utils.c
@@ -554,7 +554,7 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans,
device-sector_size = sectorsize;
device-fd = fd;
device-writeable = 1;
-   device-total_bytes = block_count;
+   device-total_bytes = (block_count / sectorsize) * sectorsize;
device-bytes_used = 0;
device-total_ios = 0;
device-dev_root = root-fs_info-dev_root;

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS hang with 3.16-rc5 (and also with 3.16-rc4)

2014-07-24 Thread Torbjørn

On 07/24/2014 04:58 PM, Chris Mason wrote:

snip
Liu Bo has a promising patch:

https://patchwork.kernel.org/patch/4618421/

Please give it a shot.  There's a second deadlock reading the free space
cache, I'm still working on that one too.

-chris
I (as expected, my hang was with free space cache) still get hangs with 
this applied on top of 3.16-rc6.

Looking forward to the free space cache patch.

I have not been able to trigger the same hang as I had with 3.15 on any 
of the 3.16-rc6 kernels so far.


--
Torbjørn

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html