[PATCH v2] lib: add size unit t/p/e to memparse

2014-04-01 Thread Gui Hecheng
For modern filesystems such as btrfs, t/p/e size level operations
are common.
add size unit t/p/e parsing to memparse

Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com
---
changelog
v1-v2: replace kilobyte with kibibyte, and others
---
 lib/cmdline.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/cmdline.c b/lib/cmdline.c
index eb67911..7cf5cb8 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -119,11 +119,17 @@ char *get_options(const char *str, int nints, int *ints)
  * @retptr: (output) Optional pointer to next char after parse completes
  *
  * Parses a string into a number.  The number stored at @ptr is
- * potentially suffixed with %K (for kilobytes, or 1024 bytes),
- * %M (for megabytes, or 1048576 bytes), or %G (for gigabytes, or
- * 1073741824).  If the number is suffixed with K, M, or G, then
- * the return value is the number multiplied by one kilobyte, one
- * megabyte, or one gigabyte, respectively.
+ * potentially suffixed with
+ * %K (for kibibytes, or 1024 bytes),
+ * %M (for mebibytes, or 1048576 bytes),
+ * %G (for gibibytes, or 1073741824),
+ * %T (for tebibytes, or 1099511627776),
+ * %P (for pebibytes, or 1125899906842624 bytes),
+ * %E (for exbibytes, or 1152921504606846976 bytes).
+ * If the number is suffixed with K, M, G, T, P, E, then
+ * the return value is the number multiplied by one kibibyte, one
+ * mebibyte, one gibibyte, one tebibyte, one pebibyte, one exbibyte,
+ * respectively.
  */
 
 unsigned long long memparse(const char *ptr, char **retptr)
@@ -133,6 +139,15 @@ unsigned long long memparse(const char *ptr, char **retptr)
unsigned long long ret = simple_strtoull(ptr, endptr, 0);
 
switch (*endptr) {
+   case 'E':
+   case 'e':
+   ret = 10;
+   case 'P':
+   case 'p':
+   ret = 10;
+   case 'T':
+   case 't':
+   ret = 10;
case 'G':
case 'g':
ret = 10;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] Btrfs: don't compress for a small write

2014-04-01 Thread Wang Shilong
To compress a small file range(=blocksize) that is not
an inline extent can not save disk space at all. skip it can
save us some cpu time.

This patch can also fix wrong setting nocompression flag for
inode, say a case when @total_in is 4096, and then we get
@total_compressed 52,because we do aligment to page cache size
firstly, and then we get into conclusion @total_in=@total_compressed
thus we will clear this inode's compression flag.

An exception comes from inserting inline extent failure but we
still have @total_compressed  @total_in,so we will still reset
inode's flag, this is ok, because we don't have good compression
effect.

Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
---
v1-v2:
make comments and change log more clear that we only skip
small data that can not be inlined.(pointed by Chris)
---
 fs/btrfs/inode.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0ec8766..5cb3075 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -394,6 +394,14 @@ static noinline int compress_file_range(struct inode 
*inode,
(start  0 || end + 1  BTRFS_I(inode)-disk_i_size))
btrfs_add_inode_defrag(NULL, inode);
 
+   /*
+* skip compression for a small file range(=blocksize) that
+* isn't an inline extent, since it dosen't save disk space at all.
+*/
+   if ((end - start + 1) = blocksize 
+   (start  0 || end + 1  BTRFS_I(inode)-disk_i_size))
+   goto cleanup_and_bail_uncompressed;
+
actual_end = min_t(u64, isize, end + 1);
 again:
will_compress = 0;
-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs-progs: update manpage for btrfs resize support size unit t/p/e

2014-04-01 Thread Gui Hecheng
btrfs resize now support size unit parse of k/m/g/t/p/e in kernel space,
adopt the changes in userspace manpage.

Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com
---
changelog
v1-v2: replace kilobyte with kibibyte, and others
---
 cmds-filesystem.c | 3 ++-
 man/btrfs.8.in| 9 +
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 7eb6e9d..b81768b 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -922,10 +922,11 @@ static int cmd_defrag(int argc, char **argv)
 }
 
 static const char * const cmd_resize_usage[] = {
-   btrfs filesystem resize [devid:][+/-]newsize[gkm]|[devid:]max 
path,
+   btrfs filesystem resize 
[devid:][+/-]newsize[kKmMgGtTpPeE]|[devid:]max path,
Resize a filesystem,
If 'max' is passed, the filesystem will occupy all available space,
on the device 'devid'.,
+   [kK] means KiB, which denotes 1KiB = 1024B, 1MiB = 1024KiB, etc.,
NULL
 };
 
diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index 7fbde82..9fad64a 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -31,7 +31,7 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBfilesystem defragment\fP [\fIoptions\fP] 
\fIfile\fP|\fIdir\fP [\fIfile\fP|\fIdir...\fP]\fP
 .PP
-\fBbtrfs\fP \fBfilesystem resize\fP 
[\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fP
+\fBbtrfs\fP \fBfilesystem resize\fP 
[\fIdevid\fP:][+/\-]\fIsize\fP[kKmMgGtTpPeE]|[\fIdevid\fP:]\fImax path\fP
 .PP
 \fBbtrfs\fP \fBfilesystem label\fP [\fIdevice\fP|\fImount_point\fP] 
[\fInewlabel\fP]
 .PP
@@ -343,7 +343,7 @@ copies with \fBcp --reflink\fP.
 .\
 .\ Some wording are extracted by the resize2fs man page
 .\
-\fBfilesystem resize\fP 
[\fIdevid\fP:][+/\-]\fIsize\fP[gkm]|[\fIdevid\fP:]\fImax path\fR
+\fBfilesystem resize\fP 
[\fIdevid\fP:][+/\-]\fIsize\fP[kKmMgGtTpPeE]|[\fIdevid\fP:]\fImax path\fR
 Resize a filesystem identified by \fIpath\fR for the underlying device
 \fIdevid\fR.  The \fIdevid\fR can be found with \fBbtrfs filesystem show\fR and
 defaults to 1 if not specified.
@@ -352,8 +352,9 @@ If the prefix \fI+\fR or \fI\-\fR is present the size is 
increased or decreased
 by the quantity \fIsize\fR.
 If no units are specified, the unit of the \fIsize\fR parameter defaults to
 bytes. Optionally, the size parameter may be suffixed by one of the following
-units designators: 'K', 'M', or 'G', kilobytes, megabytes, or gigabytes,
-respectively.
+units designators: 'K', 'M', 'G', 'T', 'P', 'E', kibibytes, mebibytes, 
gibibytes,
+tebibytes, pebibytes, exbibytes respectively.
+K means KiB, which denotes 1KiB = 1024B, 1MiB = 1024KiB, etc..
 
 If 'max' is passed, the filesystem will occupy all available space on the
 device \fIdevid\fR.
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/2] Btrfs: scrub raid56 stripes in the right way

2014-04-01 Thread Wang Shilong
Steps to reproduce:
 # mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5
 # mount /dev/sda8 /mnt
 # btrfs scrub start -BR /mnt
 # echo $? --unverified errors make return value be 3

This is because we don't setup right mapping between physical
and logical address for raid56, which makes checksum mismatch.
But we will find everthing is fine later when rechecking using
btrfs_map_block().

This patch fixed the problem by settuping right mappings and
we only verify data stripes' checksums.

Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
---
v2-v3: fix wrong logical offset set plus some clean ups.
v1-v2: on the right way to fix the problem.
---
 fs/btrfs/scrub.c | 108 +--
 1 file changed, 89 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index db21a13..aee909f 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -2235,6 +2235,47 @@ behind_scrub_pages:
return 0;
 }
 
+/*
+ * Given a physical address, this will calculate it's
+ * logical offset. if this is a parity stripe, it will return
+ * the most left data stripe's logical offset.
+ *
+ * return 0 if it is a data stripe, 1 means parity stripe.
+ */
+static int get_raid56_logic_offset(u64 physical, int num,
+  struct map_lookup *map, u64 *offset)
+{
+   int i;
+   int j = 0;
+   u64 stripe_nr;
+   u64 last_offset;
+   int stripe_index;
+   int rot;
+
+   last_offset = (physical - map-stripes[num].physical) *
+ nr_data_stripes(map);
+   *offset = last_offset;
+   for (i = 0; i  nr_data_stripes(map); i++) {
+   *offset = last_offset + i * map-stripe_len;
+
+   stripe_nr = *offset;
+   do_div(stripe_nr, map-stripe_len);
+   do_div(stripe_nr, nr_data_stripes(map));
+
+   /* Work out the disk rotation on this stripe-set */
+   rot = do_div(stripe_nr, map-num_stripes);
+   /* calculate which stripe this data locates */
+   rot += i;
+   stripe_index = do_div(rot, map-num_stripes);
+   if (stripe_index == num)
+   return 0;
+   if (stripe_index  num)
+   j++;
+   }
+   *offset = last_offset + j * map-stripe_len;
+   return 1;
+}
+
 static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
   struct map_lookup *map,
   struct btrfs_device *scrub_dev,
@@ -2256,6 +2297,7 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
u64 physical;
u64 logical;
u64 logic_end;
+   u64 physical_end;
u64 generation;
int mirror_num;
struct reada_control *reada1;
@@ -2269,16 +2311,10 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
u64 extent_len;
struct btrfs_device *extent_dev;
int extent_mirror_num;
-   int stop_loop;
-
-   if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
-BTRFS_BLOCK_GROUP_RAID6)) {
-   if (num = nr_data_stripes(map)) {
-   return 0;
-   }
-   }
+   int stop_loop = 0;
 
nstripes = length;
+   physical = map-stripes[num].physical;
offset = 0;
do_div(nstripes, map-stripe_len);
if (map-type  BTRFS_BLOCK_GROUP_RAID0) {
@@ -2296,6 +2332,11 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
} else if (map-type  BTRFS_BLOCK_GROUP_DUP) {
increment = map-stripe_len;
mirror_num = num % map-num_stripes + 1;
+   } else if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
+   BTRFS_BLOCK_GROUP_RAID6)) {
+   get_raid56_logic_offset(physical, num, map, offset);
+   increment = map-stripe_len * nr_data_stripes(map);
+   mirror_num = 1;
} else {
increment = map-stripe_len;
mirror_num = 1;
@@ -2319,7 +2360,15 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
 * to not hold off transaction commits
 */
logical = base + offset;
-
+   physical_end = physical + nstripes * map-stripe_len;
+   if (map-type  (BTRFS_BLOCK_GROUP_RAID5 |
+BTRFS_BLOCK_GROUP_RAID6)) {
+   get_raid56_logic_offset(physical_end, num,
+   map, logic_end);
+   logic_end += base;
+   } else {
+   logic_end = logical + increment * nstripes;
+   }
wait_event(sctx-list_wait,
   atomic_read(sctx-bios_in_flight) == 0);
scrub_blocked_if_needed(fs_info);
@@ -2328,7 +2377,7 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
key_start.objectid = logical;
key_start.type = 

BTRFS hangs - possibly NFS related?

2014-04-01 Thread kim-btrfs
Apologies if this is known, but I've been lurking a while on the list and
not seen anything similar - and I'm running out of ideas on what to do next
to debug it.

Small HP microserver box, running Debian, EXT4 system disk plus 4 disk BTRFS
array shared over NFS (nfs-kernel-server) and SMB - the disks recently moved
from a different box where they've been running faultlessly for months,
although that didn't use NFS.

Under reasonable combined NFS and SMB load with only a couple of clients,
the shares lock up, load average on server and clients goes high and stays
high (10-12) and stays there. Apparently not actually CPU and there's
little if any disk activity on the server.

Killing NFS and/or Samba sometimes helps, but it's always back when the load
comes back on. Chased round NFS and Samba options, then find that when the
clients hang it's unresponsive on the server directly to the disk.

Notice  a btrfs-transacti process hung in d.As are all the NFS
processes:

3779 ?S 0:00 [nfsd4]
3780 ?S 0:00 [nfsd4_callbacks]
3782 ?D  0:27 [nfsd]
3783 ?D  0:27 [nfsd]
3784 ?D  0:28 [nfsd]
3785 ?D  0:26 [nfsd]

sync instantly unsticks everything and it all works again for another
couple of minutes, when it locks up again, same symptoms. Nothing
apparently written to kern.log or dmesg, which has been the frustration all
through - I don't know where to find the culprit!

As a band-aid I've put 
btrfs filesystem sync /mnt/btrfs

In the crontab once a minute which is actually working just fine  and has
been all morning - every 5 minutes was not enough.

Any recommendations on where I can look next, or any known holes I've fallen
in.?  Do I need to force NFS clients to sync in their mount options?


Background:
Kernel - 3.13-1-amd64 #1 SMP Debian 3.13.7-1 (2014-03-25)AMD N54L with
10GB RAM.

##
Total devices 4 FS bytes used 848.88GiB
devid2 size 465.76GiB used 319.03GiB path /dev/sdc
devid4 size 465.76GiB used 319.00GiB path /dev/sda
devid5 size 455.76GiB used 309.03GiB path /dev/sdb2
devid6 size 931.51GiB used 785.00GiB path /dev/sdd

##
Data, RAID1: total=864.00GiB, used=847.86GiB
System, RAID1: total=32.00MiB, used=128.00KiB
Metadata, RAID1: total=2.00GiB, used=1009.93MiB

A scrub passes without finding any errors.  

There are a couple of VM images with light traffic which do fragment a
little but I manually defrag those every day so often and I haven't had any
problems there - it certainly isn't thrashing.



Cheers
Kim

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] Btrfs pull -- part 1

2014-04-01 Thread Chris Mason
Hi Linus,

We still have some corruption fixes and other patches coming in for the
merge window, but this batch is tested and ready to go.

Please pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git master

This is a pretty long stream of bug fixes and performance fixes.

Qu Wenruo has replaced the btrfs async threads with regular kernel
workqueues.  We'll keep an eye out for performance differences, but it's
nice to be using more generic code for this.

Miao Xie (23) commits (+621/-237):
Btrfs: use ACCESS_ONCE to prevent the optimize accesses to 
-last_trans_log_full_commit (+10/-7)
Btrfs: don't mix the ordered extents of all files together during logging 
the inodes (+49/-35)
Btrfs: fix unprotected alloc list insertion during the finishing procedure 
of replace (+3/-0)
Btrfs: just do dirty page flush for the inode with compression before 
direct IO (+7/-7)
Btrfs: fix use-after-free in the finishing procedure of the device replace 
(+111/-15)
Btrfs: use signed integer instead of unsigned long integer for log transid 
(+11/-11)
Btrfs: don't flush all delalloc inodes when we doesn't get s_umount lock 
(+29/-24)
Btrfs: fix possible empty list access when flushing the delalloc inodes 
(+8/-0)
Btrfs: fix early enospc due to the race of the two ordered extent wait 
(+14/-3)
Btrfs: introduce btrfs_{start, end}_nocow_write() for each subvolume 
(+133/-10)
Btrfs: don't start the log transaction if the log tree init fails (+14/-12)
Btrfs: fix the skipped transaction commit during the file sync (+16/-10)
Btrfs: fix wrong lock range and write size in check_can_nocow() (+3/-2)
Btrfs: remove the unnecessary flush when preparing the pages (+5/-8)
Btrfs: remove unnecessary memory barrier in btrfs_sync_log() (+0/-3)
Btrfs: remove unnecessary lock in may_commit_transaction() (+1/-8)
Btrfs: stop joining the log transaction if sync log fails (+14/-2)
Btrfs: just wait or commit our own log sub-transaction (+47/-23)
Btrfs: fix skipped error handle when log sync failed (+111/-31)
Btrfs: wake up the tasks that wait for the io earlier (+10/-4)
Btrfs: reclaim delalloc metadata more aggressively (+1/-1)
Btrfs: split the global ordered extents mutex (+7/-13)
Btrfs: fix preallocate vs double nocow write (+17/-8)

Filipe Manana (21) commits (+482/-153):
Btrfs: part 2, fix incremental send's decision to delay a dir move/rename 
(+66/-5)
Btrfs: don't insert useless holes when punching beyond the inode's size 
(+17/-11)
Btrfs: correctly determine if blocks are shared in btrfs_compare_trees 
(+10/-1)
Btrfs: fix send issuing outdated paths for utimes, chown and chmod (+12/-19)
Btrfs: fix incremental send's decision to delay a dir move/rename (+3/-3)
Btrfs: make defrag not fragment files when using prealloc extents (+2/-1)
Btrfs: correctly flush data on defrag when compression is enabled (+5/-1)
Btrfs: remove unneeded field / smaller extent_map structure (+12/-13)
Btrfs: avoid unnecessary utimes update in incremental send (+11/-10)
Btrfs: fix send attempting to rmdir non-empty directories (+221/-26)
Btrfs: send, don't send rmdir for same target multiple times (+4/-1)
Btrfs: incremental send, fix invalid path after dir rename (+34/-7)
Btrfs: less fs tree lock contention when using autodefrag (+10/-2)
Btrfs: remove unnecessary inode generation lookup in send (+2/-7)
Btrfs: add missing kfree in btrfs_destroy_workqueue (+1/-0)
Btrfs: more efficient split extent state insertion (+8/-5)
Btrfs: remove unnecessary ref heads rb tree search (+3/-4)
Btrfs: fix race when updating existing ref head (+1/-1)
Btrfs: more efficient btrfs_drop_extent_cache (+45/-14)
Btrfs: cache extent states in defrag code path (+9/-4)
Btrfs: cleanup delayed-ref.c:find_ref_head() (+6/-18)

Qu Wenruo (21) commits (+907/-1287):
btrfs: Added btrfs_workqueue_struct implemented ordered execution based on 
kernel workqueue (+164/-0)
btrfs: Replace fs_info-qgroup_rescan_worker workqueue with 
btrfs_workqueue. (+16/-15)
btrfs: Replace fs_info-readahead_workers workqueue with btrfs_workqueue. 
(+11/-14)
btrfs: Replace fs_info-delayed_workers workqueue with btrfs_workqueue. 
(+11/-13)
btrfs: Replace fs_info-fixup_workers workqueue with btrfs_workqueue. 
(+10/-11)
btrfs: Replace fs_info-cache_workers workqueue with btrfs_workqueue. 
(+11/-11)
btrfs: Add high priority workqueue support for btrfs_workqueue_struct 
(+83/-13)
btrfs: Replace fs_info-rmw_workers workqueue with btrfs_workqueue. 
(+21/-28)
btrfs: Replace fs_info-scrub_* workqueue with btrfs_workqueue. (+55/-48)
btrfs: Replace fs_info-endio_* workqueue with btrfs_workqueue. (+68/-81)
btrfs: Replace fs_info-delalloc_workers with btrfs_workqueue (+14/-20)
btrfs: Replace fs_info-submit_workers with btrfs_workqueue. (+18/-16)
btrfs: Replace fs_info-flush_workers with 

SELinux on btrfs

2014-04-01 Thread Michael Schuerig

I'm currently considering to use SELinux on an existing system with 
btrfs filesystems. This would be my first with SELinux and I wouldn't 
expect everything to go smoothly. I'm already aware that SELinux's 
automatic labelling of files is not aware of subvolumes[*]. I already 
have quite a few read-only snapshots that I don't want to forfeit, 
however, I'm not at all sure how SELinux would interact with them.

Are there any other considerations I ought to be aware of?

Michael


[*] https://wiki.debian.org/SELinux/Setup#btrfs
-- 
Michael Schuerig
mailto:mich...@schuerig.de
http://www.schuerig.de/michael/

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Help] Errors found in extent allocation tree or chunk allocation

2014-04-01 Thread Michael Witten
 I'd actually put more faith in the btrfs balance command...
 Btrfs scrub is another alternative

 I suppose, then, that my ultimate plan of action is thus:

   * Back up important data.

   * Give `btrfs scrub' a try.

   * Give `btrfs balance' a try.

   * Rebuild the file system if necessary.

The `btrfs scrub' completed successfully without error, but did not
solve my issues.

The `btrfs balance' completed successfully without error, and DID
solve my issues; it relocated every chunk, after which `btrfsck'
ran smoothly.

Thanks for the advice! You've put me at ease, and you've saved me
a lot of time and energy.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Help] Errors found in extent allocation tree or chunk allocation

2014-04-01 Thread Michael Witten
On Tue, 1 Apr 2014 07:15:47 +0800, Shilong Wang wrote:

 hello, recently, i was working on btrfsck repair. i post some
 patches to make btrfsck to rebuild extent tree with snapshots.
 you can pull code from david latest branch and have a try with
 btrfsck --init-extent-tree.

Fortunately, the `btrfs balance' command solved the issue for me:

  http://mid.gmane.org/8e4932bb29f743419660da1846c2363b-mfwit...@gmail.com

However, I did notice `--init-extent-tree' when trying to figure
out what I might do to fix my troubles; however, I was uncertain
about exactly what it would do.

According to `man btrfsck':

  --init-csum-tree
   create a new CRC tree.

  --init-extent-tree
   create a new extent tree.

These sound potentially destructive, so I was afraid to use them;
what exactly does it mean to create a new tree? After your email,
I assume each action creates a new tree that is populated from the
data in the corresponding old tree.

Also, you wrote:

 rebuild extent tree with snapshots

Do you mean to say that your patches handle the previously unhandled
case in which both of the following are true?

  * There are snapshots in the file system.
  * The extent tree needs to be rebuilt.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: implement inode_operations callback tmpfile

2014-04-01 Thread Filipe David Borba Manana
This implements the tmpfile callback of struct inode_operations, introduced
in the linux kernel 3.11 [1], and implemented already by some filesystems.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 fs/btrfs/inode.c | 120 +--
 1 file changed, 100 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0ec8766..1a38ec7 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5522,6 +5522,7 @@ static struct inode *btrfs_new_inode(struct 
btrfs_trans_handle *trans,
struct btrfs_inode_ref *ref;
struct btrfs_key key[2];
u32 sizes[2];
+   int nitems = name ? 2 : 1;
unsigned long ptr;
int ret;
 
@@ -5541,7 +5542,7 @@ static struct inode *btrfs_new_inode(struct 
btrfs_trans_handle *trans,
 */
inode-i_ino = objectid;
 
-   if (dir) {
+   if (dir  name) {
trace_btrfs_inode_request(dir);
 
ret = btrfs_set_inode_index(dir, index);
@@ -5550,6 +5551,8 @@ static struct inode *btrfs_new_inode(struct 
btrfs_trans_handle *trans,
iput(inode);
return ERR_PTR(ret);
}
+   } else if (dir) {
+   *index = 0;
}
/*
 * index_cnt is ignored for everything but a dir,
@@ -5574,21 +5577,24 @@ static struct inode *btrfs_new_inode(struct 
btrfs_trans_handle *trans,
btrfs_set_key_type(key[0], BTRFS_INODE_ITEM_KEY);
key[0].offset = 0;
 
-   /*
-* Start new inodes with an inode_ref. This is slightly more
-* efficient for small numbers of hard links since they will
-* be packed into one item. Extended refs will kick in if we
-* add more hard links than can fit in the ref item.
-*/
-   key[1].objectid = objectid;
-   btrfs_set_key_type(key[1], BTRFS_INODE_REF_KEY);
-   key[1].offset = ref_objectid;
-
sizes[0] = sizeof(struct btrfs_inode_item);
-   sizes[1] = name_len + sizeof(*ref);
+
+   if (name) {
+   /*
+* Start new inodes with an inode_ref. This is slightly more
+* efficient for small numbers of hard links since they will
+* be packed into one item. Extended refs will kick in if we
+* add more hard links than can fit in the ref item.
+*/
+   key[1].objectid = objectid;
+   btrfs_set_key_type(key[1], BTRFS_INODE_REF_KEY);
+   key[1].offset = ref_objectid;
+
+   sizes[1] = name_len + sizeof(*ref);
+   }
 
path-leave_spinning = 1;
-   ret = btrfs_insert_empty_items(trans, root, path, key, sizes, 2);
+   ret = btrfs_insert_empty_items(trans, root, path, key, sizes, nitems);
if (ret != 0)
goto fail;
 
@@ -5601,12 +5607,14 @@ static struct inode *btrfs_new_inode(struct 
btrfs_trans_handle *trans,
 sizeof(*inode_item));
fill_inode_item(trans, path-nodes[0], inode_item, inode);
 
-   ref = btrfs_item_ptr(path-nodes[0], path-slots[0] + 1,
-struct btrfs_inode_ref);
-   btrfs_set_inode_ref_name_len(path-nodes[0], ref, name_len);
-   btrfs_set_inode_ref_index(path-nodes[0], ref, *index);
-   ptr = (unsigned long)(ref + 1);
-   write_extent_buffer(path-nodes[0], name, ptr, name_len);
+   if (name) {
+   ref = btrfs_item_ptr(path-nodes[0], path-slots[0] + 1,
+struct btrfs_inode_ref);
+   btrfs_set_inode_ref_name_len(path-nodes[0], ref, name_len);
+   btrfs_set_inode_ref_index(path-nodes[0], ref, *index);
+   ptr = (unsigned long)(ref + 1);
+   write_extent_buffer(path-nodes[0], name, ptr, name_len);
+   }
 
btrfs_mark_buffer_dirty(path-nodes[0]);
btrfs_free_path(path);
@@ -5642,7 +5650,7 @@ static struct inode *btrfs_new_inode(struct 
btrfs_trans_handle *trans,
 
return inode;
 fail:
-   if (dir)
+   if (dir  name)
BTRFS_I(dir)-index_cnt--;
btrfs_free_path(path);
iput(inode);
@@ -5927,6 +5935,15 @@ static int btrfs_link(struct dentry *old_dentry, struct 
inode *dir,
err = btrfs_update_inode(trans, root, inode);
if (err)
goto fail;
+   if (inode-i_nlink == 1) {
+   /*
+* If new hard link count is 1, it's a file created
+* with open(2) O_TMPFILE flag.
+*/
+   err = btrfs_orphan_del(trans, inode);
+   if (err)
+   goto fail;
+   }
d_instantiate(dentry, inode);
btrfs_log_new_name(trans, inode, NULL, parent);
}
@@ -8858,6 +8875,68 @@ static int btrfs_permission(struct 

Btrfs lockdep with 3.15 merge window v3.14-751-g683b6c6f82a6

2014-04-01 Thread Josh Boyer
Hi Chris and Josef,

Below is a lockdep spew I have on a local VM running Linus' tree as of
this afternoon.  The specific git commit is v3.14-751-g683b6c6f82a6.

I don't believe any btrfs commits have been merged for 3.15 in this
kernel.  The only related commits seem to be the merge of Ingo's
core/locking branch, which might have found a new lockdep issue/false
positive.  I might have reported a similar lockdep spew in 3.14, but I
can't recall at the moment.  The VM is very much hung though at this
point.  Hopefully this is helpful to you.

josh

[  295.348848] 
[  295.349016] [ INFO: posible irq lock invrsion dependency detected ]
[  25.349016] 3.15.0-.rc0.git2.1.fc21.x86_64 #1 Not tainted
[  295.34906] -
[  295.349016] kswapd0/47 just changed the state of lock:
[  295.349016]  (delayed_node-mutex){+.+.-.}, at: [a019a90d] __btrf]
[  295.349016] but this lock took another, RECLAIM_FS-unsafelock in the past:
 [ }

and interrupts could create inverse lock ordering between them.

[  295.349016]
[  295.349016] ther info that miht help us debug this:
[  295.349016]  Possible interrupt unsafe locking scenario:
[  295.349016]
[  295.349016]CPU0CPU1
[  295.349016]
[  295.349016]   lock(found-groups_sem);
[  295.349016]local_irq_disable();
[  295.349016]lock(delayed_node-mutex);
[  295.349016]lock(found-groups_sem);
[  295.349016]   Interrupt
[  295.349016] lock(delayed_node-mutex);
[  29.349016]
[  295.349016]  *** DEADLOCK ***
[  295.349016]
[  295.349016] 2 locks held by kswapd0/47:
[  295.349016]  #0:  (shrinker_rwsem){..}, at: [811c62ef] shrink_0
[  295.349016]  #1:  (type-s_umount_key#25){..}, at: [81238c14]0
[  295.349016]
[  295.349016] the shorest dependencies etween 2nd lock and 1st lock:
[  29.349016]  - (ound-groups_sem){+.} ops: 1498 {
[  295.49016] HARDIRQ-ON-W at:
  295.349016]  [fff810fabb0] __loc_acquire+0x670/0x1c0
[  25.349016]   [fff810fca12] lock_acquire+0xa2/0x1d0
[  295349016]  [817dbd3e] down_write+0x4e/0xc0
[  29.349016]   [fffa013187b] _link_block_group+0x3b/]
[  25.349016]   [a013375b] btrfs_read_block_group]
[  25.349016]  [a0142a31] open_ctree+0x17f1/0x21d]
[  295.349016]   [a0115c8e] btrfs_mount+0x66e/0x8]
[  295.349016]   [812395e9] mount_fs+0x39/0x1b0
[  295.349016]   [81259b6b] vfs_kern_mount+0x6b/00
[  25.349016]   [a01157a9] btrfs_mount+0x189/0x85]
[  295.349016]   [fff812395e9] mont_fs+0x39/0x1b0
[  295.349016]   [81259bb] vfs_kern_mount+0x6b/0x0
[  295.34901]   [fff8125ca7e] do_mount+0x24e/0xbb0
[  295349016]  [fff8125d726] SyS_mount0x96/0xf0
[  25.349016]  [817e7be9] system_call_fastpath+0xb
[  29.349016] HARDIRQ-ON-R at:
[  295.349016]   [810faa9a] __lock_acquire+0x55a/0
[  295.34901]  [fff810fca12] lock_acquire+0xa2/0x1d0
[  295.349016]   [817bca1] down_read+0x51/0xa0
[  29.349016]   [a013e00d] btrfs_calc_num_tolerat]
[  295.349016] [a142a4f] open_ctree+0x180f/0x21d0]
[  295.349016]   [a0115c8e] btrfs_mount+0x66e/0x8]
[  295.349016  [12395e9] mount_fs+0x39/0x1b0
[  295.349016]  [81259b6b] vfs_kern_mount+0x6b/0x0
[  295.349016]   [812395e9] mount_fs+0x39/0x1b050]
[  295349016]   [fff81259b6b] vfs_ken_mount+0x6b/0x150
[  295.34901]   [8125ca7e] do_mount+0x24e/0xbb0
[  295.349016]  [8125d726] SyS_mout+0x96/0xf0
[  295.349016]   [817e7be9] syste_call_fastpath+0b
[  295.349016] SOFTIRQON-W at:
[  25.349016]  [810facb] __lock_acquire+0x58b/0x10
[  295349016]   [fff810fca12] lock_acquire+0xa2/0x1d0
[  29.349016]   [fff817dbd3e] downwrite+0x4e/0xc0
[  295.39016]   [fffa013187b] __link_block_group+0x3b]
[  295.349016]  [a013375b] btrfs_read_block_groups
[  295.349016   a0142a31] open_ctree+0x17f1/0x21d]
[  295.39016]   [a0115c8e] btrfs_mount+0x66e/0x85]
[  295.349016]  

Re: [GIT PULL] Btrfs pull -- part 1

2014-04-01 Thread Chris Mason



On 04/01/2014 10:25 AM, Chris Mason wrote:

Hi Linus,

We still have some corruption fixes and other patches coming in for the
merge window, but this batch is tested and ready to go.

Please pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git master


Tsutomu noticed I put the wrong branch name in.  I've pushed my 
for-linus over to my master branch so they all match the shortlog and 
diffstat below.


Thanks,
Chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Help] Errors found in extent allocation tree or chunk allocation

2014-04-01 Thread Wang Shilong

Hello Michael,

On 04/02/2014 03:26 AM, Michael Witten wrote:

On Tue, 1 Apr 2014 07:15:47 +0800, Shilong Wang wrote:


hello, recently, i was working on btrfsck repair. i post some
patches to make btrfsck to rebuild extent tree with snapshots.
you can pull code from david latest branch and have a try with
btrfsck --init-extent-tree.

Fortunately, the `btrfs balance' command solved the issue for me:

   http://mid.gmane.org/8e4932bb29f743419660da1846c2363b-mfwit...@gmail.com

However, I did notice `--init-extent-tree' when trying to figure
out what I might do to fix my troubles; however, I was uncertain
about exactly what it would do.

According to `man btrfsck':

   --init-csum-tree
create a new CRC tree.

   --init-extent-tree
create a new extent tree.

These sound potentially destructive, so I was afraid to use them;
what exactly does it mean to create a new tree? After your email,
I assume each action creates a new tree that is populated from the
data in the corresponding old tree.


We may have a totally broken Extent tree, then you could not even
mount your filesystem anymore.

'--init-extent-tree' will recreate a new tree and then re-insert 
everything according to other fs/file tree.




Also, you wrote:


rebuild extent tree with snapshots

Do you mean to say that your patches handle the previously unhandled
case in which both of the following are true?


Previouly, '--init-extent-tree' did not handle it properly if there is 
snapshots in the filesystem.


Thanks,
Wang


   * There are snapshots in the file system.
   * The extent tree needs to be rebuilt.

Thanks!
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html