[PATCH] Btrfs-progs: fix compile failure

2015-02-17 Thread Anand Jain
make
::
[CC] btrfs-search-metadata.o
btrfs-search-metadata.c: In function ‘print_usage’:
btrfs-search-metadata.c:40: error: ‘BTRFS_BUILD_VERSION’ undeclared (first use 
in this function)
btrfs-search-metadata.c:40: error: (Each undeclared identifier is reported only 
once
btrfs-search-metadata.c:40: error: for each function it appears in.)
make: *** [btrfs-search-metadata.o] Error 1

btrfs-search-metadata.c:fprintf(stderr, %s\n, BTRFS_BUILD_VERSION);

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 btrfs-search-metadata.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/btrfs-search-metadata.c b/btrfs-search-metadata.c
index 80dc326..34c6f39 100644
--- a/btrfs-search-metadata.c
+++ b/btrfs-search-metadata.c
@@ -37,7 +37,6 @@ static int print_usage(void)
fprintf(stderr, \t-t tree-id: search for given tree\n);
fprintf(stderr, \t-l level: search for node level (0=leaf)\n);
fprintf(stderr, \t-L: print full listing of matching leaf/node 
contents\n);
-   fprintf(stderr, %s\n, BTRFS_BUILD_VERSION);
exit(1);
 }
 
-- 
2.0.0.153.g79d

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs raid-1 uuid-fstab

2015-02-17 Thread Duncan
Kai Krakow posted on Tue, 17 Feb 2015 00:15:50 +0100 as excerpted:

 Long story short: I managed to strip dracut down to
 too few modules and it lost its ability to mount anything and even could
 not spawn a shell. *gnarf

Ouch!

FWIW, that's why I use a kernel built-in initramfs.  If I upgrade dracut 
or change its config and it fails to work, just as if the new kernel the 
initramfs is appended to fails to work, I simply boot an older kernel... 
with a known-working dracut-created initramfs.

Tho I /did/ have trouble with an older dracut locking to a particular 
default-root UUID at one point, so it would boot any root= I pointed it 
at, but *ONLY* as long as that particular UUID continued to exist!

Which is pretty hard to test for, since until you actually mkfs the 
existing default-root, its UUID will continue to exist, and you'll never 
know that your boot to the backup root using root= is working now, but 
will fail as soon as the default-root ceases to exist, until you're 
actually in the situation and can't boot, using any kernel/dracut 
combination!

That did drop me to the dracut/initramfs shell, but I was new enough with 
dracut at the time that I didn't really know how to fix it from there, 
nor could I properly edit a file or even view an entire file (cat worked, 
but that only let me see the last N lines and I didn't have a pager in 
the initramfs), to try to read documentation and fix the issue.

What I finally did to get out of that hole was manually ln -s the /dev/
disk/by-uuid/* symlink that the dracut/initramfs scripts were looking for 
based on the error, pointing it at an existing /dev/sdXN.  It didn't have 
to point at the root device, it could point at any device-block file, as 
long as that device-block file actually existed.

I didn't originally file a bug on that as the host-only option 
documentation warned about it being host-specific, so I figured it was 
/designed/ to do that.  Only later, when host-only was being discussed as 
the gentoo-recommended default on gentoo-dev and I explained that it 
wasn't always suitable as it broke if/when you blew away your default-
root and recreated it with a new UUID, and the gentoo dracut maintainer 
asked why I hadn't filed a bug, did I figure out it /was/ a bug, not a 
confusingly documented feature.  So I filed a bug and the gentoo 
maintainer filed one upstream as well, and it was apparently fixed.  But 
of course by then I had long since worked around the problem with more 
specific dracut-module include and exclude statements in the config, 
instead of using host-only, and that was working and continues to work, 
so I've never had reason to go back and test the more loosely specified 
host-only mode, and thus have never confirmed whether the bug was 
actually fixed or not, since I don't use that mode any more.

 And when that wasn't fun enough, my BIOS decided to no longer initialize
 USB so I could neither get into BIOS nor into Grub shell. I don't know
 when that problem happened. Probably been that for a while and I never
 noticed. Just that it went a lot slower through BIOS after I managed to
 convince it to initialize USB again (by opening the case and shorting
 the reset jumper).

Ouch.  FWIW my mobo has dual-bios, which is nice, but I've been down the 
bios-reset road before, several times.

I even had a BIOS update go bad once (due to bad RAM), screwed up the 
last-ditch bios-rescue it offered as I didn't know what I was doing, and 
had to use my netbook to setup a webmail account (didn't have the 
passwords to my normal email as I don't normally keep anything private on 
the netbook at all, in case I lose it, and couldn't access my other disks 
without a device to convert them to external/USB) and order a new BIOS 
shipped to me.

That is of course the big reason my new machine is dual-bios! =:^)  Tho 
it's not an absolute cure-all, as once it successfully boots from the 
main BIOS it auto-overwrites the second one, if different.  I'd actually 
rather make the auto-overwrite bit manual, so I could update it only when 
I was sufficiently sure it worked _reliably_, but oh, well, better than 
not having a backup BIOS at all, as I learned from experience!

 The next fun part was: My backup was incomplete in a special way: It had
 no directories dev, proc, run, sys and friends... Don't ask me how I
 solved that, probably by init=/bin/bash.

init=/bin/bash is indeed a very handy tool to have as a sysadmin. =:^)

I think I mentioned that setting that (via grub var) is actually one of 
my grub2 menu options, in the backup menu, FWIW.

 It happens, because I used
 rsync with the option to exclude those dirs. But well: In the end by
 backup was tested bootable. :-)
 
 I fixed by dracut setup and in the same procedure also fixed a
 long-standing issue with btrfs check telling me nlink errors. Luckily,
 this newer version could tell me the paths and I just delete those files
 in the chrome profile and var/lib/bluetooth directory. I 

Re: [PATCH 23/24] Btrfs: sysfs: support seed devices in the sysfs layout

2015-02-17 Thread Anand Jain


[guihc.f...@cn.fujitsu.com
 bounced removing the email id from the cc-list].

Dave,

here is the patch list for Oct.
  https://patchwork.kernel.org/project/linux-btrfs/list/?page=7
v2 isn't there. I am confused.

Anyway if btrfs-progs integration-20150213
contained V2. I am on it now. And I still see the problem.

--
btrfs fi show -- on a nested seed fs
warning devid 1 not found already  ---
warning devid 2 not found already  ---
bytenr mismatch, want=4194304, have=0 
Label: none  uuid: fce49239-b392-4e4d-b775-57dca7f2426b
Total devices 1 FS bytes used 28.00KiB
devid1 size 967.86MiB used 12.00MiB path /dev/sdb

Label: none  uuid: bccc7c86-82ff-4f2c-805a-4d384642f5e6
Total devices 2 FS bytes used 92.00KiB
devid1 size 967.86MiB used 8.00MiB path /dev/sdb
devid2 size 967.87MiB used 144.00MiB path /dev/sdc

Label: none  uuid: 6ea26ac0-b4e9-4a56-9079-67d25a57ac27
Total devices 3 FS bytes used 156.00KiB
devid3 size 1.52GiB used 224.00MiB path /dev/sdd
*** Some devices missing

btrfs-progs v3.19-rc2-68-gd4bf1cc
--

Can you pls revert this patch for now ?


Thanks, Anand



On 02/14/2015 01:51 AM, David Sterba wrote:

On Thu, Feb 12, 2015 at 02:25:32PM +0800, Anand Jain wrote:

Since we are on this topic: btrfs-progs shouldn't have had this patch:
 git log -p 2513077
-
commit 2513077f2f830b4bc83d528bfb6979eb461918bd
Author: Gui Hecheng guihc.f...@cn.fujitsu.com
Date:   Mon Oct 6 18:16:46 2014 +0800

  btrfs-progs: fix device missing of btrfs fi show with seed devices
-

it doesn't work with nested seed as I commented
http://marc.info/?l=linux-btrfsm=141102300324251w=2
-
btrfs fi show -d
warning devid 1 not found already
warning devid 2 not found already
Check tree block failed, want=29425664, have=0
read block failed check_tree_block
Couldn't setup csum tree
Check tree block failed, want=29360128, have=0
read block failed check_tree_block
-

I haven't see next version of this patch from Gui. (Gui ?, copied)


The fixed version was [PATCH v2 3/3] btrfs-progs: fix device missing of
btrfs fi show with seed devices

http://article.gmane.org/gmane.comp.file-systems.btrfs/39186/

and that's what I have merged.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Forever blocked in bit_wait with kernel 3.19

2015-02-17 Thread Omar Sandoval
On Thu, Feb 12, 2015 at 11:12:25AM +, Steven Schlansker wrote:
 [ Please CC me on replies, I'm not on the list ]
 [ This is a followup to 
 http://www.spinics.net/lists/linux-btrfs/msg41496.html ]
 
 Hello linux-btrfs,
 I've been having troubles keeping my Apache Mesos / Docker slave nodes 
 stable.  After some period of load, tasks begin to hang.  Once this happens 
 task after task ends up waiting at the same point, never to return.  The 
 system quickly becomes unusable and must be terminated.
 
 After the previous issues, I was encouraged to upgrade and retry.  I am now 
 running
 
 Linux 3.19.0 #1 SMP Mon Feb 9 09:43:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
 Btrfs v3.18.2 (and this version was also used to mkfs)
 
 root@ip-10-30-38-86:~# btrfs fi show
 Label: none  uuid: 0e8c3f1d-b07b-4643-9834-a41dafb80257
   Total devices 2 FS bytes used 3.92GiB
   devid1 size 74.99GiB used 4.01GiB path /dev/xvdc
   devid2 size 74.99GiB used 4.01GiB path /dev/xvdd
 
 Btrfs v3.18.2
 
 Data, RAID0: total=6.00GiB, used=3.69GiB
 System, RAID0: total=16.00MiB, used=16.00KiB
 Metadata, RAID0: total=2.00GiB, used=229.30MiB
 GlobalReserve, single: total=80.00MiB, used=0.00B
 
 This is the first hung task:
 
 [146280.252086] INFO: task java:28252 blocked for more than 120 seconds.
 [146280.252096]   Tainted: GE  3.19.0 #1
 [146280.252098] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [146280.252102] javaD 8805584df528 0 28252   1400 
 0x
 [146280.252106]  8805584df528 880756a24aa0 00014100 
 8805584dffd8
 [146280.252108]  00014100 8807567c31c0 880756a24aa0 
 8805584df5d0
 [146280.252109]  88075a314a00 8805584df5d0 88077c3f8ce8 
 0002
 [146280.252111] Call Trace:
 [146280.252120]  [8194efa0] ? bit_wait+0x50/0x50
 [146280.252122]  [8194e770] io_schedule+0xa0/0x130
 [146280.252125]  [8194efcc] bit_wait_io+0x2c/0x50
 [146280.252127]  [8194ec05] __wait_on_bit+0x65/0x90
 [146280.252131]  [81169ad7] wait_on_page_bit+0xc7/0xd0
 [146280.252134]  [810b0840] ? autoremove_wake_function+0x40/0x40
 [146280.252137]  [8117d9ed] shrink_page_list+0x2fd/0xa90
 [146280.252139]  [8117e7ad] shrink_inactive_list+0x1cd/0x590
 [146280.252141]  [8117f5b5] shrink_lruvec+0x5f5/0x810
 [146280.252144]  [81086d01] ? pwq_activate_delayed_work+0x31/0x90
 [146280.252146]  [8117f867] shrink_zone+0x97/0x240
 [146280.252148]  [8117fd75] do_try_to_free_pages+0x155/0x440
 [146280.252150]  [81180257] try_to_free_mem_cgroup_pages+0xa7/0x130
 [146280.252154]  [811d2931] try_charge+0x151/0x620
 [146280.252158]  [81815a05] ? tcp_schedule_loss_probe+0x145/0x1e0
 [146280.252160]  [811d6f48] mem_cgroup_try_charge+0x98/0x110
 [146280.252164]  [8170957e] ? __alloc_skb+0x7e/0x2b0
 [146280.252166]  [8116accf] __add_to_page_cache_locked+0x7f/0x290
 [146280.252169]  [8116af28] add_to_page_cache_lru+0x28/0x80
 [146280.252171]  [8116b00f] pagecache_get_page+0x8f/0x1c0
 [146280.252173]  [81952570] ? _raw_spin_unlock_bh+0x20/0x40
 [146280.252189]  [a0045935] prepare_pages.isra.19+0xc5/0x180 [btrfs]
 [146280.252199]  [a00464ec] __btrfs_buffered_write+0x1cc/0x590 
 [btrfs]
 [146280.252208]  [a0049c07] btrfs_file_write_iter+0x287/0x510 
 [btrfs]
 [146280.252211]  [813f7076] ? aa_path_perm+0xd6/0x170
 [146280.252214]  [811dfd91] new_sync_write+0x81/0xb0
 [146280.252216]  [811e0537] vfs_write+0xb7/0x1f0
 [146280.252217]  [81950636] ? mutex_lock+0x16/0x37
 [146280.252219]  [811e1146] SyS_write+0x46/0xb0
 [146280.252221]  [819529ed] system_call_fastpath+0x16/0x1b
 
 Here is a slightly different stacktrace:
 
 [158880.240245] INFO: task kworker/u16:6:13974 blocked for more than 120 
 seconds.
 [158880.240249]   Tainted: GE  3.19.0 #1
 [158880.240252] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [158880.240254] kworker/u16:6   D 88064e7b76c8 0 13974  2 
 0x
 [158880.240259] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
 [158880.240260]  88064e7b76c8 88066f0c18e0 00014100 
 88064e7b7fd8
 [158880.240262]  00014100 8201e4a0 88066f0c18e0 
 88077c3e06e8
 [158880.240264]  88075a214a00 88077c3e06e8 88064e7b7770 
 0002
 [158880.240266] Call Trace:
 [158880.240268]  [8194efa0] ? bit_wait+0x50/0x50
 [158880.240270]  [8194e770] io_schedule+0xa0/0x130
 [158880.240273]  [8194efcc] bit_wait_io+0x2c/0x50
 [158880.240275]  [8194ed9b] __wait_on_bit_lock+0x4b/0xb0
 [158880.240277]  [81169f2e] __lock_page+0xae/0xb0
 [158880.240279]  [810b0840] ? autoremove_wake_function+0x40/0x40
 [158880.240289]  [a00501bd] 

[PATCH] generic/325: Fix test case to work on 64K page size.

2015-02-17 Thread Chandan Rajendra
The test case passes 32K as the offset value to msync. This fails on machines
with 64K page size. Fix this by creating a larger file and passing offset
values which are multiples of 64K.

Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com
---
 tests/generic/325 | 10 +-
 tests/generic/325.out | 10 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tests/generic/325 b/tests/generic/325
index c47e372..e62ac95 100755
--- a/tests/generic/325
+++ b/tests/generic/325
@@ -64,7 +64,7 @@ _init_flakey
 _mount_flakey
 
 # Create the file first.
-$XFS_IO_PROG -f -c pwrite -S 0xff 0 64K $SCRATCH_MNT/foo | _filter_xfs_io
+$XFS_IO_PROG -f -c pwrite -S 0xff 0 256K $SCRATCH_MNT/foo | _filter_xfs_io
 
 # Now sync the file data to disk using 'sync' and not an fsync. This is because
 # in btrfs the first fsync clears the btrfs inode full fsync flag, which must
@@ -80,11 +80,11 @@ sync
 # This second msync() used to be a no-op for that btrfs bug (and the first 
fsync
 # didn't log the last 4Kb extent as expected too).
 $XFS_IO_PROG \
-   -c mmap -w 0 64K \
+   -c mmap -w 0 256K \
-c mwrite -S 0xaa 0 4K   \
-   -c mwrite -S 0xbb 60K 4K \
-   -c msync -s 0K 16K   \
-   -c msync -s 32K 32K  \
+   -c mwrite -S 0xbb 252K 4K \
+   -c msync -s 0K 64K   \
+   -c msync -s 192K 64K  \
-c munmap\
$SCRATCH_MNT/foo | _filter_xfs_io
 
diff --git a/tests/generic/325.out b/tests/generic/325.out
index 9a78c3e..9373e01 100644
--- a/tests/generic/325.out
+++ b/tests/generic/325.out
@@ -1,19 +1,19 @@
 QA output created by 325
-wrote 65536/65536 bytes at offset 0
+wrote 262144/262144 bytes at offset 0
 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 File content before crash/reboot:
 000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
 *
 001 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 *
-017 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
+077 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
 *
-020
+100
 File content after crash/reboot and fs mount:
 000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
 *
 001 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 *
-017 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
+077 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
 *
-020
+100
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] Btrfs: fix BUG_ON in btrfs_orphan_add() when delete unused block group

2015-02-17 Thread Chris Mason
On Wed, Feb 11, 2015 at 1:24 AM, Forrest Liu forre...@synology.com 
wrote:

Removing large amount of block group in a transaction may encounters
BUG_ON() in btrfs_orphan_add(). That is because 
btrfs_orphan_reserve_metadata()

will grab metadata reservation from transaction handle, and
btrfs_delete_unused_bgs() didn't reserve metadata for trnasaction 
handle when

delete unused block group.

The problem can be reproduce by following script

mntpath=/btrfs
loopdev=/dev/loop0
filepath=/home/forrest/image

umount $mntpath
losetup -d $loopdev
truncate --size 1000g $filepath
losetup $loopdev $filepath
mkfs.btrfs -f $loopdev
mount $loopdev $mntpath

for j in `seq 1 1 1000`; do
fallocate -l 1g $mntpath/$j
done
# wait cleaner thread remove unused block group
sleep 300

The call trace that results from the BUG_ON() is:

[  613.093084] [ cut here ]
[  613.097928] kernel BUG at fs/btrfs/inode.c:3142!


This should fix a hard to trigger bug we've seen in production here.  I 
was on a different (wrong) track for explaining it, so your timing is 
excellent.


Thanks.

-chris



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] btrfs: handle race on ENOMEM in alloc_extent_buffer

2015-02-17 Thread Omar Sandoval
On Tue, Feb 17, 2015 at 02:51:08AM -0800, Omar Sandoval wrote:
 Consider the following interleaving of overlapping calls to
 alloc_extent_buffer:
 
 Call 1:
 
 - Successfully allocates a few pages with find_or_create_page
 - find_or_create_page fails, goto free_eb
 - Unlocks the allocated pages
 
 Call 2:
 - Calls find_or_create_page and gets a page in call 1's extent_buffer
 - Finds that the page is already associated with an extent_buffer
 - Grabs a reference to the half-written extent_buffer and calls
   mark_extent_buffer_accessed on it
 
 mark_extent_buffer_accessed will then try to call mark_page_accessed on
 a null page and panic.
 
 The fix is to clear page-private of the half-written extent_buffer's
 pages all at once while holding mapping-private_lock.
 
 Signed-off-by: Omar Sandoval osan...@osandov.com
 ---
  fs/btrfs/extent_io.c | 20 
  1 file changed, 16 insertions(+), 4 deletions(-)
 
[snip]

Actually, I just realized that there's a simpler fix. I can resend the
whole series for easier merging once I get some review, but for now,
here's what I'm talking about:


btrfs: handle race on ENOMEM in alloc_extent_buffer

Consider the following interleaving of overlapping calls to
alloc_extent_buffer:

Call 1:

- Successfully allocates a few pages with find_or_create_page
- find_or_create_page fails, goto free_eb
- Unlocks the allocated pages

Call 2:
- Calls find_or_create_page and gets a page in call 1's extent_buffer
- Finds that the page is already associated with an extent_buffer
- Grabs a reference to the half-written extent_buffer and calls
  mark_extent_buffer_accessed on it

mark_extent_buffer_accessed will then try to call mark_page_accessed on
a null page and panic.

The fix is to decrement the reference count on the half-written
extent_buffer before unlocking the pages so call 2 won't use it. We also
set exists = NULL in the case that we don't use exists to avoid
accidentally returning a freed extent_buffer in an error case.

Signed-off-by: Omar Sandoval osan...@osandov.com
---
 fs/btrfs/extent_io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 790dbae..6b3cd72 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4850,6 +4850,7 @@ struct extent_buffer *alloc_extent_buffer(struct 
btrfs_fs_info *fs_info,
mark_extent_buffer_accessed(exists, p);
goto free_eb;
}
+   exists = NULL;
 
/*
 * Do this so attach doesn't complain and we need to
@@ -4913,12 +4914,12 @@ again:
return eb;
 
 free_eb:
+   WARN_ON(!atomic_dec_and_test(eb-refs));
for (i = 0; i  num_pages; i++) {
if (eb-pages[i])
unlock_page(eb-pages[i]);
}
 
-   WARN_ON(!atomic_dec_and_test(eb-refs));
btrfs_release_extent_buffer(eb);
return exists;
 }

-- 
Omar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PULL] Btrfs, recent cleanup patches

2015-02-17 Thread David Sterba
Hi,

I've collected the cleanup patches that are not in the integration
branch, the time span is last few months. I've reviewed them but tested
only lightly due to yet unknown problems in the current integration branch.

The target release is probably 3.21, I'm going to send more cleanup
series so that can be a base for next development cycle and minize
conflicts with other changes.

Please pull, thanks.

The following changes since commit a742994aa2e271eb8cd8e043d276515ec858ed73:

  Btrfs: don't remove extents and xattrs when logging new names (2015-02-14 
08:22:49 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git 
cleanups-post-3.19

for you to fetch changes up to a4f3d2c4efe2628329249b64fd5799468e025b9d:

  btrfs: cleanup, reduce temporary variables in btrfs_read_roots (2015-02-16 
18:48:47 +0100)


Daniel Dressler (3):
  Btrfs: ctree: reduce args where only fs_info used
  Btrfs: delayed-inode: replace root args iff only fs_info used
  Btrfs: disk-io: replace root args iff only fs_info used

David Sterba (3):
  btrfs: constify structs with op functions or static definitions
  btrfs: use correct type for workqueue flags
  btrfs: cleanup, reduce temporary variables in btrfs_read_roots

Eric Sandeen (10):
  btrfs: remove unused fs_info arg from btrfs_close_extra_devices()
  btrfs: consistently use fs_info in close_ctree()
  btrfs: factor btrfs_init_scrub() out of open_ctree()
  btrfs: factor btrfs_init_balance() out of open_ctree()
  btrfs: factor btrfs_init_btree_inode() out of open_ctree()
  btrfs: factor btrfs_init_dev_replace_locks() out of open_ctree()
  btrfs: factor btrfs_init_qgroup() out of open_ctree()
  btrfs: factor btrfs_init_workqueues() out of open_ctree()
  btrfs: factor btrfs_replay_log() out of open_ctree()
  btrfs: factor btrfs_read_roots() out of open_ctree()

Fabian Frederick (1):
  btrfs: fix sizeof format specifier in btrfs_check_super_valid()

Wang Shilong (1):
  Btrfs: switch to kvfree() helper

Zhao Lei (3):
  btrfs: cleanup: remove no-used alloc_chunk in 
btrfs_check_data_free_space()
  btrfs: remove unused chunk_tree argument in several functions
  btrfs: cleanup: use for() loop in btrfs_map_bio()

 fs/btrfs/async-thread.c|   4 +-
 fs/btrfs/async-thread.h|   2 +-
 fs/btrfs/check-integrity.c |   5 +-
 fs/btrfs/compression.c |   2 +-
 fs/btrfs/compression.h |   4 +-
 fs/btrfs/ctree.c   |  53 +++--
 fs/btrfs/ctree.h   |   3 +-
 fs/btrfs/delayed-inode.c   |   9 +-
 fs/btrfs/disk-io.c | 558 -
 fs/btrfs/disk-io.h |   4 +-
 fs/btrfs/extent-tree.c |  10 +-
 fs/btrfs/extent_io.h   |   2 +-
 fs/btrfs/file-item.c   |   2 +-
 fs/btrfs/file.c|   8 +-
 fs/btrfs/lzo.c |   2 +-
 fs/btrfs/props.c   |   2 +
 fs/btrfs/qgroup.c  |   2 +-
 fs/btrfs/raid56.c  |  13 +-
 fs/btrfs/scrub.c   |   2 +-
 fs/btrfs/sysfs.c   |   2 +-
 fs/btrfs/transaction.c |   2 +-
 fs/btrfs/tree-log.c|   8 +-
 fs/btrfs/volumes.c |  34 +--
 fs/btrfs/volumes.h |   3 +-
 fs/btrfs/zlib.c|   2 +-
 25 files changed, 386 insertions(+), 352 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] btrfs: check io_ctl_prepare_pages return in __btrfs_write_out_cache

2015-02-17 Thread Omar Sandoval
If io_ctl_prepare_pages fails, the pages in io_ctl.pages are not valid.
When we try to access them later, things will blow up in various ways.

Signed-off-by: Omar Sandoval osan...@osandov.com
---
 fs/btrfs/free-space-cache.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index d6c03f7..0460632 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -1114,7 +1114,7 @@ cleanup_write_cache_enospc(struct inode *inode,
  *
  * This function writes out a free space cache struct to disk for quick 
recovery
  * on mount.  This will return 0 if it was successfull in writing the cache 
out,
- * and -1 if it was not.
+ * or an errno if it was not.
  */
 static int __btrfs_write_out_cache(struct btrfs_root *root, struct inode 
*inode,
   struct btrfs_free_space_ctl *ctl,
@@ -1130,11 +1130,11 @@ static int __btrfs_write_out_cache(struct btrfs_root 
*root, struct inode *inode,
int ret;
 
if (!i_size_read(inode))
-   return -1;
+   return -EIO;
 
ret = io_ctl_init(io_ctl, inode, root, 1);
if (ret)
-   return -1;
+   return ret;
 
if (block_group  (block_group-flags  BTRFS_BLOCK_GROUP_DATA)) {
down_write(block_group-data_rwsem);
@@ -1151,7 +1151,9 @@ static int __btrfs_write_out_cache(struct btrfs_root 
*root, struct inode *inode,
}
 
/* Lock all pages first so we can lock the extent safely. */
-   io_ctl_prepare_pages(io_ctl, inode, 0);
+   ret = io_ctl_prepare_pages(io_ctl, inode, 0);
+   if (ret)
+   goto out;
 
lock_extent_bits(BTRFS_I(inode)-io_tree, 0, i_size_read(inode) - 1,
 0, cached_state);
-- 
2.3.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] btrfs: handle race on ENOMEM in alloc_extent_buffer

2015-02-17 Thread Omar Sandoval
Consider the following interleaving of overlapping calls to
alloc_extent_buffer:

Call 1:

- Successfully allocates a few pages with find_or_create_page
- find_or_create_page fails, goto free_eb
- Unlocks the allocated pages

Call 2:
- Calls find_or_create_page and gets a page in call 1's extent_buffer
- Finds that the page is already associated with an extent_buffer
- Grabs a reference to the half-written extent_buffer and calls
  mark_extent_buffer_accessed on it

mark_extent_buffer_accessed will then try to call mark_page_accessed on
a null page and panic.

The fix is to clear page-private of the half-written extent_buffer's
pages all at once while holding mapping-private_lock.

Signed-off-by: Omar Sandoval osan...@osandov.com
---
 fs/btrfs/extent_io.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c73df6a..6024db9 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4850,6 +4850,7 @@ struct extent_buffer *alloc_extent_buffer(struct 
btrfs_fs_info *fs_info,
mark_extent_buffer_accessed(exists, p);
goto free_eb;
}
+   exists = NULL;
 
/*
 * Do this so attach doesn't complain and we need to
@@ -4913,13 +4914,24 @@ again:
return eb;
 
 free_eb:
+   spin_lock(mapping-private_lock);
for (i = 0; i  num_pages; i++) {
-   if (eb-pages[i])
-   unlock_page(eb-pages[i]);
-   }
+   struct page *page = eb-pages[i];
 
+   if (page) {
+   unlock_page(page);
+   ClearPagePrivate(page);
+   set_page_private(page, 0);
+   /* One for the page private */
+   page_cache_release(page);
+   /* One for when we alloced the page */
+   page_cache_release(page);
+   }
+   }
+   spin_unlock(mapping-private_lock);
WARN_ON(!atomic_dec_and_test(eb-refs));
-   btrfs_release_extent_buffer(eb);
+   __free_extent_buffer(eb);
+
return exists;
 }
 
-- 
2.3.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] btrfs: handle ENOMEM in btrfs_alloc_tree_block

2015-02-17 Thread Omar Sandoval
This is one of the first places to go when memory is tight. Handle it
properly rather than with a BUG_ON.

Signed-off-by: Omar Sandoval osan...@osandov.com
---
 fs/btrfs/extent-tree.c | 41 -
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a684086..479df76 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7321,7 +7321,7 @@ static void unuse_block_rsv(struct btrfs_fs_info *fs_info,
  * returns the key for the extent through ins, and a tree buffer for
  * the first block of the extent through buf.
  *
- * returns the tree buffer or NULL.
+ * returns the tree buffer or an ERR_PTR on error.
  */
 struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
@@ -7332,6 +7332,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct 
btrfs_trans_handle *trans,
struct btrfs_key ins;
struct btrfs_block_rsv *block_rsv;
struct extent_buffer *buf;
+   struct btrfs_delayed_extent_op *extent_op;
u64 flags = 0;
int ret;
u32 blocksize = root-nodesize;
@@ -7352,14 +7353,15 @@ struct extent_buffer *btrfs_alloc_tree_block(struct 
btrfs_trans_handle *trans,
 
ret = btrfs_reserve_extent(root, blocksize, blocksize,
   empty_size, hint, ins, 0, 0);
-   if (ret) {
-   unuse_block_rsv(root-fs_info, block_rsv, blocksize);
-   return ERR_PTR(ret);
-   }
+   if (ret)
+   goto out_unuse;
 
buf = btrfs_init_new_buffer(trans, root, ins.objectid,
blocksize, level);
-   BUG_ON(IS_ERR(buf)); /* -ENOMEM */
+   if (IS_ERR(buf)) {
+   ret = PTR_ERR(buf);
+   goto out_free_reserved;
+   }
 
if (root_objectid == BTRFS_TREE_RELOC_OBJECTID) {
if (parent == 0)
@@ -7369,9 +7371,11 @@ struct extent_buffer *btrfs_alloc_tree_block(struct 
btrfs_trans_handle *trans,
BUG_ON(parent  0);
 
if (root_objectid != BTRFS_TREE_LOG_OBJECTID) {
-   struct btrfs_delayed_extent_op *extent_op;
extent_op = btrfs_alloc_delayed_extent_op();
-   BUG_ON(!extent_op); /* -ENOMEM */
+   if (!extent_op) {
+   ret = -ENOMEM;
+   goto out_free_buf;
+   }
if (key)
memcpy(extent_op-key, key, sizeof(extent_op-key));
else
@@ -7386,13 +7390,24 @@ struct extent_buffer *btrfs_alloc_tree_block(struct 
btrfs_trans_handle *trans,
extent_op-level = level;
 
ret = btrfs_add_delayed_tree_ref(root-fs_info, trans,
-   ins.objectid,
-   ins.offset, parent, root_objectid,
-   level, BTRFS_ADD_DELAYED_EXTENT,
-   extent_op, 0);
-   BUG_ON(ret); /* -ENOMEM */
+ins.objectid, ins.offset,
+parent, root_objectid, level,
+BTRFS_ADD_DELAYED_EXTENT,
+extent_op, 0);
+   if (ret)
+   goto out_free_delayed;
}
return buf;
+
+out_free_delayed:
+   btrfs_free_delayed_extent_op(extent_op);
+out_free_buf:
+   free_extent_buffer(buf);
+out_free_reserved:
+   btrfs_free_reserved_extent(root, ins.objectid, ins.offset, 0);
+out_unuse:
+   unuse_block_rsv(root-fs_info, block_rsv, blocksize);
+   return ERR_PTR(ret);
 }
 
 struct walk_control {
-- 
2.3.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] btrfs: Adjust commit-transaction condition to avoid NO_SPACE more

2015-02-17 Thread Zhaolei
From: Zhao Lei zhao...@cn.fujitsu.com

If we have any chance to make a successful write, we should not give up.

This patch adjust commit-transaction condition from:
  pinned = wanted
to
  left + pinned = wanted

Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com
---
 fs/btrfs/extent-tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 414d533..4ffce64 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3751,7 +3751,8 @@ alloc:
 * don't bother committing the transaction.
 */
if (percpu_counter_compare(data_sinfo-total_bytes_pinned,
-  bytes)  0)
+  used + bytes -
+  data_sinfo-total_bytes)  0)
have_pinned_space = 0;
spin_unlock(data_sinfo-lock);
 
-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: Adjust commit-transaction condition to avoid NO_SPACE more

2015-02-17 Thread Zhaolei
From: Zhao Lei zhao...@cn.fujitsu.com

If we have any chance to make a successful write, we should not give up.

This patch adjust commit-transaction condition from:
  pinned = wanted
to
  left + pinned = wanted

Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com
---
 fs/btrfs/extent-tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 414d533..4ffce64 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3751,7 +3751,8 @@ alloc:
 * don't bother committing the transaction.
 */
if (percpu_counter_compare(data_sinfo-total_bytes_pinned,
-  bytes)  0)
+  used + bytes -
+  data_sinfo-total_bytes)  0)
have_pinned_space = 0;
spin_unlock(data_sinfo-lock);
 
-- 
1.8.5.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 01/10] btrfs: Adjust commit-transaction condition to avoid NO_SPACE more

2015-02-17 Thread Zhao Lei
Hi,

 From: Zhaolei [mailto:zhao...@cn.fujitsu.com]
 Subject: [PATCH 01/10] btrfs: Adjust commit-transaction condition to avoid
 NO_SPACE more

Sorry for title, it is only one patch.
Will resend.

Thanks
Zhaolei

 
 From: Zhao Lei zhao...@cn.fujitsu.com
 
 If we have any chance to make a successful write, we should not give up.
 
 This patch adjust commit-transaction condition from:
   pinned = wanted
 to
   left + pinned = wanted
 
 Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com
 ---
  fs/btrfs/extent-tree.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index
 414d533..4ffce64 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3751,7 +3751,8 @@ alloc:
* don't bother committing the transaction.
*/
   if (percpu_counter_compare(data_sinfo-total_bytes_pinned,
 -bytes)  0)
 +used + bytes -
 +data_sinfo-total_bytes)  0)
   have_pinned_space = 0;
   spin_unlock(data_sinfo-lock);
 
 --
 1.8.5.1



--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] btrfs: ENOMEM bugfixes

2015-02-17 Thread Omar Sandoval
Hi,

As it turns out, running with low memory is a really easy way to shake
out undesirable behavior in Btrfs. This can be especially bad when
considering that a memory limit is really easy to hit in a container
(e.g., by using cgroup memory.limit_in_bytes). Here's a simple script
that can hit several problems:


#!/bin/sh

cgcreate -g memory:enomem
MEM=$((64 * 1024 * 1024))
echo $MEM  /sys/fs/cgroup/memory/enomem/memory.limit_in_bytes

cgexec -g memory:enomem ~/xfstests/ltp/fsstress -p128 -n9 -d /mnt/test 
trap killall fsstress; exit 0 SIGINT SIGTERM

while true; do
cgexec -g memory:enomem python -c '
l = []
while True:
l.append(0)'
done


Ignoring for now the cases that drop the filesystem into read-only mode
with relatively little fuss, here are a few patches that fix some of the
low-hanging fruit. They apply to Linus' tree as of today.

Thanks!

Omar Sandoval (3):
  btrfs: handle ENOMEM in btrfs_alloc_tree_block
  btrfs: handle race on ENOMEM in alloc_extent_buffer
  btrfs: check io_ctl_prepare_pages return in __btrfs_write_out_cache

 fs/btrfs/extent-tree.c  | 41 -
 fs/btrfs/extent_io.c| 20 
 fs/btrfs/free-space-cache.c | 10 ++
 3 files changed, 50 insertions(+), 21 deletions(-)

-- 
2.3.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html