[PATCH] Btrfs-progs: fix compile failure
make :: [CC] btrfs-search-metadata.o btrfs-search-metadata.c: In function ‘print_usage’: btrfs-search-metadata.c:40: error: ‘BTRFS_BUILD_VERSION’ undeclared (first use in this function) btrfs-search-metadata.c:40: error: (Each undeclared identifier is reported only once btrfs-search-metadata.c:40: error: for each function it appears in.) make: *** [btrfs-search-metadata.o] Error 1 btrfs-search-metadata.c:fprintf(stderr, %s\n, BTRFS_BUILD_VERSION); Signed-off-by: Anand Jain anand.j...@oracle.com --- btrfs-search-metadata.c | 1 - 1 file changed, 1 deletion(-) diff --git a/btrfs-search-metadata.c b/btrfs-search-metadata.c index 80dc326..34c6f39 100644 --- a/btrfs-search-metadata.c +++ b/btrfs-search-metadata.c @@ -37,7 +37,6 @@ static int print_usage(void) fprintf(stderr, \t-t tree-id: search for given tree\n); fprintf(stderr, \t-l level: search for node level (0=leaf)\n); fprintf(stderr, \t-L: print full listing of matching leaf/node contents\n); - fprintf(stderr, %s\n, BTRFS_BUILD_VERSION); exit(1); } -- 2.0.0.153.g79d -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs raid-1 uuid-fstab
Kai Krakow posted on Tue, 17 Feb 2015 00:15:50 +0100 as excerpted: Long story short: I managed to strip dracut down to too few modules and it lost its ability to mount anything and even could not spawn a shell. *gnarf Ouch! FWIW, that's why I use a kernel built-in initramfs. If I upgrade dracut or change its config and it fails to work, just as if the new kernel the initramfs is appended to fails to work, I simply boot an older kernel... with a known-working dracut-created initramfs. Tho I /did/ have trouble with an older dracut locking to a particular default-root UUID at one point, so it would boot any root= I pointed it at, but *ONLY* as long as that particular UUID continued to exist! Which is pretty hard to test for, since until you actually mkfs the existing default-root, its UUID will continue to exist, and you'll never know that your boot to the backup root using root= is working now, but will fail as soon as the default-root ceases to exist, until you're actually in the situation and can't boot, using any kernel/dracut combination! That did drop me to the dracut/initramfs shell, but I was new enough with dracut at the time that I didn't really know how to fix it from there, nor could I properly edit a file or even view an entire file (cat worked, but that only let me see the last N lines and I didn't have a pager in the initramfs), to try to read documentation and fix the issue. What I finally did to get out of that hole was manually ln -s the /dev/ disk/by-uuid/* symlink that the dracut/initramfs scripts were looking for based on the error, pointing it at an existing /dev/sdXN. It didn't have to point at the root device, it could point at any device-block file, as long as that device-block file actually existed. I didn't originally file a bug on that as the host-only option documentation warned about it being host-specific, so I figured it was /designed/ to do that. Only later, when host-only was being discussed as the gentoo-recommended default on gentoo-dev and I explained that it wasn't always suitable as it broke if/when you blew away your default- root and recreated it with a new UUID, and the gentoo dracut maintainer asked why I hadn't filed a bug, did I figure out it /was/ a bug, not a confusingly documented feature. So I filed a bug and the gentoo maintainer filed one upstream as well, and it was apparently fixed. But of course by then I had long since worked around the problem with more specific dracut-module include and exclude statements in the config, instead of using host-only, and that was working and continues to work, so I've never had reason to go back and test the more loosely specified host-only mode, and thus have never confirmed whether the bug was actually fixed or not, since I don't use that mode any more. And when that wasn't fun enough, my BIOS decided to no longer initialize USB so I could neither get into BIOS nor into Grub shell. I don't know when that problem happened. Probably been that for a while and I never noticed. Just that it went a lot slower through BIOS after I managed to convince it to initialize USB again (by opening the case and shorting the reset jumper). Ouch. FWIW my mobo has dual-bios, which is nice, but I've been down the bios-reset road before, several times. I even had a BIOS update go bad once (due to bad RAM), screwed up the last-ditch bios-rescue it offered as I didn't know what I was doing, and had to use my netbook to setup a webmail account (didn't have the passwords to my normal email as I don't normally keep anything private on the netbook at all, in case I lose it, and couldn't access my other disks without a device to convert them to external/USB) and order a new BIOS shipped to me. That is of course the big reason my new machine is dual-bios! =:^) Tho it's not an absolute cure-all, as once it successfully boots from the main BIOS it auto-overwrites the second one, if different. I'd actually rather make the auto-overwrite bit manual, so I could update it only when I was sufficiently sure it worked _reliably_, but oh, well, better than not having a backup BIOS at all, as I learned from experience! The next fun part was: My backup was incomplete in a special way: It had no directories dev, proc, run, sys and friends... Don't ask me how I solved that, probably by init=/bin/bash. init=/bin/bash is indeed a very handy tool to have as a sysadmin. =:^) I think I mentioned that setting that (via grub var) is actually one of my grub2 menu options, in the backup menu, FWIW. It happens, because I used rsync with the option to exclude those dirs. But well: In the end by backup was tested bootable. :-) I fixed by dracut setup and in the same procedure also fixed a long-standing issue with btrfs check telling me nlink errors. Luckily, this newer version could tell me the paths and I just delete those files in the chrome profile and var/lib/bluetooth directory. I
Re: [PATCH 23/24] Btrfs: sysfs: support seed devices in the sysfs layout
[guihc.f...@cn.fujitsu.com bounced removing the email id from the cc-list]. Dave, here is the patch list for Oct. https://patchwork.kernel.org/project/linux-btrfs/list/?page=7 v2 isn't there. I am confused. Anyway if btrfs-progs integration-20150213 contained V2. I am on it now. And I still see the problem. -- btrfs fi show -- on a nested seed fs warning devid 1 not found already --- warning devid 2 not found already --- bytenr mismatch, want=4194304, have=0 Label: none uuid: fce49239-b392-4e4d-b775-57dca7f2426b Total devices 1 FS bytes used 28.00KiB devid1 size 967.86MiB used 12.00MiB path /dev/sdb Label: none uuid: bccc7c86-82ff-4f2c-805a-4d384642f5e6 Total devices 2 FS bytes used 92.00KiB devid1 size 967.86MiB used 8.00MiB path /dev/sdb devid2 size 967.87MiB used 144.00MiB path /dev/sdc Label: none uuid: 6ea26ac0-b4e9-4a56-9079-67d25a57ac27 Total devices 3 FS bytes used 156.00KiB devid3 size 1.52GiB used 224.00MiB path /dev/sdd *** Some devices missing btrfs-progs v3.19-rc2-68-gd4bf1cc -- Can you pls revert this patch for now ? Thanks, Anand On 02/14/2015 01:51 AM, David Sterba wrote: On Thu, Feb 12, 2015 at 02:25:32PM +0800, Anand Jain wrote: Since we are on this topic: btrfs-progs shouldn't have had this patch: git log -p 2513077 - commit 2513077f2f830b4bc83d528bfb6979eb461918bd Author: Gui Hecheng guihc.f...@cn.fujitsu.com Date: Mon Oct 6 18:16:46 2014 +0800 btrfs-progs: fix device missing of btrfs fi show with seed devices - it doesn't work with nested seed as I commented http://marc.info/?l=linux-btrfsm=141102300324251w=2 - btrfs fi show -d warning devid 1 not found already warning devid 2 not found already Check tree block failed, want=29425664, have=0 read block failed check_tree_block Couldn't setup csum tree Check tree block failed, want=29360128, have=0 read block failed check_tree_block - I haven't see next version of this patch from Gui. (Gui ?, copied) The fixed version was [PATCH v2 3/3] btrfs-progs: fix device missing of btrfs fi show with seed devices http://article.gmane.org/gmane.comp.file-systems.btrfs/39186/ and that's what I have merged. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Forever blocked in bit_wait with kernel 3.19
On Thu, Feb 12, 2015 at 11:12:25AM +, Steven Schlansker wrote: [ Please CC me on replies, I'm not on the list ] [ This is a followup to http://www.spinics.net/lists/linux-btrfs/msg41496.html ] Hello linux-btrfs, I've been having troubles keeping my Apache Mesos / Docker slave nodes stable. After some period of load, tasks begin to hang. Once this happens task after task ends up waiting at the same point, never to return. The system quickly becomes unusable and must be terminated. After the previous issues, I was encouraged to upgrade and retry. I am now running Linux 3.19.0 #1 SMP Mon Feb 9 09:43:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux Btrfs v3.18.2 (and this version was also used to mkfs) root@ip-10-30-38-86:~# btrfs fi show Label: none uuid: 0e8c3f1d-b07b-4643-9834-a41dafb80257 Total devices 2 FS bytes used 3.92GiB devid1 size 74.99GiB used 4.01GiB path /dev/xvdc devid2 size 74.99GiB used 4.01GiB path /dev/xvdd Btrfs v3.18.2 Data, RAID0: total=6.00GiB, used=3.69GiB System, RAID0: total=16.00MiB, used=16.00KiB Metadata, RAID0: total=2.00GiB, used=229.30MiB GlobalReserve, single: total=80.00MiB, used=0.00B This is the first hung task: [146280.252086] INFO: task java:28252 blocked for more than 120 seconds. [146280.252096] Tainted: GE 3.19.0 #1 [146280.252098] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [146280.252102] javaD 8805584df528 0 28252 1400 0x [146280.252106] 8805584df528 880756a24aa0 00014100 8805584dffd8 [146280.252108] 00014100 8807567c31c0 880756a24aa0 8805584df5d0 [146280.252109] 88075a314a00 8805584df5d0 88077c3f8ce8 0002 [146280.252111] Call Trace: [146280.252120] [8194efa0] ? bit_wait+0x50/0x50 [146280.252122] [8194e770] io_schedule+0xa0/0x130 [146280.252125] [8194efcc] bit_wait_io+0x2c/0x50 [146280.252127] [8194ec05] __wait_on_bit+0x65/0x90 [146280.252131] [81169ad7] wait_on_page_bit+0xc7/0xd0 [146280.252134] [810b0840] ? autoremove_wake_function+0x40/0x40 [146280.252137] [8117d9ed] shrink_page_list+0x2fd/0xa90 [146280.252139] [8117e7ad] shrink_inactive_list+0x1cd/0x590 [146280.252141] [8117f5b5] shrink_lruvec+0x5f5/0x810 [146280.252144] [81086d01] ? pwq_activate_delayed_work+0x31/0x90 [146280.252146] [8117f867] shrink_zone+0x97/0x240 [146280.252148] [8117fd75] do_try_to_free_pages+0x155/0x440 [146280.252150] [81180257] try_to_free_mem_cgroup_pages+0xa7/0x130 [146280.252154] [811d2931] try_charge+0x151/0x620 [146280.252158] [81815a05] ? tcp_schedule_loss_probe+0x145/0x1e0 [146280.252160] [811d6f48] mem_cgroup_try_charge+0x98/0x110 [146280.252164] [8170957e] ? __alloc_skb+0x7e/0x2b0 [146280.252166] [8116accf] __add_to_page_cache_locked+0x7f/0x290 [146280.252169] [8116af28] add_to_page_cache_lru+0x28/0x80 [146280.252171] [8116b00f] pagecache_get_page+0x8f/0x1c0 [146280.252173] [81952570] ? _raw_spin_unlock_bh+0x20/0x40 [146280.252189] [a0045935] prepare_pages.isra.19+0xc5/0x180 [btrfs] [146280.252199] [a00464ec] __btrfs_buffered_write+0x1cc/0x590 [btrfs] [146280.252208] [a0049c07] btrfs_file_write_iter+0x287/0x510 [btrfs] [146280.252211] [813f7076] ? aa_path_perm+0xd6/0x170 [146280.252214] [811dfd91] new_sync_write+0x81/0xb0 [146280.252216] [811e0537] vfs_write+0xb7/0x1f0 [146280.252217] [81950636] ? mutex_lock+0x16/0x37 [146280.252219] [811e1146] SyS_write+0x46/0xb0 [146280.252221] [819529ed] system_call_fastpath+0x16/0x1b Here is a slightly different stacktrace: [158880.240245] INFO: task kworker/u16:6:13974 blocked for more than 120 seconds. [158880.240249] Tainted: GE 3.19.0 #1 [158880.240252] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [158880.240254] kworker/u16:6 D 88064e7b76c8 0 13974 2 0x [158880.240259] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1) [158880.240260] 88064e7b76c8 88066f0c18e0 00014100 88064e7b7fd8 [158880.240262] 00014100 8201e4a0 88066f0c18e0 88077c3e06e8 [158880.240264] 88075a214a00 88077c3e06e8 88064e7b7770 0002 [158880.240266] Call Trace: [158880.240268] [8194efa0] ? bit_wait+0x50/0x50 [158880.240270] [8194e770] io_schedule+0xa0/0x130 [158880.240273] [8194efcc] bit_wait_io+0x2c/0x50 [158880.240275] [8194ed9b] __wait_on_bit_lock+0x4b/0xb0 [158880.240277] [81169f2e] __lock_page+0xae/0xb0 [158880.240279] [810b0840] ? autoremove_wake_function+0x40/0x40 [158880.240289] [a00501bd]
[PATCH] generic/325: Fix test case to work on 64K page size.
The test case passes 32K as the offset value to msync. This fails on machines with 64K page size. Fix this by creating a larger file and passing offset values which are multiples of 64K. Signed-off-by: Chandan Rajendra chan...@linux.vnet.ibm.com --- tests/generic/325 | 10 +- tests/generic/325.out | 10 +- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/tests/generic/325 b/tests/generic/325 index c47e372..e62ac95 100755 --- a/tests/generic/325 +++ b/tests/generic/325 @@ -64,7 +64,7 @@ _init_flakey _mount_flakey # Create the file first. -$XFS_IO_PROG -f -c pwrite -S 0xff 0 64K $SCRATCH_MNT/foo | _filter_xfs_io +$XFS_IO_PROG -f -c pwrite -S 0xff 0 256K $SCRATCH_MNT/foo | _filter_xfs_io # Now sync the file data to disk using 'sync' and not an fsync. This is because # in btrfs the first fsync clears the btrfs inode full fsync flag, which must @@ -80,11 +80,11 @@ sync # This second msync() used to be a no-op for that btrfs bug (and the first fsync # didn't log the last 4Kb extent as expected too). $XFS_IO_PROG \ - -c mmap -w 0 64K \ + -c mmap -w 0 256K \ -c mwrite -S 0xaa 0 4K \ - -c mwrite -S 0xbb 60K 4K \ - -c msync -s 0K 16K \ - -c msync -s 32K 32K \ + -c mwrite -S 0xbb 252K 4K \ + -c msync -s 0K 64K \ + -c msync -s 192K 64K \ -c munmap\ $SCRATCH_MNT/foo | _filter_xfs_io diff --git a/tests/generic/325.out b/tests/generic/325.out index 9a78c3e..9373e01 100644 --- a/tests/generic/325.out +++ b/tests/generic/325.out @@ -1,19 +1,19 @@ QA output created by 325 -wrote 65536/65536 bytes at offset 0 +wrote 262144/262144 bytes at offset 0 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) File content before crash/reboot: 000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 001 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * -017 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb +077 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb * -020 +100 File content after crash/reboot and fs mount: 000 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa * 001 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * -017 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb +077 bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb * -020 +100 -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] Btrfs: fix BUG_ON in btrfs_orphan_add() when delete unused block group
On Wed, Feb 11, 2015 at 1:24 AM, Forrest Liu forre...@synology.com wrote: Removing large amount of block group in a transaction may encounters BUG_ON() in btrfs_orphan_add(). That is because btrfs_orphan_reserve_metadata() will grab metadata reservation from transaction handle, and btrfs_delete_unused_bgs() didn't reserve metadata for trnasaction handle when delete unused block group. The problem can be reproduce by following script mntpath=/btrfs loopdev=/dev/loop0 filepath=/home/forrest/image umount $mntpath losetup -d $loopdev truncate --size 1000g $filepath losetup $loopdev $filepath mkfs.btrfs -f $loopdev mount $loopdev $mntpath for j in `seq 1 1 1000`; do fallocate -l 1g $mntpath/$j done # wait cleaner thread remove unused block group sleep 300 The call trace that results from the BUG_ON() is: [ 613.093084] [ cut here ] [ 613.097928] kernel BUG at fs/btrfs/inode.c:3142! This should fix a hard to trigger bug we've seen in production here. I was on a different (wrong) track for explaining it, so your timing is excellent. Thanks. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] btrfs: handle race on ENOMEM in alloc_extent_buffer
On Tue, Feb 17, 2015 at 02:51:08AM -0800, Omar Sandoval wrote: Consider the following interleaving of overlapping calls to alloc_extent_buffer: Call 1: - Successfully allocates a few pages with find_or_create_page - find_or_create_page fails, goto free_eb - Unlocks the allocated pages Call 2: - Calls find_or_create_page and gets a page in call 1's extent_buffer - Finds that the page is already associated with an extent_buffer - Grabs a reference to the half-written extent_buffer and calls mark_extent_buffer_accessed on it mark_extent_buffer_accessed will then try to call mark_page_accessed on a null page and panic. The fix is to clear page-private of the half-written extent_buffer's pages all at once while holding mapping-private_lock. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/btrfs/extent_io.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) [snip] Actually, I just realized that there's a simpler fix. I can resend the whole series for easier merging once I get some review, but for now, here's what I'm talking about: btrfs: handle race on ENOMEM in alloc_extent_buffer Consider the following interleaving of overlapping calls to alloc_extent_buffer: Call 1: - Successfully allocates a few pages with find_or_create_page - find_or_create_page fails, goto free_eb - Unlocks the allocated pages Call 2: - Calls find_or_create_page and gets a page in call 1's extent_buffer - Finds that the page is already associated with an extent_buffer - Grabs a reference to the half-written extent_buffer and calls mark_extent_buffer_accessed on it mark_extent_buffer_accessed will then try to call mark_page_accessed on a null page and panic. The fix is to decrement the reference count on the half-written extent_buffer before unlocking the pages so call 2 won't use it. We also set exists = NULL in the case that we don't use exists to avoid accidentally returning a freed extent_buffer in an error case. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/btrfs/extent_io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 790dbae..6b3cd72 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4850,6 +4850,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, mark_extent_buffer_accessed(exists, p); goto free_eb; } + exists = NULL; /* * Do this so attach doesn't complain and we need to @@ -4913,12 +4914,12 @@ again: return eb; free_eb: + WARN_ON(!atomic_dec_and_test(eb-refs)); for (i = 0; i num_pages; i++) { if (eb-pages[i]) unlock_page(eb-pages[i]); } - WARN_ON(!atomic_dec_and_test(eb-refs)); btrfs_release_extent_buffer(eb); return exists; } -- Omar -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PULL] Btrfs, recent cleanup patches
Hi, I've collected the cleanup patches that are not in the integration branch, the time span is last few months. I've reviewed them but tested only lightly due to yet unknown problems in the current integration branch. The target release is probably 3.21, I'm going to send more cleanup series so that can be a base for next development cycle and minize conflicts with other changes. Please pull, thanks. The following changes since commit a742994aa2e271eb8cd8e043d276515ec858ed73: Btrfs: don't remove extents and xattrs when logging new names (2015-02-14 08:22:49 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git cleanups-post-3.19 for you to fetch changes up to a4f3d2c4efe2628329249b64fd5799468e025b9d: btrfs: cleanup, reduce temporary variables in btrfs_read_roots (2015-02-16 18:48:47 +0100) Daniel Dressler (3): Btrfs: ctree: reduce args where only fs_info used Btrfs: delayed-inode: replace root args iff only fs_info used Btrfs: disk-io: replace root args iff only fs_info used David Sterba (3): btrfs: constify structs with op functions or static definitions btrfs: use correct type for workqueue flags btrfs: cleanup, reduce temporary variables in btrfs_read_roots Eric Sandeen (10): btrfs: remove unused fs_info arg from btrfs_close_extra_devices() btrfs: consistently use fs_info in close_ctree() btrfs: factor btrfs_init_scrub() out of open_ctree() btrfs: factor btrfs_init_balance() out of open_ctree() btrfs: factor btrfs_init_btree_inode() out of open_ctree() btrfs: factor btrfs_init_dev_replace_locks() out of open_ctree() btrfs: factor btrfs_init_qgroup() out of open_ctree() btrfs: factor btrfs_init_workqueues() out of open_ctree() btrfs: factor btrfs_replay_log() out of open_ctree() btrfs: factor btrfs_read_roots() out of open_ctree() Fabian Frederick (1): btrfs: fix sizeof format specifier in btrfs_check_super_valid() Wang Shilong (1): Btrfs: switch to kvfree() helper Zhao Lei (3): btrfs: cleanup: remove no-used alloc_chunk in btrfs_check_data_free_space() btrfs: remove unused chunk_tree argument in several functions btrfs: cleanup: use for() loop in btrfs_map_bio() fs/btrfs/async-thread.c| 4 +- fs/btrfs/async-thread.h| 2 +- fs/btrfs/check-integrity.c | 5 +- fs/btrfs/compression.c | 2 +- fs/btrfs/compression.h | 4 +- fs/btrfs/ctree.c | 53 +++-- fs/btrfs/ctree.h | 3 +- fs/btrfs/delayed-inode.c | 9 +- fs/btrfs/disk-io.c | 558 - fs/btrfs/disk-io.h | 4 +- fs/btrfs/extent-tree.c | 10 +- fs/btrfs/extent_io.h | 2 +- fs/btrfs/file-item.c | 2 +- fs/btrfs/file.c| 8 +- fs/btrfs/lzo.c | 2 +- fs/btrfs/props.c | 2 + fs/btrfs/qgroup.c | 2 +- fs/btrfs/raid56.c | 13 +- fs/btrfs/scrub.c | 2 +- fs/btrfs/sysfs.c | 2 +- fs/btrfs/transaction.c | 2 +- fs/btrfs/tree-log.c| 8 +- fs/btrfs/volumes.c | 34 +-- fs/btrfs/volumes.h | 3 +- fs/btrfs/zlib.c| 2 +- 25 files changed, 386 insertions(+), 352 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] btrfs: check io_ctl_prepare_pages return in __btrfs_write_out_cache
If io_ctl_prepare_pages fails, the pages in io_ctl.pages are not valid. When we try to access them later, things will blow up in various ways. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/btrfs/free-space-cache.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index d6c03f7..0460632 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -1114,7 +1114,7 @@ cleanup_write_cache_enospc(struct inode *inode, * * This function writes out a free space cache struct to disk for quick recovery * on mount. This will return 0 if it was successfull in writing the cache out, - * and -1 if it was not. + * or an errno if it was not. */ static int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode, struct btrfs_free_space_ctl *ctl, @@ -1130,11 +1130,11 @@ static int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode, int ret; if (!i_size_read(inode)) - return -1; + return -EIO; ret = io_ctl_init(io_ctl, inode, root, 1); if (ret) - return -1; + return ret; if (block_group (block_group-flags BTRFS_BLOCK_GROUP_DATA)) { down_write(block_group-data_rwsem); @@ -1151,7 +1151,9 @@ static int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode, } /* Lock all pages first so we can lock the extent safely. */ - io_ctl_prepare_pages(io_ctl, inode, 0); + ret = io_ctl_prepare_pages(io_ctl, inode, 0); + if (ret) + goto out; lock_extent_bits(BTRFS_I(inode)-io_tree, 0, i_size_read(inode) - 1, 0, cached_state); -- 2.3.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] btrfs: handle race on ENOMEM in alloc_extent_buffer
Consider the following interleaving of overlapping calls to alloc_extent_buffer: Call 1: - Successfully allocates a few pages with find_or_create_page - find_or_create_page fails, goto free_eb - Unlocks the allocated pages Call 2: - Calls find_or_create_page and gets a page in call 1's extent_buffer - Finds that the page is already associated with an extent_buffer - Grabs a reference to the half-written extent_buffer and calls mark_extent_buffer_accessed on it mark_extent_buffer_accessed will then try to call mark_page_accessed on a null page and panic. The fix is to clear page-private of the half-written extent_buffer's pages all at once while holding mapping-private_lock. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/btrfs/extent_io.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c73df6a..6024db9 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4850,6 +4850,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, mark_extent_buffer_accessed(exists, p); goto free_eb; } + exists = NULL; /* * Do this so attach doesn't complain and we need to @@ -4913,13 +4914,24 @@ again: return eb; free_eb: + spin_lock(mapping-private_lock); for (i = 0; i num_pages; i++) { - if (eb-pages[i]) - unlock_page(eb-pages[i]); - } + struct page *page = eb-pages[i]; + if (page) { + unlock_page(page); + ClearPagePrivate(page); + set_page_private(page, 0); + /* One for the page private */ + page_cache_release(page); + /* One for when we alloced the page */ + page_cache_release(page); + } + } + spin_unlock(mapping-private_lock); WARN_ON(!atomic_dec_and_test(eb-refs)); - btrfs_release_extent_buffer(eb); + __free_extent_buffer(eb); + return exists; } -- 2.3.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] btrfs: handle ENOMEM in btrfs_alloc_tree_block
This is one of the first places to go when memory is tight. Handle it properly rather than with a BUG_ON. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/btrfs/extent-tree.c | 41 - 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index a684086..479df76 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -7321,7 +7321,7 @@ static void unuse_block_rsv(struct btrfs_fs_info *fs_info, * returns the key for the extent through ins, and a tree buffer for * the first block of the extent through buf. * - * returns the tree buffer or NULL. + * returns the tree buffer or an ERR_PTR on error. */ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, struct btrfs_root *root, @@ -7332,6 +7332,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, struct btrfs_key ins; struct btrfs_block_rsv *block_rsv; struct extent_buffer *buf; + struct btrfs_delayed_extent_op *extent_op; u64 flags = 0; int ret; u32 blocksize = root-nodesize; @@ -7352,14 +7353,15 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, ret = btrfs_reserve_extent(root, blocksize, blocksize, empty_size, hint, ins, 0, 0); - if (ret) { - unuse_block_rsv(root-fs_info, block_rsv, blocksize); - return ERR_PTR(ret); - } + if (ret) + goto out_unuse; buf = btrfs_init_new_buffer(trans, root, ins.objectid, blocksize, level); - BUG_ON(IS_ERR(buf)); /* -ENOMEM */ + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + goto out_free_reserved; + } if (root_objectid == BTRFS_TREE_RELOC_OBJECTID) { if (parent == 0) @@ -7369,9 +7371,11 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, BUG_ON(parent 0); if (root_objectid != BTRFS_TREE_LOG_OBJECTID) { - struct btrfs_delayed_extent_op *extent_op; extent_op = btrfs_alloc_delayed_extent_op(); - BUG_ON(!extent_op); /* -ENOMEM */ + if (!extent_op) { + ret = -ENOMEM; + goto out_free_buf; + } if (key) memcpy(extent_op-key, key, sizeof(extent_op-key)); else @@ -7386,13 +7390,24 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, extent_op-level = level; ret = btrfs_add_delayed_tree_ref(root-fs_info, trans, - ins.objectid, - ins.offset, parent, root_objectid, - level, BTRFS_ADD_DELAYED_EXTENT, - extent_op, 0); - BUG_ON(ret); /* -ENOMEM */ +ins.objectid, ins.offset, +parent, root_objectid, level, +BTRFS_ADD_DELAYED_EXTENT, +extent_op, 0); + if (ret) + goto out_free_delayed; } return buf; + +out_free_delayed: + btrfs_free_delayed_extent_op(extent_op); +out_free_buf: + free_extent_buffer(buf); +out_free_reserved: + btrfs_free_reserved_extent(root, ins.objectid, ins.offset, 0); +out_unuse: + unuse_block_rsv(root-fs_info, block_rsv, blocksize); + return ERR_PTR(ret); } struct walk_control { -- 2.3.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/10] btrfs: Adjust commit-transaction condition to avoid NO_SPACE more
From: Zhao Lei zhao...@cn.fujitsu.com If we have any chance to make a successful write, we should not give up. This patch adjust commit-transaction condition from: pinned = wanted to left + pinned = wanted Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com --- fs/btrfs/extent-tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 414d533..4ffce64 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3751,7 +3751,8 @@ alloc: * don't bother committing the transaction. */ if (percpu_counter_compare(data_sinfo-total_bytes_pinned, - bytes) 0) + used + bytes - + data_sinfo-total_bytes) 0) have_pinned_space = 0; spin_unlock(data_sinfo-lock); -- 1.8.5.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: Adjust commit-transaction condition to avoid NO_SPACE more
From: Zhao Lei zhao...@cn.fujitsu.com If we have any chance to make a successful write, we should not give up. This patch adjust commit-transaction condition from: pinned = wanted to left + pinned = wanted Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com --- fs/btrfs/extent-tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 414d533..4ffce64 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3751,7 +3751,8 @@ alloc: * don't bother committing the transaction. */ if (percpu_counter_compare(data_sinfo-total_bytes_pinned, - bytes) 0) + used + bytes - + data_sinfo-total_bytes) 0) have_pinned_space = 0; spin_unlock(data_sinfo-lock); -- 1.8.5.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 01/10] btrfs: Adjust commit-transaction condition to avoid NO_SPACE more
Hi, From: Zhaolei [mailto:zhao...@cn.fujitsu.com] Subject: [PATCH 01/10] btrfs: Adjust commit-transaction condition to avoid NO_SPACE more Sorry for title, it is only one patch. Will resend. Thanks Zhaolei From: Zhao Lei zhao...@cn.fujitsu.com If we have any chance to make a successful write, we should not give up. This patch adjust commit-transaction condition from: pinned = wanted to left + pinned = wanted Signed-off-by: Zhao Lei zhao...@cn.fujitsu.com --- fs/btrfs/extent-tree.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 414d533..4ffce64 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3751,7 +3751,8 @@ alloc: * don't bother committing the transaction. */ if (percpu_counter_compare(data_sinfo-total_bytes_pinned, -bytes) 0) +used + bytes - +data_sinfo-total_bytes) 0) have_pinned_space = 0; spin_unlock(data_sinfo-lock); -- 1.8.5.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] btrfs: ENOMEM bugfixes
Hi, As it turns out, running with low memory is a really easy way to shake out undesirable behavior in Btrfs. This can be especially bad when considering that a memory limit is really easy to hit in a container (e.g., by using cgroup memory.limit_in_bytes). Here's a simple script that can hit several problems: #!/bin/sh cgcreate -g memory:enomem MEM=$((64 * 1024 * 1024)) echo $MEM /sys/fs/cgroup/memory/enomem/memory.limit_in_bytes cgexec -g memory:enomem ~/xfstests/ltp/fsstress -p128 -n9 -d /mnt/test trap killall fsstress; exit 0 SIGINT SIGTERM while true; do cgexec -g memory:enomem python -c ' l = [] while True: l.append(0)' done Ignoring for now the cases that drop the filesystem into read-only mode with relatively little fuss, here are a few patches that fix some of the low-hanging fruit. They apply to Linus' tree as of today. Thanks! Omar Sandoval (3): btrfs: handle ENOMEM in btrfs_alloc_tree_block btrfs: handle race on ENOMEM in alloc_extent_buffer btrfs: check io_ctl_prepare_pages return in __btrfs_write_out_cache fs/btrfs/extent-tree.c | 41 - fs/btrfs/extent_io.c| 20 fs/btrfs/free-space-cache.c | 10 ++ 3 files changed, 50 insertions(+), 21 deletions(-) -- 2.3.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html