Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hi, Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang: > OK, I see. > But given that you often run into enospc errors, can you work out a > reproduce > script according to you work load. That will give us great help. I tried hard to reproduce it but i can't get it to reproduce with a test script. Any ideas? Stefan > > Reagrds, > Xiaoguang Wang > >> >> Greets, >> Stefan >> >>> Regards, >>> Xiaoguang Wang Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[josef-btrfs:master 4/17] arch/tile/mm/pgtable.c:47:2: error: implicit declaration of function 'glboal_node_page_state'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next.git master head: 7a27194d3aaad0547b4f3fccdaab7dc01e03f6de commit: 1277e49c31f1694ba0d9f9b3871144832c66ac0a [4/17] writeback: allow for dirty metadata accounting config: tile-tilegx_defconfig (attached as .config) compiler: tilegx-linux-gcc (GCC) 4.6.2 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 1277e49c31f1694ba0d9f9b3871144832c66ac0a # save the attached .config to linux build tree make.cross ARCH=tile All error/warnings (new ones prefixed by >>): arch/tile/mm/pgtable.c: In function 'show_mem': >> arch/tile/mm/pgtable.c:47:2: error: implicit declaration of function >> 'glboal_node_page_state' >> arch/tile/mm/pgtable.c:47:2: warning: format '%lu' expects argument of type >> 'long unsigned int', but argument 7 has type 'int' cc1: some warnings being treated as errors vim +/glboal_node_page_state +47 arch/tile/mm/pgtable.c 41 * of processors and often four NUMA zones each with high and lowmem. 42 */ 43 void show_mem(unsigned int filter) 44 { 45 struct zone *zone; 46 > 47 pr_err("Active:%lu inactive:%lu dirty:%lu metadata_dirty:%lu writeback:%lu metadata_writeback:%lu unstable:%lu free:%lu\n slab:%lu mapped:%lu pagetables:%lu bounce:%lu pagecache:%lu swap:%lu\n", 48 (global_node_page_state(NR_ACTIVE_ANON) + 49 global_node_page_state(NR_ACTIVE_FILE)), 50 (global_node_page_state(NR_INACTIVE_ANON) + --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: 4.8rc8 & OOM panic
On 09/28/16 20:46, E V wrote: > I just booted my backup box with 4.8rc8 and started an rsync onto > btrfs and it panic'd with OOM a couple hours later. I thought the OOM > problems from 4.7 we're supposed to be fixed in 4.8, or did I get that > wrong? No users or anything else on the system. Keep in mind that those problems are generic, other filesystems also suffer: http://www.spinics.net/lists/linux-mm/msg114123.html I don't see any recent compaction-related patches in Linus' tree at https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/mm/ ..so it seems they have not been merged yet. So unless they get merged in the last minute it looks like 4.8 will be DOA. -h -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
4.8rc8 & OOM panic
I just booted my backup box with 4.8rc8 and started an rsync onto btrfs and it panic'd with OOM a couple hours later. I thought the OOM problems from 4.7 we're supposed to be fixed in 4.8, or did I get that wrong? No users or anything else on the system. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] btrfs rare silent data corruption with kernel data leak (updated, preliminary patch)
On Thu, Sep 22, 2016 at 04:42:06PM -0400, Chris Mason wrote: > On 09/21/2016 07:14 AM, Paul Jones wrote: > >>-Original Message- > >>From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs- > >>ow...@vger.kernel.org] On Behalf Of Zygo Blaxell > >>Sent: Wednesday, 21 September 2016 2:56 PM > >>To: linux-btrfs@vger.kernel.org > >>Subject: btrfs rare silent data corruption with kernel data leak > >> > >>Summary: > >> > >>There seem to be two btrfs bugs here: one loses data on writes, and the > >>other leaks data from the kernel to replace it on reads. It all happens > >>after > >>checksums are verified, so the corruption is entirely silent--no EIO errors, > >>kernel messages, or device event statistics. > >> > >>Compressed extents are corrupted with kernel data leak. Uncompressed > >>extents may not be corrupted, or may be corrupted by deterministically > >>replacing data bytes with zero, or may not be corrupted. No preconditions > >>for corruption are known. Less than one file per hundred thousand seems to > >>be affected. Only specific parts of any file can be affected. > >>Kernels v4.0..v4.5.7 tested, all have the issue. > > Zygo, could you please bounce me your original email? Somehow exchange ate > it. > > If you're seeing this databases that use fsync, it could be related to the > fsync fix I put into the last RC. On my boxes it caused crashes, but memory > corruptions aren't impossible. The corruption pattern doesn't look like generic memory corruption. Data in the inline extents is never wrong. Only the data after the end of the inline extent, and the correct data in those file offsets is always zero. > Any chance you can do a controlled experiment to rule out compression? I get uncompressed inline extents, but so far I haven't found any of those that read corrupted data. I've tested 4.7.5 and it has the same corruption problem (among some others that make it hard to use for testing). The trigger seems to be the '-S' option to rsync, which causes a lot of short writes with seeks between. When there is a seek from within the first 4096 bytes to outside of the first 4096 bytes, an inline extent _can_ occur--but does not most of the time. Normally, the inline extent disappears in this sequence of operations: # head -c 4000 /usr/share/doc/ssh/copyright > f # filefrag -v f Filesystem type is: 9123683e File size of f is 4000 (1 block of 4096 bytes) ext: logical_offset:physical_offset: length: expected: flags: 0:0..4095: 0.. 4095: 4096: last,not_aligned,inline,eof f: 1 extent found # head -c 4000 /usr/share/doc/ssh/copyright | dd conv=notrunc seek=1 bs=4k of=f 0+1 records in 0+1 records out 4000 bytes (4.0 kB) copied, 0.00770182 s, 519 kB/s # filefrag -v f Filesystem type is: 9123683e File size of f is 8096 (2 blocks of 4096 bytes) ext: logical_offset:physical_offset: length: expected: flags: 0:0..4095: 0.. 4095: 4096: not_aligned,inline 1:1.. 1: 0.. 0: 1: 1: last,unknown_loc,delalloc,eof f: 2 extents found # sync # filefrag -v f Filesystem type is: 9123683e File size of f is 8096 (2 blocks of 4096 bytes) ext: logical_offset:physical_offset: length: expected: flags: 0:0.. 1:1368948.. 1368949: 2: last,encoded,eof f: 1 extent found # head -c 4000 /usr/share/doc/ssh/copyright > f but very rarely (p = 0.1), the inline extent doesn't go away, and we get an inline extent followed by more extents (see filefrag example below). The inline extents appear with and without compression; however, I have not been able to find cases where corruption occurs without compression so far. Probing a little deeper shows that the inline extent is always shorter than 4096 bytes, and corruption always happens in the gap between the end of the inline extent data and the 4096th byte in the following page. It looks like the data is OK on disk. It is just some part of the read path for compressed extents that injects uninitialized data on read. Since kernel memory is often filled with zeros, the data is read correctly much of the time by sheer chance. Existing data could be read correctly with a kernel patch. This reproducer will create corrupted extents in a kvm instance (4GB memory, 16GB of btrfs filesystem, kernel 4.5.7) in under an hour: # mkdir /tmp/eee # cd /tmp/eee # y=/usr; for x in $(seq 0 9); do rsync -avxHSPW "$y/." "$x"; y="$x"; done & # mkdir /tmp/fff # cd /tmp/fff # y=/usr; for x in $(seq 0 9); do rsync -avxHSPW "$y/." "$x"; y="$x"; done & This is how to find the inline extents where the corruption can oc
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 28.09.2016 um 15:44 schrieb Holger Hoffstätte: >> Good idea but it does not. I hope i can reproduce this with my already >> existing testscript which i've now bumped to use a 37TB partition and >> big files rather than a 15GB part and small files. If i can reproduce it >> i can also check whether disabling compression fixes this. > > Great. Remember to undo the compression on existing files, or create > them from scratch. I create files from scratch - but currently i can't trigger the problem with my testscript. But even in production load it's not that easy. I need to process 60-120 files before the error is triggered. >> No that's not the case. No rsync nor inplace is involved. I'm dumping >> differences directly from ceph and put them on top of a base image but >> only for 7 days. So it's not endless fragmenting the file. After 7 days >> a clean whole image is dumped. > > That sounds sane but it's also not at all how you described things to me > previosuly ;) But OK. I'm sorry. May be my english is just bad, you got me wrong or was drunk *joke*. It never changed. > How do you "dump differences directly from > Ceph"? I'd assume the VM images are RBDs, but it sounds you're somehow > using overlayfs. You can use rbd diff to export differences between two snapshots. So no overlayfs involved. > Anyway..something is off and you successfully cause it while other > people apparently do not. Sure - i know that. But i still don't want to switch to zfs. > Do you still use those nonstandard mount > options with extremely long transaction flush times? No i removed commit=300 just to be sure they do not cause this issue. Sure, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
On 09/28/16 15:06, Stefan Priebe - Profihost AG wrote: > > Yes this is 4.4.22 and no i don't have qgroups enabled so it can't help. > > # btrfs qgroup show /path/ > ERROR: can't perform the search - No such file or directory > ERROR: can't list qgroups: No such file or director > > This is the same output on all backup machines. OK, that is really good to know (your other mails arrived just after I sent mine). The fact that you see this problem with all kernels - even with 4.8rc - *and* on all machines is good (in a way) because it means I haven't messed up anything, and we're not chasing ghosts caused by broken backport patches. >> would be unfortunate, but you could try to disable compression for a >> while and see what happens, assuming the space requirements allow this >> experiment. > Good idea but it does not. I hope i can reproduce this with my already > existing testscript which i've now bumped to use a 37TB partition and > big files rather than a 15GB part and small files. If i can reproduce it > i can also check whether disabling compression fixes this. Great. Remember to undo the compression on existing files, or create them from scratch. > No that's not the case. No rsync nor inplace is involved. I'm dumping > differences directly from ceph and put them on top of a base image but > only for 7 days. So it's not endless fragmenting the file. After 7 days > a clean whole image is dumped. That sounds sane but it's also not at all how you described things to me previosuly ;) But OK. How do you "dump differences directly from Ceph"? I'd assume the VM images are RBDs, but it sounds you're somehow using overlayfs. > yes and no - this is not idea and even very slow if your customers need > backups on a daily basis. So you must be able to mount a specific backup > very fast. And stacking on demand is mostly too slow - but this is far > away from the topic in this thread. I understand the desire to mount & immediately access backups - it's what I do here at home too (every machine can access its own last #n backups via NFS) and it's very useful. Anyway..something is off and you successfully cause it while other people apparently do not. Do you still use those nonstandard mount options with extremely long transaction flush times? -h -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Dear Holger, first thanks for your long e-mail. Am 28.09.2016 um 14:47 schrieb Holger Hoffstätte: > On 09/28/16 13:35, Wang Xiaoguang wrote: >> hello, >> >> On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: >>> Dear list, >>> >>> is there any chance anybody wants to work with me on the following issue? >> Though I'm also somewhat new to btrfs, but I'd like to. >> >>> >>> BTRFS: space_info 4 has 18446742286429913088 free, is not full >>> BTRFS: space_info total=98247376896, used=77036814336, pinned=0, >>> reserved=0, may_use=1808490201088, readonly=0 >>> >>> i get this nearly every day. >>> >>> Here are some msg collected from today and yesterday from different servers: >>> | BTRFS: space_info 4 has 18446742182612910080 free, is not full | >>> | BTRFS: space_info 4 has 18446742254739439616 free, is not full | >>> | BTRFS: space_info 4 has 18446743980225085440 free, is not full | >>> | BTRFS: space_info 4 has 18446743619906420736 free, is not full | >>> | BTRFS: space_info 4 has 18446743647369576448 free, is not full | >>> | BTRFS: space_info 4 has 18446742286429913088 free, is not full >>> >>> What i tried so far without success: >>> - use vanilla 4.8-rc8 kernel >>> - use latest vanilla 4.4 kernel >>> - use latest 4.4 kernel + patches from holger hoffstaette > > Was that 4.4.22? It contains a patch by Goldwyn Rodrigues called > "Prevent qgroup->reserved from going subzero" which should prevent > this from happening. This should only affect filesystems with enabled > quota; you said you didn't have quota enabled, yet some quota-only > patches caused problems on your system (despite being scheduled for > 4.9 and apparently working fine everywhere else, even when I > specifically tested them *with* quota enabled). Yes this is 4.4.22 and no i don't have qgroups enabled so it can't help. # btrfs qgroup show /path/ ERROR: can't perform the search - No such file or directory ERROR: can't list qgroups: No such file or director This is the same output on all backup machines. > It means either: > - you tried my patchset for 4.4.21 (i.e. *without* the above patch) > and should bump to .22 right away No it's 4.4.22 > - you _do_ have qgroups enabled for some reason (systemd?) No see above - but yes i use systemd. > - your fs is corrupted and needs nuking If this is the case all FS on 5 servers must be corrupted and all of them were installed at a different date / year. The newest one just 5 month ago with kernel 4.1 the others with 3.18. Also a lot of other systems with just 100-900GB of space are working fine. > - you did something else entirely No idea what this could be. > There is also the chance that your use of compress-force (or rather > compression in general) causes leakage; compression runs asynchronously > and I wouldn't be surprised if that is still full of racy races..which > would be unfortunate, but you could try to disable compression for a > while and see what happens, assuming the space requirements allow this > experiment. Good idea but it does not. I hope i can reproduce this with my already existing testscript which i've now bumped to use a 37TB partition and big files rather than a 15GB part and small files. If i can reproduce it i can also check whether disabling compression fixes this. What speaks against this is that i've also a MariaDB Server which runs fine since two years with compress-force but uses only < 100GB files and also does not create and remove them on a daily basis. > You have also not told us whether this happens only on one (potentially > corrupted/confused) fs or on every one - my impression was that you have > several sharded backup filesystems/machines; not sure if that is still > the case. If it happens only on one specific fs chances are it's hosed. It happens on all of them - sorry if i missed this. >> I also met enospc error in 4.8-rc6 when doing big files create and delete >> tests, >> for my cases, I have written some patches to fix it. >> Would you please apply my patches to have a try: >> btrfs: try to satisfy metadata requests when every flush_space() returns >> btrfs: try to write enough delalloc bytes when reclaiming metadata space >> btrfs: make shrink_delalloc() try harder to reclaim metadata space > > These are all in my series for 4.4.22 and seem to work fine, however > Stefan's workload has nothing directly to do with big files; instead > it's the worst case scenario in terms of fragmentation (of huge files) and > a huge number of extents: incremental backups of VMs via rsync --inplace > with forced compression. No that's not the case. No rsync nor inplace is involved. I'm dumping differences directly from ceph and put them on top of a base image but only for 7 days. So it's not endless fragmenting the file. After 7 days a clean whole image is dumped. > IMHO this way of making backups is suboptimal in basically every possible > way, despite its convenience appeal. With such huge space requirements > it would be more effectiv
Re: [PATCH v2 0/6] Btrfs: free space tree and sanity test fixes
On Thursday, September 22, 2016 05:22:31 PM Omar Sandoval wrote: > From: Omar Sandoval > > This is v2 of my earlier series "Btrfs: fix free space tree > bitmaps+tests on big-endian systems" [1]. Patches 1, 4, and 5 are the > same as patches 1, 2, and 3 from the original series. I've added patch 2 > to fix another bug I noticed (an xfstest went out earlier). Patch 3 is > the result of the earlier discussion here [2]. Finally, patch 6 was > necessary to get the sanity tests to run on my MIPS emulator. > > This series applies to v4.8-rc7. The sanity tests pass on both x86-64 > and MIPS, and there are no xfstests regressions. Chandan and Anatoly, > could you test these out as well? Hello Omar, I have executed xfstests on a big endian ppc64 guest with 'MOUNT_OPTIONS="-o space_cache=v2"' config option. I have also executed generic/127 on a filesystem created using "fragment-free-space-tree.py" that you had provided sometime ago. I did not notice any regressions during the test runs. Tested-by: Chandan Rajendra > > I'm working on the btrfs-progs follow up, but these patches are safe > without that -- the new FREE_SPACE_TREE_VALID bit will stop all versions > of btrfs-progs from mounting read-write. > > Thanks! > > 1: http://marc.info/?l=linux-btrfs&m=146853909905570&w=2 > 2: http://marc.info/?l=linux-btrfs&m=147448992301110&w=2 > > Cc: Chandan Rajendra > Cc: Anatoly Pugachev > > Omar Sandoval (6): > Btrfs: fix free space tree bitmaps on big-endian systems > Btrfs: fix mount -o clear_cache,space_cache=v2 > Btrfs: catch invalid free space trees > Btrfs: fix extent buffer bitmap tests on big-endian systems > Btrfs: expand free space tree sanity tests to catch endianness bug > Btrfs: use less memory for delalloc sanity tests > > fs/btrfs/ctree.h | 3 +- > fs/btrfs/disk-io.c | 33 --- > fs/btrfs/extent_io.c | 64 + > fs/btrfs/extent_io.h | 22 + > fs/btrfs/free-space-tree.c | 19 ++-- > fs/btrfs/tests/extent-io-tests.c | 95 +++ > fs/btrfs/tests/free-space-tree-tests.c | 164 > +++-- > include/uapi/linux/btrfs.h | 10 +- > 8 files changed, 261 insertions(+), 149 deletions(-) > > -- chandan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
On 09/28/16 13:35, Wang Xiaoguang wrote: > hello, > > On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: >> Dear list, >> >> is there any chance anybody wants to work with me on the following issue? > Though I'm also somewhat new to btrfs, but I'd like to. > >> >> BTRFS: space_info 4 has 18446742286429913088 free, is not full >> BTRFS: space_info total=98247376896, used=77036814336, pinned=0, >> reserved=0, may_use=1808490201088, readonly=0 >> >> i get this nearly every day. >> >> Here are some msg collected from today and yesterday from different servers: >> | BTRFS: space_info 4 has 18446742182612910080 free, is not full | >> | BTRFS: space_info 4 has 18446742254739439616 free, is not full | >> | BTRFS: space_info 4 has 18446743980225085440 free, is not full | >> | BTRFS: space_info 4 has 18446743619906420736 free, is not full | >> | BTRFS: space_info 4 has 18446743647369576448 free, is not full | >> | BTRFS: space_info 4 has 18446742286429913088 free, is not full >> >> What i tried so far without success: >> - use vanilla 4.8-rc8 kernel >> - use latest vanilla 4.4 kernel >> - use latest 4.4 kernel + patches from holger hoffstaette Was that 4.4.22? It contains a patch by Goldwyn Rodrigues called "Prevent qgroup->reserved from going subzero" which should prevent this from happening. This should only affect filesystems with enabled quota; you said you didn't have quota enabled, yet some quota-only patches caused problems on your system (despite being scheduled for 4.9 and apparently working fine everywhere else, even when I specifically tested them *with* quota enabled). So, long story short: something doesn't add up. It means either: - you tried my patchset for 4.4.21 (i.e. *without* the above patch) and should bump to .22 right away - you _do_ have qgroups enabled for some reason (systemd?) - your fs is corrupted and needs nuking - you did something else entirely - unknown unknowns aka. ¯\_(ツ)_/¯ There is also the chance that your use of compress-force (or rather compression in general) causes leakage; compression runs asynchronously and I wouldn't be surprised if that is still full of racy races..which would be unfortunate, but you could try to disable compression for a while and see what happens, assuming the space requirements allow this experiment. You have also not told us whether this happens only on one (potentially corrupted/confused) fs or on every one - my impression was that you have several sharded backup filesystems/machines; not sure if that is still the case. If it happens only on one specific fs chances are it's hosed. > I also met enospc error in 4.8-rc6 when doing big files create and delete > tests, > for my cases, I have written some patches to fix it. > Would you please apply my patches to have a try: > btrfs: try to satisfy metadata requests when every flush_space() returns > btrfs: try to write enough delalloc bytes when reclaiming metadata space > btrfs: make shrink_delalloc() try harder to reclaim metadata space These are all in my series for 4.4.22 and seem to work fine, however Stefan's workload has nothing directly to do with big files; instead it's the worst case scenario in terms of fragmentation (of huge files) and a huge number of extents: incremental backups of VMs via rsync --inplace with forced compression. IMHO this way of making backups is suboptimal in basically every possible way, despite its convenience appeal. With such huge space requirements it would be more effective to have a "current backup" to rsync into and then take a snapshot (for fs consistency), pack the snapshot to a tar.gz (massively better compression than with btrfs), dump them into your Ceph cluster as objects with expiry (preferrably a separate EC pool) and then immediately delete the snapshot from the local fs. That should relieve the landing fs from getting overloaded by COWing and too many snapshots (approx. #VMs * #versions). The obvious downside is that restoring an archived snapshot would require some creative efforts. Other alternatives exist, but are probably even more (too) expensive. -h -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Am 28.09.2016 um 14:10 schrieb Wang Xiaoguang: > hello, > > On 09/28/2016 08:02 PM, Stefan Priebe - Profihost AG wrote: >> Hi Xiaoguang Wang, >> >> Am 28.09.2016 um 13:35 schrieb Wang Xiaoguang: >>> hello, >>> >>> On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: Dear list, is there any chance anybody wants to work with me on the following issue? >>> Though I'm also somewhat new to btrfs, but I'd like to. >>> BTRFS: space_info 4 has 18446742286429913088 free, is not full BTRFS: space_info total=98247376896, used=77036814336, pinned=0, reserved=0, may_use=1808490201088, readonly=0 i get this nearly every day. Here are some msg collected from today and yesterday from different servers: | BTRFS: space_info 4 has 18446742182612910080 free, is not full | | BTRFS: space_info 4 has 18446742254739439616 free, is not full | | BTRFS: space_info 4 has 18446743980225085440 free, is not full | | BTRFS: space_info 4 has 18446743619906420736 free, is not full | | BTRFS: space_info 4 has 18446743647369576448 free, is not full | | BTRFS: space_info 4 has 18446742286429913088 free, is not full What i tried so far without success: - use vanilla 4.8-rc8 kernel - use latest vanilla 4.4 kernel - use latest 4.4 kernel + patches from holger hoffstaette - use clear_cache,space_cache=v2 - use clear_cache,space_cache=v1 But all tries result in ENOSPC after a short period of time doing backups. >>> I also met enospc error in 4.8-rc6 when doing big files create and >>> delete tests, >>> for my cases, I have written some patches to fix it. >>> Would you please apply my patches to have a try: >>> btrfs: try to satisfy metadata requests when every flush_space() returns >>> btrfs: try to write enough delalloc bytes when reclaiming metadata space >>> btrfs: make shrink_delalloc() try harder to reclaim metadata space >>> You can find them in btrfs mail list. >> those are already in the patchset from holger: >> >> So i have these in my testing patchset (latest 4.4 kernel + patches from >> holger hoffstaette): >> >> btrfs-20160921-try-to-satisfy-metadata-requests-when-every-flush_space()-returns.patch >> >> >> btrfs-20160921-try-to-write-enough-delalloc-bytes-when-reclaiming-metadata-space.patch >> >> >> btrfs-20160922-make-shrink_delalloc()-try-harder-to-reclaim-metadata-space.patch >> > OK, I see. > But given that you often run into enospc errors, can you work out a > reproduce > script according to you work load. That will give us great help. I already tried that but it wasn't working. It seems i need a test device with +20TB and i need creating file that big in the tests. But that isn't easy. Currently i've no test hardware that big. May be i should try that on a production server. Stefan > Reagrds, > Xiaoguang Wang > >> >> Greets, >> Stefan >> >>> Regards, >>> Xiaoguang Wang Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
hello, On 09/28/2016 08:02 PM, Stefan Priebe - Profihost AG wrote: Hi Xiaoguang Wang, Am 28.09.2016 um 13:35 schrieb Wang Xiaoguang: hello, On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: Dear list, is there any chance anybody wants to work with me on the following issue? Though I'm also somewhat new to btrfs, but I'd like to. BTRFS: space_info 4 has 18446742286429913088 free, is not full BTRFS: space_info total=98247376896, used=77036814336, pinned=0, reserved=0, may_use=1808490201088, readonly=0 i get this nearly every day. Here are some msg collected from today and yesterday from different servers: | BTRFS: space_info 4 has 18446742182612910080 free, is not full | | BTRFS: space_info 4 has 18446742254739439616 free, is not full | | BTRFS: space_info 4 has 18446743980225085440 free, is not full | | BTRFS: space_info 4 has 18446743619906420736 free, is not full | | BTRFS: space_info 4 has 18446743647369576448 free, is not full | | BTRFS: space_info 4 has 18446742286429913088 free, is not full What i tried so far without success: - use vanilla 4.8-rc8 kernel - use latest vanilla 4.4 kernel - use latest 4.4 kernel + patches from holger hoffstaette - use clear_cache,space_cache=v2 - use clear_cache,space_cache=v1 But all tries result in ENOSPC after a short period of time doing backups. I also met enospc error in 4.8-rc6 when doing big files create and delete tests, for my cases, I have written some patches to fix it. Would you please apply my patches to have a try: btrfs: try to satisfy metadata requests when every flush_space() returns btrfs: try to write enough delalloc bytes when reclaiming metadata space btrfs: make shrink_delalloc() try harder to reclaim metadata space You can find them in btrfs mail list. those are already in the patchset from holger: So i have these in my testing patchset (latest 4.4 kernel + patches from holger hoffstaette): btrfs-20160921-try-to-satisfy-metadata-requests-when-every-flush_space()-returns.patch btrfs-20160921-try-to-write-enough-delalloc-bytes-when-reclaiming-metadata-space.patch btrfs-20160922-make-shrink_delalloc()-try-harder-to-reclaim-metadata-space.patch OK, I see. But given that you often run into enospc errors, can you work out a reproduce script according to you work load. That will give us great help. Reagrds, Xiaoguang Wang Greets, Stefan Regards, Xiaoguang Wang Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
Hi Xiaoguang Wang, Am 28.09.2016 um 13:35 schrieb Wang Xiaoguang: > hello, > > On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: >> Dear list, >> >> is there any chance anybody wants to work with me on the following issue? > Though I'm also somewhat new to btrfs, but I'd like to. > >> >> BTRFS: space_info 4 has 18446742286429913088 free, is not full >> BTRFS: space_info total=98247376896, used=77036814336, pinned=0, >> reserved=0, may_use=1808490201088, readonly=0 >> >> i get this nearly every day. >> >> Here are some msg collected from today and yesterday from different >> servers: >> | BTRFS: space_info 4 has 18446742182612910080 free, is not full | >> | BTRFS: space_info 4 has 18446742254739439616 free, is not full | >> | BTRFS: space_info 4 has 18446743980225085440 free, is not full | >> | BTRFS: space_info 4 has 18446743619906420736 free, is not full | >> | BTRFS: space_info 4 has 18446743647369576448 free, is not full | >> | BTRFS: space_info 4 has 18446742286429913088 free, is not full >> >> What i tried so far without success: >> - use vanilla 4.8-rc8 kernel >> - use latest vanilla 4.4 kernel >> - use latest 4.4 kernel + patches from holger hoffstaette >> - use clear_cache,space_cache=v2 >> - use clear_cache,space_cache=v1 >> >> But all tries result in ENOSPC after a short period of time doing >> backups. > I also met enospc error in 4.8-rc6 when doing big files create and > delete tests, > for my cases, I have written some patches to fix it. > Would you please apply my patches to have a try: > btrfs: try to satisfy metadata requests when every flush_space() returns > btrfs: try to write enough delalloc bytes when reclaiming metadata space > btrfs: make shrink_delalloc() try harder to reclaim metadata space > You can find them in btrfs mail list. those are already in the patchset from holger: So i have these in my testing patchset (latest 4.4 kernel + patches from holger hoffstaette): btrfs-20160921-try-to-satisfy-metadata-requests-when-every-flush_space()-returns.patch btrfs-20160921-try-to-write-enough-delalloc-bytes-when-reclaiming-metadata-space.patch btrfs-20160922-make-shrink_delalloc()-try-harder-to-reclaim-metadata-space.patch Greets, Stefan > > Regards, > Xiaoguang Wang >> >> Greets, >> Stefan >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: space_info 4 has 18446742286429913088 free, is not full
hello, On 09/28/2016 07:15 PM, Stefan Priebe - Profihost AG wrote: Dear list, is there any chance anybody wants to work with me on the following issue? Though I'm also somewhat new to btrfs, but I'd like to. BTRFS: space_info 4 has 18446742286429913088 free, is not full BTRFS: space_info total=98247376896, used=77036814336, pinned=0, reserved=0, may_use=1808490201088, readonly=0 i get this nearly every day. Here are some msg collected from today and yesterday from different servers: | BTRFS: space_info 4 has 18446742182612910080 free, is not full | | BTRFS: space_info 4 has 18446742254739439616 free, is not full | | BTRFS: space_info 4 has 18446743980225085440 free, is not full | | BTRFS: space_info 4 has 18446743619906420736 free, is not full | | BTRFS: space_info 4 has 18446743647369576448 free, is not full | | BTRFS: space_info 4 has 18446742286429913088 free, is not full What i tried so far without success: - use vanilla 4.8-rc8 kernel - use latest vanilla 4.4 kernel - use latest 4.4 kernel + patches from holger hoffstaette - use clear_cache,space_cache=v2 - use clear_cache,space_cache=v1 But all tries result in ENOSPC after a short period of time doing backups. I also met enospc error in 4.8-rc6 when doing big files create and delete tests, for my cases, I have written some patches to fix it. Would you please apply my patches to have a try: btrfs: try to satisfy metadata requests when every flush_space() returns btrfs: try to write enough delalloc bytes when reclaiming metadata space btrfs: make shrink_delalloc() try harder to reclaim metadata space You can find them in btrfs mail list. Regards, Xiaoguang Wang Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BTRFS: space_info 4 has 18446742286429913088 free, is not full
Dear list, is there any chance anybody wants to work with me on the following issue? BTRFS: space_info 4 has 18446742286429913088 free, is not full BTRFS: space_info total=98247376896, used=77036814336, pinned=0, reserved=0, may_use=1808490201088, readonly=0 i get this nearly every day. Here are some msg collected from today and yesterday from different servers: | BTRFS: space_info 4 has 18446742182612910080 free, is not full | | BTRFS: space_info 4 has 18446742254739439616 free, is not full | | BTRFS: space_info 4 has 18446743980225085440 free, is not full | | BTRFS: space_info 4 has 18446743619906420736 free, is not full | | BTRFS: space_info 4 has 18446743647369576448 free, is not full | | BTRFS: space_info 4 has 18446742286429913088 free, is not full What i tried so far without success: - use vanilla 4.8-rc8 kernel - use latest vanilla 4.4 kernel - use latest 4.4 kernel + patches from holger hoffstaette - use clear_cache,space_cache=v2 - use clear_cache,space_cache=v1 But all tries result in ENOSPC after a short period of time doing backups. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs-progs: Remove unnecessary parameter to clear_extent_uptodate
Signed-off-by: Qu Wenruo --- disk-io.c | 4 ++-- extent_io.h | 3 +-- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/disk-io.c b/disk-io.c index 854c285..08d3f79 100644 --- a/disk-io.c +++ b/disk-io.c @@ -241,7 +241,7 @@ static int verify_parent_transid(struct extent_io_tree *io_tree, ret = 1; out: - clear_extent_buffer_uptodate(io_tree, eb); + clear_extent_buffer_uptodate(eb); return ret; } @@ -976,7 +976,7 @@ static int setup_root_or_create_block(struct btrfs_fs_info *fs_info, btrfs_find_create_tree_block(fs_info, 0, nodesize); if (!info_root->node) return -ENOMEM; - clear_extent_buffer_uptodate(NULL, info_root->node); + clear_extent_buffer_uptodate(info_root->node); } return 0; diff --git a/extent_io.h b/extent_io.h index 208c4fe..bd6cf9e 100644 --- a/extent_io.h +++ b/extent_io.h @@ -125,8 +125,7 @@ static inline int set_extent_buffer_uptodate(struct extent_buffer *eb) return 0; } -static inline int clear_extent_buffer_uptodate(struct extent_io_tree *tree, - struct extent_buffer *eb) +static inline int clear_extent_buffer_uptodate(struct extent_buffer *eb) { eb->flags &= ~EXTENT_UPTODATE; return 0; -- 2.10.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs-progs: raid56: Add support for raid5 to calculate any stripe
Add a new function raid5_gen_result() to calculate raid5 parity or recover data stripe. Since now that raid6.c handles both raid5 and raid6, rename it to raid56.c. Signed-off-by: Qu Wenruo --- Makefile.in | 2 +- disk-io.h | 3 ++- raid6.c => raid56.c | 45 + volumes.c | 36 +++- 4 files changed, 63 insertions(+), 23 deletions(-) rename raid6.c => raid56.c (71%) diff --git a/Makefile.in b/Makefile.in index 20b740a..0ae2cd5 100644 --- a/Makefile.in +++ b/Makefile.in @@ -90,7 +90,7 @@ CHECKER_FLAGS := -include $(check_defs) -D__CHECKER__ \ objects = ctree.o disk-io.o kernel-lib/radix-tree.o extent-tree.o print-tree.o \ root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \ extent-cache.o extent_io.o volumes.o utils.o repair.o \ - qgroup.o raid6.o free-space-cache.o kernel-lib/list_sort.o props.o \ + qgroup.o raid56.o free-space-cache.o kernel-lib/list_sort.o props.o \ ulist.o qgroup-verify.o backref.o string-table.o task-utils.o \ inode.o file.o find-root.o free-space-tree.o help.o cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ diff --git a/disk-io.h b/disk-io.h index 1080fc1..9fc7e92 100644 --- a/disk-io.h +++ b/disk-io.h @@ -190,7 +190,8 @@ int write_tree_block(struct btrfs_trans_handle *trans, int write_and_map_eb(struct btrfs_trans_handle *trans, struct btrfs_root *root, struct extent_buffer *eb); -/* raid6.c */ +/* raid56.c */ void raid6_gen_syndrome(int disks, size_t bytes, void **ptrs); +int raid5_gen_result(int nr_devs, size_t stripe_len, int dest, void **data); #endif diff --git a/raid6.c b/raid56.c similarity index 71% rename from raid6.c rename to raid56.c index 833df5f..2953d61 100644 --- a/raid6.c +++ b/raid56.c @@ -26,6 +26,7 @@ #include "kerncompat.h" #include "ctree.h" #include "disk-io.h" +#include "utils.h" /* * This is the C data type to use @@ -107,3 +108,47 @@ void raid6_gen_syndrome(int disks, size_t bytes, void **ptrs) } } +static void xor_range(void *src, void *dst, size_t size) +{ + while (size) { + *(unsigned long *) dst ^= *(unsigned long *) src; + src += sizeof(unsigned long); + dst += sizeof(unsigned long); + size -= sizeof(unsigned long); + } +} + +/* + * Generate desired data/parity for RAID5 + * + * @nr_devs: Total number of devices, including parity + * @stripe_len:Stripe length + * @data: Data, with special layout: + * data[0]: Data stripe 0 + * data[nr_devs-2]: Last data stripe + * data[nr_devs-1]: RAID5 parity + * @dest: To generate which data. should follow above data layout + */ +int raid5_gen_result(int nr_devs, size_t stripe_len, int dest, void **data) +{ + int i; + char *buf = data[dest]; + + if (dest >= nr_devs || nr_devs < 2) { + error("invalid parameter for %s", __func__); + return -EINVAL; + } + /* Quich hack, 2 devs RAID5 is just RAID1, no need to calculate */ + if (nr_devs == 2) { + memcpy(data[dest], data[1 - dest], stripe_len); + return 0; + } + /* Just in case */ + memset(buf, 0, stripe_len); + for (i = 0; i < nr_devs; i++) { + if (i == dest) + continue; + xor_range(data[i], buf, stripe_len); + } + return 0; +} diff --git a/volumes.c b/volumes.c index da79751..718e67c 100644 --- a/volumes.c +++ b/volumes.c @@ -2108,12 +2108,14 @@ int write_raid56_with_parity(struct btrfs_fs_info *info, { struct extent_buffer **ebs, *p_eb = NULL, *q_eb = NULL; int i; - int j; int ret; int alloc_size = eb->len; + void **pointers; - ebs = kmalloc(sizeof(*ebs) * multi->num_stripes, GFP_NOFS); - BUG_ON(!ebs); + ebs = malloc(sizeof(*ebs) * multi->num_stripes); + pointers = malloc(sizeof(void *) * multi->num_stripes); + if (!ebs || !pointers) + return -ENOMEM; if (stripe_len > alloc_size) alloc_size = stripe_len; @@ -2143,12 +2145,6 @@ int write_raid56_with_parity(struct btrfs_fs_info *info, q_eb = new_eb; } if (q_eb) { - void **pointers; - - pointers = kmalloc(sizeof(*pointers) * multi->num_stripes, - GFP_NOFS); - BUG_ON(!pointers); - ebs[multi->num_stripes - 2] = p_eb; ebs[multi->num_stripes - 1] = q_eb; @@ -2159,17 +2155,14 @@ int write_raid56_with_parity(struct btrfs_fs_info *info, kfree(pointers); } else { ebs[multi->num_stripes - 1] = p_eb; - memcpy(p_eb->data, ebs[0]->data, stripe_len); - fo