Re: Move data and mount point to subvolume
Am Sonntag, 16. September 2018, 14:50:04 CEST schrieb Hans van Kranenburg: > The last example, where you make a subvolume and move everything into > it, will not do what you want. Since a subvolume is a separate new > directoty/file hierarchy, mv will turn into a cp and rm operation > (without warning you) probably destroying information about data shared > between files. I thought that wasn't true anymore. The NEWS file to coreutils contains this (for version 8.24): mv will try a reflink before falling back to a standard copy, which is more efficient when moving files across BTRFS subvolume boundaries. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: File permissions lost during send/receive?
Am Dienstag, 24. Juli 2018, 21:46:14 CEST schrieb Duncan: > Andrei Borzenkov posted on Tue, 24 Jul 2018 20:53:15 +0300 as excerpted: > > 24.07.2018 15:16, Marc Joliet пишет: > >> Hi list, > >> > >> (Preemptive note: this was with btrfs-progs 4.15.1, I have since > >> upgraded to 4.17. My kernel version is 4.14.52-gentoo.) > >> > >> I recently had to restore the root FS of my desktop from backup (extent > >> tree corruption; not sure how, possibly a loose SATA cable?). > >> Everything was fine, > >> even if restoring was slower than expected. However, I encountered two > >> files with permission problems, namely: > >> > >> - /bin/ping, which caused running ping as a normal user to fail due to > >> missing permissions, and > >> > >> - /sbin/unix_chkpwd (part of PAM), which prevented me from unlocking > >> the KDE Plasma lock screen; I needed to log into a TTY and run > >> "loginctl unlock- session". > >> > >> Both were easily fixed by reinstalling the affected packages (iputils > >> and pam), but I wonder why this happened after restoring from backup. > >> > >> I originally thought it was related to the SUID bit not being set, > >> because of the explanation in the ping(8) man page (section > >> "SECURITY"), but cannot find evidence of that -- that is, after > >> reinstallation, "ls -lh" does not show the sticky bit being set, or any > >> other special permission bits, for that matter: > >> > >> % ls -lh /bin/ping /sbin/unix_chkpwd > >> -rwx--x--x 1 root root 60K 22. Jul 14:47 /bin/ping* > >> -rwx--x--x 1 root root 31K 23. Jul 00:21 /sbin/unix_chkpwd* > >> > >> (Note: no ACLs are set, either.) > > > > What "getcap /bin/ping" says? You may need to install package providing > > getcap (libcap-progs here on openSUSE). > > sys-libs/libcap on gentoo. Here's what I get: > > $ getcap /bin/ping > /bin/ping = cap_net_raw+ep On my system I get: % sudo getcap /bin/ping /sbin/unix_chkpwd /bin/ping = cap_net_raw+ep /sbin/unix_chkpwd = cap_dac_override+ep > (getcap on unix_chkpwd returns nothing, but while I use kde/plasma I > don't normally use the lockscreen at all, so for all I know that's broken > here too.) > > As hinted, it's almost certainly a problem with filecaps. While I'll > freely admit to not fully understanding how file-caps work, and my use- > case doesn't use send/receive, I do recall filecaps are what ping uses > these days instead of SUID/SGID (on gentoo it'd be iputils' filecaps and > possibly caps USE flags controlling this for ping), and also that btrfs > send/receive did have a recent bugfix related to the extended-attributes > normally used to record filecaps, so the symptoms match the bug and > that's probably what you were seeing. Ah, thanks, that looks like it was it! I didn't think about extended attributes, but including "xattr" in my search yielded the following patches from April this year (this turns out to be that vaguely remembered patch/ discussion that I mentioned): [PATCH] btrfs: add chattr support for send/receive [PATCH] btrfs: add verify chattr support for send/receive test However, IIUC those changes are going to be merged along with other changes into a v2 of the send protocoll, so until that gets finalized this is something to be aware of for those like me that use send/receive for backups. Anyway, thanks for pointing me in the right direction! At least now I understand what happened. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
File permissions lost during send/receive?
Hi list, (Preemptive note: this was with btrfs-progs 4.15.1, I have since upgraded to 4.17. My kernel version is 4.14.52-gentoo.) I recently had to restore the root FS of my desktop from backup (extent tree corruption; not sure how, possibly a loose SATA cable?). Everything was fine, even if restoring was slower than expected. However, I encountered two files with permission problems, namely: - /bin/ping, which caused running ping as a normal user to fail due to missing permissions, and - /sbin/unix_chkpwd (part of PAM), which prevented me from unlocking the KDE Plasma lock screen; I needed to log into a TTY and run "loginctl unlock- session". Both were easily fixed by reinstalling the affected packages (iputils and pam), but I wonder why this happened after restoring from backup. I originally thought it was related to the SUID bit not being set, because of the explanation in the ping(8) man page (section "SECURITY"), but cannot find evidence of that -- that is, after reinstallation, "ls -lh" does not show the sticky bit being set, or any other special permission bits, for that matter: % ls -lh /bin/ping /sbin/unix_chkpwd -rwx--x--x 1 root root 60K 22. Jul 14:47 /bin/ping* -rwx--x--x 1 root root 31K 23. Jul 00:21 /sbin/unix_chkpwd* (Note: no ACLs are set, either.) I do remember the qcheck program (a Gentoo-specific program that checks the integrity of installed packages) complaining about wrong file permissions, but I didn't save its output, and there's a chance it *might* have been because I ran qcheck without root permissions :-/ . I vaguely remember some patches and/or discussion regarding permission transfer issues with send/receive on this ML, but didn't find anything after searching through my Email archive, so I might be misremembering. Does anybody have any idea what possibly went wrong, or any similar experience to speak of? Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: defragmenting best practice?
Am Freitag, 22. September 2017, 13:22:52 CEST schrieb Austin S. Hemmelgarn: > > I'm not sure where Firefox puts its cache, I only use it on very rare > > occasions. But I think it's going to .cache/mozilla last time looked > > at it. > > I'm pretty sure that is correct. FWIW, on my system Firefox's cache looks like this: % du -hsc (find .cache/mozilla/firefox/ -type f) | wc -l 9008 % du -hsc (find .cache/mozilla/firefox/ -type f) | sort -h | tail 5,4M.cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 83CEC8ADA08D9A9658458AB872BE107A216E71C6 5,5M.cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ C60061B33D3BB91ED45951C922BAA1BB40022CB7 5,7M.cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 0900D9EA8E3222EB8690348C2482C69308B15A20 5,7M.cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ F8E90D121B884360E36BCB1735CC5A8B1B7A744B 5,8M.cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 903C4CD01ABD74E353C7484C6E21A053AAC5DCC2 6,1M.cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 3A0D4193B009700155811D14A28DBE38C37C0067 6,1M.cache/mozilla/firefox/cb236e4s.default-1464421886682/startupCache/ scriptCache-current.bin 6,5M.cache/mozilla/firefox/cb236e4s.default-1464421886682/cache2/entries/ 304405168662C3624D57AF98A74345464F32A0DB 8,8M.cache/mozilla/firefox/ik7qsfwb.Temp/cache2/entries/ BD7CA4125B3AA87D6B16C987741F33C65DBFFFDD 427Minsgesamt So lots of files, many of which are (I suppose) relatively large, but do not look "everything in one database" large to me. (This is with Firefox 55.0.2.) -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Montag, 6. März 2017 00:53:40 CET Marc Joliet wrote: > On Mittwoch, 1. März 2017 19:14:07 CET you wrote: > > In any > > > > case, I started btrfs-check on the device itself. > > *Sigh*, I had to restart it, because I forgot to redirect to a file and > quite frankly wasn't expecting this flood of output, but here's a summary > of the output after about 2 days: > [snip old output] OK, it finished last night. Here's the summary again: % wc -l btrfs_check_output_20170303.log 3028222 btrfs_check_output_20170303.log % grep -v "backref lost" btrfs_check_output_20170303.log | grep -v "check \ (leaf\|node\) failed" | grep -v "lost its parent" | grep -v "referencer count" checking extents ERROR: block group[3879328546816 1073741824] used 1072840704 but extent items used 1129164800 ERROR: block group[4163870130176 1073741824] used 1072259072 but extent items used 0 ERROR: block group[4223999672320 1073741824] used 1073664000 but extent items used 1074188288 ERROR: block group[4278760505344 1073741824] used 1073377280 but extent items used 1073901568 ERROR: block group[4406535782400 1073741824] used 1073627136 but extent items used 0 ERROR: extent [3830028140544 4096] referencer bytenr mismatch, wanted: 3830028140544, have: 3826183847936 ERROR: errors found in extent allocation tree or chunk allocation checking free space cache checking fs roots Checking filesystem on /dev/sdb2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 found 892572778496 bytes used err is -5 total csum bytes: 860790216 total tree bytes: 36906336256 total fs tree bytes: 34551476224 total extent tree bytes: 1230610432 btree space waste bytes: 7446885892 file data blocks allocated: 16359581663232 referenced 2358137831424 > That's right, slowly approaching 1.5 million lines of btrfs-check output! > That's *way* more than I ran it the last time this error happened a few > weeks ago. As can be seen above, that ballooned to over 3 million lines. Since the output is 4.2 MB, even after XZ compression, I put it up on my Dropbox, just in case it's interesting to anybody: https://www.dropbox.com/s/h6ftqpygfr4vsks/btrfs_check_output_20170303.log.xz? dl=0 Since this is my backup drive, and the second time within a month that it had problems like this, *and* I've got both the btrfs-image dump and btrfs-check output, I'm going to go ahead and reformat, so that my three computers are finally backed up again. Oh, and for what it's worth, I did test against a 4.8 kernel, and pretty much immediately got the "forced RO" error, just like with 4.9. I didn't try anything older (or newer). As a last step, I'll probably collect all information I have and post it to bugzilla when I have a chance, since others might hit it, too (Kai did before me, after all). Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Mittwoch, 1. März 2017 19:14:07 CET you wrote: > In any > case, I started btrfs-check on the device itself. *Sigh*, I had to restart it, because I forgot to redirect to a file and quite frankly wasn't expecting this flood of output, but here's a summary of the output after about 2 days: % wc -l btrfs_check_output_20170303.log 1360252 btrfs_check_output_20170303.log % grep -v "backref lost" btrfs_check_output_20170303.log | grep -v "check \ (leaf\|node\) failed" | grep -v "lost its parent" | grep -v "referencer count" checking extents ERROR: block group[3879328546816 1073741824] used 1072840704 but extent items used 1129164800 ERROR: block group[4163870130176 1073741824] used 1072259072 but extent items used 0 ERROR: block group[4223999672320 1073741824] used 1073664000 but extent items used 1074188288 ERROR: block group[4278760505344 1073741824] used 1073377280 but extent items used 1073901568 ERROR: block group[4406535782400 1073741824] used 1073627136 but extent items used 0 ERROR: extent [3830028140544 4096] referencer bytenr mismatch, wanted: 3830028140544, have: 3826183847936 % tail btrfs_check_output_20170303.log ERROR: data extent[3924538445824 4096] backref lost ERROR: data extent[3933464903680 4096] backref lost ERROR: data extent[3924538531840 4096] backref lost ERROR: data extent[3839131480064 4096] backref lost ERROR: data extent[3834701750272 4096] backref lost ERROR: data extent[3873087918080 4096] backref lost ERROR: data extent[3873072283648 4096] backref lost ERROR: data extent[3873088090112 8192] backref lost ERROR: data extent[3873072287744 4096] backref lost ERROR: data extent[3856294449152 4096] backref lost That's right, slowly approaching 1.5 million lines of btrfs-check output! That's *way* more than I ran it the last time this error happened a few weeks ago. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Friday 03 March 2017 09:00:10 Qu Wenruo wrote: > > FWIW, as per my later messages, after mounting with clear_cache and > > letting > > btrfs-cleaner finish, btrfs-check did *not* print out those errors after > > running again. It's now about two weeks later that the file system is > > showing problems again. > > If btrfs-check didn't print out *any* error, then it should be mostly > fine. (Unless there is some case we don't expose yet) > > The problem should be caused by kernel AFAIK. So you think it could be a regression in 4.9? Should I try 4.10? Or is it more likely just an undiscovered bug? > > Oh, and just in case it's relevant, the file system was created with > > btrfs- > > convert (a long time, maybe 1.5 years ago, though; it was originally > > ext4). > > Not sure if it's related. > But at least for that old convert, it's chunk layout is somewhat rare > and sometimes even bug-prone. > > Did you balance the btrfs after convert? If so, it should be more like a > traditional btrfs then. Yes, I'm fairly certain I did that, as that is what the btrfs wiki recommends. > Personally speaking I don't think it is relative for your bug, but much > like a normal extent tree corruption seen in mail list. OK, so is there anything else I can do? Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Friday 03 March 2017 09:09:57 Qu Wenruo wrote: > At 03/02/2017 05:44 PM, Marc Joliet wrote: > > On Wednesday 01 March 2017 19:14:07 Marc Joliet wrote: > >> In any > >> case, I started btrfs-check on the device itself. > > > > OK, it's still running, but the output so far is: > > > > # btrfs check --mode=lowmem --progress /dev/sdb2 > > Checking filesystem on /dev/sdb2 > > UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 > > ERROR: shared extent[3826242740224 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3826442825728 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3826744471552 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827106349056 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827141001216 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827150958592 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827251724288 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827433795584 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827536166912 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827536183296 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3827621646336 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3828179406848 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3828267970560 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3828284530688 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3828714246144 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3828794187776 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3829161340928 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3829373693952 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3830252130304 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3830421159936 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3830439141376 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3830441398272 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3830785138688 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831099297792 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831128768512 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831371513856 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831535570944 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831591952384 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831799398400 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831829250048 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3831829512192 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832011440128 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832011767808 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832023920640 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832024678400 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832027316224 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832028762112 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832030236672 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832030330880 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832161079296 4096] lost its parent (parent: > > 3827251183616, level: 0) > > ERROR: shared extent[3832164904960 4096] lost its parent (
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Wednesday 01 March 2017 19:14:07 Marc Joliet wrote: > In any > case, I started btrfs-check on the device itself. OK, it's still running, but the output so far is: # btrfs check --mode=lowmem --progress /dev/sdb2 Checking filesystem on /dev/sdb2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 ERROR: shared extent[3826242740224 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3826442825728 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3826744471552 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827106349056 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827141001216 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827150958592 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827251724288 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827433795584 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827536166912 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827536183296 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3827621646336 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3828179406848 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3828267970560 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3828284530688 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3828714246144 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3828794187776 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3829161340928 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3829373693952 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3830252130304 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3830421159936 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3830439141376 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3830441398272 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3830785138688 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831099297792 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831128768512 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831371513856 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831535570944 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831591952384 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831799398400 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831829250048 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3831829512192 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832011440128 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832011767808 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832023920640 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832024678400 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832027316224 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832028762112 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832030236672 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832030330880 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832161079296 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832164904960 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832164945920 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3832613765120 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3833727565824 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3833914073088 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3833929310208 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: shared extent[3833930141696 4096] lost its parent (parent: 3827251183616, level: 0) ERROR: extent[3837768077312, 24576] referencer count mismatch (root: 33174, owner: 1277577, offset: 4767744) wanted: 1, have: 0 [snip many more referencer count mismatches] ERROR: extent[3878247383040, 8192] referencer count mismatch (root: 33495, owner: 2688918, offset: 3874816) wanted: 2, have: 3 ERROR: block group[3879328546816 1073741824] used 1072840704 but extent
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Thursday 02 March 2017 08:43:53 Qu Wenruo wrote: > At 02/02/2017 08:01 PM, Marc Joliet wrote: > > On Sunday 28 August 2016 15:29:08 Kai Krakow wrote: > >> Hello list! > > > > Hi list > > [kernel message snipped] > > >> Btrfs --repair refused to repair the filesystem telling me something > >> about compressed extents and an unsupported case, wanting me to take an > >> image and send it to the devs. *sigh* > > > > I haven't tried a repair yet; it's a big file system, and btrfs-check is > > still running: > > > > # btrfs check -p /dev/sdd2 > > Checking filesystem on /dev/sdd2 > > UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 > > parent transid verify failed on 3829276291072 wanted 224274 found 283858 > > parent transid verify failed on 3829276291072 wanted 224274 found 283858 > > parent transid verify failed on 3829276291072 wanted 224274 found 283858 > > parent transid verify failed on 3829276291072 wanted 224274 found 283858 > > Normal transid error, can't say much about if it's harmless, but at > least some thing went wrong. > > > Ignoring transid failure > > leaf parent key incorrect 3829276291072 > > bad block 3829276291072 > > That's some what a big problem for that tree block. > > If this tree block is extent tree block, no wonder why kernel output > kernel warning and abort transaction. > > You could try "btrfs-debug-tree -b 3829276291072 " to show the > content of the tree block. # btrfs-debug-tree -b 3829276291072 /dev/sdb2 btrfs-progs v4.9 node 3829276291072 level 1 items 70 free 51 generation 292525 owner 2 fs uuid f97b3cda-15e8-418b-bb9b-235391ef2a38 chunk uuid 1cee580c-3442-4717-9300-8514dd8ff297 key (3828594696192 METADATA_ITEM 0) block 3828933423104 (934798199) gen 292523 key (3828594925568 METADATA_ITEM 0) block 3829427818496 (934918901) gen 292525 key (3828595109888 METADATA_ITEM 0) block 3828895723520 (934788995) gen 292523 key (3828595232768 METADATA_ITEM 0) block 3829202751488 (934863953) gen 292524 key (3828595412992 METADATA_ITEM 0) block 3829097209856 (934838186) gen 292523 key (3828595572736 TREE_BLOCK_REF 33178) block 3829235073024 (934871844) gen 292524 key (3828595744768 METADATA_ITEM 0) block 3829128351744 (934845789) gen 292524 key (3828595982336 METADATA_ITEM 0) block 3829146484736 (934850216) gen 292524 key (3828596187136 METADATA_ITEM 1) block 3829097234432 (934838192) gen 292523 key (3828596387840 TREE_BLOCK_REF 33527) block 3829301653504 (934888099) gen 292525 key (3828596617216 METADATA_ITEM 0) block 3828885737472 (934786557) gen 292523 key (3828596838400 METADATA_ITEM 0) block 3828885741568 (934786558) gen 292523 key (3828597047296 METADATA_ITEM 0) block 3829320552448 (934892713) gen 292525 key (3828597231616 METADATA_ITEM 0) block 3828945653760 (934801185) gen 292523 key (3828597383168 METADATA_ITEM 0) block 3829276299264 (934881909) gen 292525 key (3828597641216 METADATA_ITEM 1) block 3829349351424 (934899744) gen 292525 key (3828597866496 METADATA_ITEM 0) block 3829364776960 (934903510) gen 292525 key (3828598067200 METADATA_ITEM 0) block 3828598321152 (934716387) gen 292522 key (3828598259712 METADATA_ITEM 0) block 3829422968832 (934917717) gen 292525 key (3828598415360 TREE_BLOCK_REF 33252) block 3828885803008 (934786573) gen 292523 key (3828598665216 METADATA_ITEM 0) block 3828937863168 (934799283) gen 292523 key (3828598829056 METADATA_ITEM 0) block 3828885811200 (934786575) gen 292523 key (3828599054336 METADATA_ITEM 0) block 3829363744768 (934903258) gen 292525 key (3828599246848 METADATA_ITEM 0) block 3828915838976 (934793906) gen 292523 key (3828599504896 METADATA_ITEM 0) block 3829436194816 (934920946) gen 292525 key (3828599672832 METADATA_ITEM 0) block 3828905140224 (934791294) gen 292523 key (3828599771136 METADATA_ITEM 0) block 382923776 (934895831) gen 292525 key (3828599988224 METADATA_ITEM 0) block 3829087199232 (934835742) gen 292523 key (3828600135680 METADATA_ITEM 0) block 3828885827584 (934786579) gen 292523 key (3828600389632 METADATA_ITEM 0) block 3829436284928 (934920968) gen 292525 key (3828600528896 METADATA_ITEM 0) block 3829316214784 (934891654) gen 292525 key (3828600729600 METADATA_ITEM 0) block 3828885905408 (934786598) gen 292523 key (3828600934400 METADATA_ITEM 0) block 3829384486912 (934908322) gen 292525 key (3828601143296 METADATA_ITEM 0) block 3829423611904 (934917874) gen 292525 key (3828601356288 METADATA_ITEM 0) block 3829113688064 (934842209) gen 292524 key (3828601556992 METADATA_ITEM 0) block 3829134540
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Wednesday 01 March 2017 19:14:07 Marc Joliet wrote: > > > Also, the image is complete, so I only need to find somewhere where I > > > can > > > upload a 9.4 GB file. > > > > > > > > Is it a compressed dump? Dumped with btrfs-image -c9? > > It was created with: > > btrfs-image -s -w /dev/sdb2 - | xz -9 --stdout > > ./btrfs_backup_drive_2.img.xz > > (Mainly because I felt more comfortable using a separate compression > utility, not for any rational reason. Although if you really meant > "image" above, I have the feeling I'll regret this decision.) Ah, never mind, it's not as big as I was worried it would be: % xz -l btrfs_backup_drive.img.xz Str. Blöcke Kompr. Unkompr. Verh. Check Dateiname 1 1 9.589,6 MiB 81,4 GiB 0,115 CRC64 btrfs_backup_drive.img.xz So 81.4 GB uncompressed. I have the space to uncompress it if need be. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Wednesday 01 March 2017 19:14:07 Marc Joliet wrote: > > And btrfs check --mode=lowmem is also recommended as in some rare case, > > low mem mode can detect bug which original mode doesn't. > > I did see differences in output the last time around (again, see my > previous messages in this thread), so I'll run with lowmem. OK, just to prevent confusion: it turns out I never posted the output of btrfs-check with --mode=lowmem from the first time I was seeing these errors, but I remember the output being (slightly) different from the original mode. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Wednesday 01 March 2017 17:32:35 Qu Wenruo wrote: > At 03/01/2017 04:23 PM, Marc Joliet wrote: > > On Tuesday 28 February 2017 23:14:54 Marc Joliet wrote: > >> I think I'm at that point now myself, unless anybody has any other ideas. > > > > For example, could the --init-extent-tree option to btrfs-check help, > > given > > that I needed to pass -w to btrfs-image? (First off, thanks for taking the time to respond!) > --init-extent-tree should be avoided normally, unless you're 100% sure > about only extent tree is corrupted, and you have enough faith that > --init-extent-tree can finish. > > Or it will mostly make things worse. OK, that's why I asked :) . > Before trying any RW btrfs check operations, did you run btrfs check on > the image? If by image you mean the device in question, not since the last time (see my previous messages). Or can I really run btrfs check on the image? In any case, I started btrfs-check on the device itself. > And btrfs check --mode=lowmem is also recommended as in some rare case, > low mem mode can detect bug which original mode doesn't. I did see differences in output the last time around (again, see my previous messages in this thread), so I'll run with lowmem. It won't be done until tomorrow, though. > (And it also helps us to enhance lowmem mode) OK > > Also, the image is complete, so I only need to find somewhere where I can > > upload a 9.4 GB file. > > Is it a compressed dump? Dumped with btrfs-image -c9? It was created with: btrfs-image -s -w /dev/sdb2 - | xz -9 --stdout > ./btrfs_backup_drive_2.img.xz (Mainly because I felt more comfortable using a separate compression utility, not for any rational reason. Although if you really meant "image" above, I have the feeling I'll regret this decision.) > If so, such large one won't help much unless there is some developer > really interested in inspecting the image. When I started without compression, it was something like 46 GB, after only 1-2 hours, so I expect the uncompressed size to be... very large :-/ . > Thanks, > Qu > > > Greetings > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Tuesday 28 February 2017 23:14:54 Marc Joliet wrote: > I think I'm at that point now myself, unless anybody has any other ideas. For example, could the --init-extent-tree option to btrfs-check help, given that I needed to pass -w to btrfs-image? Also, the image is complete, so I only need to find somewhere where I can upload a 9.4 GB file. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
Hi again, so, it seems that I've solved the problem: After having to umount/mount the FS several times to get btrfs-cleaner to finish, I thought of the "failed to load free space [...], rebuilding it now" type errors and decided to try the clear_cache mount option. Since then, my home server has been running its backups regularly again. Furthermore, I was able back up my desktop again via send/recv (the rsync based backup is still running, but I expect it to succeed). The kernel log has also stayed clean. Kai, I'd be curious whether clear_cache will help in your case, too, if you haven't tried it already. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
On Saturday 11 February 2017 03:01:39 Kai Krakow wrote: > Am Fri, 10 Feb 2017 23:15:03 +0100 > > schrieb Marc Joliet <mar...@gmx.de>: > > # btrfs filesystem df /media/MARCEC_BACKUP/ > > Data, single: total=851.00GiB, used=831.36GiB > > System, DUP: total=64.00MiB, used=120.00KiB > > Metadata, DUP: total=13.00GiB, used=10.38GiB > > Metadata, single: total=1.12GiB, used=0.00B > > GlobalReserve, single: total=512.00MiB, used=0.00B > > > > Hmm, I take it that the single metadata is a leftover from running > > --repair? > > It's more probably a remnant of an incomplete balance operation or an > older mkfs version. I'd simply rebalance metadata to fix this. > > I don't think that btrfs-repair would migrate missing metadata > duplicates back to single profile, it would more likely trigger > recreating the missing duplicates. But I'm not sure. I'm fairly certain it's from the repair operation, the device didn't have any single metadata before (and it still didn't really, notice that used=0.00B). So my best guess is that it was allocated during the --repair. > If it is a result of the repair operation, that could be an > interesting clue. Could it explain "error -17" from your logs? But that > would mean the duplicates were already missing before the repair > operation and triggered that problem. So the question is, why are those > duplicates missing in the first place as a result of normal operation? > From your logs: I certainly didn't create them myself, the device has always had metadata=dup (you can check previous threads of mine, which also contain "btrfs fi df" output). So I don't think it has anything to do with the failures. > ---8<--- snip > Feb 02 22:49:14 thetick kernel: BTRFS: device label MARCEC_BACKUP devid > 1 transid 283903 /dev/sdd2 > Feb 02 22:49:19 thetick kernel: EXT4-fs (sdd1): mounted filesystem with > ordered data mode. Opts: (null) > Feb 03 00:18:52 thetick kernel: BTRFS info (device sdd2): use zlib > compression Feb 03 00:18:52 thetick kernel: BTRFS info (device sdd2): > disk space caching is enabled > Feb 03 00:18:52 thetick kernel: BTRFS info (device sdd2): has skinny > extents Feb 03 00:20:09 thetick kernel: BTRFS info (device sdd2): The > free space cache file (3967375376384) is invalid. skip it > Feb 03 01:05:58 thetick kernel: [ cut here ] > Feb 03 01:05:58 thetick kernel: WARNING: CPU: 1 PID: 26544 at > fs/btrfs/extent- tree.c:2967 btrfs_run_delayed_refs+0x26c/0x290 > Feb 03 01:05:58 thetick kernel: BTRFS: Transaction aborted (error -17) > --->8--- snap > > "error -17" being "object already exists". My only theory would be this > has a direct connection to you finding the single metadata profile. > Like in "the kernel thinks the objects already exists when it really > didn't, and as a result the object is there only once now" aka "single > metadata". > > But I'm no dev and no expert on the internals. Again, I don't think it has anything to do with the single metadata. A "btrfs balance start -mprofile=single" got rid of the (empty) single metadata blocks. What I *can* say is that the error seems to be transient. For example, I ran an rsync last night that failed with the same error as before. After unmounting and mounting the FS again, I could run it to completion (and verified some files). Anyway, here's the whole log since this morning. I can't correlate individual stack traces to specific actions any more, but the rough order of actions on my part (not counting unmount/mount cycles, those are easy enough to find): - backups from the home server run, then - btrfs-cleaner drops old snapshots, after which - I run rsync (which fails), then - I run rsync again (which succeeds), then - I run "balance start -mprofile=single", which succeeds (I might have run this before rsync, I'm not sure), after which - I start backups on my laptop (which fails after I finally went to bed, at about 3:58). After that I just try "btrbk -r run" a few times, which managed to transfer the first two snapshots before failing. Feb 11 00:00:06 diefledermaus kernel: usb 1-1: new high-speed USB device number 4 using ehci-pci Feb 11 00:00:06 diefledermaus kernel: usb 1-1: New USB device found, idVendor=0480, idProduct=d010 Feb 11 00:00:06 diefledermaus kernel: usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Feb 11 00:00:06 diefledermaus kernel: usb 1-1: Product: External USB 3.0 Feb 11 00:00:06 diefledermaus kernel: usb 1-1: Manufacturer: Toshiba Feb 11 00:00:06 diefledermaus kernel: usb 1-1: SerialNumber: 20130421020612 Feb 11 00:00:07 diefledermaus kernel: usb-storage 1-1:1.0: USB Mass Storage device detected Feb 11 00:00:07 diefledermaus kernel: usb-storage
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
Sorry for the late reply, see below for why :) . On Friday 03 February 2017 23:44:10 Kai Krakow wrote: > Am Thu, 02 Feb 2017 13:01:03 +0100 > > schrieb Marc Joliet <mar...@gmx.de>: [...] > > > Btrfs --repair refused to repair the filesystem telling me something > > > about compressed extents and an unsupported case, wanting me to > > > take an image and send it to the devs. *sigh* > > > > I haven't tried a repair yet; it's a big file system, and btrfs-check > > is still running: > > > > # btrfs check -p /dev/sdd2 > > Checking filesystem on /dev/sdd2 > > UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 > > parent transid verify failed on 3829276291072 wanted 224274 found > > 283858 parent transid verify failed on 3829276291072 wanted 224274 > > found 283858 parent transid verify failed on 3829276291072 wanted > > 224274 found 283858 parent transid verify failed on 3829276291072 > > wanted 224274 found 283858 Ignoring transid failure > > leaf parent key incorrect 3829276291072 > > bad block 3829276291072 > > > > ERROR: errors found in extent allocation tree or chunk allocation > > block group 4722282987520 has wrong amount of free space > > failed to load free space cache for block group 4722282987520 > > checking free space cache [O] > > root 32018 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32089 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32091 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32092 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32107 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32189 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32190 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32191 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32265 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32266 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32409 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32410 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32411 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32412 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32413 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32631 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32632 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32633 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32634 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32635 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32636 inode 95066 errors 100, file extent discount > > > > Found file extent holes: > > start: 413696, len: 4096 > > > > root 32718 inode 95066 errors 100, file extent discount > > > > Found file exte
Re: [4.7.2] btrfs_run_delayed_refs:2963: errno=-17 Object already exists
e same time). If "usebackuproot" (now called "recovery"?) fails, then I'll just wipe the FS and start the backups from scratch. Since I would like to have that done by Saturday: is there any information I can provide that might help fix whatever bug(s) caused this? Should I file a bug if one doesn't exist yet (I haven't checked yet, sorry)? Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: system hangs due to qgroups
On Tuesday 06 December 2016 00:22:39 Marc Joliet wrote: > On Monday 05 December 2016 11:16:35 Marc Joliet wrote: > [...] > > > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.7.3_sanitized.im > > ag e.xz > > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.8.5_sanitized.im > > a > > ge.xz > > BTW, since my problem appears to have been known, does anybody still care > about these? I'll remove these files from Dropbox tomorrow around noon unless somebody says they still need them. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: btrfs-check finds file extent holes
On Saturday 17 December 2016 00:18:13 Marc Joliet wrote: > The initial results with btrfs-check's low-memory mode found > reference count mismatches, but that seems to have been a false positive, > since btrfs-check's normal mode does not find them. FWIW, just in case this wasn't known yet (I know lowmem mode is still considered experimental), here is the lowmem output from three runs, in order. (I don't have the exact dates, but the FS was mounted in-between each run over the course of about 24 hours.) # btrfs check --mode lowmem /dev/sdb2 Checking filesystem on /dev/sdb2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 checking extents ERROR: extent[3839195598848, 20480] referencer count mismatch (root: 31589, owner: 2688918, offset: 4927488) wanted: 3, have: 4 ERROR: extent[3839195598848, 20480] referencer count mismatch (root: 31614, owner: 2688918, offset: 4927488) wanted: 3, have: 4 ERROR: extent[3839195598848, 20480] referencer count mismatch (root: 31422, owner: 2688918, offset: 4927488) wanted: 2, have: 4 ERROR: extent[3840794644480, 655360] referencer count mismatch (root: 31422, owner: 494683, offset: 734683136) wanted: 6, have: 11 ERROR: extent[3840794644480, 655360] referencer count mismatch (root: 31579, owner: 494683, offset: 734683136) wanted: 2, have: 7 ^C # btrfs check --mode lowmem /dev/sdb2 [6/1151] Checking filesystem on /dev/sdb2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 checking extents ERROR: extent[3839195598848, 20480] referencer count mismatch (root: 31614, owner: 2688918, offset: 4927488) wanted: 3, have: 4 ERROR: extent[3839195598848, 20480] referencer count mismatch (root: 31422, owner: 2688918, offset: 4927488) wanted: 2, have: 4 ERROR: extent[3840794644480, 655360] referencer count mismatch (root: 31422, owner: 494683, offset: 734683136) wanted: 6, have: 11 ERROR: extent[3840794644480, 655360] referencer count mismatch (root: 31579, owner: 494683, offset: 734683136) wanted: 2, have: 7 ERROR: extent[3853952757760, 4096] referencer count mismatch (root: 31614, owner: 2688918, offset: 4935680) wanted: 1, have: 2 ERROR: extent[3855013117952, 32768] referencer count mismatch (root: 30188, owner: 494681, offset: 32219136) wanted: 1, have: 2 ERROR: extent[3855013117952, 32768] referencer count mismatch (root: 30819, owner: 494681, offset: 32219136) wanted: 1, have: 2 ERROR: extent[3855013117952, 32768] referencer count mismatch (root: 28250, owner: 494681, offset: 32219136) wanted: 1, have: 2 ERROR: extent[3855013117952, 32768] referencer count mismatch (root: 29175, owner: 494681, offset: 32219136) wanted: 1, have: 2 ERROR: extent[3855015211008, 57344] referencer count mismatch (root: 31614, owner: 494681, offset: 32366592) wanted: 1, have: 2 ERROR: extent[3855015211008, 57344] referencer count mismatch (root: 31538, owner: 494681, offset: 32366592) wanted: 1, have: 2 ERROR: extent[3855015211008, 57344] referencer count mismatch (root: 31579, owner: 494681, offset: 32366592) wanted: 1, have: 2 ERROR: extent[3855015211008, 57344] referencer count mismatch (root: 31466, owner: 494681, offset: 32366592) wanted: 1, have: 2 ERROR: extent[3855015211008, 57344] referencer count mismatch (root: 31540, owner: 494681, offset: 32366592) wanted: 1, have: 2 ERROR: extent[3856459112448, 770048] referencer count mismatch (root: 30819, owner: 494683, offset: 1430528000) wanted: 1, have: 10 ERROR: extent[3856582701056, 4096] referencer count mismatch (root: 31572, owner: 195432, offset: 83841024) wanted: 1, have: 0 ERROR: extent[3857416364032, 98304] referencer count mismatch (root: 31614, owner: 494683, offset: 735207424) wanted: 2, have: 3 ERROR: extent[3857416364032, 98304] referencer count mismatch (root: 31624, owner: 494683, offset: 735207424) wanted: 1, have: 3 ERROR: extent[3857658413056, 65536] referencer count mismatch (root: 29175, owner: 2261987, offset: 474120192) wanted: 1, have: 2 ERROR: extent[3861689466880, 40960] referencer count mismatch (root: 31383, owner: 1277577, offset: 12189696) wanted: 1, have: 2 ERROR: extent[3861689995264, 40960] referencer count mismatch (root: 31383, owner: 1277577, offset: 12320768) wanted: 1, have: 2 ERROR: extent[3861694464000, 32768] referencer count mismatch (root: 30904, owner: 1277577, offset: 4980736) wanted: 3, have: 10 ERROR: extent[3861694464000, 32768] referencer count mismatch (root: 30937, owner: 1277577, offset: 4980736) wanted: 3, have: 10 ERROR: extent[3861694464000, 32768] referencer count mismatch (root: 31570, owner: 1277577, offset: 4980736) wanted: 3, have: 10 ERRO
Re: btrfs-check finds file extent holes
On Saturday 17 December 2016 00:18:13 Marc Joliet wrote: > Is this something that btrfs-check can safely repair, or that is perhaps > even harmless? Never mind, I just found that this has been repairable since btrfs-progs 3.19. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: btrfs-check finds file extent holes
OK, btrfs-check finished about an hour after I sent this, here's the complete output: # btrfs check /dev/sdd2 Checking filesystem on /dev/sdd2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 checking extents checking free space cache checking fs roots root 30634 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30635 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30636 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30657 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30746 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30747 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30764 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30834 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30835 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30915 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30916 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30942 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31038 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31053 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31366 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31367 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31368 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31385 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31425 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31473 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31499 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31554 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31572 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31606 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31653 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31680 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 found 904425616176 bytes used err is 1 total csum bytes: 873691128 total tree bytes: 11120295936 total fs tree bytes: 8620965888 total extent tree bytes: 1368756224 btree space waste bytes: 2415249740 file data blocks allocated: 19427350777856 referenced 1003936649216 Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
btrfs-check finds file extent holes
Hello, After my backup drive displayed a weird issue (programs accessing it suddenly started zombifying, but it worked fine after a reboot), I decided to check the file system. The initial results with btrfs-check's low-memory mode found reference count mismatches, but that seems to have been a false positive, since btrfs-check's normal mode does not find them. Instead, it complains about several file extent holes: # btrfs check /dev/sdd2 Checking filesystem on /dev/sdd2 UUID: f97b3cda-15e8-418b-bb9b-235391ef2a38 checking extents checking free space cache checking fs roots root 30634 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30635 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30636 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30657 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30746 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30747 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30764 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30834 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30835 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30915 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30916 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 30942 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31038 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31053 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31366 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31367 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31368 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31385 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31425 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31473 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31499 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31554 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31572 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31606 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31653 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 root 31680 inode 95066 errors 100, file extent discount Found file extent holes: start: 413696, len: 4096 (The check is still not done, it's been running for about 24 hours now.) Is this something that btrfs-check can safely repair, or that is perhaps even harmless? % uname -a Linux thetick 4.8.14-gentoo #1 SMP PREEMPT Sun Dec 11 17:09:09 CET 2016 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux % /sbin/btrfs --version btrfs-progs v4.8.5 I can't show any other output because btrfs-check is still running. I can only say that the file system is 1TB large and about 88% full (fuller than normal, which is about 85%). Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: out-of-band dedup status?
On Thursday 08 December 2016 13:41:36 Chris Murphy wrote: > Pretty sure it will not dedupe extents that are referenced in a read > only subvolume. I've used duperemove to de-duplicate files in read-only snapshots (of different systems) on my backup drive, so unless you're referencing some specific issue, I'm pretty sure you're wrong about that. Maybe you're thinking of the occasionally mentioned old dedup kernel implementation? -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [SOLVED] Re: system hangs due to qgroups
On Tuesday 06 December 2016 11:12:12 Marc Joliet wrote: > I have disabled quotas already (first thing I did after > mounting). However, there were definitely 20-30, maybe more (enough for > 2, maybe 3, console pages -- I don't know how many lines the initramfs > rescue shell has, but based on that, you could estimate the number of > qgroups). Of course, you can probably check the sanitized images I posted for more information. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: [SOLVED] Re: system hangs due to qgroups
On Tuesday 06 December 2016 08:29:48 Qu Wenruo wrote: > At 12/05/2016 10:43 PM, Marc Joliet wrote: > > On Monday 05 December 2016 12:01:28 Marc Joliet wrote: > >>> This seems to be a NULL pointer bug in qgroup relocation fix. > >>> > >>> > >>> > >>> The latest fix (not merged yet) should address it. > >>> > >>> > >>> > >>> You could try the for-next-20161125 branch from David to fix it: > >>> https://github.com/kdave/btrfs-devel/tree/for-next-20161125 > >> > >> OK, I'll try that, thanks! I just have to wait for it to finish > >> cloning... > > > > [...] > > > >>> And for your recovery, I'd suggest to install an Archlinux into a USB > >>> HDD or USB stick, and compile David's branch and install it into the USB > >>> HDD. > >>> > >>> > >>> > >>> Then use the USB storage as rescue tool to mount the fs, which should do > >>> RW mount with or without skip_balance mount option. > >>> So you could disable quota then. > >> > >> OK, I'll try that, thanks! > > > > Excellent, thank you, that worked! My laptop is working normally again. > > I'll keep an eye on it, but so far two balance operations ran normally > > (that is, they completed within a few minutes and without hanging the > > system). > > > > (Specifically, since I didn't find out how to get a different kernel onto > > the Arch USB stick, I simply installed the kernel on my desktop, then did > > everything from an initramfs emergency shell, then moved the SSD back > > into the laptop.) > > > > Thanks, everyone! > > Glad that helped. > > I just forgot that you're using gentoo, not archlinux, and kernel > install script won't work for archlinux. > > Anyway, I'm glad that works for you. > > BTW, if you haven't yet disable quota, would you please give a report on > how many qgroup you have? I have disabled quotas already (first thing I did after mounting). However, there were definitely 20-30, maybe more (enough for 2, maybe 3, console pages -- I don't know how many lines the initramfs rescue shell has, but based on that, you could estimate the number of qgroups). > And how CPU is spinning for balancing with quota enabled? All I can say is, based on past observations, that I would see a single process (usually btrfs-transaction, but often a user-space process, such as baloo_file_extractor) using a single CPU at 100% and blocking (almost) everything else, and either finish after a while if it was quick enough, or there would be intermittent time frames where other processes weren't blocked. With balancing the behaviour was the latter, only it was the btrfs process using 100% CPU. Furthermore, metadata balances were worse than data balances. > This would help us to evaluate how qgroup slows down the process if > there are too many snapshots. Again, sorry that I was so quick to disable quotas, but I was only willing to do so much debugging with this laptop. > Thanks, > Qu Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: system hangs due to qgroups
On Monday 05 December 2016 11:16:35 Marc Joliet wrote: [...] > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.7.3_sanitized.imag > e.xz > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.8.5_sanitized.ima > ge.xz BTW, since my problem appears to have been known, does anybody still care about these? -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
[SOLVED] Re: system hangs due to qgroups
On Monday 05 December 2016 12:01:28 Marc Joliet wrote: > > This seems to be a NULL pointer bug in qgroup relocation fix. > > > > > > > > The latest fix (not merged yet) should address it. > > > > > > > > You could try the for-next-20161125 branch from David to fix it: > > https://github.com/kdave/btrfs-devel/tree/for-next-20161125 > > OK, I'll try that, thanks! I just have to wait for it to finish cloning... > [...] > > And for your recovery, I'd suggest to install an Archlinux into a USB > > HDD or USB stick, and compile David's branch and install it into the USB > > HDD. > > > > > > > > Then use the USB storage as rescue tool to mount the fs, which should do > > RW mount with or without skip_balance mount option. > > So you could disable quota then. > > OK, I'll try that, thanks! Excellent, thank you, that worked! My laptop is working normally again. I'll keep an eye on it, but so far two balance operations ran normally (that is, they completed within a few minutes and without hanging the system). (Specifically, since I didn't find out how to get a different kernel onto the Arch USB stick, I simply installed the kernel on my desktop, then did everything from an initramfs emergency shell, then moved the SSD back into the laptop.) Thanks, everyone! -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: system hangs due to qgroups
On Monday 05 December 2016 12:01:28 Marc Joliet wrote: > > You could try the for-next-20161125 branch from David to fix it: > > https://github.com/kdave/btrfs-devel/tree/for-next-20161125 > > OK, I'll try that, thanks! I just have to wait for it to finish cloning... FWIW, I get this warning: CC fs/btrfs/inode.o fs/btrfs/inode.c: In Funktion »run_delalloc_range«: fs/btrfs/inode.c:1219:9: Warnung: »cur_end« könnte in dieser Funktion uninitialisiert verwendet werden [-Wmaybe-uninitialized] start = cur_end + 1; ^ fs/btrfs/inode.c:1172:6: Anmerkung: »cur_end« wurde hier deklariert Should I be worried about that? At a cursory glance, it looks like a false alarm, but I just want to be sure (and even so, false alarms are annoying). Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: system hangs due to qgroups
On Monday 05 December 2016 10:00:13 Marc Joliet wrote: > OK, I'll post the URLs once the images are uploaded. (I had Dropbox public > URLs right before my desktop crashed -- see below -- but now dropbox-cli > doesn't want to create them.) Alright, here you go: https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.7.3_sanitized.image.xz https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.8.5_sanitized.image.xz (FYI, "dropbox-cli puburl" appears to have broken recently, so I had to use the Dropbox web interface to get these URLs.) Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: system hangs due to qgroups
On Sunday 04 December 2016 11:52:40 Chris Murphy wrote: > On Sun, Dec 4, 2016 at 9:02 AM, Marc Joliet <mar...@gmx.de> wrote: > > Also, now the file system fails with the BUG I mentioned, see here: > > > > [Sun Dec 4 12:27:07 2016] BUG: unable to handle kernel paging request at > > fe10 > > [Sun Dec 4 12:27:07 2016] IP: [] > > qgroup_fix_relocated_data_extents+0x1f/0x2a0 > > [Sun Dec 4 12:27:07 2016] PGD 1c07067 PUD 1c09067 PMD 0 > > [Sun Dec 4 12:27:07 2016] Oops: [#1] PREEMPT SMP > > [Sun Dec 4 12:27:07 2016] Modules linked in: crc32c_intel serio_raw > > [Sun Dec 4 12:27:07 2016] CPU: 0 PID: 370 Comm: mount Not tainted 4.8.11- > > gentoo #1 > > [Sun Dec 4 12:27:07 2016] Hardware name: FUJITSU LIFEBOOK A530/FJNBB06, > > BIOS Version 1.19 08/15/2011 > > [Sun Dec 4 12:27:07 2016] task: 8801b1d9 task.stack: > > 8801b1268000 [Sun Dec 4 12:27:07 2016] RIP: > > 0010:[] > > [] qgroup_fix_relocated_data_extents+0x1f/0x2a0 > > [Sun Dec 4 12:27:07 2016] RSP: 0018:8801b126bcd8 EFLAGS: 00010246 > > [Sun Dec 4 12:27:07 2016] RAX: RBX: 8801b10b3150 > > RCX: > > > > [Sun Dec 4 12:27:07 2016] RDX: 8801b20f24f0 RSI: 8801b2790800 > > RDI: > > 8801b20f2460 > > [Sun Dec 4 12:27:07 2016] RBP: 8801b10bc000 R08: 00020340 > > R09: > > 8801b20f2460 > > [Sun Dec 4 12:27:07 2016] R10: 8801b48b7300 R11: ea0005dd0ac0 > > R12: > > 8801b126bd70 > > [Sun Dec 4 12:27:07 2016] R13: R14: 8801b2790800 > > R15: > > b20f2460 > > [Sun Dec 4 12:27:07 2016] FS: 7f97a7846780() > > GS:8801bbc0() knlGS: > > [Sun Dec 4 12:27:07 2016] CS: 0010 DS: ES: CR0: > > 80050033 [Sun Dec 4 12:27:07 2016] CR2: fe10 CR3: > > 0001b12ae000 CR4: 06f0 > > [Sun Dec 4 12:27:07 2016] Stack: > > [Sun Dec 4 12:27:07 2016] 0801 0801 > > 8801b20f2460 8801b4aaa000 > > [Sun Dec 4 12:27:07 2016] 0801 8801b20f2460 > > 812c23ed 8801b1d9 > > [Sun Dec 4 12:27:07 2016] 00ff8801b126bd18 > > 8801b10b3150 8801b4aa9800 > > [Sun Dec 4 12:27:07 2016] Call Trace: > > [Sun Dec 4 12:27:07 2016] [] ? > > start_transaction+0x8d/0x4e0 > > [Sun Dec 4 12:27:07 2016] [] ? > > btrfs_recover_relocation+0x3b3/0x440 > > [Sun Dec 4 12:27:07 2016] [] ? > > btrfs_remount+0x3ca/0x560 [Sun Dec 4 12:27:07 2016] > > [] ? shrink_dcache_sb+0x54/0x70 [Sun Dec 4 12:27:07 > > 2016] [] ? do_remount_sb+0x63/0x1d0 [Sun Dec 4 > > 12:27:07 2016] [] ? do_mount+0x6f3/0xbe0 [Sun Dec 4 > > 12:27:07 2016] [] ? > > copy_mount_options+0xbf/0x170 > > [Sun Dec 4 12:27:07 2016] [] ? SyS_mount+0x61/0xa0 > > [Sun Dec 4 12:27:07 2016] [] ? > > entry_SYSCALL_64_fastpath+0x13/0x8f > > [Sun Dec 4 12:27:07 2016] Code: 66 90 66 2e 0f 1f 84 00 00 00 00 00 41 57 > > 41 56 41 55 41 54 55 53 48 83 ec 50 48 8b 46 08 4c 8b 6e 10 48 8b a8 f0 > > 01 00 00 31 c0 <4d> 8b a5 10 fe ff ff f6 85 80 0c 00 00 01 74 09 80 be b0 > > 05 00 [Sun Dec 4 12:27:07 2016] RIP [] > > qgroup_fix_relocated_data_extents+0x1f/0x2a0 > > [Sun Dec 4 12:27:07 2016] RSP > > [Sun Dec 4 12:27:07 2016] CR2: fe10 > > [Sun Dec 4 12:27:07 2016] ---[ end trace bd51bbcfd10492f7 ]--- > > I can't parse this. Maybe someone else can. Do you get the same thing, > or a different thing, if you do a normal mount rather than a remount? The call trace is of course a bit different, but in both cases the RIP line is almost identical (if that even matters?). Compare the line from my first message: "RIP [] qgroup_fix_relocated_data_extents+0x1f/0x2a8" with the newest line: "RIP [] qgroup_fix_relocated_data_extents+0x1f/0x2a0" But I just remembered, I have one from trying to mount the top-level subvolume on my desktop: [So Dez 4 18:45:19 2016] BUG: unable to handle kernel paging request at fe10 [So Dez 4 18:45:19 2016] IP: [] qgroup_fix_relocated_data_extents+0x33/0x2e0 [So Dez 4 18:45:19 2016] PGD 1a07067 PUD 1a09067 PMD 0 [So Dez 4 18:45:19 2016] Oops: [#1] PREEMPT SMP [So Dez 4 18:45:19 2016] Modules linked in: joydev dummy iptable_filter ip_tables x_tables hid_logitech_hidpp hid_logitech_dj snd_hda_codec_hdmi snd_hda_codec_analog snd_hda_codec_generic uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev snd_usb_audio snd_hwdep snd_usbmidi_lib radeon i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea
Re: system hangs due to qgroups
On Sunday 04 December 2016 18:24:08 Duncan wrote: > Marc Joliet posted on Sun, 04 Dec 2016 17:02:48 +0100 as excerpted: > > That's a good idea, although I'll probably start with sysrescuecd (Linux > > 4.8.5 and btrfs-progs 4.7.3), as I already have experience with it. > > > > [After trying it] > > > > Well, crap, I was able to get images of the file system (one sanitized), > > but mounting always fails with "device or resource busy" (with no > > corresponding dmesg output). (Also, that drive's partitions weren't > > discovered on bootup, I had to run partprobe first.) I never see that > > in the initramfs, so I'm not sure what's causing that. > > If I understand correctly what you're doing, that part is easily enough > explained. > > Remember that btrfs, unlike most filesystems, is multi-device capable. > The way it tracks which devices belong to which filesystems is by UUID, > universally *UNIQUE* ID. If you image a device via dd or similar, you of > course image its UUID as well, destroying the "unique" assumption in UUID > and confusing btrfs, which will consider it part of the existing > filesystem if the original devices with that filesystem UUID remain > hooked up. > > So if you did what I believe you did, try to mount the image while the > original filesystem devices remain attached and mounted, btrfs is simply > saying that filesystem (which btrfs identifies by UUID) is already > mounted: "device or resource busy". [...] Nope, sorry if I wasn't clear, I didn't mean that I tried to mount the image (can you even mount images created with btrfs-image?). Plus the images are xz-compressed. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: system hangs due to qgroups
On Sunday 04 December 2016 03:10:19 Adam Borowski wrote: > On Sat, Dec 03, 2016 at 10:46:40PM +0100, Marc Joliet wrote: > > As it's a rescue shell, I have only the one shell AFAIK, and it's occupied > > by mount. So I can't tell if there are dmesg entries, however, when this > > happens during a normal running system, I never saw any dmesg entries. > > You can use "open" (might be named "openvt") to spawn a shell on > tty2/tty3/etc. And if you have "screen" installed, Ctrl-a c spawns new > terminals (Ctrl-a n/p/0-9 to switch). I was actually considering adding tmux to the list of programs in the initramfs after this experience :) . > > The output of sysrq+t is too big to capture all of it (i.e., I can't > > scroll > > back to the beginning) > > You may use netconsole to log everything kernel says to another machine. I > can't provide you with the incantations from the top of my head (got working > serial (far more reliable) on all my dev boxes, and it doesn't work with > bridging ie containers on production), but as your rescue shell has no > network sharing, assuming your network card driver supports a feature > netconsole needs _and_ stars were aligned right when your network card was > manufactured, netconsole is a valuable aid. > > The system might be not dead enough to stop userland network logging from > getting through, too. OK, I'll look up netconsole. > Meow! Thanks -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: system hangs due to qgroups
OK, so I tried a few things, to now avail, more below. On Saturday 03 December 2016 15:56:45 Chris Murphy wrote: > On Sat, Dec 3, 2016 at 2:46 PM, Marc Joliet <mar...@gmx.de> wrote: > > On Saturday 03 December 2016 13:42:42 Chris Murphy wrote: > >> On Sat, Dec 3, 2016 at 11:40 AM, Marc Joliet <mar...@gmx.de> wrote: > >> > Hello all, > >> > > >> > I'm having some trouble with btrfs on a laptop, possibly due to > >> > qgroups. > >> > Specifically, some file system activities (e.g., snapshot creation, > >> > baloo_file_extractor from KDE Plasma) cause the system to hang for up > >> > to > >> > about 40 minutes, maybe more. > >> > >> Do you get any blocked tasks kernel messages? If so, issue sysrq+w > >> during the hang, and then check the system log (dmesg may not contain > >> everything if the command fills the message buffer). If it's a hang > >> without any kernel messages, then issue sysrq+t. > >> > >> https://www.kernel.org/doc/Documentation/sysrq.txt > > > > As it's a rescue shell, I have only the one shell AFAIK, and it's occupied > > by mount. So I can't tell if there are dmesg entries, however, when this > > happens during a normal running system, I never saw any dmesg entries. > > Anyway, I ran both. > > OK so this is root fs? I would try to work on it from another volume. > An advantage of openSUSE Tumbleweed is they claim to fully support > qgroups, where upstream uses much more guarded language about its > stability. > > Whereas last night's Fedora Rawhide has kernel 4.9-rc7 and btrfs-progs > 4.8.5. > https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20161203. > n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-netinst-x86_64-Rawhide > -20161203.n.0.iso > > You can use dd to write the ISO to a USB stick, it supports BIOS and > UEFI and Secure Boot. > > Troubleshooting > Rescue a Fedora system > option 3 to get to a shell > The sysrq+t and sysrq+w can be written out in their entirety with > monotonic time using 'journalctl -b -k -o short-monotonic > > kernelmessages.log' > > Unfortunately this is not a live system, so you can't (as far as I > know) install script to more easily capture everything to a single > file; 'btrfs check > btrfscheck.log' should capture most of the > output, but it misses a few early lines for some reason. > > And then scp those files to another system, or mount another stick and > copy locally. That's a good idea, although I'll probably start with sysrescuecd (Linux 4.8.5 and btrfs-progs 4.7.3), as I already have experience with it. [After trying it] Well, crap, I was able to get images of the file system (one sanitized), but mounting always fails with "device or resource busy" (with no corresponding dmesg output). (Also, that drive's partitions weren't discovered on bootup, I had to run partprobe first.) I never see that in the initramfs, so I'm not sure what's causing that. Also, now the file system fails with the BUG I mentioned, see here: [Sun Dec 4 12:27:07 2016] BUG: unable to handle kernel paging request at fe10 [Sun Dec 4 12:27:07 2016] IP: [] qgroup_fix_relocated_data_extents+0x1f/0x2a0 [Sun Dec 4 12:27:07 2016] PGD 1c07067 PUD 1c09067 PMD 0 [Sun Dec 4 12:27:07 2016] Oops: [#1] PREEMPT SMP [Sun Dec 4 12:27:07 2016] Modules linked in: crc32c_intel serio_raw [Sun Dec 4 12:27:07 2016] CPU: 0 PID: 370 Comm: mount Not tainted 4.8.11- gentoo #1 [Sun Dec 4 12:27:07 2016] Hardware name: FUJITSU LIFEBOOK A530/FJNBB06, BIOS Version 1.19 08/15/2011 [Sun Dec 4 12:27:07 2016] task: 8801b1d9 task.stack: 8801b1268000 [Sun Dec 4 12:27:07 2016] RIP: 0010:[] [] qgroup_fix_relocated_data_extents+0x1f/0x2a0 [Sun Dec 4 12:27:07 2016] RSP: 0018:8801b126bcd8 EFLAGS: 00010246 [Sun Dec 4 12:27:07 2016] RAX: RBX: 8801b10b3150 RCX: [Sun Dec 4 12:27:07 2016] RDX: 8801b20f24f0 RSI: 8801b2790800 RDI: 8801b20f2460 [Sun Dec 4 12:27:07 2016] RBP: 8801b10bc000 R08: 00020340 R09: 8801b20f2460 [Sun Dec 4 12:27:07 2016] R10: 8801b48b7300 R11: ea0005dd0ac0 R12: 8801b126bd70 [Sun Dec 4 12:27:07 2016] R13: R14: 8801b2790800 R15: b20f2460 [Sun Dec 4 12:27:07 2016] FS: 7f97a7846780() GS:8801bbc0() knlGS: [Sun Dec 4 12:27:07 2016] CS: 0010 DS: ES: CR0: 80050033 [Sun Dec 4 12:27:07 2016] CR2: fe10 CR3: 0001b12ae000 CR4: 06f0 [Sun Dec 4 12:27:07 2016] Stack: [Sun Dec 4 12:27:07 2016] 0801 0801 8801b20f2460 8801b4aaa000 [Sun Dec 4 12:27:07 2016] 00
Re: system hangs due to qgroups
On Saturday 03 December 2016 13:42:42 Chris Murphy wrote: > On Sat, Dec 3, 2016 at 11:40 AM, Marc Joliet <mar...@gmx.de> wrote: > > Hello all, > > > > I'm having some trouble with btrfs on a laptop, possibly due to qgroups. > > Specifically, some file system activities (e.g., snapshot creation, > > baloo_file_extractor from KDE Plasma) cause the system to hang for up to > > about 40 minutes, maybe more. > > Do you get any blocked tasks kernel messages? If so, issue sysrq+w > during the hang, and then check the system log (dmesg may not contain > everything if the command fills the message buffer). If it's a hang > without any kernel messages, then issue sysrq+t. > > https://www.kernel.org/doc/Documentation/sysrq.txt As it's a rescue shell, I have only the one shell AFAIK, and it's occupied by mount. So I can't tell if there are dmesg entries, however, when this happens during a normal running system, I never saw any dmesg entries. Anyway, I ran both. The output of sysrq+w mentions two tasks: "btrfs-transaction" with btrfs_scrub_pause+0xbe/0xd0 as the top-most entry in the call trace, and "mount" with its top-most entry at schedule+0x33/0x90 (it looks like it's still in the "early" processing, since there's also "btrfs_parse_early_options+0190/0x190" in the call trace). The output of sysrq+t is too big to capture all of it (i.e., I can't scroll back to the beginning), but just looking at the task names that I *can* see, there are: btrfs-fixup, various btrfs-endio*, btrfs-rmw, btrfs-freespace, btrfs-delayed-m (cut off), btrfs-readahead, btrfs-qgroup-re (cut off), btrfs- extent-re (cut off), btrfs-cleaner, and btrfs-transaction. Oh, and a bunch of kworkers. Should I take photos? That'll be annoying to do with all the scrolling, but I can do that if need be. > > After I next turned on the laptop, the balance resumed, causing bootup to > > fail, after which I remembered about the skip_balance mount option, which > > I > > tried in a rescue shell from an initramfs. > > The file system is the root filesystem? If so, skip_balance may not be > happening soon enough. Use kernel parameter rootflags=skip_balance > which will apply this mount option at the very first moment the file > system is mounted during boot. Yes, it's the root file system (there's that plus a swap partition). I believe I tried rootflags, but I think it also failed, which is why I'm using a rescue shell now. I can try it again, though, if anybody thinks that there's no point in waiting, especially if btrfs_scrub_pause in the btrfs- transaction call trace is significant. > > Since I couldn't use skip_balance, and logically can't destroy qgroups on > > a > > read-only file system, I decided to wait for a regular mount to finish. > > That has been running since Tuesday, and I am slowly growing impatient. > Haha, no kidding! I think that's very patient. Heh :) . I've still got my main desktop (as ancient as it may be), so I'm content with waiting for now, but I don't want to wait forever, especially if there might not even be a point. > > Thus I arrive at my question(s): is there anything else I can try, short > > of > > reformatting and restoring from backup? Can I use btrfs-check here, or > > any > > other tool? Or...? > > Yes, btrfs-progs 4.8.5 has the latest qgroup checks, so if there's > something wrong it should find it and if not that's a bug of its own. The initramfs has 4.8.4, but it looks like 4.8.5 was "only" an urgent bug fix, with no changes to qgroups handling, so I can use that, too. Can it repair qgroups problems, too? > > Also, should I be able to avoid reformatting: how do I properly disable > > quota support? > > 'btrfs quota disable' is the only command that applies to this and it > requires rw mount; there's no 'noquota' mount option. OK, thanks. So what should I try next? I'm sick at home, so I can spend more time on this than usual. -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
system hangs due to qgroups
Hello all, I'm having some trouble with btrfs on a laptop, possibly due to qgroups. Specifically, some file system activities (e.g., snapshot creation, baloo_file_extractor from KDE Plasma) cause the system to hang for up to about 40 minutes, maybe more. It always causes (most of) my desktop to hang, (although I can usually navigate between pre-existing Konsole tabs) and prevents new programs from starting. I've seen the system load go up to >30 before the laptop suddenly resumes normal operation. I've been seeing this since Linux 4.7, maybe already 4.6. Now, I thought that maybe this was (indirectly) due to an overly full file system (~90% full), so I deleted some things I didn't need to get it up to 15% free. (For the record, I also tried mounting with ssd_spread.) After that, I ran a balance with -dusage=50, which started out promising, but then went back to the "bad" behaviour. *But* it seemed better than before overall, so I started a balance with -musage=10, then -musage=50. That turned out to be a mistake. Since I had to transport the laptop, and couldn't wait for "balance cancel" to return (IIUC it only returns after the next block (group?) is freed), I forced the laptop off. After I next turned on the laptop, the balance resumed, causing bootup to fail, after which I remembered about the skip_balance mount option, which I tried in a rescue shell from an initramfs. But wait, that failed, too! Specifically, the stack trace I get whenever I try it includes as one of the last lines: "RIP [] qgroup_fix_relocated_data_extents+0x1f/0x2a8" (I can take photos of the full stack trace if requested.) So then I ran "btrfs qgroup show /sysroot/", which showed many quota groups, much to my surprise. On the upside, at least now I discovered the likely reason for the performance problems. (I actually think I know why I'm seeing qgroups: at one point I was trying out various snapshot/backup tools for btrfs, and one (I forgot which) unconditionally activated quota support, which infuriated me, but I promptly deactivated it, or so I thought. Is quota support automatically enabled when qgroups are discovered, or did I perhaps not disable quota support properly?) Since I couldn't use skip_balance, and logically can't destroy qgroups on a read-only file system, I decided to wait for a regular mount to finish. That has been running since Tuesday, and I am slowly growing impatient. Thus I arrive at my question(s): is there anything else I can try, short of reformatting and restoring from backup? Can I use btrfs-check here, or any other tool? Or...? Also, should I be able to avoid reformatting: how do I properly disable quota support? (BTW, searching for qgroup_fix_relocated_data_extents turned up the ML thread "[PATCH] Btrfs: fix endless loop in balancing block groups", could that be related?) The laptop is currently running Gentoo with Linux 4.8.10 and btrfs-progs 4.8.4. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: Input/output error on newly created file
Am Friday 13 May 2016 schrieb Duncan <1i5t5.dun...@cox.net> >Szalma László posted on Thu, 12 May 2016 20:28:24 +0200 as excerpted: >> The files that rarely become unreadable (I/O error but no error in dmesg >> or anywhere) are mysql MyIsam database files, and they are always small. >> Like 16kbyte for example, or smaller. Sometimes dropping the fs cache >> fixes the problem, sometimes not. Umount / mount always fixes the >> problem. Scrub says the filesystem is OK (when the file is unreadable). > >Is it possible the files are always under 4 KiB? For the record, I was seeing a similar error with dovecot *.index.log files (see the ML thread started by Szalma László) . In my case they are *not* all under 4 KiB. Looking at some of the affected files, one of them is 25K, and another is 6.6K. However, perhaps they compress to under 4K? But compressing the 25K one with lzop only goes down to 5.6K with -9 :-/ . >While there's a few variables as to max size, with 4 KiB being the >practical max, btrfs will inline really small files into their metadata >node instead of writing them out as 4 KiB data block extents. Since >that's obviously a different code-path, if it's only 4 KiB and smaller >files, it's possible there's a race in that code-path that when lost, >results in the file "disappearing" without a dmesg trace, only to >reappear after reboot. > >Actually, now that I think about it, if the files are all OVER say 2 KiB >but still quite small, say under 16 KiB, and if they had just recently >grown, it's possible that the race is in the transition from the inline >metadata state to the multiples of 4 KiB block data extent state. > >And if all the files had just shrunk, say from compaction (if done in- >place, not with a copy and rename), perhaps it's the reverse, the >transition from written data blocks to inline metadata state. I'm glad somebody is (publicly) thinking about this :-) ! Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: random i/o error without error in dmesg
Am Saturday 07 May 2016 schrieb Marc Joliet <mar...@gmx.de> >I'm thinking of filing >a bug report with dovecot; perhaps none of their devs test with btrfs? So I did this, and got a little bit of feedback. Quoting from the reply I got: "*.index.log files are always appended to using O_APPEND flag. Maybe this is relevant. Also when a new .log file is created it's opened without the O_APPEND flag and the O_APPEND is added later. This was causing a bug recently in unionfs, which ignored the flag change and caused log file corruption." The other bit of advise was to stress test dovecot using imaptest, but I'll have to do that when I have more time. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: compression disk space saving - what are your results?
On Sunday 06 December 2015 04:21:30 Duncan wrote: >Marc Joliet posted on Sat, 05 Dec 2015 15:11:51 +0100 as excerpted: >> I do think it's interesting that compression (even with LZO) seems to >> have offset the extra space wastage caused by autodefrag. > >I've seen (I think) you mention that twice now. Perhaps I'm missing >something... How does autodefrag trigger space wastage? > >What autodefrag does is watch for seriously fragmented files and queue >them up for later defrag by a worker thread. How would that waste space? > >Unless of course you're talking about breaking reflinks to existing >snapshots or other (possibly partial) copies of the file. That is in fact what I was referring to. >But I'd call >that wasting space due to the snapshots storing old copies, not due to >autodefrag keeping the current copy defragmented. And reflinks are >saving space by effectively storing parts of two files in the same >extent, not autodefrag wasting it, as the default on a normal filesystem >would be separate copies, so that's the zero-point base, Of course, the default on a normal file system is to not have any snapshots between which to reflink ;-) . Also, autodefrag is not a default mount option, so the default on BTRFS is to save space via reflinks, which is undone by defragmenting, hence why I see it as autodefrag triggering the waste of space. >and reflinks >save from it, with autodefrag therefore not changing things from the zero- >point base. No snapshots, no reflinks, autodefrag no longer "wastes" >space, so it's not autodefrag's wastage in the first place, it's the >other mechanisms' saving space. To my mind it is the keeping of snapshots and the breaking of reflinks via autodefrag that together cause space wastage. This is coming from the perspective that snapshots are *useful* and hence by themselves do not constitute wasted space. >From my viewpoint, anyway. I'd not ordinarily quibble over it one way or >the other if that's what you're referring to. But just in case you had >something else in mind that I'm not aware of, I'm posting the question. And the above is my viewpoint :-) . -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: compression disk space saving - what are your results?
On Wednesday 02 December 2015 18:46:30 Tomasz Chmielewski wrote: >What are your disk space savings when using btrfs with compression? > >I have a 200 GB btrfs filesystem which uses compress=zlib, only stores >text files (logs), mostly multi-gigabyte files. > > >It's a "single" filesystem, so "df" output matches "btrfs fi df": > ># df -h >Filesystem Size Used Avail Use% Mounted on >(...) >/dev/xvdb 200G 124G 76G 62% /var/log/remote > > ># du -sh /var/log/remote/ >153G/var/log/remote/ > > > From these numbers (124 GB used where data size is 153 GB), it appears >that we save around 20% with zlib compression enabled. >Is 20% reasonable saving for zlib? Typically text compresses much better >with that algorithm, although I understand that we have several >limitations when applying that on a filesystem level. > > >Tomasz Chmielewski >http://wpkg.org I have a total of three file systems that use compression, on a desktop and a laptop. / on both uses compress=lzo, and my backup drive uses compress=zlib (my RAID1 FS does not use compression). My desktop looks like this: % df -h DateisystemGröße Benutzt Verf. Verw% Eingehängt auf /dev/sda1 108G 79G 26G 76% / [...] For / I get a total of about 8G or at least 9% space saving: # du -hsc /mnt/rootfs/* 71G /mnt/rootfs/home 14G /mnt/rootfs/rootfs 2,3G/mnt/rootfs/var 87G insgesamt I write "at least" because this does not include snapshots. On my laptop the difference is merely 1 GB (83 vs. 84 GB), but it was using the autodefrag mount option until yesterday (when I migrated it to an SSD using dd), which probably accounts for a significant amount of wasted space. I'll see how it develops over the next two weeks, but I expect the ratio to become similar to my desktop (probably less, since there is also a lot of music on there). I would love to answer the question for my backup drive, but du took too long (> 1 h) so I stopped it :-( . I might try it again later, but no promises! Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: compression disk space saving - what are your results?
On Saturday 05 December 2015 14:37:05 Marc Joliet wrote: >My desktop looks like this: > >% df -h >DateisystemGröße Benutzt Verf. Verw% Eingehängt auf >/dev/sda1 108G 79G 26G 76% / >[...] > >For / I get a total of about 8G or at least 9% space saving: > ># du -hsc /mnt/rootfs/* >71G /mnt/rootfs/home >14G /mnt/rootfs/rootfs >2,3G/mnt/rootfs/var >87G insgesamt > >I write "at least" because this does not include snapshots. Just to be explicit, in case it was not clear, but I of course meant that the *du output* does not account for extra space used by snapshots. >On my laptop >the difference is merely 1 GB (83 vs. 84 GB), And here I also want to clarify that the df output was 84 GB, and the du output was 83 GB. Again, the du output does not account for snapshots, which go back farther on the laptop: 2 weeks of daily snapshots (with autodefrag!) instead of up to up to 2 days of bi-hourly snapshots. I do think it's interesting that compression (even with LZO) seems to have offset the extra space wastage caused by autodefrag. Greetings -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: Using Btrfs on single drives
On Sunday 15 November 2015 04:01:57 Duncan wrote: >audio muze posted on Sun, 15 Nov 2015 05:27:00 +0200 as excerpted: >> I've gone ahead and created a single drive Btrfs filesystem on a 3TB >> drive and started copying content from a raid5 array to the Btrfs >> volume. Initially copy speeds were very good sustained at ~145MB/s and >> I left it to run overnight. This morning I ran btrfs fi usage >> /mnt/btrfs and it reported around 700GB free. I selected another folder >> containing 204GB and started a copy operation, again from the raid5 >> array to the Btrfs volume. Copying is now materially slower and slowing >> further...it started at ~105MB/s and after 141GB has slowed to around >> 97MB/s. Is this to be expected with Btrfs of have I come across a bug >> of some sort? > >That looks to /me/ like native drive limitations. > [Snip nice explanation] I'll just add that I see this with my 3TB USB3 HDD, too, but also with my internal HDDs. Old drives (the oldest I had were about 10 years old) also had this problem, only scaled appropriately (the worst was something like 40/60 GB/s min./max.). You can also see this very nicely with scrub runs (I use dstat for this): they start out at the max., but gradually slow down as they progress. HTH -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: random i/o error without error in dmesg
On Wednesday 28 October 2015 05:21:13 Duncan wrote: >Marc Joliet posted on Tue, 27 Oct 2015 21:54:40 +0100 as excerpted: >>>IOW, does it take a full reboot to clear the problem, or is a simple >>>ro/rw mount cycle enough, or an unmount/remount? >>> >> Seems that a full reboot is needed, but I would expect that it would >> have the same effect if I were to pivot back into the initramfs, unmount >> / from there, >> then boot back into the system. Because quite frankly, I can't think of >> any reason why a power cycle to the SSD should make a difference here. >> I vaguely remember that systemd can do that, so I'll see if I can find >> out how. > >Agree with both the systemd returning to the initr* point (which I >actually had in mind while writing the above but don't remember the >details either, so chose to omit in the interest of limiting the size of >the reply and research necessary to generate it), and the ssd power-cycle >point. I haven't found any single command that lets you do that, but I can try one of the special targets as detailed in bootup(7) (e.g., initrd.target) when I have a chance. >>>Finally, assuming root itself isn't btrfs, if you have btrfs configured >>>as a module, you could try unmounting all btrfs and then unloading the >>>module, then reloading and remounting. That should entirely clear all >>>in-memory btrfs state, so if that doesn't solve the problem, while >>>rebooting does, then the problem's very possibly outside of btrfs scope. >>> >>> Of course if root is btrfs, you can't really check that. >> >> Nope, btrfs is built-in (though it doesn't have to be, what with me >> using an initramfs). > >Same here, also gentoo as I guess you know from previous exchanges. But >unfortunately, if your initr* is anything like mine, and your kernel >monolithic as mine, making btrfs a module with a btrfs root isn't the >easy thing it might seem to those who run ordinary distro supplied binary >kernels with pretty much everything modularized, as doing so involves a >whole new set of research on how to get that module properly included in >the initr* and loaded there, as well as installing and building the whole >module-handling infrastructure (modprobe and friends) again, as it's not >actually installed on the system at all at this point, because with the >kernel entirely monolithic, module-handling tools are unnecessary and >thus just another unnecessary package to have to keep building updates >for, if they remain installed. My kernel is fairly modular, and I use dracut to make my initramfs, so I wouldn't be surprised if it works. For me, personally, I just don't see any point in making btrfs a module. (And yes, of course I know you run Gentoo ;-) .) >So I definitely sympathize with the feeling that such a stone is better >left unturned, if overturning it is at all a possibility that can be >avoided, as it is here, this whole exercise being simply one of better >pinning the bug down, not yet actually trying to solve it. And given >that unturned stone, there are certainly easier ways. > >And one of those easier ways is investigating that whole systemd return >to initr* idea, since we both remember reading something about it, but >aren't familiar with the details. In addition to addressing the problem >headon if anyone offers a way to do so, that's the path I'd be looking at >right now. Like I said above, I'll try it out when I have a moment where I have a more "steady hand" so-to-speak. [snip deleted files stuff] >app-admin/lib_users [snip the rest of the deleted files stuff] I use that to find processes that need restarting after upgrades, though I'll sometimes check to see if it's really a library that's causing it to show up, since often a process is listed because of stuff like the font cache, or, in the case of the FISH shell, it's own history file. But yeah, didn't think of running that, but in rescue mode there were at most a dozen processes running, so there's not much to choose from, anyway. I did have to kill two remaining user processes first (pulseaudio and... I forgot the other one). I didn't try the same with / and /var because I was eager to get back to a normally running system ;-) . >Of course, if lib_users reports nothing further still holding references >to deleted files, and a remount read-only STILL fails, that's a major >note of trouble and an important finding in itself. I don't expect that, but I'll make note of it if I encounter it. >Meanwhile, as explained in the systemd docs (specifically the systemd for >administrators series, IIRC), systemd dropping back to the initr* is >actually its way of automatically doing effectively the same thing we >were using l
Re: random i/o error without error in dmesg
On Tuesday 27 October 2015 06:23:06 Duncan wrote: >Marc Joliet posted on Mon, 26 Oct 2015 15:23:39 +0100 as excerpted: >> Occasionally they go away by themselves, but usually I have to reboot to >> make them go away. This happens when getmail attempts to fetch mail, >> which fails due to the above error. After the reboot getmail succeeds >> again. > >Just out of curiosity, does a remount,ro, followed by a remount,rw, solve >the problem? > >The ro/rw cycle should force anything in memory to device, so if that >eliminates the problem, it could well be some sort of sync issue. If it >doesn't, then it's more likely an in-memory filesystem state issue, >that's cleared by the reboot. > >And if the ro/rw cycle doesn't clear the problem, what about a full >unmount/mount cycle, at least of that subvolume? > >If you're running multiple subvolumes with root being one of them, you >can't of course unmount the entire filesystem, but you could go down to >emergency mode (systemctl emergency), try unmounting everything but /, >mounting / ro, and then switching back to normal mode (from emergency >mode, simply exiting should return you to normal multi-user or gui >target, remounting filesystems as necessary, etc). > >IOW, does it take a full reboot to clear the problem, or is a simple ro/rw >mount cycle enough, or an unmount/remount? > >Finally, assuming root itself isn't btrfs, if you have btrfs configured >as a module, you could try unmounting all btrfs and then unloading the >module, then reloading and remounting. That should entirely clear all in- >memory btrfs state, so if that doesn't solve the problem, while rebooting >does, then the problem's very possibly outside of btrfs scope. Of course >if root is btrfs, you can't really check that. Thanks for the hints. I just upgraded to gentoo-sources 4.1.11 and will see if that changes anything. If not, I'll have to try unmounting /home from emergency mode (it's a subvolume mount). -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: random i/o error without error in dmesg
Hi FWIW, this sounds like what I've been seeing with dovecot. In case it's relevant, I'll try to explain. After some uptime, I'll see log messages like this: Okt 26 12:05:46 thetick dovecot[467]: imap(marcec): Error: pread() failed with file /home/marcec/.mdbox/mailboxes/BTRFS/dbox-Mails/dovecot.index.log: Input/output error Occasionally they go away by themselves, but usually I have to reboot to make them go away. This happens when getmail attempts to fetch mail, which fails due to the above error. After the reboot getmail succeeds again. As in Szalma's case, btrfs-scrub never reports anything wrong. I use LZO compression on the relevant file system, so I wanted to wait until kernel 4.1.11 before reporting this, but that hasn't hit Gentoo yet (and neither has 4.1.10, for some reason). I don't use quotas. According to the what I see in the systemd journal, the errors started on 2015-06-01 with kernel 3.19.8. Note that, strangely enough, I had been using that same version since 2015-05-23, so for more than a week before the error cropped up. I checked whether I made any changes to the configuration, and found this: diff --git a/kernels/kernel-config-3.19.8-gentoo b/kernels/kernel- config-3.19.8-gentoo index b061b31..8cf8eba 100644 --- a/kernels/kernel-config-3.19.8-gentoo +++ b/kernels/kernel-config-3.19.8-gentoo @@ -64,7 +64,7 @@ CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE="" # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="" -CONFIG_LOCALVERSION_AUTO=y +# CONFIG_LOCALVERSION_AUTO is not set CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y @@ -73,8 +73,8 @@ CONFIG_HAVE_KERNEL_LZO=y CONFIG_HAVE_KERNEL_LZ4=y # CONFIG_KERNEL_GZIP is not set # CONFIG_KERNEL_BZIP2 is not set -CONFIG_KERNEL_LZMA=y -# CONFIG_KERNEL_XZ is not set +# CONFIG_KERNEL_LZMA is not set +CONFIG_KERNEL_XZ=y # CONFIG_KERNEL_LZO is not set # CONFIG_KERNEL_LZ4 is not set CONFIG_DEFAULT_HOSTNAME="(none)" @@ -132,7 +132,7 @@ CONFIG_TICK_CPU_ACCOUNTING=y # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set # CONFIG_IRQ_TIME_ACCOUNTING is not set CONFIG_BSD_PROCESS_ACCT=y -# CONFIG_BSD_PROCESS_ACCT_V3 is not set +CONFIG_BSD_PROCESS_ACCT_V3=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y The only change I can think of that might affect anything is CONFIG_BSD_PROCESS_ACCT_V3=y (I don't remember why exactly I set it). I can try without it set, but maybe the kernel configuration is a red herring? Anyway, the current state of the system is: # uname -r 4.1.9-gentoo-r1 # btrfs filesystem show / Label: 'MARCEC_ROOT' uuid: 0267d8b3-a074-460a-832d-5d5fd36bae64 Total devices 1 FS bytes used 74.40GiB devid1 size 107.79GiB used 105.97GiB path /dev/sda1 btrfs-progs v4.2.2 # btrfs filesystem df / Data, single: total=98.94GiB, used=72.30GiB System, single: total=32.00MiB, used=20.00KiB Metadata, single: total=7.00GiB, used=2.10GiB GlobalReserve, single: total=512.00MiB, used=0.00B The filesystem is mounted as (leaving out subvolume mounts which use the same mount options): /dev/sda1 on / type btrfs (rw,noatime,compress=lzo,ssd,discard,space_cache) Greetings, -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
[SOLVED] Re: Deleted files cause btrfs-send to fail
Am Fri, 14 Aug 2015 23:37:37 +0200 schrieb Marc Joliet mar...@gmx.de: If I notice anything amiss, I'll report back. I haven't noticed anything amiss, so I'm marking this thread as SOLVED. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpFiGtxa4hTf.pgp Description: Digitale Signatur von OpenPGP
Re: Deleted files cause btrfs-send to fail
Am Sat, 15 Aug 2015 05:10:57 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Marc Joliet posted on Fri, 14 Aug 2015 23:37:37 +0200 as excerpted: (One other thing I found interesting was that btrfs scrub didn't care about the link count errors.) A lot of people are confused about exactly what btrfs scrub does, and expect it to detect and possibly fix stuff it has nothing to do with. It's *not* an fsck. Scrub does one very useful, but limited, thing. It systematically verifies that the computed checksums for all data and metadata covered by checksums match the corresponding recorded checksums. For dup/raid1/ raid10 modes, if there's a match failure, it will look up the other copy and see if it matches, replacing the invalid block with a new copy of the other one, assuming it's valid. For raid56 modes, it attempts to compute the valid copy from parity and, again assuming a match after doing so, does the replace. If a valid copy cannot be found or computed, either because it's damaged too or because there's no second copy or parity to fall back on (single and raid0 modes), then scrub will detect but cannot correct the error. In routine usage, btrfs automatically does the same thing if it happens to come across checksum errors in its normal IO stream, but it has to come across them first. Scrub's benefit is that it systematically verifies (and corrects errors where it can) checksums on the entire filesystem, not just the parts that happen to appear in the normal IO stream. I know all that, I just thought it was interesting and wanted to remark as such. After thinking about it a bit, of course, it makes perfect sense and is not very interesting at all: scrub will just verify that the checksums match, no matter whether the underlying (meta)data is valid or not. Such checksum errors can be for a few reasons... I have one ssd that's gradually failing and returns checksum errors fairly regularly. Were I using a normal filesystem I'd have had to replace it some time ago. But with btrfs in raid1 mode and regular scrubs (and backups, should they be needed; sometimes I let them get a bit stale, but I do have them and am prepared to live with the stale restored data if I have to), I've been able to keep using the failing device. When the scrubs hit errors and btrfs does the rewrite from the good copy, a block relocation on the failing device is triggered as well, with the bad block taken out of service and a new one from the set of spares all modern devices have takes its place. Currently, smartctl -A reports 904 reallocated sectors raw value, with a standardized value of 92. Before the first reallocated sector, the standardized value was 253, perfect. With the first reallocated sector, it immediately dropped to 100, apparently the rounded percentage of spare sectors left. It has gradually dropped since then to its current 92, with a threshold value of 36. So while it's gradually failing, there's still plenty of spare sectors left. Normally I would have replaced the device even so, but I've never actually had the opportunity to actually watch a slow failure continue to get worse over time, and now that I do I'm a bit curious how things will go, so I'm just letting it happen, tho I do have a replacement device already purchased and ready, when the time comes. I'm curious how that will pan out. My experience with HDDs is that at some point the sector reallocations start picking up at a somewhat constant (maybe even accelerating) rate. I wonder how SSDs behave in this regard. So real media failure, bitrot, is one reason for bad checksums. The data read back from the device simply isn't the same data that was stored to it, and the checksum fails as a result. Of course bad connector cables or storage chipset firmware or hardware is another hardware cause. Sudden reboot or power loss, with data being actively written and one copy either already updated or not yet touched, while the other is actually being written at the time of the crash so the write isn't completed, is yet another reason for checksum failure. This one is actually why a scrub can appear to do so much more than it does, because where there's a second copy (or parity) of the data available, scrub can use it to recover the partially written copy (which being partially written fails its checksum verification) to either the completed write state, if the other copy was already written, or the pre-write state, if the other copy hadn't been written at all, yet. In this way the result is often the same one an fsck would normally produce, detecting and fixing the error, but the mechanism is entirely different -- it only detected and fixed the error because the checksum was bad and it had a good copy it could replace it with, not because it had any smarts about how the filesystem actually worked, and could
Re: trim not working and irreparable errors from btrfsck
Am Fri, 14 Aug 2015 10:05:55 +0200 schrieb Marc Joliet mar...@gmx.de: (I mean, that's part of being a user of btrfs at this stage) I meant *being prepared* to file a bug report, not that one constantly has to file bug reports :) . -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpK308M7lnAT.pgp Description: Digitale Signatur von OpenPGP
Re: trim not working and irreparable errors from btrfsck
Am Thu, 13 Aug 2015 17:14:36 -0600 schrieb Chris Murphy li...@colorremedies.com: On Thu, Aug 13, 2015 at 3:23 AM, Marc Joliet mar...@gmx.de wrote: Speaking as a user, since fstrim -av still always outputs 0 bytes trimmed on my system: what's the status of this? Did anybody ever file a bug report? Since I'm not having this problem with my SSD, I'm not in a position to provide any meaningful information for such a report. The bug should whether this problem is reproducible with ext4 and XFS on the same device, and the complete details of the stacking (if this is not the full device or partition of it; e.g. if LVM, md, or encryption is between fs and physical device). And also the bug should include full dmesg as attachment, and strace of the fstrim command that results in 0 bytes trimmed. And probably separate bugs for each make/model of SSD, with the bug including make/model and firmware version. Right now I think there's no status because a.) no bug report and b.) not enough information. I was mainly asking because apparently there *is* a patch that helps some people affected by this, but nobody ever commented on it. Perhaps there's a reason for that, but I found it curious. (I see now that it was submitted in early January, in the thread [PATCH V2] Btrfs: really fix trim 0 bytes after a device delete.) I can open a bug (I mean, that's part of being a user of btrfs at this stage), I'm just surprised that nobody else has. BTW, is there a way to tell if the discard mount option does anything? I'm curious about whether it could behave differently. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgp3RKB19fH0i.pgp Description: Digitale Signatur von OpenPGP
Re: Deleted files cause btrfs-send to fail
Am Thu, 13 Aug 2015 10:54:58 +0200 schrieb Marc Joliet mar...@gmx.de: Am Thu, 13 Aug 2015 08:29:19 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Marc Joliet posted on Thu, 13 Aug 2015 09:05:41 +0200 as excerpted: Here's the actual output now, obtained via btrfs-progs 4.0.1 from an initramfs emergency shell: checking extents checking free space cache checking fs roots root 5 inode 8338813 errors 2000, link count wrong unresolved ref dir 26699 index 50500 namelen 4 name root filetype 0 errors 3, no dir item, no dir index root 5 inode 8338814 errors 2000, link count wrong unresolved ref dir 26699 index 50502 namelen 6 name marcec filetype 0 errors 3, no dir item, no dir index root 5 inode 8338815 errors 2000, link count wrong unresolved ref dir 26699 index 50504 namelen 6 name systab filetype 0 errors 3, no dir item, no dir index root 5 inode 8710030 errors 2000, link count wrong unresolved ref dir 26699 index 59588 namelen 6 name marcec filetype 0 errors 3, no dir item, no dir index root 5 inode 8710031 errors 2000, link count wrong unresolved ref dir 26699 index 59590 namelen 4 name root filetype 0 errors 3, no dir item, no dir index Checking filesystem on /dev/sda1 UUID: 0267d8b3-a074-460a-832d-5d5fd36bae64 found 63467610172 bytes used err is 1 total csum bytes: 59475016 total tree bytes: 1903411200 total fs tree bytes: 1691504640 total extent tree bytes: 130322432 btree space waste bytes: 442495212 file data blocks allocated: 555097092096 referenced 72887840768 btrfs-progs v4.0.1 Again: is this fixable? FWIW, root 5 (which you asked about upthread) is the main filesystem root. So all these appear to be on the main filesystem, not on snapshots/ subvolumes. [...] But if it's critical, you may wish to wait and have someone else confirm that before acting on it, just in case I have it wrong. I can wait until tonight, at least. The FS still mounts, and it's just the root subvolume that's affected; running btrfs-send on the /home subvolume still works. Well, I got impatient, and just went ahead and did it (I have backups, after all). It looks like it worked: the affected files were moved to /lost+found/, where I deleted them again, and btrfs-send works again. The output of btrfs check after --repair: checking extents checking free space cache checking fs roots checking csums There are no extents for csum range 0-69632 Csum exists for 0-69632 but there is no extent record Checking filesystem on /dev/sda1 UUID: 0267d8b3-a074-460a-832d-5d5fd36bae64 block group 274307481600 has wrong amount of free spacefailed to load free space cache for block group 274307481600 found 60980420666 bytes used err is 1 total csum bytes: 57521732 total tree bytes: 199680 total fs tree bytes: 1791721472 total extent tree bytes: 127942656 btree space waste bytes: 460072661 file data blocks allocated: 478650343424 referenced 73326161920 btrfs-progs v4.1.2 If I notice anything amiss, I'll report back. (One other thing I found interesting was that btrfs scrub didn't care about the link count errors.) Greetings. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpvHFAJ5LvQi.pgp Description: Digitale Signatur von OpenPGP
Re: Deleted files cause btrfs-send to fail
Am Thu, 13 Aug 2015 00:34:19 +0200 schrieb Marc Joliet mar...@gmx.de: [...] Since this is the root file system, I haven't gotten a copy of the actual output of btrfs check, though I have run it from an initramfs rescue shell. The output I saw there was much like the following (taken from an Email by Roman Mamedov from 2014-12-28): root 22730 inode 6236418 errors 2000, link count wrong unresolved ref dir 105512 index 586340 namelen 48 name [redacted].dat.bak filetype 0 errors 3, no dir item, no dir index Only in my case, it's root 5 and root 4 (I think), and the file names (and other file system specifics) are of course different. I definitely saw errors 2000 (I take it that's supposed to be an error code?). [...] Here's the actual output now, obtained via btrfs-progs 4.0.1 from an initramfs emergency shell: checking extents checking free space cache checking fs roots root 5 inode 8338813 errors 2000, link count wrong unresolved ref dir 26699 index 50500 namelen 4 name root filetype 0 errors 3, no dir item, no dir index root 5 inode 8338814 errors 2000, link count wrong unresolved ref dir 26699 index 50502 namelen 6 name marcec filetype 0 errors 3, no dir item, no dir index root 5 inode 8338815 errors 2000, link count wrong unresolved ref dir 26699 index 50504 namelen 6 name systab filetype 0 errors 3, no dir item, no dir index root 5 inode 8710030 errors 2000, link count wrong unresolved ref dir 26699 index 59588 namelen 6 name marcec filetype 0 errors 3, no dir item, no dir index root 5 inode 8710031 errors 2000, link count wrong unresolved ref dir 26699 index 59590 namelen 4 name root filetype 0 errors 3, no dir item, no dir index Checking filesystem on /dev/sda1 UUID: 0267d8b3-a074-460a-832d-5d5fd36bae64 found 63467610172 bytes used err is 1 total csum bytes: 59475016 total tree bytes: 1903411200 total fs tree bytes: 1691504640 total extent tree bytes: 130322432 btree space waste bytes: 442495212 file data blocks allocated: 555097092096 referenced 72887840768 btrfs-progs v4.0.1 Again: is this fixable? -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpO1kk8QiHHv.pgp Description: Digitale Signatur von OpenPGP
Re: Deleted files cause btrfs-send to fail
Am Thu, 13 Aug 2015 08:29:19 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Marc Joliet posted on Thu, 13 Aug 2015 09:05:41 +0200 as excerpted: Here's the actual output now, obtained via btrfs-progs 4.0.1 from an initramfs emergency shell: checking extents checking free space cache checking fs roots root 5 inode 8338813 errors 2000, link count wrong unresolved ref dir 26699 index 50500 namelen 4 name root filetype 0 errors 3, no dir item, no dir index root 5 inode 8338814 errors 2000, link count wrong unresolved ref dir 26699 index 50502 namelen 6 name marcec filetype 0 errors 3, no dir item, no dir index root 5 inode 8338815 errors 2000, link count wrong unresolved ref dir 26699 index 50504 namelen 6 name systab filetype 0 errors 3, no dir item, no dir index root 5 inode 8710030 errors 2000, link count wrong unresolved ref dir 26699 index 59588 namelen 6 name marcec filetype 0 errors 3, no dir item, no dir index root 5 inode 8710031 errors 2000, link count wrong unresolved ref dir 26699 index 59590 namelen 4 name root filetype 0 errors 3, no dir item, no dir index Checking filesystem on /dev/sda1 UUID: 0267d8b3-a074-460a-832d-5d5fd36bae64 found 63467610172 bytes used err is 1 total csum bytes: 59475016 total tree bytes: 1903411200 total fs tree bytes: 1691504640 total extent tree bytes: 130322432 btree space waste bytes: 442495212 file data blocks allocated: 555097092096 referenced 72887840768 btrfs-progs v4.0.1 Again: is this fixable? FWIW, root 5 (which you asked about upthread) is the main filesystem root. So all these appear to be on the main filesystem, not on snapshots/ subvolumes. OK As for the problem itself, noting that I'm not a dev, just a user/admin following the list, I believe... There was a recent bug (early 4.0 or 4.1, IDR which) that (as I recall understanding it) would fail to decrement link count and would thus leave unnamed inodes hanging around in directories with no way to delete them. That looks very much like what you're seeing. Now that you mention it, I think I remember seeing that patch (series?). The bug has indeed been fixed in current, and a current btrfs check should fix it, but I don't believe that v4.0.1 userspace from the initramfs is new enough to have that fix. The 4.1.2 userspace on your main system (from the first post) is current and should fix it, I believe, however. I have updated the initramfs in the meantime. (Funny: I *just* started using one, mainly to be able to use btrfstune on /, but now I have a genuine necessity for it.) But if it's critical, you may wish to wait and have someone else confirm that before acting on it, just in case I have it wrong. I can wait until tonight, at least. The FS still mounts, and it's just the root subvolume that's affected; running btrfs-send on the /home subvolume still works. Greetings -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpXLNONJFRwt.pgp Description: Digitale Signatur von OpenPGP
Re: trim not working and irreparable errors from btrfsck
Am Sun, 21 Jun 2015 07:21:03 + schrieb Paul Jones p...@pauljones.id.au: -Original Message- From: Lutz Euler [mailto:lutz.eu...@freenet.de] Sent: Sunday, 21 June 2015 12:11 AM To: Christian; Paul Jones; Austin S Hemmelgarn Cc: linux-btrfs@vger.kernel.org Subject: RE: trim not working and irreparable errors from btrfsck Hi Christian, Paul and Austin, Christian wrote: However, fstrim still gives me 0 B (0 bytes) trimmed, so that may be another problem. Is there a way to check if trim works? Paul wrote: I've got the same problem. I've got 2 SSDs with 2 partitions in RAID1, fstrim always works on the 2nd partition but not the first. There are no errors on either filesystem that I know of, but the first one is root so I can't take it offline to run btrfs check. Austin wrote: I'm seeing the same issue here, but with a Crucial brand SSD. Somewhat interestingly, I don't see any issues like this with BTRFS on top of LVM's thin-provisioning volumes, or with any other filesystems, so I think it has something to do with how BTRFS is reporting unused space or how it is submitting the discard requests. Probably you all suffer from the same problem I had a few years ago. It is a bug in how btrfs implements fstrim. To check whether you are a victim of this bug simply run: # btrfs-debug-tree /dev/whatever | grep 'FIRST_CHUNK_TREE CHUNK_ITEM' where /dev/whatever is a device of your filesystem, and interrupt after the first several output lines with C-c. (Officially the filesystem should be unmounted when running btrfs-debug-tree, but that is not necessary as we only read from it and the relevant data doesn't change very often.) You get something like: item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 0) item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 12947816448) item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 14021558272) ... (This output is from an old version of btrfs-progs. I understand newer version are more verbose, but you should nevertheless easily be able to interpret the output). If the first number different from 0 (here, the 12947816448) is larger than the sum of the sizes of the devices the filesystem consists of, bingo. This has been discussed already in the past and there is a patch. Please see for the patch: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg40618.html and for the background: http://comments.gmane.org/gmane.comp.file-systems.btrfs/15597 Kind regards, Lutz Euler I tried the test and the numbers I was getting seemed reasonable, however I went ahead and applied the patch anyway. Trim now works correctly! Thanks, Paul. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in Speaking as a user, since fstrim -av still always outputs 0 bytes trimmed on my system: what's the status of this? Did anybody ever file a bug report? There was also that other thread, fstrim not working on one of three BTRFS filesystems, that also never went anywhere. I take it from this that my SSD has been running untrimmed for quite a while now? (FWIW, queued trim is blocked by my kernel (it's forced_unqueued), but fstrim should still start an unqueued trim, right?) # uname -a Linux thetick 4.1.4-gentoo #1 SMP PREEMPT Tue Aug 4 21:58:41 CEST 2015 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux # btrfs --version btrfs-progs v4.1.2 # btrfs filesystem show Label: 'MARCEC_ROOT' uuid: 0267d8b3-a074-460a-832d-5d5fd36bae64 Total devices 1 FS bytes used 56.59GiB devid1 size 107.79GiB used 69.03GiB path /dev/sda1 Label: 'MARCEC_STORAGE' uuid: 472c9290-3ff2-4096-9c47-0612d3a52cef Total devices 2 FS bytes used 597.75GiB devid1 size 931.51GiB used 600.03GiB path /dev/sdc devid2 size 931.51GiB used 600.03GiB path /dev/sdb Label: 'MARCEC_BACKUP' uuid: f97b3cda-15e8-418b-bb9b-235391ef2a38 Total devices 1 FS bytes used 807.59GiB devid1 size 976.56GiB used 837.06GiB path /dev/sdd2 btrfs-progs v4.1.2 # btrfs filesystem df / Data, single: total=65.00GiB, used=54.83GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=4.00GiB, used=1.76GiB GlobalReserve, single: total=512.00MiB, used=0.00B Greetings -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpe5zrE9jGKM.pgp Description: Digitale Signatur von OpenPGP
Deleted files cause btrfs-send to fail
Hi all, Starting today I have an interesting problem: I deleted some files as part (old fcrontabs), which now persistently causes btrfs-send to fail. The error message I get is: Aug 12 23:32:24 thetick make_backups.sh[1059]: ERROR: send ioctl failed with -2: No such file or directory Aug 12 23:32:25 thetick make_backups.sh[1059]: ERROR: unexpected EOF in stream. There is nothing in the dmesg output. Since this is the root file system, I haven't gotten a copy of the actual output of btrfs check, though I have run it from an initramfs rescue shell. The output I saw there was much like the following (taken from an Email by Roman Mamedov from 2014-12-28): root 22730 inode 6236418 errors 2000, link count wrong unresolved ref dir 105512 index 586340 namelen 48 name [redacted].dat.bak filetype 0 errors 3, no dir item, no dir index Only in my case, it's root 5 and root 4 (I think), and the file names (and other file system specifics) are of course different. I definitely saw errors 2000 (I take it that's supposed to be an error code?). Is this something that btrfs check --repair (or something else) can safely fix? # uname -a Linux thetick 4.1.4-gentoo #1 SMP PREEMPT Tue Aug 4 21:58:41 CEST 2015 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux # btrfs --version btrfs-progs v4.1.2 # btrfs filesystem show Label: 'MARCEC_ROOT' uuid: 0267d8b3-a074-460a-832d-5d5fd36bae64 Total devices 1 FS bytes used 59.30GiB devid1 size 107.79GiB used 74.03GiB path /dev/sda1 Label: 'MARCEC_STORAGE' uuid: 472c9290-3ff2-4096-9c47-0612d3a52cef Total devices 2 FS bytes used 597.75GiB devid1 size 931.51GiB used 600.03GiB path /dev/sdc devid2 size 931.51GiB used 600.03GiB path /dev/sdb Label: 'MARCEC_BACKUP' uuid: f97b3cda-15e8-418b-bb9b-235391ef2a38 Total devices 1 FS bytes used 810.35GiB devid1 size 976.56GiB used 837.06GiB path /dev/sdd2 btrfs-progs v4.1.2 # btrfs filesystem df / Data, single: total=70.00GiB, used=57.53GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=4.00GiB, used=1.77GiB GlobalReserve, single: total=512.00MiB, used=0.00B Greetings -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpmXZfT7icaz.pgp Description: Digitale Signatur von OpenPGP
Re: Wiki suggestions
Am Mon, 13 Jul 2015 06:56:17 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Marc Joliet posted on Sun, 12 Jul 2015 14:26:04 +0200 as excerpted: I hope it's not out of place, but I have a few suggestions for the Wiki: Just in case it wasn't obvious... The wiki is open to user editing. You can, if you like, get an account and make the changes yourself. =:^) Of course, it's understandable if your reaction to web and wiki technologies is similar to mine, newsgroups and mailing lists (in my case via gmane.org's list2news service, so they too are presented as newsgroups) are your primary domain, and you tend to treat the web as read-only so rarely reply on a web forum, let alone edit a wiki. I've never gotten a wiki account here for that reason, either, or I'd have probably gone ahead and made the suggested changes... But with a bit of luck someone with an existing (or even new) account will be along to make the changes... It's partially a read-only habit, but it's also that I'm just not confident in deciding whether those actually *are* good suggestions, or put differently: it's the public face of btrfs, and I don't want to accidentally do something to ruin it (to use some hyperbole). However, if somebody gives me the go-ahead, I might just edit the wiki myself (though I don't know enough to be able to edit the kernel news entry ;-) ). -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpbh6wXjKf9C.pgp Description: Digitale Signatur von OpenPGP
Re: Wiki suggestions
Am Mon, 13 Jul 2015 19:21:54 +0200 schrieb Marc Joliet mar...@gmx.de: OK, I'll make the changes then (sans kernel log). Just a heads up: I accepted the terms of service, but the link goes to a non-existent wiki page. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpy82XmHTZbA.pgp Description: Digitale Signatur von OpenPGP
Re: Wiki suggestions
Am Mon, 13 Jul 2015 18:30:09 +0200 schrieb David Sterba dste...@suse.com: On Mon, Jul 13, 2015 at 01:18:27PM +0200, Marc Joliet wrote: Am Mon, 13 Jul 2015 06:56:17 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Marc Joliet posted on Sun, 12 Jul 2015 14:26:04 +0200 as excerpted: I hope it's not out of place, but I have a few suggestions for the Wiki: Just in case it wasn't obvious... The wiki is open to user editing. You can, if you like, get an account and make the changes yourself. =:^) Of course, it's understandable if your reaction to web and wiki technologies is similar to mine, newsgroups and mailing lists (in my case via gmane.org's list2news service, so they too are presented as newsgroups) are your primary domain, and you tend to treat the web as read-only so rarely reply on a web forum, let alone edit a wiki. I've never gotten a wiki account here for that reason, either, or I'd have probably gone ahead and made the suggested changes... But with a bit of luck someone with an existing (or even new) account will be along to make the changes... It's partially a read-only habit, but it's also that I'm just not confident in deciding whether those actually *are* good suggestions, or put differently: it's the public face of btrfs, and I don't want to accidentally do something to ruin it (to use some hyperbole). All your suggesstions are good, adding more articles/videos/talks should be easy as there's a section for that already. The news section is mostly written by me but if you keep your entries consistent with the rest then it's ok. There are a few people who watch over new wiki edits and fix/enhance them if needed. You can't do too much damage unless you really want to. OK, I'll make the changes then (sans kernel log). -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpNnh3EGP1Rh.pgp Description: Digitale Signatur von OpenPGP
Wiki suggestions
Hi, I hope it's not out of place, but I have a few suggestions for the Wiki: - The presentation NYLUG Presents: Chris Mason on Btrfs (May 14th 2015) at https://www.youtube.com/watch?v=W3QRWUfBua8 would make a nice addition to the Articles, presentations, podcasts section. - The same goes for Why you should consider using btrfs ... like Google does. at https://www.youtube.com/watch?v=6DplcPrQjvA. - coreutils 8.24 was released early this month: https://lists.gnu.org/archive/html/coreutils-announce/2015-07/msg0.html. - While I'm at it: hilights should be highlights in the btrfs-progs 4.1.1 news entry. - The Linux v4.1 news entry is still TBD ;-) . Greetings -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpFwZKnbRCn9.pgp Description: Digitale Signatur von OpenPGP
Re: btrfs partition converted from ext4 becomes read-only minutes after booting: WARNING: CPU: 2 PID: 2777 at ../fs/btrfs/super.c:260 __btrfs_abort_transaction+0x4b/0x120
Am Wed, 17 Jun 2015 10:46:30 -0700 schrieb Marc MERLIN m...@merlins.org: I tried ext4 to btrfs once a year ago and it severely mangled my filesystem. I looked at it as a cool feature/hack that may have worked some time ago, but that no one really uses anymore, and that may not work right at this point. Just another data point: when I switched to btrfs in the middle of last year I used btrfs-convert on two file systems (an SSD and my backup partition on a USB 3.0 HDD), and it worked in both cases (i.e., no data loss). I did see some strange balance issues (see the ML archives), but IIRC nothing really serious. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpazrScetd8J.pgp Description: Digitale Signatur von OpenPGP
Re: btrfs convert running out of space
Am Fri, 23 Jan 2015 08:46:23 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Marc Joliet posted on Fri, 23 Jan 2015 08:54:41 +0100 as excerpted: Am Fri, 23 Jan 2015 04:34:19 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted: What are the chances that splitting all the large files up into sub gig pieces, finish convert, then recombine them all will work? [...] Option 2: Since new files should be created using the desired target mode (raid1 IIRC), you may actually be able to move them off and immediately back on, so they appear as new files and thus get created in the desired mode. With current coreutils, wouldn't that also work if he moves the files to another (temporary) subvolume? (And with future coreutils, by copying the files without using reflinks and then removing the originals.) If done correctly, yes. However, off the filesystem is far simpler to explain over email or the like, and is much less ambiguous in terms of OK, but did you do it 'correctly' if it doesn't end up helping. If it doesn't work, it doesn't work. If move to a different subvolume under specific conditions in terms of reflinking and the like doesn't work, there's always the question of whether it /really/ didn't work, or if somehow the instructions weren't clear enough and thus failure was simply the result of a failure to fully meet the technical requirements. Of course if I was doing it myself, and if I was absolutely sure of the technical details in terms of what command I had to use to be /sure/ it didn't simply reflink and thus defeat the whole exercise, I'd likely use the shortcut. But in reality, if it didn't work I'd be second-guessing myself and would probably move everything entirely off and back on to be sure, and knowing that, I'd probably do it the /sure/ way in the first place, avoiding the chance of having to redo it to prove to myself that I'd done it correctly. Of course, having demonstrated to myself that it worked, if I ever had the problem again, I might try the shortcut, just to demonstrate to my own satisfaction the full theory that the effect of the shortcut was the same as the effect of doing it the longer and more fool-proof way. But of course I'd rather not have the opportunity to try that second-half proof. =:^) Make sense? =:^) I was going to argue that my suggestion was hardly difficult to get right, but then I read that cp defaults to --reflink=always and that it is not possible to turn off reflinks (i.e., there is no --reflink=never). So then would have to consider alternatives like dd, and, well, you are right, I suppose :) . (Of course, with the *current* version of coreutils, the simple mv somefile tmp_subvol/; mv tmp_subvol/somefile . will still work.) -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpo2SzLpOPXM.pgp Description: Digitale Signatur von OpenPGP
Re: btrfs convert running out of space
Am Fri, 23 Jan 2015 04:34:19 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted: What are the chances that splitting all the large files up into sub gig pieces, finish convert, then recombine them all will work? [...] Option 2: Since new files should be created using the desired target mode (raid1 IIRC), you may actually be able to move them off and immediately back on, so they appear as new files and thus get created in the desired mode. Of course the success here depends on how many you have to move vs. the amount of free space available that will be used when you do so, but with enough space, it should just work. Note that with this method, if the files are small enough to entirely fit one-at-a-time or a-few-at-a-time in memory (I have 16 gig RAM, for instance, and don't tend to use more than a gig or two for apps, so could in theory do 12-14 gig at a time for this), you can even use a tmpfs as the temporary storage before moving them back to the target filesystem. That should be pretty fast since the one side is all memory. With current coreutils, wouldn't that also work if he moves the files to another (temporary) subvolume? (And with future coreutils, by copying the files without using reflinks and then removing the originals.) [...] -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpGGQS6VMEBQ.pgp Description: Digitale Signatur von OpenPGP
Re: btrfs scrub status reports not running when it is
Am Wed, 14 Jan 2015 16:06:02 -0500 schrieb Sandy McArthur Jr sandy...@gmail.com: Sometimes btrfs scrub status reports that is not running when it still is. [...] FWIW, I (and one other person) reported this in the thread titled 'btrfs scrub status misreports as interrupted' (starting on 22.11.2014). # uname -a Linux mcplex 3.18.2-gentoo #1 SMP Mon Jan 12 10:24:25 EST 2015 x86_64 Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz GenuineIntel GNU/Linux # btrfs --version Btrfs v3.18.1 Too bad it's still there; I'm on kernel 3.17.8 and userspace 3.18.1, respectively, and didn't see this issue the last time I ran a scrub, so I was hoping it was gone by now. (On the upside, though, this isn't exactly the worst bug btrfs has ever had ;) .) Greetings -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup pgpxeBuwdNml4.pgp Description: Digitale Signatur von OpenPGP
Re: btrfs scrub status misreports as interrupted
Am Wed, 10 Dec 2014 10:51:15 +0800 schrieb Anand Jain anand.j...@oracle.com: Is there any relevant log in the dmegs ? Not in my case; at least, nothing that made it into the syslog. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: Two persistent problems
Am Fri, 14 Nov 2014 17:00:26 -0500 schrieb Josef Bacik jba...@fb.com: On 11/14/2014 04:51 PM, Hugo Mills wrote: Chris, Josef, anyone else who's interested, On IRC, I've been seeing reports of two persistent unsolved problems. Neither is showing up very often, but both have turned up often enough to indicate that there's something specific going on worthy of investigation. One of them is definitely a btrfs problem. The other may be btrfs, or something in the block layer, or just broken hardware; it's hard to tell from where I sit. Problem 1: ENOSPC on balance This has been going on since about March this year. I can reasonably certainly recall 8-10 cases, possibly a number more. When running a balance, the operation fails with ENOSPC when there's plenty of space remaining unallocated. This happens on full balance, filtered balance, and device delete. Other than the ENOSPC on balance, the FS seems to work OK. It seems to be more prevalent on filesystems converted from ext*. The first few or more reports of this didn't make it to bugzilla, but a few of them since then have gone in. Problem 2: Unexplained zeroes Failure to mount. Transid failure, expected xyz, have 0. Chris looked at an early one of these (for Ke, on IRC) back in September (the 27th -- sadly, the public IRC logs aren't there for it, but I can supply a copy of the private log). He rapidly came to the conclusion that it was something bad going on with TRIM, replacing some blocks with zeroes. Since then, I've seen a bunch of these coming past on IRC. It seems to be a 3.17 thing. I can successfully predict the presence of an SSD and -odiscard from the have 0. I've successfully persuaded several people to put this into bugzilla and capture btrfs-images. btrfs recover doesn't generally seem to be helpful in recovering data. I think Josef had problem 1 in his sights, but I don't know if additional images or reports are helpful at this point. For problem 2, there's obviously something bad going on, but there's not much else to go on -- and the inability to recover data isn't good. For each of these, what more information should I be trying to collect from any future reporters? So for #2 I've been looking at that the last two weeks. I'm always paranoid we're screwing up one of our data integrity sort of things, either not waiting on IO to complete properly or something like that. I've built a dm target to be as evil as possible and have been running it trying to make bad things happen. I got slightly side tracked since my stress test exposed a bug in the tree log stuff an csums which I just fixed. Now that I've fixed that I'm going back to try and make the expected blah, have 0 type errors happen. Just a quick question from a user: does Filipe's patch Btrfs: fix race between fs trimming and block group remove/allocation fix this? Judging by the commit message, it looks like it. If so, can you say whether it will make it into 3.17.x? Maybe I'm being overly paranoid, but I stuck with 3.16.7 because of this. (I mean, I have backups, but there's no need to provoke a situation where I will need them ;-) .) -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
btrfs scrub status misreports as interrupted
Hi all, While I haven't gotten any scrub already running type errors any more, I do get one strange case of state misreport. When running scrub on /home (btrfs RAID10), after 3 of 4 drives have completed, the 4th drive (sdb) will report as interrupted, despite still running: # btrfs scrub status -d /home scrub status for 472c9290-3ff2-4096-9c47-0612d3a52cef scrub device /dev/sda (id 1) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 3380 seconds total bytes scrubbed: 252.86GiB with 0 errors scrub device /dev/sdb (id 2) status scrub started at Sat Nov 22 11:57:34 2014, interrupted after 3698 seconds, not running total bytes scrubbed: 217.50GiB with 0 errors scrub device /dev/sdc (id 3) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 3013 seconds total bytes scrubbed: 252.85GiB with 0 errors scrub device /dev/sdd (id 4) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 2994 seconds total bytes scrubbed: 252.85GiB with 0 errors The funny thing is, the time will still update as the scrub keeps going: # btrfs scrub status -d /home scrub status for 472c9290-3ff2-4096-9c47-0612d3a52cef scrub device /dev/sda (id 1) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 3380 seconds total bytes scrubbed: 252.86GiB with 0 errors scrub device /dev/sdb (id 2) status scrub started at Sat Nov 22 11:57:34 2014, interrupted after 4136 seconds, not running total bytes scrubbed: 239.44GiB with 0 errors scrub device /dev/sdc (id 3) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 3013 seconds total bytes scrubbed: 252.85GiB with 0 errors scrub device /dev/sdd (id 4) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 2994 seconds total bytes scrubbed: 252.85GiB with 0 errors This has happened a few times, and when sdb finally finishes, the status is then reported correctly as finished: # btrfs scrub status -d /home scrub status for 472c9290-3ff2-4096-9c47-0612d3a52cef scrub device /dev/sda (id 1) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 3380 seconds total bytes scrubbed: 252.86GiB with 0 errors scrub device /dev/sdb (id 2) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 4426 seconds total bytes scrubbed: 252.88GiB with 0 errors scrub device /dev/sdc (id 3) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 3013 seconds total bytes scrubbed: 252.85GiB with 0 errors scrub device /dev/sdd (id 4) history scrub started at Sat Nov 22 11:57:34 2014 and finished after 2994 seconds total bytes scrubbed: 252.85GiB with 0 errors Kernel and btrfs-progs version: # uname -a Linux marcec 3.16.7-gentoo #1 SMP PREEMPT Fri Oct 31 22:45:54 CET 2014 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux # btrfs --version Btrfs v3.17.1 Should I open a report on bugzilla? -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: btrfs send and an existing backup
Am Wed, 19 Nov 2014 16:58:16 +0100 schrieb Jakob Schürz wertsto...@nurfuerspam.de: Hi there! I'm new on btrfs, and I like it :) Me too :) . (I've been using it since May.) But i have a question. I have a existing backup on an external HDD. This was ext4 before i converted it to btrfs. And i installed my debian new on btrfs with some subvolumes. (f.e. home, var, multimedia/Video multimedia/Audio...) On my backup there are no subvolumes. Now i wrote a script to take local snapshots on my laptops HDD an mirror this snapshots with btrfs send/receive to the external HDD. Yeah, I also recently made the switch to btrfs send/receive, and I just love being able to do incremental full system backups in less than two minutes (it's also efficient enough that I backup my (borrowed) laptop over WLAN). So from me a big thanks to the btrfs devs :) ! But to get to the questions: An i don't know, how to do, to make the inital snapshot on the external HDD. I want to use the existing data there, so I don't have to transmit the whole bunch of data to the external drive, which exists there already... Yeah, I had that problem, too, with my old rsync based backups; see below. What happens, if i make the same structure on the external drive with creating subvolumes and »cp --reflink«, give this subvolumes the correct names, and fire a »btrfs send«? Do you mean cp --reflink from the original backup to the new structure? That won't help. Again, see below. Or is the best (ONLY???) way, to make an initial snapshot on the external drive and delete the old backup there? I couldn't think of any other way than doing an initial snapshot + send that transferred the entire subvolumes, then doing incremental sends from there. Here's my understanding as a complete non-expert: The problem is that you need a parent snapshot, which needs to be on *both* the source *and* target volumes, with which to be able to generate and then receive the incremental send. Currently, your source and target volumes are independent, so btrfs can't infer anything about any differences between them; that is, while the data may be related, the file systems themselves have independent histories, making it impossible to compare them via their data structures. This is why you need to make an initial send: to give both volumes a common frame of reference, so to speak. So I bit the bullet and went through with it, and am keeping the original backups until enough snapshots have accumulated in the new backup location (both of my backups are on the same file system in different subvolumes). greetings jakob HTH -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: Poll: time to switch skinny-metadata on by default?
Am Sat, 25 Oct 2014 14:35:33 -0600 schrieb Chris Murphy li...@colorremedies.com: On Oct 25, 2014, at 2:33 PM, Chris Murphy li...@colorremedies.com wrote: On Oct 25, 2014, at 6:24 AM, Marc Joliet mar...@gmx.de wrote: First of all: does grub2 support booting from a btrfs file system with skinny-metadata, or is it irrelevant? Seems plausible if older kernels don't understand skinny-metadata, that GRUB2 won't either. So I just tested it with grub2-2.02-0.8.fc21 and it works. I'm surprised, actually. I don't understand the nature of the incompatibility with older kernels. Can they not mount a Btrfs volume even as ro? If so then I'd expect GRUB to have a problem, so I'm going to guess that maybe a 3.9 or older kernel could ro mount a Btrfs volume with skinny extents and the incompatibility is writing. That sounds plausible, though I hope for a definitive answer. (FWIW, I originally asked because I couldn't find any commits to grub2 related to skinny metadata; the updates to the btrfs driver were fairly sparse.) -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: Poll: time to switch skinny-metadata on by default?
Am Sat, 25 Oct 2014 21:58:08 +0200 schrieb Marc Joliet mar...@gmx.de: Am Sat, 25 Oct 2014 14:24:58 +0200 schrieb Marc Joliet mar...@gmx.de: I can still access files on MARCEC_BACKUP just fine, and the snapshots are still there (btrfs subvolume list succeeds). Just an update: that was true for a while, but at one point listing directories and accessing the file system in general stopped working (all processes that touched the FS hung/zombified). This necessitated a hard reboot, since reboot and halt (so... shutdown, really) didn't do anything other than spit out the usual the system is rebooting message. Interestingly enough, the file system was (apparently) fine after that (just as Petr Janecek wrote), other than an invalid space cache file: [ 65.477006] BTRFS info (device sdg2): The free space cache file (2466854731776) is invalid. skip it That is, running my backup routine worked just as before, and I can access files on the FS just fine. Oh, and apparently the rebalance continued successfully?! [ 342.540865] BTRFS info (device sdg2): continuing balance [ 342.51] BTRFS info (device sdg2): relocating block group 2502355320832 flags 34 [ 342.821608] BTRFS info (device sdg2): found 4 extents [ 343.056915] BTRFS info (device sdg2): relocating block group 2501818449920 flags 36 [ 437.932405] BTRFS info (device sdg2): found 25086 extents [ 438.727197] BTRFS info (device sdg2): relocating block group 2501281579008 flags 36 [ 557.319354] BTRFS info (device sdg2): found 83875 extents # btrfs balance status /media/MARCEC_BACKUP No balance found on '/media/MARCEC_BACKUP' No SEGFAULT anywhere. All I can say right now is huh. Although I'll try starting a balance -m again tomorrow, because the continued balance only took about 3-4 minutes (maybe it . Maybe it exploded, I don't know (sorry, clearly I didn't delete the entirety of that incomplete train of thought). Anyway, I did run a full balance -m again, and this time it finished successfully. Make of that what you will, but it appears that the bug is non-deterministic (makes me wonder if Petr Janecek or anybody else who hit the bug ever got a balance to finish). HTH -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: Poll: time to switch skinny-metadata on by default?
: 880052559bd0 R08: 0400 R09: 03a5 [ 5844.149007] R10: 0006 R11: 03a4 R12: 880037378a68 [ 5844.149007] R13: 0002 R14: 8800048a5800 R15: 0002 [ 5844.149007] FS: 7f6eda85c8c0() GS:88011fc8() knlGS: [ 5844.149007] CS: 0010 DS: ES: CR0: 8005003b [ 5844.149007] CR2: 0262ddf0 CR3: 18ce4000 CR4: 07e0 [ 5844.149007] Stack: [ 5844.149007] 88010c4ba108 02a9 ff00 [ 5844.149007] 0001 0009 0001178e5528 0246752f5000 [ 5844.149007] 8800156d3854 b38a 880109252000 1000 [ 5844.149007] Call Trace: [ 5844.149007] [812387ea] ? walk_down_proc+0x1da/0x2c0 [ 5844.149007] [8123b4b3] ? walk_down_tree+0xb3/0xe0 [ 5844.149007] [8123f235] ? btrfs_drop_subtree+0x195/0x210 [ 5844.149007] [8129fa2f] ? do_relocation+0x36f/0x500 [ 5844.149007] [8129d985] ? calcu_metadata_size.isra.43.constprop.57+0x95/0xb0 [ 5844.149007] [8127284f] ? read_extent_buffer+0xaf/0x110 [ 5844.149007] [8129f50d] ? remove_backref_node+0xad/0x140 [ 5844.149007] [812a007d] ? relocate_tree_blocks+0x4bd/0x610 [ 5844.149007] [812a159b] ? relocate_block_group+0x3cb/0x660 [ 5844.149007] [812a19e8] ? btrfs_relocate_block_group+0x1b8/0x2e0 [ 5844.149007] [81276a46] ? btrfs_relocate_chunk.isra.62+0x56/0x740 [ 5844.149007] [81288e50] ? btrfs_set_lock_blocking_rw+0x60/0xa0 [ 5844.149007] [8127284f] ? read_extent_buffer+0xaf/0x110 [ 5844.149007] [81230d65] ? btrfs_previous_item+0x95/0x120 [ 5844.149007] [81268961] ? btrfs_get_token_64+0x61/0xf0 [ 5844.149007] [8127182f] ? release_extent_buffer+0x2f/0xd0 [ 5844.149007] [81279b68] ? btrfs_balance+0x858/0xf20 [ 5844.149007] [81148585] ? __sb_start_write+0x65/0x110 [ 5844.149007] [8128093e] ? btrfs_ioctl_balance+0x19e/0x500 [ 5844.149007] [8128688f] ? btrfs_ioctl+0xa8f/0x2940 [ 5844.149007] [8111d1e3] ? handle_mm_fault+0x873/0xba0 [ 5844.149007] [8103889a] ? __do_page_fault+0x2ba/0x570 [ 5844.149007] [81120359] ? vma_link+0xd9/0xe0 [ 5844.149007] [8113bb9a] ? kmem_cache_alloc+0x16a/0x170 [ 5844.149007] [81157c9e] ? do_vfs_ioctl+0x7e/0x500 [ 5844.149007] [811581b9] ? SyS_ioctl+0x99/0xb0 [ 5844.149007] [8156df82] ? page_fault+0x22/0x30 [ 5844.149007] [8156c612] ? system_call_fastpath+0x16/0x1b [ 5844.149007] Code: c8 0f 85 62 fe ff ff e9 75 fd ff ff b8 f4 ff ff ff e9 c1 fc ff ff 49 8b be f0 01 00 00 48 c7 c6 1b 90 74 81 31 c0 e8 84 7f fe ff 0f 0b 0f 0b 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 41 [ 5844.151353] RIP [8123b3ec] do_walk_down+0x54c/0x560 [ 5844.151353] RSP 8800156d3778 [ 5844.172535] ---[ end trace bf07dd9e2f7fb343 ]--- -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: Poll: time to switch skinny-metadata on by default?
Am Sat, 25 Oct 2014 14:24:58 +0200 schrieb Marc Joliet mar...@gmx.de: I can still access files on MARCEC_BACKUP just fine, and the snapshots are still there (btrfs subvolume list succeeds). Just an update: that was true for a while, but at one point listing directories and accessing the file system in general stopped working (all processes that touched the FS hung/zombified). This necessitated a hard reboot, since reboot and halt (so... shutdown, really) didn't do anything other than spit out the usual the system is rebooting message. Interestingly enough, the file system was (apparently) fine after that (just as Petr Janecek wrote), other than an invalid space cache file: [ 65.477006] BTRFS info (device sdg2): The free space cache file (2466854731776) is invalid. skip it That is, running my backup routine worked just as before, and I can access files on the FS just fine. Oh, and apparently the rebalance continued successfully?! [ 342.540865] BTRFS info (device sdg2): continuing balance [ 342.51] BTRFS info (device sdg2): relocating block group 2502355320832 flags 34 [ 342.821608] BTRFS info (device sdg2): found 4 extents [ 343.056915] BTRFS info (device sdg2): relocating block group 2501818449920 flags 36 [ 437.932405] BTRFS info (device sdg2): found 25086 extents [ 438.727197] BTRFS info (device sdg2): relocating block group 2501281579008 flags 36 [ 557.319354] BTRFS info (device sdg2): found 83875 extents # btrfs balance status /media/MARCEC_BACKUP No balance found on '/media/MARCEC_BACKUP' No SEGFAULT anywhere. All I can say right now is huh. Although I'll try starting a balance -m again tomorrow, because the continued balance only took about 3-4 minutes (maybe it . HTH -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Tue, 22 Jul 2014 03:26:39 + (UTC) schrieb Duncan 1i5t5.dun...@cox.net: Marc Joliet posted on Tue, 22 Jul 2014 01:30:22 +0200 as excerpted: And now that the background deletion of the old snapshots is done, the file system ended up at: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=219.00GiB, used=140.13GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.50GiB, used=2.40GiB unknown, single: total=512.00MiB, used=0.00 I don't know how reliable du is for this, but I used it to estimate how much used data I should expect, and I get 138 GiB. That means that the snapshots yield about 2 GiB overhead, which is very reasonable, I think. Obviously I'll be starting a full balance now. [snip total/used discussion] No, you misunderstand: read my email three steps above yours (from the 21. at 15:22). I am wondering about why the disk usage ballooned to 200 GiB in the first place. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Sun, 20 Jul 2014 21:44:40 +0200 schrieb Marc Joliet mar...@gmx.de: [...] What I did: - delete the single largest file on the file system, a 12 GB VM image, along with all subvolumes that contained it - rsync it over again [...] I want to point out at this point, though, that doing those two steps freed a disproportionate amount of space. The image file is only 12 GB, and it hadn't changed in any of the snapshots (I haven't used this VM since June), so that subvolume delete -c snapshots returned after a few seconds. Yet deleting it seems to have freed up twice as much. You can see this from the filesystem df output: before, used was at 229.04 GiB, and after deleting it and copying it back (and after a day's worth of backups) went down to 218 GiB. Does anyone have any idea how this happened? Actually, now I remember something that is probably related: when I first moved to my current backup scheme last week, I first copied the data from the last rsnapshot based backup with cp --reflink to the new backup location, but forgot to use -a. I interrupted it and ran cp -a -u --reflink, but it had already copied a lot, and I was too impatient to start over; after all, the data hadn't changed. Then, when rsync (with --inplace) ran for the first time, all of these files with wrong permissions and different time stamps were copied over, but for some reason, the space used increased *greatly*; *much* more than I would expect from changed metadata. The total size of the file system data should be around 142 GB (+ snapshots), but, well, it's more than 1.5 times as much. Perhaps cp --reflink treats hard links differently than expected? I would have expected the data pointed to by the hard link to have been referenced, but maybe something else happened? -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Mon, 21 Jul 2014 15:22:16 +0200 schrieb Marc Joliet mar...@gmx.de: Am Sun, 20 Jul 2014 21:44:40 +0200 schrieb Marc Joliet mar...@gmx.de: [...] What I did: - delete the single largest file on the file system, a 12 GB VM image, along with all subvolumes that contained it - rsync it over again [...] I want to point out at this point, though, that doing those two steps freed a disproportionate amount of space. The image file is only 12 GB, and it hadn't changed in any of the snapshots (I haven't used this VM since June), so that subvolume delete -c snapshots returned after a few seconds. Yet deleting it seems to have freed up twice as much. You can see this from the filesystem df output: before, used was at 229.04 GiB, and after deleting it and copying it back (and after a day's worth of backups) went down to 218 GiB. Does anyone have any idea how this happened? Actually, now I remember something that is probably related: when I first moved to my current backup scheme last week, I first copied the data from the last rsnapshot based backup with cp --reflink to the new backup location, but forgot to use -a. I interrupted it and ran cp -a -u --reflink, but it had already copied a lot, and I was too impatient to start over; after all, the data hadn't changed. Then, when rsync (with --inplace) ran for the first time, all of these files with wrong permissions and different time stamps were copied over, but for some reason, the space used increased *greatly*; *much* more than I would expect from changed metadata. The total size of the file system data should be around 142 GB (+ snapshots), but, well, it's more than 1.5 times as much. Perhaps cp --reflink treats hard links differently than expected? I would have expected the data pointed to by the hard link to have been referenced, but maybe something else happened? Hah, OK, apparently when my daily backup removed the oldest daily snapshot, it freed up whatever was taking up so much space, so as of now the file system uses only 169.14 GiB (from 218). Weird. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Tue, 22 Jul 2014 00:30:57 +0200 schrieb Marc Joliet mar...@gmx.de: Am Mon, 21 Jul 2014 15:22:16 +0200 schrieb Marc Joliet mar...@gmx.de: Am Sun, 20 Jul 2014 21:44:40 +0200 schrieb Marc Joliet mar...@gmx.de: [...] What I did: - delete the single largest file on the file system, a 12 GB VM image, along with all subvolumes that contained it - rsync it over again [...] I want to point out at this point, though, that doing those two steps freed a disproportionate amount of space. The image file is only 12 GB, and it hadn't changed in any of the snapshots (I haven't used this VM since June), so that subvolume delete -c snapshots returned after a few seconds. Yet deleting it seems to have freed up twice as much. You can see this from the filesystem df output: before, used was at 229.04 GiB, and after deleting it and copying it back (and after a day's worth of backups) went down to 218 GiB. Does anyone have any idea how this happened? Actually, now I remember something that is probably related: when I first moved to my current backup scheme last week, I first copied the data from the last rsnapshot based backup with cp --reflink to the new backup location, but forgot to use -a. I interrupted it and ran cp -a -u --reflink, but it had already copied a lot, and I was too impatient to start over; after all, the data hadn't changed. Then, when rsync (with --inplace) ran for the first time, all of these files with wrong permissions and different time stamps were copied over, but for some reason, the space used increased *greatly*; *much* more than I would expect from changed metadata. The total size of the file system data should be around 142 GB (+ snapshots), but, well, it's more than 1.5 times as much. Perhaps cp --reflink treats hard links differently than expected? I would have expected the data pointed to by the hard link to have been referenced, but maybe something else happened? Hah, OK, apparently when my daily backup removed the oldest daily snapshot, it freed up whatever was taking up so much space, so as of now the file system uses only 169.14 GiB (from 218). Weird. And now that the background deletion of the old snapshots is done, the file system ended up at: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=219.00GiB, used=140.13GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.50GiB, used=2.40GiB unknown, single: total=512.00MiB, used=0.00 I don't know how reliable du is for this, but I used it to estimate how much used data I should expect, and I get 138 GiB. That means that the snapshots yield about 2 GiB overhead, which is very reasonable, I think. Obviously I'll be starting a full balance now. I still think this whole... thing is very odd, hopefully somebody can shed some light on it for me (maybe it's obvious, but I don't see it). -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Sat, 19 Jul 2014 19:11:00 -0600 schrieb Chris Murphy li...@colorremedies.com: I'm seeing this also in the 2nd dmesg: [ 249.893310] BTRFS error (device sdg2): free space inode generation (0) did not match free space cache generation (26286) So you could try umounting the volume. And doing a one time mount with the clear_cache mount option. Give it some time to rebuild the space cache. After that you could umount again, and mount with enospc_debug and try to reproduce the enospc with another balance and see if dmesg contains more information this time. OK, I did that, and the new dmesg is attached. Also, some outputs again, first filesystem df (that total surge at the end sure is consistent): # btrfs filesystem df /mnt Data, single: total=237.00GiB, used=229.67GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.50GiB, used=3.49GiB unknown, single: total=512.00MiB, used=0.00 And here what I described in my initial post, the output of balance status immediately after the error (turns out my memory was correct): btrfs filesystem balance status /mnt Balance on '/mnt' is running 0 out of about 0 chunks balanced (0 considered), -nan% left (Also, this is with Gentoo kernel 3.15.6 now.) -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup dmesg4.log.xz Description: application/xz signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Sat, 19 Jul 2014 18:53:03 -0600 schrieb Chris Murphy li...@colorremedies.com: On Jul 19, 2014, at 2:58 PM, Marc Joliet mar...@gmx.de wrote: Am Sat, 19 Jul 2014 22:10:51 +0200 schrieb Marc Joliet mar...@gmx.de: [...] Another random idea: the number of errors decreased the second time I ran balance (from 4 to 2), I could run another full balance and see if it keeps decreasing. Well, this time there were still 2 ENOSPC errors. But I can show the df output after such an ENOSPC error, to illustrate what I meant with the sudden surge in total usage: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=236.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 And then after running a balance and (almost) immediately cancelling: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=230.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 I think it's a bit weird. Two options: a. Keep using the file system, with judicious backups, if a dev wants more info they'll reply to the thread; b. Migrate the data to a new file system, first capture the file system with btrfs-image in case a dev wants more info and you've since blown away the filesystem, and then move it to a new btrfs fs. I'd use send/receive for this to preserve subvolumes and snapshots. OK, I'll keep that in mind. I'll keep running the file system for now, just in case it's a run-time error (i.e., a bug in the balance code, and not a problem with the file system itself). If it gets trashed on its own, or I move to a new file system, I'll be sure to follow the steps you outlined. Chris Murphy Thanks -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
... perhaps it's some other bug. --- [1] Raid5/6 support not yet complete. Operational code is there but recovery code is still incomplete. [0] https://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3 Thanks -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Sun, 20 Jul 2014 12:22:33 +0200 schrieb Marc Joliet mar...@gmx.de: [...] I'll try this and see, but I think I have more files 1GB than would account for this error (which comes towards the end of the balance when only a few chunks are left). I'll see what find /mnt -type f -size +1G finds :) . Now that I think about it, though, it sounds like it could explain the sudden surge in total data size: for one very big file, several chunks/extents are created, but the data cannot be copied from the original ext4 extent. So far, the above find command has only found a few handful of files (plus all the reflinks in the snapshots), much to my surprise. It still has one subvolume to go through, though. And just for completeness, that same find command didn't find any files on /, which I also converted from ext4, and for which a full balance completed successfully. So maybe this is in the right direction, but I'll wait and see what Chris Murphy (or anyone else) might find in my latest dmesg output. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Fw: ENOSPC errors during balance
Start weitergeleitete Nachricht: Huh, turns out the Reply-To was to Chris Murphy, so here it is again for the whole list. Datum: Sat, 19 Jul 2014 20:34:34 +0200 Von: Marc Joliet mar...@gmx.de An: Chris Murphy li...@colorremedies.com Betreff: Re: ENOSPC errors during balance Am Sat, 19 Jul 2014 11:38:08 -0600 schrieb Chris Murphy li...@colorremedies.com: The 2nd dmesg (didn't look at the 1st), has many instances like this; [96241.882138] ata2.00: exception Emask 0x1 SAct 0x7ffe0fff SErr 0x0 action 0x6 frozen [96241.882139] ata2.00: Ata error. fis:0x21 [96241.882142] ata2.00: failed command: READ FPDMA QUEUED [96241.882148] ata2.00: cmd 60/08:00:68:0a:2d/00:00:18:00:00/40 tag 0 ncq 4096 in res 41/00:58:40:5c:2c/00:00:18:00:00/40 Emask 0x1 (device error) I'm not sure what this error is, it acts like an unrecoverable read error but I'm not seeing UNC reported. It looks like ata 2.00 is sdb, which is a member of a btrfs raid10 volume. So this isn't related to your sdg2 and enospc error, it's a different problem. Yeah, from what I remember reading it's related to nforce2 chipsets, but I never pursued it, since I never really noticed any consequences (this is an old computer that I originally build in 2006). IIRC one workaround is to switch to 1.5gpbs instead of 3gbps (but then, it already is at 1.5 Gbps, but none of the other ports are? Might be the hard drive, I *think* it's older than the others.), another is related to irqbalance (which I forgot about, I've just switched it off and will see if the messages stop, but then again, my first dmesg didn't have any of those messages). Anyway, yes, it's unrelated to my problem :-) . I'm not sure of the reason for the BTRFS info (device sdg2): 2 enospc errors during balance but it seems informational rather than either a warning or problem. I'd treat ext4-btrfs converted file systems to be something of an odd duck, in that it's uncommon, therefore isn't getting as much testing and extra caution is a good idea. Make frequent backups. Well, I *could* just recreate the file system. Since these are my only backups (no offsite backup as of yet), I wanted to keep the existing ones. So btrfs-convert was a convenient way to upgrade. But since I ended up deleting those backups anyway, I would only be losing my hourly and a few daily backups. But it's not as if the file system is otherwise misbehaving. Another random idea: the number of errors decreased the second time I ran balance (from 4 to 2), I could run another full balance and see if it keeps decreasing. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Sat, 19 Jul 2014 22:10:51 +0200 schrieb Marc Joliet mar...@gmx.de: [...] Another random idea: the number of errors decreased the second time I ran balance (from 4 to 2), I could run another full balance and see if it keeps decreasing. Well, this time there were still 2 ENOSPC errors. But I can show the df output after such an ENOSPC error, to illustrate what I meant with the sudden surge in total usage: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=236.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 And then after running a balance and (almost) immediately cancelling: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=230.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature