Re: illegal snapshot, cannot be deleted
Hello, I succeeded to delete illegal snapshot with command: btrfs subvolume delete /.snapshots/741/snapshot When I have done btrfs balance / -dusage=0 -musage=0 increasing value up to 4o I did not have issues. But on value 4- for-dusage= and -musage= I got message that there is no space left on disk. Do you have any advice how to manage that? Vedran On Thu, Nov 12, 2015 at 1:32 PM, Austin S Hemmelgarnwrote: > On 2015-11-11 17:11, Vedran Vucic wrote: >> >> Hello, >> >> I use OpenSuse 13.2 on my Toshiba Satellite laptop. I noticed that I run >> out of disk space, checked documentation and I realized that there were >> many snapshots. I used Yast Snapper to delete snapshots. >> I noticed that one snapshot with number 748 could not be deleted. >> I entered terminal and after the command: >> snapper -c root delete 748 >> I got message Illegal snapshot. > > This sounds like some sort of issue with snapper, not BTRFS itself, but see > below for some suggestions. >> >> I woudl like to delete it since it is old one. >> Please find details about my system as requested on your wiki page. >> uname -a >> Linux linux-jjcc.site 3.16.7-29-desktop #1 SMP PREEMPT Fri Oct 23 00:46:04 >> UTC 2015 (6be6a97) i686 i686 i386 GNU/Linux >> >> btrfs --version >> btrfs-progs v4.0+20150429 >> >> btrfs fi show >> Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 >> Total devices 1 FS bytes used 10.98GiB >> devid1 size 18.71GiB used 18.71GiB path /dev/sda6 >> >> btrfs fi df / >> Data, single: total=15.19GiB, used=10.37GiB >> System, DUP: total=8.00MiB, used=16.00KiB >> System, single: total=4.00MiB, used=0.00B >> Metadata, DUP: total=1.75GiB, used=622.53MiB >> Metadata, single: total=8.00MiB, used=0.00B >> GlobalReserve, single: total=208.00MiB, used=0.00B >> Please find attached dmesg.log as requested. >> >> Please advise what have to do in order to delete snapshot that is reported >> to be illegal. > > Have you tried running 'btrfs subvolume delete' on the snapshot? You'll > have to find the full path to it first of course, but that shouldn't be too > hard. Based on the lack of BTRFS error messages in the kernel log you > posted, I'm pretty certain that this isn't an issue with the filesystem > itself (although the filesystem doesn't look particularly healthy, see > further below), so manually deleting the snapshot using the regular BTRFS > commands should work just fine. That said, you may also want to look into > changing the config for snapper, as it has a ridiculously aggressive > retention policy for snapshots by default, which tends to lead to excessive > space usage on filesystems smaller than about 250GB. > > You may also want to look at running a balance on the filesystem, the > numbers from btrfs fi show and btrfs fi df look somewhat worrying, you've > got all the space on the disk allocated as chunks by BTRFS, but have a > significant amount of empty space in those chunks. Given that fact, ENOSPC > issues are a very real possibility, and you'll probably have to run a series > of partial balances to fix this (and it's important to do it before it > becomes a visible issue also, because once you start getting ENOSPC errors, > it is a lot harder to fix). Try running a balance with '-dusage=0 > -musage=0', then re-run repeatedly increasing the number for both arguments > by 5 each time until you get to 50. If a run complains about 'ENOSPC errors > during balance', re-run it with the same number for -dusage and -musage. If > you end up re-running with the same value 3 times and keep getting the > errors, then you're probably beyond the point of this being fixable, and > should just recreate the filesystem (you do have backups, right?). > Otherwise, after finishing the run with '-dusage=50 -musage=50' > successfully, run a full balance without the dusage and musage options. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: illegal snapshot, cannot be deleted
Hello, Here are outputs of commands as you requested: btrfs fi df / Data, single: total=8.00GiB, used=7.71GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.12GiB, used=377.25MiB GlobalReserve, single: total=128.00MiB, used=0.00B btrfs fi show Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 Total devices 1 FS bytes used 8.08GiB devid1 size 18.71GiB used 10.31GiB path /dev/sda6 btrfs-progs v4.0+20150429 Thanks, vedran On Fri, Nov 13, 2015 at 5:30 PM, Austin S Hemmelgarnwrote: > On 2015-11-13 11:12, Vedran Vucic wrote: >> >> Hello, >> >> I succeeded to delete illegal snapshot with command: >> btrfs subvolume delete /.snapshots/741/snapshot >> When I have done >> btrfs balance / -dusage=0 -musage=0 >> increasing value up to 4o I did not have issues. >> But on value 4- for-dusage= and -musage= >> I got message that there is no space left on disk. >> Do you have any advice how to manage that? > > > Can you post the output of 'btrfs fi df' and 'btrfs fi show' again? Both > should have changed after the balance, and I'd need to see what it looks > like now to be able to give any reasonable advice. > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: illegal snapshot, cannot be deleted
On 2015-11-13 11:12, Vedran Vucic wrote: Hello, I succeeded to delete illegal snapshot with command: btrfs subvolume delete /.snapshots/741/snapshot When I have done btrfs balance / -dusage=0 -musage=0 increasing value up to 4o I did not have issues. But on value 4- for-dusage= and -musage= I got message that there is no space left on disk. Do you have any advice how to manage that? Can you post the output of 'btrfs fi df' and 'btrfs fi show' again? Both should have changed after the balance, and I'd need to see what it looks like now to be able to give any reasonable advice. smime.p7s Description: S/MIME Cryptographic Signature
Re: illegal snapshot, cannot be deleted
On 2015-11-13 12:30, Vedran Vucic wrote: Hello, Here are outputs of commands as you requested: btrfs fi df / Data, single: total=8.00GiB, used=7.71GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.12GiB, used=377.25MiB GlobalReserve, single: total=128.00MiB, used=0.00B btrfs fi show Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 Total devices 1 FS bytes used 8.08GiB devid1 size 18.71GiB used 10.31GiB path /dev/sda6 btrfs-progs v4.0+20150429 Hmm, that's odd, based on these numbers, you should be having no issue at all trying to run a balance. You might be hitting some other bug in the kernel, however, but I don't remember if there were any known bugs related to ENOSPC or balance in the version you're running. You might see if trying to re-run the balance with '-dusage=40 -musage=40' works correctly (I've seen cases where the first run fails, but subsequent ones work because the first one made some progress despite failing). On Fri, Nov 13, 2015 at 5:30 PM, Austin S Hemmelgarnwrote: On 2015-11-13 11:12, Vedran Vucic wrote: Hello, I succeeded to delete illegal snapshot with command: btrfs subvolume delete /.snapshots/741/snapshot When I have done btrfs balance / -dusage=0 -musage=0 increasing value up to 4o I did not have issues. But on value 4- for-dusage= and -musage= I got message that there is no space left on disk. Do you have any advice how to manage that? Can you post the output of 'btrfs fi df' and 'btrfs fi show' again? Both should have changed after the balance, and I'd need to see what it looks like now to be able to give any reasonable advice. smime.p7s Description: S/MIME Cryptographic Signature
Re: illegal snapshot, cannot be deleted
On 2015-11-13 13:42, Hugo Mills wrote: On Fri, Nov 13, 2015 at 01:10:12PM -0500, Austin S Hemmelgarn wrote: On 2015-11-13 12:30, Vedran Vucic wrote: Hello, Here are outputs of commands as you requested: btrfs fi df / Data, single: total=8.00GiB, used=7.71GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.12GiB, used=377.25MiB GlobalReserve, single: total=128.00MiB, used=0.00B btrfs fi show Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 Total devices 1 FS bytes used 8.08GiB devid1 size 18.71GiB used 10.31GiB path /dev/sda6 btrfs-progs v4.0+20150429 Hmm, that's odd, based on these numbers, you should be having no issue at all trying to run a balance. You might be hitting some other bug in the kernel, however, but I don't remember if there were any known bugs related to ENOSPC or balance in the version you're running. There's one specific bug that shows up with ENOSPC exactly like this. It's in all versions of the kernel, there's no known solution, and no guaranteed mitigation strategy, I'm afraid. Various things like balancing, or adding, balancing, and removing a device again have been tried. Sometimes they seem to help; sometimes they just make the problem worse. We average maybe one report a week or so with this particular set of symptoms. We should get this listed on the Wiki on the Gotcha's page ASAP, especially considering that it's a pretty significant bug (not quite as bad as data corruption, but pretty darn close). Vedran, could you try running the balance with just '-dusage=40' and then again with just '-musage=40'? If just one of those fails, it could help narrow things down significantly. Hugo, is there anything else known about this issue (I don't recall seeing it mentioned before, and a quick web search didn't turn up much)? In particular: 1. Is there any known way to reliably reproduce it (I would assume not, as that would likely lead to a mitigation strategy. If someone does find a reliable reproducer, please let me know, I've got some significant spare processor time and storage space I could dedicate to getting traces and filesystem images for debugging, and already have most of the required infrastructure set up for something like this)? 2. Is it contagious (that is, if I send a snapshot from a filesystem that is affected by it, does the filesystem that receives the snapshot become affected; if we could find a way to reproduce it, I could easily answer this question within a couple of minutes of reproducing it)? 3. Do we have any kind of statistics beyond the rate of reports (for example, does it happen more often on bigger filesystems, or possibly more frequently with certain chunk profiles)? smime.p7s Description: S/MIME Cryptographic Signature
Re: illegal snapshot, cannot be deleted
On Fri, Nov 13, 2015 at 02:40:44PM -0500, Austin S Hemmelgarn wrote: > On 2015-11-13 13:42, Hugo Mills wrote: > >On Fri, Nov 13, 2015 at 01:10:12PM -0500, Austin S Hemmelgarn wrote: > >>On 2015-11-13 12:30, Vedran Vucic wrote: > >>>Hello, > >>> > >>>Here are outputs of commands as you requested: > >>> btrfs fi df / > >>>Data, single: total=8.00GiB, used=7.71GiB > >>>System, DUP: total=32.00MiB, used=16.00KiB > >>>Metadata, DUP: total=1.12GiB, used=377.25MiB > >>>GlobalReserve, single: total=128.00MiB, used=0.00B > >>> > >>>btrfs fi show > >>>Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 > >>> Total devices 1 FS bytes used 8.08GiB > >>> devid1 size 18.71GiB used 10.31GiB path /dev/sda6 > >>> > >>>btrfs-progs v4.0+20150429 > >>> > >>Hmm, that's odd, based on these numbers, you should be having no > >>issue at all trying to run a balance. You might be hitting some > >>other bug in the kernel, however, but I don't remember if there were > >>any known bugs related to ENOSPC or balance in the version you're > >>running. > > > >There's one specific bug that shows up with ENOSPC exactly like > >this. It's in all versions of the kernel, there's no known solution, > >and no guaranteed mitigation strategy, I'm afraid. Various things like > >balancing, or adding, balancing, and removing a device again have been > >tried. Sometimes they seem to help; sometimes they just make the > >problem worse. > > > >We average maybe one report a week or so with this particular > >set of symptoms. > We should get this listed on the Wiki on the Gotcha's page ASAP, > especially considering that it's a pretty significant bug (not quite > as bad as data corruption, but pretty darn close). It's certainly mentioned in the FAQ, in the main entry on unexpected ENOSPC. The text takes you through identifying when there's the "usual" problem, then goes on to say that if you've hit ENOSPC with free space still to be unallocated, you've got this issue. > Vedran, could you try running the balance with just '-dusage=40' and > then again with just '-musage=40'? If just one of those fails, it > could help narrow things down significantly. > > Hugo, is there anything else known about this issue (I don't recall > seeing it mentioned before, and a quick web search didn't turn up > much)? I grumble about it regularly on IRC, where we get many more reports of it than on the mailing list. There have been a couple on here that I can recall, but not many. > In particular: > 1. Is there any known way to reliably reproduce it (I would assume > not, as that would likely lead to a mitigation strategy. If someone > does find a reliable reproducer, please let me know, I've got some > significant spare processor time and storage space I could dedicate > to getting traces and filesystem images for debugging, and already > have most of the required infrastructure set up for something like > this)? None that I know of. I can start asking people for btrfs-image dumps again, if you want to investigate. I did do that for a while, to pass them to josef, but he said he didn't need any more of them after a while. (He was always planning on investigating it, but kept getting diverted by data corruption bugs, which have higher priority). > 2. Is it contagious (that is, if I send a snapshot from a filesystem > that is affected by it, does the filesystem that receives the > snapshot become affected; if we could find a way to reproduce it, I > could easily answer this question within a couple of minutes of > reproducing it)? No, as far as I know, it doesn't transfer via send/receive. send/receive is largely equivalent to copying the data by other means -- receive is implemented almost exclusively in userspace, with only a couple of ioctls for mucking around with the UUIDs at the end. > 3. Do we have any kind of statistics beyond the rate of reports (for > example, does it happen more often on bigger filesystems, or > possibly more frequently with certain chunk profiles)? Not that I've noticed, no. We've had it on small and large, single-device and many devices, HDD and SSD, converted and not converted. At one point, a couple of years ago, I did think it was down to converted filesystems, because we had a run of them, but that seems not to be the case. Hugo. -- Hugo Mills | The glass is neither half-full nor half-empty; it is hugo@... carfax.org.uk | twice as large as it needs to be. http://carfax.org.uk/ | PGP: E2AB1DE4 | Dr Jon Whitehead signature.asc Description: Digital signature
Re: illegal snapshot, cannot be deleted
Vedran, I see 2 snapshot numbers (748 and 741), maybe copy-paste error or typo, but can you confirm that the illegal one is deleted? /Henk On Fri, Nov 13, 2015 at 6:30 PM, Vedran Vucicwrote: > Hello, > > Here are outputs of commands as you requested: > btrfs fi df / > Data, single: total=8.00GiB, used=7.71GiB > System, DUP: total=32.00MiB, used=16.00KiB > Metadata, DUP: total=1.12GiB, used=377.25MiB > GlobalReserve, single: total=128.00MiB, used=0.00B > > btrfs fi show > Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 > Total devices 1 FS bytes used 8.08GiB > devid1 size 18.71GiB used 10.31GiB path /dev/sda6 > > btrfs-progs v4.0+20150429 > > Thanks, > > vedran > > On Fri, Nov 13, 2015 at 5:30 PM, Austin S Hemmelgarn > wrote: >> On 2015-11-13 11:12, Vedran Vucic wrote: >>> >>> Hello, >>> >>> I succeeded to delete illegal snapshot with command: >>> btrfs subvolume delete /.snapshots/741/snapshot >>> When I have done >>> btrfs balance / -dusage=0 -musage=0 >>> increasing value up to 4o I did not have issues. >>> But on value 4- for-dusage= and -musage= >>> I got message that there is no space left on disk. >>> Do you have any advice how to manage that? >> >> >> Can you post the output of 'btrfs fi df' and 'btrfs fi show' again? Both >> should have changed after the balance, and I'd need to see what it looks >> like now to be able to give any reasonable advice. >> >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: illegal snapshot, cannot be deleted
On Fri, Nov 13, 2015 at 01:10:12PM -0500, Austin S Hemmelgarn wrote: > On 2015-11-13 12:30, Vedran Vucic wrote: > >Hello, > > > >Here are outputs of commands as you requested: > > btrfs fi df / > >Data, single: total=8.00GiB, used=7.71GiB > >System, DUP: total=32.00MiB, used=16.00KiB > >Metadata, DUP: total=1.12GiB, used=377.25MiB > >GlobalReserve, single: total=128.00MiB, used=0.00B > > > >btrfs fi show > >Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 > > Total devices 1 FS bytes used 8.08GiB > > devid1 size 18.71GiB used 10.31GiB path /dev/sda6 > > > >btrfs-progs v4.0+20150429 > > > Hmm, that's odd, based on these numbers, you should be having no > issue at all trying to run a balance. You might be hitting some > other bug in the kernel, however, but I don't remember if there were > any known bugs related to ENOSPC or balance in the version you're > running. There's one specific bug that shows up with ENOSPC exactly like this. It's in all versions of the kernel, there's no known solution, and no guaranteed mitigation strategy, I'm afraid. Various things like balancing, or adding, balancing, and removing a device again have been tried. Sometimes they seem to help; sometimes they just make the problem worse. We average maybe one report a week or so(*) with this particular set of symptoms. Hugo. (*) > You might see if trying to re-run the balance with > '-dusage=40 -musage=40' works correctly (I've seen cases where the > first run fails, but subsequent ones work because the first one made > some progress despite failing). > >On Fri, Nov 13, 2015 at 5:30 PM, Austin S Hemmelgarn > >wrote: > >>On 2015-11-13 11:12, Vedran Vucic wrote: > >>> > >>>Hello, > >>> > >>>I succeeded to delete illegal snapshot with command: > >>>btrfs subvolume delete /.snapshots/741/snapshot > >>>When I have done > >>>btrfs balance / -dusage=0 -musage=0 > >>>increasing value up to 4o I did not have issues. > >>>But on value 4- for-dusage= and -musage= > >>>I got message that there is no space left on disk. > >>>Do you have any advice how to manage that? > >> > >> > >>Can you post the output of 'btrfs fi df' and 'btrfs fi show' again? Both > >>should have changed after the balance, and I'd need to see what it looks > >>like now to be able to give any reasonable advice. > >> > >> > > -- Hugo Mills | You can play with your friends' privates, but you hugo@... carfax.org.uk | can't play with your friends' childrens' privates. http://carfax.org.uk/ | PGP: E2AB1DE4 | C++ coding rule signature.asc Description: Digital signature
Re: illegal snapshot, cannot be deleted
Hello, My system is on laptop that is not heavy duty such as servers. openSuse 13.2 was installed approx 2 months ago so the issue did not appear due to longterm lack of administration or maintenance. Please let me know if I can help in any other way. Thanks, vedran On Fri, Nov 13, 2015 at 9:15 PM, Vedran Vucicwrote: > Hello, > > I guess that it might be bug in kernel. > I was successful this: > btrfs balance start / -dusage=50 -musage=35 > > musage above 35 caused ENOSPC message. Otherwise it was good. > Thanks on support, > > vedran > > On Fri, Nov 13, 2015 at 7:42 PM, Hugo Mills wrote: >> On Fri, Nov 13, 2015 at 01:10:12PM -0500, Austin S Hemmelgarn wrote: >>> On 2015-11-13 12:30, Vedran Vucic wrote: >>> >Hello, >>> > >>> >Here are outputs of commands as you requested: >>> > btrfs fi df / >>> >Data, single: total=8.00GiB, used=7.71GiB >>> >System, DUP: total=32.00MiB, used=16.00KiB >>> >Metadata, DUP: total=1.12GiB, used=377.25MiB >>> >GlobalReserve, single: total=128.00MiB, used=0.00B >>> > >>> >btrfs fi show >>> >Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 >>> > Total devices 1 FS bytes used 8.08GiB >>> > devid1 size 18.71GiB used 10.31GiB path /dev/sda6 >>> > >>> >btrfs-progs v4.0+20150429 >>> > >>> Hmm, that's odd, based on these numbers, you should be having no >>> issue at all trying to run a balance. You might be hitting some >>> other bug in the kernel, however, but I don't remember if there were >>> any known bugs related to ENOSPC or balance in the version you're >>> running. >> >>There's one specific bug that shows up with ENOSPC exactly like >> this. It's in all versions of the kernel, there's no known solution, >> and no guaranteed mitigation strategy, I'm afraid. Various things like >> balancing, or adding, balancing, and removing a device again have been >> tried. Sometimes they seem to help; sometimes they just make the >> problem worse. >> >>We average maybe one report a week or so(*) with this particular >> set of symptoms. >> >>Hugo. >> >> (*) >> >>> You might see if trying to re-run the balance with >>> '-dusage=40 -musage=40' works correctly (I've seen cases where the >>> first run fails, but subsequent ones work because the first one made >>> some progress despite failing). >>> >On Fri, Nov 13, 2015 at 5:30 PM, Austin S Hemmelgarn >>> > wrote: >>> >>On 2015-11-13 11:12, Vedran Vucic wrote: >>> >>> >>> >>>Hello, >>> >>> >>> >>>I succeeded to delete illegal snapshot with command: >>> >>>btrfs subvolume delete /.snapshots/741/snapshot >>> >>>When I have done >>> >>>btrfs balance / -dusage=0 -musage=0 >>> >>>increasing value up to 4o I did not have issues. >>> >>>But on value 4- for-dusage= and -musage= >>> >>>I got message that there is no space left on disk. >>> >>>Do you have any advice how to manage that? >>> >> >>> >> >>> >>Can you post the output of 'btrfs fi df' and 'btrfs fi show' again? Both >>> >>should have changed after the balance, and I'd need to see what it looks >>> >>like now to be able to give any reasonable advice. >>> >> >>> >> >>> >>> >> >> >> >> -- >> Hugo Mills | You can play with your friends' privates, but you >> hugo@... carfax.org.uk | can't play with your friends' childrens' privates. >> http://carfax.org.uk/ | >> PGP: E2AB1DE4 | C++ coding >> rule -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: illegal snapshot, cannot be deleted
On 2015-11-13 14:55, Hugo Mills wrote: On Fri, Nov 13, 2015 at 02:40:44PM -0500, Austin S Hemmelgarn wrote: On 2015-11-13 13:42, Hugo Mills wrote: On Fri, Nov 13, 2015 at 01:10:12PM -0500, Austin S Hemmelgarn wrote: On 2015-11-13 12:30, Vedran Vucic wrote: Hello, Here are outputs of commands as you requested: btrfs fi df / Data, single: total=8.00GiB, used=7.71GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=1.12GiB, used=377.25MiB GlobalReserve, single: total=128.00MiB, used=0.00B btrfs fi show Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 Total devices 1 FS bytes used 8.08GiB devid1 size 18.71GiB used 10.31GiB path /dev/sda6 btrfs-progs v4.0+20150429 Hmm, that's odd, based on these numbers, you should be having no issue at all trying to run a balance. You might be hitting some other bug in the kernel, however, but I don't remember if there were any known bugs related to ENOSPC or balance in the version you're running. There's one specific bug that shows up with ENOSPC exactly like this. It's in all versions of the kernel, there's no known solution, and no guaranteed mitigation strategy, I'm afraid. Various things like balancing, or adding, balancing, and removing a device again have been tried. Sometimes they seem to help; sometimes they just make the problem worse. We average maybe one report a week or so with this particular set of symptoms. We should get this listed on the Wiki on the Gotcha's page ASAP, especially considering that it's a pretty significant bug (not quite as bad as data corruption, but pretty darn close). It's certainly mentioned in the FAQ, in the main entry on unexpected ENOSPC. The text takes you through identifying when there's the "usual" problem, then goes on to say that if you've hit ENOSPC with free space still to be unallocated, you've got this issue. It should still probably be on the Gotcha's page also, as it definitely fits the general description of the stuff there. Vedran, could you try running the balance with just '-dusage=40' and then again with just '-musage=40'? If just one of those fails, it could help narrow things down significantly. Hugo, is there anything else known about this issue (I don't recall seeing it mentioned before, and a quick web search didn't turn up much)? I grumble about it regularly on IRC, where we get many more reports of it than on the mailing list. There have been a couple on here that I can recall, but not many. Ah, that would explain it, I'm almost never on IRC. In particular: 1. Is there any known way to reliably reproduce it (I would assume not, as that would likely lead to a mitigation strategy. If someone does find a reliable reproducer, please let me know, I've got some significant spare processor time and storage space I could dedicate to getting traces and filesystem images for debugging, and already have most of the required infrastructure set up for something like this)? None that I know of. I can start asking people for btrfs-image dumps again, if you want to investigate. I did do that for a while, to pass them to josef, but he said he didn't need any more of them after a while. (He was always planning on investigating it, but kept getting diverted by data corruption bugs, which have higher priority). I don't have the experience to be able to properly debug it myself from images (my expertise has always been finding bugs, not necessarily fixing them), but was more offering to try and generate images (if we could find some series of commands that reproduces this at least some of the time, I have the resources to run a couple of VM's doing that over and over again until it hits the bug). If I could get some, I might be able to put some assertions into the kernel so that it panics when there's an ENOSPC in the balance code, and get a stack trace, but the more I think about it, the more likely it seems that that isn't going to be too helpful. 2. Is it contagious (that is, if I send a snapshot from a filesystem that is affected by it, does the filesystem that receives the snapshot become affected; if we could find a way to reproduce it, I could easily answer this question within a couple of minutes of reproducing it)? No, as far as I know, it doesn't transfer via send/receive. send/receive is largely equivalent to copying the data by other means -- receive is implemented almost exclusively in userspace, with only a couple of ioctls for mucking around with the UUIDs at the end. I thought that might be the case, but wanted to ask just to be safe (I do local backups on some systems using send/receive, largely because this means if my regular root filesystem gets corrupted, I can directly boot the backups, run a couple of commands, and then have a working system again in about 5 or 10 minutes, but if this could spread through send/receive, then that makes backups done this way less useful (because this is
Re: illegal snapshot, cannot be deleted
Hugo Mills posted on Fri, 13 Nov 2015 19:55:20 + as excerpted: > receive is implemented almost exclusively in userspace, with only a > couple of ioctls for mucking around with the UUIDs at the end. I wasn't aware of that and had assumed kernel space. Apart from the topic of discussion here, that has implications for the old "how old is too old" versioning question regarding userspace, so thanks, Hugo. =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: illegal snapshot, cannot be deleted
Hello, I guess that it might be bug in kernel. I was successful this: btrfs balance start / -dusage=50 -musage=35 musage above 35 caused ENOSPC message. Otherwise it was good. Thanks on support, vedran On Fri, Nov 13, 2015 at 7:42 PM, Hugo Millswrote: > On Fri, Nov 13, 2015 at 01:10:12PM -0500, Austin S Hemmelgarn wrote: >> On 2015-11-13 12:30, Vedran Vucic wrote: >> >Hello, >> > >> >Here are outputs of commands as you requested: >> > btrfs fi df / >> >Data, single: total=8.00GiB, used=7.71GiB >> >System, DUP: total=32.00MiB, used=16.00KiB >> >Metadata, DUP: total=1.12GiB, used=377.25MiB >> >GlobalReserve, single: total=128.00MiB, used=0.00B >> > >> >btrfs fi show >> >Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 >> > Total devices 1 FS bytes used 8.08GiB >> > devid1 size 18.71GiB used 10.31GiB path /dev/sda6 >> > >> >btrfs-progs v4.0+20150429 >> > >> Hmm, that's odd, based on these numbers, you should be having no >> issue at all trying to run a balance. You might be hitting some >> other bug in the kernel, however, but I don't remember if there were >> any known bugs related to ENOSPC or balance in the version you're >> running. > >There's one specific bug that shows up with ENOSPC exactly like > this. It's in all versions of the kernel, there's no known solution, > and no guaranteed mitigation strategy, I'm afraid. Various things like > balancing, or adding, balancing, and removing a device again have been > tried. Sometimes they seem to help; sometimes they just make the > problem worse. > >We average maybe one report a week or so(*) with this particular > set of symptoms. > >Hugo. > > (*) > >> You might see if trying to re-run the balance with >> '-dusage=40 -musage=40' works correctly (I've seen cases where the >> first run fails, but subsequent ones work because the first one made >> some progress despite failing). >> >On Fri, Nov 13, 2015 at 5:30 PM, Austin S Hemmelgarn >> > wrote: >> >>On 2015-11-13 11:12, Vedran Vucic wrote: >> >>> >> >>>Hello, >> >>> >> >>>I succeeded to delete illegal snapshot with command: >> >>>btrfs subvolume delete /.snapshots/741/snapshot >> >>>When I have done >> >>>btrfs balance / -dusage=0 -musage=0 >> >>>increasing value up to 4o I did not have issues. >> >>>But on value 4- for-dusage= and -musage= >> >>>I got message that there is no space left on disk. >> >>>Do you have any advice how to manage that? >> >> >> >> >> >>Can you post the output of 'btrfs fi df' and 'btrfs fi show' again? Both >> >>should have changed after the balance, and I'd need to see what it looks >> >>like now to be able to give any reasonable advice. >> >> >> >> >> >> > > > > -- > Hugo Mills | You can play with your friends' privates, but you > hugo@... carfax.org.uk | can't play with your friends' childrens' privates. > http://carfax.org.uk/ | > PGP: E2AB1DE4 | C++ coding rule -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: illegal snapshot, cannot be deleted
On Fri, Nov 13, 2015 at 09:11:46PM +, Duncan wrote: > Hugo Mills posted on Fri, 13 Nov 2015 19:55:20 + as excerpted: > > > receive is implemented almost exclusively in userspace, with only a > > couple of ioctls for mucking around with the UUIDs at the end. > > I wasn't aware of that and had assumed kernel space. Apart from the > topic of discussion here, that has implications for the old "how old is > too old" versioning question regarding userspace, so thanks, Hugo. =:^) Note that send is still heavily kernel-side. It's only receive that's all in userspace. Hugo. -- Hugo Mills | hugo@... carfax.org.uk | __(_'> http://carfax.org.uk/ | Squeak! PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: illegal snapshot, cannot be deleted
Hugo Mills posted on Fri, 13 Nov 2015 21:13:41 + as excerpted: > On Fri, Nov 13, 2015 at 09:11:46PM +, Duncan wrote: >> Hugo Mills posted on Fri, 13 Nov 2015 19:55:20 + as excerpted: >> >> > receive is implemented almost exclusively in userspace, with only a >> > couple of ioctls for mucking around with the UUIDs at the end. >> >> I wasn't aware of that and had assumed kernel space. Apart from the >> topic of discussion here, that has implications for the old "how old is >> too old" versioning question regarding userspace, so thanks, Hugo. =:^) > > Note that send is still heavily kernel-side. It's only receive > that's all in userspace. Being "runtime", send's kernel-side use would be expected. The more general rule that runtime operations are kernel side, so user side versioning doesn't have the same importance, unless you're trying to use userside tools such as check, rescue and recover, generally run on an unmounted btrfs, to recover from damage, since in the general case that's where userspace code really goes to work and thus where it's version becomes important. That the receive side of the send/receive feature is an exception to this general rule is the news, since now we have a runtime tool, normally run on a mounted btrfs, where the userspace code is doing the work and therefore the userspace version, newer versions having the latest fixes, becomes important. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: illegal snapshot, cannot be deleted
On 2015-11-11 17:11, Vedran Vucic wrote: Hello, I use OpenSuse 13.2 on my Toshiba Satellite laptop. I noticed that I run out of disk space, checked documentation and I realized that there were many snapshots. I used Yast Snapper to delete snapshots. I noticed that one snapshot with number 748 could not be deleted. I entered terminal and after the command: snapper -c root delete 748 I got message Illegal snapshot. This sounds like some sort of issue with snapper, not BTRFS itself, but see below for some suggestions. I woudl like to delete it since it is old one. Please find details about my system as requested on your wiki page. uname -a Linux linux-jjcc.site 3.16.7-29-desktop #1 SMP PREEMPT Fri Oct 23 00:46:04 UTC 2015 (6be6a97) i686 i686 i386 GNU/Linux btrfs --version btrfs-progs v4.0+20150429 btrfs fi show Label: none uuid: d6934db3-3ac9-49d0-83db-287be7b995a5 Total devices 1 FS bytes used 10.98GiB devid1 size 18.71GiB used 18.71GiB path /dev/sda6 btrfs fi df / Data, single: total=15.19GiB, used=10.37GiB System, DUP: total=8.00MiB, used=16.00KiB System, single: total=4.00MiB, used=0.00B Metadata, DUP: total=1.75GiB, used=622.53MiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=208.00MiB, used=0.00B Please find attached dmesg.log as requested. Please advise what have to do in order to delete snapshot that is reported to be illegal. Have you tried running 'btrfs subvolume delete' on the snapshot? You'll have to find the full path to it first of course, but that shouldn't be too hard. Based on the lack of BTRFS error messages in the kernel log you posted, I'm pretty certain that this isn't an issue with the filesystem itself (although the filesystem doesn't look particularly healthy, see further below), so manually deleting the snapshot using the regular BTRFS commands should work just fine. That said, you may also want to look into changing the config for snapper, as it has a ridiculously aggressive retention policy for snapshots by default, which tends to lead to excessive space usage on filesystems smaller than about 250GB. You may also want to look at running a balance on the filesystem, the numbers from btrfs fi show and btrfs fi df look somewhat worrying, you've got all the space on the disk allocated as chunks by BTRFS, but have a significant amount of empty space in those chunks. Given that fact, ENOSPC issues are a very real possibility, and you'll probably have to run a series of partial balances to fix this (and it's important to do it before it becomes a visible issue also, because once you start getting ENOSPC errors, it is a lot harder to fix). Try running a balance with '-dusage=0 -musage=0', then re-run repeatedly increasing the number for both arguments by 5 each time until you get to 50. If a run complains about 'ENOSPC errors during balance', re-run it with the same number for -dusage and -musage. If you end up re-running with the same value 3 times and keep getting the errors, then you're probably beyond the point of this being fixable, and should just recreate the filesystem (you do have backups, right?). Otherwise, after finishing the run with '-dusage=50 -musage=50' successfully, run a full balance without the dusage and musage options. smime.p7s Description: S/MIME Cryptographic Signature