Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
Am 31.07.2013 21:24, schrieb Matthew Miller: > On Wed, Jul 31, 2013 at 08:18:52PM +0200, Reindl Harald wrote: >> you are aware how much 10% of 8 TB are? > > So why *not* keep more logs, at least while nothing else is using it? to save space? there where i use "Thin Provisioning" are full backups of machines mandatory and you do not want to have hundrets of gigabyteslogs >> you need at least a lot of more fuzzy logic >> * not more than XXX MB >> * or vary the percentage depending on the drive size >> * if /var/log is a dedicated partition *nothing* reserved > > That last, at least, seems reasonable at least the second too - nobody looks at TB's of logs and the few people who do are not the norm and can configure it signature.asc Description: OpenPGP digital signature -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Wed, Jul 31 2013 at 5:53pm -0400, Chris Murphy wrote: > > On Jul 31, 2013, at 12:38 PM, Eric Sandeen wrote: > > > > > i.e. if you only want the efficient snapshots, a way to fully-provision > > a "thinp" device. I'm still not sure if this is possible…? > > […] > > > > > I guess I'm pretty nervous about offering actual thin provisioned > > storage to "average" Fedora users. I'm having nightmares about the "bug" > > reports already, just based on the likelihood of most users misunderstanding > > the feature and it's requirements & expected behavior… > > So possibly the installer should be conservative about how thin the > provisioning is; We (David Lehman, myself and others on our respective teams) have already decided some months ago that any thin LVs that anaconda establishes will _not_ oversubscribe the thin-pool. And in fact a reserve of free space will be kept in the thin-pool as well as the parent VG. > otherwise I'm imagining inadequately provisioned thinp LV, while also > using the rollback feature [1]. Can you elaborate? Rollback with LVM thin provisioning doesn't require any additional space in the pool. It is a simple matter of swapping the internal device_ids that the thin-pool uses as an index to access the corresponding thin volumes. This is done when activating the thin volumes. LVM2's support thinp snapshot merge (aka rollback) is still pending, but RFC patches have been published via this BZ: https://bugzilla.redhat.com/show_bug.cgi?id=957881 > [1] https://fedoraproject.org/wiki/Changes/Rollback The Rollback project authors have been having periodic concalls with David Lehman, myself and others. So we are relatively coordinated ;) Mike -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 31, 2013, at 12:38 PM, Eric Sandeen wrote: > > i.e. if you only want the efficient snapshots, a way to fully-provision > a "thinp" device. I'm still not sure if this is possible…? […] > > I guess I'm pretty nervous about offering actual thin provisioned > storage to "average" Fedora users. I'm having nightmares about the "bug" > reports already, just based on the likelihood of most users misunderstanding > the feature and it's requirements & expected behavior… So possibly the installer should be conservative about how thin the provisioning is; otherwise I'm imagining inadequately provisioned thinp LV, while also using the rollback feature [1]. [1] https://fedoraproject.org/wiki/Changes/Rollback Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Wed, 2013-07-31 at 13:38 -0500, Eric Sandeen wrote: > On 7/31/13 12:08 PM, Chris Murphy wrote: > > > > On Jul 31, 2013, at 8:32 AM, Mike Snitzer > > wrote: > > > >> But on the desktop the fedora developers need to provide sane > >> policy/defaults. > > > > Right. And the concern I have (other than a blatant bug), is the F20 > > feature for the installer to create thinp LVs; and to do that the > > installer needs to know what sane default parameters are. I think > > perhaps determining those defaults is non-obvious because of my > > experience in this bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=984236 > > > > If I'm going to use thinP mostly for snapshots, then that suggests a > > smaller chunk size at thin pool creation time; whereas if I have no > > need for snapshots but a greater need for provisioning then a larger > > chunk size is better. And asking usage context in the installer, I > > think is a problem. > > Quite some time ago I had asked whether we could get the allocation-tracking > snapshot niceties from dm-thinp, without actually needing it to be thin. > > i.e. if you only want the efficient snapshots, a way to fully-provision > a "thinp" device. I'm still not sure if this is possible...? > > I guess I'm pretty nervous about offering actual thin provisioned > storage to "average" Fedora users. I'm having nightmares about the "bug" > reports already, just based on the likelihood of most users misunderstanding > the feature and it's requirements & expected behavior... There was some discussion in #anaconda yesterday about making it available only via custom partitioning rather than the Installation Options dropdown, IIRC. That has the effect of denoting it as an 'advanced feature' and requiring more expertise on the part of the user to set it up. -- Adam Williamson Fedora QA Community Monkey IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora http://www.happyassassin.net -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Wed, 2013-07-31 at 11:52 +0200, Zdenek Kabelac wrote: ... > ThinP should be configured in a way that admin is able to extend pool to > honour promised space if really needed. It's not a good idea, to provision > 1EB > if you have at most just 1TB disk and then you expect you will have no > problems when someone fallocate() 500TB. > > I.e. if someone is using iSCSI disc array with 'hw' thin provisioning > support, there is no scsi command to provision space - it's admin's work to > ensure there is enough disc space to keep up with user demands Oops, Zdenek is likely repeating a mis-statement I made about the SCSI Standard on a call yesterday. I just checked and I was wrong - the latest draft of the Standard does provide a way to pre-provision space. Sorry - I should have checked before speaking. In March 2010 the SCSI committee added the concept of "anchored thin provisioning" to the (proposed) SCSI Block Commands – 3 (SBC-3) Standard. This allows a logical block to be in one of three states: mapped, deallocated, or anchored. A write command that specifies an anchored LBA does not require allocation of additional LBA mapping resources for that LBA. A write command that specifies a deallocated LBA may require allocation of LBA mapping resources. This change was proposed by David Black from EMC. The justification is reflects our discussion: "There is extensive experience with this sort of resource preallocation mechanism in filesystems, as most physical filesystems are effectively thin provisioned courtesy of the way that file space allocation works. In that domain, this sort of preallocation mechanism is useful and used selectively (e.g., the fallocate() primitive in Linux and Unix systems). In this context, SCSI anchored functionality can be viewed as extending filesystem notions of preallocation down to include SCSI thin provisioning.". So, 1) others have seen the need for pre-allocation in thinp environments, 2) hardware will eventually show up that implements it, 3) it appears as though the extension to fallocate that Mike suggested is worth investigating, 4) if we do this, we will want to add the concept to LVM thinp, and 5) to the plumbing in Linux SCSI so we can pass it to capable hardware. Tom -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 31, 2013, at 1:44 PM, Mike Snitzer wrote: > Did you ever look > to see if CONFIG_DM_DEBUG_BLOCK_STACK_TRACING is enabled? It's not enabled in either the regular or debug kernels found in koji. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Wed, Jul 31 2013 at 2:38pm -0400, Eric Sandeen wrote: > On 7/31/13 12:08 PM, Chris Murphy wrote: > > > > On Jul 31, 2013, at 8:32 AM, Mike Snitzer > > wrote: > > > >> But on the desktop the fedora developers need to provide sane > >> policy/defaults. > > > > Right. And the concern I have (other than a blatant bug), is the F20 > > feature for the installer to create thinp LVs; and to do that the > > installer needs to know what sane default parameters are. I think > > perhaps determining those defaults is non-obvious because of my > > experience in this bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=984236 > > > > If I'm going to use thinP mostly for snapshots, then that suggests a > > smaller chunk size at thin pool creation time; whereas if I have no > > need for snapshots but a greater need for provisioning then a larger > > chunk size is better. And asking usage context in the installer, I > > think is a problem. > > Quite some time ago I had asked whether we could get the allocation-tracking > snapshot niceties from dm-thinp, without actually needing it to be thin. > > i.e. if you only want the efficient snapshots, a way to fully-provision > a "thinp" device. I'm still not sure if this is possible...? TBD, we could add a "sandeen_makes_thinp_his_bitch" param and if specified (likely for entire pool, but we'll see) it would mean thin volumes allocating from the pool would have their logical address space reserved to be completey contiguous on creation (with all thin blocks flagged in metadata as RESERVED). The actual thin block allocation (zeroing of blocks on first write if configured, etc.) transitions the metadata's block from RESERVED to PROVISIONED. Not yet clear to me that the DM thinp code can be easily adapted to make the thin block allocation 2 staged. But would seem to be a prereq for dm-thinp's REQ_RESERVE support. I'll check with Joe (cc'd) and come back with his dose of reality ;) > I guess I'm pretty nervous about offering actual thin provisioned > storage to "average" Fedora users. I'm having nightmares about the "bug" > reports already, just based on the likelihood of most users misunderstanding > the feature and it's requirements & expected behavior... Heh, you shouldn't be nervous. You can just punt said bugs over the fence right? ;) Mike -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Wed, Jul 31 2013 at 1:08pm -0400, Chris Murphy wrote: > > On Jul 31, 2013, at 8:32 AM, Mike Snitzer wrote: > > > But on the desktop the fedora > > developers need to provide sane policy/defaults. > > Right. And the concern I have (other than a blatant bug), is the F20 > feature for the installer to create thinp LVs; and to do that the > installer needs to know what sane default parameters are. I think > perhaps determining those defaults is non-obvious because of my > experience in this bug: > https://bugzilla.redhat.com/show_bug.cgi?id=984236 Hmm, certainly a strange one. But some bugs can be. Did you ever look to see if CONFIG_DM_DEBUG_BLOCK_STACK_TRACING is enabled? Wouldn't explain any dmeventd memleak issues but could help explain slowness associated with mkfs.btrfs ontop of thinp. Anyway, to be continued in the BZ... > If I'm going to use thinP mostly for snapshots, then that suggests a > smaller chunk size at thin pool creation time; whereas if I have no > need for snapshots but a greater need for provisioning then a larger > chunk size is better. And asking usage context in the installer, I > think is a problem. It is certainly less than ideal but we haven't come up with an alternative yet. As Zdenek mentioned in comment#13 of the BZ you referenced, we're looking to do is establish default profiles for at least these 2 use-cases you mentioned. lvm2 has recently grown profile support. We just need to come to terms with what constitutes sufficiently small and sufficently large thinp block sizes. We're doing work to zero in on the best defaults... so ultimately this is still up in the air. But my current thinking for these 2 profiles is: * for performance, use data device's optimal_io_size if > 64K. - this will yield a thinp block_size that is a full stripe on RAID[56] * for snapshots, use data device's minimum_io_size if > 64K. If/when we have the kernel REQ_RESERVE support to prealloc space in the thin-pool it _could_ be that we make the snapshots profile the default; and anything that wanted more performance could use fallocate(). But it is a slippery slope because many apps could overcompensate to always use fallocate()... we really don't want that. So some form of quota might need to be enforced on a cgroup level (once cgroup's reservation quota is exceeded fallocate()'s REQ_RESERVE bio pass down will be skipped). Grafting in cgroup-based policy into DM is a whole other can of worms, but doable. Open to other ideas... Mike -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Wed, Jul 31, 2013 at 08:18:52PM +0200, Reindl Harald wrote: > you are aware how much 10% of 8 TB are? So why *not* keep more logs, at least while nothing else is using it? > you need at least a lot of more fuzzy logic > * not more than XXX MB > * or vary the percentage depending on the drive size > * if /var/log is a dedicated partition *nothing* reserved That last, at least, seems reasonable. -- Matthew Miller ☁☁☁ Fedora Cloud Architect ☁☁☁ -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
Am 31.07.2013 20:14, schrieb Zbigniew Jędrzejewski-Szmek: > journald provides configuration knobs to exactly set the limits. > But forcing the admin to always configure this is something that > should be avoided, and reasonable values that work OK most of the > time should be used. Those defaults (15% of available /var/log, 10% free) > may not be perfect, but they give reasonable behaviour on various > systems, large and small. This is true even on btrfs with 50% > overestimate of free space you are aware how much 10% of 8 TB are? this is the same way fundamentally broken as the "5% reserved for root" these days you need at least a lot of more fuzzy logic * not more than XXX MB * or vary the percentage depending on the drive size * if /var/log is a dedicated partition *nothing* reserved signature.asc Description: OpenPGP digital signature -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 7/31/13 12:08 PM, Chris Murphy wrote: > > On Jul 31, 2013, at 8:32 AM, Mike Snitzer > wrote: > >> But on the desktop the fedora developers need to provide sane >> policy/defaults. > > Right. And the concern I have (other than a blatant bug), is the F20 > feature for the installer to create thinp LVs; and to do that the > installer needs to know what sane default parameters are. I think > perhaps determining those defaults is non-obvious because of my > experience in this bug: > https://bugzilla.redhat.com/show_bug.cgi?id=984236 > > If I'm going to use thinP mostly for snapshots, then that suggests a > smaller chunk size at thin pool creation time; whereas if I have no > need for snapshots but a greater need for provisioning then a larger > chunk size is better. And asking usage context in the installer, I > think is a problem. Quite some time ago I had asked whether we could get the allocation-tracking snapshot niceties from dm-thinp, without actually needing it to be thin. i.e. if you only want the efficient snapshots, a way to fully-provision a "thinp" device. I'm still not sure if this is possible...? I guess I'm pretty nervous about offering actual thin provisioned storage to "average" Fedora users. I'm having nightmares about the "bug" reports already, just based on the likelihood of most users misunderstanding the feature and it's requirements & expected behavior... -Eric > Chris Murphy > -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, Jul 29, 2013 at 05:34:05PM -0400, Simo Sorce wrote: > On Mon, 2013-07-29 at 23:06 +0200, Lennart Poettering wrote: > > Well, the point I am making is that it is wrong to ask userspace to > > handle this. Get the APIs right you expose to userspace. > > If user space assume it can use 'all the space up to 15% from exhausting > space' then it is user space that is wrong to me. > Even w/o thin provisioning you do not want any application to > boundlessly consume terabytes of space just because it happen to sit on > a *big* disk. > Applications may use heuristics to better behave in 'common' situations, > but should limit themselves also on the max space they are going to > consume in general (or better have admin controllable knobs to do so > coupled with reasonable defaults). journald provides configuration knobs to exactly set the limits. But forcing the admin to always configure this is something that should be avoided, and reasonable values that work OK most of the time should be used. Those defaults (15% of available /var/log, 10% free) may not be perfect, but they give reasonable behaviour on various systems, large and small. This is true even on btrfs with 50% overestimate of free space. > > I mean, ultimately for me it doesn't matter I geuss, since you say > > neither the fs/block layer nor userspace should care, but that this is > > the admin's problem, but that really sounds like chickening out to > > me... > > Given the admin is the only one that knows for any given situation what > is more important to him I do not think there is much more you can do. > Sure you can set defaults or what not but there isn't any configuration > that will ever be right short of something that can read minds. > > What you can do is give admin knobs to tweak in user space, as well as > in the kernel. So that applications can limit themselves based on > configuration, lacking those knobs you need to provide mechanisms a la > cgroups that are used to hard limit misbehaving apps that think they are > at an all-you-can-it buffet. Let's say that I'm downloading something in the browser, or creating a iso image in brasero. I think it would be really awful if those applications couldn't tell me that I don't have enough space (without actually exhausting it and hitting a limit), like they can now. So it's not a question of misbehaving. Zbyszek -- they are not broken. they are refucktored -- alxchk -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 31, 2013, at 8:32 AM, Mike Snitzer wrote: > But on the desktop the fedora > developers need to provide sane policy/defaults. Right. And the concern I have (other than a blatant bug), is the F20 feature for the installer to create thinp LVs; and to do that the installer needs to know what sane default parameters are. I think perhaps determining those defaults is non-obvious because of my experience in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=984236 If I'm going to use thinP mostly for snapshots, then that suggests a smaller chunk size at thin pool creation time; whereas if I have no need for snapshots but a greater need for provisioning then a larger chunk size is better. And asking usage context in the installer, I think is a problem. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/31/2013 10:32 AM, Mike Snitzer wrote: On Mon, Jul 29 2013 at 2:48pm -0400, Eric Sandeen wrote: On 7/27/13 11:56 AM, Lennart Poettering wrote: On Fri, 26.07.13 22:13, Miloslav Trmač (mitr at volny.cz) wrote: Hello all, with thin provisioning available, the total and free space values reported by a filesystem do not necessarily mean that that much space is _actually_ available (the actual backing storage may be smaller, or shared with other filesystems). If your package reports disk space usage to users, and bases this on filesystem free space, please consider whether it might need to take LVM thin provisioning into account. The same applies if your package automatically allocates a certain proportion of the total or available space. A quick way to check whether your package is likely to be affected, is to look for statfs() or statvfs() calls in C, or the equivalent in your higher-level library / programming language. Well, I am pretty sure the burden must be on the file systems to report a useful estimate free blocks value in statfs()/statvfs(). Exporting that problem to userspace and expecting userspace to work around this is just wrong. In fact, this would be quite an API breakage if applications cannot rely that the value returned is at least a rough estimate on how much data can be stored on disk. journald will scale how much disk usage it will use of /var/log/journal based on the file system size and free level. It will also module the per-service rate limit levels based on the amount of free disk space. If you break the API of statfs()/statvfs(), then you will end up break this and all programs like it. Any program needs to be prepared for ENOSPC; as Ric mentioned elsewhere, until you successfully write to it, it's not yours! :) (Ok, thinp running out of space won't generate ENOSPC today, either, but you see my general point...) And how much space are we really talking about here? If you're running thin-provisioning on thin margins, especially w/o some way to automatically hot-add storage, you're probably doing it wrong. (And if journald sees "100T free" and decides it can use 50T of that, it's doing it wrong, too) ;) The truth is somewhere in the middle, but quibbling over whether this app or that can claim a bit of space behind a thin-provisioned volume probably isn't useful. Right, so picking up on what we've discussed: adding the ability to have fallocate propagate to the underlying storage via a new REQ_RESERVE bio (if the storage opts-in, which dm-thinp could). This bio would be the reciprocal of discard -- thus enabling the caller to efficiently reserve space in the underlying storage (e.g. dm-thin-pool). So volumes or apps (e.g. journald) that _expect_ to have fully-provisioned space from thinp could. I think that this would be really useful and, as you mention, is the mirror image of our discard support ric This would also allow for a hyrid setup where the thin-pool is configured to use a smaller block size to benefit taking many snapshots -- but then allows select apps and/or volumes to reserve contiguous space from the thin-pool. It obviously also offers the other traditional fallocate benefits too (reserving large contiguous space for performance, etc). I'll draft an RFC patch or 2 for LKML... may take some time for me to get to it but I can make it a higher priority if others have serious interest. The admin definitely needs tools to see the state of thinly provisioned storage, but that's the admin's job to worry about, not the app's, IMHO. Yeah, in a data center the admin really should be all over these thinp concerns, making them a non-issue. But on the desktop the fedora developers need to provide sane policy/defaults. Mike -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Wed, Jul 31 2013 at 5:52am -0400, Zdenek Kabelac wrote: > Dne 31.7.2013 10:39, Florian Weimer napsal(a): > >On 07/29/2013 08:38 PM, Ric Wheeler wrote: > > > >>If application A does a stat or statvfs() call, sees 1GB of space left > >>and then does a write, we could easily lose that race to any other > >>application. > >> > >>If you want to reserve space, you need to grab the space yourself > >>(always works with a large "write()" but preallocation can also help > >>without dm-thin). > > > >In order to have it work "always", you'll have to write unpredictable data. > >If you write just zeros, the reservation isn't guaranteed if the file system > >supports compression. > > > >I'm pretty sure we want a crass layering violation for this one (probably a > >new mode flag for fallocate), to ensure proper storage reservation for things > >like VM images. > > > If someone wants to use preallocation - thus always allocate whole space, > than there is no reason to use provisioned devices unless someone > want's to use its snapshot feature (for this there could be > probably introduced something like creation of fully provisioned > device) - but then you end-up > with the same problem once you start to use snapshot. > > For me this rather looks like misuse of thin provisioning. > > ThinP should be configured in a way that admin is able to extend > pool to honour promised space if really needed. It's not a good > idea, to provision 1EB if you have at most just 1TB disk and then > you expect you will have no problems when someone fallocate() 500TB. fallocate doesn't allow you to reserve more physical space than you have (or allowed via quota). > I.e. if someone is using iSCSI disc array with 'hw' thin > provisioning support, there is no scsi command to provision space - > it's admin's work to ensure there is enough disc space to keep up > with user demands > > Maybe - just an idea - there could be a kernel bit-flag somewhere, > which might tell if the device used by fs is 'fully provisioned' or > 'thin provisioned' (something like rotational/non-rotational) But > there is no way to return information about free disc space - since > it's highly subjective value and moreover very expensive to > calculate. If things like journald _need_ to have a sysfs flag that denotes the volume it is writing to is thinp then I'd like to understand what it'd do differently knowing that info. Would it conditionally call fallocate() -- assuming dm-thinp grows REQ_RESERVE support like I mentioned in my previous post. I see little value in exposing whether some portion of the storage stack is thin or not. What is an app to do with that info? It'd have to do things like: 1) determine the blockdevice the FS is layered on 2) lookup sysfs file for that device.. a filesystem can span multiple devices.. some time some not. It is just a rat's nest. Thinly provisioned storage this isn't exactly a new concept. But the Linux provided target obviously engages other parts of the OS to properly support it (at a minimum the volume manager and the installer). If the fallocate() triggered REQ_RESERVE passdown to the underlying storage provides a reasonable stop gap we can really explore it -- at least we'd be piggybacking on an established interface that returns success or failure. Mike -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, Jul 29 2013 at 2:48pm -0400, Eric Sandeen wrote: > On 7/27/13 11:56 AM, Lennart Poettering wrote: > > On Fri, 26.07.13 22:13, Miloslav Trmač (mitr at volny.cz) wrote: > > > >> Hello all, > >> with thin provisioning available, the total and free space values > >> reported by a filesystem do not necessarily mean that that much space > >> is _actually_ available (the actual backing storage may be smaller, or > >> shared with other filesystems). > >> > >> If your package reports disk space usage to users, and bases this on > >> filesystem free space, please consider whether it might need to take > >> LVM thin provisioning into account. > >> > >> The same applies if your package automatically allocates a certain > >> proportion of the total or available space. > >> > >> A quick way to check whether your package is likely to be affected, is > >> to look for statfs() or statvfs() calls in C, or the equivalent in > >> your higher-level library / programming language. > > > > Well, I am pretty sure the burden must be on the file systems to report > > a useful estimate free blocks value in statfs()/statvfs(). Exporting that > > problem to userspace and expecting userspace to work around this is just > > wrong. In fact, this would be quite an API breakage if applications > > cannot rely that the value returned is at least a rough estimate on how > > much data can be stored on disk. > > > > journald will scale how much disk usage it will use of /var/log/journal > > based on the file system size and free level. It will also module the > > per-service rate limit levels based on the amount of free disk space. If > > you break the API of statfs()/statvfs(), then you will end up break this > > and all programs like it. > > Any program needs to be prepared for ENOSPC; as Ric mentioned elsewhere, > until you successfully write to it, it's not yours! :) (Ok, thinp > running out of space won't generate ENOSPC today, either, but you see > my general point...) > > And how much space are we really talking about here? If you're running > thin-provisioning on thin margins, especially w/o some way to automatically > hot-add storage, you're probably doing it wrong. > > (And if journald sees "100T free" and decides it can use 50T of that, > it's doing it wrong, too) ;) > > The truth is somewhere in the middle, but quibbling over whether this > app or that can claim a bit of space behind a thin-provisioned volume > probably isn't useful. Right, so picking up on what we've discussed: adding the ability to have fallocate propagate to the underlying storage via a new REQ_RESERVE bio (if the storage opts-in, which dm-thinp could). This bio would be the reciprocal of discard -- thus enabling the caller to efficiently reserve space in the underlying storage (e.g. dm-thin-pool). So volumes or apps (e.g. journald) that _expect_ to have fully-provisioned space from thinp could. This would also allow for a hyrid setup where the thin-pool is configured to use a smaller block size to benefit taking many snapshots -- but then allows select apps and/or volumes to reserve contiguous space from the thin-pool. It obviously also offers the other traditional fallocate benefits too (reserving large contiguous space for performance, etc). I'll draft an RFC patch or 2 for LKML... may take some time for me to get to it but I can make it a higher priority if others have serious interest. > The admin definitely needs tools to see the state of thinly provisioned > storage, but that's the admin's job to worry about, not the app's, IMHO. Yeah, in a data center the admin really should be all over these thinp concerns, making them a non-issue. But on the desktop the fedora developers need to provide sane policy/defaults. Mike -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, Jul 29 2013 at 2:49pm -0400, Daniel P. Berrange wrote: > On Mon, Jul 29, 2013 at 02:38:23PM -0400, Ric Wheeler wrote: > > On 07/29/2013 10:18 AM, Daniel P. Berrange wrote: > > >On Mon, Jul 29, 2013 at 08:01:23AM -0600, Chris Murphy wrote: > > >>On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" > >>redhat.com> wrote: > > >> > > >>>Yep, we need to be able to report free space on filesystems, so that > > >>>apps provisioning virtual machines can get an idea of how much storage > > >>>they can provide to VMs without risk of over comitting. > > >>> > > >>>I agree that we really want the kernel, or at least a reusable shared > > >>>library, to provide some kind of interface to determine this, rather > > >>>than requiring every userspace app which cares to re-invent the wheel. > > >>What does it mean for an app to use stat to get free space, and then > > >>proceeds to create too big a VM image in a directory that has a quota > > >>set? I still think apps are asking an inappropriate/unqualified question > > >>by asking for volume free space, instead of what's available to them for > > >>a specified path. > > > From an API POV, libvirt doesn't need/care about the free space on the > > >volume underlying the filesystem. We actually only care about the free > > >space in a given directory that we're using for disk images. It just > > >happens that we implement this using statvfs() currently. So when I > > >ask for an API above, don't take this to mean I want a statvfs() that > > >knows about sparse volumes. An API or syscall that provides free space > > >for individual directories is fine with me. > > > > > > > Just another note, it is never safe to assume that storage under any > > file system is yours for the taking. > > > > If application A does a stat or statvfs() call, sees 1GB of space > > left and then does a write, we could easily lose that race to any > > other application. > > This race doesn't matter from libvirt's POV. It is just providing a > mechanism via its API. It is upto the management application using > libvirt to make use of the mechanism to provide a usage policy. > Their usage scenario may well enable them to make certain assumptions > about the storage that you could not otherwise do in a race free > manner. > > In addition, even in more general purpose usage scenarios, it does > not neccessarily matter if there is a race, because there can be a > second line of defence. For example, KVM can be set to pause the VM > upon ENOSPC errors, giving management application or administrator > the chance to expand capacity the underlying storage and then unpause > the guest. In that case checking the free space is mostly just a > sanity check which serves to avoid hitting the pause-on-ENOSPC scenario > too frequently. Running out of free space _should_ be extremely rare. A properly configured dm-thin pool will have adequate free space, with an appropriate low water mark, that would give admins ample time to extend (even if a human were to do it). But lvm2 has support to autoextend the thin-pool with free space in the parent volume group. But I'm just talking about the not-really-chicken solution of leaning on a properly configured system (either by admins in a data center or by fedora developers with sane defaults). As an aside, this extra free space checking that KVM is doing is really broken by design (polling sucks -- especially if this polling is happening in the host for each guest). Would be much better to leverage something like lvm2 with a custom dmeventd plugin that fires when it receives the low watermark and/or -ENOSPC event. Thinly provisioned volumes offer the prospect of doing away with this polling -- as such proper dm-thin integration has been on the virt roadmap for a while. Just never seems to happen. Mike -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
Dne 31.7.2013 10:39, Florian Weimer napsal(a): On 07/29/2013 08:38 PM, Ric Wheeler wrote: If application A does a stat or statvfs() call, sees 1GB of space left and then does a write, we could easily lose that race to any other application. If you want to reserve space, you need to grab the space yourself (always works with a large "write()" but preallocation can also help without dm-thin). In order to have it work "always", you'll have to write unpredictable data. If you write just zeros, the reservation isn't guaranteed if the file system supports compression. I'm pretty sure we want a crass layering violation for this one (probably a new mode flag for fallocate), to ensure proper storage reservation for things like VM images. If someone wants to use preallocation - thus always allocate whole space, than there is no reason to use provisioned devices unless someone want's to use its snapshot feature (for this there could be probably introduced something like creation of fully provisioned device) - but then you end-up with the same problem once you start to use snapshot. For me this rather looks like misuse of thin provisioning. ThinP should be configured in a way that admin is able to extend pool to honour promised space if really needed. It's not a good idea, to provision 1EB if you have at most just 1TB disk and then you expect you will have no problems when someone fallocate() 500TB. I.e. if someone is using iSCSI disc array with 'hw' thin provisioning support, there is no scsi command to provision space - it's admin's work to ensure there is enough disc space to keep up with user demands Maybe - just an idea - there could be a kernel bit-flag somewhere, which might tell if the device used by fs is 'fully provisioned' or 'thin provisioned' (something like rotational/non-rotational) But there is no way to return information about free disc space - since it's highly subjective value and moreover very expensive to calculate. Zdenek -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/29/2013 08:38 PM, Ric Wheeler wrote: If application A does a stat or statvfs() call, sees 1GB of space left and then does a write, we could easily lose that race to any other application. If you want to reserve space, you need to grab the space yourself (always works with a large "write()" but preallocation can also help without dm-thin). In order to have it work "always", you'll have to write unpredictable data. If you write just zeros, the reservation isn't guaranteed if the file system supports compression. I'm pretty sure we want a crass layering violation for this one (probably a new mode flag for fallocate), to ensure proper storage reservation for things like VM images. -- Florian Weimer / Red Hat Product Security Team -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
Dne 29.7.2013 23:06, Lennart Poettering napsal(a): On Mon, 29.07.13 16:52, Ric Wheeler (rwhee...@redhat.com) wrote: Oh, we don't assume it's all ours. We recheck regularly, immediately before appending to the journal files, of course assuming that we are not the only writers. With thinly provisioned storage (or things like btrfs, writeable snapshots, etc), you will not really ever know how much space is really there. Yeah, and that's an API regression. I guess there is major misunderstanding what is the whole purpose of thin provisioning. From this thread one could get the feeling that thinp just complicates estimations of free space for filesystem :) But the usage is quite different from the beginning. - disk space is costly resource - resizing of filesystem (i.e. ext4) is blocking usage and could be risky. - lot of filesystems does not support native snapshots. So thinp is here to help with these things. - Instead of running multi terrabyte disk arrays when user is only using gigs of disk space - thinp allows to add storage when needed (so you could slowly extend your disk arrays with more hw which needs more energy) - Instead of resizing fs all the time - you create large fs from the beginning and you let the block layer to resolve magic (at the price of disk fragmentation) - Instead of repeatedly writing code for snapshots to every fs - again you let the block layer to handle it (at the price of less efficient disk space usage). So the idea that fs would return different number of free space when it's being run on thinly provisioned device is simply wrong from many points. And there is no point to support this - since with LVM you could replace thinp device with linear mirrored device online if that would be needed. Obviously this would give you very floating results for any stats() functions you think there should be supported. Secondly - thinpool is designed to grow - so from which number you would actually want to estimate the free size - from the current pool size ? from the free size in whole volume group ? from the size of all attached disks which could be attached to volume group ? If you have multiple thin pools in VG - then what do you think the result value should be here? And finaly - the snapshot features makes the estimation of free space very costly operation - if you run multiple snapshots -how do you estimate free space ? What would be the meaning of this value ? thinp should work the same. Of course, this requires that the block layer has to pass more metadata up to the file systems than before, but there's really nothing intrinsicly evil about that, I mean, it could be as basic as just passing along a "provisioning perentage" or so which the fs will simply multiply into the returned values... (Of course it won't be that simple, but you get the concept...) Sorry, but the only broken concept I can see here is to allocate 50% of free disk space just because it can be made - disk space is not local RAM - if the FS tells you it has 1EB doesn't mean the program should just allocate 500TB for nothing. In this case admin obviously must properly configure provisioned disk space for those users who are used to eat every single byte from their fs. Thinp can't resolve this. Zdenek -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 29, 2013, at 3:06 PM, Lennart Poettering wrote: > > Well, journald is totally fine if it is lied to in the sense that the > values returned by statfs()/statvfs() are just estimates, and not > precise. However, it is assumed that the values are not off by > 100% as > they might be on thinp... e.g. A VM has a 1TB virtual device backed by a qcow2 file residing on a file system with 100GB free space, on a hard drive. That's > 1000%. It's not just thinp, this has been going on for a long time. So if journald wants to know something more real in the case of thinp, why not a passthrough from real file system to virtual file system in the case of qcow2? > Well, the point I am making is that it is wrong to ask userspace to > handle this. Get the APIs right you expose to userspace. Effectively it seems for a long time now there hasn't been an API exposing the information you think user space requires. Apps using stat are asking a question of an API that by design only knows about the most immediate file system, not anything beyond which may be backing it. The request for free space in the immediate file system seems vaguely reasonable, but needing information beyond the file system seems specious. > I mean, ultimately for me it doesn't matter I geuss, since you say > neither the fs/block layer nor userspace should care, but that this is > the admin's problem, but that really sounds like chickening out to > me… It may be that admins are going to need better tools to assist them in monitoring before crisis events occur. But it does seem the admin shouldn't be creating 16TB qcow2 files on 1TB real backing, any more than they should do the same with thinp. They're just asking for trouble, the lie is too big. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 29, 2013, at 3:34 PM, Simo Sorce wrote: > > btrfs consumes space on each write to the same block. > > If you have a 10GB file system with a 5GB, existing log file and overwrite it > twice in place, you will run out of space. It's a sufficiently confusing example, that I almost wish df (the typical user-space tool to learn of free space) would lie for btrfs volumes by subtracting 15% in the report. The raid1/10 case is even more confusing, but only because users now expect to be lied to that they have 1/2 the storage space compared to what they purchased. Btrfs is telling the whole truth about the total available storage and that data is taking twice the allocation. So again it's managing user expectations. Maybe df should persist in lying somehow (although difficult with mixed raid levels in a single volume), and the more lower level stat and btrfs tools should tell the deeper story? > There is no magic pony here for you - if you configure thin, you mean to use > it to lie to the users and their file systems for a valid reason. And not new. Qcow has allowed the situation for some time. A legitimate concern is how the file system behaves when its virtual storage suddenly runs out of backing storage space; that it can fail semi-gracefully (i.e. without file system corruption, even if there would be data loss). Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/29/2013 05:06 PM, Lennart Poettering wrote: On Mon, 29.07.13 16:52, Ric Wheeler (rwhee...@redhat.com) wrote: Oh, we don't assume it's all ours. We recheck regularly, immediately before appending to the journal files, of course assuming that we are not the only writers. With thinly provisioned storage (or things like btrfs, writeable snapshots, etc), you will not really ever know how much space is really there. Yeah, and that's an API regression. It is actually not an API regression, this is how file systems have always operated on enterprise storage (including writeable snapshots) and, to all practical purposes, whenever you are running in a multi-application environment. In effect, there never was an API that gave you what you want outside of the "write(2)" system call :) On btrfs you can just add/remove device as you wish during runtime and statvfs() does refelct this immediately. btrfs consumes space on each write to the same block. If you have a 10GB file system with a 5GB, existing log file and overwrite it twice in place, you will run out of space. thinp should work the same. Of course, this requires that the block layer has to pass more metadata up to the file systems than before, but there's really nothing intrinsicly evil about that, I mean, it could be as basic as just passing along a "provisioning perentage" or so which the fs will simply multiply into the returned values... (Of course it won't be that simple, but you get the concept...) I would argue that it is working how it should work. If you want fully provisioned storage and are a single application/single user file system, you can configure your box that way. Thin provisioned storage - by design - has a pool of real storage that is shared across all file systems that sit on devices that it serves. On SAN volumes, that exactly means you share the physical storage pool across multiple hosts and all of their file systems. The way it works assumes: * the system administrator understands thin provisioned storage and the system workload to some rough level * the sys admin set the water marks appropriately so that when we hit a low water mark, we can add physical storage to the pool There is no magic pony here for you - if you configure thin, you mean to use it to lie to the users and their file systems for a valid reason. Applications can do whatever they want as long as the sys admin monitors the box properly and has a way to add storage when needed. Think "just in time" storage provisioning. I am starting to think that this is critical enough that we might want to always fully provision this - just like we would for audit logs Checking won't hurt anything, but the storage stack will lie to you (and honestly, we always have in many cases :)). Well, journald is totally fine if it is lied to in the sense that the values returned by statfs()/statvfs() are just estimates, and not precise. However, it is assumed that the values are not off by > 100% as they might be on thinp... Or on btrfs or on copy on write LVM (not just ours, but hardware LVM) snapshots, etc. Or if a large application is running that is about to do a pre-allocation of the rest of the free data. The heuristic you assume does not work in any but the most constrained of all use cases. That the values are not perfectly accurate has been known forever. Since file systems existed developers knew that book-keeping and stuff means the returned valuea are slightly higher than practically reachable. And since compressed file systems they also knew that they might be lower than actually reachable. However, it's one thing to return bad estimates, and it is another thing to be totally off in the woods as is the case for thinp! This is not new or unique to thinp. There are some alerts that we can raise when you hit a low water mark for the device mapper physical pool, it would be interesting to talk about how you might leverage these. Well, the point I am making is that it is wrong to ask userspace to handle this. Get the APIs right you expose to userspace. I mean, ultimately for me it doesn't matter I geuss, since you say neither the fs/block layer nor userspace should care, but that this is the admin's problem, but that really sounds like chickening out to me... Not chickening out, just working as designed. If you don't like this, you need to use traditional, fully provisioned storage and not use copy on write technologies (like btrfs or LVM writeable snapshots). Apparently we have lied to you so well over the years that you just never noticed the reality of many other misleading IO stack configurations :) Ric -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, 2013-07-29 at 23:06 +0200, Lennart Poettering wrote: > Well, the point I am making is that it is wrong to ask userspace to > handle this. Get the APIs right you expose to userspace. If user space assume it can use 'all the space up to 15% from exhausting space' then it is user space that is wrong to me. Even w/o thin provisioning you do not want any application to boundlessly consume terabytes of space just because it happen to sit on a *big* disk. Applications may use heuristics to better behave in 'common' situations, but should limit themselves also on the max space they are going to consume in general (or better have admin controllable knobs to do so coupled with reasonable defaults). > I mean, ultimately for me it doesn't matter I geuss, since you say > neither the fs/block layer nor userspace should care, but that this is > the admin's problem, but that really sounds like chickening out to > me... Given the admin is the only one that knows for any given situation what is more important to him I do not think there is much more you can do. Sure you can set defaults or what not but there isn't any configuration that will ever be right short of something that can read minds. What you can do is give admin knobs to tweak in user space, as well as in the kernel. So that applications can limit themselves based on configuration, lacking those knobs you need to provide mechanisms a la cgroups that are used to hard limit misbehaving apps that think they are at an all-you-can-it buffet. Simo. > -- Simo Sorce * Red Hat, Inc * New York -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, 29.07.13 16:52, Ric Wheeler (rwhee...@redhat.com) wrote: > >Oh, we don't assume it's all ours. We recheck regularly, immediately > >before appending to the journal files, of course assuming that we are > >not the only writers. > > With thinly provisioned storage (or things like btrfs, writeable > snapshots, etc), you will not really ever know how much space is > really there. Yeah, and that's an API regression. On btrfs you can just add/remove device as you wish during runtime and statvfs() does refelct this immediately. thinp should work the same. Of course, this requires that the block layer has to pass more metadata up to the file systems than before, but there's really nothing intrinsicly evil about that, I mean, it could be as basic as just passing along a "provisioning perentage" or so which the fs will simply multiply into the returned values... (Of course it won't be that simple, but you get the concept...) > I am starting to think that this is critical enough that we might > want to always fully provision this - just like we would for audit > logs > > Checking won't hurt anything, but the storage stack will lie to you > (and honestly, we always have in many cases :)). Well, journald is totally fine if it is lied to in the sense that the values returned by statfs()/statvfs() are just estimates, and not precise. However, it is assumed that the values are not off by > 100% as they might be on thinp... That the values are not perfectly accurate has been known forever. Since file systems existed developers knew that book-keeping and stuff means the returned valuea are slightly higher than practically reachable. And since compressed file systems they also knew that they might be lower than actually reachable. However, it's one thing to return bad estimates, and it is another thing to be totally off in the woods as is the case for thinp! > There are some alerts that we can raise when you hit a low water > mark for the device mapper physical pool, it would be interesting to > talk about how you might leverage these. Well, the point I am making is that it is wrong to ask userspace to handle this. Get the APIs right you expose to userspace. I mean, ultimately for me it doesn't matter I geuss, since you say neither the fs/block layer nor userspace should care, but that this is the admin's problem, but that really sounds like chickening out to me... Lennart -- Lennart Poettering - Red Hat, Inc. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/29/2013 04:35 PM, Lennart Poettering wrote: On Mon, 29.07.13 13:48, Eric Sandeen (sand...@redhat.com) wrote: Well, I am pretty sure the burden must be on the file systems to report a useful estimate free blocks value in statfs()/statvfs(). Exporting that problem to userspace and expecting userspace to work around this is just wrong. In fact, this would be quite an API breakage if applications cannot rely that the value returned is at least a rough estimate on how much data can be stored on disk. journald will scale how much disk usage it will use of /var/log/journal based on the file system size and free level. It will also module the per-service rate limit levels based on the amount of free disk space. If you break the API of statfs()/statvfs(), then you will end up break this and all programs like it. Any program needs to be prepared for ENOSPC; as Ric mentioned elsewhere, until you successfully write to it, it's not yours! :) (Ok, thinp running out of space won't generate ENOSPC today, either, but you see my general point...) Oh, we don't assume it's all ours. We recheck regularly, immediately before appending to the journal files, of course assuming that we are not the only writers. With thinly provisioned storage (or things like btrfs, writeable snapshots, etc), you will not really ever know how much space is really there. And how much space are we really talking about here? If you're running thin-provisioning on thin margins, especially w/o some way to automatically hot-add storage, you're probably doing it wrong. journald will by default allow the journal files to grow to 10% of the filesystem size of /var/log/journal, but will make sure that 15% is always kept free. This really is just about finding some useful parameters for sizing things that are likely to just work. It's not at all about making any algorithms depending on that, a way to avoid ENOSPC handling or anything like that. It's just about finding some sensible default metircs. Lennart I am starting to think that this is critical enough that we might want to always fully provision this - just like we would for audit logs Checking won't hurt anything, but the storage stack will lie to you (and honestly, we always have in many cases :)). There are some alerts that we can raise when you hit a low water mark for the device mapper physical pool, it would be interesting to talk about how you might leverage these. Ric -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Monday, July 29, 2013 01:41:12 PM Chris Murphy wrote: > On Jul 29, 2013, at 1:05 PM, Steve Grubb wrote: > > The audit system also cares about space available. We tell people to > > create a partition specifically for auditing so that we can keep close > > track on what's left. > > How does the audit system determine space available? It uses fstatfs() on the descriptor currently opened for logging. -Steve > If it's using btrfs configured for raid1 or raid10, df and stat will report > the total storage of all devices in the volume, unlike md raid (or even > proprietary raid). Instead df reports logs files as using twice as much > space. > > > Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, 29.07.13 13:48, Eric Sandeen (sand...@redhat.com) wrote: > > Well, I am pretty sure the burden must be on the file systems to report > > a useful estimate free blocks value in statfs()/statvfs(). Exporting that > > problem to userspace and expecting userspace to work around this is just > > wrong. In fact, this would be quite an API breakage if applications > > cannot rely that the value returned is at least a rough estimate on how > > much data can be stored on disk. > > > > journald will scale how much disk usage it will use of /var/log/journal > > based on the file system size and free level. It will also module the > > per-service rate limit levels based on the amount of free disk space. If > > you break the API of statfs()/statvfs(), then you will end up break this > > and all programs like it. > > Any program needs to be prepared for ENOSPC; as Ric mentioned elsewhere, > until you successfully write to it, it's not yours! :) (Ok, thinp > running out of space won't generate ENOSPC today, either, but you see > my general point...) Oh, we don't assume it's all ours. We recheck regularly, immediately before appending to the journal files, of course assuming that we are not the only writers. > And how much space are we really talking about here? If you're running > thin-provisioning on thin margins, especially w/o some way to automatically > hot-add storage, you're probably doing it wrong. journald will by default allow the journal files to grow to 10% of the filesystem size of /var/log/journal, but will make sure that 15% is always kept free. This really is just about finding some useful parameters for sizing things that are likely to just work. It's not at all about making any algorithms depending on that, a way to avoid ENOSPC handling or anything like that. It's just about finding some sensible default metircs. Lennart -- Lennart Poettering - Red Hat, Inc. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/29/2013 03:50 PM, Chris Adams wrote: Once upon a time, Chris Murphy said: How does the audit system determine space available? If it's using btrfs configured for raid1 or raid10, df and stat will report the total storage of all devices in the volume, unlike md raid (or even proprietary raid). Instead df reports logs files as using twice as much space. How is _anything_ (programs, users, admins) supposed to know how much free space is left on btrfs? This behavior seems like an admin's nightmare. With copy on write file systems like btrfs, you can consume space on writes to existing files when overwriting them. You can even consume space by removing files :) Ric -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
Once upon a time, Chris Murphy said: > How does the audit system determine space available? If it's using btrfs > configured for raid1 or raid10, df and stat will report the total storage of > all devices in the volume, unlike md raid (or even proprietary raid). Instead > df reports logs files as using twice as much space. How is _anything_ (programs, users, admins) supposed to know how much free space is left on btrfs? This behavior seems like an admin's nightmare. -- Chris Adams -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 29, 2013, at 1:05 PM, Steve Grubb wrote: > > The audit system also cares about space available. We tell people to create a > partition specifically for auditing so that we can keep close track on what's > left. How does the audit system determine space available? If it's using btrfs configured for raid1 or raid10, df and stat will report the total storage of all devices in the volume, unlike md raid (or even proprietary raid). Instead df reports logs files as using twice as much space. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/29/2013 03:05 PM, Steve Grubb wrote: On Friday, July 26, 2013 09:29:41 PM Eric Sandeen wrote: On 7/26/13 3:13 PM, Miloslav Trmač wrote: A quick way to check whether your package is likely to be affected, is to look for statfs() or statvfs() calls in C, or the equivalent in your higher-level library / programming language. statfs will still tell you how much space the filesystem has allocated, as well as how much space it thinks it has left, based on the total space the disk has *said* it has available, just like it always ever did. The difference, of course, is that you might actually run out of blocks before you fill the fs. But I can't think offhand what apps would care. And again, it's something the admin shouldn't let happen. The audit system also cares about space available. We tell people to create a partition specifically for auditing so that we can keep close track on what's left. But we have the requirement that for people that depend on it to "take away" system access should the ability to record audit events fail. We also need an accurate estimation before we run out so we can send an admin defined warning when the disk has filled to a certain point so that they can archive files or make space available. If we run out of disk space and were not able to warn admins and this was a shop that really cares about auditing, the system will either be shutdown or sent to single user mode for corrective action. So, having accurate space left numbers is real important so that systems don't get shutdown unexpectedly. -Steve Of course, if you simply can never run out of space and have a special file system/device of your own, you should use fully provisioned storage. dm-thin is not about solving *every* problem :) ric For now, consider it completely transparent to the user (unless the admin doesn't keep up, in which case it will be anything *but* transparent). TBH, when the backing store runs out of space, things do get pretty ugly at this point. It's work that needs to be done to make it more robust & graceful. -Eric Mirek -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Friday, July 26, 2013 09:29:41 PM Eric Sandeen wrote: > On 7/26/13 3:13 PM, Miloslav Trmač wrote: > > A quick way to check whether your package is likely to be affected, is > > to look for statfs() or statvfs() calls in C, or the equivalent in > > your higher-level library / programming language. > > statfs will still tell you how much space the filesystem has allocated, > as well as how much space it thinks it has left, based on the total > space the disk has *said* it has available, just like it always ever > did. > > The difference, of course, is that you might actually run out of blocks > before you fill the fs. But I can't think offhand what apps would care. > And again, it's something the admin shouldn't let happen. The audit system also cares about space available. We tell people to create a partition specifically for auditing so that we can keep close track on what's left. But we have the requirement that for people that depend on it to "take away" system access should the ability to record audit events fail. We also need an accurate estimation before we run out so we can send an admin defined warning when the disk has filled to a certain point so that they can archive files or make space available. If we run out of disk space and were not able to warn admins and this was a shop that really cares about auditing, the system will either be shutdown or sent to single user mode for corrective action. So, having accurate space left numbers is real important so that systems don't get shutdown unexpectedly. -Steve > For now, consider it completely transparent to the user (unless the > admin doesn't keep up, in which case it will be anything *but* > transparent). > > TBH, when the backing store runs out of space, things do get pretty > ugly at this point. It's work that needs to be done to make it more > robust & graceful. > > -Eric > > > Mirek -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, Jul 29, 2013 at 02:38:23PM -0400, Ric Wheeler wrote: > On 07/29/2013 10:18 AM, Daniel P. Berrange wrote: > >On Mon, Jul 29, 2013 at 08:01:23AM -0600, Chris Murphy wrote: > >>On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" > >>wrote: > >> > >>>Yep, we need to be able to report free space on filesystems, so that > >>>apps provisioning virtual machines can get an idea of how much storage > >>>they can provide to VMs without risk of over comitting. > >>> > >>>I agree that we really want the kernel, or at least a reusable shared > >>>library, to provide some kind of interface to determine this, rather > >>>than requiring every userspace app which cares to re-invent the wheel. > >>What does it mean for an app to use stat to get free space, and then > >>proceeds to create too big a VM image in a directory that has a quota > >>set? I still think apps are asking an inappropriate/unqualified question > >>by asking for volume free space, instead of what's available to them for > >>a specified path. > > From an API POV, libvirt doesn't need/care about the free space on the > >volume underlying the filesystem. We actually only care about the free > >space in a given directory that we're using for disk images. It just > >happens that we implement this using statvfs() currently. So when I > >ask for an API above, don't take this to mean I want a statvfs() that > >knows about sparse volumes. An API or syscall that provides free space > >for individual directories is fine with me. > > > > Just another note, it is never safe to assume that storage under any > file system is yours for the taking. > > If application A does a stat or statvfs() call, sees 1GB of space > left and then does a write, we could easily lose that race to any > other application. This race doesn't matter from libvirt's POV. It is just providing a mechanism via its API. It is upto the management application using libvirt to make use of the mechanism to provide a usage policy. Their usage scenario may well enable them to make certain assumptions about the storage that you could not otherwise do in a race free manner. In addition, even in more general purpose usage scenarios, it does not neccessarily matter if there is a race, because there can be a second line of defence. For example, KVM can be set to pause the VM upon ENOSPC errors, giving management application or administrator the chance to expand capacity the underlying storage and then unpause the guest. In that case checking the free space is mostly just a sanity check which serves to avoid hitting the pause-on-ENOSPC scenario too frequently. Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 7/27/13 11:56 AM, Lennart Poettering wrote: > On Fri, 26.07.13 22:13, Miloslav Trmač (m...@volny.cz) wrote: > >> Hello all, >> with thin provisioning available, the total and free space values >> reported by a filesystem do not necessarily mean that that much space >> is _actually_ available (the actual backing storage may be smaller, or >> shared with other filesystems). >> >> If your package reports disk space usage to users, and bases this on >> filesystem free space, please consider whether it might need to take >> LVM thin provisioning into account. >> >> The same applies if your package automatically allocates a certain >> proportion of the total or available space. >> >> A quick way to check whether your package is likely to be affected, is >> to look for statfs() or statvfs() calls in C, or the equivalent in >> your higher-level library / programming language. > > Well, I am pretty sure the burden must be on the file systems to report > a useful estimate free blocks value in statfs()/statvfs(). Exporting that > problem to userspace and expecting userspace to work around this is just > wrong. In fact, this would be quite an API breakage if applications > cannot rely that the value returned is at least a rough estimate on how > much data can be stored on disk. > > journald will scale how much disk usage it will use of /var/log/journal > based on the file system size and free level. It will also module the > per-service rate limit levels based on the amount of free disk space. If > you break the API of statfs()/statvfs(), then you will end up break this > and all programs like it. Any program needs to be prepared for ENOSPC; as Ric mentioned elsewhere, until you successfully write to it, it's not yours! :) (Ok, thinp running out of space won't generate ENOSPC today, either, but you see my general point...) And how much space are we really talking about here? If you're running thin-provisioning on thin margins, especially w/o some way to automatically hot-add storage, you're probably doing it wrong. (And if journald sees "100T free" and decides it can use 50T of that, it's doing it wrong, too) ;) The truth is somewhere in the middle, but quibbling over whether this app or that can claim a bit of space behind a thin-provisioned volume probably isn't useful. The admin definitely needs tools to see the state of thinly provisioned storage, but that's the admin's job to worry about, not the app's, IMHO. -Eric -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/29/2013 10:18 AM, Daniel P. Berrange wrote: On Mon, Jul 29, 2013 at 08:01:23AM -0600, Chris Murphy wrote: On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" wrote: Yep, we need to be able to report free space on filesystems, so that apps provisioning virtual machines can get an idea of how much storage they can provide to VMs without risk of over comitting. I agree that we really want the kernel, or at least a reusable shared library, to provide some kind of interface to determine this, rather than requiring every userspace app which cares to re-invent the wheel. What does it mean for an app to use stat to get free space, and then proceeds to create too big a VM image in a directory that has a quota set? I still think apps are asking an inappropriate/unqualified question by asking for volume free space, instead of what's available to them for a specified path. From an API POV, libvirt doesn't need/care about the free space on the volume underlying the filesystem. We actually only care about the free space in a given directory that we're using for disk images. It just happens that we implement this using statvfs() currently. So when I ask for an API above, don't take this to mean I want a statvfs() that knows about sparse volumes. An API or syscall that provides free space for individual directories is fine with me. Daniel Just another note, it is never safe to assume that storage under any file system is yours for the taking. If application A does a stat or statvfs() call, sees 1GB of space left and then does a write, we could easily lose that race to any other application. If you want to reserve space, you need to grab the space yourself (always works with a large "write()" but preallocation can also help without dm-thin). The difference with dm-thin is that all applications on all file systems backed by the same block pool compete for that space. Another worry here is that preallocation is a file system concept, thinly provisioned storage (dm-thin or array backed), only sees proper writes so you need to write to space to really claim it. Ric -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/29/2013 10:01 AM, Chris Murphy wrote: On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" wrote: Yep, we need to be able to report free space on filesystems, so that apps provisioning virtual machines can get an idea of how much storage they can provide to VMs without risk of over comitting. I agree that we really want the kernel, or at least a reusable shared library, to provide some kind of interface to determine this, rather than requiring every userspace app which cares to re-invent the wheel. What does it mean for an app to use stat to get free space, and then proceeds to create too big a VM image in a directory that has a quota set? I still think apps are asking an inappropriate/unqualified question by asking for volume free space, instead of what's available to them for a specified path. Chris Murphy When you use thinly provisioned storage, the file system itself does not know how much physical storage is really backing it so stat, df and friends *really* have no way to tell. Think of it as the equivalent of virtual memory backed by physical DRAM - the virtual storage is backed by physical disk. It is up to the admin/installation tools to provision enough real storage to make this work. If you provision 10 file systems with a virtual 1TB each and only back it with 2TB of real disk, you will need to monitor the space (via device mapper tools) and dynamically throw in more disk when the physical pool runs low. Regards, Ric -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 29, 2013, at 8:18 AM, Daniel P. Berrange wrote: > > From an API POV, libvirt doesn't need/care about the free space on the > volume underlying the filesystem. We actually only care about the free > space in a given directory that we're using for disk images. It just > happens that we implement this using statvfs() currently. So when I > ask for an API above, don't take this to mean I want a statvfs() that > knows about sparse volumes. An API or syscall that provides free space > for individual directories is fine with me. Got it. So what's needed is an API that can return available space to user/application, while abstracting whether the limit is a function of btrfs raid peculiarities, thinp overcommitting, or file system quota? And then applications that need to know what this limit is, need to use this new API? Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, Jul 29, 2013 at 08:01:23AM -0600, Chris Murphy wrote: > > On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" wrote: > > > Yep, we need to be able to report free space on filesystems, so that > > apps provisioning virtual machines can get an idea of how much storage > > they can provide to VMs without risk of over comitting. > > > > I agree that we really want the kernel, or at least a reusable shared > > library, to provide some kind of interface to determine this, rather > > than requiring every userspace app which cares to re-invent the wheel. > > What does it mean for an app to use stat to get free space, and then > proceeds to create too big a VM image in a directory that has a > quota set? I still think apps are asking an > inappropriate/unqualified question by asking for volume free space, > instead of what's available to them for a specified path. libvirt only does what users (or applications) ask of it. The problem is that the information is not available to give to users/ applications so they can make a good decision. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Mon, Jul 29, 2013 at 08:01:23AM -0600, Chris Murphy wrote: > > On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" wrote: > > > Yep, we need to be able to report free space on filesystems, so that > > apps provisioning virtual machines can get an idea of how much storage > > they can provide to VMs without risk of over comitting. > > > > I agree that we really want the kernel, or at least a reusable shared > > library, to provide some kind of interface to determine this, rather > > than requiring every userspace app which cares to re-invent the wheel. > > What does it mean for an app to use stat to get free space, and then > proceeds to create too big a VM image in a directory that has a quota > set? I still think apps are asking an inappropriate/unqualified question > by asking for volume free space, instead of what's available to them for > a specified path. From an API POV, libvirt doesn't need/care about the free space on the volume underlying the filesystem. We actually only care about the free space in a given directory that we're using for disk images. It just happens that we implement this using statvfs() currently. So when I ask for an API above, don't take this to mean I want a statvfs() that knows about sparse volumes. An API or syscall that provides free space for individual directories is fine with me. Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 29, 2013, at 6:30 AM, "Daniel P. Berrange" wrote: > Yep, we need to be able to report free space on filesystems, so that > apps provisioning virtual machines can get an idea of how much storage > they can provide to VMs without risk of over comitting. > > I agree that we really want the kernel, or at least a reusable shared > library, to provide some kind of interface to determine this, rather > than requiring every userspace app which cares to re-invent the wheel. What does it mean for an app to use stat to get free space, and then proceeds to create too big a VM image in a directory that has a quota set? I still think apps are asking an inappropriate/unqualified question by asking for volume free space, instead of what's available to them for a specified path. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, Jul 26, 2013 at 10:06:20PM +0100, Richard W.M. Jones wrote: > On Fri, Jul 26, 2013 at 10:13:42PM +0200, Miloslav Trmač wrote: > > Hello all, > > with thin provisioning available, the total and free space values > > reported by a filesystem do not necessarily mean that that much space > > is _actually_ available (the actual backing storage may be smaller, or > > shared with other filesystems). > > > > If your package reports disk space usage to users, and bases this on > > filesystem free space, please consider whether it might need to take > > LVM thin provisioning into account. > > > > The same applies if your package automatically allocates a certain > > proportion of the total or available space. > > > > A quick way to check whether your package is likely to be affected, is > > to look for statfs() or statvfs() calls in C, or the equivalent in > > your higher-level library / programming language. > > Also libvirt has a whole set of APIs around storage and > free space. Yep, we need to be able to report free space on filesystems, so that apps provisioning virtual machines can get an idea of how much storage they can provide to VMs without risk of over comitting. I agree that we really want the kernel, or at least a reusable shared library, to provide some kind of interface to determine this, rather than requiring every userspace app which cares to re-invent the wheel. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Sat, Jul 27, 2013 at 6:56 PM, Lennart Poettering wrote: > On Fri, 26.07.13 22:13, Miloslav Trmač (m...@volny.cz) wrote: >> The same applies if your package automatically allocates a certain >> proportion of the total or available space. >> >> A quick way to check whether your package is likely to be affected, is >> to look for statfs() or statvfs() calls in C, or the equivalent in >> your higher-level library / programming language. > > Well, I am pretty sure the burden must be on the file systems to report > a useful estimate free blocks value in statfs()/statvfs(). Exporting that > problem to userspace and expecting userspace to work around this is just > wrong. In fact, this would be quite an API breakage if applications > cannot rely that the value returned is at least a rough estimate on how > much data can be stored on disk. Well, we have two subsystems making quite reasonable assumptions, with the composition being unreasonable. We'll just have to figure a solution out. That's what distributions are for, after all. Mirek. P.S. WRT stat{v,}fs API breakage: I've been (as can be expected) thinking about this a lot, and the primary criteria for API breakage are IMHO: 1) When an "old" application is used in an "old" environment, the results should not change (so that "things keep working", especially on system upgrades; OTOH it's OK for an old application not to be able to handle a "new" environment/to be broken by it). 2) When faced with an environment change, an interface should usually behave strictly as documented, not to preserve a specific use case and break other use cases (because we can't know how various use cases are prevalent, especially in binary-only applications). 3) This implies that introducing new features typically means new adding new interfaces and updating applications to be able to use them. That's, I think, quite reasonable. So, I think that stat{v,}fs should continue to only report about the file system: 1) the value of "free space" reported by stat{v,}fs() for a SAN-located FS shouldn't change if Fedora 21 suddenly learns about SAN thin provisioning 2) stat{v,}fs is explicitly documented to report about the file system, not about the storage stack. To be clear, it's more important to have _a_ solution than to have specifically a solution that follows these ideas. That's why it's a P.S. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 27.07.2013 05:07, Chris Murphy wrote: On Jul 26, 2013, at 4:53 PM, Pádraig Brady wrote: On 07/26/2013 09:13 PM, Miloslav Trmač wrote: Hello all, with thin provisioning available, the total and free space values reported by a filesystem do not necessarily mean that that much space is _actually_ available (the actual backing storage may be smaller, or shared with other filesystems). If your package reports disk space usage to users, and bases this on filesystem free space, please consider whether it might need to take LVM thin provisioning into account. The same applies if your package automatically allocates a certain proportion of the total or available space. A quick way to check whether your package is likely to be affected, is to look for statfs() or statvfs() calls in C, or the equivalent in your higher-level library / programming language. Anything df(1) should do here? Example: Creating a btrfs raid1 volume from two 2TB drives, df shows it as having 4TB available: # parted -l Error: /dev/sdb: unrecognised disk label Model: ATA VBOX HARDDISK (scsi) Disk /dev/sdb: 2199GB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: Error: /dev/sdc: unrecognised disk label Model: ATA VBOX HARDDISK (scsi) Disk /dev/sdc: 2199GB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: # mkfs.btrfs -d raid1 -m raid1 /dev/sd[bc] WARNING! - Btrfs v0.20-rc1 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using adding device /dev/sdc id 2 fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 4.00TB Btrfs v0.20-rc1 # mount /dev/sdb /mnt # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda179G 4.2G 71G 6% / devtmpfs1.5G 0 1.5G 0% /dev tmpfs 1.5G 0 1.5G 0% /dev/shm tmpfs 1.5G 680K 1.5G 1% /run tmpfs 1.5G 0 1.5G 0% /sys/fs/cgroup tmpfs 1.5G 4.0K 1.5G 1% /tmp none224G 87G 138G 39% /media/sf_chris /dev/sdb4.0T 56K 4.0T 1% /mnt The explanation is that the file system isn't raid1, but rather the allocated chunks have this attribute. Presently a volume only allocates with one profile, but the future plan is per subvolume and even per file raid profiles. So establishing how much free space there is on a btrfs volume is absolutely less than clear. Anyway, I think it will cause some confusion if by "available" an application thinks it can write out more than 2TB of data to this example volume. I thought one of the features of combining the block layer and filesystem layer like btrfs does is that the filesystem can actually know the state/topology of the block layer and work more efficiently. Combined with the already existing problem of getting out of diskspace errors long before use hits 100% (has this been fixed since?) this makes any sort of capacity planning difficult if not impossible. Regards, Dennis -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 27, 2013, at 10:56 AM, Lennart Poettering wrote: > > Well, I am pretty sure the burden must be on the file systems to report > a useful estimate free blocks value in statfs()/statvfs(). tl;dr 4 VMs, each using one thinp LV. Each LV has a virtualsize of 1TB. The VG backing those LVs is 1TB. If each LV actually is using only 150GB, the real free space in the VG is 400GB. But how to you propose informing each VMs of the real free space? Are they all informed there's 400GB of free space? Or do you just do a simple scaling and tell them 400GB/4 is free? OK well what if 2 of those VMs actively make use of snapshotting? The scaling approach quickly isn't going to work out for any of the VMs. I think the burden is on the virtual storage layer designer/implementer. He shouldn't make 1TB virtualsize LVs, when only 150GB is needed. The idea isn't to use thinp to totally eliminate the need to ever grow an LV and the underlying fs, but to reduce the need (perhaps significantly). > Note that btrfs RAID is broken in a similar way: it will return the > amount of actual free blocks to the user. Since if RAID is enabled each > file however requires twice (or some other factor) the number of blocks > the value is completely bogus. The btrfs RAID userspace API is simply > broken. It's a problem. I'm unconvinced it's broken. As I mentioned earlier, a btrfs volume as a whole doesn't have a raid profile set. It's the subvolume (or possibly a file). Because the work isn't done to enable per subvolume or per file raid profiles, this is done at mkfs time. But this actually only sets the profile for the default subvolume, not the whole file system. It just seems it is that way now. So you could argue that in the meantime, btrfs devs should punt, and report free space similar to md and lvm raid. Long term fix seems to require the application making a more qualified inquiry. Asking free space for a whole volume that it may not even have write permission for seems unreasonable. It should instead ask for free space for a particular path. The actual write location might be a directory with a quota that must be honored; or a subvolume with a raid1 data profile set. The program asking for volume free space is a totally ambiguous inquiry. > The accepted way to get an estimate how much disk space is still > available is statfs()/statvfs(), applications and admins rely on the > values it returns. You cannot just break that and think you can get away > with it. Sorry, this is a half empty vs half full problem. A solution won't be found by disregarding the other perspective; as a consequence to calling it broken, you're saying to not break it we can't have per subvolume or per file raid. And that's less acceptable than the original problem, which really is that some programs are making unacceptably vague and grandiose inquiries about free space availability. > > The thin provisioning folks need to find a way to make this work, not > userspace programmers. 99.9% of userspace programs are writing out pretty small files, at a rate that's fairly knowable. They are thus well behaved. A handful of applications want to know how much free space there is, as if the answer entitles them to use all or most of that free space, compared to some other program that asks at the same time? I think the expectation programs can get ballpark free space information for a volume was probably always unreasonable, it's just that thin provisioning is making this more clear. Most burden is on the user implementer who creates virtualsize LVs to not make them too big. And then I think there is some burden on programs to make more qualified inquiries for free space available. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Sat, Jul 27, 2013 at 06:25:57PM +0100, Richard W.M. Jones wrote: > I also appreciate this will not be easy with the sheer variety of > underlying storage. Also the problem is not well-defined: What > happens if the underlying storage is storing two guests, and either of > them could grow to the full size on their own, but if both together > did it, we'd run out of space? What does "thin provisioning ratio" > mean for this? And, yes -- overprovisioning is certainly a huge use case. -- Matthew Miller ☁☁☁ Fedora Cloud Architect ☁☁☁ -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Sat, Jul 27, 2013 at 07:02:41PM +0200, Lennart Poettering wrote: > On Fri, 26.07.13 21:29, Eric Sandeen (sand...@redhat.com) wrote: > > On 7/26/13 3:13 PM, Miloslav Trmač wrote: > > > Hello all, > > > with thin provisioning available, the total and free space values > > > reported by a filesystem do not necessarily mean that that much space > > > is _actually_ available (the actual backing storage may be smaller, or > > > shared with other filesystems). > > > > > > If your package reports disk space usage to users, and bases this on > > > filesystem free space, please consider whether it might need to take > > > LVM thin provisioning into account. > > > > Short answer: it doesn't (it can't). > > > > Just like an application doesn't know if it's got a 2.5" or 3.5" drive > > behind it, or cloud behind it, or a usb stick behind it, it doesn't > > know if it's got thinly provisioned storage behind it. > > Well, correct me if I am wrong but don't RAID devices communicate > certain metrics to the file systems on them already? (stride size > iirc?). It doesn't sound too difficult to communicate the thin > provisioning ratio as well, and then leave it to the file system to > scale the report disk free space. It's been like this since thin-provisioned SANs first came along, so twenty or more years. It's got a lot worse because of widespread virtualization and the use of raw-sparse, VMDK and qcow2. I agree with you that underlying storage really ought to communicate this information up to the kernel (as has been recently done with alignment information, see [1]). I also appreciate this will not be easy with the sheer variety of underlying storage. Also the problem is not well-defined: What happens if the underlying storage is storing two guests, and either of them could grow to the full size on their own, but if both together did it, we'd run out of space? What does "thin provisioning ratio" mean for this? Rich. [1] http://libguestfs.org/virt-alignment-scan.1.html#linux-host-block-and-i-o-size -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, 26.07.13 21:29, Eric Sandeen (sand...@redhat.com) wrote: > On 7/26/13 3:13 PM, Miloslav Trmač wrote: > > Hello all, > > with thin provisioning available, the total and free space values > > reported by a filesystem do not necessarily mean that that much space > > is _actually_ available (the actual backing storage may be smaller, or > > shared with other filesystems). > > > > If your package reports disk space usage to users, and bases this on > > filesystem free space, please consider whether it might need to take > > LVM thin provisioning into account. > > Short answer: it doesn't (it can't). > > Just like an application doesn't know if it's got a 2.5" or 3.5" drive > behind it, or cloud behind it, or a usb stick behind it, it doesn't > know if it's got thinly provisioned storage behind it. Well, correct me if I am wrong but don't RAID devices communicate certain metrics to the file systems on them already? (stride size iirc?). It doesn't sound too difficult to communicate the thin provisioning ratio as well, and then leave it to the file system to scale the report disk free space. > > The same applies if your package automatically allocates a certain > > proportion of the total or available space. > > I can't imagine that anything actually does that, does it? > Good lord I hope not. ;) journald does that (see other mail). Lennart -- Lennart Poettering - Red Hat, Inc. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, 26.07.13 22:13, Miloslav Trmač (m...@volny.cz) wrote: > Hello all, > with thin provisioning available, the total and free space values > reported by a filesystem do not necessarily mean that that much space > is _actually_ available (the actual backing storage may be smaller, or > shared with other filesystems). > > If your package reports disk space usage to users, and bases this on > filesystem free space, please consider whether it might need to take > LVM thin provisioning into account. > > The same applies if your package automatically allocates a certain > proportion of the total or available space. > > A quick way to check whether your package is likely to be affected, is > to look for statfs() or statvfs() calls in C, or the equivalent in > your higher-level library / programming language. Well, I am pretty sure the burden must be on the file systems to report a useful estimate free blocks value in statfs()/statvfs(). Exporting that problem to userspace and expecting userspace to work around this is just wrong. In fact, this would be quite an API breakage if applications cannot rely that the value returned is at least a rough estimate on how much data can be stored on disk. journald will scale how much disk usage it will use of /var/log/journal based on the file system size and free level. It will also module the per-service rate limit levels based on the amount of free disk space. If you break the API of statfs()/statvfs(), then you will end up break this and all programs like it. Note that btrfs RAID is broken in a similar way: it will return the amount of actual free blocks to the user. Since if RAID is enabled each file however requires twice (or some other factor) the number of blocks the value is completely bogus. The btrfs RAID userspace API is simply broken. The accepted way to get an estimate how much disk space is still available is statfs()/statvfs(), applications and admins rely on the values it returns. You cannot just break that and think you can get away with it. The thin provisioning folks need to find a way to make this work, not userspace programmers. Lennart -- Lennart Poettering - Red Hat, Inc. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 26, 2013, at 4:53 PM, Pádraig Brady wrote: > On 07/26/2013 09:13 PM, Miloslav Trmač wrote: >> Hello all, >> with thin provisioning available, the total and free space values >> reported by a filesystem do not necessarily mean that that much space >> is _actually_ available (the actual backing storage may be smaller, or >> shared with other filesystems). >> >> If your package reports disk space usage to users, and bases this on >> filesystem free space, please consider whether it might need to take >> LVM thin provisioning into account. >> >> The same applies if your package automatically allocates a certain >> proportion of the total or available space. >> >> A quick way to check whether your package is likely to be affected, is >> to look for statfs() or statvfs() calls in C, or the equivalent in >> your higher-level library / programming language. > > Anything df(1) should do here? Example: Creating a btrfs raid1 volume from two 2TB drives, df shows it as having 4TB available: # parted -l Error: /dev/sdb: unrecognised disk label Model: ATA VBOX HARDDISK (scsi) Disk /dev/sdb: 2199GB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: Error: /dev/sdc: unrecognised disk label Model: ATA VBOX HARDDISK (scsi) Disk /dev/sdc: 2199GB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: # mkfs.btrfs -d raid1 -m raid1 /dev/sd[bc] WARNING! - Btrfs v0.20-rc1 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using adding device /dev/sdc id 2 fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 4.00TB Btrfs v0.20-rc1 # mount /dev/sdb /mnt # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda179G 4.2G 71G 6% / devtmpfs1.5G 0 1.5G 0% /dev tmpfs 1.5G 0 1.5G 0% /dev/shm tmpfs 1.5G 680K 1.5G 1% /run tmpfs 1.5G 0 1.5G 0% /sys/fs/cgroup tmpfs 1.5G 4.0K 1.5G 1% /tmp none224G 87G 138G 39% /media/sf_chris /dev/sdb4.0T 56K 4.0T 1% /mnt The explanation is that the file system isn't raid1, but rather the allocated chunks have this attribute. Presently a volume only allocates with one profile, but the future plan is per subvolume and even per file raid profiles. So establishing how much free space there is on a btrfs volume is absolutely less than clear. Anyway, I think it will cause some confusion if by "available" an application thinks it can write out more than 2TB of data to this example volume. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 7/26/13 3:13 PM, Miloslav Trmač wrote: > Hello all, > with thin provisioning available, the total and free space values > reported by a filesystem do not necessarily mean that that much space > is _actually_ available (the actual backing storage may be smaller, or > shared with other filesystems). > > If your package reports disk space usage to users, and bases this on > filesystem free space, please consider whether it might need to take > LVM thin provisioning into account. Short answer: it doesn't (it can't). Just like an application doesn't know if it's got a 2.5" or 3.5" drive behind it, or cloud behind it, or a usb stick behind it, it doesn't know if it's got thinly provisioned storage behind it. It's up to the administrator to make sure that the thinly provisioned device doesn't fill up & run out of space, but if it does, there's nothing an app can do about that. There's also no standard interface to query how full your thinly provisioned storage is; it depends on what's implementing the thin provisioning. So again, nothing an app can do/query/handle/change. > The same applies if your package automatically allocates a certain > proportion of the total or available space. I can't imagine that anything actually does that, does it? Good lord I hope not. ;) > A quick way to check whether your package is likely to be affected, is > to look for statfs() or statvfs() calls in C, or the equivalent in > your higher-level library / programming language. statfs will still tell you how much space the filesystem has allocated, as well as how much space it thinks it has left, based on the total space the disk has *said* it has available, just like it always ever did. The difference, of course, is that you might actually run out of blocks before you fill the fs. But I can't think offhand what apps would care. And again, it's something the admin shouldn't let happen. For now, consider it completely transparent to the user (unless the admin doesn't keep up, in which case it will be anything *but* transparent). TBH, when the backing store runs out of space, things do get pretty ugly at this point. It's work that needs to be done to make it more robust & graceful. -Eric > Mirek > -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On 07/26/2013 09:13 PM, Miloslav Trmač wrote: > Hello all, > with thin provisioning available, the total and free space values > reported by a filesystem do not necessarily mean that that much space > is _actually_ available (the actual backing storage may be smaller, or > shared with other filesystems). > > If your package reports disk space usage to users, and bases this on > filesystem free space, please consider whether it might need to take > LVM thin provisioning into account. > > The same applies if your package automatically allocates a certain > proportion of the total or available space. > > A quick way to check whether your package is likely to be affected, is > to look for statfs() or statvfs() calls in C, or the equivalent in > your higher-level library / programming language. Anything df(1) should do here? -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 26, 2013, at 3:34 PM, David Lehman wrote: > On Fri, 2013-07-26 at 22:59 +0200, Miloslav Trmač wrote: >> On Fri, Jul 26, 2013 at 10:17 PM, DJ Delorie wrote: >>> If your package reports disk space usage to users, and bases this on filesystem free space, please consider whether it might need to take LVM thin provisioning into account. >>> >>> Perhaps you could include a small code snippet explaining *how* to do >>> this? Is there an lvm_thin_statfs() we can use? >> >> I'd love to, but I don't know how. David, could you suggest something, >> please? > > As noted by drago01, this is not exactly new or specific to thinp -- a > similar situation exists with btrfs. The RAID 1/10 problem where df reports free space as the size of the volume, not the actual amount of stuff that can be stored on the volume? Or is there another case? It does seem to be a problem that df works this way with btrfs. It's one thing for 'btrfs df' to do it's own thing, but for df to do this doesn't seem like a good idea. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, 2013-07-26 at 22:59 +0200, Miloslav Trmač wrote: > On Fri, Jul 26, 2013 at 10:17 PM, DJ Delorie wrote: > > > >> If your package reports disk space usage to users, and bases this on > >> filesystem free space, please consider whether it might need to take > >> LVM thin provisioning into account. > > > > Perhaps you could include a small code snippet explaining *how* to do > > this? Is there an lvm_thin_statfs() we can use? > > I'd love to, but I don't know how. David, could you suggest something, > please? As noted by drago01, this is not exactly new or specific to thinp -- a similar situation exists with btrfs. You would have to ask the developers of lvm and btrfs for a way to decode the magic. I don't know any manageable solution to this problem. > Mirek -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Jul 26, 2013, at 3:02 PM, drago01 wrote: > On Fri, Jul 26, 2013 at 10:59 PM, Miloslav Trmač wrote: >> On Fri, Jul 26, 2013 at 10:17 PM, DJ Delorie wrote: >>> If your package reports disk space usage to users, and bases this on filesystem free space, please consider whether it might need to take LVM thin provisioning into account. >>> >>> Perhaps you could include a small code snippet explaining *how* to do >>> this? Is there an lvm_thin_statfs() we can use? >> >> I'd love to, but I don't know how. David, could you suggest something, >> please? > > The same issue exits for btrfs ... shouldn't we somehow try to get a > generic api for the "exotic" file systems? I don't think btrfs yet supports thin provisioning itself via quotas. The quotas code is still in flux in any case. I have had some problems with lvm thinp and btrfs that happen more rarely with xfs. The thinp LV is somewhat large, 16TB for the available RAM on the machine, 4GB. But I don't have the same problems when using qcow2 to thin provision the same amount of space. But isn't the idea that the file system itself isn't really aware of how much actual space is available? That's up to the manager of the thin provisioned space, in this case LVM. Not the file system. Chris Murphy -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, Jul 26, 2013 at 10:13:42PM +0200, Miloslav Trmač wrote: > Hello all, > with thin provisioning available, the total and free space values > reported by a filesystem do not necessarily mean that that much space > is _actually_ available (the actual backing storage may be smaller, or > shared with other filesystems). > > If your package reports disk space usage to users, and bases this on > filesystem free space, please consider whether it might need to take > LVM thin provisioning into account. > > The same applies if your package automatically allocates a certain > proportion of the total or available space. > > A quick way to check whether your package is likely to be affected, is > to look for statfs() or statvfs() calls in C, or the equivalent in > your higher-level library / programming language. Also libvirt has a whole set of APIs around storage and free space. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, Jul 26, 2013 at 10:13:42PM +0200, Miloslav Trmač wrote: > Hello all, > with thin provisioning available, the total and free space values > reported by a filesystem do not necessarily mean that that much space > is _actually_ available (the actual backing storage may be smaller, or > shared with other filesystems). > > If your package reports disk space usage to users, and bases this on > filesystem free space, please consider whether it might need to take > LVM thin provisioning into account. I guess virt-df could be such a package. > The same applies if your package automatically allocates a certain > proportion of the total or available space. > > A quick way to check whether your package is likely to be affected, is > to look for statfs() or statvfs() calls in C, or the equivalent in > your higher-level library / programming language. What code needs to be used in addition/instead? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#) -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, Jul 26, 2013 at 10:59 PM, Miloslav Trmač wrote: > On Fri, Jul 26, 2013 at 10:17 PM, DJ Delorie wrote: >> >>> If your package reports disk space usage to users, and bases this on >>> filesystem free space, please consider whether it might need to take >>> LVM thin provisioning into account. >> >> Perhaps you could include a small code snippet explaining *how* to do >> this? Is there an lvm_thin_statfs() we can use? > > I'd love to, but I don't know how. David, could you suggest something, > please? The same issue exits for btrfs ... shouldn't we somehow try to get a generic api for the "exotic" file systems? -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
On Fri, Jul 26, 2013 at 10:17 PM, DJ Delorie wrote: > >> If your package reports disk space usage to users, and bases this on >> filesystem free space, please consider whether it might need to take >> LVM thin provisioning into account. > > Perhaps you could include a small code snippet explaining *how* to do > this? Is there an lvm_thin_statfs() we can use? I'd love to, but I don't know how. David, could you suggest something, please? Mirek -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
> If your package reports disk space usage to users, and bases this on > filesystem free space, please consider whether it might need to take > LVM thin provisioning into account. Perhaps you could include a small code snippet explaining *how* to do this? Is there an lvm_thin_statfs() we can use? -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Does your application depend on, or report, free disk space? Re: F20 Self Contained Change: OS Installer Support for LVM Thin Provisioning
Hello all, with thin provisioning available, the total and free space values reported by a filesystem do not necessarily mean that that much space is _actually_ available (the actual backing storage may be smaller, or shared with other filesystems). If your package reports disk space usage to users, and bases this on filesystem free space, please consider whether it might need to take LVM thin provisioning into account. The same applies if your package automatically allocates a certain proportion of the total or available space. A quick way to check whether your package is likely to be affected, is to look for statfs() or statvfs() calls in C, or the equivalent in your higher-level library / programming language. Mirek -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct