Am 14.05.25 um 11:06 schrieb Fabian Grünbichler: >> Fiona Ebner <f.eb...@proxmox.com> hat am 14.05.2025 10:22 CEST geschrieben: >> >> >> Am 13.05.25 um 15:31 schrieb Fiona Ebner: >>> Signed-off-by: Fiona Ebner <f.eb...@proxmox.com> >>> --- >>> src/PVE/Storage/RBDPlugin.pm | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/src/PVE/Storage/RBDPlugin.pm b/src/PVE/Storage/RBDPlugin.pm >>> index 154fa00..b56f8e4 100644 >>> --- a/src/PVE/Storage/RBDPlugin.pm >>> +++ b/src/PVE/Storage/RBDPlugin.pm >>> @@ -703,6 +703,12 @@ sub status { >>> >>> # max_avail -> max available space for data w/o replication in the pool >>> # stored -> amount of user data w/o replication in the pool >>> + # NOTE These values are used because they are most natural from a user >>> perspective. >>> + # However, the %USED/percent_used value in Ceph is calculated from >>> values before factoring out >>> + # replication, namely 'bytes_used / (bytes_used + avail_raw)'. In >>> certain setups, e.g. with LZ4 >>> + # compression, this percentage can be noticeably different form the >>> percentage >>> + # 'stored / (stored + max_avail)' shown in the Proxmox VE CLI/UI. See >>> also src/mon/PGMap.cc from >>> + # the Ceph source code, which also mentions that 'stored' is an >>> approximation. >>> my $free = $d->{stats}->{max_avail}; >>> my $used = $d->{stats}->{stored}; >>> my $total = $used + $free; >> >> Thinking about this again, I don't think continuing to use 'stored' is >> best after all, because that is before compression. And this is where >> the mismatch really comes from AFAICT. For highly compressible data, the >> mismatch between actual usage on the storage and 'stored' can be very >> big (in a quick test using the 'yes' command to fill an RBD image, I got >> stored = 2 * (used / replication_count)). And here in the storage stats >> we are interested in the usage on the storage, not the actual amount of >> data written by the user. For ZFS we also don't use 'logicalused', but >> 'used'. > > but for ZFS, we actually use the "logical" view provided by `zfs list/get`, > not the "physical" view provided by `zpool list/get` (and even the latter > would already account for redundancy).
But that is not the same logcial view as 'logicalused' which would not consider compression. > > e.g., with a testpool consisting of three mirrored vdevs of size 1G, with > a single dataset filled with a file with 512MB of random data: > > $ zpool list -v testpool > NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP > DEDUP HEALTH ALTROOT > testpool 960M 513M 447M - - 42% 53% > 1.00x ONLINE - > mirror-0 960M 513M 447M - - 42% 53.4% > - ONLINE > /tmp/vdev1.img 1G - - - - - - > - ONLINE > /tmp/vdev2.img 1G - - - - - - > - ONLINE > /tmp/vdev3.img 1G - - - - - - > - ONLINE > > and what we use for the storage status: > > $ zfs get available,used testpool/data > NAME PROPERTY VALUE SOURCE > testpool/data available 319M - > testpool/data used 512M - > > if we switch away from `stored`, we'd have to account for replication > ourselves to match that, right? and we don't have that information > readily available (and also no idea how to handle EC pools?)? wouldn't > we just exchange one wrong set of numbers with another (differently) > wrong set of numbers? I would've used avail_raw / max_avail to calculate the replication factor and apply that to bytes_used. Sure it won't be perfect, but it should lead to matching the percent_used reported by Ceph: percent_used = used_bytes / (used_bytes + avail_raw) max_avail = avail_raw / rep (rep is called raw_used_rate in Ceph source, but I'm shortening it for readability) Thus: rep = avail_raw / max_avail our_used = used_bytes / rep our_avail = max_avail = avail_raw / rep our_percentage = our_used / (our_used + our_avail) = (used_bytes/rep) / (used_bytes/rep + avail_raw/rep) = then canceling rep = used_bytes / (used_bytes + avail_raw) = percent_used from Ceph The point is that it'd be much better than not considering compression. > > FWIW, we already provide raw numbers in the pool view, and could maybe > expand that view to provide more details? > > e.g., for my test rbd pool the pool view shows 50,29% used amounting to > 163,43GiB, whereas the storage status says 51.38% used amounting to > 61.11GB of 118.94GB, with the default 3/2 replication > > ceph df detail says: > > { > "name": "rbd", > "id": 2, > "stats": { > "stored": 61108710142, => /1000/1000/1000 == storage > used But this is not really "storage used". This is the amount of user data, before compression. The actual usage on the storage can be much lower than this. > "stored_data": 61108699136, > "stored_omap": 11006, > "objects": 15579, > "kb_used": 171373017, > "bytes_used": 175485968635, => /1024/1024/1024 == pool used > "data_bytes_used": 175485935616, > "omap_bytes_used": 33019, > "percent_used": 0.5028545260429382, => rounded this is the pool view > percentage > "max_avail": 57831211008, => (this + > stored)/1000/1000/1000 storage total > "quota_objects": 0, > "quota_bytes": 0, > "dirty": 0, > "rd": 253354, > "rd_bytes": 38036885504, > "wr": 75833, > "wr_bytes": 33857918976, > "compress_bytes_used": 0, > "compress_under_bytes": 0, > "stored_raw": 183326130176, > "avail_raw": 173493638191 > } > }, > > >> From src/osd/osd_types.h: >> >>> int64_t data_stored = 0; ///< Bytes actually stored by the >>> user >>> int64_t data_compressed = 0; ///< Bytes stored after >>> compression >>> int64_t data_compressed_allocated = 0; ///< Bytes allocated for >>> compressed data >>> int64_t data_compressed_original = 0; ///< Bytes that were compressed >> >> >> >> _______________________________________________ >> pve-devel mailing list >> pve-devel@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel