From: Thomas Weißschuh <[email protected]> Sent: Tuesday, July 29, 2025 11:47 AM > > On 2025-07-29 18:39:45+0000, Michael Kelley wrote: > > From: Naman Jain <[email protected]> Sent: Wednesday, July 23, > > 2025 12:02 AM > > > > > > The ring buffer size varies across VMBus channels. The size of sysfs > > > node for the ring buffer is currently hardcoded to 4 MB. Userspace > > > clients either use fstat() or hardcode this size for doing mmap(). > > > To address this, make the sysfs node size dynamic to reflect the > > > actual ring buffer size for each channel. This will ensure that > > > fstat() on ring sysfs node always returns the correct size of > > > ring buffer. > > > > > > This is a backport of the upstream commit > > > 65995e97a1ca ("Drivers: hv: Make the sysfs node size for the ring buffer > > > dynamic") > > > with modifications, as the original patch has missing dependencies on > > > kernel v6.12.x. The structure "struct attribute_group" does not have > > > bin_size field in v6.12.x kernel so the logic of configuring size of > > > sysfs node for ring buffer has been moved to > > > vmbus_chan_bin_attr_is_visible(). > > > > > > Original change was not a fix, but it needs to be backported to fix size > > > related discrepancy caused by the commit mentioned in Fixes tag. > > > > > > Fixes: bf1299797c3c ("uio_hv_generic: Align ring size to system page") > > > Cc: <[email protected]> # 6.12.x > > > Signed-off-by: Naman Jain <[email protected]> > > > --- > > > > > > This change won't apply on older kernels currently due to missing > > > dependencies. I will take care of them after this goes in. > > > > > > I did not retain any Reviewed-by or Tested-by tags, since the code has > > > changed completely, while the functionality remains same. > > > Requesting Michael, Dexuan, Wei to please review again. > > > > > > --- > > > drivers/hv/vmbus_drv.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > > > index 1f519e925f06..616e63fb2f15 100644 > > > --- a/drivers/hv/vmbus_drv.c > > > +++ b/drivers/hv/vmbus_drv.c > > > @@ -1810,7 +1810,6 @@ static struct bin_attribute chan_attr_ring_buffer = > > > { > > > .name = "ring", > > > .mode = 0600, > > > }, > > > - .size = 2 * SZ_2M, > > > .mmap = hv_mmap_ring_buffer_wrapper, > > > }; > > > static struct attribute *vmbus_chan_attrs[] = { > > > @@ -1866,6 +1865,7 @@ static umode_t > > > vmbus_chan_bin_attr_is_visible(struct kobject *kobj, > > > /* Hide ring attribute if channel's ring_sysfs_visible is set to false > > > */ > > > if (attr == &chan_attr_ring_buffer && !channel->ring_sysfs_visible) > > > return 0; > > > + attr->size = channel->ringbuffer_pagecount << PAGE_SHIFT; > > > > Suppose a VM has two devices using UIO, such as DPDK network device with > > a 2MiB ring buffer, and an fcopy device with a 16KiB ring buffer. Both > > devices > > will be referencing the same static instance of chan_attr_ring_buffer, and > > the > > .size field it contains. The above statement will change that .size field > > between 2MiB and 16KiB as the /sys entries are initially populated, and as > > the visibility is changed if the devices are removed and re-instantiated > > (which > > is much more likely for fcopy than for netvsc). That changing of the .size > > value will probably work most of the time, but it's racy if two devices with > > different ring buffer sizes get instantiated or re-instantiated at the same > > time. > > IIRC it works out in practice. While the global attribute instance is indeed > modified back-and-forth the size from it will be *copied* into kernfs > after each recalculation. So each attribute should get its own correct size.
The race I see is in fs/sysfs/group.c in the create_files() function. It calls the is_bin_visible() function, which this patch uses to set the .size field of the static attribute. Then creates_files() calls sysfs_add_bin_file_mode_ns(), which reads the .size field and uses it to create the sysfs entry. But if create_files() is called in parallel on two different kobjs of the same type, but with different values for the .size field, the second create_files() could overwrite the static .size field after the first create_files() has set it, but before it has used it. I don't see any global lock that would prevent such, though maybe I'm missing something. > > > Unfortunately, I don't see a fix, short of backporting support for the > > .bin_size function, as this is exactly the problem that function solves. > > It should work out in practice. (I introduced the .bin_size function) The race I describe is unlikely, particularly if attribute groups are created once and then not disturbed. But note that the Hyper-V fcopy group can get updated in a running VM via update_sysfs_group(), which also calls create_files(). Such an update might marginally increase the potential for the race and for getting the wrong size. Still, I agree it should work out in practice. Michael > > Thomas
