Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-14 Thread Michael Dalton
I'd like to confirm the preferred sysfs path structure for mergeable
receive buffers. Is 'mergeable_rx_buffer_size' the right attribute name
to use or is there a strong preference for a different name?

I believe the current approach proposed for the next patchset is to use a
per-netdev attribute group which we will add to the receive
queue kobj (struct netdev_rx_queue). That leaves us with at
least two options:
  (1) Name the attribute group something, e.g., 'virtio-net', in which
  case all virtio-net attributes for eth0 queue N will be of
  the form:
  /sys/class/net/eth0/queues/rx-N/virtio-net/attribute name

  (2) Do not name the attribute group (leave the name NULL), in which
  case AFAICT virtio-net and device-independent attributes would be
  mixed without any indication. For example, all virtio-net
  attributes for netdev eth0 queue N would be of the form:
  /sys/class/net/eth0/queues/rx-N/attribute name

FWIW, the bonding netdev has a similar sysfs issue and uses a per-netdev
attribute group (stored in the 'sysfs_groups' field of struct netdevice)
In the case of bonding, the attribute group is named, so
device-independent netdev attributes are found in
/sys/class/net/eth0/attribute name while bonding attributes are placed
in /sys/class/net/eth0/bonding/attribute name.

So it seems like there is some precedent for using an attribute group
name corresponding to the driver name. Does using an attribute group
name of 'virtio-net' sound good or would an empty or different attribute
group name be preferred?

Best,

Mike
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-14 Thread Michael S. Tsirkin
On Tue, Jan 14, 2014 at 01:45:42PM -0800, Michael Dalton wrote:
 I'd like to confirm the preferred sysfs path structure for mergeable
 receive buffers. Is 'mergeable_rx_buffer_size' the right attribute name
 to use or is there a strong preference for a different name?
 
 I believe the current approach proposed for the next patchset is to use a
 per-netdev attribute group which we will add to the receive
 queue kobj (struct netdev_rx_queue). That leaves us with at
 least two options:
   (1) Name the attribute group something, e.g., 'virtio-net', in which
   case all virtio-net attributes for eth0 queue N will be of
   the form:
   /sys/class/net/eth0/queues/rx-N/virtio-net/attribute name
 
   (2) Do not name the attribute group (leave the name NULL), in which
   case AFAICT virtio-net and device-independent attributes would be
   mixed without any indication. For example, all virtio-net
   attributes for netdev eth0 queue N would be of the form:
   /sys/class/net/eth0/queues/rx-N/attribute name
 
 FWIW, the bonding netdev has a similar sysfs issue and uses a per-netdev
 attribute group (stored in the 'sysfs_groups' field of struct netdevice)
 In the case of bonding, the attribute group is named, so
 device-independent netdev attributes are found in
 /sys/class/net/eth0/attribute name while bonding attributes are placed
 in /sys/class/net/eth0/bonding/attribute name.
 
 So it seems like there is some precedent for using an attribute group
 name corresponding to the driver name. Does using an attribute group
 name of 'virtio-net' sound good or would an empty or different attribute
 group name be preferred?
 
 Best,
 
 Mike

I'm guessing we should follow the bonding example.
What do others think?

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-13 Thread Michael S. Tsirkin
On Sun, Jan 12, 2014 at 03:32:28PM -0800, Michael Dalton wrote:
 Hi Michael,
 
 On Sun, Jan 12, 2014 at 9:09 AM, Michael S. Tsirkin m...@redhat.com wrote:
  Can't we add struct attribute * to netdevice, and pass that in when
  creating the kobj?
 
 I like that idea, I think that will work and should be better than
 the alternatives. The actual kobjs for RX queues (struct netdev_rx_queue)
 are allocated and deallocated by calls to net_rx_queue_update_kobjects,
 which resizes RX queue kobjects when the netdev RX queues are resized.
 
 Is this what you had in mind:
 (1) Add a pointer to an attribute group to struct net_device, used for
 per-netdev rx queue attributes and initialized before the call to
 register_netdevice().
 (2) Declare an attribute group containing the mergeable_rx_buffer_size
 attribute in virtio-net, and initialize the per-netdevice group pointer
 to the address of this group in virtnet_probe before register_netdevice
 (3) In net-sysfs, modify net_rx_queue_update_kobjects
 (or rx_queue_add_kobject) to call sysfs_create_group on the
 per-netdev attribute group (if non-NULL), adding the attributes in
 the group to the RX queue kobject.


Exactly.

 That should allow us to have per-RX queue attributes that are
 device-specific. I'm not a sysfs expert, but it seems that rx_queue_ktype
 and rx_queue_sysfs_ops presume that all rx queue sysfs operations are
 performed on attributes of type rx_queue_attribute. That type will need
 to be moved from net-sysfs.c to a header file like netdevice.h so that
 the type can be used in virtio-net when we declare the
 mergeable_rx_buffer_size attribute.
 
 The last issue is how the rx_queue_attribute 'show' function
 implementation for mergeable_rx_buffer_size will access the appropriate
 per-receive queue EWMA data. The arguments to the show function will be
 the netdev_rx_queue and the attribute itself. We can get to the
 struct net_device from the netdev_rx_queue.  If we extended
 netdev_rx_queue to indicate the queue_index or to store a void *priv_data
 pointer, that would be sufficient to allow us to resolve this issue.

Hmm netdev_rx_queue is not defined unless CONFIG_RPS is set.
Maybe we should use a different structure.


 Please let me know if the above sounds good or if you see a better way
 to accomplish this goal. Thanks!
 
 Best,
 
 Mike

Sounds good to me.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-13 Thread Ben Hutchings
On Mon, 2014-01-13 at 11:40 +0200, Michael S. Tsirkin wrote:
 On Sun, Jan 12, 2014 at 03:32:28PM -0800, Michael Dalton wrote:
[...]
  The last issue is how the rx_queue_attribute 'show' function
  implementation for mergeable_rx_buffer_size will access the appropriate
  per-receive queue EWMA data. The arguments to the show function will be
  the netdev_rx_queue and the attribute itself. We can get to the
  struct net_device from the netdev_rx_queue.  If we extended
  netdev_rx_queue to indicate the queue_index or to store a void *priv_data
  pointer, that would be sufficient to allow us to resolve this issue.
 
 Hmm netdev_rx_queue is not defined unless CONFIG_RPS is set.
 Maybe we should use a different structure.
[...]

I don't think RPS should own this structure.  It's just that there are
currently no per-RX-queue attributes other than those defined by RPS.

By the way, CONFIG_RPS is equivalent to CONFIG_SMP so will likely be
enabled already in most places where virtio_net is used.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-13 Thread Michael Dalton
On Mon, Jan 13, 2014 at 7:38 AM, Ben Hutchings bhutchi...@solarflare.com
wrote:
 I don't think RPS should own this structure.  It's just that there are
 currently no per-RX-queue attributes other than those defined by RPS.

Agreed, there is useful attribute-independent functionality already
built around netdev_rx_queue - e.g., dynamically resizing the rx queue
kobjs as the number of RX queues enabled for the netdev is changed. While
the current attributes happen to be used only by RPS, AFAICT it seems
RPS should not own netdev_rx_queue but rather should own the RPS-specific
fields themselves within netdev_rx_queue.

If there are no objections, it seems like I could modify
netdev_rx_queue and related functionality so that their existence does
not depend on CONFIG_RPS, and instead just have CONFIG_RPS control
whether or not the RPS-specific attributes/fields are present.

Best,

Mike
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-13 Thread Michael Dalton
Sorry I missed this important piece of information, it appears that
netdev_queue (the TX equivalent of netdev_rx_queue) already has
decoupled itself from CONFIG_XPS due to an attribute,
queue_trans_timeout, that does not depend on XPS functionality. So it
seems that something somewhat equivalent has already happened on the
TX side.

Best,

Mike
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-12 Thread Michael S. Tsirkin
On Fri, Jan 10, 2014 at 09:19:37PM -0800, Michael Dalton wrote:
 Hi Jason, Michael
 
 Sorry for the delay in response. Jason, I agree this patch ended up
 being larger than expected. The major implementation parts are:
 (1) Setup directory structure (driver/per-netdev/rx-queue directories)
 (2) Network device renames (optional, so debugfs dir has the right name)
 (3) Support resizing the # of RX queues (optional - we could just export
 max_queue_pairs files and not delete files if an RX queue is disabled)
 (4) Reference counting - used in case someone opens a debugfs
 file and then removes the virtio-net device.
 (5) The actual mergeable rx buffer file implementation itself. For now
 I have added a seqcount for memory safety, but if a read-only race
 condition is acceptable we could elide the seqcount. FWIW, the
 seqcount write in receive_mergeable() should, on modern x86,
 translate to two non-atomic adds and two compiler barriers, so
 overhead is not expected to be meaningful.
 
 We can move to sysfs and this would simplify or eliminate much of the
 above, including most of (1) - (4). I believe our choices for what to
 do for the next patchset include:
 (a) Use debugfs as is currently done, removing any optional features
 listed above that are deemed unnecessary.
 
 (b) Add a per-netdev sysfs attribute group to net_device-sysfs_groups.
 Each attribute would display the mergeable packet buffer size for a given
 RX queue, and there would be max_queue_pairs attributes in total. This
 is already supported by net/core/net-sysfs.c:netdev_register_kobject(),
 but means that we would have a static set of per-RX queue files for
 all RX queues supported by the netdev, rather than dynamically displaying
 only the files corresponding to enabled RX queues (e.g., when # of RX
 queues is changed by  ethtool -L device).  For an example of this
 approach, see drivers/net/bonding/bond_sysfs.c.
 
 (c) Modify struct netdev_rx_queue to add virtio-net EWMA fields directly,
 and modify net-sysfs.c to manage the new fields. Unlike (b), this approach
 supports the RX queue resizing in (3) but means putting virtio-net info
 in netdev_rx_queue, which currently has only device-independent fields.

Can't we add struct attribute * to netdevice, and pass that in when
creating the kobj?

 My preference would be (b): try using sysfs and adding a device-specific
 attribute group to the virtio-net netdevice (stored in the existing
 'sysfs_groups' field and supported by net-sysfs).  This would avoid
 adding virtio-net specific information to net-sysfs. What would you
 prefer (or is there a better way than the approaches above)? Thanks!
 
 Best,
 
 Mike
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-12 Thread Michael Dalton
Hi Michael,

On Sun, Jan 12, 2014 at 9:09 AM, Michael S. Tsirkin m...@redhat.com wrote:
 Can't we add struct attribute * to netdevice, and pass that in when
 creating the kobj?

I like that idea, I think that will work and should be better than
the alternatives. The actual kobjs for RX queues (struct netdev_rx_queue)
are allocated and deallocated by calls to net_rx_queue_update_kobjects,
which resizes RX queue kobjects when the netdev RX queues are resized.

Is this what you had in mind:
(1) Add a pointer to an attribute group to struct net_device, used for
per-netdev rx queue attributes and initialized before the call to
register_netdevice().
(2) Declare an attribute group containing the mergeable_rx_buffer_size
attribute in virtio-net, and initialize the per-netdevice group pointer
to the address of this group in virtnet_probe before register_netdevice
(3) In net-sysfs, modify net_rx_queue_update_kobjects
(or rx_queue_add_kobject) to call sysfs_create_group on the
per-netdev attribute group (if non-NULL), adding the attributes in
the group to the RX queue kobject.

That should allow us to have per-RX queue attributes that are
device-specific. I'm not a sysfs expert, but it seems that rx_queue_ktype
and rx_queue_sysfs_ops presume that all rx queue sysfs operations are
performed on attributes of type rx_queue_attribute. That type will need
to be moved from net-sysfs.c to a header file like netdevice.h so that
the type can be used in virtio-net when we declare the
mergeable_rx_buffer_size attribute.

The last issue is how the rx_queue_attribute 'show' function
implementation for mergeable_rx_buffer_size will access the appropriate
per-receive queue EWMA data. The arguments to the show function will be
the netdev_rx_queue and the attribute itself. We can get to the
struct net_device from the netdev_rx_queue.  If we extended
netdev_rx_queue to indicate the queue_index or to store a void *priv_data
pointer, that would be sufficient to allow us to resolve this issue.

Please let me know if the above sounds good or if you see a better way
to accomplish this goal. Thanks!

Best,

Mike
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-12 Thread Jason Wang
On 01/13/2014 07:32 AM, Michael Dalton wrote:
 Hi Michael,

 On Sun, Jan 12, 2014 at 9:09 AM, Michael S. Tsirkin m...@redhat.com wrote:
 Can't we add struct attribute * to netdevice, and pass that in when
 creating the kobj?
 I like that idea, I think that will work and should be better than
 the alternatives. The actual kobjs for RX queues (struct netdev_rx_queue)
 are allocated and deallocated by calls to net_rx_queue_update_kobjects,
 which resizes RX queue kobjects when the netdev RX queues are resized.

 Is this what you had in mind:
 (1) Add a pointer to an attribute group to struct net_device, used for
 per-netdev rx queue attributes and initialized before the call to
 register_netdevice().
 (2) Declare an attribute group containing the mergeable_rx_buffer_size
 attribute in virtio-net, and initialize the per-netdevice group pointer
 to the address of this group in virtnet_probe before register_netdevice
 (3) In net-sysfs, modify net_rx_queue_update_kobjects
 (or rx_queue_add_kobject) to call sysfs_create_group on the
 per-netdev attribute group (if non-NULL), adding the attributes in
 the group to the RX queue kobject.

 That should allow us to have per-RX queue attributes that are
 device-specific. I'm not a sysfs expert, but it seems that rx_queue_ktype
 and rx_queue_sysfs_ops presume that all rx queue sysfs operations are
 performed on attributes of type rx_queue_attribute. That type will need
 to be moved from net-sysfs.c to a header file like netdevice.h so that
 the type can be used in virtio-net when we declare the
 mergeable_rx_buffer_size attribute.

There's a possible issue, rx queue sysfs depends on CONFIG_RPS. So we
probably need a dedicated attribute just for virtio-net.

 The last issue is how the rx_queue_attribute 'show' function
 implementation for mergeable_rx_buffer_size will access the appropriate
 per-receive queue EWMA data. The arguments to the show function will be
 the netdev_rx_queue and the attribute itself. We can get to the
 struct net_device from the netdev_rx_queue.  If we extended
 netdev_rx_queue to indicate the queue_index or to store a void *priv_data
 pointer, that would be sufficient to allow us to resolve this issue.

 Please let me know if the above sounds good or if you see a better way
 to accomplish this goal. Thanks!

 Best,

 Mike

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-10 Thread Michael Dalton
Hi Jason, Michael

Sorry for the delay in response. Jason, I agree this patch ended up
being larger than expected. The major implementation parts are:
(1) Setup directory structure (driver/per-netdev/rx-queue directories)
(2) Network device renames (optional, so debugfs dir has the right name)
(3) Support resizing the # of RX queues (optional - we could just export
max_queue_pairs files and not delete files if an RX queue is disabled)
(4) Reference counting - used in case someone opens a debugfs
file and then removes the virtio-net device.
(5) The actual mergeable rx buffer file implementation itself. For now
I have added a seqcount for memory safety, but if a read-only race
condition is acceptable we could elide the seqcount. FWIW, the
seqcount write in receive_mergeable() should, on modern x86,
translate to two non-atomic adds and two compiler barriers, so
overhead is not expected to be meaningful.

We can move to sysfs and this would simplify or eliminate much of the
above, including most of (1) - (4). I believe our choices for what to
do for the next patchset include:
(a) Use debugfs as is currently done, removing any optional features
listed above that are deemed unnecessary.

(b) Add a per-netdev sysfs attribute group to net_device-sysfs_groups.
Each attribute would display the mergeable packet buffer size for a given
RX queue, and there would be max_queue_pairs attributes in total. This
is already supported by net/core/net-sysfs.c:netdev_register_kobject(),
but means that we would have a static set of per-RX queue files for
all RX queues supported by the netdev, rather than dynamically displaying
only the files corresponding to enabled RX queues (e.g., when # of RX
queues is changed by  ethtool -L device).  For an example of this
approach, see drivers/net/bonding/bond_sysfs.c.

(c) Modify struct netdev_rx_queue to add virtio-net EWMA fields directly,
and modify net-sysfs.c to manage the new fields. Unlike (b), this approach
supports the RX queue resizing in (3) but means putting virtio-net info
in netdev_rx_queue, which currently has only device-independent fields.

My preference would be (b): try using sysfs and adding a device-specific
attribute group to the virtio-net netdevice (stored in the existing
'sysfs_groups' field and supported by net-sysfs).  This would avoid
adding virtio-net specific information to net-sysfs. What would you
prefer (or is there a better way than the approaches above)? Thanks!

Best,

Mike
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-10 Thread Michael Dalton
Also, one other note: if we use sysfs, the directory structure will
be different depending on our chosen sysfs strategy. If we augment
netdev_rx_queue, the new attributes will be found in the standard
'rx-N' netdev subdirectory, e.g.,
/sys/class/net/eth0/queues/rx-0/mergeable_rx_buffer_size

Whereas if we use per-netdev attributes, our attributes would be in
/sys/class/net/eth0/group name/attribute name, which may be
less intuitive as AFAICT we'd have to indicate both the queue # and
type of value being reported using the attribute name. E.g.,
/sys/class/net/eth0/virtio-net/rx-0_mergeable_buffer_size.
That's somewhat less elegant.

I don't see an easy way to add new attributes to the 'rx-N'
subdirectories without directly modifying struct netdev_rx_queue,
so I think this is another tradeoff between the two sysfs approaches.

Best,

Mike
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-08 Thread Michael S. Tsirkin
On Wed, Jan 08, 2014 at 02:34:31PM +0800, Jason Wang wrote:
 On 01/07/2014 01:25 PM, Michael Dalton wrote:
  Add initial support for debugfs to virtio-net. Each virtio-net network
  device will have a directory under /virtio-net in debugfs. The
  per-network device directory will contain one sub-directory per active,
  enabled receive queue. If mergeable receive buffers are enabled, each
  receive queue directory will contain a read-only file that returns the
  current packet buffer size for the receive queue.
 
  Signed-off-by: Michael Dalton mwdal...@google.com
 
 This looks more complicated than expected. How about just adding an
 entry in sysfs onto the existed network class device which looks more
 simpler?

sysfs is part of userspace ABI, I think that's the main issue: can we
commit to this attribute being there in the future?
If yes we can use sysfs but maybe it seems reasonable to use debugfs for
a while until we are more sure of this.
I don't mind either way.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-07 Thread Jason Wang
On 01/07/2014 01:25 PM, Michael Dalton wrote:
 Add initial support for debugfs to virtio-net. Each virtio-net network
 device will have a directory under /virtio-net in debugfs. The
 per-network device directory will contain one sub-directory per active,
 enabled receive queue. If mergeable receive buffers are enabled, each
 receive queue directory will contain a read-only file that returns the
 current packet buffer size for the receive queue.

 Signed-off-by: Michael Dalton mwdal...@google.com

This looks more complicated than expected. How about just adding an
entry in sysfs onto the existed network class device which looks more
simpler?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next v2 4/4] virtio-net: initial debugfs support, export mergeable rx buffer size

2014-01-06 Thread Michael Dalton
Add initial support for debugfs to virtio-net. Each virtio-net network
device will have a directory under /virtio-net in debugfs. The
per-network device directory will contain one sub-directory per active,
enabled receive queue. If mergeable receive buffers are enabled, each
receive queue directory will contain a read-only file that returns the
current packet buffer size for the receive queue.

Signed-off-by: Michael Dalton mwdal...@google.com
---
 drivers/net/virtio_net.c | 314 ---
 1 file changed, 296 insertions(+), 18 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f6e1ee0..5da18d6 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -27,6 +27,9 @@
 #include linux/slab.h
 #include linux/cpu.h
 #include linux/average.h
+#include linux/seqlock.h
+#include linux/kref.h
+#include linux/debugfs.h
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -35,6 +38,9 @@ static bool csum = true, gso = true;
 module_param(csum, bool, 0444);
 module_param(gso, bool, 0444);
 
+/* Debugfs root directory for all virtio-net devices. */
+static struct dentry *virtnet_debugfs_root;
+
 /* FIXME: MTU in config. */
 #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
 #define GOOD_COPY_LEN  128
@@ -102,9 +108,6 @@ struct receive_queue {
/* Chain pages by the private ptr. */
struct page *pages;
 
-   /* Average packet length for mergeable receive buffers. */
-   struct ewma mrg_avg_pkt_len;
-
/* Page frag for packet buffer allocation. */
struct page_frag alloc_frag;
 
@@ -115,6 +118,28 @@ struct receive_queue {
char name[40];
 };
 
+/* Per-receive queue statistics exported via debugfs. */
+struct receive_queue_stats {
+   /* Average packet length of receive queue (for mergeable rx buffers). */
+   struct ewma avg_pkt_len;
+
+   /* Per-receive queue stats debugfs directory. */
+   struct dentry *dbg;
+
+   /* Reference count for the receive queue statistics, needed because
+* an open debugfs file may outlive the receive queue and netdevice.
+* Open files will remain in-use until all outstanding file descriptors
+* are closed, even after the underlying file is unlinked.
+*/
+   struct kref refcount;
+
+   /* Sequence counter to allow debugfs readers to safely access stats.
+* Assumes a single virtio-net writer, which is enforced by virtio-net
+* and NAPI.
+*/
+   seqcount_t dbg_seq;
+};
+
 struct virtnet_info {
struct virtio_device *vdev;
struct virtqueue *cvq;
@@ -147,6 +172,15 @@ struct virtnet_info {
/* Active statistics */
struct virtnet_stats __percpu *stats;
 
+   /* Per-receive queue statstics exported via debugfs. Stored in
+* virtnet_info to survive freeze/restore -- a task may have a per-rq
+* debugfs file open at the time of freeze.
+*/
+   struct receive_queue_stats **rq_stats;
+
+   /* Per-netdevice debugfs directory. */
+   struct dentry *dbg_dev_root;
+
/* Work struct for refilling if we run low on memory. */
struct delayed_work refill;
 
@@ -358,6 +392,8 @@ static struct sk_buff *receive_mergeable(struct net_device 
*dev,
 unsigned int len)
 {
struct skb_vnet_hdr *hdr = ctx-buf;
+   struct virtnet_info *vi = netdev_priv(dev);
+   struct receive_queue_stats *rq_stats = vi-rq_stats[vq2rxq(rq-vq)];
int num_buf = hdr-mhdr.num_buffers;
struct page *page = virt_to_head_page(ctx-buf);
int offset = ctx-buf - page_address(page);
@@ -413,7 +449,9 @@ static struct sk_buff *receive_mergeable(struct net_device 
*dev,
}
}
 
-   ewma_add(rq-mrg_avg_pkt_len, head_skb-len);
+   write_seqcount_begin(rq_stats-dbg_seq);
+   ewma_add(rq_stats-avg_pkt_len, head_skb-len);
+   write_seqcount_end(rq_stats-dbg_seq);
return head_skb;
 
 err_skb:
@@ -600,18 +638,30 @@ static int add_recvbuf_big(struct receive_queue *rq, 
gfp_t gfp)
return err;
 }
 
+static unsigned int get_mergeable_buf_len(struct ewma *avg_pkt_len)
+{
+   const size_t hdr_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
+   unsigned int len;
+
+   len = hdr_len + clamp_t(unsigned int, ewma_read(avg_pkt_len),
+   GOOD_PACKET_LEN, PAGE_SIZE - hdr_len);
+   return ALIGN(len, L1_CACHE_BYTES);
+}
+
 static int add_recvbuf_mergeable(struct receive_queue *rq, gfp_t gfp)
 {
const unsigned int ring_size = rq-mrg_buf_ctx_size;
-   const size_t hdr_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
struct page_frag *alloc_frag = rq-alloc_frag;
+   struct virtnet_info *vi = rq-vq-vdev-priv;
struct mergeable_receive_buf_ctx *ctx;
int err;
unsigned int len, hole;
 
-   len = hdr_len + clamp_t(unsigned