RE: [PATCH v30 2/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-04-04 Thread Wang, Wei W
On Thursday, April 5, 2018 9:12 AM, Michael S. Tsirkin wrote:
> On Thu, Apr 05, 2018 at 12:30:27AM +, Wang, Wei W wrote:
> > On Wednesday, April 4, 2018 10:08 PM, Michael S. Tsirkin wrote:
> > > On Wed, Apr 04, 2018 at 10:07:51AM +0800, Wei Wang wrote:
> > > > On 04/04/2018 02:47 AM, Michael S. Tsirkin wrote:
> > > > > On Wed, Apr 04, 2018 at 12:10:03AM +0800, Wei Wang wrote:
> > I'm afraid the driver couldn't be aware if the added hints are stale
> > or not,
> 
> 
> No - I mean that driver has code that compares two values and stops
> reporting. Can one of the values be stale?

The driver compares "vb->cmd_id_use != vb->cmd_id_received" to decide if it 
needs to stop reporting hints, and cmd_id_received is what the driver reads 
from host (host notifies the driver to read for the latest value). If host 
sends a new cmd id, it will notify the guest to read again. I'm not sure how 
that could be a stale cmd id (or maybe I misunderstood your point here?)

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PULL] fwcfg, vhost: features and fixes

2018-04-04 Thread Michael S. Tsirkin
The following changes since commit 0c8efd610b58cb23cefdfa12015799079aef94ae:

  Linux 4.16-rc5 (2018-03-11 17:25:09 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to dc32bb678e103afbcfa4d814489af0566307f528:

  vhost: add vsock compat ioctl (2018-03-20 03:17:42 +0200)


fw_cfg, vhost: features fixes

This cleans up the qemu fw cfg device driver.
On top of this, vmcore is dumped there on crash to
help debugging witH kASLR enabled.
Also included are some fixes in vhost.

Signed-off-by: Michael S. Tsirkin 


Marc-André Lureau (10):
  fw_cfg: fix sparse warnings in fw_cfg_sel_endianness()
  fw_cfg: fix sparse warnings with fw_cfg_file
  fw_cfg: fix sparse warning reading FW_CFG_ID
  fw_cfg: fix sparse warnings around FW_CFG_FILE_DIR read
  fw_cfg: remove inline from fw_cfg_read_blob()
  fw_cfg: handle fw_cfg_read_blob() error
  fw_cfg: add a public uapi header
  fw_cfg: add DMA register
  crash: export paddr_vmcoreinfo_note()
  fw_cfg: write vmcoreinfo details

Michael S. Tsirkin (1):
  ptr_ring: fix build

Sonny Rao (2):
  vhost: fix vhost ioctl signature to build with clang
  vhost: add vsock compat ioctl

 MAINTAINERS  |   1 +
 drivers/firmware/qemu_fw_cfg.c   | 291 +++
 drivers/vhost/vhost.c|   2 +-
 drivers/vhost/vhost.h|   4 +-
 drivers/vhost/vsock.c|  11 ++
 include/uapi/linux/qemu_fw_cfg.h |  97 +
 kernel/crash_core.c  |   1 +
 tools/virtio/ringtest/ptr_ring.c |   5 +
 8 files changed, 348 insertions(+), 64 deletions(-)
 create mode 100644 include/uapi/linux/qemu_fw_cfg.h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v30 2/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-04-04 Thread Michael S. Tsirkin
On Thu, Apr 05, 2018 at 12:30:27AM +, Wang, Wei W wrote:
> On Wednesday, April 4, 2018 10:08 PM, Michael S. Tsirkin wrote:
> > On Wed, Apr 04, 2018 at 10:07:51AM +0800, Wei Wang wrote:
> > > On 04/04/2018 02:47 AM, Michael S. Tsirkin wrote:
> > > > On Wed, Apr 04, 2018 at 12:10:03AM +0800, Wei Wang wrote:
> > > > > +static int add_one_sg(struct virtqueue *vq, unsigned long pfn,
> > > > > +uint32_t len) {
> > > > > + struct scatterlist sg;
> > > > > + unsigned int unused;
> > > > > +
> > > > > + sg_init_table(, 1);
> > > > > + sg_set_page(, pfn_to_page(pfn), len, 0);
> > > > > +
> > > > > + /* Detach all the used buffers from the vq */
> > > > > + while (virtqueue_get_buf(vq, ))
> > > > > + ;
> > > > > +
> > > > > + /*
> > > > > +  * Since this is an optimization feature, losing a couple of 
> > > > > free
> > > > > +  * pages to report isn't important. We simply return without 
> > > > > adding
> > > > > +  * the page hint if the vq is full.
> > > > why not stop scanning of following pages though?
> > >
> > > Because continuing to send hints is a way to deliver the maximum
> > > possible hints to host. For example, host may have a delay in taking
> > > hints at some point, and then it resumes to take hints soon. If the
> > > driver does not stop when the vq is full, it will be able to put more
> > > hints to the vq once the vq has available entries to add.
> > 
> > What this appears to be is just lack of coordination between host and guest.
> > 
> > But meanwhile you are spending cycles walking the list uselessly.
> > Instead of trying nilly-willy, the standard thing to do is to wait for host 
> > to
> > consume an entry and proceed.
> > 
> > Coding it up might be tricky, so it's probably acceptable as is for now, but
> > please replace the justification about with a TODO entry that we should
> > synchronize with the host.
> 
> Thanks. I plan to add
> 
> TODO: The current implementation could be further improved by stopping the 
> reporting when the vq is full and continuing the reporting when host notifies 
> that there are available entries for the driver to add.

... that entries have been used.

> 
> > 
> > 
> > >
> > > >
> > > > > +  * We are adding one entry each time, which essentially results 
> > > > > in no
> > > > > +  * memory allocation, so the GFP_KERNEL flag below can be 
> > > > > ignored.
> > > > > +  * Host works by polling the free page vq for hints after 
> > > > > sending the
> > > > > +  * starting cmd id, so the driver doesn't need to kick after 
> > > > > filling
> > > > > +  * the vq.
> > > > > +  * Lastly, there is always one entry reserved for the cmd id to 
> > > > > use.
> > > > > +  */
> > > > > + if (vq->num_free > 1)
> > > > > + return virtqueue_add_inbuf(vq, , 1, vq, GFP_KERNEL);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static int virtio_balloon_send_free_pages(void *opaque, unsigned long
> > pfn,
> > > > > +unsigned long nr_pages)
> > > > > +{
> > > > > + struct virtio_balloon *vb = (struct virtio_balloon *)opaque;
> > > > > + uint32_t len = nr_pages << PAGE_SHIFT;
> > > > > +
> > > > > + /*
> > > > > +  * If a stop id or a new cmd id was just received from host, 
> > > > > stop
> > > > > +  * the reporting, and return 1 to indicate an active stop.
> > > > > +  */
> > > > > + if (virtio32_to_cpu(vb->vdev, vb->cmd_id_use) != vb-
> > >cmd_id_received)
> > > > > + return 1;
> > 
> > functions returning int should return 0 or -errno on failure, positive 
> > return
> > code should indicate progress.
> > 
> > If you want a boolean, use bool pls.
> 
> OK. I plan to change 1  to -EBUSY to indicate the case that host actively 
> asks the driver to stop reporting (This makes the callback return value type 
> consistent with walk_free_mem_block). 
> 

something like EINTR might be a better fit.

> 
> > 
> > 
> > > > > +
> > > > this access to cmd_id_use and cmd_id_received without locks bothers
> > > > me. Pls document why it's safe.
> > >
> > > OK. Probably we could add below to the above comments:
> > >
> > > cmd_id_use and cmd_id_received don't need to be accessed under locks
> > > because the reporting does not have to stop immediately before
> > > cmd_id_received is changed (i.e. when host requests to stop). That is,
> > > reporting more hints after host requests to stop isn't an issue for
> > > this optimization feature, because host will simply drop the stale
> > > hints next time when it needs a new reporting.
> > 
> > What about the other direction? Can this observe a stale value and exit
> > erroneously?
> 
> I'm afraid the driver couldn't be aware if the added hints are stale or not,


No - I mean that driver has code that compares two values
and stops reporting. Can one of the values be stale?

> because host and guest actions happen asynchronously. That is, 

RE: [PATCH v30 2/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-04-04 Thread Wang, Wei W
On Wednesday, April 4, 2018 10:08 PM, Michael S. Tsirkin wrote:
> On Wed, Apr 04, 2018 at 10:07:51AM +0800, Wei Wang wrote:
> > On 04/04/2018 02:47 AM, Michael S. Tsirkin wrote:
> > > On Wed, Apr 04, 2018 at 12:10:03AM +0800, Wei Wang wrote:
> > > > +static int add_one_sg(struct virtqueue *vq, unsigned long pfn,
> > > > +uint32_t len) {
> > > > +   struct scatterlist sg;
> > > > +   unsigned int unused;
> > > > +
> > > > +   sg_init_table(, 1);
> > > > +   sg_set_page(, pfn_to_page(pfn), len, 0);
> > > > +
> > > > +   /* Detach all the used buffers from the vq */
> > > > +   while (virtqueue_get_buf(vq, ))
> > > > +   ;
> > > > +
> > > > +   /*
> > > > +* Since this is an optimization feature, losing a couple of 
> > > > free
> > > > +* pages to report isn't important. We simply return without 
> > > > adding
> > > > +* the page hint if the vq is full.
> > > why not stop scanning of following pages though?
> >
> > Because continuing to send hints is a way to deliver the maximum
> > possible hints to host. For example, host may have a delay in taking
> > hints at some point, and then it resumes to take hints soon. If the
> > driver does not stop when the vq is full, it will be able to put more
> > hints to the vq once the vq has available entries to add.
> 
> What this appears to be is just lack of coordination between host and guest.
> 
> But meanwhile you are spending cycles walking the list uselessly.
> Instead of trying nilly-willy, the standard thing to do is to wait for host to
> consume an entry and proceed.
> 
> Coding it up might be tricky, so it's probably acceptable as is for now, but
> please replace the justification about with a TODO entry that we should
> synchronize with the host.

Thanks. I plan to add

TODO: The current implementation could be further improved by stopping the 
reporting when the vq is full and continuing the reporting when host notifies 
that there are available entries for the driver to add.


> 
> 
> >
> > >
> > > > +* We are adding one entry each time, which essentially results 
> > > > in no
> > > > +* memory allocation, so the GFP_KERNEL flag below can be 
> > > > ignored.
> > > > +* Host works by polling the free page vq for hints after 
> > > > sending the
> > > > +* starting cmd id, so the driver doesn't need to kick after 
> > > > filling
> > > > +* the vq.
> > > > +* Lastly, there is always one entry reserved for the cmd id to 
> > > > use.
> > > > +*/
> > > > +   if (vq->num_free > 1)
> > > > +   return virtqueue_add_inbuf(vq, , 1, vq, GFP_KERNEL);
> > > > +
> > > > +   return 0;
> > > > +}
> > > > +
> > > > +static int virtio_balloon_send_free_pages(void *opaque, unsigned long
> pfn,
> > > > +  unsigned long nr_pages)
> > > > +{
> > > > +   struct virtio_balloon *vb = (struct virtio_balloon *)opaque;
> > > > +   uint32_t len = nr_pages << PAGE_SHIFT;
> > > > +
> > > > +   /*
> > > > +* If a stop id or a new cmd id was just received from host, 
> > > > stop
> > > > +* the reporting, and return 1 to indicate an active stop.
> > > > +*/
> > > > +   if (virtio32_to_cpu(vb->vdev, vb->cmd_id_use) != vb-
> >cmd_id_received)
> > > > +   return 1;
> 
> functions returning int should return 0 or -errno on failure, positive return
> code should indicate progress.
> 
> If you want a boolean, use bool pls.

OK. I plan to change 1  to -EBUSY to indicate the case that host actively asks 
the driver to stop reporting (This makes the callback return value type 
consistent with walk_free_mem_block). 



> 
> 
> > > > +
> > > this access to cmd_id_use and cmd_id_received without locks bothers
> > > me. Pls document why it's safe.
> >
> > OK. Probably we could add below to the above comments:
> >
> > cmd_id_use and cmd_id_received don't need to be accessed under locks
> > because the reporting does not have to stop immediately before
> > cmd_id_received is changed (i.e. when host requests to stop). That is,
> > reporting more hints after host requests to stop isn't an issue for
> > this optimization feature, because host will simply drop the stale
> > hints next time when it needs a new reporting.
> 
> What about the other direction? Can this observe a stale value and exit
> erroneously?

I'm afraid the driver couldn't be aware if the added hints are stale or not, 
because host and guest actions happen asynchronously. That is, host side 
iothread stops taking hints as soon as the migration thread asks to stop, it 
doesn't wait for any ACK from the driver to stop (as we discussed before, host 
couldn't always assume that the driver is in a responsive state).

Btw, we also don't need to worry about any memory left in the vq, since only 
addresses are added to the vq, there is no real memory allocations.

Best,
Wei

Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Andrew Lunn
> Networking vendors have out of tree kernel modules. Those modules use a
> netdev (call it a master netdev, a control netdev, cpu port, whatever)
> to pull packets from the ASIC and deliver to virtual netdevices
> representing physical ports.

Sounds a lot like DSA. Please ask the vendor to contribute the drivers
:-)

> The master netdev should not be mucked with by a user. It should be
> ignored by certain s/w with lldpd as just an *example*.

I have come across occasional problems with the master device in DSA.
But nothing too serious. Generally the switch will just toss frames it
gets which don't have the needed header, when they come direct from
the master device, rather than via the slave devices.

Andrew

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Jiri Pirko
Wed, Apr 04, 2018 at 07:37:49PM CEST, da...@davemloft.net wrote:
>From: David Ahern 
>Date: Wed, 4 Apr 2018 11:21:54 -0600
>
>> It is a netdev so there is no reason to have a separate ip command to
>> inspect it. 'ip link' is the right place.
>
>I agree on this.
>
>What I really don't understand still is the use case... really.
>
>So there are control netdevs, what exactly is the problem with that?
>
>Are we not exporting enough information for applications to handle
>these devices sanely?  If so, then's let add that information.
>
>We can set netdev->type to ETH_P_LINUXCONTROL or something like that.
>
>Another alternative is to add an interface flag like IFF_CONTROL or
>similar, and that probably is much nicer.
>
>Hiding the devices means that we acknowledge that applications are
>currently broken with control netdevs... and we want them to stay
>broken!
>
>That doesn't sound like a good plan to me.
>
>So let's fix handling of control netdevs instead of hiding them.

Exactly. Don't workaround userspace issues by kernel patches.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Siwei Liu
On Wed, Apr 4, 2018 at 10:21 AM, David Ahern  wrote:
> On 4/4/18 1:36 AM, Siwei Liu wrote:
>> On Tue, Apr 3, 2018 at 6:04 PM, David Ahern  wrote:
>>> On 4/3/18 9:42 AM, Jiri Pirko wrote:
>
> There are other use cases that want to hide a device from userspace. I

 What usecases do you have in mind?
>>>
>>> As mentioned in a previous response some kernel drivers create control
>>> netdevs. Just as in this case users should not be mucking with it, and
>>> S/W like lldpd should ignore it.
>>>

> would prefer a better solution than playing games with name prefixes and
> one that includes an API for users to list all devices -- even ones
> hidden by default.

 Netdevice hiding feels a bit scarry for me. This smells like a workaround
 for userspace issues. Why can't the netdevice be visible always and
 userspace would know what is it and what should it do with it?

 Once we start with hiding, there are other things related to that which
 appear. Like who can see what, levels of visibility etc...

>>>
>>> I would not advocate for any API that does not allow users to have full
>>> introspection. The intent is to hide the netdev by default but have an
>>> option to see it.
>>
>> I'm fine with having a link dump API to inspect the hidden netdev. As
>> said, the name for hidden netdevs should be in a separate device
>> namespace, and we did not even get closer to what it should look like
>> as I don't want to make it just an option for ip link. Perhaps a new
>> set of sub-commands of, say, 'ip device'.
>
> It is a netdev so there is no reason to have a separate ip command to
> inspect it. 'ip link' is the right place.

If you're still thinking the visibility is part of link attribute
rather than a separate namespace, I'd say we are trying to solve
essentially different problems, and I really don't understand your
proposal in solving that problem to be honest.

So, let's step back on studying your case if that's the right thing
and let's talk about concrete examples.

-Siwei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Stephen Hemminger
On Wed, 4 Apr 2018 11:37:52 -0600
David Ahern  wrote:

> Networking vendors have out of tree kernel modules. Those modules use a
> netdev (call it a master netdev, a control netdev, cpu port, whatever)
> to pull packets from the ASIC and deliver to virtual netdevices
> representing physical ports. The master netdev should not be mucked with
> by a user. It should be ignored by certain s/w with lldpd as just an
> *example*.

Sorry, the linux kernel maintainers have a clear well defined attitude
about out of tree kernel modules...
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread David Miller
From: David Ahern 
Date: Wed, 4 Apr 2018 11:37:52 -0600

> Networking vendors have out of tree kernel modules. Those modules use a
> netdev (call it a master netdev, a control netdev, cpu port, whatever)
> to pull packets from the ASIC and deliver to virtual netdevices
> representing physical ports. The master netdev should not be mucked with
> by a user. It should be ignored by certain s/w with lldpd as just an
> *example*.

Two approaches:

1) Add an IFF_CONTROL and make userspace understand this.  It is probably
   long overdue.

2) Design the driver properly.  Have a non-netdev master device like
   mlxsw does, and control it using devlink or similar.  This is exactly
   how this stuff was meant to be architected.

> From there I think you are confusing my intentions: I fundamentally do
> not believe the kernel should be hiding anything from an admin. Not
> showing data by default is completely different than not showing that
> data at all.

It is the same David.

It measn we have no intention of fixing applications to properly know
what to do with and how to handle these devices.

If you hide these objects, we are basically giving up on fixing the
tools and or the drivers themselves to be architected differently
(see #2 above).

That really isn't acceptable in my opinion.

> The intention of my patch with the IFF_HIDDEN attribute is:
> 1. it is a netdev attribute
> 
> 2. that attribute can be used by userpsace to indicate to the kernel I
> want all or I want the default
> 
> 3. that attribute can be controlled by an admin.
> 
> The patches go beyond my specific use case (preventing a user from
> modifying a netdev it should not be touching) but to defining the
> semantics of a generic capability which is what the kernel should have.

"Teach, do not hide!" -Yoda
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread David Miller
From: David Ahern 
Date: Wed, 4 Apr 2018 11:21:54 -0600

> It is a netdev so there is no reason to have a separate ip command to
> inspect it. 'ip link' is the right place.

I agree on this.

What I really don't understand still is the use case... really.

So there are control netdevs, what exactly is the problem with that?

Are we not exporting enough information for applications to handle
these devices sanely?  If so, then's let add that information.

We can set netdev->type to ETH_P_LINUXCONTROL or something like that.

Another alternative is to add an interface flag like IFF_CONTROL or
similar, and that probably is much nicer.

Hiding the devices means that we acknowledge that applications are
currently broken with control netdevs... and we want them to stay
broken!

That doesn't sound like a good plan to me.

So let's fix handling of control netdevs instead of hiding them.

Thanks.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v30 2/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-04-04 Thread Michael S. Tsirkin
On Wed, Apr 04, 2018 at 10:07:51AM +0800, Wei Wang wrote:
> On 04/04/2018 02:47 AM, Michael S. Tsirkin wrote:
> > On Wed, Apr 04, 2018 at 12:10:03AM +0800, Wei Wang wrote:
> > > +static int add_one_sg(struct virtqueue *vq, unsigned long pfn, uint32_t 
> > > len)
> > > +{
> > > + struct scatterlist sg;
> > > + unsigned int unused;
> > > +
> > > + sg_init_table(, 1);
> > > + sg_set_page(, pfn_to_page(pfn), len, 0);
> > > +
> > > + /* Detach all the used buffers from the vq */
> > > + while (virtqueue_get_buf(vq, ))
> > > + ;
> > > +
> > > + /*
> > > +  * Since this is an optimization feature, losing a couple of free
> > > +  * pages to report isn't important. We simply return without adding
> > > +  * the page hint if the vq is full.
> > why not stop scanning of following pages though?
> 
> Because continuing to send hints is a way to deliver the maximum possible
> hints to host. For example, host may have a delay in taking hints at some
> point, and then it resumes to take hints soon. If the driver does not stop
> when the vq is full, it will be able to put more hints to the vq once the vq
> has available entries to add.

What this appears to be is just lack of coordination between
host and guest.

But meanwhile you are spending cycles walking the list uselessly.
Instead of trying nilly-willy, the standard thing to do
is to wait for host to consume an entry and proceed.

Coding it up might be tricky, so it's probably acceptable as is
for now, but please replace the justification about with
a TODO entry that we should synchronize with the host.


> 
> > 
> > > +  * We are adding one entry each time, which essentially results in no
> > > +  * memory allocation, so the GFP_KERNEL flag below can be ignored.
> > > +  * Host works by polling the free page vq for hints after sending the
> > > +  * starting cmd id, so the driver doesn't need to kick after filling
> > > +  * the vq.
> > > +  * Lastly, there is always one entry reserved for the cmd id to use.
> > > +  */
> > > + if (vq->num_free > 1)
> > > + return virtqueue_add_inbuf(vq, , 1, vq, GFP_KERNEL);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int virtio_balloon_send_free_pages(void *opaque, unsigned long 
> > > pfn,
> > > +unsigned long nr_pages)
> > > +{
> > > + struct virtio_balloon *vb = (struct virtio_balloon *)opaque;
> > > + uint32_t len = nr_pages << PAGE_SHIFT;
> > > +
> > > + /*
> > > +  * If a stop id or a new cmd id was just received from host, stop
> > > +  * the reporting, and return 1 to indicate an active stop.
> > > +  */
> > > + if (virtio32_to_cpu(vb->vdev, vb->cmd_id_use) != vb->cmd_id_received)
> > > + return 1;

functions returning int should return 0 or -errno on failure,
positive return code should indicate progress.

If you want a boolean, use bool pls.


> > > +
> > this access to cmd_id_use and cmd_id_received without locks
> > bothers me. Pls document why it's safe.
> 
> OK. Probably we could add below to the above comments:
> 
> cmd_id_use and cmd_id_received don't need to be accessed under locks because
> the reporting does not have to stop immediately before cmd_id_received is
> changed (i.e. when host requests to stop). That is, reporting more hints
> after host requests to stop isn't an issue for this optimization feature,
> because host will simply drop the stale hints next time when it needs a new
> reporting.

What about the other direction? Can this observe a stale value and
exit erroneously?

> 
> 
> 
> > 
> > > + return add_one_sg(vb->free_page_vq, pfn, len);
> > > +}
> > > +
> > > +static int send_start_cmd_id(struct virtio_balloon *vb, uint32_t cmd_id)
> > > +{
> > > + struct scatterlist sg;
> > > + struct virtqueue *vq = vb->free_page_vq;
> > > +
> > > + vb->cmd_id_use = cpu_to_virtio32(vb->vdev, cmd_id);
> > > + sg_init_one(, >cmd_id_use, sizeof(vb->cmd_id_use));
> > > + return virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL);
> > > +}
> > > +
> > > +static int send_stop_cmd_id(struct virtio_balloon *vb)
> > > +{
> > > + struct scatterlist sg;
> > > + struct virtqueue *vq = vb->free_page_vq;
> > > +
> > > + sg_init_one(, >stop_cmd_id, sizeof(vb->cmd_id_use));
> > why the inconsistency?
> 
> Thanks, will make it consistent.
> 
> Best,
> Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Siwei Liu
On Tue, Apr 3, 2018 at 6:04 PM, David Ahern  wrote:
> On 4/3/18 9:42 AM, Jiri Pirko wrote:
>>>
>>> There are other use cases that want to hide a device from userspace. I
>>
>> What usecases do you have in mind?
>
> As mentioned in a previous response some kernel drivers create control
> netdevs. Just as in this case users should not be mucking with it, and
> S/W like lldpd should ignore it.

I'm still not sure I understand your case: why you want to hide the
control netdev, as I assume those devices could choose either to
silently ignore the request, or fail loudly against user operations?
Is it creating issues already, or what problem you want to solve if
not making the netdev invisible. Why couldn't lldpd check some
specific flag and ignore the control netdevice (can you please give an
example of a concrete driver for control netdevice *in tree*).

And I'm completely lost why you want an API to make a hidden netdev
visible again for these control devices.

Thanks,
-Siwei


>
>>
>>> would prefer a better solution than playing games with name prefixes and
>>> one that includes an API for users to list all devices -- even ones
>>> hidden by default.
>>
>> Netdevice hiding feels a bit scarry for me. This smells like a workaround
>> for userspace issues. Why can't the netdevice be visible always and
>> userspace would know what is it and what should it do with it?
>>
>> Once we start with hiding, there are other things related to that which
>> appear. Like who can see what, levels of visibility etc...
>>
>
> I would not advocate for any API that does not allow users to have full
> introspection. The intent is to hide the netdev by default but have an
> option to see it.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-dev] Re: [RFC PATCH 3/3] virtio_net: make lower netdevs for virtio_bypass hidden

2018-04-04 Thread Siwei Liu
On Tue, Apr 3, 2018 at 5:20 AM, Michael S. Tsirkin  wrote:
> On Sun, Apr 01, 2018 at 05:13:10AM -0400, Si-Wei Liu wrote:
>> diff --git a/include/uapi/linux/virtio_net.h 
>> b/include/uapi/linux/virtio_net.h
>> index aa40664..0827b7e 100644
>> --- a/include/uapi/linux/virtio_net.h
>> +++ b/include/uapi/linux/virtio_net.h
>> @@ -80,6 +80,8 @@ struct virtio_net_config {
>>   __u16 max_virtqueue_pairs;
>>   /* Default maximum transmit unit advice */
>>   __u16 mtu;
>> + /* Device at bus:slot.function backed up by virtio_net */
>> + __u16 bsf2backup;
>>  } __attribute__((packed));
>
> I'm not sure this is a good interface.  This isn't unique even on some
> PCI systems, not to speak of non-PCI ones.

Are you suggesting adding PCI address domain besides to make it
universally unique? And what the non-PCI device you envisioned that
the main target, essetially live migration, can/should cover? Or is
there better option in your mind already?

Thanks,
-Siwei
>
>>  /*
>> --
>> 1.8.3.1
>
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-dev] Re: [RFC PATCH 1/3] qemu: virtio-bypass should explicitly bind to a passthrough device

2018-04-04 Thread Siwei Liu
On Tue, Apr 3, 2018 at 5:25 AM, Michael S. Tsirkin  wrote:
> On Sun, Apr 01, 2018 at 05:13:08AM -0400, Si-Wei Liu wrote:
>> @@ -896,6 +898,68 @@ void qmp_device_del(const char *id, Error **errp)
>>  }
>>  }
>>
>> +int pci_get_busdevfn_by_id(const char *id, uint16_t *busnr,
>> +   uint16_t *devfn, Error **errp)
>> +{
>> +uint16_t busnum = 0, slot = 0, func = 0;
>> +const char *pc, *pd, *pe;
>> +Error *local_err = NULL;
>> +ObjectClass *class;
>> +char value[1024];
>> +BusState *bus;
>> +uint64_t u64;
>> +
>> +if (!(pc = strchr(id, ':'))) {
>> +error_setg(errp, "Invalid id: backup=%s, "
>> +   "correct format should be backup="
>> +   "':[.]'", id);
>> +return -1;
>> +}
>> +get_opt_name(value, sizeof(value), id, ':');
>> +if (pc != id + 1) {
>> +bus = qbus_find(value, errp);
>> +if (!bus)
>> +return -1;
>> +
>> +class = object_get_class(OBJECT(bus));
>> +if (class != object_class_by_name(TYPE_PCI_BUS) &&
>> +class != object_class_by_name(TYPE_PCIE_BUS)) {
>> +error_setg(errp, "%s is not a device on pci bus", id);
>> +return -1;
>> +}
>> +busnum = (uint16_t)pci_bus_num(PCI_BUS(bus));
>> +}
>
> pci_bus_num is almost always a bug if not done within
> a context of a PCI host, bridge, etc.
>
> In particular this will not DTRT if run before guest assigns bus
> numbers.
>
I was seeking means to reserve a specific pci bus slot from drivers,
and update the driver when guest assigns the bus number but it seems
there's no low-hanging fruits. Because of that reason the bus_num is
only obtained until it's really needed (during get_config) and I
assume at that point the pci bus assignment is already done. I know
the current one is not perfect, but we need that information (PCI
bus:slot.func number) to name the guest device correctly.

Regards,
-Siwei
>
>> +
>> +if (!devfn)
>> +goto out;
>> +
>> +pd = strchr(pc, '.');
>> +pe = get_opt_name(value, sizeof(value), pc + 1, '.');
>> +if (pe != pc + 1) {
>> +parse_option_number("slot", value, , _err);
>> +if (local_err) {
>> +error_propagate(errp, local_err);
>> +return -1;
>> +}
>> +slot = (uint16_t)u64;
>> +}
>> +if (pd && *(pd + 1) != '\0') {
>> +parse_option_number("function", pd, , _err);
>> +if (local_err) {
>> +error_propagate(errp, local_err);
>> +return -1;
>> +}
>> +func = (uint16_t)u64;
>> +}
>> +
>> +out:
>> +if (busnr)
>> +*busnr = busnum;
>> +if (devfn)
>> +*devfn = ((slot & 0x1F) << 3) | (func & 0x7);
>> +return 0;
>> +}
>> +
>>  BlockBackend *blk_by_qdev_id(const char *id, Error **errp)
>>  {
>>  DeviceState *dev;
>> --
>> 1.8.3.1
>
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Siwei Liu
On Tue, Apr 3, 2018 at 11:19 PM, Jiri Pirko  wrote:
> Wed, Apr 04, 2018 at 03:04:26AM CEST, dsah...@gmail.com wrote:
>>On 4/3/18 9:42 AM, Jiri Pirko wrote:

 There are other use cases that want to hide a device from userspace. I
>>>
>>> What usecases do you have in mind?
>>
>>As mentioned in a previous response some kernel drivers create control
>>netdevs. Just as in this case users should not be mucking with it, and
>
> virtio_net. Any other drivers?

netvsc if factoring out virtio_bypass to a common driver.

>
>
>>S/W like lldpd should ignore it.
>
> It's just a matter of identification of the netdevs, so the user knows
> what to do.
>
>
>>
>>>
 would prefer a better solution than playing games with name prefixes and
 one that includes an API for users to list all devices -- even ones
 hidden by default.
>>>
>>> Netdevice hiding feels a bit scarry for me. This smells like a workaround
>>> for userspace issues. Why can't the netdevice be visible always and
>>> userspace would know what is it and what should it do with it?
>>>
>>> Once we start with hiding, there are other things related to that which
>>> appear. Like who can see what, levels of visibility etc...
>>>
>>
>>I would not advocate for any API that does not allow users to have full
>>introspection. The intent is to hide the netdev by default but have an
>>option to see it.
>
> As an administrator, I want to see all by default. I think it is
> reasonable requirements. Again, this awfully smells like a workaround...

If the requirement is just for dumping the link info i.e. perform
read-only operation on the hidden netdev, it's completely fine.
However, I am not a big fan of creating a weird mechanism to allow
user deliberately manipulate the visibility (hide/unhide) of a netdev
in any case at any time. This is subject to becoming a slippery slope
to work around any software issue that should get fixed in the right
place.

Let's treat IFF_HIDDEN as a means to hide auto-managed netdevices. If
it's just the name is misleading, I can get it renamed to something
like IFF_AUTO_MANAGED which might reflect its nature more properly.

Thanks,
-Siwei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Siwei Liu
On Tue, Apr 3, 2018 at 6:04 PM, David Ahern  wrote:
> On 4/3/18 9:42 AM, Jiri Pirko wrote:
>>>
>>> There are other use cases that want to hide a device from userspace. I
>>
>> What usecases do you have in mind?
>
> As mentioned in a previous response some kernel drivers create control
> netdevs. Just as in this case users should not be mucking with it, and
> S/W like lldpd should ignore it.
>
>>
>>> would prefer a better solution than playing games with name prefixes and
>>> one that includes an API for users to list all devices -- even ones
>>> hidden by default.
>>
>> Netdevice hiding feels a bit scarry for me. This smells like a workaround
>> for userspace issues. Why can't the netdevice be visible always and
>> userspace would know what is it and what should it do with it?
>>
>> Once we start with hiding, there are other things related to that which
>> appear. Like who can see what, levels of visibility etc...
>>
>
> I would not advocate for any API that does not allow users to have full
> introspection. The intent is to hide the netdev by default but have an
> option to see it.

I'm fine with having a link dump API to inspect the hidden netdev. As
said, the name for hidden netdevs should be in a separate device
namespace, and we did not even get closer to what it should look like
as I don't want to make it just an option for ip link. Perhaps a new
set of sub-commands of, say, 'ip device'.

-Siwei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice

2018-04-04 Thread Jiri Pirko
Wed, Apr 04, 2018 at 03:04:26AM CEST, dsah...@gmail.com wrote:
>On 4/3/18 9:42 AM, Jiri Pirko wrote:
>>>
>>> There are other use cases that want to hide a device from userspace. I
>> 
>> What usecases do you have in mind?
>
>As mentioned in a previous response some kernel drivers create control
>netdevs. Just as in this case users should not be mucking with it, and

virtio_net. Any other drivers?


>S/W like lldpd should ignore it.

It's just a matter of identification of the netdevs, so the user knows
what to do.


>
>> 
>>> would prefer a better solution than playing games with name prefixes and
>>> one that includes an API for users to list all devices -- even ones
>>> hidden by default.
>> 
>> Netdevice hiding feels a bit scarry for me. This smells like a workaround
>> for userspace issues. Why can't the netdevice be visible always and
>> userspace would know what is it and what should it do with it?
>> 
>> Once we start with hiding, there are other things related to that which
>> appear. Like who can see what, levels of visibility etc...
>> 
>
>I would not advocate for any API that does not allow users to have full
>introspection. The intent is to hide the netdev by default but have an
>option to see it.

As an administrator, I want to see all by default. I think it is
reasonable requirements. Again, this awfully smells like a workaround...
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization