[Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net

2018-05-07 Thread Sridhar Samudrala
This feature bit can be used by hypervisor to indicate virtio_net device to
act as a standby for another device with the same MAC address.

I tested this with a small change to the patch to mark the STANDBY feature 
'true'
by default as i am using libvirt to start the VMs.
Is there a way to pass the newly added feature bit 'standby' to qemu via libvirt
XML file?

Signed-off-by: Sridhar Samudrala <sridhar.samudr...@intel.com>
---
 hw/net/virtio-net.c | 2 ++
 include/standard-headers/linux/virtio_net.h | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 90502fca7c..38b3140670 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2198,6 +2198,8 @@ static Property virtio_net_properties[] = {
  true),
 DEFINE_PROP_INT32("speed", VirtIONet, net_conf.speed, SPEED_UNKNOWN),
 DEFINE_PROP_STRING("duplex", VirtIONet, net_conf.duplex_str),
+DEFINE_PROP_BIT64("standby", VirtIONet, host_features, 
VIRTIO_NET_F_STANDBY,
+  false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/standard-headers/linux/virtio_net.h 
b/include/standard-headers/linux/virtio_net.h
index e9f255ea3f..01ec09684c 100644
--- a/include/standard-headers/linux/virtio_net.h
+++ b/include/standard-headers/linux/virtio_net.h
@@ -57,6 +57,9 @@
 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23  /* Set MAC address */
 
+#define VIRTIO_NET_F_STANDBY  62/* Act as standby for another device
+ * with the same MAC.
+ */
 #define VIRTIO_NET_F_SPEED_DUPLEX 63   /* Device set linkspeed and duplex */
 
 #ifndef VIRTIO_NET_NO_LEGACY
-- 
2.14.3




Re: [Qemu-devel] [net-next RFC PATCH 0/7] multiqueue support for tun/tap

2011-08-12 Thread Sridhar Samudrala
On Fri, 2011-08-12 at 09:54 +0800, Jason Wang wrote:
 As multi-queue nics were commonly used for high-end servers,
 current single queue based tap can not satisfy the
 requirement of scaling guest network performance as the
 numbers of vcpus increase. So the following series
 implements multiple queue support in tun/tap.
 
 In order to take advantages of this, a multi-queue capable
 driver and qemu were also needed. I just rebase the latest
 version of Krishna's multi-queue virtio-net driver into this
 series to simplify the test. And for multiqueue supported
 qemu, you can refer the patches I post in
 http://www.spinics.net/lists/kvm/msg52808.html. Vhost is
 also a must to achieve high performance and its code could
 be used for multi-queue without modification. Alternatively,
 this series can be also used for Krishna's M:N
 implementation of multiqueue but I didn't test it.
 
 The idea is simple: each socket were abstracted as a queue
 for tun/tap, and userspace may open as many files as
 required and then attach them to the devices. In order to
 keep the ABI compatibility, device creation were still
 finished in TUNSETIFF, and two new ioctls TUNATTACHQUEUE and
 TUNDETACHQUEUE were added for user to manipulate the numbers
 of queues for the tun/tap.

Is it possible to have tap create these queues automatically when
TUNSETIFF is called instead of having userspace to do the new
ioctls. I am just wondering if it is possible to get multi-queue
to be enabled without any changes to qemu. I guess the number of queues
could be based on the number of vhost threads/guest virtio-net queues.

Also, is it possible to enable multi-queue on the host alone without
any guest virtio-net changes?

Have you done any multiple TCP_RR/UDP_RR testing with small packet
sizes? 256byte request/response with 50-100 instances? 

 
 I've done some basic performance testing of multi queue
 tap. For tun, I just test it through vpnc.
 
 Notes:
 - Test shows improvement when receving packets from
 local/external host to guest, and send big packet from guest
 to local/external host.
 - Current multiqueue based virtio-net/tap introduce a
 regression of send small packet (512 byte) from guest to
 local/external host. I suspect it's the issue of queue
 selection in both guest driver and tap. Would continue to
 investigate.
 - I would post the perforamnce numbers as a reply of this
 mail.
 
 TODO:
 - solve the issue of packet transmission of small packets.
 - addressing the comments of virtio-net driver
 - performance tunning
 
 Please review and comment it, Thanks.
 
 ---
 
 Jason Wang (5):
   tuntap: move socket/sock related structures to tun_file
   tuntap: categorize ioctl
   tuntap: introduce multiqueue related flags
   tuntap: multiqueue support
   tuntap: add ioctls to attach or detach a file form tap device
 
 Krishna Kumar (2):
   Change virtqueue structure
   virtio-net changes
 
 
  drivers/net/tun.c   |  738 
 ++-
  drivers/net/virtio_net.c|  578 --
  drivers/virtio/virtio_pci.c |   10 -
  include/linux/if_tun.h  |5 
  include/linux/virtio.h  |1 
  include/linux/virtio_net.h  |3 
  6 files changed, 867 insertions(+), 468 deletions(-)
 




[Qemu-devel] Re: [PATCH] vhost: force vhost off for non-MSI guests

2011-01-20 Thread Sridhar Samudrala
On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
 When MSI is off, each interrupt needs to be bounced through the io
 thread when it's set/cleared, so vhost-net causes more context switches and
 higher CPU utilization than userspace virtio which handles networking in
 the same thread.
 
 We'll need to fix this by adding level irq support in kvm irqfd,
 for now disable vhost-net in these configurations.
 
 Signed-off-by: Michael S. Tsirkin m...@redhat.com
 ---
 
 I need to report some error from virtio-pci
 that would be handled specially (disable but don't
 report an error) so I wanted one that's never likely to be used by a
 userspace ioctl. I selected ERANGE but it'd
 be easy to switch to something else. Comments?

Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?

-Sridhar
 
  hw/vhost.c  |4 +++-
  hw/virtio-net.c |6 --
  hw/virtio-pci.c |3 +++
  hw/virtio.h |2 ++
  4 files changed, 12 insertions(+), 3 deletions(-)
 
 diff --git a/hw/vhost.c b/hw/vhost.c
 index 1d09ed0..c79765a 100644
 --- a/hw/vhost.c
 +++ b/hw/vhost.c
 @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice 
 *vdev)
  
  r = vdev-binding-set_guest_notifiers(vdev-binding_opaque, true);
  if (r  0) {
 -fprintf(stderr, Error binding guest notifier: %d\n, -r);
 + if (r != -EVIRTIO_DISABLED) {
 + fprintf(stderr, Error binding guest notifier: %d\n, -r);
 + }
  goto fail_notifiers;
  }
  
 diff --git a/hw/virtio-net.c b/hw/virtio-net.c
 index ccb3e63..5de3fee 100644
 --- a/hw/virtio-net.c
 +++ b/hw/virtio-net.c
 @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, 
 uint8_t status)
  if (!n-vhost_started) {
  int r = vhost_net_start(tap_get_vhost_net(n-nic-nc.peer), 
 n-vdev);
  if (r  0) {
 -error_report(unable to start vhost net: %d: 
 - falling back on userspace virtio, -r);
 +if (r != -EVIRTIO_DISABLED) {
 +error_report(unable to start vhost net: %d: 
 + falling back on userspace virtio, -r);
 +}
  } else {
  n-vhost_started = 1;
  }
 diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
 index dd8887a..dbf4be0 100644
 --- a/hw/virtio-pci.c
 +++ b/hw/virtio-pci.c
 @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void *opaque, 
 int n, bool assign)
  EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);
  
  if (assign) {
 +if (!msix_enabled(proxy-pci_dev)) {
 +return -EVIRTIO_DISABLED;
 +}
  int r = event_notifier_init(notifier, 0);
  if (r  0) {
  return r;
 diff --git a/hw/virtio.h b/hw/virtio.h
 index d8546d5..53bbdba 100644
 --- a/hw/virtio.h
 +++ b/hw/virtio.h
 @@ -98,6 +98,8 @@ typedef struct {
  void (*vmstate_change)(void * opaque, bool running);
  } VirtIOBindings;
  
 +#define EVIRTIO_DISABLED ERANGE
 +
  #define VIRTIO_PCI_QUEUE_MAX 64
  
  #define VIRTIO_NO_VECTOR 0x




[Qemu-devel] Re: [PATCH] vhost: force vhost off for non-MSI guests

2011-01-20 Thread Sridhar Samudrala
On Thu, 2011-01-20 at 19:47 +0200, Michael S. Tsirkin wrote:
 On Thu, Jan 20, 2011 at 08:31:53AM -0800, Sridhar Samudrala wrote:
  On Thu, 2011-01-20 at 17:35 +0200, Michael S. Tsirkin wrote:
   When MSI is off, each interrupt needs to be bounced through the io
   thread when it's set/cleared, so vhost-net causes more context switches 
   and
   higher CPU utilization than userspace virtio which handles networking in
   the same thread.
   
   We'll need to fix this by adding level irq support in kvm irqfd,
   for now disable vhost-net in these configurations.
   
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
   ---
   
   I need to report some error from virtio-pci
   that would be handled specially (disable but don't
   report an error) so I wanted one that's never likely to be used by a
   userspace ioctl. I selected ERANGE but it'd
   be easy to switch to something else. Comments?
  
  Should this error be EVHOST_DISABLED rather than EVIRTIO_DISABLED?
  
  -Sridhar
 
 The error is reported by virtio-pci which does not know about vhost.
 I started with EVIRTIO_MSIX_DISABLED and made is shorter.
 Would EVIRTIO_MSIX_DISABLED be better?

I think so. This makes it more clear.
-Sridhar
 
   
hw/vhost.c  |4 +++-
hw/virtio-net.c |6 --
hw/virtio-pci.c |3 +++
hw/virtio.h |2 ++
4 files changed, 12 insertions(+), 3 deletions(-)
   
   diff --git a/hw/vhost.c b/hw/vhost.c
   index 1d09ed0..c79765a 100644
   --- a/hw/vhost.c
   +++ b/hw/vhost.c
   @@ -649,7 +649,9 @@ int vhost_dev_start(struct vhost_dev *hdev, 
   VirtIODevice *vdev)

r = vdev-binding-set_guest_notifiers(vdev-binding_opaque, true);
if (r  0) {
   -fprintf(stderr, Error binding guest notifier: %d\n, -r);
   + if (r != -EVIRTIO_DISABLED) {
   + fprintf(stderr, Error binding guest notifier: %d\n, -r);
   + }
goto fail_notifiers;
}

   diff --git a/hw/virtio-net.c b/hw/virtio-net.c
   index ccb3e63..5de3fee 100644
   --- a/hw/virtio-net.c
   +++ b/hw/virtio-net.c
   @@ -121,8 +121,10 @@ static void virtio_net_vhost_status(VirtIONet *n, 
   uint8_t status)
if (!n-vhost_started) {
int r = vhost_net_start(tap_get_vhost_net(n-nic-nc.peer), 
   n-vdev);
if (r  0) {
   -error_report(unable to start vhost net: %d: 
   - falling back on userspace virtio, -r);
   +if (r != -EVIRTIO_DISABLED) {
   +error_report(unable to start vhost net: %d: 
   + falling back on userspace virtio, -r);
   +}
} else {
n-vhost_started = 1;
}
   diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
   index dd8887a..dbf4be0 100644
   --- a/hw/virtio-pci.c
   +++ b/hw/virtio-pci.c
   @@ -628,6 +628,9 @@ static int virtio_pci_set_guest_notifier(void 
   *opaque, int n, bool assign)
EventNotifier *notifier = virtio_queue_get_guest_notifier(vq);

if (assign) {
   +if (!msix_enabled(proxy-pci_dev)) {
   +return -EVIRTIO_DISABLED;
   +}
int r = event_notifier_init(notifier, 0);
if (r  0) {
return r;
   diff --git a/hw/virtio.h b/hw/virtio.h
   index d8546d5..53bbdba 100644
   --- a/hw/virtio.h
   +++ b/hw/virtio.h
   @@ -98,6 +98,8 @@ typedef struct {
void (*vmstate_change)(void * opaque, bool running);
} VirtIOBindings;

   +#define EVIRTIO_DISABLED ERANGE
   +
#define VIRTIO_PCI_QUEUE_MAX 64

#define VIRTIO_NO_VECTOR 0x




[Qemu-devel] Re: kvm networking todo wiki

2010-09-22 Thread Sridhar Samudrala

 On Tue, 2010-09-21 at 18:11 +0200, Michael S. Tsirkin wrote:

I've put up a wiki page with a kvm networking todo list,
mainly to avoid effort duplication, but also in the hope
to draw attention to what I think we should try addressing
in KVM:

http://www.linux-kvm.org/page/NetworkingTodo

This page could cover all networking related activity in KVM,
currently most info is related to virtio-net.

Note: if there's no developer listed for an item,
this just means I don't know of anyone actively working
on an issue at the moment, not that no one intends to.

I would appreciate it if others working on one of the items on this list
would add their names so we can communicate better.  If others like this
wiki page, please go ahead and add stuff you are working on if any.

It would be especially nice to add autotest projects:
there is just a short test matrix and a catch-all
'Cover test matrix with autotest', currently.

Currently there are some links to Red Hat bugzilla entries,
feel free to add links to other bugzillas.


Thanks for capturing these items. It is really useful.

Another item that is missing is
- support assigning SR-IOV VF to a guest via tap/macvtap

Currently, this requires
 - VF to be put in promiscuous mode when using a bridge/tap
 - add a new mac address to VF when using macvtap.

I don't think any of the VF drivers provide these capabilities
at this time.

-Sridhar



[Qemu-devel] Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu

2010-01-29 Thread Sridhar Samudrala
On Wed, 2010-01-27 at 14:56 -0800, Sridhar Samudrala wrote:
 On Wed, 2010-01-27 at 22:39 +0100, Arnd Bergmann wrote:
  On Wednesday 27 January 2010, Anthony Liguori wrote:
I think -net socket,fd should just be (trivially) extended to work 
with raw
sockets out of the box, with no support for opening it. Then you can 
have
libvirt or some wrapper open a raw socket and a private namespace and 
just pass it
down.
 
That'd work. Anthony?
   
   The fundamental problem that I have with all of this is that we should 
   not be introducing new network backends that are based around something 
   only a developer is going to understand.  If I'm a user and I want to 
   use an external switch in VEPA mode, how in the world am I going to know 
   that I'm supposed to use the -net raw backend or the -net socket 
   backend?  It might as well be the -net butterflies backend as far as a 
   user is concerned.
  
  My point is that we already have -net socket,fd and any user that passes
  an fd into that already knows what he wants to do with it. Making it
  work with raw sockets is just a natural extension to this, which works
  on all kernels and (with separate namespaces) is reasonably secure.
 
 Didn't realize that -net socket is already there and supports TCP and
 UDP sockets. I will look into extending -net socket to support AF_PACKET
 SOCK_RAW type sockets.

OK. Here is a patch that adds AF_PACKET-SOCK_RAW support to -netdev socket
backend. It allows specifying a already opened raw fd or a ifname to which a
raw socket can be bind.

   -netdev socket,fd=X,id=str
   -netdev socket,ifname=ethX/macvlanX,id=str

However, i found that struct NetSocketState doesn't include all the State info 
that
is required to support AF_PACKET Raw sockets. So i had to add NetSocketRawState 
and also couldn't re-use much of the code.

I think -net socket backend is more geared towards AF_INET sockets. Adding 
support
for a new family of socket doesn't fit nicely with the existing code.
But if this approach is more acceptable than a new -net raw,fd backend, i am 
fine 
with it.

Thanks
Sridhar

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index eba578a..7d62dd9 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -15,6 +15,7 @@
 #include net.h
 #include net/checksum.h
 #include net/tap.h
+#include net/socket.h
 #include qemu-timer.h
 #include virtio-net.h
 
@@ -133,6 +134,9 @@ static int peer_has_vnet_hdr(VirtIONet *n)
 case NET_CLIENT_TYPE_TAP:
 n-has_vnet_hdr = tap_has_vnet_hdr(n-nic-nc.peer);
 break;
+case NET_CLIENT_TYPE_SOCKET_RAW:
+n-has_vnet_hdr = sock_raw_has_vnet_hdr(n-nic-nc.peer);
+break;
 default:
 return 0;
 }
@@ -149,6 +153,9 @@ static int peer_has_ufo(VirtIONet *n)
 case NET_CLIENT_TYPE_TAP:
 n-has_ufo = tap_has_ufo(n-nic-nc.peer);
 break;
+case NET_CLIENT_TYPE_SOCKET_RAW:
+n-has_ufo = sock_raw_has_ufo(n-nic-nc.peer);
+break;
 default:
 return 0;
 }
@@ -165,6 +172,9 @@ static void peer_using_vnet_hdr(VirtIONet *n, int 
using_vnet_hdr)
 case NET_CLIENT_TYPE_TAP:
 tap_using_vnet_hdr(n-nic-nc.peer, using_vnet_hdr);
 break;
+case NET_CLIENT_TYPE_SOCKET_RAW:
+sock_raw_using_vnet_hdr(n-nic-nc.peer, using_vnet_hdr);
+break;
 default:
 break; 
 }
@@ -180,6 +190,9 @@ static void peer_set_offload(VirtIONet *n, int csum, int 
tso4, int tso6,
 case NET_CLIENT_TYPE_TAP:
 tap_set_offload(n-nic-nc.peer, csum, tso4, tso6, ecn, ufo);
 break;
+case NET_CLIENT_TYPE_SOCKET_RAW:
+sock_raw_set_offload(n-nic-nc.peer, csum, tso4, tso6, ecn, ufo);
+break;
 default:
 break; 
 }
diff --git a/net.c b/net.c
index 6ef93e6..3d25d64 100644
--- a/net.c
+++ b/net.c
@@ -1002,6 +1002,11 @@ static struct {
 .type = QEMU_OPT_STRING,
 .help = UDP multicast address and port number,
 },
+{
+.name = ifname,
+.type = QEMU_OPT_STRING,
+.help = interface name,
+},
 { /* end of list */ }
 },
 #ifdef CONFIG_VDE
diff --git a/net.h b/net.h
index 116bb80..74b3e69 100644
--- a/net.h
+++ b/net.h
@@ -34,7 +34,8 @@ typedef enum {
 NET_CLIENT_TYPE_TAP,
 NET_CLIENT_TYPE_SOCKET,
 NET_CLIENT_TYPE_VDE,
-NET_CLIENT_TYPE_DUMP
+NET_CLIENT_TYPE_DUMP,
+NET_CLIENT_TYPE_SOCKET_RAW,
 } net_client_type;
 
 typedef void (NetPoll)(VLANClientState *, bool enable);
diff --git a/net/socket.c b/net/socket.c
index 5533737..56f5bad 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -32,6 +32,327 @@
 #include qemu_socket.h
 #include sysemu.h
 
+#include netpacket/packet.h
+#include net/ethernet.h
+#include net/if.h
+#include sys/ioctl.h
+
+/* Maximum GSO packet size (64k) plus plenty of room for
+ * the ethernet and virtio_net headers
+ */
+#define

[Qemu-devel] Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu

2010-01-26 Thread Sridhar Samudrala
On Tue, 2010-01-26 at 14:47 -0600, Anthony Liguori wrote:
 On 01/26/2010 02:40 PM, Sridhar Samudrala wrote:
  This patch adds raw socket backend to qemu and is based on Or Gerlitz's
  patch re-factored and ported to the latest qemu-kvm git tree.
  It also includes support for vnet_hdr option that enables gso/checksum
  offload with raw backend. You can find the linux kernel patch to support
  this feature here.
  http://thread.gmane.org/gmane.linux.network/150308
 
  Signed-off-by: Sridhar Samudralas...@us.ibm.com
 
 
 See the previous discussion about the raw backend from Or's original 
 patch.  There's no obvious reason why we should have this in addition to 
 a tun/tap backend.
 
 The only use-case I know of is macvlan but macvtap addresses this 
 functionality while not introduce the rather nasty security problems 
 associated with a raw backend.

The raw backend can be attached to a physical device, macvlan or SR-IOV VF.
I don't think AF_PACKET socket itself introduces any security problems. The
raw socket can be created only by a user with CAP_RAW capability. The only
issue is if we need to assume that qemu itself is an untrusted process and a
raw fd cannot be passed to it.
But, i think it is a useful backend to support in qemu that provides guest to
remote host connectivity without the need for a bridge/tap.

macvtap could be an alternative if it supports binding to SR-IOV VFs too.

Thanks
Sridhar