Re: [PATCH v2 1/5] virtio: add functions for piecewise addition of buffers
On 01/02/2013 01:03 PM, Rusty Russell wrote: Paolo Bonzini pbonz...@redhat.com writes: The virtqueue_add_buf function has two limitations: 1) it requires the caller to provide all the buffers in a single call; 2) it does not support chained scatterlists: the buffers must be provided as an array of struct scatterlist; Chained scatterlists are a horrible interface, but that doesn't mean we shouldn't support them if there's a need. I think I once even had a patch which passed two chained sgs, rather than a combo sg and two length numbers. It's very old, but I've pasted it below. Duplicating the implementation by having another interface is pretty nasty; I think I'd prefer the chained scatterlists, if that's optimal for you. I rebased against virtio-next and use it in virtio-scsi, and tested it with 4 targets virtio-scsi devices and host cpu idle=poll. Saw a little performance regression here. General: Run status group 0 (all jobs): READ: io=34675MB, aggrb=248257KB/s, minb=248257KB/s, maxb=248257KB/s, mint=143025msec, maxt=143025msec WRITE: io=34625MB, aggrb=247902KB/s, minb=247902KB/s, maxb=247902KB/s, mint=143025msec, maxt=143025msec Chained: Run status group 0 (all jobs): READ: io=34863MB, aggrb=242320KB/s, minb=242320KB/s, maxb=242320KB/s, mint=147325msec, maxt=147325msec WRITE: io=34437MB, aggrb=239357KB/s, minb=239357KB/s, maxb=239357KB/s, mint=147325msec, maxt=147325msec Thanks, Wanlong Gao From d3181b3f9bbdebbd3f2928b64821b406774757f8 Mon Sep 17 00:00:00 2001 From: Rusty Russell ru...@rustcorp.com.au Date: Wed, 2 Jan 2013 16:43:49 +0800 Subject: [PATCH] virtio: use chained scatterlists Rather than handing a scatterlist[] and out and in numbers to virtqueue_add_buf(), hand two separate ones which can be chained. I shall refrain from ranting about what a disgusting hack chained scatterlists are. I'll just note that this doesn't make things simpler (see diff). The scatterlists we use can be too large for the stack, so we put them in our device struct and reuse them. But in many cases we don't want to pay the cost of sg_init_table() as we don't know how many elements we'll have and we'd have to initialize the entire table. This means we have two choices: carefully reset the end markers after we call virtqueue_add_buf(), which we do in virtio_net for the xmit path where it's easy and we want to be optimal. Elsewhere we implement a helper to unset the end markers after we've filled the array. Signed-off-by: Rusty Russell ru...@rustcorp.com.au Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com --- drivers/block/virtio_blk.c | 57 ++--- drivers/char/hw_random/virtio-rng.c | 2 +- drivers/char/virtio_console.c | 6 ++-- drivers/net/virtio_net.c| 68 ++- drivers/scsi/virtio_scsi.c | 18 ++ drivers/virtio/virtio_balloon.c | 6 ++-- drivers/virtio/virtio_ring.c| 71 +++-- include/linux/virtio.h | 14 ++-- net/9p/trans_virtio.c | 31 +--- 9 files changed, 172 insertions(+), 101 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 0bdde8f..17cf0b7 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -102,8 +102,8 @@ static inline struct virtblk_req *virtblk_alloc_req(struct virtio_blk *vblk, static void virtblk_add_buf_wait(struct virtio_blk *vblk, struct virtblk_req *vbr, -unsigned long out, -unsigned long in) +struct scatterlist *out, +struct scatterlist *in) { DEFINE_WAIT(wait); @@ -112,7 +112,7 @@ static void virtblk_add_buf_wait(struct virtio_blk *vblk, TASK_UNINTERRUPTIBLE); spin_lock_irq(vblk-disk-queue-queue_lock); - if (virtqueue_add_buf(vblk-vq, vbr-sg, out, in, vbr, + if (virtqueue_add_buf(vblk-vq, out, in, vbr, GFP_ATOMIC) 0) { spin_unlock_irq(vblk-disk-queue-queue_lock); io_schedule(); @@ -128,12 +128,13 @@ static void virtblk_add_buf_wait(struct virtio_blk *vblk, } static inline void virtblk_add_req(struct virtblk_req *vbr, - unsigned int out, unsigned int in) + struct scatterlist *out, + struct scatterlist *in) { struct virtio_blk *vblk = vbr-vblk; spin_lock_irq(vblk-disk-queue-queue_lock); - if (unlikely(virtqueue_add_buf(vblk-vq, vbr-sg, out, in, vbr, + if (unlikely(virtqueue_add_buf(vblk-vq, out, in, vbr, GFP_ATOMIC) 0)) { spin_unlock_irq(vblk-disk-queue-queue_lock);
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 02.01.13 at 12:26, Andrew Cooper andrew.coop...@citrix.com wrote: On 27/12/12 18:02, Eric W. Biederman wrote: It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? I don't think so (and suggested that before as a response to an earlier submission of this patch set), and it would make most of the discussion here mute. Jan ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 02/11] x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE
On 27.12.12 at 03:18, Daniel Kiper daniel.ki...@oracle.com wrote: Some implementations (e.g. Xen PVOPS) could not use part of identity page table to construct transition page table. It means that they require separate PUDs, PMDs and PTEs for virtual and physical (identity) mapping. To satisfy that requirement add extra pointer to PGD, PUD, PMD and PTE and align existing code. So you keep posting this despite it having got pointed out on each earlier submission that this is unnecessary, proven by the fact that the non-pvops Xen kernels can get away without it. Why? Jan ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] virtio_console: correct error message on failure of debugfs_create_dir
On Friday 21 December 2012, Amit Shah wrote: On (Thu) 20 Dec 2012 [14:11:21], Sasha Levin wrote: debugfs_create_dir() returns NULL if it fails, there's little point in calling PTR_ERR on it. debugfs_create_dir() does return an error value if debugfs is not enabled. This check for !pdrvdata.debugfs_dir should infact use IS_ERR_OR_NULL(). Care to submit a patch for that? How about we fix the stub instead to return NULL when debugfs is disabled? Arnd ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH] virtio-net: reset virtqueue affinity when doing cpu hotplug
On Wed, 2012-12-26 at 15:06 +0800, Wanlong Gao wrote: Add a cpu notifier to virtio-net, so that we can reset the virtqueue affinity if the cpu hotplug happens. It improve the performance through enabling or disabling the virtqueue affinity after doing cpu hotplug. Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Cc: Jason Wang jasow...@redhat.com Cc: virtualization@lists.linux-foundation.org Cc: net...@vger.kernel.org Signed-off-by: Wanlong Gao gaowanl...@cn.fujitsu.com --- drivers/net/virtio_net.c | 39 ++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index a6fcf15..9710cf4 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -26,6 +26,7 @@ #include linux/scatterlist.h #include linux/if_vlan.h #include linux/slab.h +#include linux/cpu.h static int napi_weight = 128; module_param(napi_weight, int, 0444); @@ -34,6 +35,8 @@ static bool csum = true, gso = true; module_param(csum, bool, 0444); module_param(gso, bool, 0444); +static bool cpu_hotplug = false; + /* FIXME: MTU in config. */ #define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) #define GOOD_COPY_LEN128 @@ -1041,6 +1044,26 @@ static void virtnet_set_affinity(struct virtnet_info *vi, bool set) vi-affinity_hint_set = false; } +static int virtnet_cpu_callback(struct notifier_block *nfb, +unsigned long action, void *hcpu) +{ + switch(action) { + case CPU_ONLINE: + case CPU_ONLINE_FROZEN: + case CPU_DEAD: + case CPU_DEAD_FROZEN: + cpu_hotplug = true; + break; + default: + break; + } + return NOTIFY_OK; +} + +static struct notifier_block virtnet_cpu_notifier = { + .notifier_call = virtnet_cpu_callback, +}; + static void virtnet_get_ringparam(struct net_device *dev, struct ethtool_ringparam *ring) { @@ -1131,7 +1154,14 @@ static int virtnet_change_mtu(struct net_device *dev, int new_mtu) */ static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb) { - int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : + int txq; + + if (unlikely(cpu_hotplug == true)) { + virtnet_set_affinity(netdev_priv(dev), true); + cpu_hotplug = false; + } + + txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : smp_processor_id(); while (unlikely(txq = dev-real_num_tx_queues)) @@ -1248,6 +1278,8 @@ static void virtnet_del_vqs(struct virtnet_info *vi) { struct virtio_device *vdev = vi-vdev; + unregister_hotcpu_notifier(virtnet_cpu_notifier); + virtnet_set_affinity(vi, false); vdev-config-del_vqs(vdev); @@ -1372,6 +1404,11 @@ static int init_vqs(struct virtnet_info *vi) goto err_free; virtnet_set_affinity(vi, true); + + ret = register_hotcpu_notifier(virtnet_cpu_notifier); + if (ret) + goto err_free; + return 0; err_free: It looks like this patch assumes virtio_net supports a single instance. Try your patch with two instances, I am pretty sure it wont do very well. It seems to me you need something else than a single boolean. A sequence number for example should be better... ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 02/11] x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE
On Thu, Dec 27, 2012 at 03:19:24PM -0800, Daniel Kiper wrote: Hmm... this code is being redone at the moment... this might conflict. Is this available somewhere? May I have a look at it? http://marc.info/?l=linux-kernelm=135581534620383 The for-x86-boot-v7 and -v8 branches. HTH. -- Regards/Gruss, Boris. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization