Re: AIO requests may be disordered by Qemu-kvm iothread with disk cache=writethrough, Bug or Feature?
On Thu, 10/08 19:59, charlie.song wrote: > Dear KVM Developers: > I am Xiang Song from UCloud company. We currently encounter a weird > phenomenon about Qemu-KVM IOthread. > We recently try to use Linux AIO from guest OS and find that the IOthread > mechanism of Qemu-KVM will reorder I/O requests from guest OS > even when the AIO write requests are issued from a single thread in order. > This does not happen on the host OS however. > We are not sure whether this is a feature of Qemu-KVM IOthread mechanism > or a Bug. > > The testbd is as following: (the guest disk device cache is configured to > writethrough.) > CPU: Intel(R) Xeon(R) CPU E5-2650 > QEMU version: 1.5.3 > Host/Guest Kernel: Both Linux 4.1.8 & Linux 2.6.32, OS type CentOS 6.5 > Simplified Guest OS qemu cmd: > /usr/libexec/qemu-kvm -machine rhel6.3.0,accel=kvm,usb=off -cpu kvm64 -smp > 8,sockets=8,cores=1,threads=1 > -drive > file=/var/lib/libvirt/images/song-disk.img,if=none,id=drive-virtio-disk0,format=qcow2,serial=UCLOUD_DISK_VDA,cache=writethrough > > -device > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:22:d5:52,bus=pci.0,addr=0x4 You mentioned iothread above but it's not in your command line? > > The test code triggerring this phenomenon work as following: it use linux aio > API to issue concurrent async write requests to a file. During exection it > will > continuously write data into target test file. There are total 'X' jobs, and > each job is assigned a job id JOB_ID which starts from 0. Each job will write > 16 * 512 > Byte data into the target file at offset = JOB_ID * 512. (the data is > repeated uint64_t JOB_ID). > There is only one thread handling 'X' jobs one by one through Linux AIO > (io_submit) cmd. When handling jobs, it will continuously > issuing AIO requests without waiting for AIO Callbacks. When it finishes, the > file should look like: > [00][1...1][2...2][3...3]...[X-1...X-1] > Then we use a check program to test the resulting file, it can > continuously read the first 8 byte (uint64_t) of each sector and print it > out. In normal cases, > it's output is like: > 0 1 2 3 X-1 > > Exec output: (Set X=32) > In our guest OS, the output is abnormal: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 > 15 16 17 18 18 18 18 18 18 24 25 26 27 28 29 30 31. > It can be seen that job20~job24 are overwrited by job19. > In our host OS, the output is as expected, 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 > 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31. I'm not 100% sure but I don't think the returning of io_submit guarantees any ordering, usually you need to wait for the callback to ensure that. Fam > > > I can provide the example code if needed. > > Best regards, song > > 2015-10-08 > > > charlie.song > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: AIO requests may be disordered by Qemu-kvm iothread with disk cache=writethrough, Bug or Feature?
On Fri, 10/09 11:25, charlie.song wrote: > At 2015-10-08 23:37:02, "Fam Zheng" <f...@redhat.com> wrote: > >On Thu, 10/08 19:59, charlie.song wrote: > >> Dear KVM Developers: > >> I am Xiang Song from UCloud company. We currently encounter a weird > >> phenomenon about Qemu-KVM IOthread. > >> We recently try to use Linux AIO from guest OS and find that the > >> IOthread mechanism of Qemu-KVM will reorder I/O requests from guest OS > >> even when the AIO write requests are issued from a single thread in order. > >> This does not happen on the host OS however. > >> We are not sure whether this is a feature of Qemu-KVM IOthread > >> mechanism or a Bug. > >> > >> The testbd is as following: (the guest disk device cache is configured to > >> writethrough.) > >> CPU: Intel(R) Xeon(R) CPU E5-2650 > >> QEMU version: 1.5.3 > >> Host/Guest Kernel: Both Linux 4.1.8 & Linux 2.6.32, OS type CentOS 6.5 > >> Simplified Guest OS qemu cmd: > >> /usr/libexec/qemu-kvm -machine rhel6.3.0,accel=kvm,usb=off -cpu kvm64 -smp > >> 8,sockets=8,cores=1,threads=1 > >> -drive > >> file=/var/lib/libvirt/images/song-disk.img,if=none,id=drive-virtio-disk0,format=qcow2,serial=UCLOUD_DISK_VDA,cache=writethrough > >> > >> -device > >> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:22:d5:52,bus=pci.0,addr=0x4 > > > >You mentioned iothread above but it's not in your command line? > I means the thread pool mechanism used by qemu-kvm to accelerate I/O > processing.This is used by paio_submit (block/raw-posix.c) by default and > with pool->max_threads = 64 as I know. (qemu-kvm version 1.5.3) The thread pool parallism may reorder non-overlapping requests, but it shouldn't cause any reordering of overlapping requests like the case in your io pattern. QEMU ensures that. Do you see this with aio=native? Fam > > > > >> > >> The test code triggerring this phenomenon work as following: it use linux > >> aio API to issue concurrent async write requests to a file. During > >> exection it will > >> continuously write data into target test file. There are total 'X' jobs, > >> and each job is assigned a job id JOB_ID which starts from 0. Each job > >> will write 16 * 512 > >> Byte data into the target file at offset = JOB_ID * 512. (the data is > >> repeated uint64_t JOB_ID). > >> There is only one thread handling 'X' jobs one by one through Linux > >> AIO (io_submit) cmd. When handling jobs, it will continuously > >> issuing AIO requests without waiting for AIO Callbacks. When it finishes, > >> the file should look like: > >> [00][1...1][2...2][3...3]...[X-1...X-1] > >> Then we use a check program to test the resulting file, it can > >> continuously read the first 8 byte (uint64_t) of each sector and print it > >> out. In normal cases, > >> it's output is like: > >> 0 1 2 3 X-1 > >> > >> Exec output: (Set X=32) > >> In our guest OS, the output is abnormal: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 > >> 14 15 16 17 18 18 18 18 18 18 24 25 26 27 28 29 30 31. > >> It can be seen that job20~job24 are overwrited by job19. > >> In our host OS, the output is as expected, 0 1 2 3 4 5 6 7 8 9 10 11 12 13 > >> 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31. > > > >I'm not 100% sure but I don't think the returning of io_submit guarantees any > >ordering, usually you need to wait for the callback to ensure that. > Is there any proof or artical about the ordering of io_submit requests? > > > >Fam > > > >> > >> > >> I can provide the example code if needed. > >> > >> Best regards, song > >> > >> 2015-10-08 > >> > >> > >> charlie.song > >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe kvm" in > >> the body of a message to majord...@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] os-android: Add support to android platform, built by ndk-r10
On Tue, 09/15 10:11, Houcheng Lin wrote: > From: HouchengThanks for sending patches! Please include qemu-de...@nongnu.org list for QEMU changes. Fam > > This patch is to build qemu in android ndk tool-chain, and has been tested in > both > x86_64 and x86 android platform with hardware virtualization enabled. This > patch is > composed of three part: > > - configure scripts for android > - OS dependent code for android > - kernel headers for android > > The configure scripts add cross-compile options for android, define compile > flags and link flags > and call android specific pkg-config. A pseudo pkg-config script is added to > report correct > compile flags for android system. > > The OS dependent code for android that implement functions missing in bionic > C, including: > - getdtablesize(): call getrlimit() instead. > - sigtimewait(): call __rt_sigtimewait() instead. > - a_ptsname(): call pstname_r() instead. > - lockf(): not see this feature in android, directly return -1. > - shm_open(): not see this feature in android, directly return -1. > > The kernel headers for android include two kernel header missing in android: > scsi/sg.h and > sys/io.h. > > How to build android version > > 1. download the ndk toolchain r10, build the following libraries and install > in your toolchain sysroot: > libiconv-1.14 > gettext-0.19 > libffi-3.0.12 > glib-2.34.3 > libpng-1.2.52 > pixman-0.30 > > 2. configure the qemu and build: > > % export SYSROOT="/opt/android-toolchain64/sysroot" > % CFLAGS=" --sysroot=$SYSROOT -I$SYSROOT/usr/include > -I$SYSROOT/usr/include/pixman-1/" \ > ./configure --prefix="${SYSROOT}/usr" \ > --cross-prefix=x86_64-linux-android- \ > --enable-kvm --enable-trace-backend=nop --disable-fdt > --target-list=x86_64-softmmu \ > --disable-spice --disable-vhost-net --disable-libiscsi > --audio-drv-list="" --disable-gtk \ > --disable-gnutls --disable-libnfs --disable-glusterfs > --disable-libssh2 --disable-seccomp \ > --disable-usb-redir --disable-libusb > % make -j4 > > Or, configure qemu to static link version: > > % export SYSROOT="/opt/android-toolchain64/sysroot" > % CFLAGS=" --sysroot=$SYSROOT -I$SYSROOT/usr/include > -I$SYSROOT/usr/include/pixman-1/" \ > ./configure --prefix="${SYSROOT}/usr" \ > --cross-prefix=x86_64-linux-android- \ > --enable-kvm --enable-trace-backend=nop --disable-fdt > --target-list=x86_64-softmmu \ > --disable-spice --disable-vhost-net --disable-libiscsi > --audio-drv-list="" --disable-gtk \ > --disable-gnutls --disable-libnfs --disable-glusterfs > --disable-libssh2 --disable-seccomp \ > --disable-usb-redir --disable-libusb --static > % make -j4 > > Signed-off-by: Houcheng > --- > configure | 30 - > include/android/scsi/sg.h | 307 > +++ > include/qemu/osdep.h|4 + > include/sysemu/os-android.h | 35 + > kvm-all.c |3 + > scripts/android-pkg-config | 28 > tests/Makefile |2 + > util/osdep.c| 53 > util/qemu-openpty.c | 10 +- > 9 files changed, 467 insertions(+), 5 deletions(-) > create mode 100644 include/android/scsi/sg.h > create mode 100644 include/android/sys/io.h > create mode 100644 include/sysemu/os-android.h > create mode 100755 scripts/android-pkg-config > > diff --git a/configure b/configure > index 5c06663..3ff6ffa 100755 > --- a/configure > +++ b/configure > @@ -566,7 +566,6 @@ fi > > # host *BSD for user mode > HOST_VARIANT_DIR="" > - > case $targetos in > CYGWIN*) >mingw32="yes" > @@ -692,9 +691,23 @@ Haiku) >vhost_net="yes" >vhost_scsi="yes" >QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers > $QEMU_INCLUDES" > + case $cross_prefix in > +*android*) > + android="yes" > + guest_agent="no" > + QEMU_INCLUDES="-I\$(SRC_PATH)/include/android $QEMU_INCLUDES" > +;; > +*) > +;; > + esac > ;; > esac > > +if [ "$android" = "yes" ] ; then > + QEMU_CFLAGS="-DANDROID $QEMU_CFLAGS" > + LIBS="-lglib-2.0 -lgthread-2.0 -lz -lpixman-1 -lintl -liconv -lc $LIBS" > + libs_qga="-lglib-2.0 -lgthread-2.0 -lz -lpixman-1 -lintl -liconv -lc" > +fi > if [ "$bsd" = "yes" ] ; then >if [ "$darwin" != "yes" ] ; then > bsd_user="yes" > @@ -1736,7 +1749,14 @@ fi > # pkg-config probe > > if ! has "$pkg_config_exe"; then > - error_exit "pkg-config binary '$pkg_config_exe' not found" > + case $cross_prefix in > +*android*) > + pkg_config_exe=scripts/android-pkg-config > + ;; > +*) > + error_exit "pkg-config binary '$pkg_config_exe' not found" > + ;; > + esac > fi > >
[PATCH v2] virtio-blk: Allow extended partitions
This will allow up to DISK_MAX_PARTS (256) partitions, with for example GPT in the guest. Otherwise, the partition scan code will only discover the first 15 partitions. Signed-off-by: Fam Zheng <f...@redhat.com> --- drivers/block/virtio_blk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index d4d05f0..38ea01b 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -657,6 +657,7 @@ static int virtblk_probe(struct virtio_device *vdev) vblk->disk->private_data = vblk; vblk->disk->fops = _fops; vblk->disk->driverfs_dev = >dev; + vblk->disk->flags |= GENHD_FL_EXT_DEVT; vblk->index = index; /* configure queue flush support */ -- 2.4.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Virtio IRQ problem
On Mon, 07/13 10:41, Pavel Davydov wrote: The proprietary system doesn't have a virtio driver, that is the problem. What are the steps that Linux performs to work correctly with virtio devices? It's done with a specific driver that talks to virtio interface: http://github.com/torvalds/linux/blob/master/drivers/net/virtio_net.c It goes through regular PCI device discovery, and sets up ring buffers for rx and tx, together with irq handlers etc., according to the device specification. The device operations are documented as: http://docs.oasis-open.org/virtio/virtio/v1.0/csprd01/virtio-v1.0-csprd01.html#x1-870001 Thanks, Fam Thank you, Pavel On 07/12/2015 05:31 PM, Fam Zheng wrote: On Fri, 07/10 22:34, Pavel Davydov wrote: Hello, I've got the following problem with KVM: I'm running a proprietary OS under KVM, the OS is neither Linux/Unix, nor Windows, and I don't get any IRQs from Virtio device. The Virtio device in the system I'm running is a PCI Ethernet device. The device is detected correctly, the IRQ line is also identified, its matches with the IRQ line number retrieved by running lspci command when Linux is run instead of the proprietary system. When Linux is run on exactly the same virtual machine configuration (with the Virtio device), the Virtio device is detected and works correctly, there is traffic between the host and the guest. Linux does more than detection, you need to have drivers running in the guest for virtio devices. Does your proprietary system have virtio device drivers? Fam When the only change in the configuration is that the type of the virtual network interface is changed from Virtio to E1000, in both OSes: the proprietary OS and Linux, the virtual network interface works correctly. The question is: what steps is Virtio driver to perform to enable the IRQs? (Obviously, Linux driver for Virtio performs these steps, the IRQs are generated under Linux VM). Any hint would be appreciated. Thank you, Pavel -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Virtio IRQ problem
On Fri, 07/10 22:34, Pavel Davydov wrote: Hello, I've got the following problem with KVM: I'm running a proprietary OS under KVM, the OS is neither Linux/Unix, nor Windows, and I don't get any IRQs from Virtio device. The Virtio device in the system I'm running is a PCI Ethernet device. The device is detected correctly, the IRQ line is also identified, its matches with the IRQ line number retrieved by running lspci command when Linux is run instead of the proprietary system. When Linux is run on exactly the same virtual machine configuration (with the Virtio device), the Virtio device is detected and works correctly, there is traffic between the host and the guest. Linux does more than detection, you need to have drivers running in the guest for virtio devices. Does your proprietary system have virtio device drivers? Fam When the only change in the configuration is that the type of the virtual network interface is changed from Virtio to E1000, in both OSes: the proprietary OS and Linux, the virtual network interface works correctly. The question is: what steps is Virtio driver to perform to enable the IRQs? (Obviously, Linux driver for Virtio performs these steps, the IRQs are generated under Linux VM). Any hint would be appreciated. Thank you, Pavel -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH target] vhost/scsi: vhost_skip_iovec_bytes() can be static
On Mon, 02/02 14:25, kbuild test robot wrote: drivers/vhost/scsi.c:1081:5: sparse: symbol 'vhost_skip_iovec_bytes' was not declared. Should it be static? Signed-off-by: Fengguang Wu fengguang...@intel.com --- scsi.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index d888bd9..8ac003f 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -1078,7 +1078,7 @@ vhost_scsi_send_bad_target(struct vhost_scsi *vs, pr_err(Faulted on virtio_scsi_cmd_resp\n); } -int vhost_skip_iovec_bytes(size_t bytes, int max_niov, +static int vhost_skip_iovec_bytes(size_t bytes, int max_niov, struct iovec *iov_in, size_t off_in, struct iovec **iov_out, size_t *off_out) Probably keep the parameter list lines aligned? Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How does kvm achieve an 'advanced' process separation
On Thu, 01/29 16:51, Kun Cheng wrote: Hey guys, Hi! That might be a dumb question, but currently I find myself unable to clearly explain that to others. As we all know how CPU and memory is virtualised, and how memory address space is translated using the shadow page table or EPT, that creates each VM an individual running space. However, as each VM process is essentially an Linux process, how are they unable to do IPC among them? I tried to answer that question, but I was not confident about my explanation. Here's what I thought about. First, VMM like Xen or KVM supports virtualised OSes (paravirtualised or hardware-assisted virtualised). OS provides IPC mechanism but itself cannot use it to communicate with another OS. Although they run in guest machines which are essentially host's processes , they still cannot do IPC with others. Second, each VM process runs in an individual virtualised platform, it's the only OS running dominantly on its own virtualised resources, so it's unable to be aware of others. (But as each VM process has its PID, their processes have the potentials to do IPC if another one's PID is notified? ) Finally, the question can be described as, how does KVM enhance the process isolation to prevent those VM processes to IPC with each other? Unlike a normal process on the host OS, a VM doesn't have any access to host OS resources, except those that are intentionally virtualized, such as CPU, memory and IO devices: basically all of which have a behavior that resembles real hardware. IPC, in contrary, is usually supported by an OS in the form of system calls, which is a totally different category of resources or functions, that is not virtualized by the hypervisor, thus it is essentially not exposed to guest. The hypervisor makes sure that the guest doesn't see the existence of host process where the guest lives at all - it doesn't need to know, nor should it. In order to communicate with outside, guest has to only use whatever are provided to it - specifically, IO devices, be it a paravirtualized NIC or emulated USB device. I also notice that KVM seems to be benefited from cgroups, is that contributing to the isolation? It's not the fundamental of virtualization, although could possibly be utilized in some cases to enforce the isolation. Hope that helps. Fam I hope someone could give me a perfect answer. However, any useful reply is appreciated. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Submit your Google Summer of Code project ideas and volunteer to mentor
On Fri, 01/23 17:21, Stefan Hajnoczi wrote: Dear libvirt, KVM, and QEMU contributors, The Google Summer of Code season begins soon and it's time to collect our thoughts for mentoring students this summer working full-time on libvirt, KVM, and QEMU. What is GSoC? Google Summer of Code 2015 (GSoC) funds students to work on open source projects for 12 weeks over the summer. Open source organizations apply to participate and those accepted receive funding for one or more students. We now need to collect a list of project ideas on our wiki. We also need mentors to volunteer. http://qemu-project.org/Google_Summer_of_Code_2015 Project ideas Please post project ideas on the wiki page below. Project ideas should be suitable as a 12-week project that a student fluent in C/Python/etc can complete. No prior knowledge of QEMU/KVM/libvirt internals can be assumed. http://qemu-project.org/Google_Summer_of_Code_2015 Mentors Please add your name to project ideas you are willing to mentor. In order to mentor you must be an established contributor (regularly contribute patches). You must be willing to spend about 5 hours per week from May 25 to August 21. I have CCed the 8 most active committers since QEMU 2.1.0 as well as the previous libvirt and KVM mentors but everyone is invited. Official timeline: https://www.google-melange.com/gsoc/events/google/gsoc20145 s/20145/2015/ Thank you for organizing it! Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: cannot receive qemu-dev/kvm-dev mails sent by myself
On Mon, 12/22 20:21, Zhang Haoyu wrote: On 2014/12/22 20:05, Paolo Bonzini wrote: On 22/12/2014 12:40, Zhang Haoyu wrote: On 2014/12/22 17:54, Paolo Bonzini wrote: On 22/12/2014 10:48, Zhang Haoyu wrote: Hi, I cannot receive qemu-dev/kvm-dev mails sent by myself, but mails from others can be received, any helps? For qemu-devel, you need to configure mailman to send messages even if they are yours. For the kvm mailing list I'm not sure how it works (I read it through GMANE). I didn't find the configuration, could you send out the configuration method or link site? https://lists.nongnu.org/mailman/options/qemu-devel Thanks, Paolo, I already have fellow the default settings, as it didn't work, so forget it. Receive your own posts to the list? Ordinarily, you will get a copy of every message you post to the list. If you don't want to receive this copy, set this option to No. = Yes Avoid duplicate copies of messages? When you are listed explicitly in the To: or Cc: headers of a list message, you can opt to not receive another copy from the mailing list. Select Yes to avoid receiving copies from the mailing list; select No to receive copies. If the list has member personalized messages enabled, and you elect to receive copies, every copy will have a X-Mailman-Copy: yes header added to it. = No So you're probably talking about the gmail feature that you don't get copies of the messages sent by you. I don't use gmail to send emails on lists, Stefan (Cc'ed) may know better. Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [question] virtio-blk performancedegradationhappened with virito-serial
On Mon, 09/22 21:23, Zhang Haoyu wrote: Amit, It's related to the big number of ioeventfds used in virtio-serial-pci. With virtio-serial-pci's ioeventfd=off, the performance is not affected no matter if guest initializes it or not. In my test, there are 12 fds to poll in qemu_poll_ns before loading guest virtio_console.ko, whereas 76 once modprobe virtio_console. Looks like the ppoll takes more time to poll more fds. Some trace data with systemtap: 12 fds: time rel_time symbol 15(+1) qemu_poll_ns [enter] 18(+3) qemu_poll_ns [return] 76 fd: 12(+2) qemu_poll_ns [enter] 18(+6) qemu_poll_ns [return] I haven't looked at virtio-serial code, I'm not sure if we should reduce the number of ioeventfds in virtio-serial-pci or focus on lower level efficiency. Does ioeventfd=off hamper the performance of virtio-serial? In theory it has an impact, but I have no data about this. If you have a performance demand, it's best to try it against your use case to answer this question. Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [question] virtio-blk performance degradationhappened with virito-serial
On Tue, 09/02 12:06, Amit Shah wrote: On (Mon) 01 Sep 2014 [20:52:46], Zhang Haoyu wrote: Hi, all I start a VM with virtio-serial (default ports number: 31), and found that virtio-blk performance degradation happened, about 25%, this problem can be reproduced 100%. without virtio-serial: 4k-read-random 1186 IOPS with virtio-serial: 4k-read-random 871 IOPS but if use max_ports=2 option to limit the max number of virio-serial ports, then the IO performance degradation is not so serious, about 5%. And, ide performance degradation does not happen with virtio-serial. Pretty sure it's related to MSI vectors in use. It's possible that the virtio-serial device takes up all the avl vectors in the guests, leaving old-style irqs for the virtio-blk device. I don't think so, I use iometer to test 64k-read(or write)-sequence case, if I disable the virtio-serial dynamically via device manager-virtio-serial = disable, then the performance get promotion about 25% immediately, then I re-enable the virtio-serial via device manager-virtio-serial = enable, the performance got back again, very obvious. add comments: Although the virtio-serial is enabled, I don't use it at all, the degradation still happened. Using the vectors= option as mentioned below, you can restrict the number of MSI vectors the virtio-serial device gets. You can then confirm whether it's MSI that's related to these issues. Amit, It's related to the big number of ioeventfds used in virtio-serial-pci. With virtio-serial-pci's ioeventfd=off, the performance is not affected no matter if guest initializes it or not. In my test, there are 12 fds to poll in qemu_poll_ns before loading guest virtio_console.ko, whereas 76 once modprobe virtio_console. Looks like the ppoll takes more time to poll more fds. Some trace data with systemtap: 12 fds: time rel_time symbol 15(+1) qemu_poll_ns [enter] 18(+3) qemu_poll_ns [return] 76 fd: 12(+2) qemu_poll_ns [enter] 18(+6) qemu_poll_ns [return] I haven't looked at virtio-serial code, I'm not sure if we should reduce the number of ioeventfds in virtio-serial-pci or focus on lower level efficiency. Haven't compared with g_poll but I think the underlying syscall should be the same. Any ideas? Fam So, I think it has no business with legacy interrupt mode, right? I am going to observe the difference of perf top data on qemu and perf kvm stat data when disable/enable virtio-serial in guest, and the difference of perf top data on guest when disable/enable virtio-serial in guest, any ideas? Thanks, Zhang Haoyu If you restrict the number of vectors the virtio-serial device gets (using the -device virtio-serial-pci,vectors= param), does that make things better for you? Amit -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu.git build fail with error
On Wed, 05/21 05:40, Liu, RongrongX wrote: Hi, After download the latest qemu.git(http://git.qemu.org/?p=qemu.git;a=summary), then compile the qemu.git, it will build fail with error Some build log CCtrace/generated-events.o CCtrace/generated-tracers.o CCutil/cutils.o ARlibqemustub.a lt LINK vscclient ARlibqemuutil.a LINK qemu-ga LINK qemu-nbd LINK qemu-img LINK qemu-io qemu-img.o: In function `add_format_to_seq': /root/qemu/qemu-img.c:73: undefined reference to `g_sequence_lookup' collect2: ld returned 1 exit status make: *** [qemu-img] Error 1 make: *** Waiting for unfinished jobs Hi, A recent change 1a443c1b8 (qemu-img: sort block formats in help message) added a silent dependency on glib = 2.28. Fix posted by Mike Day is already on the way into qemu.git in Kevin's PULL request. Please wait for it to be applied and try again, or compile QEMU against a more recent glib. Thanks, Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ssh from host to guest using qemu to boot VM
On Tue, 04/15 14:03, Jobin Raju George wrote: Yes, you were right, the port was already being used when I was doing: -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ However, even after using a unix socket for this socket as: -chardev socket,path=/tmp/g2g,server,nowait,id=port1-char the VM has booted giving a warning at the host console: Warning: vlan 0 with no nics but when I do ifconfig on the guest I see only lo and I still get ssh: connect to host 10.0.2.15 port 22: Connection timed out when I try to ssh; the IP I used to ssh is 10.0.2.15, which according to man qemu-system-x86_64 is the IP assigned to the first VM booted if static IP is not assigned. And now there is no internet connection on the guest. That means you have to create a device. Add an -net nic. Fam On Mon, Apr 14, 2014 at 6:15 PM, Fam Zheng f...@redhat.com wrote: On Mon, 04/14 17:36, Jobin Raju George wrote: I retried using: /usr/bin/qemu-system-x86_64 \ -m 1024 \ -name vserialtest \ -cdrom ubuntu-12.04-desktop-amd64.iso -hda ubuntu1204-virtio-serial \ -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ -device virtio-serial \ -device virtserialport,id=port1,chardev=port1-char,name=org.fedoraproject.port.0 \ -net user,hostfwd=tcp:127.0.0.1:-:8001 but get the following: qemu-system-x86_64: -net user,hostfwd=tcp:127.0.0.1:-:8001: could not set up host forwarding rule 'tcp:127.0.0.1:-:8001' qemu-system-x86_64: -net user,hostfwd=tcp:127.0.0.1:-:8001: Device 'user' could not be initialized Also tried: -net user,hostfwd=tcp::-:8001 but get the following error: qemu-system-x86_64: -net user,hostfwd=tcp::-:8001: could not set up host forwarding rule 'tcp::-:8001' qemu-system-x86_64: -net user,hostfwd=tcp::-:8001: Device 'user' could not be initialized Is the port busy? What does netstat -ltn say? Fam On Mon, Apr 14, 2014 at 5:31 PM, Fam Zheng f...@redhat.com wrote: On Mon, 04/14 17:14, Jobin Raju George wrote: Hey! How do I setup ssh from the host to the guest using qemu? 1) I am able to use port redirection when I boot the VM without any special parameter(explained in point 2) as follows: /usr/bin/qemu-system-x86_64 -hda ubuntu1204 -m 512 -redir tcp:::8001 2) But when I try to boot using the following /usr/bin/qemu-system-x86_64 \ -m 1024 \ -name vserialtest \ -cdrom ubuntu-12.04-desktop-amd64.iso \ -hda ubuntu1204-virtio-serial \ -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ -device virtio-serial \ -device virtserialport,id=port1,chardev=port1-char,name=org.fedoraproject.port.0 \ -net user,hostfwd=tcp:::8001 I get the following error and the VM does not boot: qemu-system-x86_64: -net user,hostfwd=tcp:::8001: invalid host forwarding rule 'tcp:::8001' qemu-system-x86_64: -net user,hostfwd=tcp:::8001: Device 'user' could not be initialized Format: hostfwd=[tcp|udp]:[hostaddr]:hostport-[guestaddr]:guestport Try: -net user,hostfwd=tcp::-:8001 Fam Please note that I am able to boot the VM without the -net parameter without any issues, however, I want to setup ssh from the host to the guest. ssh from guest to host works fine as expected. -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ssh from host to guest using qemu to boot VM
On Tue, 04/15 15:49, Jobin Raju George wrote: Adding -net nic to the end of the booting command, I am able to connect to the internet now and have 10.0.2.15 as the IP address, but still ssh username@10.0.2.15 returns ssh: connect to host 10.0.2.15 port 22: Connection timed out Ugh, you totally missed it, neither 10.0.2.15 nor 22 is in ssh's business. Should be something like, depending on your hostfwd port: ssh username@127.0.0.1 -p Fam On Tue, Apr 15, 2014 at 2:25 PM, Fam Zheng f...@redhat.com wrote: On Tue, 04/15 14:03, Jobin Raju George wrote: Yes, you were right, the port was already being used when I was doing: -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ However, even after using a unix socket for this socket as: -chardev socket,path=/tmp/g2g,server,nowait,id=port1-char the VM has booted giving a warning at the host console: Warning: vlan 0 with no nics but when I do ifconfig on the guest I see only lo and I still get ssh: connect to host 10.0.2.15 port 22: Connection timed out when I try to ssh; the IP I used to ssh is 10.0.2.15, which according to man qemu-system-x86_64 is the IP assigned to the first VM booted if static IP is not assigned. And now there is no internet connection on the guest. That means you have to create a device. Add an -net nic. Fam On Mon, Apr 14, 2014 at 6:15 PM, Fam Zheng f...@redhat.com wrote: On Mon, 04/14 17:36, Jobin Raju George wrote: I retried using: /usr/bin/qemu-system-x86_64 \ -m 1024 \ -name vserialtest \ -cdrom ubuntu-12.04-desktop-amd64.iso -hda ubuntu1204-virtio-serial \ -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ -device virtio-serial \ -device virtserialport,id=port1,chardev=port1-char,name=org.fedoraproject.port.0 \ -net user,hostfwd=tcp:127.0.0.1:-:8001 but get the following: qemu-system-x86_64: -net user,hostfwd=tcp:127.0.0.1:-:8001: could not set up host forwarding rule 'tcp:127.0.0.1:-:8001' qemu-system-x86_64: -net user,hostfwd=tcp:127.0.0.1:-:8001: Device 'user' could not be initialized Also tried: -net user,hostfwd=tcp::-:8001 but get the following error: qemu-system-x86_64: -net user,hostfwd=tcp::-:8001: could not set up host forwarding rule 'tcp::-:8001' qemu-system-x86_64: -net user,hostfwd=tcp::-:8001: Device 'user' could not be initialized Is the port busy? What does netstat -ltn say? Fam On Mon, Apr 14, 2014 at 5:31 PM, Fam Zheng f...@redhat.com wrote: On Mon, 04/14 17:14, Jobin Raju George wrote: Hey! How do I setup ssh from the host to the guest using qemu? 1) I am able to use port redirection when I boot the VM without any special parameter(explained in point 2) as follows: /usr/bin/qemu-system-x86_64 -hda ubuntu1204 -m 512 -redir tcp:::8001 2) But when I try to boot using the following /usr/bin/qemu-system-x86_64 \ -m 1024 \ -name vserialtest \ -cdrom ubuntu-12.04-desktop-amd64.iso \ -hda ubuntu1204-virtio-serial \ -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ -device virtio-serial \ -device virtserialport,id=port1,chardev=port1-char,name=org.fedoraproject.port.0 \ -net user,hostfwd=tcp:::8001 I get the following error and the VM does not boot: qemu-system-x86_64: -net user,hostfwd=tcp:::8001: invalid host forwarding rule 'tcp:::8001' qemu-system-x86_64: -net user,hostfwd=tcp:::8001: Device 'user' could not be initialized Format: hostfwd=[tcp|udp]:[hostaddr]:hostport-[guestaddr]:guestport Try: -net user,hostfwd=tcp::-:8001 Fam Please note that I am able to boot the VM without the -net parameter without any issues, however, I want to setup ssh from the host to the guest. ssh from guest to host works fine as expected. -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thanks and regards
Re: ssh from host to guest using qemu to boot VM
On Mon, 04/14 17:14, Jobin Raju George wrote: Hey! How do I setup ssh from the host to the guest using qemu? 1) I am able to use port redirection when I boot the VM without any special parameter(explained in point 2) as follows: /usr/bin/qemu-system-x86_64 -hda ubuntu1204 -m 512 -redir tcp:::8001 2) But when I try to boot using the following /usr/bin/qemu-system-x86_64 \ -m 1024 \ -name vserialtest \ -cdrom ubuntu-12.04-desktop-amd64.iso \ -hda ubuntu1204-virtio-serial \ -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ -device virtio-serial \ -device virtserialport,id=port1,chardev=port1-char,name=org.fedoraproject.port.0 \ -net user,hostfwd=tcp:::8001 I get the following error and the VM does not boot: qemu-system-x86_64: -net user,hostfwd=tcp:::8001: invalid host forwarding rule 'tcp:::8001' qemu-system-x86_64: -net user,hostfwd=tcp:::8001: Device 'user' could not be initialized Format: hostfwd=[tcp|udp]:[hostaddr]:hostport-[guestaddr]:guestport Try: -net user,hostfwd=tcp::-:8001 Fam Please note that I am able to boot the VM without the -net parameter without any issues, however, I want to setup ssh from the host to the guest. ssh from guest to host works fine as expected. -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ssh from host to guest using qemu to boot VM
On Mon, 04/14 17:36, Jobin Raju George wrote: I retried using: /usr/bin/qemu-system-x86_64 \ -m 1024 \ -name vserialtest \ -cdrom ubuntu-12.04-desktop-amd64.iso -hda ubuntu1204-virtio-serial \ -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ -device virtio-serial \ -device virtserialport,id=port1,chardev=port1-char,name=org.fedoraproject.port.0 \ -net user,hostfwd=tcp:127.0.0.1:-:8001 but get the following: qemu-system-x86_64: -net user,hostfwd=tcp:127.0.0.1:-:8001: could not set up host forwarding rule 'tcp:127.0.0.1:-:8001' qemu-system-x86_64: -net user,hostfwd=tcp:127.0.0.1:-:8001: Device 'user' could not be initialized Also tried: -net user,hostfwd=tcp::-:8001 but get the following error: qemu-system-x86_64: -net user,hostfwd=tcp::-:8001: could not set up host forwarding rule 'tcp::-:8001' qemu-system-x86_64: -net user,hostfwd=tcp::-:8001: Device 'user' could not be initialized Is the port busy? What does netstat -ltn say? Fam On Mon, Apr 14, 2014 at 5:31 PM, Fam Zheng f...@redhat.com wrote: On Mon, 04/14 17:14, Jobin Raju George wrote: Hey! How do I setup ssh from the host to the guest using qemu? 1) I am able to use port redirection when I boot the VM without any special parameter(explained in point 2) as follows: /usr/bin/qemu-system-x86_64 -hda ubuntu1204 -m 512 -redir tcp:::8001 2) But when I try to boot using the following /usr/bin/qemu-system-x86_64 \ -m 1024 \ -name vserialtest \ -cdrom ubuntu-12.04-desktop-amd64.iso \ -hda ubuntu1204-virtio-serial \ -chardev socket,host=localhost,port=,server,nowait,id=port1-char \ -device virtio-serial \ -device virtserialport,id=port1,chardev=port1-char,name=org.fedoraproject.port.0 \ -net user,hostfwd=tcp:::8001 I get the following error and the VM does not boot: qemu-system-x86_64: -net user,hostfwd=tcp:::8001: invalid host forwarding rule 'tcp:::8001' qemu-system-x86_64: -net user,hostfwd=tcp:::8001: Device 'user' could not be initialized Format: hostfwd=[tcp|udp]:[hostaddr]:hostport-[guestaddr]:guestport Try: -net user,hostfwd=tcp::-:8001 Fam Please note that I am able to boot the VM without the -net parameter without any issues, however, I want to setup ssh from the host to the guest. ssh from guest to host works fine as expected. -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Thanks and regards, Jobin Raju George Final Year, Information Technology College of Engineering Pune Alternate e-mail: georgejr10...@coep.ac.in -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] virtio-scsi: Skip setting affinity on uninitialized vq
virtscsi_init calls virtscsi_remove_vqs on err, even before initializing the vqs. The latter calls virtscsi_set_affinity, so let's check the pointer there before setting affinity on it. This fixes a panic when setting device's num_queues=2 on RHEL 6.5: qemu-system-x86_64 ... \ -device virtio-scsi-pci,id=scsi0,addr=0x13,...,num_queues=2 \ -drive file=/stor/vm/dummy.raw,id=drive-scsi-disk,... \ -device scsi-hd,drive=drive-scsi-disk,... [0.354734] scsi0 : Virtio SCSI HBA [0.379504] BUG: unable to handle kernel NULL pointer dereference at 0020 [0.380141] IP: [814741ef] __virtscsi_set_affinity+0x4f/0x120 [0.380141] PGD 0 [0.380141] Oops: [#1] SMP [0.380141] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0+ #5 [0.380141] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [0.380141] task: 88003c9f ti: 88003c9f8000 task.ti: 88003c9f8000 [0.380141] RIP: 0010:[814741ef] [814741ef] __virtscsi_set_affinity+0x4f/0x120 [0.380141] RSP: :88003c9f9c08 EFLAGS: 00010256 [0.380141] RAX: RBX: 88003c3a9d40 RCX: 1070 [0.380141] RDX: 0002 RSI: RDI: [0.380141] RBP: 88003c9f9c28 R08: 000136c0 R09: 88003c801c00 [0.380141] R10: 81475229 R11: 0008 R12: [0.380141] R13: 81cc7ca8 R14: 88003cac3d40 R15: 88003cac37a0 [0.380141] FS: () GS:88003e40() knlGS: [0.380141] CS: 0010 DS: ES: CR0: 8005003b [0.380141] CR2: 0020 CR3: 01c0e000 CR4: 06f0 [0.380141] Stack: [0.380141] 88003c3a9d40 88003cac3d80 88003cac3d40 [0.380141] 88003c9f9c48 814742e8 88003c26d000 88003c26d000 [0.380141] 88003c9f9c68 81474321 88003c26d000 88003c3a9d40 [0.380141] Call Trace: [0.380141] [814742e8] virtscsi_set_affinity+0x28/0x40 [0.380141] [81474321] virtscsi_remove_vqs+0x21/0x50 [0.380141] [81475231] virtscsi_init+0x91/0x240 [0.380141] [81365290] ? vp_get+0x50/0x70 [0.380141] [81475544] virtscsi_probe+0xf4/0x280 [0.380141] [81363ea5] virtio_dev_probe+0xe5/0x140 [0.380141] [8144c669] driver_probe_device+0x89/0x230 [0.380141] [8144c8ab] __driver_attach+0x9b/0xa0 [0.380141] [8144c810] ? driver_probe_device+0x230/0x230 [0.380141] [8144c810] ? driver_probe_device+0x230/0x230 [0.380141] [8144ac1c] bus_for_each_dev+0x8c/0xb0 [0.380141] [8144c499] driver_attach+0x19/0x20 [0.380141] [8144bf28] bus_add_driver+0x198/0x220 [0.380141] [8144ce9f] driver_register+0x5f/0xf0 [0.380141] [81d27c91] ? spi_transport_init+0x79/0x79 [0.380141] [8136403b] register_virtio_driver+0x1b/0x30 [0.380141] [81d27d19] init+0x88/0xd6 [0.380141] [81d27c18] ? scsi_init_procfs+0x5b/0x5b [0.380141] [81ce88a7] do_one_initcall+0x7f/0x10a [0.380141] [81ce8aa7] kernel_init_freeable+0x14a/0x1de [0.380141] [81ce8b3b] ? kernel_init_freeable+0x1de/0x1de [0.380141] [817dec20] ? rest_init+0x80/0x80 [0.380141] [817dec29] kernel_init+0x9/0xf0 [0.380141] [817e68fc] ret_from_fork+0x7c/0xb0 [0.380141] [817dec20] ? rest_init+0x80/0x80 [0.380141] RIP [814741ef] __virtscsi_set_affinity+0x4f/0x120 [0.380141] RSP 88003c9f9c08 [0.380141] CR2: 0020 [0.380141] ---[ end trace 8074b70c3d5e1d73 ]--- [0.475018] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0009 [0.475018] [0.475068] Kernel Offset: 0x0 from 0x8100 (relocation range: 0x8000-0x9fff) [0.475068] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0009 Signed-off-by: Fam Zheng f...@redhat.com Acked-by: Paolo Bonzini pbonz...@redhat.com Cc: sta...@vger.kernel.org --- drivers/scsi/virtio_scsi.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 16bfd50..3019267 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -750,8 +750,12 @@ static void __virtscsi_set_affinity(struct virtio_scsi *vscsi, bool affinity) vscsi-affinity_hint_set = true; } else { - for (i = 0; i vscsi-num_queues; i++) + for (i = 0; i vscsi-num_queues; i++) { + if (!vscsi-req_vqs[i].vq) { + continue; + } virtqueue_set_affinity(vscsi-req_vqs[i].vq, -1); + } vscsi-affinity_hint_set
[PATCH] virtio-scsi: Skip setting affinity on uninitialized vq
virtscsi_init calls virtscsi_remove_vqs on err, even before initializing the vqs. The latter calls virtscsi_set_affinity, so let's check the pointer there before setting affinity on it. This fixes a panic when setting device's num_queues=2 on RHEL 6.5: qemu-system-x86_64 ... \ -device virtio-scsi-pci,id=scsi0,addr=0x13,...,num_queues=2 \ -drive file=/stor/vm/dummy.raw,id=drive-scsi-disk,... \ -device scsi-hd,drive=drive-scsi-disk,... [0.354734] scsi0 : Virtio SCSI HBA [0.379504] BUG: unable to handle kernel NULL pointer dereference at 0020 [0.380141] IP: [814741ef] __virtscsi_set_affinity+0x4f/0x120 [0.380141] PGD 0 [0.380141] Oops: [#1] SMP [0.380141] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0+ #5 [0.380141] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [0.380141] task: 88003c9f ti: 88003c9f8000 task.ti: 88003c9f8000 [0.380141] RIP: 0010:[814741ef] [814741ef] __virtscsi_set_affinity+0x4f/0x120 [0.380141] RSP: :88003c9f9c08 EFLAGS: 00010256 [0.380141] RAX: RBX: 88003c3a9d40 RCX: 1070 [0.380141] RDX: 0002 RSI: RDI: [0.380141] RBP: 88003c9f9c28 R08: 000136c0 R09: 88003c801c00 [0.380141] R10: 81475229 R11: 0008 R12: [0.380141] R13: 81cc7ca8 R14: 88003cac3d40 R15: 88003cac37a0 [0.380141] FS: () GS:88003e40() knlGS: [0.380141] CS: 0010 DS: ES: CR0: 8005003b [0.380141] CR2: 0020 CR3: 01c0e000 CR4: 06f0 [0.380141] Stack: [0.380141] 88003c3a9d40 88003cac3d80 88003cac3d40 [0.380141] 88003c9f9c48 814742e8 88003c26d000 88003c26d000 [0.380141] 88003c9f9c68 81474321 88003c26d000 88003c3a9d40 [0.380141] Call Trace: [0.380141] [814742e8] virtscsi_set_affinity+0x28/0x40 [0.380141] [81474321] virtscsi_remove_vqs+0x21/0x50 [0.380141] [81475231] virtscsi_init+0x91/0x240 [0.380141] [81365290] ? vp_get+0x50/0x70 [0.380141] [81475544] virtscsi_probe+0xf4/0x280 [0.380141] [81363ea5] virtio_dev_probe+0xe5/0x140 [0.380141] [8144c669] driver_probe_device+0x89/0x230 [0.380141] [8144c8ab] __driver_attach+0x9b/0xa0 [0.380141] [8144c810] ? driver_probe_device+0x230/0x230 [0.380141] [8144c810] ? driver_probe_device+0x230/0x230 [0.380141] [8144ac1c] bus_for_each_dev+0x8c/0xb0 [0.380141] [8144c499] driver_attach+0x19/0x20 [0.380141] [8144bf28] bus_add_driver+0x198/0x220 [0.380141] [8144ce9f] driver_register+0x5f/0xf0 [0.380141] [81d27c91] ? spi_transport_init+0x79/0x79 [0.380141] [8136403b] register_virtio_driver+0x1b/0x30 [0.380141] [81d27d19] init+0x88/0xd6 [0.380141] [81d27c18] ? scsi_init_procfs+0x5b/0x5b [0.380141] [81ce88a7] do_one_initcall+0x7f/0x10a [0.380141] [81ce8aa7] kernel_init_freeable+0x14a/0x1de [0.380141] [81ce8b3b] ? kernel_init_freeable+0x1de/0x1de [0.380141] [817dec20] ? rest_init+0x80/0x80 [0.380141] [817dec29] kernel_init+0x9/0xf0 [0.380141] [817e68fc] ret_from_fork+0x7c/0xb0 [0.380141] [817dec20] ? rest_init+0x80/0x80 [0.380141] RIP [814741ef] __virtscsi_set_affinity+0x4f/0x120 [0.380141] RSP 88003c9f9c08 [0.380141] CR2: 0020 [0.380141] ---[ end trace 8074b70c3d5e1d73 ]--- [0.475018] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0009 [0.475018] [0.475068] Kernel Offset: 0x0 from 0x8100 (relocation range: 0x8000-0x9fff) [0.475068] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0009 Signed-off-by: Fam Zheng f...@redhat.com --- drivers/scsi/virtio_scsi.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index 16bfd50..3019267 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -750,8 +750,12 @@ static void __virtscsi_set_affinity(struct virtio_scsi *vscsi, bool affinity) vscsi-affinity_hint_set = true; } else { - for (i = 0; i vscsi-num_queues; i++) + for (i = 0; i vscsi-num_queues; i++) { + if (!vscsi-req_vqs[i].vq) { + continue; + } virtqueue_set_affinity(vscsi-req_vqs[i].vq, -1); + } vscsi-affinity_hint_set = false; } -- 1.9.1 -- To unsubscribe from this list: send
Re: [PATCH RFC V2 4/4] tools: virtio: add a top-like utility for displaying vhost satistics
On Fri, 03/21 17:41, Jason Wang wrote: This patch adds simple python to display vhost satistics of vhost, the codes were based on kvm_stat script from qemu. As work function has been recored, filters could be used to distinguish which kinds of work are being executed or queued: vhost statistics vhost_virtio_update_used_idx 1215215 0 vhost_virtio_get_vq_desc 1215215 0 vhost_work_queue_wakeup 986808 0 vhost_virtio_signal 811601 0 vhost_net_tx611457 0 vhost_net_rx603758 0 vhost_net_tx(datacopy) 601903 0 vhost_work_queue_wakeup(rx_net) 565081 0 vhost_virtio_signal(rx) 461603 0 vhost_work_queue_wakeup(tx_kick)421718 0 vhost_virtio_update_avail_event 417346 0 vhost_virtio_signal(tx) 349998 0 vhost_work_queue_coalesce39384 0 vhost_work_queue_coalesce(rx_net)38677 0 vhost_net_tx(zerocopy)9554 0 vhost_work_queue_coalesce(tx_kick) 707 0 vhost_work_queue_wakeup(rx_kick) 9 0 Signed-off-by: Jason Wang jasow...@redhat.com --- tools/virtio/vhost_stat | 375 1 file changed, 375 insertions(+) create mode 100755 tools/virtio/vhost_stat diff --git a/tools/virtio/vhost_stat b/tools/virtio/vhost_stat new file mode 100755 index 000..398fd4a --- /dev/null +++ b/tools/virtio/vhost_stat @@ -0,0 +1,375 @@ +#!/usr/bin/python +# +# top-like utility for displaying vhost statistics +# +# Copyright 2012 Red Hat, Inc. Should it be 2014? Fam snip -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Guest VM kernel panic with VIRTIO_BLK with KVM
On Wed, 04/02 11:32, saurabh agarwal wrote: We have compiled a new linux kernel for a guest VM with CONFIG_VIRTIO_BLK=y. But it doesn't boot and kernel panics. To kernel command line I tried passing root=/dev/vda and root=/dev/vda1 but same kernel panic comes every time. VIRTIO_NET worked fine. But with VIRTIO_BLK we see the below kernel panic. Can someone suggest what could be going wrong? Kernel version in VM is 3.0.76. QEMU emulator version 1.3.0 Kernel command line is kernel /boot/bzImage root=/dev/vda1 early_printk=serial console=hvc0 console=tty0 console=ttyS0,115200n8 e1000.disable_vlan_offload=1 initrd /boot/initramfs.cpio Below is a portion of GuestVM Kernel Console logs while booting up brd: module loaded^M vda: vda1 vda2 vda3 vda4^M scsi0 : ata_piix^M scsi1 : ata_piix^M ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc180 irq 14 ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc188 irq 15 ... ... VFS: Cannot open root device vda or unknown-block(253,0)^M Please append a correct root= boot option; here are the available partitions: fd00 8388608 vda driver: virtio_blk fd01 7340032 vda1 ---- fd02 512000 vda2 ---- fd03 535552 vda3 ---- Does /dev/vda1 have a root filesystem? Looks like the partitions on vda is correctly detected, so the virtio-blk driver already works. Thanks, Fam Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(253,0) Can someone suggest what could be wrong? Regards, Saurabh -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] scsi: Change sense buffer size to 252
According to SPC-4, section 4.5.2.1, 252 is the limit of sense data. So increase the values. Tested by hacking QEMU to fake virtio-scsi request sense len to 252. Without this patch the driver stops working immediately when it gets the request. Signed-off-by: Fam Zheng f...@redhat.com --- include/linux/virtio_scsi.h | 2 +- include/scsi/scsi_cmnd.h| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/virtio_scsi.h b/include/linux/virtio_scsi.h index 4195b97..a437f7f 100644 --- a/include/linux/virtio_scsi.h +++ b/include/linux/virtio_scsi.h @@ -28,7 +28,7 @@ #define _LINUX_VIRTIO_SCSI_H #define VIRTIO_SCSI_CDB_SIZE 32 -#define VIRTIO_SCSI_SENSE_SIZE 96 +#define VIRTIO_SCSI_SENSE_SIZE 252 /* SCSI command request, followed by data-out */ struct virtio_scsi_cmd_req { diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index 91558a1..a64dac03 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -104,7 +104,7 @@ struct scsi_cmnd { struct request *request;/* The command we are working on */ -#define SCSI_SENSE_BUFFERSIZE 96 +#define SCSI_SENSE_BUFFERSIZE 252 unsigned char *sense_buffer; /* obtained by REQUEST SENSE when * CHECK CONDITION is received on original -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O redirection Guest VM
On Fri, 03/14 02:00, Shivaramakrishnan Vaidyanathan wrote: Hello Fam, Thanks a lot to your reply. I think I needed to bit more clear in my explanation. Here is my requirement: At first,I have a guest vm with its virtual disk performing no storage intrusion detection running on top of kvm hypervisor. Then later I create this new VM app that performs the storage intrusion detection functionality again running on top of kvm hypervisor.I run nfs share inside this vm. Once I do this,I need the I/O's that is happening on guest vm to undergo the intrusion detection functionality before writes are performed on the disks. How could I point the image to the NFS share if the guest vm prior didn't had this functionality? And If I do the functionality now,Only NFS-shared partitions could be intercepted.What about the other partition writes occurring at the guest vm? Sounds like you need to change the backend of VM on flight to you NFS share. But I don't think it's supported for now. (Assuming you use QEMU as device virtualizer). Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I/O redirection Guest VM
On Fri, 03/14 04:46, Shiva wrote: Hello, I am working on building an Storage Intrusion Detection System(SIDS) App inside a VM on KVM hypervisor. I wanted I/O's from other vm's to first contact this App VM and then finally write to the disk.I went with the network storage as an option to achieve this objective. But wanted to know what about I/O's that are not a shared on NFS-share for example? How can I intercept these I/O/'s ? As the guest VM will be having its own virtual disk and can write to its own partitions that are not part of NFS share. I know modifying Qemu could be other options here.(Like developing a driver or adding redirection of I/O). But I have a time constraint here and unlikely to achieve it by this way. Looking forward to your help/comments.Thanks Not sure I understand it correctly, but this sounds doable to backend. If you want to intercept the IO's from other VM, can you move the location of its virtual disk and let them access throught your NFS you App VM can control? If you only want to observe the IO, you can mirror the writes to a target on the NFS. Thanks, Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] virtio-scsi: Change sense buffer size to 252
According to SPC-4, section 4.5.2.1, 252 is the limit of sense data. So increase the value. Signed-off-by: Fam Zheng f...@redhat.com --- include/linux/virtio_scsi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/virtio_scsi.h b/include/linux/virtio_scsi.h index 4195b97..a437f7f 100644 --- a/include/linux/virtio_scsi.h +++ b/include/linux/virtio_scsi.h @@ -28,7 +28,7 @@ #define _LINUX_VIRTIO_SCSI_H #define VIRTIO_SCSI_CDB_SIZE 32 -#define VIRTIO_SCSI_SENSE_SIZE 96 +#define VIRTIO_SCSI_SENSE_SIZE 252 /* SCSI command request, followed by data-out */ struct virtio_scsi_cmd_req { -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-scsi: Change sense buffer size to 252
On Thu, 03/06 11:09, Paolo Bonzini wrote: Il 06/03/2014 09:47, Fam Zheng ha scritto: According to SPC-4, section 4.5.2.1, 252 is the limit of sense data. So increase the value. Signed-off-by: Fam Zheng f...@redhat.com --- include/linux/virtio_scsi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/virtio_scsi.h b/include/linux/virtio_scsi.h index 4195b97..a437f7f 100644 --- a/include/linux/virtio_scsi.h +++ b/include/linux/virtio_scsi.h @@ -28,7 +28,7 @@ #define _LINUX_VIRTIO_SCSI_H #define VIRTIO_SCSI_CDB_SIZE 32 -#define VIRTIO_SCSI_SENSE_SIZE 96 +#define VIRTIO_SCSI_SENSE_SIZE 252 /* SCSI command request, followed by data-out */ struct virtio_scsi_cmd_req { Hi Fam, how did you test this? I only tested the basic functionality of virtio-scsi. I'm doing more testing on this and your fix on QEMU side now. Thanks, Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-scsi: Change sense buffer size to 252
On Thu, 03/06 12:55, Paolo Bonzini wrote: Il 06/03/2014 12:22, Hannes Reinecke ha scritto: On 03/06/2014 11:09 AM, Paolo Bonzini wrote: Il 06/03/2014 09:47, Fam Zheng ha scritto: According to SPC-4, section 4.5.2.1, 252 is the limit of sense data. So increase the value. Signed-off-by: Fam Zheng f...@redhat.com --- include/linux/virtio_scsi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/virtio_scsi.h b/include/linux/virtio_scsi.h index 4195b97..a437f7f 100644 --- a/include/linux/virtio_scsi.h +++ b/include/linux/virtio_scsi.h @@ -28,7 +28,7 @@ #define _LINUX_VIRTIO_SCSI_H #define VIRTIO_SCSI_CDB_SIZE 32 -#define VIRTIO_SCSI_SENSE_SIZE 96 +#define VIRTIO_SCSI_SENSE_SIZE 252 /* SCSI command request, followed by data-out */ struct virtio_scsi_cmd_req { Hi Fam, how did you test this? Is there a specific reason _not_ to use the linux default? The SCSI stack typically limits the sense code to SCSI_SENSE_BUFFERSIZE, so using other values have a limited sense. Literally :-) OK. So, do we need to lift the limit on other parts of SCSI stack as well? Indeed I don't think this patch makes a difference. Though I asked not from the SCSI stack perspective, but because right now both virtio-scsi targets (QEMU and vhost-scsi) are also truncating at 96. Oh, I missed that, I'll do a more thorough review and testing. Thank you for pointing out. Thanks, Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [libvirt] Looking for project ideas and mentors for Google Summer of Code 2014
On Fri, 02/07 15:01, Fam Zheng wrote: I'd like to add persistent dirty bitmap as an idea but I seem to have no account on wiki, so I'll just reply here, please help with review and update the page if it makes sense. (Who could create an account for me, BTW?) Now I've got two because Paolo and Kevin helped me at the same time :). Thank you! Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [libvirt] Looking for project ideas and mentors for Google Summer of Code 2014
On Thu, 02/06 13:27, Stefan Hajnoczi wrote: On Tue, Feb 4, 2014 at 1:38 PM, Michal Privoznik mpriv...@redhat.com wrote: On 03.02.2014 08:45, Stefan Hajnoczi wrote: KVM libvirt: you are welcome to join the QEMU umbrella organization like last year. I've updated wiki with a libvirt idea. But I can sense more to come later as I have some time to think about it :) Great, thanks! I have added my QEMU block layer ideas too. Deadline for the organization application is 14th of February. As part of the application form we need to show our list of ideas. Thanks for posting your project ideas so that our list is ready. I'd like to add persistent dirty bitmap as an idea but I seem to have no account on wiki, so I'll just reply here, please help with review and update the page if it makes sense. (Who could create an account for me, BTW?) Project Idea Below == Incremental backup of block images -- Summary: Implement persistent incremental image backup. Users want to do regular backup of VM image data to protect data from unexpected loss. Incremental backup is a backup strategy that only copies out the new data that is changed since previous backup, to reduce the overhead of backup and improve the storage utilization. To track which part of guest data is changed, QEMU needs to store image's dirty bitmap on the disk as well as the image data itself. The task is to implement a new block driver (a filter) to load/store this persistent dirty bitmap file, and maintain the dirty bits while the guest writes to the data image. As a prerequisite, you also need to make the design of this bitmap file format. Then, design test cases and write scripts to test the driver. The persistent bitmap file must contain: 0. Magic bits to identify the format of this file. 1. Bitmap granularity (e.g. 64 KB) 2. The actual bitmap (1 TB disk @ 64 KB granularity = 2 MB bitmap) 3. Flags including a clean flag. The clean flag is used to tell whether the persistent bitmap file is safe to use again. When QEMU opens the persistent dirty bitmap, it clears the clean flag. When QEMU deactivates and finishes writing out the dirty bitmap, it sets the clean flag. If the QEMU process crashes it is not safe to trust the dirty bitmap; a full backup must be performed. Make use of this flag in the driver to limit the performance overhead. Links: [http://en.wikipedia.org/wiki/Incremental_backup Incremental backup] [http://lists.nongnu.org/archive/html/qemu-devel/2014-01/msg02156.html QMP: Introduce incremental drive-backup with in-memory dirty bitmap] Details: Skill level: intermediate Language: C Mentors: Fam Zheng f...@redhat.com (fam on IRC), Stefan Hajnoczi stefa...@redhat.com (stefanha on IRC) Thanks, Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Qemu-img
On 2013年12月04日 07:47, RB wrote: On Tue, Dec 3, 2013 at 4:26 PM, CDR vene...@gmail.com wrote: In Fedora 20 rpm -qa | grep qemuqemu-img convert -O qcow2 Neustar-flat.vmdk /home/neustar.qcow2-img qemu-img-1.6.1-2.fc20.x86_64 but when I try to boot with the image, it fails. Please see the image attached. So the conversion is unbootable. Any idea? I'd appreciate your not unicasting the query to me, it's a public list. Your attached image indicates the guest VM started booting but failed at some point, probably when trying to mount the root partition. The format conversion appears to have worked, but it's not going to fiddle the internals of the image if it was dependent on something specific to VMware. I don't see the screenshot here but... Telling from the file name, this vmdk file should be in fact a raw disk and the conversion shouldn't be a problem (you can use `xxd -l 512 Neustar-flat.vmdk` to see if it's simply an MBR, or containing KDMV magic in the first bytes). It might be the guest OS driver or configuration that don't automatically work on kvm. How did you export this vmdk from ESX, what's the guest OS, and what's your command line to boot it? Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Equivalent of vmware SIOC (Storage IO Control) in KVM
On Mon, 10/14 02:13, Andrey Korolyov wrote: Hello, By the way, is there plans to enhance qemu I/O throttling to able to swallow peaks or to apply various disciplines? Current one-second flat discipline seemingly is not enough for uneven workloads especially when there is no alternative like cgroups for nbd usage. Hi, In current upstream master there are improvements on throttling (actually a total rework), which added burst limits with new options of bps_max, iops_max, etc. Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Equivalent of vmware SIOC (Storage IO Control) in KVM
On Mon, 10/14 02:18, Wangshen (Peter) wrote: on Sunday, October 13, 2013 9:27 PM, Paolo Bonzini wrote: Il 12/10/2013 08:09, Soumendu Satapathy (sosatapa) ha scritto: Do we have an equivalent of vmware SIOC like feature in KVM? Yes, you have two choices: 1) use cgroups to throttle I/O at the level of the host disk (i.e. multiple virtual disks stored on the same disk share the limit). If you're using libvirt, add the blkiotune element to the definition of the virtual machine. 2) enable I/O throttling in QEMU, to apply limits at the level of the guest disk. If you're using libvirt, add the iotune element within the disk element in the definition of the virtual machine. Both are documented at http://libvirt.org/formatdomain.html (search for blkiotune and iotune). Both blkiotune and iotune are only take effect on one same host. How to throttle I/O across multiple host with shared storage devices? Depending on how you use the shared storage, I think you could: 1) Throttle disk/network IO on the shared device with cgroups. 2) Set throttle limit (iotune) on each domain. Thanks, Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM in HA active/active + fault-tolerant configuration
On Wed, 08/21 22:49, g.da...@assyoma.it wrote: On 2013-08-21 21:40, Brian Jackson wrote: On Wednesday, August 21, 2013 6:02:31 AM CDT, g.da...@assyoma.it wrote: Hi all, I have a question about Linux KVM HA cluster. I understand that in a HA setup I can live migrate virtual machine between host that shares the same storage (via various methods, eg: DRDB). This enable us to migrate the VMs based on hosts loads and performance. ìMy current understanding is that, with this setup, an host crash will cause the VMs to be restarded on another host. However, I wonder if there is a method to have a fully fault-tolerant HA configuration, where for fully fault-tolerant I means that an host crash (eg: power failures) will cause the VMs to be migrated to another hosts with no state change. In other word: it is possible to have an always-synchronized (both disk memory) VM instance on another host, so that the migrated VM does not need to be restarted but only restored/unpaused? For disk data synchronization we can use shared storages (bypassing the problem) or something similar do DRDB, but what about memory? You're looking for something that doesn't exist for KVM. There was a project once for it called Kemari, but afaik, it's been abandoned for a while. Hi Brian, thank you for your reply. As I googled extensively without finding anything, I was prepared to a similar response. Anyway, from what I understand, Qemu already use a similar approach (tracking dirty memory pages) when live migrating virtual machines to another host. Active/active sounds not easy to get, as it seem to me, since you'll need to make sure the VMs on both nodes are always in the same state all the time, that sounds impossible for two emulator processes on two different hosts. I think hot spare is more practical: in background you repetitively trigger migration of delta memory and copy to hot spare, but don't start to run it. Once the active one fails, you can resume the running of hot spare, which is at a latest checkpoint. But I think this needs to some work on current live migration code. Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM Block Device Driver
On Wed, 08/14 07:29, Spensky, Chad - 0559 - MITLL wrote: Stefan, Fam, We are trying to keep an active shadow copy while the system is running without any need for pausing. More precisely we want to log every individual access to the drive into a database so that the entire stream of accesses could be replayed at a later time. There's no IO request log infrastructure in QEMU for now. What drive-mirror can do is to repetitively send changed sectors since last sending point, but it's not in guest request order or operation size, it works just with a dirty bitmap. In your methodology you didn't mention the hook you worked in block.c, but I think it is necessary to hack block.c to log every r/w access to the drive, I think you synchronous each write to the shadow image, right? On 8/14/13 6:05 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Wed, Aug 14, 2013 at 10:40:06AM +0800, Fam Zheng wrote: On Tue, 08/13 16:13, Spensky, Chad - 0559 - MITLL wrote: Hi All, I'm working with some disk introspection on KVM, and we trying to create a shadow image of the disk. We've hooked the functions in block.c, in particular bdrv_aio_writev. However we are seeing writes go through, pausing the VM, and the comparing our shadow image with the actual VM image, and they aren't 100% synced up. The first 1-2 sectors appear to be always be correct, however, after that, there are sometimes some discrepancies. I believe we have exhausted most obvious bugs (malloc bugs, incorrect size calculations etc.). Has anyone had any experience with this or have any insights? Our methodology is as follows: 1. Boot the VM. 2. Pause VM. 3. Copy the disk to our shadow image. How do you copy the disk, from guest or host? 4. Perform very few reads/writes. Did you flush to disk? 5. Pause VM. 6. Compare shadow copy with active vm disk. And this is where we are seeing discrepancies. Any help is much appreciated! We are running on Ubuntu 12.04 with a modified Debian build. - Chad -- Chad S. Spensky I think drive-backup command does just what you want, it creates a image and copy-on-write date from guest disk to the target, without pausing VM. Or perhaps drive-mirror. Maybe Chad can explain what the use case is. There is probably an existing command that does this or that could be extended to do this safely. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM Block Device Driver
On Wed, 08/14 08:19, Spensky, Chad - 0559 - MITLL wrote: Fam, That's correct, we modified block.c to hook the appropriate functions and output the information through a unix socket. One of the functions that we hooked is bdrv_aio_writev, however it seems like the data that we are seeing at that point in the callstack is not what actually makes it to the disk image for the guest. The first couple of sectors always seem to be the same, however after sector 2 it's a toss up. I'm guessing there may be some sort of caching going on or something, and we appear to be missing it. Your approach sounds right. For you problem, I think it should be possible to debug using gdb: start qemu in gdb, boot with a live CD, so no disturbing disk IO by the OS; attach a image to the guest; insert breakpoint at bdrv_aio_writev, then in guest you can start IO: # echo your data | dd of=/dev/sda oflag=sync seek=XXX You should now be at the breakpoint in gdb and you can trace the your data to the unix socket and compare with the guest io request. Fam - Chad -- Chad S. Spensky MIT Lincoln Laboratory Group 59 (Cyber Systems Assessment) Ph: (781) 981-4173 On 8/14/13 8:16 AM, Fam Zheng f...@redhat.com wrote: On Wed, 08/14 07:29, Spensky, Chad - 0559 - MITLL wrote: Stefan, Fam, We are trying to keep an active shadow copy while the system is running without any need for pausing. More precisely we want to log every individual access to the drive into a database so that the entire stream of accesses could be replayed at a later time. There's no IO request log infrastructure in QEMU for now. What drive-mirror can do is to repetitively send changed sectors since last sending point, but it's not in guest request order or operation size, it works just with a dirty bitmap. In your methodology you didn't mention the hook you worked in block.c, but I think it is necessary to hack block.c to log every r/w access to the drive, I think you synchronous each write to the shadow image, right? On 8/14/13 6:05 AM, Stefan Hajnoczi stefa...@gmail.com wrote: On Wed, Aug 14, 2013 at 10:40:06AM +0800, Fam Zheng wrote: On Tue, 08/13 16:13, Spensky, Chad - 0559 - MITLL wrote: Hi All, I'm working with some disk introspection on KVM, and we trying to create a shadow image of the disk. We've hooked the functions in block.c, in particular bdrv_aio_writev. However we are seeing writes go through, pausing the VM, and the comparing our shadow image with the actual VM image, and they aren't 100% synced up. The first 1-2 sectors appear to be always be correct, however, after that, there are sometimes some discrepancies. I believe we have exhausted most obvious bugs (malloc bugs, incorrect size calculations etc.). Has anyone had any experience with this or have any insights? Our methodology is as follows: 1. Boot the VM. 2. Pause VM. 3. Copy the disk to our shadow image. How do you copy the disk, from guest or host? 4. Perform very few reads/writes. Did you flush to disk? 5. Pause VM. 6. Compare shadow copy with active vm disk. And this is where we are seeing discrepancies. Any help is much appreciated! We are running on Ubuntu 12.04 with a modified Debian build. - Chad -- Chad S. Spensky I think drive-backup command does just what you want, it creates a image and copy-on-write date from guest disk to the target, without pausing VM. Or perhaps drive-mirror. Maybe Chad can explain what the use case is. There is probably an existing command that does this or that could be extended to do this safely. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM Block Device Driver
On Tue, 08/13 16:13, Spensky, Chad - 0559 - MITLL wrote: Hi All, I'm working with some disk introspection on KVM, and we trying to create a shadow image of the disk. We've hooked the functions in block.c, in particular bdrv_aio_writev. However we are seeing writes go through, pausing the VM, and the comparing our shadow image with the actual VM image, and they aren't 100% synced up. The first 1-2 sectors appear to be always be correct, however, after that, there are sometimes some discrepancies. I believe we have exhausted most obvious bugs (malloc bugs, incorrect size calculations etc.). Has anyone had any experience with this or have any insights? Our methodology is as follows: 1. Boot the VM. 2. Pause VM. 3. Copy the disk to our shadow image. How do you copy the disk, from guest or host? 4. Perform very few reads/writes. Did you flush to disk? 5. Pause VM. 6. Compare shadow copy with active vm disk. And this is where we are seeing discrepancies. Any help is much appreciated! We are running on Ubuntu 12.04 with a modified Debian build. - Chad -- Chad S. Spensky I think drive-backup command does just what you want, it creates a image and copy-on-write date from guest disk to the target, without pausing VM. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Migration route from Parallels on Mac for Windows images?
On Fri, 06/28 22:36, Ken Roberts wrote: On Jun 28, 2013, at 3:39 AM, Paolo Bonzini pbonz...@redhat.com wrote: Il 28/06/2013 03:01, Ken Roberts ha scritto: More details on not bootable would be nice. Do you get a blue screen? Seabios screen? You may need to prep the image before you convert it (google mergeide). Not sure if you support screenshots on the list, so I'm typing it below the post. The only configured device is the hard disk, it is INSTANTLY showing Boot failed: not a bootable disk. If you attach one of your Windows images to one of the Linux VMs and run the Linux VM on Parallels, what does file -s say if you pass it the Windows disk? # fdisk -l /dev/sdb Disk /dev/sdb: 68.7 GB, 68719730688 bytes 255 heads, 63 sectors/track, 8354 cylinders, total 134218224 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00010001 Device Boot Start End Blocks Id System /dev/sdb1 * 63 134190944670954417 HPFS/NTFS/exFAT # file -s /dev/sdb /dev/sdb: x86 boot sector, Microsoft Windows XP MBR, Serial 0x10001; partition 1: ID=0x7, active, starthead 1, startsector 63, 134190882 sectors, code offset 0xc0 # file -s /dev/sdb1 /dev/sdb1: x86 boot sector, code offset 0x52, OEM-ID NTFS, sectors/cluster 8, reserved sectors 0, Media descriptor 0xf8, heads 255, hidden sectors 63, dos 4.0 BootSector (0x80) And what does file say (on the host) about the same Windows image after conversion to raw? $file -s popeye-c-raw.img popeye-c-raw.img: data Also, can you do dd if=/path/to/windows-image.raw bs=512 count=1 | od -tx1 (of course you have to replace /path/to/windows-image.raw) and include the output? $dd if=popeye-c-raw.img bs=512 count=1 | od -tx1 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.000139388 s, 3.7 MB/s 000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0001000 Definitely not a correct conversion, can you post your converting command? I think is should be one like: qemu-img convert -f parallels -O qcow2 original.parallels target.qcow2 -- Fam -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html