Re: [Qemu-devel] [PATCH 2/2 V7] qemu,qmp: add inject-nmi qmp command

2011-04-11 Thread Markus Armbruster
Avi Kivity a...@redhat.com writes:

 On 04/08/2011 12:41 AM, Anthony Liguori wrote:

 And it's a good thing to have, but exposing this as the only API to
 do something as simple as generating a guest crash dump is not the
 friendliest thing in the world to do to users.

 nmi is a fine name for something that corresponds to a real-life nmi
 button (often labeled NMI).

Agree.

 generate-crash-dump is a wrong name for something that doesn't
 generate a crash dump (the guest may not be configured for it, or it
 may fail to work).

Or the OS uses the NMI button for something else.

 I'd expect that to be host-side functionality.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Ulrich Obergfell

   typedef struct HPETState {
 @@ -248,7 +253,7 @@ static int hpet_post_load(void *opaque, int
 version_id)

   static const VMStateDescription vmstate_hpet_timer = {
   .name = hpet_timer,
 - .version_id = 1,
 + .version_id = 3,

 Why jump from 1 to 3?
 
   .minimum_version_id = 1,
   .minimum_version_id_old = 1,
   .fields = (VMStateField []) {
 @@ -258,6 +263,11 @@ static const VMStateDescription
 vmstate_hpet_timer = {
   VMSTATE_UINT64(fsb, HPETTimer),
   VMSTATE_UINT64(period, HPETTimer),
   VMSTATE_UINT8(wrap_flag, HPETTimer),
 + VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
 + VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
 + VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
 + VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
 + VMSTATE_UINT32_V(divisor, HPETTimer, 3),


Anthony,

I incremented the version ID of 'vmstate_hpet' from 2 to 3 to make sure
that migrations from a QEMU process that is capable of 'driftfix' to a
QEMU process that is _not_ capable of 'driftfix' will fail. I assigned
version ID 3 to 'vmstate_hpet_timer' and to the new fields in there too
to indicate that adding those fields was the reason why the version ID
of 'vmstate_hpet' was incremented to 3.

As far as the flow of execution in vmstate_load_state() is concerned, I
think it does not matter whether the version ID of 'vmstate_hpet_timer'
and the new fields in there is 2 or 3 (as long as they are consistent).
When the 'while(field-name)' loop in vmstate_load_state() gets to the
following field in 'vmstate_hpet' ...

VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
vmstate_hpet_timer, HPETTimer),

... it calls itself recursively ...

if (field-flags  VMS_STRUCT) {
ret = vmstate_load_state(f, field-vmsd, addr, field-vmsd-version_id);

'field-vmsd-version_id' is the version ID of 'vmstate_hpet_timer' [1].
Hence 'vmstate_hpet_timer.version_id' is being checked against itself ...

if (version_id  vmsd-version_id) {
return -EINVAL;
}

... and the version IDs of the new fields are also being checked against
'vmstate_hpet_timer.version_id' ...

if ((field-field_exists 
 field-field_exists(opaque, version_id)) ||
(!field-field_exists 
 field-version_id = version_id)) {

If you want me to change the version ID of 'vmstate_hpet_timer' and the
new fields in there from 3 to 2, I can do that.


Regards,

Uli


[1] Ref.: commit fa3aad24d94a6cf894db52d83f72a399324a17bb
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Stefan Hajnoczi
On Sun, Apr 10, 2011 at 4:19 PM, Kuniyasu Suzaki k.suz...@aist.go.jp wrote:

 From: Avi Kivity a...@redhat.com
 Subject: Re: EuroSec'11 Presentation
 Date: Sun, 10 Apr 2011 17:49:52 +0300

 On 04/10/2011 05:23 PM, Kuniyasu Suzaki wrote:
  Dear,
 
  I made a presentation about memory disclosure attack on SKM (Kernel
  Samepage Merging) with KVM at EuroSec 2011.
  The titile is Memory Deduplication as a Threat to the Guest OS.
     http://www.iseclab.org/eurosec-2011/program.html
 
  The slide is downloadbale.
     http://www.slideshare.net/suzaki/eurosec2011-slide-memory-deduplication
  The paper will be downloadble form ACM Digital Library.
 
  Please tell me, if you have comments. Thank you.

 Very interesting presentation.  It seems every time you share something,
 it become a target for attacks.

 I'm happy to hear your comments.
 The referee's comment was severe. It said there was not brand-new
 point, but there are real attack experiences.  My paper was just
 evaluated the detction on apahce2 and sshd on Linux Guest OS and
 Firefox and IE6 on Windows Guest OS.

If I have a VM on the same physical host as someone else I may be able
to determine which programs and specific versions they are currently
running.

Is there some creative attack using this technique that I'm missing?
I don't see many serious threats.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Ulrich Obergfell

 vmstate_hpet_timer = {
   VMSTATE_UINT64(fsb, HPETTimer),
   VMSTATE_UINT64(period, HPETTimer),
   VMSTATE_UINT8(wrap_flag, HPETTimer),
 + VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
 + VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
 + VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
 + VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
 + VMSTATE_UINT32_V(divisor, HPETTimer, 3),

 We ought to be able to use a subsection keyed off of whether any ticks
 are currently accumulated, no?


Anthony,

I'm not sure if I understand your question correctly. Are you suggesting
to migrate the driftfix-related state conditionally / only if there are
any ticks accumulated in 'ticks_not_accounted' and 'irqs_to_inject' ?

The size of the driftfix-related state is 28 bytes per timer and we have
32 timers per HPETState, i.e. 896 additional bytes per HPETState. With a
maximum number of 8 HPET blocks (HPETState), this amounts to 7168 bytes.
Hence, unconditional migration of the driftfix-related state should not
cause significant additional overhead.

Maybe I missed something. Could you please explain which benefit you see
in using a subsection ?


Regards,

Uli
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Avi Kivity

On 04/11/2011 12:06 PM, Ulrich Obergfell wrote:

  vmstate_hpet_timer = {
VMSTATE_UINT64(fsb, HPETTimer),
VMSTATE_UINT64(period, HPETTimer),
VMSTATE_UINT8(wrap_flag, HPETTimer),
  + VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
  + VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
  + VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
  + VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
  + VMSTATE_UINT32_V(divisor, HPETTimer, 3),

  We ought to be able to use a subsection keyed off of whether any ticks
  are currently accumulated, no?


Anthony,

I'm not sure if I understand your question correctly. Are you suggesting
to migrate the driftfix-related state conditionally / only if there are
any ticks accumulated in 'ticks_not_accounted' and 'irqs_to_inject' ?

The size of the driftfix-related state is 28 bytes per timer and we have
32 timers per HPETState, i.e. 896 additional bytes per HPETState. With a
maximum number of 8 HPET blocks (HPETState), this amounts to 7168 bytes.
Hence, unconditional migration of the driftfix-related state should not
cause significant additional overhead.



It's not about overhead.


Maybe I missed something. Could you please explain which benefit you see
in using a subsection ?


In the common case of there being no drift, you can migrate from a qemu 
that supports driftfix to a qemu that doesn't.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB EHCI patch for 0.14.0?

2011-04-11 Thread ya su
David:

I have applied the patch to 0.14.0, and there is a bug if I add a
optiarc CRRWDVD CRX890A usb device on windows xp, I first comment out
the following code in usb-linux.c:

if (is_halted(s, p-devep)) {
ret = ioctl(s-fd, USBDEVFS_CLEAR_HALT, urb-endpoint);
#if 0   
if (ret  0) {
DPRINTF(husb: failed to clear halt. ep 0x%x errno %d\n,
urb-endpoint, errno);
return USB_RET_NAK;
}
#endif
clear_halt(s, p-devep);
}

 then it can continue to run in linux, but still stall on windows
xp and win7. I turn on debug, part of the output is as the following:

husb: async cancel. aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status -2 alen 0
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64
husb: submit ctrl. len 72 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x0 req 0x5 val 0x2 index 0 len 0
husb: ctrl set addr 2
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18
husb: submit ctrl. len 26 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0
husb: releasing interfaces
husb: ctrl set config 1 ret 0 errno 11
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: data submit. ep 0x2 len 31 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 31
invoking packet_complete. plen = 31
husb: data submit. ep 0x81 len 64 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 4
invoking packet_complete. plen = 4
husb: data submit. ep 0x81 len 13 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status -32 alen 0
invoking packet_complete. plen = -3
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64
husb: submit ctrl. len 72 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x0 req 0x5 val 0x1 index 0 len 0
husb: ctrl set addr 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18
husb: submit ctrl. len 26 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0
husb: releasing interfaces
husb: ctrl set config 1 ret 0 errno 11
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: data submit. ep 0x2 len 31 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 31
invoking packet_complete. plen = 31
husb: data submit. ep 0x81 len 64 aurb 0x1616cd0
[Thread 0x74f75710 (LWP 3317) exited]
husb: async completed. aurb 0x1616cd0 status 0 alen 4
invoking packet_complete. plen = 4
husb: data submit. ep 0x81 len 13 aurb 0x1616cd0
husb: async cancel. aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status -2 alen 0
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 64
husb: submit ctrl. len 72 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: reset device 6.8
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: ctrl type 0x0 req 0x5 val 0x2 index 0 len 0
husb: ctrl set addr 2
husb: ctrl type 0x80 req 0x6 val 0x100 index 0 len 18
husb: submit ctrl. len 26 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 18
invoking packet_complete. plen = 8
husb: ctrl type 0x0 req 0x9 val 0x1 index 0 len 0
husb: releasing interfaces
husb: ctrl set config 1 ret 0 errno 11
husb: claiming interfaces. config 1
husb: i is 18, descr_len is 50, dl 9, dt 2
husb: config #1 need 1
husb: 1 interfaces claimed for configuration 1
husb: data submit. ep 0x2 len 31 aurb 0x1616cd0
husb: async completed. aurb 0x1616cd0 status 0 alen 31
invoking packet_complete. plen = 31
husb: data submit. ep 0x81 len 40 aurb 

RE: Administration panel for KVM

2011-04-11 Thread Martin Maurer
Hi Daniel,

Proxmox VE can be installed on existing Lenny installations (see 
http://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Lenny), the upcoming 
2.x series on Squeeze. But we still provide a bare-metal installer as this is 
the most user friendly way to install (the auto partitioning make sure that 
there is enough free space for the LVM snapshots, used for backups (see vzdump 
for OpenVZ and KVM)).

This means you just have an additional repo in your sources.list and you still 
get Debian security updates (expect some package which are provide by our repo, 
like KVM or Kernel).

We do not use libvirt, we have a web gui and also tools for the command line, 
e.g. qm for managing KVM guests.  http://pve.proxmox.com/wiki/Qm_manual

Here is the link to the roadmap for 2.0 - a major change a big step forward:
http://pve.proxmox.com/wiki/Roadmap#Roadmap_for_2.x

Best Regards,

Martin


 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org]
 On Behalf Of Daniel Bareiro
 Sent: Sonntag, 10. April 2011 17:00
 To: KVM General
 Subject: Re: Administration panel for KVM
 
 On Sunday, 10 April 2011 14:00:41 +0200, Matthias Hovestadt wrote:
 
  Hi!
 
 Hi, Matthias!
 
   With a group of college buddies, we are evaluating the possibility
   of initiating a project to develop a management panel of KVM virtual
   machines. The idea is to do something similar to OpenXenManager but
   for KVM.
 
  At out university we developed a Perl-based management tool named
  kvm-top. This tool is command-line only, not offering any GUI at the
  moment. The initial idea of that tool was to make the start-up of VMs
  easier than doing it manually. The tool analyzes a VM-specific config
  file like
 
  GUEST_ID=219
  GUEST_NAME=attic
  .
  .
 
  defining all parameters for starting up a VM. For actually starting
  this VM, a single command now is sufficient:
 
  asok01 ~ # kvm-top -start attic
 
  This will not only start-up the VM attic, but also check if this VM
  is running on some other cluster node and connect to the iSCSI target
  if required.
 
  Meanwhile, the tool has evolved, not only consisting of the kvm-top
  tool, but also a server component named kvm-ctld running on each
  cluster node. The kvm-top tool connects to the kvm-ctld running on
  the local host, executing the desired command. At this, the command
  does not nessecarily have to be executed on the same cluster node. For
  instance, it is easily possible to start/stop a VM running on a
  different cluster node.
 
 
  However, the main feature of kvm-top is giving information about the
  current status of the running VMs:
 
  asok01 ~ # kvm-top
  VM   NODE   AS 5s  30s USER PID   #CPU MEM   VNC   SPICE #LAN
 
 ==
 ===
  atticasok02  4   4 root  66141  2048 36003 -2
  cbaseasok08  1   1 root 102221  1048 36142 -1
  cbase-spice  asok08  0   0 root  42691  1024 36143  59241
  cloud-pj asok02 14  18 root 240711  1024 36001 -2
  .
  .
  .
 
  where 5s and 30s contain the average system load over the last 5s
  resp. 30s. There are serveral ways of filtering or sorting the output,
  e.g. sorting by cluster nodes:
 
  asok01 ~ # kvm-top -s node
  NODE   VM   AS 5s  30s USER PID   #CPU MEM   VNC   SPICE #LAN
 
 ==
 ===
  asok01(ENABLED): 0(0) VMs, CPU=0%, MEM=2%, AGE 00:00
  asok02(ENABLED): 7(8) VMs, CPU=13%, MEM=99%, AGE 00:05
 attic 4   4 root  66141  2048 36003 -2
 cloud-pj 21  19 root 240711  1024 36001 -2
  .
  .
 
 
  The kvm-top tool even allows migration of VMs between the cluster
  nodes. The following command would migrate the VM attic from the
  currently used cluster node asok02 to cluster node asok07 (note:
  the command has been executed on a different cluster node asok01):
 
  asok01 ~ # kvm-top -migrate attic asok07
 
 
  As I mentioned, the tool is command line only at the moment, however
  it shouldn't be too difficult to create a web-based interface, since
  the kvm-ctld allows communication not only with kvm-top. Connecting to
  the port of kvm-ctld, it's pretty easy to get information about all
  currently running VMs or start/stop/migrate VMs.
 
 
  If there's interest in that tool, please let me know. I'll gladly
  publish it.
 
 Sounds interesting. If you publish it, I'd take a look.
 
 Researching on the Internet I found virt-manager [1], although I'm not sure if
 it can interact with KVM. In any case, virt-manager uses libvirt and my idea
 was not to use libvirt in the VMHost. I guess kvm-ctld
 will supply some of the functions of libvirt at the remote end.
 
 Thanks for your reply.
 
 Regards,
 Daniel
 
 [1] http://virt-manager.et.redhat.com/
 --
 Fingerprint: BFB3 08D6 B4D1 31B2 72B9  29CE 6696 BF1B 14E6 1D37 Powered
 

Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Luiz Capitulino
On Sat, 9 Apr 2011 13:34:43 +0300
Blue Swirl blauwir...@gmail.com wrote:

 On Sat, Apr 9, 2011 at 2:25 AM, Luiz Capitulino lcapitul...@redhat.com 
 wrote:
  Hi there,
 
  Summary:
 
   - PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 minutes. 
  Got
    the problem with e1000, virtio and rtl8139. However, pcnet *works* (it's
    as fast as qemu-kvm.git)
 
   - PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a minute. 
  Tried
    with e1000, virtio and rtl8139 (I don't remember if I tried with pcnet)
 
  I tried with qemu.git v0.13.0 in order to check if this was a regression, 
  but
  I got the same problem...
 
  Then I inspected qemu-kvm.git under the assumption that it could have a fix
  that wasn't commited to qemu.git. Found this:
 
   - commit 0836b77f0f65d56d08bdeffbac25cd6d78267dc9 which is merge, works
 
   - commit cc015e9a5dde2f03f123357fa060acbdfcd570a4 does not work (it's slow)
 
  I tried a bisect, but it brakes due to gcc4 vs. gcc3 changes. Then I 
  inspected
  commits manually, and found out that commit 64d7e9a4 doesn't work, which 
  makes
  me think that the fix could be in the conflict resolution of 0836b77f, which
  makes me remember that I'm late for diner, so my conclusions at this point 
  are
  not reliable :)
 
  Ideas?
 
 What is the test case?

It's an external PXE server, command-line is:

 qemu -boot n -enable-kvm -net nic,model=virtio -net tap,ifname=vnet0,script=

 I tried PXE booting a 10M file with and without
 KVM and the results are pretty much the same with pcnet and e1000.
 time qemu -monitor stdio -boot n -net nic,model=e1000 -net
 user,tftp=.,bootfile=10M -net dump,file=foo -enable-kvm
 time qemu -monitor stdio -boot n -net nic,model=pcnet -net
 user,tftp=.,bootfile=10M -net dump,file=foo -enable-kvm
 time qemu -monitor stdio -boot n -net nic,model=e1000 -net
 user,tftp=.,bootfile=10M -net dump,file=foo
 time qemu -monitor stdio -boot n -net nic,model=pcnet -net
 user,tftp=.,bootfile=10M -net dump,file=foo
 
 All times are ~10s.

Yeah, you're using the internal tftp server.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Anthony Liguori

On 04/11/2011 04:08 AM, Avi Kivity wrote:

On 04/11/2011 12:06 PM, Ulrich Obergfell wrote:

  vmstate_hpet_timer = {
VMSTATE_UINT64(fsb, HPETTimer),
VMSTATE_UINT64(period, HPETTimer),
VMSTATE_UINT8(wrap_flag, HPETTimer),
  + VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
  + VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
  + VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
  + VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
  + VMSTATE_UINT32_V(divisor, HPETTimer, 3),

  We ought to be able to use a subsection keyed off of whether any 
ticks

  are currently accumulated, no?


Anthony,

I'm not sure if I understand your question correctly. Are you suggesting
to migrate the driftfix-related state conditionally / only if there are
any ticks accumulated in 'ticks_not_accounted' and 'irqs_to_inject' ?

The size of the driftfix-related state is 28 bytes per timer and we have
32 timers per HPETState, i.e. 896 additional bytes per HPETState. With a
maximum number of 8 HPET blocks (HPETState), this amounts to 7168 bytes.
Hence, unconditional migration of the driftfix-related state should not
cause significant additional overhead.



It's not about overhead.


Maybe I missed something. Could you please explain which benefit you see
in using a subsection ?


In the common case of there being no drift, you can migrate from a 
qemu that supports driftfix to a qemu that doesn't.




Right, subsections are a trick.  The idea is that when you introduce new 
state for a device model that is not always going to be set, when you do 
the migration, you detect whether the state is set or not and if it's 
not set, instead of sending empty versions of that state (i.e. 
missed_ticks=0) you just don't send the new state at all.


This means that you can migrate to an older version of QEMU provided the 
migration would work correctly.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Anthony Liguori

On 04/11/2011 03:24 AM, Ulrich Obergfell wrote:

   typedef struct HPETState {
@@ -248,7 +253,7 @@ static int hpet_post_load(void *opaque, int
version_id)

   static const VMStateDescription vmstate_hpet_timer = {
   .name = hpet_timer,
- .version_id = 1,
+ .version_id = 3,

Why jump from 1 to 3?


   .minimum_version_id = 1,
   .minimum_version_id_old = 1,
   .fields = (VMStateField []) {
@@ -258,6 +263,11 @@ static const VMStateDescription
vmstate_hpet_timer = {
   VMSTATE_UINT64(fsb, HPETTimer),
   VMSTATE_UINT64(period, HPETTimer),
   VMSTATE_UINT8(wrap_flag, HPETTimer),
+ VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
+ VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
+ VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
+ VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
+ VMSTATE_UINT32_V(divisor, HPETTimer, 3),


Anthony,

I incremented the version ID of 'vmstate_hpet' from 2 to 3 to make sure
that migrations from a QEMU process that is capable of 'driftfix' to a
QEMU process that is _not_ capable of 'driftfix' will fail. I assigned
version ID 3 to 'vmstate_hpet_timer' and to the new fields in there too
to indicate that adding those fields was the reason why the version ID
of 'vmstate_hpet' was incremented to 3.

As far as the flow of execution in vmstate_load_state() is concerned, I
think it does not matter whether the version ID of 'vmstate_hpet_timer'
and the new fields in there is 2 or 3 (as long as they are consistent).
When the 'while(field-name)' loop in vmstate_load_state() gets to the
following field in 'vmstate_hpet' ...

 VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
 vmstate_hpet_timer, HPETTimer),

... it calls itself recursively ...

 if (field-flags  VMS_STRUCT) {
 ret = vmstate_load_state(f, field-vmsd, addr, 
field-vmsd-version_id);

'field-vmsd-version_id' is the version ID of 'vmstate_hpet_timer' [1].
Hence 'vmstate_hpet_timer.version_id' is being checked against itself ...

 if (version_id  vmsd-version_id) {
 return -EINVAL;
 }

... and the version IDs of the new fields are also being checked against
'vmstate_hpet_timer.version_id' ...

 if ((field-field_exists
  field-field_exists(opaque, version_id)) ||
 (!field-field_exists
  field-version_id= version_id)) {

If you want me to change the version ID of 'vmstate_hpet_timer' and the
new fields in there from 3 to 2, I can do that.


It avoids surprises so I think it's a reasonable thing to do.  But yes, 
your analysis is correct.


Regards,

Anthony Liguori



Regards,

Uli


[1] Ref.: commit fa3aad24d94a6cf894db52d83f72a399324a17bb


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB EHCI patch for 0.14.0?

2011-04-11 Thread David Ahern


On 04/11/11 03:40, ya su wrote:
 David:
 
 I have applied the patch to 0.14.0, and there is a bug if I add a
 optiarc CRRWDVD CRX890A usb device on windows xp, I first comment out
 the following code in usb-linux.c:
 
 if (is_halted(s, p-devep)) {
 ret = ioctl(s-fd, USBDEVFS_CLEAR_HALT, urb-endpoint);
 #if 0 
 if (ret  0) {
 DPRINTF(husb: failed to clear halt. ep 0x%x errno %d\n,
 urb-endpoint, errno);
 return USB_RET_NAK;
 }
 #endif
 clear_halt(s, p-devep);
 }
 
  then it can continue to run in linux, but still stall on windows
 xp and win7. I turn on debug, part of the output is as the following:

The EHCI code is very rough and needs someone to step up and finish it.

It seems to work ok for USB storage devices (keys and drives), and seems
to work fine with printers and scanners (at least it works with mine
;-)). I see stalls from time to time, but it recovers and continues on.
Clearly some touchups are needed.

On the other end it is known not to work with any audio and video
devices (webcams, iphones).

Something like the DVD I have no idea - never tried.

I lost momentum on the code last August and have not been able to get
back to it for a variety of reasons. It really needs someone to pick it
up and continue - or look at adding xhci code which might be a better
solution for virtualization.

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Glauber Costa
On Mon, 2011-04-11 at 08:10 -0500, Anthony Liguori wrote:
 On 04/11/2011 04:08 AM, Avi Kivity wrote:
  On 04/11/2011 12:06 PM, Ulrich Obergfell wrote:
vmstate_hpet_timer = {
  VMSTATE_UINT64(fsb, HPETTimer),
  VMSTATE_UINT64(period, HPETTimer),
  VMSTATE_UINT8(wrap_flag, HPETTimer),
+ VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
+ VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
+ VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
+ VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
+ VMSTATE_UINT32_V(divisor, HPETTimer, 3),
  
We ought to be able to use a subsection keyed off of whether any 
  ticks
are currently accumulated, no?
 
 
  Anthony,
 
  I'm not sure if I understand your question correctly. Are you suggesting
  to migrate the driftfix-related state conditionally / only if there are
  any ticks accumulated in 'ticks_not_accounted' and 'irqs_to_inject' ?
 
  The size of the driftfix-related state is 28 bytes per timer and we have
  32 timers per HPETState, i.e. 896 additional bytes per HPETState. With a
  maximum number of 8 HPET blocks (HPETState), this amounts to 7168 bytes.
  Hence, unconditional migration of the driftfix-related state should not
  cause significant additional overhead.
 
 
  It's not about overhead.
 
  Maybe I missed something. Could you please explain which benefit you see
  in using a subsection ?
 
  In the common case of there being no drift, you can migrate from a 
  qemu that supports driftfix to a qemu that doesn't.
 
 
 Right, subsections are a trick.  The idea is that when you introduce new 
 state for a device model that is not always going to be set, when you do 
 the migration, you detect whether the state is set or not and if it's 
 not set, instead of sending empty versions of that state (i.e. 
 missed_ticks=0) you just don't send the new state at all.
 
 This means that you can migrate to an older version of QEMU provided the 
 migration would work correctly.

Using subsections and testing for hpet option being disabled vs enabled,
is fine. But checking for the existence of drift, like you suggested (or
at least how I understood you), is very tricky. It is expected to change
many times during guest lifetime, and would make our migration
predictability something Heisenberg would be proud of.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Avi Kivity

On 04/11/2011 04:39 PM, Glauber Costa wrote:

On Mon, 2011-04-11 at 08:10 -0500, Anthony Liguori wrote:
  On 04/11/2011 04:08 AM, Avi Kivity wrote:
On 04/11/2011 12:06 PM, Ulrich Obergfell wrote:
   vmstate_hpet_timer = {
 VMSTATE_UINT64(fsb, HPETTimer),
 VMSTATE_UINT64(period, HPETTimer),
 VMSTATE_UINT8(wrap_flag, HPETTimer),
   + VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
   + VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
   + VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
   + VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
   + VMSTATE_UINT32_V(divisor, HPETTimer, 3),

   We ought to be able to use a subsection keyed off of whether any
ticks
   are currently accumulated, no?
  
  
Anthony,
  
I'm not sure if I understand your question correctly. Are you suggesting
to migrate the driftfix-related state conditionally / only if there are
any ticks accumulated in 'ticks_not_accounted' and 'irqs_to_inject' ?
  
The size of the driftfix-related state is 28 bytes per timer and we have
32 timers per HPETState, i.e. 896 additional bytes per HPETState. With a
maximum number of 8 HPET blocks (HPETState), this amounts to 7168 bytes.
Hence, unconditional migration of the driftfix-related state should not
cause significant additional overhead.
  
  
It's not about overhead.
  
Maybe I missed something. Could you please explain which benefit you see
in using a subsection ?
  
In the common case of there being no drift, you can migrate from a
qemu that supports driftfix to a qemu that doesn't.
  

  Right, subsections are a trick.  The idea is that when you introduce new
  state for a device model that is not always going to be set, when you do
  the migration, you detect whether the state is set or not and if it's
  not set, instead of sending empty versions of that state (i.e.
  missed_ticks=0) you just don't send the new state at all.

  This means that you can migrate to an older version of QEMU provided the
  migration would work correctly.

Using subsections and testing for hpet option being disabled vs enabled,
is fine. But checking for the existence of drift, like you suggested (or
at least how I understood you), is very tricky. It is expected to change
many times during guest lifetime, and would make our migration
predictability something Heisenberg would be proud of.


First, I'd expect no drift under normal circumstances, at least without 
overcommit.  We may also allow a small amount of drift to pass migration 
(we lost time during the last phase anyway).


Second, the problem only occurs on new-old migrations.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Anthony Liguori

On 04/11/2011 08:39 AM, Glauber Costa wrote:

On Mon, 2011-04-11 at 08:10 -0500, Anthony Liguori wrote:

On 04/11/2011 04:08 AM, Avi Kivity wrote:

On 04/11/2011 12:06 PM, Ulrich Obergfell wrote:

  vmstate_hpet_timer = {
VMSTATE_UINT64(fsb, HPETTimer),
VMSTATE_UINT64(period, HPETTimer),
VMSTATE_UINT8(wrap_flag, HPETTimer),
  + VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
  + VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
  + VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
  + VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
  + VMSTATE_UINT32_V(divisor, HPETTimer, 3),

  We ought to be able to use a subsection keyed off of whether any

ticks

  are currently accumulated, no?


Anthony,

I'm not sure if I understand your question correctly. Are you suggesting
to migrate the driftfix-related state conditionally / only if there are
any ticks accumulated in 'ticks_not_accounted' and 'irqs_to_inject' ?

The size of the driftfix-related state is 28 bytes per timer and we have
32 timers per HPETState, i.e. 896 additional bytes per HPETState. With a
maximum number of 8 HPET blocks (HPETState), this amounts to 7168 bytes.
Hence, unconditional migration of the driftfix-related state should not
cause significant additional overhead.


It's not about overhead.


Maybe I missed something. Could you please explain which benefit you see
in using a subsection ?

In the common case of there being no drift, you can migrate from a
qemu that supports driftfix to a qemu that doesn't.


Right, subsections are a trick.  The idea is that when you introduce new
state for a device model that is not always going to be set, when you do
the migration, you detect whether the state is set or not and if it's
not set, instead of sending empty versions of that state (i.e.
missed_ticks=0) you just don't send the new state at all.

This means that you can migrate to an older version of QEMU provided the
migration would work correctly.

Using subsections and testing for hpet option being disabled vs enabled,
is fine. But checking for the existence of drift, like you suggested (or
at least how I understood you), is very tricky. It is expected to change
many times during guest lifetime, and would make our migration
predictability something Heisenberg would be proud of.


Is this true?  I would expect it to be very tied to workloads.  For idle 
workloads, you should never have accumulated missed ticks whereas with 
heavy workloads, you always will have accumulated ticks.


Is that not correct?

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 3/5] hpet 'driftfix': add fields to HPETTimer and VMStateDescription

2011-04-11 Thread Glauber Costa
On Mon, 2011-04-11 at 08:47 -0500, Anthony Liguori wrote:
 On 04/11/2011 08:39 AM, Glauber Costa wrote:
  On Mon, 2011-04-11 at 08:10 -0500, Anthony Liguori wrote:
  On 04/11/2011 04:08 AM, Avi Kivity wrote:
  On 04/11/2011 12:06 PM, Ulrich Obergfell wrote:
vmstate_hpet_timer = {
  VMSTATE_UINT64(fsb, HPETTimer),
  VMSTATE_UINT64(period, HPETTimer),
  VMSTATE_UINT8(wrap_flag, HPETTimer),
+ VMSTATE_UINT64_V(saved_period, HPETTimer, 3),
+ VMSTATE_UINT64_V(ticks_not_accounted, HPETTimer, 3),
+ VMSTATE_UINT32_V(irqs_to_inject, HPETTimer, 3),
+ VMSTATE_UINT32_V(irq_rate, HPETTimer, 3),
+ VMSTATE_UINT32_V(divisor, HPETTimer, 3),
We ought to be able to use a subsection keyed off of whether any
  ticks
are currently accumulated, no?
 
  Anthony,
 
  I'm not sure if I understand your question correctly. Are you suggesting
  to migrate the driftfix-related state conditionally / only if there are
  any ticks accumulated in 'ticks_not_accounted' and 'irqs_to_inject' ?
 
  The size of the driftfix-related state is 28 bytes per timer and we have
  32 timers per HPETState, i.e. 896 additional bytes per HPETState. With a
  maximum number of 8 HPET blocks (HPETState), this amounts to 7168 bytes.
  Hence, unconditional migration of the driftfix-related state should not
  cause significant additional overhead.
 
  It's not about overhead.
 
  Maybe I missed something. Could you please explain which benefit you see
  in using a subsection ?
  In the common case of there being no drift, you can migrate from a
  qemu that supports driftfix to a qemu that doesn't.
 
  Right, subsections are a trick.  The idea is that when you introduce new
  state for a device model that is not always going to be set, when you do
  the migration, you detect whether the state is set or not and if it's
  not set, instead of sending empty versions of that state (i.e.
  missed_ticks=0) you just don't send the new state at all.
 
  This means that you can migrate to an older version of QEMU provided the
  migration would work correctly.
  Using subsections and testing for hpet option being disabled vs enabled,
  is fine. But checking for the existence of drift, like you suggested (or
  at least how I understood you), is very tricky. It is expected to change
  many times during guest lifetime, and would make our migration
  predictability something Heisenberg would be proud of.
 
 Is this true?  I would expect it to be very tied to workloads.  For idle 
 workloads, you should never have accumulated missed ticks whereas with 
 heavy workloads, you always will have accumulated ticks.
 
 Is that not correct?
Yes, it is , but we lose a lot of reliability by tying migration to the 
workload. Given that
we still have to start qemu the same way both sides, we end up with a
situation in which at time t, migration is possible, and at time t+1
migration is not.

I'd rather have subsections enabled at all times when the option to
allow driftfix is enabled.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Anthony Liguori

On 04/11/2011 03:51 AM, Stefan Hajnoczi wrote:

I'm happy to hear your comments.
The referee's comment was severe. It said there was not brand-new
point, but there are real attack experiences.  My paper was just
evaluated the detction on apahce2 and sshd on Linux Guest OS and
Firefox and IE6 on Windows Guest OS.

If I have a VM on the same physical host as someone else I may be able
to determine which programs and specific versions they are currently
running.

Is there some creative attack using this technique that I'm missing?
I don't see many serious threats.


It's a deviation of a previously demonstrated attack where memory access 
timing is used to guess memory content.  This has been demonstrated in 
the past to be a viable technique to reduce the keyspace of things like 
ssh keys which makes attack a bit easier.


But it's a well known issue with colocation and the attack can be 
executed just by looking at raw memory access time (to guess whether 
another process brought something into the cache).


Regards,

Anthony Liguori


Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Kuniyasu Suzaki

Stefan,

From: Stefan Hajnoczi stefa...@gmail.com
Subject: Re: EuroSec'11 Presentation
Date: Mon, 11 Apr 2011 09:51:42 +0100

 On Sun, Apr 10, 2011 at 4:19 PM, Kuniyasu Suzaki k.suz...@aist.go.jp wrote:
 
  From: Avi Kivity a...@redhat.com
  Subject: Re: EuroSec'11 Presentation
  Date: Sun, 10 Apr 2011 17:49:52 +0300
 
  On 04/10/2011 05:23 PM, Kuniyasu Suzaki wrote:
   Dear,
  
   I made a presentation about memory disclosure attack on SKM (Kernel
   Samepage Merging) with KVM at EuroSec 2011.
   The titile is Memory Deduplication as a Threat to the Guest OS.
      http://www.iseclab.org/eurosec-2011/program.html
  
   The slide is downloadbale.
      
   http://www.slideshare.net/suzaki/eurosec2011-slide-memory-deduplication
   The paper will be downloadble form ACM Digital Library.
  
   Please tell me, if you have comments. Thank you.
 
  Very interesting presentation.  It seems every time you share something,
  it become a target for attacks.
 
  I'm happy to hear your comments.
  The referee's comment was severe. It said there was not brand-new
  point, but there are real attack experiences.  My paper was just
  evaluated the detction on apahce2 and sshd on Linux Guest OS and
  Firefox and IE6 on Windows Guest OS.
 
 If I have a VM on the same physical host as someone else I may be able
 to determine which programs and specific versions they are currently
 running.
 
 Is there some creative attack using this technique that I'm missing?
 I don't see many serious threats.

The memory disclosure attack assumed to be applied on Cloud Computing
which offers multi tenants. Even if a application has a vulnerablity,
attacker can find and attack it.
As I show my slides, IE6 is an exmaple.

The situation resembles to Cross VM Side Channel Attack mentioned in
CCS10 paper Hey, you, get off of my cloud.


Kuniyasu Suzaki

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Kuniyasu Suzaki

Anthony,

From: Anthony Liguori anth...@codemonkey.ws
Subject: Re: EuroSec'11 Presentation
Date: Mon, 11 Apr 2011 10:27:27 -0500

 On 04/11/2011 03:51 AM, Stefan Hajnoczi wrote:
  I'm happy to hear your comments.
  The referee's comment was severe. It said there was not brand-new
  point, but there are real attack experiences.  My paper was just
  evaluated the detction on apahce2 and sshd on Linux Guest OS and
  Firefox and IE6 on Windows Guest OS.
  If I have a VM on the same physical host as someone else I may be able
  to determine which programs and specific versions they are currently
  running.
 
  Is there some creative attack using this technique that I'm missing?
  I don't see many serious threats.
 
 It's a deviation of a previously demonstrated attack where memory access 
 timing is used to guess memory content.  This has been demonstrated in 
 the past to be a viable technique to reduce the keyspace of things like 
 ssh keys which makes attack a bit easier.
 
 But it's a well known issue with colocation and the attack can be 
 executed just by looking at raw memory access time (to guess whether 
 another process brought something into the cache).

Thank you for comments.
The memory disclosure attack can be prevented by several ways mention in my 
Countermeasure side (Page 22).

If we limit KSM on READ-ONLY pages, we detect and prevent the attack.
I also think most memory deduplication is on READ-ONLY pages.

--
Kuniysu Suzaki

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Avi Kivity

On 04/11/2011 06:46 PM, Kuniyasu Suzaki wrote:


  But it's a well known issue with colocation and the attack can be
  executed just by looking at raw memory access time (to guess whether
  another process brought something into the cache).

Thank you for comments.
The memory disclosure attack can be prevented by several ways mention in my 
Countermeasure side (Page 22).

If we limit KSM on READ-ONLY pages, we detect and prevent the attack.
I also think most memory deduplication is on READ-ONLY pages.



With EPT or NPT you cannot detect if a page is read only.

Furthermore, at least Linux (without highmem) maps all of memory with a 
read/write mapping in addition to the per-process mapping, so no page is 
read-only.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Kuniyasu Suzaki

From: Avi Kivity a...@redhat.com
Subject: Re: EuroSec'11 Presentation
Date: Mon, 11 Apr 2011 18:48:41 +0300

 On 04/11/2011 06:46 PM, Kuniyasu Suzaki wrote:
  
But it's a well known issue with colocation and the attack can be
executed just by looking at raw memory access time (to guess whether
another process brought something into the cache).
 
  Thank you for comments.
  The memory disclosure attack can be prevented by several ways mention in my 
  Countermeasure side (Page 22).
 
  If we limit KSM on READ-ONLY pages, we detect and prevent the attack.
  I also think most memory deduplication is on READ-ONLY pages.
 
 
 With EPT or NPT you cannot detect if a page is read only.
 
 Furthermore, at least Linux (without highmem) maps all of memory with a 
 read/write mapping in addition to the per-process mapping, so no page is 
 read-only.

Unfortunately, yes. Linux kernel maps all memory with read/write.
I met this problem already.
I have to find another OS which clearly separete read only pages.

I also know the CPU can not distinguish read only pages.
However, If a VMM can trace CR3 and retrive the page tables, we can
distinguish read only page or not.
Yes, it is a academic interest.

--
suzaki

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] fix regression caused by e48672fa25e879f7ae21785c7efd187738139593

2011-04-11 Thread Nikola Ciprich
Hello Zachary,
what is the current status, are You going to post this patch to Avi?
I'd like to see one (or both) in stable eventually, I think it's good 
candidate..
BR
nik


-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Stefan Hajnoczi
On Mon, Apr 11, 2011 at 4:27 PM, Anthony Liguori anth...@codemonkey.ws wrote:
 On 04/11/2011 03:51 AM, Stefan Hajnoczi wrote:

 I'm happy to hear your comments.
 The referee's comment was severe. It said there was not brand-new
 point, but there are real attack experiences.  My paper was just
 evaluated the detction on apahce2 and sshd on Linux Guest OS and
 Firefox and IE6 on Windows Guest OS.

 If I have a VM on the same physical host as someone else I may be able
 to determine which programs and specific versions they are currently
 running.

 Is there some creative attack using this technique that I'm missing?
 I don't see many serious threats.

 It's a deviation of a previously demonstrated attack where memory access
 timing is used to guess memory content.  This has been demonstrated in the
 past to be a viable technique to reduce the keyspace of things like ssh keys
 which makes attack a bit easier.

How can you reduce the key space by determining whether the guest has
arbitrary 4 KB data in physical memory?

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM call agenda for April 12

2011-04-11 Thread Juan Quintela

Please, send in any agenda items you are interested in covering.

Later, Juan.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB EHCI patch for 0.14.0?

2011-04-11 Thread Jan Kiszka
On 2011-04-11 15:23, David Ahern wrote:
 I lost momentum on the code last August and have not been able to get
 back to it for a variety of reasons. It really needs someone to pick it
 up and continue - or look at adding xhci code which might be a better
 solution for virtualization.

xHCI is on the way [1], but the code was not yet published AFAIK.

Jan

[1]
http://www.linuxtag.org/2011/de/program/freies-vortragsprogramm/popup/vortragsdetails.html?no_cache=1talkid=103

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB EHCI patch for 0.14.0?

2011-04-11 Thread Jan Kiszka
On 2011-04-11 15:23, David Ahern wrote:
 I lost momentum on the code last August and have not been able to get
 back to it for a variety of reasons. It really needs someone to pick it
 up and continue - or look at adding xhci code which might be a better
 solution for virtualization.

xHCI is on the way [1], but the code was not yet published AFAIK.

Jan

[1]
http://www.linuxtag.org/2011/de/program/freies-vortragsprogramm/popup/vortragsdetails.html?no_cache=1talkid=103

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 2/2 V7] qemu,qmp: add inject-nmi qmp command

2011-04-11 Thread Blue Swirl
On Mon, Apr 11, 2011 at 10:01 AM, Markus Armbruster arm...@redhat.com wrote:
 Avi Kivity a...@redhat.com writes:

 On 04/08/2011 12:41 AM, Anthony Liguori wrote:

 And it's a good thing to have, but exposing this as the only API to
 do something as simple as generating a guest crash dump is not the
 friendliest thing in the world to do to users.

 nmi is a fine name for something that corresponds to a real-life nmi
 button (often labeled NMI).

 Agree.

We could also introduce an alias mechanism for user friendly names, so
nmi could be used in addition of full path. Aliases could be useful
for device paths as well.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Anthony Liguori

On 04/11/2011 10:46 AM, Kuniyasu Suzaki wrote:



But it's a well known issue with colocation and the attack can be
executed just by looking at raw memory access time (to guess whether
another process brought something into the cache).

Thank you for comments.
The memory disclosure attack can be prevented by several ways mention in my 
Countermeasure side (Page 22).


Not to be discouraging, but this class of attacks (side channel 
information disclosures) is very well known and very well documented.


Side channel attacks are extremely difficult to use from a practical 
perspective.  First, you have to know that your target is colocated with 
you and that you are actually sharing a resource.  Second, you have to 
be able to exploit the additional information you've gathered.


This type of attack is just as application to any multi-user environment 
and is not at all unique to virtualization.



If we limit KSM on READ-ONLY pages, we detect and prevent the attack.
I also think most memory deduplication is on READ-ONLY pages.


There's really no point about worrying about these sort of things.  
Either you're not going to colocate, you'll colocate and do the best you 
can with what the hardware provides (socket isolation, no KSM, etc.), or 
you're no going to worry about these types of things.


Again, it is extremely difficult to use side channel information 
disclosures to actually exploit anything.  If you are worried about this 
level of security, you shouldn't be using x86 hardware as more advanced 
hardware has more rigorous support for protecting against these sort of 
things.


Regards,

Anthony Liguori


--
Kuniysu Suzaki



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: EuroSec'11 Presentation

2011-04-11 Thread Anthony Liguori

On 04/11/2011 11:25 AM, Stefan Hajnoczi wrote:

On Mon, Apr 11, 2011 at 4:27 PM, Anthony Liguorianth...@codemonkey.ws  wrote:


It's a deviation of a previously demonstrated attack where memory access
timing is used to guess memory content.  This has been demonstrated in the
past to be a viable technique to reduce the keyspace of things like ssh keys
which makes attack a bit easier.

How can you reduce the key space by determining whether the guest has
arbitrary 4 KB data in physical memory?


I'm not sure that you can.  But the way the cache timing attack worked 
is that by doing a cache timing analysis in another process that's 
sharing the cache with a process doing key generation, you can make 
predictions about the paths taken by the key generation code which let's 
you narrow down the key space.


Of course, even this is extremely hard to exploit because you need to 
happen to be sharing a cache with something that's doing ssh key 
generation, you have to know when it starts and when it ends, and you 
have to know exactly what version of ssh is running.  And even then, 
it's just reduces the time needed to brute force.  It still takes a long 
time.


I think knowing whether a 4kb page is shared by some other guest in the 
system is so little information that I don't see what you could 
practically do with it that can't already be done via a cache timing attack.


Regards,

Anthony Liguori


Stefan


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB EHCI patch for 0.14.0?

2011-04-11 Thread David Ahern


On 04/11/11 10:46, Jan Kiszka wrote:
 On 2011-04-11 15:23, David Ahern wrote:
 I lost momentum on the code last August and have not been able to get
 back to it for a variety of reasons. It really needs someone to pick it
 up and continue - or look at adding xhci code which might be a better
 solution for virtualization.
 
 xHCI is on the way [1], but the code was not yet published AFAIK.
 
 Jan
 
 [1]
 http://www.linuxtag.org/2011/de/program/freies-vortragsprogramm/popup/vortragsdetails.html?no_cache=1talkid=103
 

interesting. And will it be released / submitted to qemu for inclusion?

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Host crash

2011-04-11 Thread FinnTux
Hello,

I ran into a crash today while I tried to log into one of my servers.
I noticed ssh didn't respond at all. All virtual machines were running
ok though. I had one terminal open to the server and it was running ok
except ssh didn't work.

I'm not quite sure if this is kvm related but I was hoping you experts
could figure out what went wrong. Sorry for the noise it it is
something else.


My setup (two similar machines):
Asus P5K SE mobo
Quad Core Q6600
8 GB RAM
Several NICs for different networks
NICs:
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
02:00.0 Ethernet controller: Atheros Communications L1 Gigabit Ethernet (rev b0)
04:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053
PCI-E Gigabit Ethernet Controller (rev 20)
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
06:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751
Gigabit Ethernet PCI Express (rev 21)
07:01.0 Ethernet controller: Intel Corporation 82540EM Gigabit
Ethernet Controller (rev 02)


Realteks are bonded for drbd sync
Marvell for my local net (using sk98lin driver)
Intel for internet connection
Broadcom for my SAN. not in use yet.

two vlans on my local net

Debian squeeze (all software stock squeeze except qemu-kvm)
kernel 2.6.32-5-amd64
drbd used for shared storage between hosts
qemu-kvm-0.14
home made script for starting and stopping virtual machines

Just let me know if you need more info.

Below is a cut from dmesg that I was able to save:

[210618.760363] [ cut here ]
[210618.760397] kernel BUG at
/build/buildd-linux-2.6_2.6.32-31-amd64-vrfdM4/linux-2.6-2.6.32/debian/build/source_amd64_none/mm/slub.c:2969!
[210618.760455] invalid opcode:  [#1] SMP
[210618.760489] last sysfs file:
/sys/devices/virtual/net/vlan240/statistics/tx_dropped
[210618.760542] CPU 3
[210618.760568] Modules linked in: nfs fscache ocfs2 jbd2 quota_tree
drbd lru_cache cn nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs
ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp
libiscsi_tcp libiscsi scsi_transport_iscsi ocfs2_dlmfs
ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue fuse
configfs bridge loop 8021q garp stp bonding kvm_intel kvm tun
snd_hda_codec_realtek snd_hda_intel snd_hda_codec nouveau snd_hwdep
ttm snd_pcm snd_timer drm_kms_helper snd drm soundcore i2c_i801
i2c_algo_bit serio_raw snd_page_alloc asus_atk0110 evdev i2c_core
pcspkr button processor ext3 jbd mbcache dm_mod raid1 raid0 md_mod
sd_mod crc_t10dif pata_marvell ata_generic tg3 ahci ata_piix libphy
uhci_hcd e1000 atl1 sk98lin libata scsi_mod ehci_hcd thermal
thermal_sys usbcore r8169 nls_base mii [last unloaded: scsi_wait_scan]
[210618.761141] Pid: 2588, comm: smartd Tainted: G   M
2.6.32-5-amd64 #1 P5K SE
[210618.761188] RIP: 0010:[810e723f]  [810e723f]
kfree+0x55/0xcb
[210618.761244] RSP: 0018:88022447baa8  EFLAGS: 00010246
[210618.761272] RAX: 02100068 RBX: 8801fdc35560 RCX:
015e
[210618.761320] RDX: 880207d68380 RSI: ea0007945700 RDI:
ea000700
[210618.761367] RBP: 8802 R08:  R09:
81455200
[210618.761413] R10: 0002 R11:  R12:
8110fe65
[210618.761460] R13: 880224d56a80 R14: 88022a49 R15:
880207d68380
[210618.761508] FS:  7f70fb4207e0() GS:880008d8()
knlGS:
[210618.761557] CS:  0010 DS:  ES:  CR0: 8005003b
[210618.761586] CR2: 7fa94149d6f0 CR3: 00022448e000 CR4:
26e0
[210618.761633] DR0:  DR1:  DR2:

[210618.761680] DR3:  DR6: 0ff0 DR7:
0400
[210618.761728] Process smartd (pid: 2588, threadinfo
88022447a000, task 88022ce92350)
[210618.761776] Stack:
[210618.761799]  8801fdc35560 880207d68380 
8110fe65
[210618.761839] 0 8801fdc35560 81110845 88020001
88022aca2350
[210618.761899] 0  880207d68380 
8118210a
[210618.764184] Call Trace:
[210618.764184]  [8110fe65] ? bio_free_map_data+0x15/0x1e
[210618.764184]  [81110845] ? bio_uncopy_user+0x47/0x59
[210618.764184]  [8118210a] ? blk_rq_unmap_user+0x1e/0x45
[210618.764184]  [811859e7] ? sg_io+0x37a/0x3d7
[210618.764184]  [81185f43] ? scsi_cmd_ioctl+0x217/0x3f4
[210618.764184]  [810f6145] ? path_to_nameidata+0x15/0x37
[210618.764184]  [a00a8b0c] ? sd_ioctl+0x9d/0xcb [sd_mod]
[210618.764184]  [81183915] ? __blkdev_driver_ioctl+0x69/0x7e
[210618.764184]  [81184110] ? blkdev_ioctl+0x7e6/0x836
[210618.764184]  [810bc307] ? release_pages+0x17b/0x18d
[210618.764184]  [810ff946] ? touch_atime+0x7c/0x127
[210618.764184]  

Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Luiz Capitulino
On Fri, 08 Apr 2011 19:50:57 -0500
Anthony Liguori anth...@codemonkey.ws wrote:

 On 04/08/2011 06:25 PM, Luiz Capitulino wrote:
  Hi there,
 
  Summary:
 
- PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 minutes. 
  Got
  the problem with e1000, virtio and rtl8139. However, pcnet *works* (it's
  as fast as qemu-kvm.git)
 
- PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a minute. 
  Tried
  with e1000, virtio and rtl8139 (I don't remember if I tried with pcnet)
 
  I tried with qemu.git v0.13.0 in order to check if this was a regression, 
  but
  I got the same problem...
 
  Then I inspected qemu-kvm.git under the assumption that it could have a fix
  that wasn't commited to qemu.git. Found this:
 
- commit 0836b77f0f65d56d08bdeffbac25cd6d78267dc9 which is merge, works
 
- commit cc015e9a5dde2f03f123357fa060acbdfcd570a4 does not work (it's 
  slow)
 
  I tried a bisect, but it brakes due to gcc4 vs. gcc3 changes. Then I 
  inspected
  commits manually, and found out that commit 64d7e9a4 doesn't work, which 
  makes
  me think that the fix could be in the conflict resolution of 0836b77f, which
  makes me remember that I'm late for diner, so my conclusions at this point 
  are
  not reliable :)
 
 Can you run kvm_stat to see what the exit rates are?

Here you go, both collected after the VM is fully booted:

qemu.git:

efer_reload0 0
exits  15976719599
fpu_reload   203 0
halt_exits   54427
halt_wakeup0 0
host_state_reload 29985170
hypercalls 0 0
insn_emulation 13449597341
insn_emulation_fail0 0
invlpg  9687 0
io_exits   85979 0
irq_exits 162179 4
irq_injections 1158227
irq_window 2071227
largepages 0 0
mmio_exits  954541
mmu_cache_miss  5307 0
mmu_flooded 2493 0
mmu_pde_zapped  1188 0
mmu_pte_updated 5355 0
mmu_pte_write 181550 0
mmu_recycled   0 0
mmu_shadow_zapped   6437 0
mmu_unsync15 0
nmi_injections 0 0
nmi_window 0 0
pf_fixed   73983 0
pf_guest4027 0
remote_tlb_flush   1 0
request_irq6 0
signal_exits  135731 2
tlb_flush  26760 0

qemu-kvm.git:

efer_reload0 0
exits869724433
fpu_reload46 0
halt_exits   206 8
halt_wakeup7 0
host_state_reload 105173 8
hypercalls 0 0
insn_emulation   698411821
insn_emulation_fail0 0
invlpg  9682 0
io_exits  626201 0
irq_exits  22930 4
irq_injections  2815 8
irq_window  1029 0
largepages 0 0
mmio_exits  3657 0
mmu_cache_miss  5271 0
mmu_flooded 2466 0
mmu_pde_zapped  1146 0
mmu_pte_updated 5294 0
mmu_pte_write 191173 0
mmu_recycled   0 0
mmu_shadow_zapped   6405 0
mmu_unsync17 0
nmi_injections 0 0
nmi_window 0 0
pf_fixed   73580 0
pf_guest4169 0
remote_tlb_flush   1 0
request_irq0 0
signal_exits   24873 0
tlb_flush  26628 0

 
 Maybe we're missing a coalesced io in qemu.git?  It's also possible that 
 gpxe is hitting the apic or pit quite a lot.
 
 Regards,
 
 Anthony Liguori
 
  Ideas?
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Use mmap for working with disk image V2

2011-04-11 Thread Christoph Hellwig
How do you plan to handle I/O errors or ENOSPC conditions?  Note that
shared writeable mappings are by far the feature in the VM/FS code
that is most error prone, including the impossiblity of doing sensible
error handling.

The version that accidentally used MAP_PRIVATE actually makes a lot of
sense for an equivalent of qemu's snapshot mode where the image is
readonly and changes are kept private as long as the amount of modified
blocks is small enough to not kill the host VM, but using shared
writeable mappings just sems dangerous.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Use mmap for working with disk image V2

2011-04-11 Thread Pekka Enberg
On Mon, Apr 11, 2011 at 9:41 PM, Christoph Hellwig h...@infradead.org wrote:
 How do you plan to handle I/O errors or ENOSPC conditions?  Note that
 shared writeable mappings are by far the feature in the VM/FS code
 that is most error prone, including the impossiblity of doing sensible
 error handling.

Good point. I reverted the commit. Thanks!

On Mon, Apr 11, 2011 at 9:41 PM, Christoph Hellwig h...@infradead.org wrote:
 The version that accidentally used MAP_PRIVATE actually makes a lot of
 sense for an equivalent of qemu's snapshot mode where the image is
 readonly and changes are kept private as long as the amount of modified
 blocks is small enough to not kill the host VM, but using shared
 writeable mappings just sems dangerous.

Yup, Sasha, mind submitting a MAP_PRIVATE version that's enabled with
'--snapshot' (or equivalent) command line option.

Pekka
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Alex Williamson
On Mon, 2011-04-11 at 15:35 -0300, Luiz Capitulino wrote:
 On Fri, 08 Apr 2011 19:50:57 -0500
 Anthony Liguori anth...@codemonkey.ws wrote:
 
  On 04/08/2011 06:25 PM, Luiz Capitulino wrote:
   Hi there,
  
   Summary:
  
 - PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 
   minutes. Got
   the problem with e1000, virtio and rtl8139. However, pcnet *works* 
   (it's
   as fast as qemu-kvm.git)
  
 - PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a minute. 
   Tried
   with e1000, virtio and rtl8139 (I don't remember if I tried with 
   pcnet)
  

I was having this problem too, but I think it's because I forgot to
build qemu with --enable-io-thread, which is the default for qemu-kvm.
Can you re-configure and build with that and see if it's fast?  Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm tools: rhel6.0 guest hung during shutdown

2011-04-11 Thread Pekka Enberg
On Sun, 2011-04-10 at 16:58 +0300, Gleb Natapov wrote:
 On Sun, Apr 10, 2011 at 09:49:31PM +0800, Amos Kong wrote:
  System halted.
  [note: guest hung ...]
  
 Isn't that expected result without ACPI support? I would expect all guests
 to hang like that at the end.

I see hangs with Debian Squeeze image too but not with the minimal QEMU
image I usually test things with. I wonder, though, why userspace
insists on using ACPI for shutdown as we boot with 'noapic'.

Pekka

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm tools: rhel6.0 guest hung during shutdown

2011-04-11 Thread Gleb Natapov
On Mon, Apr 11, 2011 at 10:01:30PM +0300, Pekka Enberg wrote:
 On Sun, 2011-04-10 at 16:58 +0300, Gleb Natapov wrote:
  On Sun, Apr 10, 2011 at 09:49:31PM +0800, Amos Kong wrote:
   System halted.
   [note: guest hung ...]
   
  Isn't that expected result without ACPI support? I would expect all guests
  to hang like that at the end.
 
 I see hangs with Debian Squeeze image too but not with the minimal QEMU
 image I usually test things with. I wonder, though, why userspace
 insists on using ACPI for shutdown as we boot with 'noapic'.
 
There is not way to power down PC from software without ACPI (may be APM
has something, but I doubt kvm-tool implements it either). Do you remember
Windows 95 it is now safe to turn off your computer screen? 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Luiz Capitulino
On Mon, 11 Apr 2011 13:00:32 -0600
Alex Williamson alex.william...@redhat.com wrote:

 On Mon, 2011-04-11 at 15:35 -0300, Luiz Capitulino wrote:
  On Fri, 08 Apr 2011 19:50:57 -0500
  Anthony Liguori anth...@codemonkey.ws wrote:
  
   On 04/08/2011 06:25 PM, Luiz Capitulino wrote:
Hi there,
   
Summary:
   
  - PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 
minutes. Got
the problem with e1000, virtio and rtl8139. However, pcnet *works* 
(it's
as fast as qemu-kvm.git)
   
  - PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a 
minute. Tried
with e1000, virtio and rtl8139 (I don't remember if I tried with 
pcnet)
   
 
 I was having this problem too, but I think it's because I forgot to
 build qemu with --enable-io-thread, which is the default for qemu-kvm.
 Can you re-configure and build with that and see if it's fast?  Thanks,

Yes, nice catch, it's faster with I/O thread enabled, even seem faster
than qemu-kvm.git.

So, does this have to be fixed w/o I/O thread?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm tools: rhel6.0 guest hung during shutdown

2011-04-11 Thread Cyrill Gorcunov
On 04/11/2011 11:07 PM, Gleb Natapov wrote:
 On Mon, Apr 11, 2011 at 10:01:30PM +0300, Pekka Enberg wrote:
 On Sun, 2011-04-10 at 16:58 +0300, Gleb Natapov wrote:
 On Sun, Apr 10, 2011 at 09:49:31PM +0800, Amos Kong wrote:
 System halted.
 [note: guest hung ...]

 Isn't that expected result without ACPI support? I would expect all guests
 to hang like that at the end.

 I see hangs with Debian Squeeze image too but not with the minimal QEMU
 image I usually test things with. I wonder, though, why userspace
 insists on using ACPI for shutdown as we boot with 'noapic'.

 There is not way to power down PC from software without ACPI (may be APM
 has something, but I doubt kvm-tool implements it either). Do you remember
 Windows 95 it is now safe to turn off your computer screen? 

yup, iirc APM had some set power state entry point, but not sure, need to find 
docs ;)

 
 --
   Gleb.


-- 
Cyrill
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm tools: rhel6.0 guest hung during shutdown

2011-04-11 Thread Gleb Natapov
On Mon, Apr 11, 2011 at 11:28:22PM +0400, Cyrill Gorcunov wrote:
 On 04/11/2011 11:07 PM, Gleb Natapov wrote:
  On Mon, Apr 11, 2011 at 10:01:30PM +0300, Pekka Enberg wrote:
  On Sun, 2011-04-10 at 16:58 +0300, Gleb Natapov wrote:
  On Sun, Apr 10, 2011 at 09:49:31PM +0800, Amos Kong wrote:
  System halted.
  [note: guest hung ...]
 
  Isn't that expected result without ACPI support? I would expect all guests
  to hang like that at the end.
 
  I see hangs with Debian Squeeze image too but not with the minimal QEMU
  image I usually test things with. I wonder, though, why userspace
  insists on using ACPI for shutdown as we boot with 'noapic'.
 
  There is not way to power down PC from software without ACPI (may be APM
  has something, but I doubt kvm-tool implements it either). Do you remember
  Windows 95 it is now safe to turn off your computer screen? 
 
 yup, iirc APM had some set power state entry point, but not sure, need to 
 find docs ;)
 
Just go for ACPI then. APM is dead.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: USB EHCI patch for 0.14.0?

2011-04-11 Thread Jan Kiszka
On 2011-04-11 19:53, David Ahern wrote:
 
 
 On 04/11/11 10:46, Jan Kiszka wrote:
 On 2011-04-11 15:23, David Ahern wrote:
 I lost momentum on the code last August and have not been able to get
 back to it for a variety of reasons. It really needs someone to pick it
 up and continue - or look at adding xhci code which might be a better
 solution for virtualization.

 xHCI is on the way [1], but the code was not yet published AFAIK.

 Jan

 [1]
 http://www.linuxtag.org/2011/de/program/freies-vortragsprogramm/popup/vortragsdetails.html?no_cache=1talkid=103

 
 interesting. And will it be released / submitted to qemu for inclusion?

I suppose so. But maybe Alex can tell more.

Jan



signature.asc
Description: OpenPGP digital signature


Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Alex Williamson
On Mon, 2011-04-11 at 22:04 +0200, Jan Kiszka wrote:
 On 2011-04-11 21:15, Luiz Capitulino wrote:
  On Mon, 11 Apr 2011 13:00:32 -0600
  Alex Williamson alex.william...@redhat.com wrote:
  
  On Mon, 2011-04-11 at 15:35 -0300, Luiz Capitulino wrote:
  On Fri, 08 Apr 2011 19:50:57 -0500
  Anthony Liguori anth...@codemonkey.ws wrote:
 
  On 04/08/2011 06:25 PM, Luiz Capitulino wrote:
  Hi there,
 
  Summary:
 
- PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 
  minutes. Got
  the problem with e1000, virtio and rtl8139. However, pcnet *works* 
  (it's
  as fast as qemu-kvm.git)
 
- PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a 
  minute. Tried
  with e1000, virtio and rtl8139 (I don't remember if I tried with 
  pcnet)
 
 
  I was having this problem too, but I think it's because I forgot to
  build qemu with --enable-io-thread, which is the default for qemu-kvm.
  Can you re-configure and build with that and see if it's fast?  Thanks,
  
  Yes, nice catch, it's faster with I/O thread enabled, even seem faster
  than qemu-kvm.git.
 
 What's the performance under qemu-kvm with -no-kvm-irqchip?
 
  
  So, does this have to be fixed w/o I/O thread?
 
 If it's most probably an architectural deficit of non-io-thread mode, I
 would say let it rest in peace. But maybe it points to a generic issues
 that is just magnified by non-threaded mode.

I've probably been told, but forget.  Why isn't io-thread enabled by
default?  Thanks,

Alex


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Jan Kiszka
On 2011-04-11 22:14, Alex Williamson wrote:
 On Mon, 2011-04-11 at 22:04 +0200, Jan Kiszka wrote:
 On 2011-04-11 21:15, Luiz Capitulino wrote:
 On Mon, 11 Apr 2011 13:00:32 -0600
 Alex Williamson alex.william...@redhat.com wrote:

 On Mon, 2011-04-11 at 15:35 -0300, Luiz Capitulino wrote:
 On Fri, 08 Apr 2011 19:50:57 -0500
 Anthony Liguori anth...@codemonkey.ws wrote:

 On 04/08/2011 06:25 PM, Luiz Capitulino wrote:
 Hi there,

 Summary:

   - PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 
 minutes. Got
 the problem with e1000, virtio and rtl8139. However, pcnet *works* 
 (it's
 as fast as qemu-kvm.git)

   - PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a 
 minute. Tried
 with e1000, virtio and rtl8139 (I don't remember if I tried with 
 pcnet)


 I was having this problem too, but I think it's because I forgot to
 build qemu with --enable-io-thread, which is the default for qemu-kvm.
 Can you re-configure and build with that and see if it's fast?  Thanks,

 Yes, nice catch, it's faster with I/O thread enabled, even seem faster
 than qemu-kvm.git.

 What's the performance under qemu-kvm with -no-kvm-irqchip?


 So, does this have to be fixed w/o I/O thread?

 If it's most probably an architectural deficit of non-io-thread mode, I
 would say let it rest in peace. But maybe it points to a generic issues
 that is just magnified by non-threaded mode.
 
 I've probably been told, but forget.  Why isn't io-thread enabled by
 default?  Thanks,

TCG performance still sucks in io-threaded mode. I've three patches in
my queue that reduces the overhead a bit further - for me to a
reasonable level (will post them the next days). But, still, YMMV
depending on the workload.

At least Windows should no longer we a functional blocker thanks to
Paolo's work.

Jan



signature.asc
Description: OpenPGP digital signature


Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Jan Kiszka
On 2011-04-11 22:18, Jan Kiszka wrote:
 On 2011-04-11 22:14, Alex Williamson wrote:
 On Mon, 2011-04-11 at 22:04 +0200, Jan Kiszka wrote:
 On 2011-04-11 21:15, Luiz Capitulino wrote:
 On Mon, 11 Apr 2011 13:00:32 -0600
 Alex Williamson alex.william...@redhat.com wrote:

 On Mon, 2011-04-11 at 15:35 -0300, Luiz Capitulino wrote:
 On Fri, 08 Apr 2011 19:50:57 -0500
 Anthony Liguori anth...@codemonkey.ws wrote:

 On 04/08/2011 06:25 PM, Luiz Capitulino wrote:
 Hi there,

 Summary:

   - PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 
 minutes. Got
 the problem with e1000, virtio and rtl8139. However, pcnet *works* 
 (it's
 as fast as qemu-kvm.git)

   - PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a 
 minute. Tried
 with e1000, virtio and rtl8139 (I don't remember if I tried with 
 pcnet)


 I was having this problem too, but I think it's because I forgot to
 build qemu with --enable-io-thread, which is the default for qemu-kvm.
 Can you re-configure and build with that and see if it's fast?  Thanks,

 Yes, nice catch, it's faster with I/O thread enabled, even seem faster
 than qemu-kvm.git.

 What's the performance under qemu-kvm with -no-kvm-irqchip?


 So, does this have to be fixed w/o I/O thread?

 If it's most probably an architectural deficit of non-io-thread mode, I
 would say let it rest in peace. But maybe it points to a generic issues
 that is just magnified by non-threaded mode.

 I've probably been told, but forget.  Why isn't io-thread enabled by
 default?  Thanks,
 
 TCG performance still sucks in io-threaded mode. I've three patches in
 my queue that reduces the overhead a bit further - for me to a
 reasonable level (will post them the next days). But, still, YMMV
 depending on the workload.

In fact, they were already prepared. So I've just sent them out.

Jan



signature.asc
Description: OpenPGP digital signature


Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Jan Kiszka
On 2011-04-11 23:05, Luiz Capitulino wrote:
 On Mon, 11 Apr 2011 22:04:52 +0200
 Jan Kiszka jan.kis...@web.de wrote:
 
 On 2011-04-11 21:15, Luiz Capitulino wrote:
 On Mon, 11 Apr 2011 13:00:32 -0600
 Alex Williamson alex.william...@redhat.com wrote:

 On Mon, 2011-04-11 at 15:35 -0300, Luiz Capitulino wrote:
 On Fri, 08 Apr 2011 19:50:57 -0500
 Anthony Liguori anth...@codemonkey.ws wrote:

 On 04/08/2011 06:25 PM, Luiz Capitulino wrote:
 Hi there,

 Summary:

   - PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 
 minutes. Got
 the problem with e1000, virtio and rtl8139. However, pcnet *works* 
 (it's
 as fast as qemu-kvm.git)

   - PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a 
 minute. Tried
 with e1000, virtio and rtl8139 (I don't remember if I tried with 
 pcnet)


 I was having this problem too, but I think it's because I forgot to
 build qemu with --enable-io-thread, which is the default for qemu-kvm.
 Can you re-configure and build with that and see if it's fast?  Thanks,

 Yes, nice catch, it's faster with I/O thread enabled, even seem faster
 than qemu-kvm.git.

 What's the performance under qemu-kvm with -no-kvm-irqchip?
 
 Still fast, 

I meant: is it even faster with unaccelerated userspace irqchip? I've
seen such effects with emulated NICs before.

 but just realized that qemu-kvm's configure says that I/O thread
 is disabled:
 
  IO thread no
 
 And it's fast..

That only means (so far) that the upstream io-thread code is disabled.
Qemu-kvm's own solution is enabled all the time, and you can't switch to
upstream anyway as both are incompatible. That's going to change soon
(hopefully) when we migrate qemu-kvm to the upstream version.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 1/2] rbd: use the higher level librbd instead of just librados

2011-04-11 Thread Josh Durgin

On 04/08/2011 01:43 AM, Stefan Hajnoczi wrote:

On Mon, Mar 28, 2011 at 04:15:57PM -0700, Josh Durgin wrote:

librbd stacks on top of librados to provide access
to rbd images.

Using librbd simplifies the qemu code, and allows
qemu to use new versions of the rbd format
with few (if any) changes.

Signed-off-by: Josh Durginjosh.dur...@dreamhost.com
Signed-off-by: Yehuda Sadehyeh...@hq.newdream.net
---
  block/rbd.c   |  785 +++--
  block/rbd_types.h |   71 -
  configure |   33 +--
  3 files changed, 221 insertions(+), 668 deletions(-)
  delete mode 100644 block/rbd_types.h


Hi Josh,
I have applied your patches onto qemu.git/master and am running
ceph.git/master.

Unfortunately qemu-iotests fails for me.


Test 016 seems to hang in qemu-io -g -c write -P 66 128M 512
rbd:rbd/t.raw.  I can reproduce this consistently.  Here is the
backtrace of the hung process (not consuming CPU, probably deadlocked):


This hung because it wasn't checking the return value of rbd_aio_write.
I've fixed this in the for-qemu branch of 
http://ceph.newdream.net/git/qemu-kvm.git. Also, the existing rbd 
implementation is not 'growable' - writing to a large offset will not 
expand the rbd image correctly. Should we implement bdrv_truncate to 
support this (librbd has a resize operation)? Is bdrv_truncate useful 
outside of qemu-img and qemu-io?



Test 008 failed with an assertion but succeeded when run again.  I think
this is a race condition:


This is likely a use-after-free, but I haven't been able to find the 
race condition yet (or reproduce it). Could you get a backtrace from the 
core file?


Thanks,
Josh
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Slow PXE boot in qemu.git (fast in qemu-kvm.git)

2011-04-11 Thread Anthony Liguori

On 04/11/2011 03:04 PM, Jan Kiszka wrote:

On 2011-04-11 21:15, Luiz Capitulino wrote:

On Mon, 11 Apr 2011 13:00:32 -0600
Alex Williamsonalex.william...@redhat.com  wrote:


On Mon, 2011-04-11 at 15:35 -0300, Luiz Capitulino wrote:

On Fri, 08 Apr 2011 19:50:57 -0500
Anthony Liguorianth...@codemonkey.ws  wrote:


On 04/08/2011 06:25 PM, Luiz Capitulino wrote:

Hi there,

Summary:

   - PXE boot in qemu.git (HEAD f124a41) is quite slow, more than 5 minutes. Got
 the problem with e1000, virtio and rtl8139. However, pcnet *works* (it's
 as fast as qemu-kvm.git)

   - PXE boot in qemu-kvm.git (HEAD df85c051) is fast, less than a minute. Tried
 with e1000, virtio and rtl8139 (I don't remember if I tried with pcnet)


I was having this problem too, but I think it's because I forgot to
build qemu with --enable-io-thread, which is the default for qemu-kvm.
Can you re-configure and build with that and see if it's fast?  Thanks,

Yes, nice catch, it's faster with I/O thread enabled, even seem faster
than qemu-kvm.git.

What's the performance under qemu-kvm with -no-kvm-irqchip?


So, does this have to be fixed w/o I/O thread?

If it's most probably an architectural deficit of non-io-thread mode, I
would say let it rest in peace. But maybe it points to a generic issues
that is just magnified by non-threaded mode.


If gpxe is spinning waiting for I/O to complete, that's going to prevent 
select from running until the next signal (timer event).


Regards,

Anthony Liguori


Jan



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] Native Linux KVM tool

2011-04-11 Thread Andrea Arcangeli
On Sat, Apr 09, 2011 at 09:40:09AM +0200, Ingo Molnar wrote:
 
 * Andrea Arcangeli aarca...@redhat.com wrote:
 
  [...] I thought the whole point of a native kvm tool was to go all the 
  paravirt way to provide max performance and maybe also depend on vhost as 
  much as possible.

BTW, I should elaborate on the all the paravirt way, going 100%
paravirt isn't what I meant. I was thinking at the performance
critical drivers mainly like storage and network. The kvm tool could
be more hackable and evolve faster by exposing a single hardware view
to the linux guest (using only paravirt whenever that improves
performance, like network/storage).

Whenever full emulation doesn't affect any fast path, it should be
preferred rather than inventing new paravirt interfaces for no
good.

That for example applies first and foremost to the EPT support which
is simpler and more optimal than any shadow paravirt pagetables. It'd
be a dead end to do all in paravirt performance-wise. I definitely
didn't mean any resemblance to lguest when I said full paravirt ;).
Sorry for the confusion.

 To me it's more than that: today i can use it to minimally boot test various 
 native bzImages just by typing:
 
   kvm run ./bzImage
 
 this will get me past most of the kernel init, up to the point where it would 
 try to mount user-space. ( That's rather powerful to me personally, as i 
 introduce most of my bugs to these stages of kernel bootup - and as a kernel 
 developer i'm not alone there ;-)
 
 I would be sad if i were forced to compile in some sort of paravirt support, 
 just to be able to boot-test random native kernel images.

 Really, if you check the code, serial console and timer support is not a big 
 deal complexity-wise and it is rather useful:

Agree with that.

 
   git pull git://github.com/penberg/linux-kvm master
 
 So i think up to a point hardware emulation is both fun to implement (it's 
 fun 
 to be on the receiving end of hw calls, for a change) and a no-brainer to 
 have 
 from a usability POV. How far it wants to go we'll see! :-)

About using the kvm tool as a debugging tool I don't see the point
though. It's very unlikely the kvm tool will ever be able to match
qemu power and capabilities for debugging, in fact qemu will allow you
to do basic debug of several device drivers too (e1000, IDE etc...). I
don't really see the point of the kvm tool as a debugging tool
considering how qemu is mature in terms of monitor memory inspection
commands and gdbstub for that, if it's debug you're going after adding
more features to the qemu monitor looks a better way to go.

The only way I see this useful is to lead it into a full performance
direction, using paravirt whenever it saves CPU (like virtio-blk,
vhost-net) and allow it to scale to hundred of cpus doing I/O
simultaneously and get there faster than qemu. Now smp scaling with
qemu-kvm driver backends hasn't been a big issue according to Avi, so
it's not like we're under pressure from it, but clearly someday it may
become a bigger issue and having less drivers to deal with (especially
only having vhost-blk in userland with vhost-net already being in the
kernel) may provide an advantage in allowing a more performance
oriented implementation of the backends without breaking lots of
existing and valuable full-emulated drivers.

In terms of pure kernel debugging I'm afraid this will be dead end and
for the kernel testing you describe I think qemu-kvm will work best
already. We already have a simpler kvm support in qemu (vs qemu-kvm)
and we don't want a third that is even slower than qemu kvm support,
so it has to be faster than qemu-kvm or nothing IMHO :).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for April 12

2011-04-11 Thread Anthony Liguori

On 04/11/2011 11:35 AM, Juan Quintela wrote:

Please, send in any agenda items you are interested in covering.


I won't be able to attend due.

Regards,

Anthony Liguori


Later, Juan.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html