RE: PCI passthrough of TV-tuners with VT-d

2009-11-09 Thread Han, Weidong
Kenni Lund wrote:
> Hi
> 
> While considering upgrading my MythTV backend to a VT-d capable
> system, I'm trying to figure out if it would be possible to utilize
> PCI passthrough with KVM.
> 
> As far as I understand the wiki page on "How to assign devices with
> VT-d" [1] and various posts on mailing lists, then all conventional
> PCI devices should (hopefully) work, as long as they don't share an
> IRQ with another device, and IF they share a IRQ, the device must be
> MSI capable - is this assumption correct? I know that graphics cards

Correct.

> are special, but I'm only interested in TV tuners in my specific case.
> 
> I don't know if anyone has tested TV tuners on KVM, which perhaps
> makes it a hard question, but would you expect that a regular PCI TV
> Tuner with onboard MPEG2 encoder (not PCI Express, not MSI-capable,
> using a unique IRQ) would work?

It can avoid IRQ sharing limitation, and you can assign it to guest. We didn't 
try TV tuner passthrough. Anyway, you can have a try first.

Regards,
Weidong

> 
> If it makes any difference, the TV tuners are all Hauppauge
> PVR-150/250, recognized by lspci as "Multimedia video controller:
> Internext Compression Inc iTVC16 (CX23416) MPEG-2 Encoder".
> 
> Thanks in advance :)
> 
> Best Regards
> Kenni Lund
> 
> [1]
> http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI Passthrough and graphic cards ...

2009-11-05 Thread Han, Weidong
Basic graphics passthrough works in xen-unstable, we are working on iGFX 
passthrough now (Xen Client already has iGFX passthrough support). After 
complete it, we plan to do the same in KVM. Fede said he is porting xen code to 
kvm.I think you can also think how to implement it in kvm cleanly. Actually 
there are many aspects can be enhanced. For example, copy/load video BIOS to 
guest, easy use for various cases (primary/secondary graphics in host/guest), 
etc.

Regards,
Weidong

Jurgen Baier wrote:
> Is there anything I can do to help with this feature. It is probably
> one of the most sort after requests. I have been toying with the Xen
> component and would love to see this in KVM 
> 
> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
> Behalf Of Han, Weidong 
> Sent: Friday, 6 November 2009 11:37 AM
> To: 'Fede'; 'Asdo'
> Cc: 'kvm@vger.kernel.org'
> Subject: RE: PCI Passthrough and graphic cards ...
> 
> Fede wrote:
>> On Wed, Nov 4, 2009 at 11:00, Asdo  wrote:
>>> In case of a single video card in the system, it would be wonderful
>>> to be able to suddenly give the video card as passthrough to the
>>> guest removing it from the host. Since the video card internal state
>>> is unknown to the guest, the guest should then re-initialize it with
>>> some help of a guest driver. For example it could act like resuming
>>> the graphics from standby. Giving the video card back to the host
>>> would need some kind of a hotkey. While the video card is at the
>>> host, the guest graphics would be invisible (not windowed).
>>> Alternatively the guest could be frozen. 
>>> 
>>> Would that be anyhow feasible?
>>> 
>>> That would be just great for the people who use Windows for
>>> videogames 
>>> 
>>> Thank you
>>> Asdo
>> 
>> I'm currently working in this as a research project at university.
>> I'll have more information soon. But basically, the problem is that
>> graphic cards have BIOS. In order to make this work, graphic card
>> must be reseted, just like it happens whenever a computer is powered
>> up. 
>> 
> 
> Yes, need to reset the graphics card and re-execute video BIOS in
> guest. 
> 
> Regards,
> Weidong
> 
>> I'm currently porting xen patch into kvm.
>> 
>> If someone has more information, please, let me know.
>> 
>> Thanks
>> Federico
>> 
>>> 
>>> Han, Weidong wrote:
>>>> 
>>>> Passthrough graphic card to guest can satisfy your requirement.
>>>> Currently it's not supported in kvm, but we have a plan to support
>>>> it, but you still need wait for a while.
>>>> 
>>>> I think sharing the host graphics power is not easy to implement.
>>>> 
>>>> Regards,
>>>> Weidong
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI Passthrough and graphic cards ...

2009-11-05 Thread Han, Weidong
Fede wrote:
> On Wed, Nov 4, 2009 at 11:00, Asdo  wrote:
>> In case of a single video card in the system, it would be wonderful
>> to be able to suddenly give the video card as passthrough to the
>> guest removing it from the host. Since the video card internal state
>> is unknown to the guest, the guest should then re-initialize it with
>> some help of a guest driver. For example it could act like resuming
>> the graphics from standby. 
>> Giving the video card back to the host would need some kind of a
>> hotkey. While the video card is at the host, the guest graphics
>> would be invisible (not windowed). Alternatively the guest could be
>> frozen. 
>> 
>> Would that be anyhow feasible?
>> 
>> That would be just great for the people who use Windows for
>> videogames 
>> 
>> Thank you
>> Asdo
> 
> I'm currently working in this as a research project at university.
> I'll have more information soon. But basically, the problem is that
> graphic cards have BIOS. In order to make this work, graphic card must
> be reseted, just like it happens whenever a computer is powered up.
> 

Yes, need to reset the graphics card and re-execute video BIOS in guest.

Regards,
Weidong

> I'm currently porting xen patch into kvm.
> 
> If someone has more information, please, let me know.
> 
> Thanks
> Federico
> 
>> 
>> Han, Weidong wrote:
>>> 
>>> Passthrough graphic card to guest can satisfy your requirement.
>>> Currently it's not supported in kvm, but we have a plan to support
>>> it, but you still need wait for a while. 
>>> 
>>> I think sharing the host graphics power is not easy to implement.
>>> 
>>> Regards,
>>> Weidong
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI Passthrough and graphic cards ...

2009-11-04 Thread Han, Weidong
Asdo wrote:
> In case of a single video card in the system, it would be wonderful to
> be able to suddenly give the video card as passthrough to the guest
> removing it from the host. Since the video card internal state is
> unknown to the guest, the guest should then re-initialize it with some
> help of a guest driver. For example it could act like resuming the
> graphics from standby.
> Giving the video card back to the host would need some kind of a
> hotkey. While the video card is at the host, the guest graphics would
> be invisible (not windowed). Alternatively the guest could be frozen.
> 
> Would that be anyhow feasible?

I'm not sure. From passthrough point of view, it needs to re-assign the video 
card to the target guest or host which wants to own it. That's to say it needs 
to hotplug video card. don't know if it can work smoothly.

Regards,
Weidong

> 
> That would be just great for the people who use Windows for videogames
> 
> Thank you
> Asdo
> 
> Han, Weidong wrote:
>> Passthrough graphic card to guest can satisfy your requirement.
>> Currently it's not supported in kvm, but we have a plan to support
>> it, but you still need wait for a while.  
>> 
>> I think sharing the host graphics power is not easy to implement.
>> 
>> Regards,
>> Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI Passthrough and graphic cards ...

2009-11-04 Thread Han, Weidong
Passthrough graphic card to guest can satisfy your requirement. Currently it's 
not supported in kvm, but we have a plan to support it, but you still need wait 
for a while.

I think sharing the host graphics power is not easy to implement.

Regards,
Weidong

Michael McStarfighter wrote:
> By the way: How good is the KVM own graphic card? Is it possible to
> use / share the host graphics power?
> 
> 2009/11/4 Avi Kivity :
>> On 11/04/2009 01:14 PM, Michael McStarfighter wrote:
>>> 
>>> My thoughts are only to give the Windows guest more graphic power.
>>> Isn't it possible to get this without a dedicated monitor?
>>> 
>> 
>> I'm no expert on graphics, but unless the two cards somehow
>> multiplex the monitor, anything Windows outputs will be lost.
>> 
>> --
>> error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: VGA pass-through

2009-08-18 Thread Han, Weidong
nathan binkert wrote:
 Currently VGA pass-through is not supported in KVM. It needs much
 work to support it.
>>> 
>>> Here's what I know of:
>>> 
>>> - load the vga bios at 0xc
>>> - implement whatever main bios interfaces the vga vios expects
>>> - pass through the vga I/O ports
>>> 
>> I think it also needs:
>>  - map video RAM: 0xa - 0xb
>>  - remove Qemu VGA device emulation
>>  - reserve enough guest memory space for PCI BARs (most
>> graphics cards has more than 256M memory, I didn't check
>> how much guest memory kvm reserve for PCI BARs)
>> 
>> In addition, some graphics cards are specific, such as pBAR
>> must equal to vBAR. To my feeling, there are some tricks to
>> make it work.
> 
> Hi All,
> 
> Out of curiosity, has any progress been made on this?  I tried to set
> this up myself several months ago (KVM-84) with no luck as primary or
> secondary.  No matter what I did, if I passed through a video card,
> kvm wouldn't even enter the bios.  Presumably because of vga bios
> issues.  I even hacked the nvidia driver on my host system to make
> certain that the host didn't touch the card at all.
> 

If you didn't do above changes, obviously it won't work directly.

We are working on VGA passthrough on Xen side now. KVM should be similiar. 
Basically we can reuse most of the code. We will post the code when we port it 
to KVM.

Regards,
Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Device assignment theoretical questions

2009-07-23 Thread Han, Weidong
Wolfgang Nothdurft wrote:
> Hi,
> 
> what limits does kvm have with device assignment and vt-d enabled.
> 
> - Is it possible to assign pci devices behind the same pci bridge to
> different guests?

For devices behind conventional PCI bridges, the source-id in the DMA requests 
is the requester-id of the bridge device. For devices behind PCI 
Express-to-PCI/PCI-X bridges, the source-id possibly is the source-id of the 
original PCI-X transaction or the source-id provided by the bridge. So devices 
behind these bridges can only be collectively assigned to a single guest.

> - Is there a limit how many devices can be assigned to one guest?

VT-d doesn't have the limitation. kvm guest supports 32 devices.

> - What kind of cards work with the dma=none option?

dma=none implements DMA remapping using software method. I think it needn't 
depend on device. You can refer to Amit's pv-dma tree for more information.

Regards,
Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI device passthrough for FreeDOS guests

2009-07-13 Thread Han, Weidong
Carl-Daniel Hailfinger wrote:
> Hi,
> 
> I want to trace the PCI device accesses (config space, I/O ports,
> memory BARs) of a DOS application (EEPROM flasher) and it seems KVM
> can help with this.
> The current plan is to install KVM, use FreeDOS as a guest, activate
> passthrough for the PCI device I'm interested in and log the accesses
> somehow.
> 
> Possible problems I'd have to solve:
> - According to
> http://www.linux-kvm.org/wiki/images/d/d0/KvmForum2008$kdf2008_14.pdf
> PCI device passthrough needs a CPU with VT-d/IOMMU support.
> - http://www.linux-kvm.org/page/TODO says passthrough won't work for
> conventional PCI devices (I assume a PCI graphics card is such a

It supports conventional PCI devices passgthrough now. 

> device). - http://www.linux-kvm.org/page/TODO says passthrough won't
> work for devices without Function Level Reset (FLR).

FLR is not must for passthrough. If the device status is correct, passthrough 
can work. FLR is used to keep device in correct status before and after 
passthrough. Now more device reset methods are supported, such as secondary bus 
reset and D0hot->D3 transition. So even the device won't support PCIe FLR, it 
still can be reset. It's not a problem now.

The VT-d TODO is obsolete. I will update it.

> - http://www.linux-kvm.org/page/TODO says VT support for real mode is
> not good, so FreeDOS might be a problem.
> - The KVM wiki does not mention how to get PCI passthrough working.

There is VT-d Howto: 
http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM. pls refer 
to it to passthrough pci.

> - No idea if logging of PCI accesses is possible with KVM or if I have
> to use helper tools for that.

You can log PCI accesses. Pls refer to qemu-kvm/hw/device-assignment.c.

Regards,
Weidong

> 
> Any insights/hints are appreciated. If KVM is the wrong tool for my
> task, please point me in the right direction.
> 
> Please CC me as I'm not subscribed to the list. Thanks.
> 
> Regards,
> Carl-Daniel
> 
> P.S. The linux-kvm.org wiki returns zero results if you search the
> wiki text for "PCI", but a Google search finds quite a few pages
> mentioning PCI. Is there a bug in the seach function?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: VTd pass through and 32 bit (sorry previous mail had no subject)

2009-07-09 Thread Han, Weidong
Subash Kalbarga wrote:
> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
> Behalf Of Subash Kalbarga
> Sent: Thursday, July 09, 2009 12:51 PM
> To: kvm@vger.kernel.org
> Subject:
> 
> Hi
> 
> I find that the HOW-TO for VT-d pass through on linux-kvm.org asks for
> Interrupt Remapping to be enabled in the kernel. However Interrupt
> Remapping is dependent on x86_64 (DMAR was also dependent but I guess
> the 2.6.30 kernel allows it for 32 bit systems)
> 

Device pass through doesn't depend on interrupt remapping. Interrupt remapping 
is optional. I will change the VT-d HOW-TO to clarify it. 

Regards,
Weidong

> 
> Does that mean I will not be able to try device pass through on 32 bit
> systems?

> 
> Thanks in advance
> Subash
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in the
> body of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: VGA pass-through

2009-06-30 Thread Han, Weidong
Michael Tokarev wrote:
> Han, Weidong wrote:
>> Michael Tokarev wrote:
> []
>>> But how about using it as a "secondary" video card?  Like, I can
>>> plug another add-on vga card into a free PCI slot and tell X to
>>> use that one instead of "default" card.  Can kvm work like this?
>> 
>> "secondary" means secondary video card in host, and will be
>> "primary" in guest, right? As long as it's primary in guest, I think
>> you still need most of above changes.  
> 
> No, I mean "secondary" on guest - no matter if it's secondary or
> primary on host.  So that we'll have qemu-emulated VGA as primary
> and a hardware-based secondary on *guest*.
> 

Oh, for "secondary" in guest, it may needn't above changes. You can try it 
without any change.

> How much useful it is - it's another question.  For instance, I've
> no idea how windows guest will be able to use such "secondary" vga
> card on guest.
> 

I don't know. Is there any interesting usage of this secondary VGA in guest?

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: VGA pass-through

2009-06-30 Thread Han, Weidong
Michael Tokarev wrote:
> Han, Weidong wrote:
> []
>>>> Currently VGA pass-through is not supported in KVM. It needs much
>>>> work to support it. 
>>>> 
>>> What does it need?
>>> 
>>> Here's what I know of:
>>> 
>>> - load the vga bios at 0xc
>>> - implement whatever main bios interfaces the vga vios expects
>>> - pass through the vga I/O ports
>>> 
>>> Anything else?
>> 
>> I think it also needs:
>>   - map video RAM: 0xa - 0xb
>>   - remove Qemu VGA device emulation
>>   - reserve enough guest memory space for PCI BARs (most graphics
>> cards has more than 256M memory, I didn't check how much guest
>> memory kvm reserve for PCI BARs)  
>> 
>> In addition, some graphics cards are specific, such as pBAR must
>> equal to vBAR. To my feeling, there are some tricks to make it work. 
> 
> That all seems to be necessary for "console" support, i.e., so that
> the guest bios etc will know this device as "primary display" or
> whatever it is called.
> 
> But how about using it as a "secondary" video card?  Like, I can
> plug another add-on vga card into a free PCI slot and tell X to
> use that one instead of "default" card.  Can kvm work like this?

"secondary" means secondary video card in host, and will be "primary" in guest, 
right? As long as it's primary in guest, I think you still need most of above 
changes. 

Regards,
Weidong

> 
> Thanks.
> 
> /mjt

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: VGA pass-through

2009-06-30 Thread Han, Weidong
Avi Kivity wrote:
> On 06/30/2009 09:16 AM, Han, Weidong wrote:
>> Subash Kalbarga wrote:
>> 
>>> Hi
>>> 
>>> I am using kvm-86 on 2.6.28 and I am able to pass through most of my
>>> PCI devices successfully. However, I do not see even the KVM BIOS
>>> output if I pass through my VGA adapter. I am doing -vga none,
>>> -nographic and -pcidevice host=xx:yy.z
>>> 
>>> 
>>> Should I expect to see the kvm BIOS output?  Is VGA pass-through
>>> something that is expected to work?
>>> 
>> 
>> Currently VGA pass-through is not supported in KVM. It needs much
>> work to support it. 
>> 
>> 
> 
> What does it need?
> 
> Here's what I know of:
> 
> - load the vga bios at 0xc
> - implement whatever main bios interfaces the vga vios expects
> - pass through the vga I/O ports
> 
> Anything else?

I think it also needs:
  - map video RAM: 0xa - 0xb
  - remove Qemu VGA device emulation
  - reserve enough guest memory space for PCI BARs (most graphics cards has 
more than 256M memory, I didn't check how much guest memory kvm reserve for PCI 
BARs) 
  
In addition, some graphics cards are specific, such as pBAR must equal to vBAR. 
To my feeling, there are some tricks to make it work.

Regards,
Weidong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: VGA pass-through

2009-06-29 Thread Han, Weidong
Subash Kalbarga wrote:
> Hi
> 
> I am using kvm-86 on 2.6.28 and I am able to pass through most of my
> PCI devices successfully. However, I do not see even the KVM BIOS
> output if I pass through my VGA adapter. I am doing -vga none,
> -nographic and -pcidevice host=xx:yy.z  
> 
> 
> Should I expect to see the kvm BIOS output?  Is VGA pass-through
> something that is expected to work? 

Currently VGA pass-through is not supported in KVM. It needs much work to 
support it.

Regards,
Weidong

> 
> Thanks in advance.
> 
> Subash

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH RFC] qemu: fix hot remove assigned device

2009-06-10 Thread Han, Weidong
Gerd Hoffmann wrote:
> On 06/10/09 10:31, Han, Weidong wrote:
>> Avi Kivity wrote:
>>> Can you check pci_dev->qdev instead of assigned?  A little less
>>> ugly.
>> 
>> I tried to find an easy and clean way to check it, but I found the
>> members of struct PCIDevice and DeviceState were not suitable for
>> this checking, and qdev is not pointer in struct PCIDevice.
> 
> Well, certain DeviceState pointers (type for example) are set only in
> case the device was created using the qdev api.  I think you certainly
> should get away without adding a new parameter to
> pci_unregister_device. 
> 
> Also I've just sent a patch to the qemu-devel fixing the qdev register
> API for pci devices, allowing to register config space callbacks. 
> Once this is merged you can switch pci passthrough to the new qdev
> API. 
> 

Good. Extending the qdev APIs is the most elegant solution. Thanks.

Regards,
Weidong
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH RFC] qemu: fix hot remove assigned device

2009-06-10 Thread Han, Weidong
Avi Kivity wrote:
> Han, Weidong wrote:
>> 
>> -int pci_unregister_device(PCIDevice *pci_dev)
>> +int pci_unregister_device(PCIDevice *pci_dev, int assigned)  {
>>  int ret = 0;
>> 
>> @@ -377,7 +377,11 @@ int pci_unregister_device(PCIDevice *pci_dev)
>>  qemu_free_irqs(pci_dev->irq);
>>  pci_irq_index--;
>>  pci_dev->bus->devices[pci_dev->devfn] = NULL;
>> -qdev_free(&pci_dev->qdev);
>> +
>> +if (assigned)
>> +qemu_free(pci_dev);
>> +else
>> +qdev_free(&pci_dev->qdev);
>>  return 0;
>>  }
>> 
> 
> Can you check pci_dev->qdev instead of assigned?  A little less ugly.

I tried to find an easy and clean way to check it, but I found the members of 
struct PCIDevice and DeviceState were not suitable for this checking, and qdev 
is not pointer in struct PCIDevice.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH RFC] qemu: fix hot remove assigned device

2009-06-10 Thread Han, Weidong
Gerd Hoffmann wrote:
> On 06/09/09 16:51, Paul Brook wrote:
>> On Tuesday 09 June 2009, Han, Weidong wrote:
>>> Paul Brook wrote:
>>>> On Monday 08 June 2009, Weidong Han wrote:
>>>>> When hot remove an assigned device, segmentation fault was
>>>>> triggered by qemu_free(&pci_dev->qdev) in pci_unregister_device().
>>>>> pci_register_device() doesn't initialize or set pci_dev->qdev.
>>>>> For an assigned device, qdev variable isn't touched at all. So
>>>>> segmentation fault happens when to free a non-initialized qdev.
>>>> Better would be to just disable hot remove for devices still using
>>>> the legacy pci_register_device API.
>>> PCI passthrough uses pci_register_device to register assigned
>>> device to qemu. Is there newer API to do so?
>> 
>> Yes. See e.g. LSI scsi emulation.
> 
> Well.  Except that you can't (yet) register pci config read/write
> callbacks using the qdev-based API.
> 

So pci passthrough have to use pci_register_device now. I cooked a new patch 
(post below) to fix this issue. Thanks.


When hot remove an assigned device, segmentation fault was triggered
by qdev_free(&pci_dev->qdev) in pci_unregister_device().
pci_register_device() doesn't initialize or set pci_dev->qdev. For an
assigned device, qdev variable isn't touched at all. So segmentation
fault happens when to free a non-initialized qdev.

This patch adds a parameter in pci_unregister_device to check if
it's an assigned device. For assgined device, free pci_dev instead of
qdev. No changes for other devices.


Signed-off-by: Weidong Han 
---
 hw/device-assignment.c |2 +-
 hw/pci-hotplug.c   |2 +-
 hw/pci.c   |8 ++--
 hw/pci.h   |2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 624d15a..65920d0 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -581,7 +581,7 @@ static void free_assigned_device(AssignedDevInfo *adev)
 dev->real_device.config_fd = 0;
 }
 
-pci_unregister_device(&dev->dev);
+pci_unregister_device(&dev->dev, 1);
 #ifdef KVM_CAP_IRQ_ROUTING
 free_dev_irq_entries(dev);
 #endif
diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c
index d7c8b84..18a4912 100644
--- a/hw/pci-hotplug.c
+++ b/hw/pci-hotplug.c
@@ -271,6 +271,6 @@ void pci_device_hot_remove_success(int pcibus, int slot)
 break;
 }
 
-pci_unregister_device(d);
+pci_unregister_device(d, 0);
 }
 
diff --git a/hw/pci.c b/hw/pci.c
index 25581a4..35fd064 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -363,7 +363,7 @@ static void pci_unregister_io_regions(PCIDevice *pci_dev)
 }
 }
 
-int pci_unregister_device(PCIDevice *pci_dev)
+int pci_unregister_device(PCIDevice *pci_dev, int assigned)
 {
 int ret = 0;
 
@@ -377,7 +377,11 @@ int pci_unregister_device(PCIDevice *pci_dev)
 qemu_free_irqs(pci_dev->irq);
 pci_irq_index--;
 pci_dev->bus->devices[pci_dev->devfn] = NULL;
-qdev_free(&pci_dev->qdev);
+
+if (assigned)
+qemu_free(pci_dev);
+else
+qdev_free(&pci_dev->qdev);
 return 0;
 }
 
diff --git a/hw/pci.h b/hw/pci.h
index f2dccb5..f658e78 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -199,7 +199,7 @@ PCIDevice *pci_register_device(PCIBus *bus, const char 
*name,
int instance_size, int devfn,
PCIConfigReadFunc *config_read,
PCIConfigWriteFunc *config_write);
-int pci_unregister_device(PCIDevice *pci_dev);
+int pci_unregister_device(PCIDevice *pci_dev, int assigned);
 
 void pci_register_io_region(PCIDevice *pci_dev, int region_num,
 uint32_t size, int type,
-- 
1.6.0.4--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH RFC] qemu: fix hot remove assigned device

2009-06-08 Thread Han, Weidong
Paul Brook wrote:
> On Monday 08 June 2009, Weidong Han wrote:
>> When hot remove an assigned device, segmentation fault was triggered
>> by qemu_free(&pci_dev->qdev) in pci_unregister_device().
>> pci_register_device() doesn't initialize or set pci_dev->qdev. For an
>> assigned device, qdev variable isn't touched at all. So segmentation
>> fault happens when to free a non-initialized qdev.
> 
> Better would be to just disable hot remove for devices still using
> the legacy pci_register_device API.
> 

PCI passthrough uses pci_register_device to register assigned device to qemu. 
Is there newer API to do so?

Regards,
Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 0/2] Intel-IOMMU: source-id checking for interrupt remapping

2009-06-01 Thread Han, Weidong
Any comments on the patchset? Thanks. 

Regards,
Weidong

Han, Weidong wrote:
> Support source-id checking for interrupt remapping, and then
> isolates interrupts for guests/VMs with assigned devices.
> 
> Eric raised pci rebalance issue with VT-d. Yes, it's an issue now.
> Linux needs to handle pci rebalance changes to DRHD scopes. It's
> tricky to support it. This patch just supports source-id for
> interrupt remapping, won't touch that.
> 
> The patchset can be applied on linux-2.6-tip tree.
> 
> v2 -> v3 changelog:
>   As Ingo suggested, restructured some code and fixed some code
> style issues.
> 
> v1 -> v2 changelog:
> Access PCI directly (read_pci_config_byte) to parse IOAPIC,
> instead of PCI related discovery, because PCI subsystem is not
> initialized at that time.
> 
> 
> Weidong Han (2):
>   Intel-IOMMU, intr-remap: set the whole 128bits of irte when
> modify/free it
>   Intel-IOMMU, intr-remap: source-id checking
> 
>  arch/x86/kernel/apic/io_apic.c |6 ++
>  drivers/pci/intr_remapping.c   |  160
>  +++ drivers/pci/intr_remapping.h
>  |2 + include/linux/dmar.h   |   11 +++
>  4 files changed, 162 insertions(+), 17 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/2] Intel-IOMMU, intr-remap: source-id checking

2009-05-21 Thread Han, Weidong
Eric W. Biederman wrote:
> "Han, Weidong"  writes:
> 
>> Ingo Molnar wrote:
>>> * Eric W. Biederman  wrote:
>>> 
>>>> Not finding an upstream pcie_bridge and then concluding we are a
>>>> pcie device seems bogus. 
>>>> 
>> 
>> If device is pcie, pci_find_upstream_pcie_bridge() will return
>> NULL. For root complex integrated device, it won't find upstream
>> bridge, and also return NULL. What's more, pcie device and root
>> complex integrated device will be handled as the same way to set
>> sid, so I mix them here. But it also returns NULL for busted
>> hardware. I think no parent bus can be considered Root Complex
>> integrated device, right? If so, I think can handle it as follows:
> 
> I have problems with pci_find_upstream_pcie_bridge.  The
> name is actively misleading about what that function does.
> Returning a pci_to_pci bridge is strongly counter intuitive
> give that name.
> 
> Can we change it to pci_find_upstream_pcie_to_pci_bridge.
> Returning NULL in all cases when there is not an upstream
> pcie_to_pci bridge.

pci_find_upstream_pcie_bridge returns upstream pcie-to-pci/pcix bridge or 
legacy pci bridge for a pci device. pci_find_upstream_bridge may be more 
suitable.

> 
> For the set_sid case that is ideal.  For the other cases in
> intel-iommu.c it may be a problem.  Is it even possible to have a
> genuine pci device not behind a pcie to pci bridge on an intel
> chipset with this iommu?
> 

I think it's little possible. But coincide to VT-d spec, we need to handle 
differently for devices behind pcie-to-pci/pcix bridge and behind legacy pci 
bridge.

> 
>>  ...
>>  if (dev->is_pcie || !dev->bus->parent) {
>>  set_irte_sid(...);
>>  return 0;
>>  }
>> 
>>  bridge = pci_find_upstream_pcie_bridge(dev);
>>  if (bridge) {
>>  if (bridge->is_pcie) /* PCIE-to-PCI/PCIx bridge */
>>  set_irte_sid(...); else /* legacy PCI bridge */
>>  set_irte_sid(..);
>>  }
>> 
>>  return 0;
>> 
>>>> Why if we do have an upstream pcie bridge do we only want to do a
>>>> bus range verification instead of checking just for the bus
>>>>> devfn?
>>>> 
>>>> The legacy PCI case seems even stranger.
>> 
>> Why? If a PCI device isn't connected to a PCIe bridge, it should be
>> a legacy bridge. 
> 
> I am not deep in the IOV specification at the moment.  I am mostly
> wondering why we pick the parts we pick to verify.  I recall
> bus and devfn being on the pcie packets so that makes sense.
> 
> Why would we ever want to do something different?  Does a pcie to pci
> bridge do something different in it's translation?

source-id is different between devices pcie-to-pci/pcix bridge and behind 
legacy pci bridge. Thus VT-d needs to handle it differently.

The pcie-to-pci/pcix bridges may generate a different requester-id and tag 
combination in some instances for transactions forwarded to the bridge's PCI 
Express interface. The action of replacing the original transaction's 
requester-id with one assigned by the bridge is generally referred to as taking 
'ownership' of the transaction. If the bridge generates a new requester-id for 
a transaction forwarded from the secondary interface to the primary interface, 
the bridge assigns the PCI Express requester-id using the secondary interface's 
bus number, and sets both the device number and function number fields to zero. 
Refer to the PCI Express-to-PCI/PCI-X bridge specifications for more details.

For devices behind conventional PCI bridges, the source-id in the DMA requests 
is the requester-id of the bridge device.

Regards,
Weidong


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/2] Intel-IOMMU, intr-remap: source-id checking

2009-05-21 Thread Han, Weidong
Ingo Molnar wrote:
> * Eric W. Biederman  wrote:
> 
>> Not finding an upstream pcie_bridge and then concluding we are a
>> pcie device seems bogus. 
>> 

If device is pcie, pci_find_upstream_pcie_bridge() will return NULL. For root 
complex integrated device, it won't find upstream bridge, and also return NULL. 
What's more, pcie device and root complex integrated device will be handled as 
the same way to set sid, so I mix them here. But it also returns NULL for 
busted hardware. I think no parent bus can be considered Root Complex 
integrated device, right? If so, I think can handle it as follows:

...
if (dev->is_pcie || !dev->bus->parent) {
set_irte_sid(...);
return 0;
}

bridge = pci_find_upstream_pcie_bridge(dev);
if (bridge) {
if (bridge->is_pcie) /* PCIE-to-PCI/PCIx bridge */
set_irte_sid(...);
else /* legacy PCI bridge */
set_irte_sid(..);
}

return 0;

>> Why if we do have an upstream pcie bridge do we only want to do a
>> bus range verification instead of checking just for the bus
>>> devfn?
>> 
>> The legacy PCI case seems even stranger.

Why? If a PCI device isn't connected to a PCIe bridge, it should be a legacy 
bridge.

Regards,
Weidong

> 
> Good points. Would be nice to get an ack from the PCI folks to make
> sure these bits are sane.
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/2] Intel-IOMMU, intr-remap: source-id checking

2009-05-20 Thread Han, Weidong
Ingo Molnar wrote:
> * Eric W. Biederman  wrote:
> 
>> Ingo Molnar  writes:
>> 
>>> * Weidong Han  wrote:
>>> 
 To support domain-isolation usages, the platform hardware must be
 capable of uniquely identifying the requestor (source-id) for each
 interrupt message. Without source-id checking for interrupt
 remapping , a rouge guest/VM with assigned devices can launch
 interrupt attacks to bring down anothe guest/VM or the VMM itself.
 
 This patch adds source-id checking for interrupt remapping, and
 then really isolates interrupts for guests/VMs with assigned
 devices.
 
 Because PCI subsystem is not initialized yet when set up IOAPIC
 entries, use read_pci_config_byte to access PCI config space
 directly.
 
 Signed-off-by: Weidong Han 
 ---
  arch/x86/kernel/apic/io_apic.c |6 +++
  drivers/pci/intr_remapping.c   |   90
  ++-
  drivers/pci/intr_remapping.h   |2 + include/linux/dmar.h 
  |   11 + 4 files changed, 106 insertions(+), 3 deletions(-)
>>> 
>>> Code structure looks nice now. (and i susect you have tested this on
>>> real and relevant hardware?) I've Cc:-ed Eric too ... does this
>>> direction look good to you too Eric?
>> 
>> Being a major nitpick, I have to point out that the code is not
>> structured to support other iommus, and I think AMD has one that
>> can do this as well.
> 
> (Joerg Cc:-ed)

The patch just changes code in interrupt remapping of Intel IOMMU. How does AMD 
IOMMU remap interrupts in ioapic code? I didn't find relative code. If AMD 
IOMMU already enabled the same code, it can wrap the same code in IOMMU API. 

> 
>> The early pci reading of the bus is just wrong.  What happens if
>> the pci layer decided to renumber things?  It looks like we have a
>> real dependency on pci there and are avoiding sorting it out with
>> this.
> 
> Yes ... but is there much we can do about this bootstrap dependency?
> We want to enable the IO-APIC very early in its final form. There's
> quite a bit of IRQ functionality that doesnt go via the PCI layer,
> and which is being relied on by early bootup. The timer irq must
> work, etc.

Currently VT-d code didn't support pci rebalance. There is much work to do. It 
needs to track devcie identity changes and resultant imapct to Device Scope 
under each VT-d engine.

> 
>> Hmm.  But that is what we use in setup_ioapic_sid I expect the
>> right solution is to delay enabling ioapic entries until driver
>> enable them.  That could also reduce screaming irqs during bootup
>> in the kdump case.
> 
> Yes, and note that we are moving in that direction in tip:irq/numa
> (Yinghai Cc:-ed) - we are delaying IRQ setup to the very last
> moment. We recently got rid of 32-bit IRQ compression in that branch
> as well and the IRQ vectoring code is now unified between 64-bit and
> 32-bit so it's nify and you might want to check it and look for
> holes ...
> 
> ( The motivation there was different: delay allocation of the
>   irq_desc so that we can make it NUMA-local - but it has the same
>   general effect. )
> 
>> set_msi_sid looks wrong.  The comment are unhelpful. irte->svt
>> should get an enum value or a deine (removing the repeated
>> explanations of the magic value) and then we could have room to
>> explain why we are doing what we are doing.
> 
> (yes, it probably wants a helper method - i pointed these smaller
> details out in my review.)

will update as Ingo suggested.

Regards,
Weidong

> 
>> Not finding an upstream pcie_bridge and then concluding we are a
>> pcie device seems bogus. 
>> 
>> Why if we do have an upstream pcie bridge do we only want to do a
>> bus range verification instead of checking just for the bus
>>> devfn?
>> 
>> The legacy PCI case seems even stranger.
> 
> Good points. Would be nice to get an ack from the PCI folks to make
> sure these bits are sane.
> 
>> 
>> 
>> The table of apic information by apic_id also seems wrong.  Don't
>> we have chip_data or something that should point it that we can
>> get from the irq?
> 
> ->chip_data is already used for io-apic configuration bits - if it's
> reused then the right way to do it is to extend struct irq_cfg in
> arch/x86/kernel/apic/io_apic.c.
> 
>   Ingo

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 1/2] Intel-IOMMU, intr-remap: set the whole 128bits of irte when modify/free it

2009-05-19 Thread Han, Weidong
Ingo Molnar wrote:
> * Weidong Han  wrote:
> 
>> Interrupt remapping table entry is 128bits. Currently, it only sets
>> low 64bits of irte in modify_irte and free_irte. This ignores high
>> 64bits setting of irte, that means source-id setting will be
>> ignored. This patch sets the whole 128bits of irte when modify/free
>> it. Following source-id checking patch depends on this.
>> 
>> Signed-off-by: Weidong Han 
>> ---
>>  drivers/pci/intr_remapping.c |   10 +++---
>>  1 files changed, 7 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/pci/intr_remapping.c
>> b/drivers/pci/intr_remapping.c index f5e0ea7..946e170 100644 ---
>> a/drivers/pci/intr_remapping.c +++ b/drivers/pci/intr_remapping.c
>> @@ -309,7 +309,8 @@ int modify_irte(int irq, struct irte
>>  *irte_modified) index = irq_iommu->irte_index +
>>  irq_iommu->sub_handle; irte = &iommu->ir_table->base[index];
>> 
>> -set_64bit((unsigned long *)irte, irte_modified->low);
>> +set_64bit((unsigned long *)&irte->low, irte_modified->low);
>> +set_64bit((unsigned long *)&irte->high, irte_modified->high);
>> 
>>  __iommu_flush_cache(iommu, irte, sizeof(*irte));
>> 
>>  rc = qi_flush_iec(iommu, index, 0);
>> @@ -386,8 +387,11 @@ int free_irte(int irq)
>>  irte = &iommu->ir_table->base[index];
>> 
>>  if (!irq_iommu->sub_handle) {
>> -for (i = 0; i < (1 << irq_iommu->irte_mask); i++)
>> -set_64bit((unsigned long *)(irte + i), 0);
>> +for (i = 0; i < (1 << irq_iommu->irte_mask); i++) {
>> +set_64bit((unsigned long *)&irte->low, 0);
>> +set_64bit((unsigned long *)&irte->high, 0);
>> +irte++;
>> +}
> 
> The loop is a bit unclean. It has a side-effect on 'irte' - and
> other patterns in the driver usually treat 'irte' as a generally
> available variable.
> 
> So the above code, while correct, opens up the possibility of later
> code added to this function relying on 'irte', thinking that it's
> set to "&iommu->ir_table->base[index]", and then breaking because
> 'irte' has been iterated to the end of it in certain circumstances.
> 
> It's better to factor out the whole loop into a helper function,
> which does something like:
> 
> int flush_entries(struct irq_2_iommu *irq_iommu)
> {
>   struct irte *start, *entry, *end;
>   struct intel_iommu *iommu;
>   int index;
> 
>   if (irq_iommu->sub_handle)
>   return 0;
> 
>   iommu = irq_iommu->iommu;
>   index = irq_iommu->irte_index + irq_iommu->sub_handle;
> 
>   start = iommu->ir_table->base + index;
>   end = start + (1 << irq_iommu->irte_mask);
> 
>   for (entry = start; entry < end; entry++) {
>   set_64bit((unsigned long *)&entry->low,  0);
>   set_64bit((unsigned long *)&entry->high, 0);
>   }
> 
>   return qi_flush_iec(iommu, index, irq_iommu->irte_mask);
> }
> 
> Note how clearer this is - the new method has one purpose and
> 'entry' is a clear iterator.
> 
> ( And note how much clearer the flow of 'rc' has become as well as a
>   side-effect: it is clear now that it's set to 0 when
>   irq_iommu->sub_handle is still present. )
> 
> Thanks,
> 
>   Ingo

Yes, more clearer. will update it in next version. Thanks.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/2] Intel-IOMMU, intr-remap: source-id checking

2009-05-19 Thread Han, Weidong
Ingo Molnar wrote:
> * Weidong Han  wrote:
> 
>> To support domain-isolation usages, the platform hardware must be
>> capable of uniquely identifying the requestor (source-id) for each
>> interrupt message. Without source-id checking for interrupt
>> remapping , a rouge guest/VM with assigned devices can launch
>> interrupt attacks to bring down anothe guest/VM or the VMM itself.
>> 
>> This patch adds source-id checking for interrupt remapping, and
>> then really isolates interrupts for guests/VMs with assigned
>> devices.
>> 
>> Because PCI subsystem is not initialized yet when set up IOAPIC
>> entries, use read_pci_config_byte to access PCI config space
>> directly.
>> 
>> Signed-off-by: Weidong Han 
>> ---
>>  arch/x86/kernel/apic/io_apic.c |6 +++
>>  drivers/pci/intr_remapping.c   |   90
>>  ++-
>>  drivers/pci/intr_remapping.h   |2 + include/linux/dmar.h   
>>  |   11 + 4 files changed, 106 insertions(+), 3 deletions(-)
> 
> Code structure looks nice now. (and i susect you have tested this on
> real and relevant hardware?) I've Cc:-ed Eric too ... does this
> direction look good to you too Eric?
> 
> Have a few minor nits only:
> 
>> diff --git a/arch/x86/kernel/apic/io_apic.c
>> b/arch/x86/kernel/apic/io_apic.c 
>> index 30da617..3d10c68 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -1559,6 +1559,9 @@ int setup_ioapic_entry(int apic_id, int irq, 
>>  irte.vector = vector; irte.dest_id = IRTE_DEST(destination);
>> 
>> +/* Set source-id of interrupt request */
>> +set_ioapic_sid(&irte, apic_id);
>> +
>>  modify_irte(irq, &irte);
>> 
>>  ir_entry->index2 = (index >> 15) & 0x1;
>> @@ -3329,6 +3332,9 @@ static int msi_compose_msg(struct pci_dev
>>  *pdev, unsigned int irq, struct msi_ms  
>> irte.vector =
>> cfg->vector; irte.dest_id = IRTE_DEST(dest); 
>> 
>> +/* Set source-id of interrupt request */
>> +set_msi_sid(&irte, pdev);
>> +
>>  modify_irte(irq, &irte);
>> 
>>  msg->address_hi = MSI_ADDR_BASE_HI;
>> diff --git a/drivers/pci/intr_remapping.c
>> b/drivers/pci/intr_remapping.c 
>> index 946e170..9ef7b0d 100644
>> --- a/drivers/pci/intr_remapping.c
>> +++ b/drivers/pci/intr_remapping.c
>> @@ -10,6 +10,8 @@
>>  #include 
>>  #include "intr_remapping.h"
>>  #include 
>> +#include 
>> +#include "pci.h"
>> 
>>  static struct ioapic_scope ir_ioapic[MAX_IO_APICS];  static int
>> ir_ioapic_num; @@ -405,6 +407,61 @@ int free_irte(int irq)
>>  return rc;
>>  }
>> 
>> +int set_ioapic_sid(struct irte *irte, int apic)
>> +{
>> +int i;
>> +u16 sid = 0;
>> +
>> +if (!irte)
>> +return -1;
>> +
>> +for (i = 0; i < MAX_IO_APICS; i++)
>> +if (ir_ioapic[i].id == apic) {
>> +sid = (ir_ioapic[i].bus << 8) | ir_ioapic[i].devfn; +   
>> break;
>> +}
> 
> Please generally put extra curly braces around each multi-line loop
> body. One reason beyond readability is robustness: the above
> structure can be easily extended in a buggy way via [mockup patch
> hunk]:
> 
>>  sid = (ir_ioapic[i].bus << 8) | ir_ioapic[i].devfn; 
>> break;
>>  }
>> +if (!sid)
>> +break;
> 
> And note that if this slips in by accident how unobvious this bug is
> during patch review - the loop head context is not present in the
> 3-line default context and the code "looks" correct at a glance.
> 
> With extra braces, such typos or mismerges:
> 
>>  }
>>  }
>> +if (!sid)
>> +break;
> 
> stick out during review like a sore thumb :-)
> 
>> +if (sid == 0) {
>> +printk(KERN_WARNING "Failed to set source-id of "
>> +   "I/O APIC (%d), because it is not under "
>> +   "any DRHD\n", apic);
>> +return -1;
>> +}
> 
> please try to keep kernel messages on a single line - even if
> checkpatch complains. Also, it's a good idea to use pr_warning(),
> it's shorter by 8 characters.
> 
>> +
>> +irte->svt = 1; /* requestor ID verification SID/SQ */
>> +irte->sq = 0;  /* comparing all 16-bit of SID */
>> +irte->sid = sid;
> 
> this is a borderline suggestion:
> 
> Note how you already lined up the _comments_ vertically, so you did
> notice that it makes sense to align vertically. The same visual
> arguments can be made for the initialization itself too:
> 
>> +
>> +irte->svt   = 1;/* requestor ID verification SID/SQ */
>> +irte->sq= 0;/* comparing all 16-bit of SID */
>> +irte->sid   = sid;
>> +
>> +return 0;
> 
> But ... it might make even more sense to introduce a set_irte()
> helper method, so it can all be written in a single line as:
> 
>   set_irte(irte, 1, 0, sid);
> 
> and explain common values

RE: [PATCH 2/2] Intel-IOMMU, intr-remap: source-id checking

2009-05-19 Thread Han, Weidong
Ingo Molnar wrote:
> * Han, Weidong  wrote:
> 
>> Ingo Molnar wrote:
>>> * Han, Weidong  wrote:
>>> 
>>>> Siddha, Suresh B wrote:
>>>>> On Wed, 2009-05-06 at 23:16 -0700, Han, Weidong wrote:
>>>>>> @@ -634,6 +694,44 @@ static int ir_parse_ioapic_scope(struct
>>>>>> acpi_dmar_header *header, " 0x%Lx\n",
>>>>>> scope->enumeration_id, drhd->address);
>>>>>> 
>>>>>> +bus = pci_find_bus(drhd->segment, scope->bus);
>>>>>> +path = (struct acpi_dmar_pci_path *)(scope + 
>>>>>> 1); +  count =
>>>>>> (scope->length - +sizeof(struct 
>>>>>> acpi_dmar_device_scope))
>>>>>> +/ sizeof(struct acpi_dmar_pci_path);
>>>>>> +
>>>>>> +while (count) {
>>>>>> +if (pdev)
>>>>>> +pci_dev_put(pdev);
>>>>>> +
>>>>>> +if (!bus)
>>>>>> +break;
>>>>>> +
>>>>>> +pdev = pci_get_slot(bus,
>>>>>> +PCI_DEVFN(path->dev, path->fn));
>>>>>> +if (!pdev)
>>>>>> +break;
>>>>> 
>>>>> ir_parse_ioapic_scope() happens very early in the boot. So, I
>>>>> don't think we can do the pci related discovery here.
>>>>> 
>>>> 
>>>> Thanks for your pointing it out. It should enable the source-id
>>>> checking for io-apic's after the pci subsystem is up. I will
>>>> change it.
>>> 
>>> Note, there's ways to do early PCI quirks too, check
>>> arch/x86/kernel/early-quirks.c. It's done by reading the PCI
>>> configuration space directly via a careful early-capable subset of
>>> the PCI config space APIs. 
>>> 
>>> But it's a method of last resort.
>>> 
>> 
>> Thanks for your reminder. It can use direct PCI access here as
>> follows. It's easy and clean. I think it's better than adding the
>> source-id checking for io-apic's after the pci subsystem is up. I
>> will send out updated patches after some tests.   
>> 
>> @@ -634,6 +695,24 @@ static int ir_parse_ioapic_scope(struct
>>acpi_dmar_header *header, " 0x%Lx\n",
>>scope->enumeration_id, drhd->address);
>> 
>> +   bus = scope->bus;
>> +   path = (struct acpi_dmar_pci_path *)(scope +
>> 1); +   count = (scope->length -
>> +sizeof(struct
>> acpi_dmar_device_scope)) +   /
>> sizeof(struct acpi_dmar_pci_path); + +   while
>> (--count > 0) { +   /* Access PCI
>> directly due to the PCI +* subsystem
>> isn't initialized yet. +*/
>> +   bus = read_pci_config_byte(bus,
>> path->dev, +   path->fn,
>> PCI_SECONDARY_BUS); +   path++;
>> +   }
>> +
>> +   ir_ioapic[ir_ioapic_num].bus = bus;
>> +   ir_ioapic[ir_ioapic_num].devfn =
>> +   PCI_DEVFN(path->dev,
>> path->fn); 
> 
> looks good IMO, beyond the obligatory comment-style nitpick [*] :-)
> Also, the function above seems to be way too large - please split it
> into a couple of natural helper functions.
> 
> Thanks,
> 
>   Ingo
> 
> [*]
> 
> Please use the customary comment style:
> 
>   /*
>* Comment .
>* .. goes here:
>*/
> 
> specified in Documentation/CodingStyle.

I have sent out the updated patches. Thanks!

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] Intel-IOMMU, intr-remap: source-id checking

2009-05-18 Thread Han, Weidong
Ingo Molnar wrote:
> * Han, Weidong  wrote:
> 
>> Siddha, Suresh B wrote:
>>> On Wed, 2009-05-06 at 23:16 -0700, Han, Weidong wrote:
>>>> @@ -634,6 +694,44 @@ static int ir_parse_ioapic_scope(struct
>>>>   acpi_dmar_header *header, " 0x%Lx\n",
>>>>   scope->enumeration_id, drhd->address);
>>>> 
>>>> +  bus = pci_find_bus(drhd->segment, scope->bus);
>>>> +  path = (struct acpi_dmar_pci_path *)(scope + 1); +  
>>>> count =
>>>> (scope->length - +  sizeof(struct 
>>>> acpi_dmar_device_scope))
>>>> +  / sizeof(struct acpi_dmar_pci_path);
>>>> +
>>>> +  while (count) {
>>>> +  if (pdev)
>>>> +  pci_dev_put(pdev);
>>>> +
>>>> +  if (!bus)
>>>> +  break;
>>>> +
>>>> +  pdev = pci_get_slot(bus,
>>>> +  PCI_DEVFN(path->dev, path->fn));
>>>> +  if (!pdev)
>>>> +  break;
>>> 
>>> ir_parse_ioapic_scope() happens very early in the boot. So, I
>>> don't think we can do the pci related discovery here.
>>> 
>> 
>> Thanks for your pointing it out. It should enable the source-id
>> checking for io-apic's after the pci subsystem is up. I will
>> change it.
> 
> Note, there's ways to do early PCI quirks too, check
> arch/x86/kernel/early-quirks.c. It's done by reading the PCI
> configuration space directly via a careful early-capable subset of
> the PCI config space APIs.
> 
> But it's a method of last resort.
> 

Thanks for your reminder. It can use direct PCI access here as follows. It's 
easy and clean. I think it's better than adding the source-id checking for 
io-apic's after the pci subsystem is up. I will send out updated patches after 
some tests.
 
@@ -634,6 +695,24 @@ static int ir_parse_ioapic_scope(struct acpi_dmar_header 
*header,
   " 0x%Lx\n", scope->enumeration_id,
   drhd->address);

+   bus = scope->bus;
+   path = (struct acpi_dmar_pci_path *)(scope + 1);
+   count = (scope->length -
+sizeof(struct acpi_dmar_device_scope))
+   / sizeof(struct acpi_dmar_pci_path);
+
+   while (--count > 0) {
+   /* Access PCI directly due to the PCI
+* subsystem isn't initialized yet.
+*/
+   bus = read_pci_config_byte(bus, path->dev,
+   path->fn, PCI_SECONDARY_BUS);
+   path++;
+   }
+
+   ir_ioapic[ir_ioapic_num].bus = bus;
+   ir_ioapic[ir_ioapic_num].devfn =
+   PCI_DEVFN(path->dev, path->fn);
ir_ioapic[ir_ioapic_num].iommu = iommu;
ir_ioapic[ir_ioapic_num].id = scope->enumeration_id;
ir_ioapic_num++;




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] Intel-IOMMU, intr-remap: source-id checking

2009-05-10 Thread Han, Weidong
Siddha, Suresh B wrote:
> On Wed, 2009-05-06 at 23:16 -0700, Han, Weidong wrote:
>> @@ -634,6 +694,44 @@ static int ir_parse_ioapic_scope(struct
>> acpi_dmar_header *header, " 0x%Lx\n",
>> scope->enumeration_id, drhd->address);
>> 
>> +bus = pci_find_bus(drhd->segment, scope->bus);
>> +path = (struct acpi_dmar_pci_path *)(scope + 1); +  
>> count =
>> (scope->length - +sizeof(struct 
>> acpi_dmar_device_scope))
>> +/ sizeof(struct acpi_dmar_pci_path);
>> +
>> +while (count) {
>> +if (pdev)
>> +pci_dev_put(pdev);
>> +
>> +if (!bus)
>> +break;
>> +
>> +pdev = pci_get_slot(bus,
>> +PCI_DEVFN(path->dev, path->fn));
>> +if (!pdev)
>> +break;
> 
> ir_parse_ioapic_scope() happens very early in the boot. So, I don't
> think we can do the pci related discovery here.
> 

Thanks for your pointing it out. It should enable the source-id checking for 
io-apic's after the pci subsystem is up. I will change it.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: tg3 driver in guest fails for VT-d passthrough NIC

2009-05-06 Thread Han, Weidong
Yuji virtualized more capabilities to guest in Xen Qemu. I think your tg3 
driver can work in Xen guest with VT-d passthrough. If anyone interests, he/she 
can port the code from Xen Qemu to KVM Qemu.

Regards,
Weidong

Alex Williamson wrote:
> On Wed, May 6, 2009 at 11:31 AM, Nadolski, Ed 
> wrote: 
>> Hi,
>> 
>> I am running the KVM kernel & userspace downloaded on 4/24 with
>> Fedora 10 host Linux on a Dell T7400 Xeon with VT-x and VT-d
>> enabled. The T7400 has an onboard Broadcom NIC, but when I use VT-d
>> to assign this NIC to a Fedora 10 guest, the NIC's tg3 driver in the
>> guest aborts because it cannot find the PM Capability in the
>> device's PCI config space. ... 
>> So has anyone else seen this, and what is the right way to address
>> it?  It's not good to simply pass thru certain device Capabilities
>> if they cannot be properly handled by KVM/QEMU, but OTOH the
>> unmodified guest OS drivers should not break because they were
>> written to expect certain Capabilities.  Is there any plan to add
>> support for these missing Capabilities? 
> 
> Yes, I've seen this with bnx2 devices.  For a Linux guest, it's a
> fairly trivial change to the driver to not depend on these
> capabilities, but it would be far more compatible with existing driver
> code out there to make an attempt to support support them in KVM.
> AFAIK, it's not being working on yet.
> 
> Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI hotplug and passthrough together supported?

2009-04-07 Thread Han, Weidong
hotplug is supported. You can refer to 
http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM.

Regards,
Weidong

Aaron Dailey wrote:
> I'm interested in using PCI passthrough on a PCI device that may not
> be present at boot time, but is later inserted.  Is this possible in
> KVM? I believe both passthrough and hotplug are individually
> supported, but I couldn't find an answer about hot plugging a device
> and then using it via passthrough.
> 
> I'm not subscribed to the mailing list, so please copy me on replies.
> 
> Thanks for your help,
> 
> Aaron Dailey
> --
>   Aaron Dailey
>   a...@alumni.virginia.edu

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI passtthrought & intel 82574L can't boot from disk

2009-04-07 Thread Han, Weidong
Hi,

Currently, expansion ROM of assigned devices will be loaded into guest and 
execute. You can just comment out one line code (assigned_dev_load_option_rom() 
in qemu/hw/pc.c) to let qemu don't load expansion ROM, and then this issue 
should be gone. I met this issue before, but there was no problem when I 
assigned the same device to a guest in Xen which also load and execute 
expansion ROM. I guess it's caused by the difference between Xen guest bios and 
kvm guest bios. 

Regards,
Weidong

Hauke Hoffmann wrote:
> On Thursday 02 April 2009 18:58:26 you wrote:
>> It is my understanding that you need vt-d/iommu support. I didn't
>> think any existing amd chipsets had iommu support. You may want to
>> look into that. 
> 
> Hi Brian,
> 
> thanks for you response.
> 
> I found a tool [1] from Intel to disable the Boot ROM on the nic.
> Thats resolves the boot problem.
> 
> Regards
> Hauke
> 
> [1]
> http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&ProductID=412&DwnldID=8242
> 
>> 
>> --Brian Jackson
>> 
>> On Thursday 02 April 2009 07:00:07 Hauke Hoffmann wrote:
>>> Hi,
>>> 
>>> qemu-system-x86_64 runs well and i can boot and run the guest
>>> system. Thats works very well. 
>>> 
>>> Command:
>>> /usr/local/kvm/bin/qemu-system-x86_64 -m
>>> 512 -hda /var/VM/roadrunner.local/hda.qcow2 -smp 1 -vnc
>>> 192.168.2.30: -net nic,macaddr=DE:AD:BE:EF:90:26 -net
>>> tap,ifname=tap0,script=no,downscript=no -boot c
>>> 
>>> Then i tried to add an intel 82574L network adapter to the guest.
>>> Just the same command with addtionally "-pcidevice host=07:00.0"
>>> 
>>> Then i connected via VNC and see BIOS startpage and the following
>>> lines: Initializing Intel(r) boot agent ge v1.3.21
>>> pxe 2.1 build 086 (WfM 2.0)
>>> Press f12 for moot menu
>>> 
>>> You can see a screenshot at http://nxt7.de/download/qemu.png
>>> 
>>> The guests keep on this point and nothing changes. (I have wait
>>> hours.) 
>>> 
>>> I tried to press F12 in ThightVNC but no action.
>>> I must say that ThightVNC has problems with special chars (in my
>>> case). 
>>> 
>>> At this point, i need your help.
>>> 
>>> 
>>> Here are some details of my system
>>> 
>>> Kernel: 2.6.29 form kernel.org (self compiled)
>>> kvm userspace: kvm-84 (self compiled)
>>> OS: Ubuntu 8.04.2 server
>>> 
>>> r...@ls:~# lspci
>>> 00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev
>>> a2) 00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a3)
>>> 00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a3)
>>> 00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller
>>> (rev a1) 00:02.1 USB Controller: nVidia Corporation MCP55 USB
>>> Controller (rev a2) 00:04.0 IDE interface: nVidia Corporation MCP55
>>> IDE (rev a1) 00:05.0 IDE interface: nVidia Corporation MCP55 SATA
>>> Controller (rev a3) 00:05.1 IDE interface: nVidia Corporation MCP55
>>> SATA Controller (rev a3) 00:05.2 IDE interface: nVidia Corporation
>>> MCP55 SATA Controller (rev a3) 00:06.0 PCI bridge: nVidia
>>> Corporation MCP55 PCI bridge (rev a2) 00:08.0 Bridge: nVidia
>>> Corporation MCP55 Ethernet (rev a3) 00:09.0 Bridge: nVidia
>>> Corporation MCP55 Ethernet (rev a3) 00:0a.0 PCI bridge: nVidia
>>> Corporation MCP55 PCI Express bridge (rev a3) 00:0b.0 PCI bridge:
>>> nVidia Corporation MCP55 PCI Express bridge (rev a3) 00:0c.0 PCI
>>> bridge: nVidia Corporation MCP55 PCI Express bridge (rev a3)
>>> 00:0d.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge
>>> (rev a3) 00:0e.0 PCI bridge: nVidia Corporation MCP55 PCI Express
>>> bridge (rev a3) 00:0f.0 PCI bridge: nVidia Corporation MCP55 PCI
>>> Express bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices
>>> [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
>>> 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8
>>> [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro
>>> Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host
>>> bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
>>> Miscellaneous Control 01:09.0 Ethernet controller: Lite-On
>>> Communications Inc LNE100TX [Linksys EtherFast 10/100] (rev 25)
>>> 01:0a.0 VGA compatible controller: XGI Technology Inc. (eXtreme
>>> Graphics Innovation) Volari Z7 06:00.0 SATA controller: JMicron
>>> Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 03)
>>> 06:00.1 IDE interface: JMicron Technologies, Inc. JMicron
>>> 20360/20363 AHCI Controller (rev 03) 07:00.0 Ethernet controller:
>>> Intel Corporation 82574L Gigabit Network Connection 
>>> 
>>> 
>>> r...@ls:~# lspci -tvvv
>>> -[:00]-+-00.0  nVidia Corporation MCP55 Memory Controller
>>>+-01.0  nVidia Corporation MCP55 LPC Bridge
>>>+-01.1  nVidia Corporation MCP55 SMBus
>>>+-02.0  nVidia Corporation MCP55 USB Controller
>>>+-02.1  nVidia Corporation MCP55 USB Controller
>>>+-04.0  nVidia Corporation MCP55 IDE
>>>+-05.0  nVidia Corporation MCP55 SATA Controller
>>> 

RE: [PATCH 2/2] kvm: qemu: check device assignment command

2009-03-30 Thread Han, Weidong
Avi Kivity wrote:
> Han, Weidong wrote:
>>> I suggest replacing the parsing code with pci_parse_devaddr() (needs
>>> to be extended to support functions) so that all the checking and
>>> parsing is done in one place. 
>>> 
>> 
>> If use pci_parse_devaddr(), it needs to add domain section to
>> assigning command, and add function section to pci_add/pci_del
>> commands. What's more, pci_parse_devaddr() parses guest device bdf,
>> there are some assumption, such as function is 0. But here parse
>> host bdf. It's a little complex to combine them together.
>> 
> 
> Right, but we end up with overall better code.

pci_parse_devaddr parses [[:][:], it's valid when even enter 
only slot, whereas it must be bus:slot.func in device assignment command  
(-pcidevice host=bus:slot.func). So I implemented a dedicated function to parse 
device bdf in device assignment command, rather than mix two parsing function 
together.

Signed-off-by: Weidong Han 

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index cef7c8a..53375ff 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -1195,8 +1195,7 @@ out:
  */
 AssignedDevInfo *add_assigned_device(const char *arg)
 {
-char *cp, *cp1;
-char device[8];
+char device[16];
 char dma[6];
 int r;
 AssignedDevInfo *adev;
@@ -1207,6 +1206,13 @@ AssignedDevInfo *add_assigned_device(const char *arg)
 return NULL;
 }
 r = get_param_value(device, sizeof(device), "host", arg);
+if (!r)
+ goto bad;
+
+r = pci_parse_host_devaddr(device, &adev->bus, &adev->dev, &adev->func);
+if (r)
+goto bad;
+
 r = get_param_value(adev->name, sizeof(adev->name), "name", arg);
 if (!r)
snprintf(adev->name, sizeof(adev->name), "%s", device);
@@ -1216,18 +1222,6 @@ AssignedDevInfo *add_assigned_device(const char *arg)
 if (r && !strncmp(dma, "none", 4))
 adev->disable_iommu = 1;
 #endif
-cp = device;
-adev->bus = strtoul(cp, &cp1, 16);
-if (*cp1 != ':')
-goto bad;
-cp = cp1 + 1;
-
-adev->dev = strtoul(cp, &cp1, 16);
-if (*cp1 != '.')
-goto bad;
-cp = cp1 + 1;
-
-adev->func = strtoul(cp, &cp1, 16);
 
 LIST_INSERT_HEAD(&adev_head, adev, next);
 return adev;
diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c
index eca0517..bf97c8c 100644
--- a/qemu/hw/pci.c
+++ b/qemu/hw/pci.c
@@ -163,6 +163,7 @@ static int pci_set_default_subsystem_id(PCIDevice *pci_dev)
 }
 
 /*
+ * Parse pci address in qemu command
  * Parse [[:]:], return -1 on error
  */
 static int pci_parse_devaddr(const char *addr, int *domp, int *busp, unsigned 
*slotp)
@@ -211,6 +212,55 @@ static int pci_parse_devaddr(const char *addr, int *domp, 
int *busp, unsigned *s
 return 0;
 }
 
+/*
+ * Parse device bdf in device assignment command:
+ *
+ * -pcidevice host=bus:dev.func
+ *
+ * Parse :. return -1 on error
+ */
+int pci_parse_host_devaddr(const char *addr, int *busp,
+   int *slotp, int *funcp)
+{
+const char *p;
+char *e;
+int val;
+int bus = 0, slot = 0, func = 0;
+
+p = addr;
+val = strtoul(p, &e, 16);
+if (e == p)
+   return -1;
+if (*e == ':') {
+   bus = val;
+   p = e + 1;
+   val = strtoul(p, &e, 16);
+   if (e == p)
+   return -1;
+   if (*e == '.') {
+   slot = val;
+   p = e + 1;
+   val = strtoul(p, &e, 16);
+   if (e == p)
+   return -1;
+   func = val;
+   } else
+   return -1;
+} else
+   return -1;
+
+if (bus > 0xff || slot > 0x1f || func > 0x7)
+   return -1;
+
+if (*e)
+   return -1;
+
+*busp = bus;
+*slotp = slot;
+*funcp = func;
+return 0;
+}
+
 int pci_read_devaddr(const char *addr, int *domp, int *busp, unsigned *slotp)
 {
 char devaddr[32];
diff --git a/qemu/hw/pci.h b/qemu/hw/pci.h
index a7438f2..bfdd29a 100644
--- a/qemu/hw/pci.h
+++ b/qemu/hw/pci.h
@@ -227,6 +227,9 @@ PCIDevice *pci_find_device(int bus_num, int slot, int 
function);
 int pci_read_devaddr(const char *addr, int *domp, int *busp, unsigned *slotp);
 int pci_assign_devaddr(const char *addr, int *domp, int *busp, unsigned 
*slotp);
 
+int pci_parse_host_devaddr(const char *addr, int *busp,
+   int *slotp, int *funcp);
+
 void pci_info(Monitor *mon);
 PCIBus *pci_bridge_init(PCIBus *bus, int devfn, uint16_t vid, uint16_t did,
 pci_map_irq_fn map_irq, const char *name);--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] kvm: qemu: check device assignment command

2009-03-27 Thread Han, Weidong
Avi Kivity wrote:
> Han, Weidong wrote:
>> Device assignment command is like "-pcidevice host=xx:yy.z".
>> Check bus:dev.func length to make sure its format is xx:yy.z.
>> 
>> Signed-off-by: Weidong Han 
>> ---
>>  qemu/hw/device-assignment.c |5 -
>>  1 files changed, 4 insertions(+), 1 deletions(-)
>> 
>> diff --git a/qemu/hw/device-assignment.c
>> b/qemu/hw/device-assignment.c index cef7c8a..50a0d5c 100644 ---
>> a/qemu/hw/device-assignment.c +++ b/qemu/hw/device-assignment.c
>> @@ -1196,7 +1196,7 @@ out:
>>  AssignedDevInfo *add_assigned_device(const char *arg)  {
>>  char *cp, *cp1;
>> -char device[8];
>> +char device[9];
>>  char dma[6];
>>  int r;
>>  AssignedDevInfo *adev;
>> @@ -1207,6 +1207,9 @@ AssignedDevInfo *add_assigned_device(const
>>  char *arg)  return NULL; }
>>  r = get_param_value(device, sizeof(device), "host", arg);
>> +/* b:d.f format: xx:yy.z */
>> +if (r != 7)
>> +goto bad;
>>  r = get_param_value(adev->name, sizeof(adev->name), "name",
>>  arg);  if (!r) snprintf(adev->name, sizeof(adev->name), "%s",
>> device); 
>> 
> 
> I suggest replacing the parsing code with pci_parse_devaddr() (needs
> to be extended to support functions) so that all the checking and
> parsing is done in one place.

If use pci_parse_devaddr(), it needs to add domain section to assigning 
command, and add function section to pci_add/pci_del commands. What's more, 
pci_parse_devaddr() parses guest device bdf, there are some assumption, such as 
function is 0. But here parse host bdf. It's a little complex to combine them 
together.

Regards,
Weidong





--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] kvm: qemu: add warning message when assign device without IOMMU

2009-03-26 Thread Han, Weidong
Avi Kivity wrote:
> Han, Weidong wrote:
>> When user wants to assign device with IOMMU, but IOMMU is not
>> enabled, add warning messages to prompt user this situation, and
>> return error. Or device will be assigned but it cannot work.
>> 
> 
> If the device has does not do DMA (say a serial port) then it can be
> assigned.  Can we check the device for dma capability and allow this
> special case?

Do you have any hint to check it? This warning be triggered only when user 
wants to assign device with IOMMU. If user doesn't want to use IOMMU to assign 
device, he can use "-pcidevice host=xx:yy.z,dma=none". 

Regards,
Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] kvm: qemu: check device assignment command

2009-03-26 Thread Han, Weidong
Device assignment command is like "-pcidevice host=xx:yy.z".
Check bus:dev.func length to make sure its format is xx:yy.z.

Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index cef7c8a..50a0d5c 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -1196,7 +1196,7 @@ out:
 AssignedDevInfo *add_assigned_device(const char *arg)
 {
 char *cp, *cp1;
-char device[8];
+char device[9];
 char dma[6];
 int r;
 AssignedDevInfo *adev;
@@ -1207,6 +1207,9 @@ AssignedDevInfo *add_assigned_device(const char *arg)
 return NULL;
 }
 r = get_param_value(device, sizeof(device), "host", arg);
+/* b:d.f format: xx:yy.z */
+if (r != 7)
+goto bad;
 r = get_param_value(adev->name, sizeof(adev->name), "name", arg);
 if (!r)
snprintf(adev->name, sizeof(adev->name), "%s", device);
-- 
1.6.0.4


0002-kvm-qemu-check-device-assignment-command.patch
Description: 0002-kvm-qemu-check-device-assignment-command.patch


[PATCH 1/2] kvm: qemu: add warning message when assign device without IOMMU

2009-03-26 Thread Han, Weidong
When user wants to assign device with IOMMU, but IOMMU is not
enabled, add warning messages to prompt user this situation, and
return error. Or device will be assigned but it cannot work.

Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index b7f9fa6..cef7c8a 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -616,6 +616,11 @@ static int assign_device(AssignedDevInfo *adev)
 r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
 if (r && !adev->disable_iommu)
assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+else if (!adev->disable_iommu && r == 0) {
+   fprintf(stderr, "IOMMU is not enabled. You cannot use "
+"it to assign device!\n");
+   return -EINVAL;
+}
 #endif
 
 r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
-- 
1.6.0.4


0001-kvm-qemu-add-warning-message-when-assign-device-wi.patch
Description: 0001-kvm-qemu-add-warning-message-when-assign-device-wi.patch


RE: Build breakage between kvm-userspace.git HEAD and 2.6.29-rc8

2009-03-16 Thread Han, Weidong
Avi Kivity wrote:
> Han, Weidong wrote:
>> kvm_deassign_pci_device is defined when KVM_CAP_DEVICE_DEASSIGNMENT
>> is defined. Following patch should fix the breakage. 
>> 
> 
> Signoff please.

Sorry for missing it.  Signed-off-by: Weidong Han 

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Build breakage between kvm-userspace.git HEAD and 2.6.29-rc8

2009-03-16 Thread Han, Weidong
Gregory Haskins wrote:
> Hi kvm-devs,
>   Been away from the kvm scene for a while...hope everyone is well.
> 
> FYI: There are a few build breakages between kvm-userspace.git HEAD
> (f29d402ab885d30bfa7350378cff911c9c17a226) when you compile against
> Linus' linux-2.6.git HEAD
> (v2.6.29-rc8-041b62374c7fedc11a8a1eeda2868612d3d1436c)
> 
> I am not sure if this is a supported configuration (e.g. perhaps you
> only support building kvm-userspace HEAD against Avi's kernel tree?).
> But in case this was unintentional, read on...
> 
> First error:
> 
> libkvm.c:702: error: 'struct kvm_irq_level' has no member named
> 'status' make[1]: *** [libkvm.o] Error 1
> 
> Which is attributable to:
> 
> commit ea1b668e7684dc43e9d198ba0b25fe47a0b2acd2
> Author: Gleb Natapov 
> Date:   Wed Feb 4 17:30:01 2009 +0200
> 
> kvm: qemu: handle IRQ status injection in userspace
> 
> Reverting this patch does allow the build to get past this error. 
> Then I hit the next one:
> 
> device-assignment.o: In function `deassign_device':
> /home/ghaskins/sandbox/git/fabric/kvm-userspace/qemu/hw/device-assignment.c:562:
> undefined reference to `kvm_deassign_pci_device'
> 
> Reverting:
> 
> commit 1435f18213edee9fabf936f5d6be70054f1f2bd1
> Author: Weidong Han 
> Date:   Wed Feb 18 14:33:41 2009 +0800
> 
> kvm: qemu: fix hot remove assigned device
> 
>  Followed by:
> 
> commit 71ae34caaa961330f53f40090cd180a74fcda5fc
> Author: Weidong Han 
> Date:   Wed Feb 18 14:28:30 2009 +0800
> 
> kvm: qemu: deassign device from guest
> 
> Does fix the build.  I realize reversion is probably not the desirable
> long term solution, but at least it points to the proper places that
> need some backwards-compat love ;)
> 

kvm_deassign_pci_device is defined when KVM_CAP_DEVICE_DEASSIGNMENT is defined. 
Following patch should fix the breakage.

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 7c73210..6765683 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -551,6 +551,7 @@ static int assign_irq(AssignedDevInfo *adev)
 
 static void deassign_device(AssignedDevInfo *adev)
 {
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
 struct kvm_assigned_pci_dev assigned_dev_data;
 AssignedDevice *dev = adev->assigned_dev;
 int r;
@@ -563,6 +564,7 @@ static void deassign_device(AssignedDevInfo *adev)
 if (r < 0)
fprintf(stderr, "Failed to deassign device \"%s\" : %s\n",
 adev->name, strerror(-r));
+#endif
 }
 
 void remove_assigned_device(AssignedDevInfo *adev)



> Regards,
> -Greg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: PCI Passthrough - create_userspace_phys_mem: Invalid argument

2009-02-26 Thread Han, Weidong
freisei wrote:
> Hi List!
> 
> i want to passthrough my isdn-card in a kvm-vm.
> 
>> qemu-system-x86_64 -vnc :3 -m 2048 -name $vmname -hda
>> /dev/vg0/xp8asterisk -boot n -net nic,macaddr=52:54:00:58:08:03 -net
>> tap,script=/etc/qemu-ifup -usbdevice tablet -k de -smp 2 -monitor
>> stdio -pcidevice host=05:04.0
>> Running /etc/qemu-ifup
>> ...Bringing up  for bridged mode
>> ...Adding tap0 to intern-interface
>> init_assigned_device: Registering real physical device 05:04.0
>> (bus=5 dev=4 func=0) get_real_device: region 0 size 32 start
>> 0xf910 type 512 resource_fd 14 get_real_device: region 1 size 32
>> start 0x3000 type 256 resource_fd 0 QEMU 0.9.1 monitor - type 'help'
>> for more information (qemu) assigned_dev_pci_read_config: (6.0):
>> address= 
>> val=0x1244 len=2
>> assigned_dev_pci_read_config: (6.0): address=0002 val=0x0e00
>> len=2 assigned_dev_pci_read_config: (6.0): address=
>> val=0x1244 len=2 assigned_dev_pci_read_config: (6.0):
>> address=0002 val=0x0e00 len=2 assigned_dev_pci_read_config:
>> (6.0): address= val=0x1244 len=2
>> assigned_dev_pci_read_config: (6.0): address=0002 val=0x0e00
>> len=2 assigned_dev_pci_read_config: (6.0): address=000a
>> val=0x0280 len=2 assigned_dev_pci_read_config: (6.0):
>> address= val=0x1244 len=2 assigned_dev_pci_read_config:
>> (6.0): address=0002 val=0x0e00 len=2
>> assigned_dev_pci_write_config: (6.0): address=0010 val=0x
>> len=4 assigned_dev_pci_read_config: (6.0): address=0010
>> val=0xffe0 len=4 assigned_dev_pci_read_config: (6.0):
>> address=0010 val=0xffe0 len=4 assigned_dev_pci_write_config:
>> (6.0): address=0010 val=0xf2001100 len=4 assigned_dev_iomem_map:
>> e_phys=f2001100 r_virt=0x7f9374876000 type=0 len=0020
>> region_num=0 create_userspace_phys_mem: Invalid argument   
>> assigned_dev_iomem_map: Error: create new mapping failed

This error happened when the MMIO of assigned device is not page aligned. Here 
MMIO size of your device is 0x20 Bytes. The commit 
fb48c7c858cc612e3e1cb011c534075ac5de553d in kvm userspace should fix it. Was it 
included in your userspace? The commit changes rom bios, you can use "make 
bios", and "make install" to make it take effect. And I think replacing "-net 
nic,macaddr=52:54:00:58:08:03 -net tap,script=/etc/qemu-ifup" with "-net none" 
will also work for you.

Regards,
Weidong

> Already tried:
> - remove not needed-parameters
> - dma=none
> - set assign-device DEBUG (as you see)
> 
> Additional Info:
> lspci:
> 05:04.0 Network controller: AVM GmbH Fritz!PCI v2.0 ISDN (rev 02)

Can you post output of "lspci -v"? We can see if its mmio is page aligned or 
not. Can you check if commit fb48c7c858cc612e3e1cb011c534075ac5de553d was 
included in your kvm-userspace?

> r...@xp8main3:/etc/kvm# uname -a
> Linux xp8main3 2.6.29-rc6-xp8no7 #1 SMP Thu Feb 26 12:27:17 CET 2009
> x86_64 GNU/Linux
> 
> 
> greets freisei

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] VT-d: fix PCI device detach from virtual machine

2009-02-26 Thread Han, Weidong
When assign a device behind conventional PCI bridge or PCIe to
PCI/PCI-x bridge to a domain, it must assign its bridge and may
also need to assign secondary interface to the same domain. 

Dependent assignment is already there, but dependent
deassignment is missed when detach device from virtual machine.
This results in conventional PCI device assignment failure after
it has been assigned once. This patch addes dependent
deassignment, and fixes the issue.


Signed-off-by: Weidong Han 
---
 drivers/pci/intel-iommu.c |   29 +
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index f3f6865..d21a1a5 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -2772,6 +2772,33 @@ static int vm_domain_add_dev_info(struct dmar_domain 
*domain,
return 0;
 }
 
+static void iommu_detach_dependent_devices(struct intel_iommu *iommu,
+  struct pci_dev *pdev)
+{
+   struct pci_dev *tmp, *parent;
+
+   if (!iommu || !pdev)
+   return;
+
+   /* dependent device detach */
+   tmp = pci_find_upstream_pcie_bridge(pdev);
+   /* Secondary interface's bus number and devfn 0 */
+   if (tmp) {
+   parent = pdev->bus->self;
+   while (parent != tmp) {
+   iommu_detach_dev(iommu, parent->bus->number,
+   parent->devfn);
+   parent = parent->bus->self;
+   }
+   if (tmp->is_pcie) /* this is a PCIE-to-PCI bridge */
+   iommu_detach_dev(iommu,
+   tmp->subordinate->number, 0);
+   else /* this is a legacy PCI bridge */
+   iommu_detach_dev(iommu,
+   tmp->bus->number, tmp->devfn);
+   }
+}
+
 static void vm_domain_remove_one_dev_info(struct dmar_domain *domain,
  struct pci_dev *pdev)
 {
@@ -2797,6 +2824,7 @@ static void vm_domain_remove_one_dev_info(struct 
dmar_domain *domain,
spin_unlock_irqrestore(&device_domain_lock, flags);
 
iommu_detach_dev(iommu, info->bus, info->devfn);
+   iommu_detach_dependent_devices(iommu, pdev);
free_devinfo_mem(info);
 
spin_lock_irqsave(&device_domain_lock, flags);
@@ -2846,6 +2874,7 @@ static void vm_domain_remove_all_dev_info(struct 
dmar_domain *domain)
 
iommu = device_to_iommu(info->bus, info->devfn);
iommu_detach_dev(iommu, info->bus, info->devfn);
+   iommu_detach_dependent_devices(iommu, info->dev);
 
/* clear this iommu in iommu_bmp, update iommu count
 * and coherency
-- 
1.6.0.4


0001-VT-d-fix-PCI-device-detach-from-virtual-machine.patch
Description: 0001-VT-d-fix-PCI-device-detach-from-virtual-machine.patch


RE: [PATCH 2/2 v2] PCI: add remove_id sysfs entry

2009-02-25 Thread Han, Weidong
Chris Wright wrote:
> This adds a remove_id sysfs entry to allow users of new_id to later
> remove the added dynid.  One use case is management tools that want to
> dynamically bind/unbind devices to pci-stub driver while devices are
> assigned to KVM guests.  Rather than having to track which driver was
> originally bound to the driver, a mangement tool can simply:
> 
> # echo "8086 10f5" > /sys/bus/pci/drivers/pci-stub/new_id
> # echo :00:19.0 > /sys/bus/pci/devices/:00:19.0/driver/unbind
> # echo :00:19.0 > /sys/bus/pci/drivers/pci-stub/bind
> 
> Guest uses device
> 
> # echo "8086 10f5" > /sys/bus/pci/drivers/pci-stub/remove_id
> # echo :00:19.0 > /sys/bus/pci/drivers_probe
> 

After above two commands, I found device 00:19.0 was still bound to pci-stub, 
and doesn't work. I found it needs following unbind command between remove_id 
and drivers_probe to make it work with original driver:
  # echo "8086 10f5" > /sys/bus/pci/drivers/pci-stub/unbind

Chris, did it also happen on your side? or did I miss something? 

Regards,
Weidong

> Signed-off-by: Chris Wright 
> ---
> v2
>  - update changelog to match ABI docs (remove echo -n)
> 
>  Documentation/ABI/testing/sysfs-bus-pci |   16 ++
>  drivers/pci/pci-driver.c|   80
>  +++- 2 files changed, 94 insertions(+),
> 2 deletions(-) 
> 
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -99,6 +99,52 @@ store_new_id(struct device_driver *drive
>  }
>  static DRIVER_ATTR(new_id, S_IWUSR, NULL, store_new_id);
> 
> +/**
> + * store_remove_id - remove a PCI device ID from this driver
> + * @driver: target device driver
> + * @buf: buffer for scanning device ID data
> + * @count: input size
> + *
> + * Removes a dynamic pci device ID to this driver.
> + */
> +static ssize_t
> +store_remove_id(struct device_driver *driver, const char *buf,
> size_t count) +{
> + struct pci_dynid *dynid, *n;
> + struct pci_driver *pdrv = to_pci_driver(driver);
> + __u32 vendor, device, subvendor = PCI_ANY_ID,
> + subdevice = PCI_ANY_ID, class = 0, class_mask = 0;
> + int fields = 0;
> + int retval = -ENODEV;
> +
> + fields = sscanf(buf, "%x %x %x %x %x %x",
> + &vendor, &device, &subvendor, &subdevice,
> + &class, &class_mask);
> + if (fields < 2)
> + return -EINVAL;
> +
> + spin_lock(&pdrv->dynids.lock);
> + list_for_each_entry_safe(dynid, n, &pdrv->dynids.list, node) {
> + struct pci_device_id *id = &dynid->id;
> + if ((id->vendor == vendor) &&
> + (id->device == device) &&
> + (subvendor == PCI_ANY_ID || id->subvendor == subvendor) &&
> + (subdevice == PCI_ANY_ID || id->subdevice == subdevice) &&
> + !((id->class ^ class) & class_mask)) {
> + list_del(&dynid->node);
> + kfree(dynid);
> + retval = 0;
> + break;
> + }
> + }
> + spin_unlock(&pdrv->dynids.lock);
> +
> + if (retval)
> + return retval;
> + return count;
> +}
> +static DRIVER_ATTR(remove_id, S_IWUSR, NULL, store_remove_id);
> +
>  static void
>  pci_free_dynids(struct pci_driver *drv)
>  {
> @@ -125,6 +171,20 @@ static void pci_remove_newid_file(struct
>  {
>   driver_remove_file(&drv->driver, &driver_attr_new_id);
>  }
> +
> +static int
> +pci_create_removeid_file(struct pci_driver *drv)
> +{
> + int error = 0;
> + if (drv->probe != NULL)
> + error = driver_create_file(&drv->driver,&driver_attr_remove_id);
> + return error;
> +}
> +
> +static void pci_remove_removeid_file(struct pci_driver *drv)
> +{
> + driver_remove_file(&drv->driver, &driver_attr_remove_id);
> +}
>  #else /* !CONFIG_HOTPLUG */
>  static inline void pci_free_dynids(struct pci_driver *drv) {}
>  static inline int pci_create_newid_file(struct pci_driver *drv)
> @@ -132,6 +192,11 @@ static inline int pci_create_newid_file(
>   return 0;
>  }
>  static inline void pci_remove_newid_file(struct pci_driver *drv) {}
> +static inline int pci_create_removeid_file(struct pci_driver *drv)
> +{
> + return 0;
> +}
> +static inline void pci_remove_removeid_file(struct pci_driver *drv)
>  {} #endif
> 
>  /**
> @@ -852,13 +917,23 @@ int __pci_register_driver(struct pci_dri
>   /* register with core */
>   error = driver_register(&drv->driver);
>   if (error)
> - return error;
> + goto out;
> 
>   error = pci_create_newid_file(drv);
>   if (error)
> - driver_unregister(&drv->driver);
> + goto out_newid;
> 
> + error = pci_create_removeid_file(drv);
> + if (error)
> + goto out_removeid;
> +out:
>   return error;
> +
> +out_removeid:
> + pci_remove_newid_file(drv);
> +out_newid:
> + driver_unregister(&drv->driver);
> + goto out;
>  }
> 
>  /**
> 

[PATCH 1/1] kvm: bios: make MMIO address page aligned in guest

2009-02-19 Thread Han, Weidong
MMIO of some devices are not page aligned, such as some EHCI
controllers and virtual Realtek NIC in guest. Current guest
bios doesn't guarantee the start address of MMIO page aligned.
This may result in failure of device assignment, because KVM
only allow to register page aligned memory slots. For example,
it fails to assign EHCI controller (its MMIO size is 1KB) with
virtual Realtek NIC (its MMIO size is 256Bytes), because MMIO
of virtual Realtek NIC in guest starts from 0xf2001000, MMIO of
the EHCI controller will starts from 0xf2001400.

MMIO addresses in guest are allocated in guest bios. This patch
makes MMIO address page aligned in bios, then fixes the issue.

Signed-off-by: Weidong Han 
---
 bios/rombios32.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/bios/rombios32.c b/bios/rombios32.c
index 9d2eaaa..4dea066 100755
--- a/bios/rombios32.c
+++ b/bios/rombios32.c
@@ -967,6 +967,9 @@ static void pci_bios_init_device(PCIDevice *d)
 *paddr = (*paddr + size - 1) & ~(size - 1);
 pci_set_io_region_addr(d, i, *paddr);
 *paddr += size;
+/* make memory address page aligned */
+if (!(val & PCI_ADDRESS_SPACE_IO))
+*paddr = (*paddr + 0xfff) & 0xf000;
 }
 }
 break;
-- 
1.6.0.4


0001-kvm-bios-make-MMIO-address-page-aligned-in-guest.patch
Description: 0001-kvm-bios-make-MMIO-address-page-aligned-in-guest.patch


RE: [PATCH 7/8] kvm: qemu: deassign device from guest

2009-02-18 Thread Han, Weidong
Marcelo Tosatti wrote:
> Weidong,
> 
> Does this set fix
> 
> http://sourceforge.net/tracker2/?func=detail&aid=2432316&group_id=180599&atid=893831
> 

I found above bug was already gone even without my patch. I guess it's fixed by 
Mark:

commit: 02874f4272b6787ff94ee7256ef083257b9d1eb1
Author: Mark McLoughlin 
Date:   Fri Nov 28 17:10:47 2008 +

kvm: qemu: device-assignment: free device if hotplug fails

Signed-off-by: Mark McLoughlin 
Signed-off-by: Avi Kivity 


Actually, my patch just moves free_assigned_device into init_assigned_device, 
no functional change. But I updated the patch to also call free_assigned_device 
when pci_register_device fails in init_assigned_device, because adev is 
allocated by qemu_mallocz in add_assigned_device.

>From ce48b0d6c636d8f49bc5977d1d144fa047273846 Mon Sep 17 00:00:00 2001
From: Weidong Han 
Date: Thu, 19 Feb 2009 10:49:30 +0800
Subject: [PATCH] kvm: qemu: free device on error in init_assigned_device

make init_assigned_device call free_assigned_device on error,
and then make free_assigned_device is static because it's only
invoked in device-assigned.c.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   14 +-
 qemu/hw/device-assignment.h |1 -
 qemu/hw/pci-hotplug.c   |1 -
 3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index e6d2352..0b96ee4 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -443,7 +443,7 @@ again:
 
 static LIST_HEAD(, AssignedDevInfo) adev_head;
 
-void free_assigned_device(AssignedDevInfo *adev)
+static void free_assigned_device(AssignedDevInfo *adev)
 {
 AssignedDevice *dev = adev->assigned_dev;
 
@@ -550,7 +550,7 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (NULL == dev) {
 fprintf(stderr, "%s: Error: Couldn't register real device %s\n",
 __func__, adev->name);
-return NULL;
+goto out;
 }
 
 adev->assigned_dev = dev;
@@ -558,14 +558,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (get_real_device(dev, adev->bus, adev->dev, adev->func)) {
 fprintf(stderr, "%s: Error: Couldn't get real device (%s)!\n",
 __func__, adev->name);
-return NULL;
+goto out;
 }
 
 /* handle real device's MMIO/PIO BARs */
 if (assigned_dev_register_regions(dev->real_device.regions,
   dev->real_device.region_number,
   dev))
-return NULL;
+goto out;
 
 /* handle interrupt routing */
 e_device = (dev->dev.devfn >> 3) & 0x1f;
@@ -595,10 +595,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r < 0) {
fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
 adev->name, strerror(-r));
-   return NULL;
+   goto out;
 }
 
 return &dev->dev;
+
+out:
+free_assigned_device(adev);
+return NULL;
 }
 
 /*
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index f216bb0..6a9b9fa 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -94,7 +94,6 @@ struct AssignedDevInfo {
 int disable_iommu;
 };
 
-void free_assigned_device(AssignedDevInfo *adev);
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
diff --git a/qemu/hw/pci-hotplug.c b/qemu/hw/pci-hotplug.c
index 8c76453..65fafd1 100644
--- a/qemu/hw/pci-hotplug.c
+++ b/qemu/hw/pci-hotplug.c
@@ -143,7 +143,6 @@ static PCIDevice *qemu_pci_hot_assign_device(PCIBus 
*pci_bus, const char *opts)
 ret = init_assigned_device(adev, pci_bus);
 if (ret == NULL) {
 term_printf("Failed to assign device\n");
-free_assigned_device(adev);
 return NULL;
 }
 
-- 
1.6.0.4



> 
> On Wed, Feb 18, 2009 at 03:13:05PM +0800, Han, Weidong wrote:
>> free_assigned_device just frees device from qemu, it should also
>> deassign the device from guest when guest exits or hot remove
>> assigned device. 
>> 
>> Acked-by: Mark McLoughlin 
>> Signed-off-by: Weidong Han 
>> ---
>>  qemu/hw/device-assignment.c |   28 ++--
>>  qemu/hw/device-assignment.h |1 +
>>  2 files changed, 27 insertions(+), 2 deletions(-)



0003-kvm-qemu-free-device-on-error-in-init_assigned_dev-v2.patch
Description: 0003-kvm-qemu-free-device-on-error-in-init_assigned_dev-v2.patch


[PATCH 7/8] kvm: qemu: deassign device from guest

2009-02-17 Thread Han, Weidong
free_assigned_device just frees device from qemu, it should also
deassign the device from guest when guest exits or hot remove
assigned device.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   28 ++--
 qemu/hw/device-assignment.h |1 +
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 5341ef2..fc89c6f 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -549,6 +549,28 @@ static int assign_irq(AssignedDevInfo *adev)
 return r;
 }
 
+static void deassign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+
+r = kvm_deassign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to deassign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+}
+
+void remove_assigned_device(AssignedDevInfo *adev)
+{
+deassign_device(adev);
+free_assigned_device(adev);
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -563,7 +585,7 @@ void assigned_dev_update_irqs()
 
 r = assign_irq(adev);
 if (r < 0)
-free_assigned_device(adev);
+remove_assigned_device(adev);
 
 adev = next;
 }
@@ -615,10 +637,12 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 /* assign device to guest */
 r = assign_device(adev);
 if (r < 0)
-goto out;
+goto assigned_out;
 
 return &dev->dev;
 
+assigned_out:
+deassign_device(adev);
 out:
 free_assigned_device(adev);
 return NULL;
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index 6a9b9fa..84f3f32 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -97,6 +97,7 @@ struct AssignedDevInfo {
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
+void remove_assigned_device(AssignedDevInfo *adev);
 ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset);
 void assigned_dev_update_irqs(void);
 
-- 
1.6.0.4


0007-kvm-qemu-deassign-device-from-guest.patch
Description: 0007-kvm-qemu-deassign-device-from-guest.patch


[PATCH 5/8] kvm: qemu: free device on error in init_assigned_device

2009-02-17 Thread Han, Weidong
make init_assigned_device call free_assigned_device on error,
and then make free_assigned_device is static because it's only
invoked in device-assigned.c.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   12 
 qemu/hw/device-assignment.h |1 -
 qemu/hw/pci-hotplug.c   |1 -
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index e6d2352..6046fdd 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -443,7 +443,7 @@ again:
 
 static LIST_HEAD(, AssignedDevInfo) adev_head;
 
-void free_assigned_device(AssignedDevInfo *adev)
+static void free_assigned_device(AssignedDevInfo *adev)
 {
 AssignedDevice *dev = adev->assigned_dev;
 
@@ -558,14 +558,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (get_real_device(dev, adev->bus, adev->dev, adev->func)) {
 fprintf(stderr, "%s: Error: Couldn't get real device (%s)!\n",
 __func__, adev->name);
-return NULL;
+goto out;
 }
 
 /* handle real device's MMIO/PIO BARs */
 if (assigned_dev_register_regions(dev->real_device.regions,
   dev->real_device.region_number,
   dev))
-return NULL;
+goto out;
 
 /* handle interrupt routing */
 e_device = (dev->dev.devfn >> 3) & 0x1f;
@@ -595,10 +595,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r < 0) {
fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
 adev->name, strerror(-r));
-   return NULL;
+   goto out;
 }
 
 return &dev->dev;
+
+out:
+free_assigned_device(adev);
+return NULL;
 }
 
 /*
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index f216bb0..6a9b9fa 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -94,7 +94,6 @@ struct AssignedDevInfo {
 int disable_iommu;
 };
 
-void free_assigned_device(AssignedDevInfo *adev);
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
diff --git a/qemu/hw/pci-hotplug.c b/qemu/hw/pci-hotplug.c
index 8c76453..65fafd1 100644
--- a/qemu/hw/pci-hotplug.c
+++ b/qemu/hw/pci-hotplug.c
@@ -143,7 +143,6 @@ static PCIDevice *qemu_pci_hot_assign_device(PCIBus 
*pci_bus, const char *opts)
 ret = init_assigned_device(adev, pci_bus);
 if (ret == NULL) {
 term_printf("Failed to assign device\n");
-free_assigned_device(adev);
 return NULL;
 }
 
-- 
1.6.0.4


0005-kvm-qemu-free-device-on-error-in-init_assigned_dev.patch
Description: 0005-kvm-qemu-free-device-on-error-in-init_assigned_dev.patch


[PATCH 6/8] kvm: qemu: wrap assign_device and assign_irq

2009-02-17 Thread Han, Weidong
Just wrap assign_device and assign_irq, no functional changes

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |  121 +--
 1 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 6046fdd..5341ef2 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -487,6 +487,68 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 return (uint32_t)bus << 8 | (uint32_t)devfn;
 }
 
+static int assign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+ 
+r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+return r;
+}
+
+static int assign_irq(AssignedDevInfo *adev)
+{
+struct kvm_assigned_irq assigned_irq_data;
+AssignedDevice *dev = adev->assigned_dev;
+int irq, r = 0;
+
+irq = pci_map_irq(&dev->dev, dev->intpin);
+irq = piix_get_irq(irq);
+
+#ifdef TARGET_IA64
+irq = ipf_map_irq(&dev->dev, irq);
+#endif
+
+if (dev->girq == irq)
+return r;
+
+memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
+assigned_irq_data.assigned_dev_id =
+calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_irq_data.guest_irq = irq;
+assigned_irq_data.host_irq = dev->real_device.irq;
+r = kvm_assign_irq(kvm_context, &assigned_irq_data);
+if (r < 0) {
+fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
+adev->name, strerror(-r));
+fprintf(stderr, "Perhaps you are assigning a device "
+"that shares an IRQ with another device?\n");
+return r;
+}
+
+dev->girq = irq;
+return r;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -497,37 +559,11 @@ void assigned_dev_update_irqs()
 adev = LIST_FIRST(&adev_head);
 while (adev) {
 AssignedDevInfo *next = LIST_NEXT(adev, next);
-AssignedDevice *assigned_dev = adev->assigned_dev;
-int irq, r;
-
-irq = pci_map_irq(&assigned_dev->dev, assigned_dev->intpin);
-irq = piix_get_irq(irq);
-
-#ifdef TARGET_IA64
-   irq = ipf_map_irq(&assigned_dev->dev, irq);
-#endif
+int r;
 
-if (irq != assigned_dev->girq) {
-struct kvm_assigned_irq assigned_irq_data;
-
-memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
-assigned_irq_data.assigned_dev_id  =
-calc_assigned_dev_id(assigned_dev->h_busnr,
- (uint8_t) assigned_dev->h_devfn);
-assigned_irq_data.guest_irq = irq;
-assigned_irq_data.host_irq = assigned_dev->real_device.irq;
-r = kvm_assign_irq(kvm_context, &assigned_irq_data);
-if (r < 0) {
-fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
-adev->name, strerror(-r));
-fprintf(stderr, "Perhaps you are assigning a device "
-"that shares an IRQ with another device?\n");
-free_assigned_device(adev);
-adev = next;
-continue;
-}
-assigned_dev->girq = irq;
-}
+r = assign_irq(adev);
+if (r < 0)
+free_assigned_device(adev);
 
 adev = next;
 }
@@ -576,27 +612,10 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 dev->h_busnr = adev->bus;
 dev->h_devfn = PCI_DEVFN(adev->dev, adev->func);
 
-memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
-assigned_dev_data.assigned_dev_id  =
-   calc_assigned_dev_id(dev->h_busnr, (uint32_t)dev->h_devfn);
-assigned_dev_data.busnr = dev->h_busnr;
-assigned_dev_data.devfn = dev->h_devfn;
-
-#ifdef KVM_CAP_IOMMU
-/* We always enable the IOMMU if present
- * (or when not disabled on the command line)
- */
-r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
-if (r && !adev->disable_iommu)
-   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
-#endif
-
-r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
-if (r < 0

[PATCH 8/8] kvm: qemu: fix hot remove assigned device

2009-02-17 Thread Han, Weidong
When hot remove assigned device, should deassign it
from guest and free it from qemu.

assign_dev_update_irqs may not be invoked when hot add a
device, so need to assign irq after device assignment in
init_assigned_device.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   20 
 qemu/hw/device-assignment.h |1 +
 qemu/hw/pci-hotplug.c   |   17 +
 3 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index fc89c6f..d6acc67 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -571,6 +571,21 @@ void remove_assigned_device(AssignedDevInfo *adev)
 free_assigned_device(adev);
 }
 
+AssignedDevInfo *get_assigned_device(int pcibus, int slot)
+{
+AssignedDevice *assigned_dev = NULL;
+AssignedDevInfo *adev = NULL;
+
+LIST_FOREACH(adev, &adev_head, next) {
+assigned_dev = adev->assigned_dev;
+if (pci_bus_num(assigned_dev->dev.bus) == pcibus &&
+PCI_SLOT(assigned_dev->dev.devfn) == slot)
+return adev;
+}
+
+return NULL;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -639,6 +654,11 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r < 0)
 goto assigned_out;
 
+/* assign irq for the device */
+r = assign_irq(adev);
+if (r < 0)
+goto assigned_out;
+ 
 return &dev->dev;
 
 assigned_out:
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index 84f3f32..da775d7 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -98,6 +98,7 @@ PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus 
*bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
 void remove_assigned_device(AssignedDevInfo *adev);
+AssignedDevInfo *get_assigned_device(int pcibus, int slot);
 ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset);
 void assigned_dev_update_irqs(void);
 
diff --git a/qemu/hw/pci-hotplug.c b/qemu/hw/pci-hotplug.c
index 65fafd1..d2ee2c5 100644
--- a/qemu/hw/pci-hotplug.c
+++ b/qemu/hw/pci-hotplug.c
@@ -155,6 +155,14 @@ static PCIDevice *qemu_pci_hot_assign_device(PCIBus 
*pci_bus, const char *opts)
 return ret;
 }
 
+static void qemu_pci_hot_deassign_device(AssignedDevInfo *adev)
+{
+remove_assigned_device(adev);
+
+term_printf("Unregister host PCI device %02x:%02x.%1x "
+   "(\"%s\") from guest\n", 
+   adev->bus, adev->dev, adev->func, adev->name);
+}
 #endif /* USE_KVM_DEVICE_ASSIGNMENT */
 
 void pci_device_hot_add(const char *pci_addr, const char *type, const char 
*opts)
@@ -231,12 +239,21 @@ void pci_device_hot_remove_success(int pcibus, int slot)
 {
 PCIDevice *d = pci_find_device(pcibus, slot, 0);
 int class_code;
+AssignedDevInfo *adev;
 
 if (!d) {
 term_printf("invalid slot %d\n", slot);
 return;
 }
 
+#ifdef USE_KVM_DEVICE_ASSIGNMENT
+adev = get_assigned_device(pcibus, slot);
+if (adev) {
+qemu_pci_hot_deassign_device(adev);
+return;
+}
+#endif /* USE_KVM_DEVICE_ASSIGNMENT */
+
 class_code = d->config_read(d, PCI_CLASS_DEVICE+1, 1);
 
 switch(class_code) {
-- 
1.6.0.4


0008-kvm-qemu-fix-hot-remove-assigned-device.patch
Description: 0008-kvm-qemu-fix-hot-remove-assigned-device.patch


[PATCH 3/8] kvm: qemu: fix hot assign device

2009-02-17 Thread Han, Weidong
Last qemu merge broke device assignment hotplug. Call
qemu_pci_hot_assign_device in pci_device_hot_add for
hot assign device, and add the command for it.
for example hot assign 01:00.0, can use following command:
  pci_add pci_addr=auto host host=01:00.0

Signed-off-by: Weidong Han 
---
 qemu/hw/device-hotplug.c |   37 -
 qemu/hw/pci-hotplug.c|   35 +++
 qemu/monitor.c   |2 +-
 3 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/qemu/hw/device-hotplug.c b/qemu/hw/device-hotplug.c
index ae72a00..782b6b4 100644
--- a/qemu/hw/device-hotplug.c
+++ b/qemu/hw/device-hotplug.c
@@ -30,49 +30,12 @@
 #include "pci.h"
 #include "pc.h"
 #include "console.h"
-#include "device-assignment.h"
 #include "config.h"
 #include "virtio-blk.h"
 
 #define PCI_BASE_CLASS_STORAGE  0x01
 #define PCI_BASE_CLASS_NETWORK  0x02
 
-#ifdef USE_KVM_DEVICE_ASSIGNMENT
-static PCIDevice *qemu_system_hot_assign_device(const char *opts, int bus_nr)
-{
-PCIBus *pci_bus;
-AssignedDevInfo *adev;
-PCIDevice *ret;
-
-pci_bus = pci_find_bus(bus_nr);
-if (!pci_bus) {
-term_printf ("Can't find pci_bus %d\n", bus_nr);
-return NULL;
-}
-adev = add_assigned_device(opts);
-if (adev == NULL) {
-term_printf ("Error adding device; check syntax\n");
-return NULL;
-}
- 
-ret = init_assigned_device(adev, pci_bus);
-if (ret == NULL) {
-term_printf("Failed to assign device\n");
-free_assigned_device(adev);
-return NULL;
-}
-
-term_printf("Registered host PCI device %02x:%02x.%1x "
-   "(\"%s\") as guest device %02x:%02x.%1x\n",
-   adev->bus, adev->dev, adev->func, adev->name,
-   pci_bus_num(pci_bus), (ret->devfn >> 3) & 0x1f,
-   adev->func);
-
-return ret;
-}
-
-#endif /* USE_KVM_DEVICE_ASSIGNMENT */
-
 int add_init_drive(const char *opts)
 {
 int drive_opt_idx, drive_idx;
diff --git a/qemu/hw/pci-hotplug.c b/qemu/hw/pci-hotplug.c
index 6286764..8c76453 100644
--- a/qemu/hw/pci-hotplug.c
+++ b/qemu/hw/pci-hotplug.c
@@ -31,6 +31,7 @@
 #include "console.h"
 #include "block_int.h"
 #include "virtio-blk.h"
+#include "device-assignment.h"
 
 #if defined(TARGET_I386) || defined(TARGET_X86_64)
 static PCIDevice *qemu_pci_hot_add_nic(PCIBus *pci_bus, const char *opts)
@@ -127,6 +128,36 @@ static PCIDevice *qemu_pci_hot_add_storage(PCIBus 
*pci_bus, const char *opts)
 return opaque;
 }
 
+#ifdef USE_KVM_DEVICE_ASSIGNMENT
+static PCIDevice *qemu_pci_hot_assign_device(PCIBus *pci_bus, const char *opts)
+{
+AssignedDevInfo *adev;
+PCIDevice *ret;
+
+adev = add_assigned_device(opts);
+if (adev == NULL) {
+term_printf ("Error adding device; check syntax\n");
+return NULL;
+}
+ 
+ret = init_assigned_device(adev, pci_bus);
+if (ret == NULL) {
+term_printf("Failed to assign device\n");
+free_assigned_device(adev);
+return NULL;
+}
+
+term_printf("Registered host PCI device %02x:%02x.%1x "
+   "(\"%s\") as guest device %02x:%02x.%1x\n",
+   adev->bus, adev->dev, adev->func, adev->name,
+   pci_bus_num(pci_bus), (ret->devfn >> 3) & 0x1f,
+   adev->func);
+
+return ret;
+}
+
+#endif /* USE_KVM_DEVICE_ASSIGNMENT */
+
 void pci_device_hot_add(const char *pci_addr, const char *type, const char 
*opts)
 {
 PCIDevice *dev = NULL;
@@ -149,6 +180,10 @@ void pci_device_hot_add(const char *pci_addr, const char 
*type, const char *opts
 dev = qemu_pci_hot_add_nic(pci_bus, opts);
 else if (strcmp(type, "storage") == 0)
 dev = qemu_pci_hot_add_storage(pci_bus, opts);
+#ifdef USE_KVM_DEVICE_ASSIGNMENT
+else if (strcmp(type, "host") == 0)
+dev = qemu_pci_hot_assign_device(pci_bus, opts);
+#endif /* USE_KVM_DEVICE_ASSIGNMENT */
 else
 term_printf("invalid type: %s\n", type);
 
diff --git a/qemu/monitor.c b/qemu/monitor.c
index 2699b6f..a4a6eb4 100644
--- a/qemu/monitor.c
+++ b/qemu/monitor.c
@@ -1545,7 +1545,7 @@ static const term_cmd_t term_cmds[] = {
 "[,cyls=c,heads=h,secs=s[,trans=t]]\n"
 "[snapshot=on|off][,cache=on|off]",
 "add drive to PCI storage controller" 
},
-{ "pci_add", "sss", pci_device_hot_add, 
"pci_addr=auto|[[:]:] nic|storage 
[[vlan=n][,macaddr=addr][,model=type]] [file=file][,if=type][,bus=nr]...", 
"hot-add PCI device" },
+{ "pci_add", "sss", pci_device_hot_add, 
"pci_addr=auto|[[:]:] nic|storage|host 
[[vlan=n][,macaddr=addr][,model=type]] [file=file][,if=type][,bus=nr]... 
[host=02:00.0[,name=string][,dma=none]", "hot-add PCI device" },
 { "pci_del", "s", pci_device_hot_remove, 
"pci_addr=[[:]:]", "hot remove PCI device" },
 { "host_net_add", "ss", net_host_device_add,
   "[tap,user,so

[PATCH 4/8] kvm: libkvm: add deassign ioctl

2009-02-17 Thread Han, Weidong
Add this to support hot remove assigned device.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 libkvm/libkvm.c |   14 ++
 libkvm/libkvm.h |   13 +
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 92ffe10..a559a0d 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -1141,6 +1141,20 @@ int kvm_assign_irq(kvm_context_t kvm,
 }
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev)
+{
+   int ret;
+
+   ret = ioctl(kvm->vm_fd, KVM_DEASSIGN_PCI_DEVICE, assigned_dev);
+   if (ret < 0)
+   return -errno;
+
+   return ret;
+}
+#endif
+
 int kvm_destroy_memory_region_works(kvm_context_t kvm)
 {
int ret = 0;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index e79e4fd..f6d653c 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -737,6 +737,19 @@ int kvm_assign_irq(kvm_context_t kvm,
 int kvm_destroy_memory_region_works(kvm_context_t kvm);
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+/*!
+ * \brief Notifies host kernel about a PCI device to be deassigned from a guest
+ *
+ * Used for hot remove PCI device, this function notifies the host
+ * kernel about the deassigning of the physical PCI device from a guest.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param assigned_dev Parameters, like bus, devfn number, etc
+ */
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev);
+#endif
 
 /*!
  * \brief Checks whether the generic irq routing capability is present
-- 
1.6.0.4


0004-kvm-libkvm-add-deassign-ioctl.patch
Description: 0004-kvm-libkvm-add-deassign-ioctl.patch


[PATCH 2/8] kvm: fix kvm_vm_ioctl_deassign_device

2009-02-17 Thread Han, Weidong
only need to set assigned_dev_id for deassignment, use
match->flags to judge and deassign it.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 virt/kvm/kvm_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ef52622..3aacdb9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -574,7 +574,7 @@ static int kvm_vm_ioctl_deassign_device(struct kvm *kvm,
goto out;
}
 
-   if (assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
+   if (match->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
kvm_deassign_device(kvm, match);
 
kvm_free_assigned_device(kvm, match);
-- 
1.6.0.4


0002-kvm-fix-kvm_vm_ioctl_deassign_device.patch
Description: 0002-kvm-fix-kvm_vm_ioctl_deassign_device.patch


[PATCH 1/8] kvm: define KVM_CAP_DEVICE_DEASSIGNMENT

2009-02-17 Thread Han, Weidong
define KVM_CAP_DEVICE_DEASSIGNMENT and KVM_DEASSIGN_PCI_DEVICE
for device deassignment.

the ioctl has been already implemented in the
commit: 0a920356748df4fb06e86c21c23d2ed6d31d37ad

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 include/linux/kvm.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 869462c..02dfb1b 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -402,6 +402,9 @@ struct kvm_trace_rec {
 #ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
 #endif
+#ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
+#define KVM_CAP_DEVICE_DEASSIGNMENT 26
+#endif
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -466,6 +469,8 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
+#define KVM_DEASSIGN_PCI_DEVICE _IOR(KVMIO, 0x72, \
+struct kvm_assigned_pci_dev)
 
 /*
  * ioctls for vcpu fds
-- 
1.6.0.4


0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch
Description: 0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch


[PATCH 0/8] kvm: fix device assignment hotplug

2009-02-17 Thread Han, Weidong
Last qemu merge broke device assignment hotplug. It didn't call 
qemu_pci_hot_assign_device in pci_device_hot_add for device assignment hot add, 
and also missed device assignment hotplug command in pci_add command. This 
patchset fixes this issue. Note that this patchset rebases and includes the 
patches of fixing hot remove assigned device, which have been sent in mailing 
list for version 3.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7] [V3] kvm: qemu: fix hot remove assigned device with iommu

2009-02-17 Thread Han, Weidong
Han, Weidong wrote:
> Marcelo Tosatti wrote:
>> On Fri, Feb 13, 2009 at 05:49:49PM +0800, Han, Weidong wrote:
>>> when hot remove the assigned device with iommu, it should
>>> deassign it from guest and free it from qemu.
>>> 
>>> assign_dev_update_irqs may not be invoked when hot add a device,
>>> so need to assign irq after device assignment in
>>> init_assigned_device.
>> 
>> Weidong,
>> 
>> Can you please test device assignment hotplug with the current tree?
>> 

device assignment hotplug doesn't work on current tree. I had a quick glance, 
found device assignment didn't be considered, qemu_system_hot_assign_device is 
not used at all. I will fix it.

Regards,
Weidong

>> There was a qemu merge yesterday.
>> 
>> Thanks
> 
> Ok.
> 
> Regards,
> Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7] [V3] kvm: qemu: fix hot remove assigned device with iommu

2009-02-17 Thread Han, Weidong
Marcelo Tosatti wrote:
> On Fri, Feb 13, 2009 at 05:49:49PM +0800, Han, Weidong wrote:
>> when hot remove the assigned device with iommu, it should
>> deassign it from guest and free it from qemu.
>> 
>> assign_dev_update_irqs may not be invoked when hot add a device,
>> so need to assign irq after device assignment in
>> init_assigned_device.
> 
> Weidong,
> 
> Can you please test device assignment hotplug with the current tree?
> 
> There was a qemu merge yesterday.
> 
> Thanks

Ok. 

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] [V3] kvm: qemu: deassign device from guest

2009-02-13 Thread Han, Weidong
free_assigned_device just frees device from qemu, it should also
deassign the device from guest when guest exits or hot remove
a device.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   28 ++--
 qemu/hw/device-assignment.h |1 +
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 5341ef2..fc89c6f 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -549,6 +549,28 @@ static int assign_irq(AssignedDevInfo *adev)
 return r;
 }
 
+static void deassign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+
+r = kvm_deassign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to deassign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+}
+
+void remove_assigned_device(AssignedDevInfo *adev)
+{
+deassign_device(adev);
+free_assigned_device(adev);
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -563,7 +585,7 @@ void assigned_dev_update_irqs()
 
 r = assign_irq(adev);
 if (r < 0)
-free_assigned_device(adev);
+remove_assigned_device(adev);
 
 adev = next;
 }
@@ -615,10 +637,12 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 /* assign device to guest */
 r = assign_device(adev);
 if (r < 0)
-goto out;
+goto assigned_out;
 
 return &dev->dev;
 
+assigned_out:
+deassign_device(adev);
 out:
 free_assigned_device(adev);
 return NULL;
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index 6a9b9fa..84f3f32 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -97,6 +97,7 @@ struct AssignedDevInfo {
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
+void remove_assigned_device(AssignedDevInfo *adev);
 ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset);
 void assigned_dev_update_irqs(void);
 
-- 
1.6.0.4


0006-kvm-qemu-deassign-device-from-guest.patch
Description: 0006-kvm-qemu-deassign-device-from-guest.patch


[PATCH 7/7] [V3] kvm: qemu: fix hot remove assigned device with iommu

2009-02-13 Thread Han, Weidong
when hot remove the assigned device with iommu, it should
deassign it from guest and free it from qemu.

assign_dev_update_irqs may not be invoked when hot add a device,
so need to assign irq after device assignment in
init_assigned_device.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   20 
 qemu/hw/device-assignment.h |1 +
 qemu/hw/device-hotplug.c|   17 +
 3 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index fc89c6f..d6acc67 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -571,6 +571,21 @@ void remove_assigned_device(AssignedDevInfo *adev)
 free_assigned_device(adev);
 }
 
+AssignedDevInfo *get_assigned_device(int pcibus, int slot)
+{
+AssignedDevice *assigned_dev = NULL;
+AssignedDevInfo *adev = NULL;
+
+LIST_FOREACH(adev, &adev_head, next) {
+assigned_dev = adev->assigned_dev;
+if (pci_bus_num(assigned_dev->dev.bus) == pcibus &&
+PCI_SLOT(assigned_dev->dev.devfn) == slot)
+return adev;
+}
+
+return NULL;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -639,6 +654,11 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r < 0)
 goto assigned_out;
 
+/* assign irq for the device */
+r = assign_irq(adev);
+if (r < 0)
+goto assigned_out;
+ 
 return &dev->dev;
 
 assigned_out:
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index 84f3f32..da775d7 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -98,6 +98,7 @@ PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus 
*bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
 void remove_assigned_device(AssignedDevInfo *adev);
+AssignedDevInfo *get_assigned_device(int pcibus, int slot);
 ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset);
 void assigned_dev_update_irqs(void);
 
diff --git a/qemu/hw/device-hotplug.c b/qemu/hw/device-hotplug.c
index 671acb2..03987ab 100644
--- a/qemu/hw/device-hotplug.c
+++ b/qemu/hw/device-hotplug.c
@@ -63,6 +63,14 @@ static PCIDevice *qemu_system_hot_assign_device(const char 
*opts, int bus_nr)
 return ret;
 }
 
+static void qemu_system_hot_deassign_device(AssignedDevInfo *adev)
+{
+remove_assigned_device(adev);
+
+term_printf("Unregister host PCI device %02x:%02x.%1x "
+   "(\"%s\") from guest\n", 
+   adev->bus, adev->dev, adev->func, adev->name);
+}
 #endif /* USE_KVM_DEVICE_ASSIGNMENT */
 
 static int add_init_drive(const char *opts)
@@ -240,12 +248,21 @@ void device_hot_remove_success(int pcibus, int slot)
 {
 PCIDevice *d = pci_find_device(pcibus, slot);
 int class_code;
+AssignedDevInfo *adev;
 
 if (!d) {
 term_printf("invalid slot %d\n", slot);
 return;
 }
 
+#ifdef USE_KVM_DEVICE_ASSIGNMENT
+adev = get_assigned_device(pcibus, slot);
+if (adev) {
+qemu_system_hot_deassign_device(adev);
+return;
+}
+#endif /* USE_KVM_DEVICE_ASSIGNMENT */
+
 class_code = d->config_read(d, PCI_CLASS_DEVICE+1, 1);
 
 pci_unregister_device(d);
-- 
1.6.0.4


0007-kvm-qemu-fix-hot-remove-assigned-device-with-iommu.patch
Description: 0007-kvm-qemu-fix-hot-remove-assigned-device-with-iommu.patch


[PATCH 5/7] [V3] kvm: qemu: wrap assign_device and assign_irq

2009-02-13 Thread Han, Weidong
just wrap assign_device and assign_irq, no functional changes

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |  121 +--
 1 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 6046fdd..5341ef2 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -487,6 +487,68 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 return (uint32_t)bus << 8 | (uint32_t)devfn;
 }
 
+static int assign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+ 
+r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+return r;
+}
+
+static int assign_irq(AssignedDevInfo *adev)
+{
+struct kvm_assigned_irq assigned_irq_data;
+AssignedDevice *dev = adev->assigned_dev;
+int irq, r = 0;
+
+irq = pci_map_irq(&dev->dev, dev->intpin);
+irq = piix_get_irq(irq);
+
+#ifdef TARGET_IA64
+irq = ipf_map_irq(&dev->dev, irq);
+#endif
+
+if (dev->girq == irq)
+return r;
+
+memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
+assigned_irq_data.assigned_dev_id =
+calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_irq_data.guest_irq = irq;
+assigned_irq_data.host_irq = dev->real_device.irq;
+r = kvm_assign_irq(kvm_context, &assigned_irq_data);
+if (r < 0) {
+fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
+adev->name, strerror(-r));
+fprintf(stderr, "Perhaps you are assigning a device "
+"that shares an IRQ with another device?\n");
+return r;
+}
+
+dev->girq = irq;
+return r;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -497,37 +559,11 @@ void assigned_dev_update_irqs()
 adev = LIST_FIRST(&adev_head);
 while (adev) {
 AssignedDevInfo *next = LIST_NEXT(adev, next);
-AssignedDevice *assigned_dev = adev->assigned_dev;
-int irq, r;
-
-irq = pci_map_irq(&assigned_dev->dev, assigned_dev->intpin);
-irq = piix_get_irq(irq);
-
-#ifdef TARGET_IA64
-   irq = ipf_map_irq(&assigned_dev->dev, irq);
-#endif
+int r;
 
-if (irq != assigned_dev->girq) {
-struct kvm_assigned_irq assigned_irq_data;
-
-memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
-assigned_irq_data.assigned_dev_id  =
-calc_assigned_dev_id(assigned_dev->h_busnr,
- (uint8_t) assigned_dev->h_devfn);
-assigned_irq_data.guest_irq = irq;
-assigned_irq_data.host_irq = assigned_dev->real_device.irq;
-r = kvm_assign_irq(kvm_context, &assigned_irq_data);
-if (r < 0) {
-fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
-adev->name, strerror(-r));
-fprintf(stderr, "Perhaps you are assigning a device "
-"that shares an IRQ with another device?\n");
-free_assigned_device(adev);
-adev = next;
-continue;
-}
-assigned_dev->girq = irq;
-}
+r = assign_irq(adev);
+if (r < 0)
+free_assigned_device(adev);
 
 adev = next;
 }
@@ -576,27 +612,10 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 dev->h_busnr = adev->bus;
 dev->h_devfn = PCI_DEVFN(adev->dev, adev->func);
 
-memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
-assigned_dev_data.assigned_dev_id  =
-   calc_assigned_dev_id(dev->h_busnr, (uint32_t)dev->h_devfn);
-assigned_dev_data.busnr = dev->h_busnr;
-assigned_dev_data.devfn = dev->h_devfn;
-
-#ifdef KVM_CAP_IOMMU
-/* We always enable the IOMMU if present
- * (or when not disabled on the command line)
- */
-r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
-if (r && !adev->disable_iommu)
-   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
-#endif
-
-r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
-if (r < 0

[PATCH 4/7] [V3] kvm: qemu: free device on error in init_assigned_device

2009-02-13 Thread Han, Weidong
make init_assigned_device call free_assigned_device on error,
and then make free_assigned_device is static because it's only
invoked in device-assigned.c.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   12 
 qemu/hw/device-assignment.h |1 -
 qemu/hw/device-hotplug.c|1 -
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index e6d2352..6046fdd 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -443,7 +443,7 @@ again:
 
 static LIST_HEAD(, AssignedDevInfo) adev_head;
 
-void free_assigned_device(AssignedDevInfo *adev)
+static void free_assigned_device(AssignedDevInfo *adev)
 {
 AssignedDevice *dev = adev->assigned_dev;
 
@@ -558,14 +558,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (get_real_device(dev, adev->bus, adev->dev, adev->func)) {
 fprintf(stderr, "%s: Error: Couldn't get real device (%s)!\n",
 __func__, adev->name);
-return NULL;
+goto out;
 }
 
 /* handle real device's MMIO/PIO BARs */
 if (assigned_dev_register_regions(dev->real_device.regions,
   dev->real_device.region_number,
   dev))
-return NULL;
+goto out;
 
 /* handle interrupt routing */
 e_device = (dev->dev.devfn >> 3) & 0x1f;
@@ -595,10 +595,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r < 0) {
fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
 adev->name, strerror(-r));
-   return NULL;
+   goto out;
 }
 
 return &dev->dev;
+
+out:
+free_assigned_device(adev);
+return NULL;
 }
 
 /*
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index f216bb0..6a9b9fa 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -94,7 +94,6 @@ struct AssignedDevInfo {
 int disable_iommu;
 };
 
-void free_assigned_device(AssignedDevInfo *adev);
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
diff --git a/qemu/hw/device-hotplug.c b/qemu/hw/device-hotplug.c
index d8f0fcc..671acb2 100644
--- a/qemu/hw/device-hotplug.c
+++ b/qemu/hw/device-hotplug.c
@@ -51,7 +51,6 @@ static PCIDevice *qemu_system_hot_assign_device(const char 
*opts, int bus_nr)
 ret = init_assigned_device(adev, pci_bus);
 if (ret == NULL) {
 term_printf("Failed to assign device\n");
-free_assigned_device(adev);
 return NULL;
 }
 
-- 
1.6.0.4


0004-kvm-qemu-free-device-on-error-in-init_assigned_dev.patch
Description: 0004-kvm-qemu-free-device-on-error-in-init_assigned_dev.patch


[PATCH 3/7] [V3] kvm: libkvm: add deassign ioctl

2009-02-13 Thread Han, Weidong
add this to support hot remove device with iommu

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 libkvm/libkvm.c |   14 ++
 libkvm/libkvm.h |   13 +
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 92ffe10..a559a0d 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -1141,6 +1141,20 @@ int kvm_assign_irq(kvm_context_t kvm,
 }
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev)
+{
+   int ret;
+
+   ret = ioctl(kvm->vm_fd, KVM_DEASSIGN_PCI_DEVICE, assigned_dev);
+   if (ret < 0)
+   return -errno;
+
+   return ret;
+}
+#endif
+
 int kvm_destroy_memory_region_works(kvm_context_t kvm)
 {
int ret = 0;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index e79e4fd..f6d653c 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -737,6 +737,19 @@ int kvm_assign_irq(kvm_context_t kvm,
 int kvm_destroy_memory_region_works(kvm_context_t kvm);
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+/*!
+ * \brief Notifies host kernel about a PCI device to be deassigned from a guest
+ *
+ * Used for hot remove PCI device, this function notifies the host
+ * kernel about the deassigning of the physical PCI device from a guest.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param assigned_dev Parameters, like bus, devfn number, etc
+ */
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev);
+#endif
 
 /*!
  * \brief Checks whether the generic irq routing capability is present
-- 
1.6.0.4


0003-kvm-libkvm-add-deassign-ioctl.patch
Description: 0003-kvm-libkvm-add-deassign-ioctl.patch


[PATCH 1/7] [V3] kvm: define KVM_CAP_DEVICE_DEASSIGNMENT

2009-02-13 Thread Han, Weidong
define KVM_CAP_DEVICE_DEASSIGNMENT and KVM_DEASSIGN_PCI_DEVICE
for device deassignment.

the ioctl has been already implemented in the
commit: 0a920356748df4fb06e86c21c23d2ed6d31d37ad

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 include/linux/kvm.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 869462c..02dfb1b 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -402,6 +402,9 @@ struct kvm_trace_rec {
 #ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
 #endif
+#ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
+#define KVM_CAP_DEVICE_DEASSIGNMENT 26
+#endif
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -466,6 +469,8 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
+#define KVM_DEASSIGN_PCI_DEVICE _IOR(KVMIO, 0x72, \
+struct kvm_assigned_pci_dev)
 
 /*
  * ioctls for vcpu fds
-- 
1.6.0.4


0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch
Description: 0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch


[PATCH 0/7] [V3] kvm: fix hot remove assigned device with IOMMU

2009-02-13 Thread Han, Weidong
Changelog from v2 -> v3:

only need to set assigned_dev_id for deassignment in userspace, and use 
match->flags to judge and deassign it in kvm ioctl.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] [V3] kvm: fix kvm_vm_ioctl_deassign_device

2009-02-13 Thread Han, Weidong
only need to set assigned_dev_id for deassignment, use
match->flags to judge and deassign it.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 virt/kvm/kvm_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ef52622..3aacdb9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -574,7 +574,7 @@ static int kvm_vm_ioctl_deassign_device(struct kvm *kvm,
goto out;
}
 
-   if (assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
+   if (match->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
kvm_deassign_device(kvm, match);
 
kvm_free_assigned_device(kvm, match);
-- 
1.6.0.4


0002-kvm-fix-kvm_vm_ioctl_deassign_device.patch
Description: 0002-kvm-fix-kvm_vm_ioctl_deassign_device.patch


RE: [PATCH 5/6] kvm: qemu: deassign device from guest

2009-02-13 Thread Han, Weidong
Mark McLoughlin wrote:
> On Fri, 2009-02-13 at 14:58 +0800, Han, Weidong wrote:
>> +static void deassign_device(AssignedDevInfo *adev) +{
>> +struct kvm_assigned_pci_dev assigned_dev_data;
>> +AssignedDevice *dev = adev->assigned_dev;
>> +int r;
>> +
>> +memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
>> +assigned_dev_data.assigned_dev_id  =
>> +   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
> 
>> From here:
> 
>> +assigned_dev_data.busnr = dev->h_busnr;
>> +assigned_dev_data.devfn = dev->h_devfn;
>> +
>> +#ifdef KVM_CAP_IOMMU
>> +/* We always enable the IOMMU if present
>> + * (or when not disabled on the command line)
>> + */
>> +r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
>> +if (r && !adev->disable_iommu)
>> +   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
>> +#endif
> 
> to here.
> 
> As, I said, these lines should not be needed and the ioctl should be
> fixed like so:
> 
> -   if (assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
> +   if (match->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
> kvm_deassign_device(kvm, match);
> 

I ignored this comment, will update it.

Regards,
Weidong

> Cheers,
> Mark.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] kvm: define KVM_CAP_DEVICE_DEASSIGNMENT

2009-02-12 Thread Han, Weidong
define KVM_CAP_DEVICE_DEASSIGNMENT and KVM_DEASSIGN_PCI_DEVICE
for device deassignment.

the ioctl has been already implemented in the
commit: 0a920356748df4fb06e86c21c23d2ed6d31d37ad

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 include/linux/kvm.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 869462c..02dfb1b 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -402,6 +402,9 @@ struct kvm_trace_rec {
 #ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
 #endif
+#ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
+#define KVM_CAP_DEVICE_DEASSIGNMENT 26
+#endif
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -466,6 +469,8 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
+#define KVM_DEASSIGN_PCI_DEVICE _IOR(KVMIO, 0x72, \
+struct kvm_assigned_pci_dev)
 
 /*
  * ioctls for vcpu fds
-- 
1.6.0.4


0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch
Description: 0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch


[PATCH 2/6] kvm: libkvm: add deassign ioctl

2009-02-12 Thread Han, Weidong
add this to support hot remove device with iommu

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 libkvm/libkvm.c |   14 ++
 libkvm/libkvm.h |   13 +
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 92ffe10..a559a0d 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -1141,6 +1141,20 @@ int kvm_assign_irq(kvm_context_t kvm,
 }
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev)
+{
+   int ret;
+
+   ret = ioctl(kvm->vm_fd, KVM_DEASSIGN_PCI_DEVICE, assigned_dev);
+   if (ret < 0)
+   return -errno;
+
+   return ret;
+}
+#endif
+
 int kvm_destroy_memory_region_works(kvm_context_t kvm)
 {
int ret = 0;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index e79e4fd..f6d653c 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -737,6 +737,19 @@ int kvm_assign_irq(kvm_context_t kvm,
 int kvm_destroy_memory_region_works(kvm_context_t kvm);
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+/*!
+ * \brief Notifies host kernel about a PCI device to be deassigned from a guest
+ *
+ * Used for hot remove PCI device, this function notifies the host
+ * kernel about the deassigning of the physical PCI device from a guest.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param assigned_dev Parameters, like bus, devfn number, etc
+ */
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev);
+#endif
 
 /*!
  * \brief Checks whether the generic irq routing capability is present
-- 
1.6.0.4


0002-kvm-libkvm-add-deassign-ioctl.patch
Description: 0002-kvm-libkvm-add-deassign-ioctl.patch


[PATCH 5/6] kvm: qemu: deassign device from guest

2009-02-12 Thread Han, Weidong
free_assigned_device just frees device from qemu, it should also
deassign the device from guest when guest exits or hot remove
a device.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   39 +--
 qemu/hw/device-assignment.h |1 +
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 5341ef2..7240d1e 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -549,6 +549,39 @@ static int assign_irq(AssignedDevInfo *adev)
 return r;
 }
 
+static void deassign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+
+r = kvm_deassign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to deassign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+}
+
+void remove_assigned_device(AssignedDevInfo *adev)
+{
+deassign_device(adev);
+free_assigned_device(adev);
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -563,7 +596,7 @@ void assigned_dev_update_irqs()
 
 r = assign_irq(adev);
 if (r < 0)
-free_assigned_device(adev);
+remove_assigned_device(adev);
 
 adev = next;
 }
@@ -615,10 +648,12 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 /* assign device to guest */
 r = assign_device(adev);
 if (r < 0)
-goto out;
+goto assigned_out;
 
 return &dev->dev;
 
+assigned_out:
+deassign_device(adev);
 out:
 free_assigned_device(adev);
 return NULL;
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index 6a9b9fa..84f3f32 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -97,6 +97,7 @@ struct AssignedDevInfo {
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
+void remove_assigned_device(AssignedDevInfo *adev);
 ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset);
 void assigned_dev_update_irqs(void);
 
-- 
1.6.0.4


0005-kvm-qemu-deassign-device-from-guest.patch
Description: 0005-kvm-qemu-deassign-device-from-guest.patch


[PATCH 6/6] kvm: qemu: fix hot remove assigned device with iommu

2009-02-12 Thread Han, Weidong
when hot remove the assigned device with iommu, it should
deassign it from guest and free it from qemu.

assign_dev_update_irqs may not be invoked when hot add a device,
so need to assign irq after device assignment in
init_assigned_device.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   20 
 qemu/hw/device-assignment.h |1 +
 qemu/hw/device-hotplug.c|   17 +
 3 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 7240d1e..d4fa1a9 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -582,6 +582,21 @@ void remove_assigned_device(AssignedDevInfo *adev)
 free_assigned_device(adev);
 }
 
+AssignedDevInfo *get_assigned_device(int pcibus, int slot)
+{
+AssignedDevice *assigned_dev = NULL;
+AssignedDevInfo *adev = NULL;
+
+LIST_FOREACH(adev, &adev_head, next) {
+assigned_dev = adev->assigned_dev;
+if (pci_bus_num(assigned_dev->dev.bus) == pcibus &&
+PCI_SLOT(assigned_dev->dev.devfn) == slot)
+return adev;
+}
+
+return NULL;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -650,6 +665,11 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r < 0)
 goto assigned_out;
 
+/* assign irq for the device */
+r = assign_irq(adev);
+if (r < 0)
+goto assigned_out;
+ 
 return &dev->dev;
 
 assigned_out:
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index 84f3f32..da775d7 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -98,6 +98,7 @@ PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus 
*bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
 void remove_assigned_device(AssignedDevInfo *adev);
+AssignedDevInfo *get_assigned_device(int pcibus, int slot);
 ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset);
 void assigned_dev_update_irqs(void);
 
diff --git a/qemu/hw/device-hotplug.c b/qemu/hw/device-hotplug.c
index 671acb2..03987ab 100644
--- a/qemu/hw/device-hotplug.c
+++ b/qemu/hw/device-hotplug.c
@@ -63,6 +63,14 @@ static PCIDevice *qemu_system_hot_assign_device(const char 
*opts, int bus_nr)
 return ret;
 }
 
+static void qemu_system_hot_deassign_device(AssignedDevInfo *adev)
+{
+remove_assigned_device(adev);
+
+term_printf("Unregister host PCI device %02x:%02x.%1x "
+   "(\"%s\") from guest\n", 
+   adev->bus, adev->dev, adev->func, adev->name);
+}
 #endif /* USE_KVM_DEVICE_ASSIGNMENT */
 
 static int add_init_drive(const char *opts)
@@ -240,12 +248,21 @@ void device_hot_remove_success(int pcibus, int slot)
 {
 PCIDevice *d = pci_find_device(pcibus, slot);
 int class_code;
+AssignedDevInfo *adev;
 
 if (!d) {
 term_printf("invalid slot %d\n", slot);
 return;
 }
 
+#ifdef USE_KVM_DEVICE_ASSIGNMENT
+adev = get_assigned_device(pcibus, slot);
+if (adev) {
+qemu_system_hot_deassign_device(adev);
+return;
+}
+#endif /* USE_KVM_DEVICE_ASSIGNMENT */
+
 class_code = d->config_read(d, PCI_CLASS_DEVICE+1, 1);
 
 pci_unregister_device(d);
-- 
1.6.0.4


0006-kvm-qemu-fix-hot-remove-assigned-device-with-iommu.patch
Description: 0006-kvm-qemu-fix-hot-remove-assigned-device-with-iommu.patch


[PATCH 4/6] kvm: qemu: wrap assign_device and assign_irq

2009-02-12 Thread Han, Weidong
just wrap assign_device and assign_irq, no functional changes

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |  121 +--
 1 files changed, 70 insertions(+), 51 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 6046fdd..5341ef2 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -487,6 +487,68 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 return (uint32_t)bus << 8 | (uint32_t)devfn;
 }
 
+static int assign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+ 
+r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+return r;
+}
+
+static int assign_irq(AssignedDevInfo *adev)
+{
+struct kvm_assigned_irq assigned_irq_data;
+AssignedDevice *dev = adev->assigned_dev;
+int irq, r = 0;
+
+irq = pci_map_irq(&dev->dev, dev->intpin);
+irq = piix_get_irq(irq);
+
+#ifdef TARGET_IA64
+irq = ipf_map_irq(&dev->dev, irq);
+#endif
+
+if (dev->girq == irq)
+return r;
+
+memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
+assigned_irq_data.assigned_dev_id =
+calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_irq_data.guest_irq = irq;
+assigned_irq_data.host_irq = dev->real_device.irq;
+r = kvm_assign_irq(kvm_context, &assigned_irq_data);
+if (r < 0) {
+fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
+adev->name, strerror(-r));
+fprintf(stderr, "Perhaps you are assigning a device "
+"that shares an IRQ with another device?\n");
+return r;
+}
+
+dev->girq = irq;
+return r;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -497,37 +559,11 @@ void assigned_dev_update_irqs()
 adev = LIST_FIRST(&adev_head);
 while (adev) {
 AssignedDevInfo *next = LIST_NEXT(adev, next);
-AssignedDevice *assigned_dev = adev->assigned_dev;
-int irq, r;
-
-irq = pci_map_irq(&assigned_dev->dev, assigned_dev->intpin);
-irq = piix_get_irq(irq);
-
-#ifdef TARGET_IA64
-   irq = ipf_map_irq(&assigned_dev->dev, irq);
-#endif
+int r;
 
-if (irq != assigned_dev->girq) {
-struct kvm_assigned_irq assigned_irq_data;
-
-memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
-assigned_irq_data.assigned_dev_id  =
-calc_assigned_dev_id(assigned_dev->h_busnr,
- (uint8_t) assigned_dev->h_devfn);
-assigned_irq_data.guest_irq = irq;
-assigned_irq_data.host_irq = assigned_dev->real_device.irq;
-r = kvm_assign_irq(kvm_context, &assigned_irq_data);
-if (r < 0) {
-fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
-adev->name, strerror(-r));
-fprintf(stderr, "Perhaps you are assigning a device "
-"that shares an IRQ with another device?\n");
-free_assigned_device(adev);
-adev = next;
-continue;
-}
-assigned_dev->girq = irq;
-}
+r = assign_irq(adev);
+if (r < 0)
+free_assigned_device(adev);
 
 adev = next;
 }
@@ -576,27 +612,10 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 dev->h_busnr = adev->bus;
 dev->h_devfn = PCI_DEVFN(adev->dev, adev->func);
 
-memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
-assigned_dev_data.assigned_dev_id  =
-   calc_assigned_dev_id(dev->h_busnr, (uint32_t)dev->h_devfn);
-assigned_dev_data.busnr = dev->h_busnr;
-assigned_dev_data.devfn = dev->h_devfn;
-
-#ifdef KVM_CAP_IOMMU
-/* We always enable the IOMMU if present
- * (or when not disabled on the command line)
- */
-r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
-if (r && !adev->disable_iommu)
-   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
-#endif
-
-r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
-if (r < 0

[PATCH 3/6] kvm: qemu: free device on error in init_assigned_device

2009-02-12 Thread Han, Weidong
make init_assigned_device call free_assigned_device on error,
and then make free_assigned_device is static because it's only
invoked in device-assigned.c.

Acked-by: Mark McLoughlin 
Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |   12 
 qemu/hw/device-assignment.h |1 -
 qemu/hw/device-hotplug.c|1 -
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index e6d2352..6046fdd 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -443,7 +443,7 @@ again:
 
 static LIST_HEAD(, AssignedDevInfo) adev_head;
 
-void free_assigned_device(AssignedDevInfo *adev)
+static void free_assigned_device(AssignedDevInfo *adev)
 {
 AssignedDevice *dev = adev->assigned_dev;
 
@@ -558,14 +558,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (get_real_device(dev, adev->bus, adev->dev, adev->func)) {
 fprintf(stderr, "%s: Error: Couldn't get real device (%s)!\n",
 __func__, adev->name);
-return NULL;
+goto out;
 }
 
 /* handle real device's MMIO/PIO BARs */
 if (assigned_dev_register_regions(dev->real_device.regions,
   dev->real_device.region_number,
   dev))
-return NULL;
+goto out;
 
 /* handle interrupt routing */
 e_device = (dev->dev.devfn >> 3) & 0x1f;
@@ -595,10 +595,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (r < 0) {
fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
 adev->name, strerror(-r));
-   return NULL;
+   goto out;
 }
 
 return &dev->dev;
+
+out:
+free_assigned_device(adev);
+return NULL;
 }
 
 /*
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index f216bb0..6a9b9fa 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -94,7 +94,6 @@ struct AssignedDevInfo {
 int disable_iommu;
 };
 
-void free_assigned_device(AssignedDevInfo *adev);
 PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
diff --git a/qemu/hw/device-hotplug.c b/qemu/hw/device-hotplug.c
index d8f0fcc..671acb2 100644
--- a/qemu/hw/device-hotplug.c
+++ b/qemu/hw/device-hotplug.c
@@ -51,7 +51,6 @@ static PCIDevice *qemu_system_hot_assign_device(const char 
*opts, int bus_nr)
 ret = init_assigned_device(adev, pci_bus);
 if (ret == NULL) {
 term_printf("Failed to assign device\n");
-free_assigned_device(adev);
 return NULL;
 }
 
-- 
1.6.0.4


0003-kvm-qemu-free-device-on-error-in-init_assigned_dev.patch
Description: 0003-kvm-qemu-free-device-on-error-in-init_assigned_dev.patch


[PATCH 0/6] [v2] kvm: fix hot remove assigned device with IOMMU

2009-02-12 Thread Han, Weidong
Changes from v1 -> v2:

As Mark suggested, droped the loglevel change in kvm_vm_ioctl_deassign_device, 
and splited a patch into multiple patches for easier review.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 4/4] kvm: qemu: fix hot remove device

2009-02-12 Thread Han, Weidong
Good suggestion. I will split it and resend them.

Regards,
Weidong

Mark McLoughlin wrote:
> Hi Weidong,
> 
> In general, this looks like a good cleanup.
> 
> With deassign_device() fixed to only require assigned_dev_id, I would
> be happy to ACK this whole patch.
> 
> However, it would be much, much easier to review the patch if you had
> split it into multiple patches e.g.
> 
>   1) Make init_assigned_device() call free_assigned_device on error
>   2) Split out assign_device() with no functional changes
>   3) Split out assign_irq() with no functional changes
>   4) Add deassign_device() and make init_assigned_device() and
>  assigned_dev_update_irqs() use it
>   5) Add device hotunplug
> 
> On Tue, 2009-02-10 at 20:40 +0800, Han, Weidong wrote:
>> when hot remove a device with iommu, should deassign it from guest,
>> and free it from qemu.
>> 
>> assign_dev_update_irqs may not be invoked when hot add a device,
>> so need to assign irq after assign device in init_assigned_device.
>> 
>> Signed-off-by: Weidong Han 
>> ---
>>  qemu/hw/device-assignment.c |  187
>>  ++-
>>  qemu/hw/device-assignment.h |3 +- qemu/hw/device-hotplug.c|
>>  18 - 3 files changed, 151 insertions(+), 57 deletions(-)
>> 
>> diff --git a/qemu/hw/device-assignment.c
>> b/qemu/hw/device-assignment.c 
>> index e6d2352..6362798 100644
>> --- a/qemu/hw/device-assignment.c
>> +++ b/qemu/hw/device-assignment.c
>> @@ -443,7 +443,7 @@ again:
>> 
>>  static LIST_HEAD(, AssignedDevInfo) adev_head;
>> 
>> -void free_assigned_device(AssignedDevInfo *adev)
>> +static void free_assigned_device(AssignedDevInfo *adev)
> 
> Right, if init_assigned_device() fails, the device gets freed, so
> nothing outside of this file needs this.
> 
> 
>> @@ -487,6 +487,116 @@ static uint32_t calc_assigned_dev_id(uint8_t
>>  bus, uint8_t devfn) return (uint32_t)bus << 8 | (uint32_t)devfn;
>>  }
>> 
>> +static int assign_device(AssignedDevInfo *adev)
>> +{
>> +struct kvm_assigned_pci_dev assigned_dev_data;
>> +AssignedDevice *dev = adev->assigned_dev;
>> +int r;
>> +
>> +memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
>> +assigned_dev_data.assigned_dev_id  =
>> +calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
>> +assigned_dev_data.busnr = dev->h_busnr;
>> +assigned_dev_data.devfn = dev->h_devfn;
>> +
>> +#ifdef KVM_CAP_IOMMU
>> +/* We always enable the IOMMU if present
>> + * (or when not disabled on the command line)
>> + */
>> +r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
>> +if (r && !adev->disable_iommu)
>> +assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU; +#endif
>> +
>> +r = kvm_assign_pci_device(kvm_context, &assigned_dev_data); +  
>> if (r < 0) + fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
>> +adev->name, strerror(-r));
>> +return r;
>> +}
> 
> Just a copy of the code from init_assigned_device() with no functional
> changes.
> 
>> +static void deassign_device(AssignedDevInfo *adev) +{
>> +struct kvm_assigned_pci_dev assigned_dev_data;
>> +AssignedDevice *dev = adev->assigned_dev;
>> +int r;
>> +
>> +memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
>> +assigned_dev_data.assigned_dev_id  =
>> +calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
> 
> We should only need to set assigned_dev_id for deassignment, so these
> lines should not be needed:
> 
>> +assigned_dev_data.busnr = dev->h_busnr;
>> +assigned_dev_data.devfn = dev->h_devfn;
>> +
>> +#ifdef KVM_CAP_IOMMU
>> +/* We always enable the IOMMU if present
>> + * (or when not disabled on the command line)
>> + */
>> +r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
>> +if (r && !adev->disable_iommu)
>> +assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
>> +#endif
> 
> I think the kernel side needs this fix:
> 
> -   if (assigned_dev->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
> +   if (match->flags & KVM_DEV_ASSIGN_ENABLE_IOMMU)
> kvm_deassign_device(kvm, match);
> 
>> +if (kvm_deassign_pci_device(kvm_context, &assigned_dev_data))
>> +fprintf(stderr, "Could not notify kernel about "
>> +"deassigned de

RE: [PATCH 2/4] kvm: change a loglevel in kvm_vm_ioctl_deassign_device

2009-02-12 Thread Han, Weidong
Mark McLoughlin wrote:
> On Tue, 2009-02-10 at 20:40 +0800, Han, Weidong wrote:
>> change from KERN_INFO to KERN_WARNING to prompt users when
>> deassign an unassigned device
>> 
>> Signed-off-by: Weidong Han 
>> ---
>>  virt/kvm/kvm_main.c |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>> 
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index ef52622..03f9807 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -568,7 +568,7 @@ static int kvm_vm_ioctl_deassign_device(struct
>>  kvm *kvm, match =
>>
>> kvm_find_assigned_dev(&kvm->arch.assigned_dev_head,
>>  assigned_dev->assigned_dev_id); if (!match) {
>> -printk(KERN_INFO "%s: device hasn't been assigned before, "
>> +printk(KERN_WARNING "%s: device hasn't been assigned before, "
>>"so cannot be deassigned\n", __func__);
> 
> Personally, I'd drop the printk() - the calling program has passed an
> invalid device ID and we've returned -EINVAL, I think we should be
> silent in such cases and let the app do the error handling.
> 

Reasonable.

Regards,
Weidong

> Cheers,
> Mark.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] kvm: qemu: fix hot remove device

2009-02-10 Thread Han, Weidong
when hot remove a device with iommu, should deassign it from guest,
and free it from qemu.

assign_dev_update_irqs may not be invoked when hot add a device,
so need to assign irq after assign device in init_assigned_device.

Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |  187 ++-
 qemu/hw/device-assignment.h |3 +-
 qemu/hw/device-hotplug.c|   18 -
 3 files changed, 151 insertions(+), 57 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index e6d2352..6362798 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -443,7 +443,7 @@ again:
 
 static LIST_HEAD(, AssignedDevInfo) adev_head;
 
-void free_assigned_device(AssignedDevInfo *adev)
+static void free_assigned_device(AssignedDevInfo *adev)
 {
 AssignedDevice *dev = adev->assigned_dev;
 
@@ -487,6 +487,116 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 return (uint32_t)bus << 8 | (uint32_t)devfn;
 }
 
+static int assign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+ 
+r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+return r;
+}
+
+static void deassign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+
+if (kvm_deassign_pci_device(kvm_context, &assigned_dev_data))
+   fprintf(stderr, "Could not notify kernel about "
+"deassigned device (%x:%x.%x)\n",
+dev->h_busnr, PCI_SLOT(dev->h_devfn), PCI_FUNC(dev->h_devfn));
+}
+
+static int assign_irq(AssignedDevInfo *adev)
+{
+struct kvm_assigned_irq assigned_irq_data;
+AssignedDevice *dev = adev->assigned_dev;
+int irq, r = 0;
+
+irq = pci_map_irq(&dev->dev, dev->intpin);
+irq = piix_get_irq(irq);
+
+#ifdef TARGET_IA64
+irq = ipf_map_irq(&dev->dev, irq);
+#endif
+
+if (dev->girq == irq)
+return r;
+
+memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
+assigned_irq_data.assigned_dev_id =
+calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_irq_data.guest_irq = irq;
+assigned_irq_data.host_irq = dev->real_device.irq;
+r = kvm_assign_irq(kvm_context, &assigned_irq_data);
+if (r < 0) {
+fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
+adev->name, strerror(-r));
+fprintf(stderr, "Perhaps you are assigning a device "
+"that shares an IRQ with another device?\n");
+return r;
+}
+
+dev->girq = irq;
+return r;
+}
+
+void remove_assigned_device(AssignedDevInfo *adev)
+{
+deassign_device(adev);
+free_assigned_device(adev);
+}
+
+AssignedDevInfo *get_assigned_device(int pcibus, int slot)
+{
+AssignedDevice *assigned_dev = NULL;
+AssignedDevInfo *adev = NULL;
+
+LIST_FOREACH(adev, &adev_head, next) {
+assigned_dev = adev->assigned_dev;
+if (pci_bus_num(assigned_dev->dev.bus) == pcibus &&
+PCI_SLOT(assigned_dev->dev.devfn) == slot)
+return adev;
+}
+
+return NULL;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -496,38 +606,12 @@ void assigned_dev_update_irqs()
 
 adev = LIST_FIRST(&adev_head);
 while (adev) {
-AssignedDevInfo *next = LIST_NEXT(adev, next);
-AssignedDevice *assigned_dev = adev->assigned_dev;
-int irq, r;
+int r;
+struct AssignedDevInfo *next = LIST_NEXT(adev, next);
 
-irq = pci_map_irq(&assigned_dev->dev, assigned_dev->intpin);
-irq = piix_get_irq(irq);
-
-#ifde

[PATCH 3/4] kvm: libkvm: add deassign ioctl

2009-02-10 Thread Han, Weidong
add this ioctl to deassign a device from guest

Signed-off-by: Weidong Han 
---
 libkvm/libkvm.c |   14 ++
 libkvm/libkvm.h |   13 +
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 80300c9..cbd4c6f 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -1141,6 +1141,20 @@ int kvm_assign_irq(kvm_context_t kvm,
 }
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev)
+{
+   int ret;
+
+   ret = ioctl(kvm->vm_fd, KVM_DEASSIGN_PCI_DEVICE, assigned_dev);
+   if (ret < 0)
+   return -errno;
+
+   return ret;
+}
+#endif
+
 int kvm_destroy_memory_region_works(kvm_context_t kvm)
 {
int ret = 0;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index e79e4fd..f6d653c 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -737,6 +737,19 @@ int kvm_assign_irq(kvm_context_t kvm,
 int kvm_destroy_memory_region_works(kvm_context_t kvm);
 #endif
 
+#ifdef KVM_CAP_DEVICE_DEASSIGNMENT
+/*!
+ * \brief Notifies host kernel about a PCI device to be deassigned from a guest
+ *
+ * Used for hot remove PCI device, this function notifies the host
+ * kernel about the deassigning of the physical PCI device from a guest.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param assigned_dev Parameters, like bus, devfn number, etc
+ */
+int kvm_deassign_pci_device(kvm_context_t kvm,
+   struct kvm_assigned_pci_dev *assigned_dev);
+#endif
 
 /*!
  * \brief Checks whether the generic irq routing capability is present
-- 
1.6.0.4


0003-kvm-libkvm-add-deassign-ioctl.patch
Description: 0003-kvm-libkvm-add-deassign-ioctl.patch


[PATCH 2/4] kvm: change a loglevel in kvm_vm_ioctl_deassign_device

2009-02-10 Thread Han, Weidong
change from KERN_INFO to KERN_WARNING to prompt users when
deassign an unassigned device

Signed-off-by: Weidong Han 
---
 virt/kvm/kvm_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ef52622..03f9807 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -568,7 +568,7 @@ static int kvm_vm_ioctl_deassign_device(struct kvm *kvm,
match = kvm_find_assigned_dev(&kvm->arch.assigned_dev_head,
  assigned_dev->assigned_dev_id);
if (!match) {
-   printk(KERN_INFO "%s: device hasn't been assigned before, "
+   printk(KERN_WARNING "%s: device hasn't been assigned before, "
  "so cannot be deassigned\n", __func__);
r = -EINVAL;
goto out;
-- 
1.6.0.4


0002-kvm-change-a-loglevel-in-kvm_vm_ioctl_deassign_devi.patch
Description: 0002-kvm-change-a-loglevel-in-kvm_vm_ioctl_deassign_devi.patch


[PATCH 1/4] kvm: define KVM_CAP_DEVICE_DEASSIGNMENT

2009-02-10 Thread Han, Weidong
define KVM_CAP_DEVICE_DEASSIGNMENT and KVM_DEASSIGN_PCI_DEVICE
for device deassignment.

Signed-off-by: Weidong Han 
---
 include/linux/kvm.h |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 869462c..02dfb1b 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -402,6 +402,9 @@ struct kvm_trace_rec {
 #ifdef __KVM_HAVE_IOAPIC
 #define KVM_CAP_IRQ_ROUTING 25
 #endif
+#ifdef __KVM_HAVE_DEVICE_ASSIGNMENT
+#define KVM_CAP_DEVICE_DEASSIGNMENT 26
+#endif
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -466,6 +469,8 @@ struct kvm_irq_routing {
 #define KVM_ASSIGN_IRQ _IOR(KVMIO, 0x70, \
struct kvm_assigned_irq)
 #define KVM_REINJECT_CONTROL  _IO(KVMIO, 0x71)
+#define KVM_DEASSIGN_PCI_DEVICE _IOR(KVMIO, 0x72, \
+struct kvm_assigned_pci_dev)
 
 /*
  * ioctls for vcpu fds
-- 
1.6.0.4


0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch
Description: 0001-kvm-define-KVM_CAP_DEVICE_DEASSIGNMENT.patch


[PATCH 0/4] kvm: fix hot remove device

2009-02-10 Thread Han, Weidong
When hot remove a device with IOMMU, should deassign it from guest and free it 
from qemu. Currently, a device cannot work when it is hot added twice with 
IOMMU (hot add -> hot remove -> hot add). This patchset fixes this issue.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] [RESEND] kvm: qemu: fix double removal of assigned device

2009-02-06 Thread Han, Weidong
free_assigned_device removes the assigned device from the
list adev_head, needn't remove it in assigned_dev_update_irqs.

Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 82ff00a..e6d2352 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -522,7 +522,6 @@ void assigned_dev_update_irqs()
 adev->name, strerror(-r));
 fprintf(stderr, "Perhaps you are assigning a device "
 "that shares an IRQ with another device?\n");
-LIST_REMOVE(adev, next);
 free_assigned_device(adev);
 adev = next;
 continue;
-- 
1.6.0.4


0003-kvm-qemu-fix-double-removal-of-assigned-device.patch
Description: 0003-kvm-qemu-fix-double-removal-of-assigned-device.patch


[PATCH 1/3] [RESEND] kvm: qemu: remove the useless parameter

2009-02-06 Thread Han, Weidong
should pass &assigned_dev->dev to ipf_map_irq in while loop, the
parameter PCIDevice *d is useless. And rename assign_dev_update_irq
to assigned_dev_update_irqs() because it updates irq on all assigned devices.

Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |4 ++--
 qemu/hw/device-assignment.h |2 +-
 qemu/hw/pci.c   |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 3fa8053..5003611 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -488,7 +488,7 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
-void assigned_dev_update_irq(PCIDevice *d)
+void assigned_dev_update_irqs()
 {
 AssignedDevInfo *adev;
 
@@ -502,7 +502,7 @@ void assigned_dev_update_irq(PCIDevice *d)
 irq = piix_get_irq(irq);
 
 #ifdef TARGET_IA64
-   irq = ipf_map_irq(d, irq);
+   irq = ipf_map_irq(&assigned_dev->dev, irq);
 #endif
 
 if (irq != assigned_dev->girq) {
diff --git a/qemu/hw/device-assignment.h b/qemu/hw/device-assignment.h
index 8b3b105..f216bb0 100644
--- a/qemu/hw/device-assignment.h
+++ b/qemu/hw/device-assignment.h
@@ -99,7 +99,7 @@ PCIDevice *init_assigned_device(AssignedDevInfo *adev, PCIBus 
*bus);
 AssignedDevInfo *add_assigned_device(const char *arg);
 void add_assigned_devices(PCIBus *bus, const char **devices, int n_devices);
 ram_addr_t assigned_dev_load_option_roms(ram_addr_t rom_base_offset);
-void assigned_dev_update_irq(PCIDevice *d);
+void assigned_dev_update_irqs(void);
 
 #define MAX_DEV_ASSIGN_CMDLINE 8
 
diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c
index a6d57e5..e207c7c 100644
--- a/qemu/hw/pci.c
+++ b/qemu/hw/pci.c
@@ -484,7 +484,7 @@ void pci_default_write_config(PCIDevice *d,
 if (kvm_enabled() && qemu_kvm_irqchip_in_kernel() &&
 address >= PIIX_CONFIG_IRQ_ROUTE &&
address < PIIX_CONFIG_IRQ_ROUTE + 4)
-assigned_dev_update_irq(d);
+assigned_dev_update_irqs();
 #endif /* USE_KVM_DEVICE_ASSIGNMENT */
 
 end = address + len;
-- 
1.6.0.4


0001-kvm-qemu-remove-the-useless-parameter.patch
Description: 0001-kvm-qemu-remove-the-useless-parameter.patch


[PATCH 0/3] [RESEND] kvm: qemu: several fixes for device assignment

2009-02-06 Thread Han, Weidong
This patchset had been sent to kvm mailing list few months ago. I suspect they 
were ignored. Now I rebased them and resend again.

This patchset removes useless paramter, fixes leak of ioperm data of assigned 
device, and fixes double removal of assigned device.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] [RESEND] kvm: qemu: fix leak of ioperm data

2009-02-06 Thread Han, Weidong
implement kvm_remove_ioperm_data to free ioperm data, and call it
in free_assigned_device to avoid leak.

Signed-off-by: Weidong Han 
---
 qemu/hw/device-assignment.c |2 ++
 qemu/qemu-kvm.c |   17 +
 qemu/qemu-kvm.h |1 +
 3 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 5003611..82ff00a 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -457,6 +457,8 @@ void free_assigned_device(AssignedDevInfo *adev)
 if (!pci_region->valid || !(pci_region->type & IORESOURCE_MEM))
 continue;
 
+kvm_remove_ioperm_data(region->u.r_baseport, region->r_size);
+
 if (region->u.r_virtbase) {
 int ret = munmap(region->u.r_virtbase,
  (pci_region->size + 0xFFF) & 0xF000);
diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 5ff63ad..f14237e 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -1329,6 +1329,23 @@ void kvm_add_ioperm_data(struct ioperm_data *data)
 LIST_INSERT_HEAD(&ioperm_head, data, entries);
 }
 
+void kvm_remove_ioperm_data(unsigned long start_port, unsigned long num)
+{
+struct ioperm_data *data;
+
+data = LIST_FIRST(&ioperm_head);
+while (data) {
+struct ioperm_data *next = LIST_NEXT(data, entries);
+
+if (data->start_port == start_port && data->num == num) {
+LIST_REMOVE(data, entries);
+qemu_free(data);
+}
+
+data = next;
+}
+}
+
 void kvm_ioperm(CPUState *env, void *data)
 {
 if (kvm_enabled() && qemu_system_ready)
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
index 042dd93..e7acd2e 100644
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -127,6 +127,7 @@ struct ioperm_data;
 
 void kvm_ioperm(CPUState *env, void *data);
 void kvm_add_ioperm_data(struct ioperm_data *data);
+void kvm_remove_ioperm_data(unsigned long start_port, unsigned long num);
 void kvm_arch_do_ioperm(void *_data);
 #endif
 
-- 
1.6.0.4


0002-kvm-qemu-fix-leak-of-ioperm-data.patch
Description: 0002-kvm-qemu-fix-leak-of-ioperm-data.patch


RE: [PATCH] kvm: userspace: change vtd.o to iommu.o in Kbuild

2009-01-08 Thread Han, Weidong
Amit Shah wrote:
> On Thu, Jan 08, 2009 at 03:27:12PM +0800, Huang, Wei W wrote:
>> vtd.c has been renamed to iommu.c, need to change it in Kbuild
>> accordingly. 
>> 
>> Signed-off-by: Wei Huang 
>> ---
>>  kernel/x86/Kbuild |2 +-
>>  1 files changed, 1 insertions(+), 1 deletions(-)
>> 
>> diff --git a/kernel/x86/Kbuild b/kernel/x86/Kbuild
>> index c4723b1..48339b4 100644
>> --- a/kernel/x86/Kbuild
>> +++ b/kernel/x86/Kbuild
>> @@ -10,7 +10,7 @@ ifeq ($(EXT_CONFIG_KVM_TRACE),y)
>>  kvm-objs += kvm_trace.o
>>  endif
>>  ifeq ($(CONFIG_DMAR),y)
>> -kvm-objs += vtd.o
>> +kvm-objs += iommu.o
> 
> This won't be able to build an older kernel.
> 

Because IOMMU APIs were changed in kernel, it needs external module to support 
older kernel.

Regards,
Weidong

> Amit

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Failed to use PCI device assignment

2008-12-17 Thread Han, Weidong
l_agp agpgart shpchp pci_hotplug ext3 jbd
> mbcache usbhid hid sr_mod cdrom sd_mod crc_t10dif sg ata_generic ahci
> pata_acpi ehci_hcd uhci_hcd usbcore libata scsi_mod thermal processor
> fan fuse [last unloaded: e1000e] Dec 16 10:25:51 toto kernel:
> [324498.625679] Pid: 4028, comm: qemu-system-x86 Not tainted
> 2.6.28-rc8 #4  
> Dec 16 10:25:51 toto kernel: [324498.625681] Call Trace:
> Dec 16 10:25:51 toto kernel: [324498.625684]  []
> warn_slowpath+0x60/0x80 
> Dec 16 10:25:51 toto kernel: [324498.625693]  [] ?
> paging32_prefetch_page+0xc2/0x1a0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625700]  [] ?
> pic_irq_request+0x26/0x70 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625703]  []
> __enable_irq+0x30/0x70 
> Dec 16 10:25:51 toto kernel: [324498.625704]  []
> enable_irq+0x3c/0x60 
> Dec 16 10:25:51 toto kernel: [324498.625710]  []
> kvm_assigned_dev_ack_irq+0x2c/0x40 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625717]  []
> kvm_notify_acked_irq+0x2e/0x50 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625724]  []
> kvm_ioapic_update_eoi+0x3c/0x80 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625730]  []
> apic_mmio_write+0x221/0x610 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625736]  [] ?
> paging32_gva_to_gpa+0x38/0x70 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625742]  [] ?
> vcpu_find_mmio_dev+0x3c/0x80 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625749]  []
> emulator_write_emulated_onepage+0x9a/0x140 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625755]  []
> emulator_write_emulated+0x59/0x70 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625767]  []
> x86_emulate_insn+0x40f/0x25f0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625773]  [] ?
> emulator_read_std+0x4c/0xa0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625784]  [] ?
> do_insn_fetch+0x74/0xd0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625794]  [] ?
> x86_decode_insn+0x846/0xc90 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625804]  [] ?
> kvm_emulate_pio+0x10f/0x250 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625810]  []
> emulate_instruction+0x12c/0x2b0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625814]  []
> handle_apic_access+0x41/0x80 [kvm_intel] 
> Dec 16 10:25:51 toto kernel: [324498.625817]  []
> kvm_handle_exit+0x91/0x180 [kvm_intel] 
> Dec 16 10:25:51 toto kernel: [324498.625823]  [] ?
> kvm_apic_has_interrupt+0x52/0xd0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625829]  [] ?
> kvm_cpu_has_interrupt+0x27/0x40 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625836]  []
> kvm_arch_vcpu_ioctl_run+0x51e/0x830 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625839]  [] ?
> _spin_unlock_irqrestore+0x4e/0x50 
> Dec 16 10:25:51 toto kernel: [324498.625845]  []
> kvm_vcpu_ioctl+0x3a5/0x4a0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625848]  [] ?
> raw_pci_read+0x79/0x90 
> Dec 16 10:25:51 toto kernel: [324498.625850]  [] ?
> pty_write+0x3c/0x60 
> Dec 16 10:25:51 toto kernel: [324498.625852]  [] ?
> _spin_unlock_irqrestore+0x2f/0x50 
> Dec 16 10:25:51 toto kernel: [324498.625858]  [] ?
> kvm_vcpu_ioctl+0x0/0x4a0 [kvm] 
> Dec 16 10:25:51 toto kernel: [324498.625861]  []
> vfs_ioctl+0x28/0x90 
> Dec 16 10:25:51 toto kernel: [324498.625863]  [] ?
> __wake_up+0x40/0x50 
> Dec 16 10:25:51 toto kernel: [324498.625865]  []
> do_vfs_ioctl+0x5e/0x4c0 
> Dec 16 10:25:51 toto kernel: [324498.625867]  [] ?
> tty_ldisc_deref+0x53/0x70 
> Dec 16 10:25:51 toto kernel: [324498.625871]  [] ?
> inotify_inode_queue_event+0xbc/0xe0 
> Dec 16 10:25:51 toto kernel: [324498.625874]  [] ?
> vfs_write+0xfe/0x160 
> Dec 16 10:25:51 toto kernel: [324498.625876]  [] ?
> fget_light+0xe0/0x110 
> Dec 16 10:25:51 toto kernel: [324498.625878]  []
> sys_ioctl+0x7a/0x80 
> Dec 16 10:25:51 toto kernel: [324498.625880]  []
> sysenter_do_call+0x12/0x2f 
> Dec 16 10:25:51 toto kernel: [324498.625882] ---[ end trace
> 00bcc41ab2b61b55 ]--- 
> Dec 16 10:25:51 toto kernel: [324498.662236] [ cut here
> ]--- 
> 
> 
> Thanks,
> 
> Olivier
> 
> -Original Message-
> From: Han, Weidong [mailto:weidong@intel.com]
> Sent: Sat 12/13/08 3:12
> To: Courtay Olivier; 'kvm@vger.kernel.org'
> Subject: RE: Failed to use PCI device assignment
> 
> Courtay Olivier wrote:
>> Hi,
>> 
>> I try to use PCI device assignment on a 2.6.28-rc8 with kvm-80 on a
>> Intel that support VT-D. 
>> 
>> I have a error:
>> 
>> # qemu-system-x86_64 -pcidevice host=00:19.0 -hda  WinXP.img  -cdrom
>> /dev/scd0  -m 1024 Warning: No DNS servers found
>> create_userspace_phys_mem: Invalid argument
>> kvm_cpu_register_physical_memory: failed
>> 
>> Where 00:19.0 is a e100e intel Ethernet card. I have removed the
>> driver before launch. 
>> 
>> I have compiled qemu in DEBUG mode but none interesting information.
>> 
> 
> You can enable DEVICE_ASSIGNMENT_DEBUG in qemu/hw/device-assignment.c
> to get more information. 
> 
> I suspect your error is not related to device asisgnment. Otherwise,
> it should output "assigned_dev_iomem_map: Error: create new mapping
> failed".  
> 
> Regards,
> Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Re:RE: [PATCH 1/3] KVM: pci passthrough un-page-aligned mmio device on machines w/o VT-d

2008-12-16 Thread Han, Weidong
刘晓建 wrote:
> 2008-12-16,"Han, Weidong"  wrote:
>> Pls note that multiple devices may share one page for their MMIO,
>> and they may be assigned to different guests, there is potential
>> isolation issue. So we don't allow to assign these devices.
> 
> In my view, we can modify the get_real_device() to open the file
> /proc/iomem to find out whether the "potential issue" is a real one,
> instead of disabling this feature all the times.
> 
> Xiaojian Liu

As I know, there are few devices that have unpage-aligned mmio, and they are 
almost legacy. What's more, it's a little weird to map unpage-aligned memory 
slot, though I don't know what's problem on it. In short, I don't perfer to 
assign these devices.

Regards,
Weidong

RE: [PATCH 1/3] KVM: pci passthrough un-page-aligned mmio device on machines w/o VT-d

2008-12-15 Thread Han, Weidong
刘晓建 wrote:
> the following patches apply to KVM-79. however, just now I viewed the
> code in KVM-81, and saw the problem remains.
> 
> when assigning a pci device to guest os, the qemu can not guarantee
> the device's virtual iomem address has the same page offset with the
> real one. 
> for example: in native linux, the the starting iomem address for my
> via-rhine nic is 0xdf9fff00, which is not page aligned. But
> QEMU will emulate this nic has a page aligned mmio address. This
> difference will make guest os access the wrong address when it tries
> to do mmio. 
> 

Pls note that multiple devices may share one page for their MMIO, and they may 
be assigned to different guests, there is potential isolation issue. So we 
don't allow to assign these devices.

Regards,
Weidong

> It seems Han Weidong has disabled QEMU assign this kinds of devices
> to guest os. The following patch will remove this constraint.
> ---
>  diff -uNr kvm-79/bios/rombios32.c kvm-79-fixed/bios/rombios32.c
> --- kvm-79/bios/rombios32.c 2008-11-12 19:48:01.0 +0800
> +++ kvm-79-fixed/bios/rombios32.c 2008-12-12 11:38:41.0 +0800
> @@ -931,7 +931,9 @@
>  paddr = &pci_bios_bigmem_addr;
>  else
>  paddr = &pci_bios_mem_addr;
> -*paddr = (*paddr + size - 1) & ~(size - 1);
> +
> +   /* To preserve the iomem page offset. --Xiaojian Liu */
> +*paddr = ((*paddr + size - 1) & ~(size - 1)) | (val
>  & 0x0fff); pci_set_io_region_addr(d, i, *paddr);
>  *paddr += size;
>  }
> 
> Xiaojian Liu

N�Р骒r��yb�X�肚�v�^�)藓{.n�+�筏�hФ�≤�}��财�z�&j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f

RE: How to use PCI-passthrough with kvm

2008-12-14 Thread Han, Weidong
I think you need sync with kernel code:

$ ./configure
$ cd kernel
$ make sync LINUX=../../kvm  -- kvm kernel directory
$ cd ..
$ make
$ make install

BTW, you should also need following patch to pass compilation.

---
 kernel/x86/external-module-compat.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/x86/external-module-compat.h 
b/kernel/x86/external-module-compat.h
index b5e11e2..2f16ca7 100644
--- a/kernel/x86/external-module-compat.h
+++ b/kernel/x86/external-module-compat.h
@@ -335,7 +335,7 @@ struct kvm_desc_ptr {
 #define FEATURE_CONTROL_VMXON_ENABLED  (1<<2)
 #endif
 
-#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,29)
+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,28)
 
 struct mtrr_var_range {
u32 base_lo;
-- 
1.5.1

w1ndoz wrote:
> Hi Weidong,
> Thank you for your answer.
> 
> I tried to test latest Avi's kvm.git and kvm-userspace.git.
> but, kvm-userspace fails to compile.
> 
> I used the following git repositories:
> git clone
> git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm-userspace.git
> vtd 
> 
> $ ./configure
> $ make
>  make -C libkvm
> make[1]: Entering directory `/home/ddk/vtd/libkvm'
> gcc -m32 -D__i386__ -MMD -MF ./.libkvm.d -g -fomit-frame-pointer -Wall
>  -fno-stack-protector   -I /home/ddk/vtd/kernel/include   -c -o
> libkvm.o libkvm.c
> In file included from libkvm.c:38:
> libkvm.h:630: warning: 'struct kvm_pit_state' declared inside
> parameter list libkvm.h:630: warning: its scope is only this
> definition or 
> declaration, which is probably not what you want
> libkvm.h:641: warning: 'struct kvm_pit_state' declared inside
> parameter list gcc -m32 -D__i386__ -MMD -MF ./.libkvm-x86.d -g
> -fomit-frame-pointer -Wall  -fno-stack-protector   -I
> /home/ddk/vtd/kernel/include   -c -o libkvm-x86.o libkvm-x86.c
> In file included from libkvm-x86.c:1:
> libkvm.h:630: warning: 'struct kvm_pit_state' declared inside
> parameter list libkvm.h:630: warning: its scope is only this
> definition or 
> declaration, which is probably not what you want
> libkvm.h:641: warning: 'struct kvm_pit_state' declared inside
> parameter list libkvm-x86.c:246: warning: 'struct kvm_pit_state'
> declared inside parameter list libkvm-x86.c:247: error: conflicting
> types for 'kvm_get_pit' 
> libkvm.h:630: error: previous declaration of 'kvm_get_pit' was here
> libkvm-x86.c: In function 'kvm_get_pit':
> libkvm-x86.c:251: error: invalid application of 'sizeof' to incomplete
> type 'struct kvm_pit_state'
> libkvm-x86.c:251: error: array type has incomplete element type
> libkvm-x86.c:251: error: invalid application of 'sizeof' to incomplete
> type 'struct kvm_pit_state'
> libkvm-x86.c:251: error: invalid application of 'sizeof' to incomplete
> type 'struct kvm_pit_state'
> libkvm-x86.c: At top level:
> libkvm-x86.c:259: warning: 'struct kvm_pit_state' declared inside
> parameter list libkvm-x86.c:260: error: conflicting types for
> 'kvm_set_pit' 
> libkvm.h:641: error: previous declaration of 'kvm_set_pit' was here
> libkvm-x86.c: In function 'kvm_set_pit':
> libkvm-x86.c:264: error: invalid application of 'sizeof' to incomplete
> type 'struct kvm_pit_state'
> libkvm-x86.c:264: error: array type has incomplete element type
> libkvm-x86.c:264: error: invalid application of 'sizeof' to incomplete
> type 'struct kvm_pit_state'
> libkvm-x86.c:264: error: invalid application of 'sizeof' to incomplete
> type 'struct kvm_pit_state'
> make[1]: *** [libkvm-x86.o] Error 1
> make[1]: Leaving directory `/home/ddk/vtd/libkvm'
> make: *** [libkvm] Error 2
> 
> Any advice or tips will help?
> 
> Thanks,
> Kazushi
> 
> From: "Han, Weidong" 
> Subject: RE: How to use PCI-passthrough with kvm
> Date: Thu, 11 Dec 2008 18:26:27 +0800
> 
>> It's not related to shared irq. MSI is already supported in KVM, so
>> if the device has MSI capability, there is no sharing irq issue. 
>> 
>> Can you try latest kvm.git and kvm-userspace.git? At least you are
>> not using latest kvm-userspace because there is output "BUG:
>> kvm_destroy_phys_mem: invalid parameters (slot=-1)". In addition,
>> can you try out other device, such as add-on PCIe NIC or USB?   
>> 
>> Regards,
>> Weidong
>> 
>> w1ndoz wrote:
>>> Hi Weidong,
>>> 
>>> Thank you for your advice.
>>> 
>>> The other messages which I found are the following messages which
>>> KVM outputs:
>>> BUG: kvm_destroy_ph

RE: Failed to use PCI device assignment

2008-12-12 Thread Han, Weidong
Courtay Olivier wrote:
> Hi,
> 
> I try to use PCI device assignment on a 2.6.28-rc8 with kvm-80 on a
> Intel that support VT-D. 
> 
> I have a error:
> 
> # qemu-system-x86_64 -pcidevice host=00:19.0 -hda  WinXP.img  -cdrom
> /dev/scd0  -m 1024 Warning: No DNS servers found
> create_userspace_phys_mem: Invalid argument
> kvm_cpu_register_physical_memory: failed
> 
> Where 00:19.0 is a e100e intel Ethernet card. I have removed the
> driver before launch. 
> 
> I have compiled qemu in DEBUG mode but none interesting information.
> 

You can enable DEVICE_ASSIGNMENT_DEBUG in qemu/hw/device-assignment.c to get 
more information. 

I suspect your error is not related to device asisgnment. Otherwise, it should 
output "assigned_dev_iomem_map: Error: create new mapping failed".

Regards,
Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: How to use PCI-passthrough with kvm

2008-12-11 Thread Han, Weidong
It's not related to shared irq. MSI is already supported in KVM, so if the 
device has MSI capability, there is no sharing irq issue.

Can you try latest kvm.git and kvm-userspace.git? At least you are not using 
latest kvm-userspace because there is output "BUG: kvm_destroy_phys_mem: 
invalid parameters (slot=-1)". In addition, can you try out other device, such 
as add-on PCIe NIC or USB? 

Regards,
Weidong

w1ndoz wrote:
> Hi Weidong,
> 
> Thank you for your advice.
> 
> The other messages which I found are the following messages which
> KVM outputs:
> BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> BUG: kvm_destroy_phys_mem: invalid parameters (slot=-1)
> 
> and, IRQ status is as follows:
> 
> /proc/interrupt of the host OS is as follows:
> cat /proc/interrupts
>CPU0   CPU1
>   0: 45  0   IO-APIC-edge  timer
>   1:  1  1   IO-APIC-edge  i8042
>   4:  1  1   IO-APIC-edge
>   9:  0  0   IO-APIC-fasteoi   acpi
>  12:  2  2   IO-APIC-edge  i8042
>  14:   8675   8589   IO-APIC-edge  ata_piix
>  15:  0  0   IO-APIC-edge  ata_piix
>  17:  0  1   IO-APIC-fasteoi   uhci_hcd:usb3,
>  ehci_hcd:usb7 18:   2201   2200   IO-APIC-fasteoi  
> uhci_hcd:usb1, 
> uhci_hcd:usb6, pata_marvell
>  19: 43 45   IO-APIC-fasteoi   uhci_hcd:usb5, ohci1394
>  21:   1091   1060   IO-APIC-fasteoi   uhci_hcd:usb2,
>  ata_piix, eth1 23:  2  2   IO-APIC-fasteoi  
> uhci_hcd:usb4, ehci_hcd:usb8 505:   7198   7310  
> PCI-MSI-edge  [EMAIL PROTECTED]::00:02.0 506: 50 53  
> PCI-MSI-edge  kvm_assigned_msi_device 
> NMI:  0  0   Non-maskable interrupts
> LOC:  87300  87267   Local timer interrupts
> RES:  54424  49773   Rescheduling interrupts
> CAL:494446   Function call interrupts
> TLB:651625   TLB shootdowns
> SPU:  0  0   Spurious interrupts
> ERR:  0
> MIS:  0
> 
> and GuestOS is as follows:
> $ cat /proc/interrupts
>CPU0
>   0:  42779   IO-APIC-edge  timer
>   1: 76   IO-APIC-edge  i8042
>   2:  0XT-PIC-XTcascade
>   4:  1   IO-APIC-edge
>   8:  3   IO-APIC-edge  rtc
>  10: 68   IO-APIC-edge  eth1
>  11: 45   IO-APIC-edge  0, eth0
>  12:228   IO-APIC-edge  i8042
>  14:   9982   IO-APIC-edge  libata
>  15:740   IO-APIC-edge  libata
> NMI:  0   Non-maskable interrupts
> LOC:  42638   Local timer interrupts
> RES:  0   Rescheduling interrupts
> CAL:  0   function call interrupts
> TLB:  0   TLB shootdowns
> TRM:  0   Thermal event interrupts
> SPU:  0   Spurious interrupts
> ERR:  0
> MIS:  0
> Note that eth1 is Intel NIC.
> 
> Any idea ?
> 
> Thanks,
> Kazushi

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: How to use PCI-passthrough with kvm

2008-12-10 Thread Han, Weidong
Hi,

This onboard NIC assignment works for me. I used RHEL5u1 guest and e1000e 
0.2.9.5 driver. So I don't think this issue is caused by PCI passthrough on 
KVM. Is there any error messages of assigning device?

Pls note that the device that shares IRQ with other devices in host cannot work 
by PCI-passthrough currently. 

Regards,
Weidong

w1ndoz wrote:
> Hi Weidong,
> 
> Thank you for your advice
> Finally, I could recognize a Intel NIC from the guest OS.
> 
> But, I had a new problem.
> 
> A guest os have been able to recognize the NIC with
> pci-passthrough. However the NIC could not get a IP address on a
> guest os
> 
> The NIC causes the following errors and stops:
> :00:05.0: eth1: Detected Tx Unit Hang:
> 
> I collected information on webs and I knew that it was a bug of
> e1000e drivers, and used a driver of the newest version (e1000e
> 0.5.8.2.tar.gz), but the problem did not solve it.
> 
> Will this error be caused by PCI-passthrough on KVM ?
> Of course the NIC works on a host OS normally.
> 
> dmesg is as follows:
> [  361.424319] e1000e: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow
> Control: RX/TX
> [  387.711336] ACPI: PCI interrupt for device :00:05.0 disabled
> [  414.760362] e1000e: Intel(R) PRO/1000 Network Driver - 0.5.8.2-NAPI
> [  414.760366] e1000e: Copyright (c) 1999-2008 Intel Corporation.
> [  414.760853] ACPI: PCI Interrupt :00:05.0[A] -> Link [LNKA] ->
> GSI 10 (level, high) -> IRQ 10
> [  414.760901] PCI: Setting latency timer of device :00:05.0 to 64
> [  414.851311] :00:05.0: : Failed to initialize MSI interrupts.
> Falling back to legacy interrupts.
> [  415.235355] :00:05.0: eth1: (PCI Express:2.5GB/s:Width x1)
> 00:19:d1:a3:fa:b3
> [  415.235359] :00:05.0: eth1: Intel(R) PRO/1000 Network
> Connection [  415.235381] :00:05.0: eth1: MAC: 7, PHY: 6, PBA No:
> ff-0ff [  415.395924] ADDRCONF(NETDEV_UP): eth1: link is not ready
> [  416.887888] e1000e: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow
> Control: RX/TX
> [  416.887893] :00:05.0: eth1: 10/100 speed: disabling TSO
> [  416.888981] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> [  419.124848] :00:05.0: eth1: Detected Tx Unit Hang:
> [  419.124850]   TDH  <0>
> [  419.124851]   TDT  <3>
> [  419.124858]   next_to_use  <3>
> [  419.124858]   next_to_clean<0>
> [  419.124859] buffer_info[next_to_clean]:
> [  419.124860]   time_stamp   <4ffc>
> [  419.124861]   next_to_watch<0>
> [  419.124861]   jiffies  <5208>
> [  419.124862]   next_to_watch.status <0>
> [  421.251409] :00:05.0: eth1: Detected Tx Unit Hang:
> [  421.251411]   TDH  <0>
> [  421.251412]   TDT  <3>
> [  421.251412]   next_to_use  <3>
> [  421.251413]   next_to_clean<0>
> [  421.251414] buffer_info[next_to_clean]:
> [  421.251415]   time_stamp   <4ffc>
> [  421.251415]   next_to_watch<0>
> [  421.251416]   jiffies  <53fc>
> [  421.251417]   next_to_watch.status <0>
> [  423.378487] :00:05.0: eth1: Detected Tx Unit Hang:
> [  423.378489]   TDH  <0>
> [  423.378490]   TDT  <3>
> [  423.378490]   next_to_use  <3>
> [  423.378491]   next_to_clean<0>
> [  423.378491] buffer_info[next_to_clean]:
> [  423.378492]   time_stamp   <4ffc>
> [  423.378492]   next_to_watch<0>
> [  423.378493]   jiffies  <55f0>
> [  423.378494]   next_to_watch.status <0>
> [  425.505743] :00:05.0: eth1: Detected Tx Unit Hang:
> [  425.505746]   TDH  <0>
> [  425.505746]   TDT  <3>
> [  425.505747]   next_to_use  <3>
> [  425.505748]   next_to_clean<0>
> [  425.505749] buffer_info[next_to_clean]:
> [  425.505749]   time_stamp   <4ffc>
> [  425.505750]   next_to_watch<0>
> [  425.505751]   jiffies  <57e4>
> [  425.505751]   next_to_watch.status <0>
> [  427.615676] eth1: no IPv6 routers present
> [  427.632701] NETDEV WATCHDOG: eth1: transmit timed out
> [  429.462389] e1000e: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow
> Control: RX/TX
> [  429.462394] :00:05.0: eth1: 10/100 speed: disabling TSO
> 
> Any advice or tips will help?
> 
> Thanks,
> Kazushi
> 
>> Hi Kazushi,
>> 
>> Make sure unload the driver of the device before assign it.
>> 
>> Regards,
>> Weidong
>> 
>> w1ndoz wrote:
>>> Hi
>>> 
>>> I'm interested in PCI passthrough support.
>>> 
>>> I downloaded Linux Kernel 2.6.28-rc7 and build and ran KVM (which
>>> is KVM-79) on this kernel. I started KVM with the following
>>> command-lines: $ qemu-system-x86_64 -boot c -m 512 -hda test.qcow
>>> -localtime -k en-us -pcidevice host=00:19.0 (00:19.0 is Intel(R)
>>> PRO/1000 Network Connection Device) 
>>> 
>>> However, KVM caused the following errors:
>>> Could not notify kernel about assigned device "00:19.0"
>>> register_real_device: Device or resource busy
>>> Segma

RE: [PATCHSETS #2] KVM device passthrough support with AMD IOMMU

2008-12-10 Thread Han, Weidong
Joerg Roedel wrote:
> On Wed, Dec 10, 2008 at 02:22:08PM +, David Woodhouse wrote:
>> On Wed, 2008-12-10 at 15:11 +0100, Joerg Roedel wrote:
>>> So now its open how this will be merged alltogether. We can merge
>>> it in three steps (first from Dave's tree, then Avi and at last my
>>> IOMMU updates which has to happen in that order).
>>> The other option is to merge this all with one pull-request I send
>>> to Linus. It should not conflict with your updates Avi because all
>>> these changes apply also cleanly to current linus/master.
>>> Is this acceptable for everyone?
>>> 
>>> David, its important to hear your opinion on that because this
>>> single pull-request would include all VT-d updates you have queued
>>> in your tree? Is this ok for you?
>> 
>> Yes, that's fine by me. Should I apply patches 1-14 of Weidong's
>> patch set from Monday first? Or just let you do that?
> 
> I already did it on-top of your tree because Han Weidong's patches
> 1-17 were rebased to your tree and my IOMMU-API patches apply on-top
> of his patches.
> 
> Joerg

It's good. Thanks Dave and Joerg.

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Change P2P domain flags

2008-12-10 Thread Han, Weidong
Mike Day wrote:
> On 09/12/08 00:02 +0800, Han, Weidong wrote:
> 
>> +/* devices under the same p2p bridge are owned in one domain */
>> +#define DOMAIN_FLAG_P2P_MULTIPLE_DEVICES (1 < 0)
> 
> Shouldn't that be (1 << 0 )?
> 
> 

You are right. Thanks.

Regards,
Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] [v3] kvm: qemu: Assign irq in init_assigned_device

2008-12-10 Thread Han, Weidong
v2->v3:  fixed "use after free bug" reported by Mark.

assign_dev_update_irqs may not be invoked when hot add a device, so need to 
assign irq after assign device in init_assigned_device.

Additionally, clean up assign_dev_update_irqs code, and free the assigned 
device in init_assigned_device when it fails.

Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
---
 qemu/hw/device-assignment.c |  134 +-
 qemu/hw/device-hotplug.c|1 -
 2 files changed, 80 insertions(+), 55 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 03a52e6..2fbc4bf 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -490,6 +490,68 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 return (uint32_t)bus << 8 | (uint32_t)devfn;
 }
 
+static int assign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+ 
+r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+return r;
+}
+
+static int assign_irq(AssignedDevInfo *adev)
+{
+struct kvm_assigned_irq assigned_irq_data;
+AssignedDevice *dev = adev->assigned_dev;
+int irq, r = 0;
+
+irq = pci_map_irq(&dev->dev, dev->intpin);
+irq = piix_get_irq(irq);
+
+#ifdef TARGET_IA64
+irq = ipf_map_irq(&dev->dev, irq);
+#endif
+
+if (dev->girq == irq)
+return r;
+
+memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
+assigned_irq_data.assigned_dev_id =
+calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_irq_data.guest_irq = irq;
+assigned_irq_data.host_irq = dev->real_device.irq;
+r = kvm_assign_irq(kvm_context, &assigned_irq_data);
+if (r < 0) {
+fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
+adev->name, strerror(-r));
+fprintf(stderr, "Perhaps you are assigning a device "
+"that shares an IRQ with another device?\n");
+return r;
+}
+
+dev->girq = irq;
+return r;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -499,38 +561,12 @@ void assigned_dev_update_irqs()
 
 adev = LIST_FIRST(&adev_head);
 while (adev) {
-AssignedDevInfo *next = LIST_NEXT(adev, next);
-AssignedDevice *assigned_dev = adev->assigned_dev;
-int irq, r;
-
-irq = pci_map_irq(&assigned_dev->dev, assigned_dev->intpin);
-irq = piix_get_irq(irq);
-
-#ifdef TARGET_IA64
-   irq = ipf_map_irq(&assigned_dev->dev, irq);
-#endif
-
-if (irq != assigned_dev->girq) {
-struct kvm_assigned_irq assigned_irq_data;
-
-memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
-assigned_irq_data.assigned_dev_id  =
-calc_assigned_dev_id(assigned_dev->h_busnr,
- (uint8_t) assigned_dev->h_devfn);
-assigned_irq_data.guest_irq = irq;
-assigned_irq_data.host_irq = assigned_dev->real_device.irq;
-r = kvm_assign_irq(kvm_context, &assigned_irq_data);
-if (r < 0) {
-fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
-adev->name, strerror(-r));
-fprintf(stderr, "Perhaps you are assigning a device "
-"that shares an IRQ with another device?\n");
-free_assigned_device(adev);
-adev = next;
-continue;
-}
-assigned_dev->girq = irq;
-}
+int r;
+struct AssignedDevInfo *next = LIST_NEXT(adev, next);
+
+r = assign_irq(adev);
+if (r < 0)
+free_assigned_device(adev);
 
 adev = next;
 }
@@ -561,14 +597,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (get_real_device(dev, adev->bus, adev->dev, adev->func)) {
 fprintf(stderr, "%s: Error: Couldn't get real device (%s)!\n",
 __func__, adev->name);
-return NULL;
+goto out;
 }
 
 /* handle real device's MMIO/PIO BARs */
 if (assigned_dev_register_regions(dev->real_device.regions,
   

[PATCH 2/4] [v3] kvm: qemu: Fix leak of ioperm data

2008-12-10 Thread Han, Weidong
v2->v3:  fixed "use after free bug" reported by Mark.

Free ioperm data in free_assigned_device.
And also, define ioperm_data structure and declare related functions when 
USE_KVM_DEVICE_ASSIGNMENT is defined, because ioperm data is only used by 
device assignment.

Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
---
 qemu/hw/device-assignment.c |2 ++
 qemu/qemu-kvm.c |   17 +
 qemu/qemu-kvm.h |   12 +++-
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 74f183a..04b6450 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -460,6 +460,8 @@ void free_assigned_device(AssignedDevInfo *adev)
 if (!pci_region->valid || !(pci_region->type & IORESOURCE_MEM))
 continue;
 
+kvm_remove_ioperm_data(region->u.r_baseport, region->r_size);
+
 if (region->u.r_virtbase) {
 int ret = munmap(region->u.r_virtbase,
  (pci_region->size + 0xFFF) & 0xF000);
diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 067cf03..960668f 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -1145,6 +1145,23 @@ void kvm_add_ioperm_data(struct ioperm_data *data)
 LIST_INSERT_HEAD(&ioperm_head, data, entries);
 }
 
+void kvm_remove_ioperm_data(unsigned long start_port, unsigned long num)
+{
+struct ioperm_data *data;
+
+data = LIST_FIRST(&ioperm_head);
+while (data) {
+struct ioperm_data *next = LIST_NEXT(data, entries);
+
+if (data->start_port == start_port && data->num == num) {
+LIST_REMOVE(data, entries);
+qemu_free(data);
+}
+
+data = next;
+}
+}
+
 void kvm_ioperm(CPUState *env, void *data)
 {
 if (kvm_enabled() && qemu_system_ready)
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
index 90fadcd..3e86586 100644
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -90,11 +90,6 @@ int qemu_kvm_unregister_coalesced_mmio(target_phys_addr_t 
addr,
 
 void qemu_kvm_system_reset_request(void);
 
-#ifdef USE_KVM_DEVICE_ASSIGNMENT
-void kvm_ioperm(CPUState *env, void *data);
-void kvm_arch_do_ioperm(void *_data);
-#endif
-
 #ifdef TARGET_PPC
 int handle_powerpc_dcr_read(int vcpu, uint32_t dcrn, uint32_t *data);
 int handle_powerpc_dcr_write(int vcpu,uint32_t dcrn, uint32_t data);
@@ -110,6 +105,7 @@ int handle_powerpc_dcr_write(int vcpu,uint32_t dcrn, 
uint32_t data);
 extern int kvm_allowed;
 extern kvm_context_t kvm_context;
 
+#ifdef USE_KVM_DEVICE_ASSIGNMENT
 struct ioperm_data {
 unsigned long start_port;
 unsigned long num;
@@ -117,6 +113,12 @@ struct ioperm_data {
 LIST_ENTRY(ioperm_data) entries;
 };
 
+void kvm_ioperm(CPUState *env, void *data);
+void kvm_add_ioperm_data(struct ioperm_data *data);
+void kvm_remove_ioperm_data(unsigned long start_port, unsigned long num);
+void kvm_arch_do_ioperm(void *_data);
+#endif
+
 int qemu_kvm_has_sync_mmu(void);
 
 #define kvm_enabled() (kvm_allowed)
-- 
1.6.0.4


0002-Fix-leak-of-ioperm-data.patch
Description: 0002-Fix-leak-of-ioperm-data.patch


RE: [PATCH 2/4] [v2] kvm: qemu: Fix leak of ioperm data

2008-12-10 Thread Han, Weidong
Mark McLoughlin wrote:
> On Wed, 2008-12-10 at 21:22 +0800, Han, Weidong wrote:
>> 
>> +void kvm_remove_ioperm_data(unsigned long start_port, unsigned long
>> num) +{ +struct ioperm_data *data;
>> +
>> +data = LIST_FIRST(&ioperm_head);
>> +while (data) {
>> +if (data->start_port == start_port && data->num == num) {
>> +LIST_REMOVE(data, entries);
>> +qemu_free(data);
>> +}
>> +
>> +data = LIST_NEXT(data, entries);
>> +}
>> +}
> 
> Repeating what I said last time:
> 
>  You've a "use after free bug" here; you free the structure and
>  LIST_NEXT de-references the pointer to it in order to obtain the
>  pointer to the next structure.
> 
> What you need is:
> 
> {
> struct ioperm_data *data;
> 
> data = LIST_FIRST(&ioperm_head);
> while (data) {
> struct ioperm_data *next = LIST_NEXT(data, entries);
> 
> if (data->start_port == start_port && data->num == num) {
> LIST_REMOVE(data, entries);
> qemu_free(data);
> }
> 
> data = next;
> }
> }
> 
> Cheers,
> Mark.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/4] [v2] kvm: qemu: Fix leak of ioperm data

2008-12-10 Thread Han, Weidong
Mark McLoughlin wrote:
> On Wed, 2008-12-10 at 21:22 +0800, Han, Weidong wrote:
>> 
>> +void kvm_remove_ioperm_data(unsigned long start_port, unsigned long
>> num) +{ +struct ioperm_data *data;
>> +
>> +data = LIST_FIRST(&ioperm_head);
>> +while (data) {
>> +if (data->start_port == start_port && data->num == num) {
>> +LIST_REMOVE(data, entries);
>> +qemu_free(data);
>> +}
>> +
>> +data = LIST_NEXT(data, entries);
>> +}
>> +}
> 
> Repeating what I said last time:
> 
>  You've a "use after free bug" here; you free the structure and
>  LIST_NEXT de-references the pointer to it in order to obtain the
>  pointer to the next structure.
> 

my God, will fix it soon. Thanks.

Regards,
Weidong

> What you need is:
> 
> {
> struct ioperm_data *data;
> 
> data = LIST_FIRST(&ioperm_head);
> while (data) {
> struct ioperm_data *next = LIST_NEXT(data, entries);
> 
> if (data->start_port == start_port && data->num == num) {
> LIST_REMOVE(data, entries);
> qemu_free(data);
> }
> 
> data = next;
> }
> }
> 
> Cheers,
> Mark.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] [v2] kvm: qemu: Assign irq in init_assigned_device

2008-12-10 Thread Han, Weidong
assign_dev_update_irqs may not be invoked when hot add a device, so need to 
assign irq after assign device in init_assigned_device.

Additionally, clean up assign_dev_update_irqs code, and free the assigned 
device in init_assigned_device when it fails.

Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
---
 qemu/hw/device-assignment.c |  135 +-
 qemu/hw/device-hotplug.c|1 -
 2 files changed, 80 insertions(+), 56 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 03a52e6..160f001 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -490,6 +490,68 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 return (uint32_t)bus << 8 | (uint32_t)devfn;
 }
 
+static int assign_device(AssignedDevInfo *adev)
+{
+struct kvm_assigned_pci_dev assigned_dev_data;
+AssignedDevice *dev = adev->assigned_dev;
+int r;
+
+memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
+assigned_dev_data.assigned_dev_id  =
+   calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_dev_data.busnr = dev->h_busnr;
+assigned_dev_data.devfn = dev->h_devfn;
+
+#ifdef KVM_CAP_IOMMU
+/* We always enable the IOMMU if present
+ * (or when not disabled on the command line)
+ */
+r = kvm_check_extension(kvm_context, KVM_CAP_IOMMU);
+if (r && !adev->disable_iommu)
+   assigned_dev_data.flags |= KVM_DEV_ASSIGN_ENABLE_IOMMU;
+#endif
+ 
+r = kvm_assign_pci_device(kvm_context, &assigned_dev_data);
+if (r < 0)
+   fprintf(stderr, "Failed to assign device \"%s\" : %s\n",
+adev->name, strerror(-r));
+return r;
+}
+
+static int assign_irq(AssignedDevInfo *adev)
+{
+struct kvm_assigned_irq assigned_irq_data;
+AssignedDevice *dev = adev->assigned_dev;
+int irq, r = 0;
+
+irq = pci_map_irq(&dev->dev, dev->intpin);
+irq = piix_get_irq(irq);
+
+#ifdef TARGET_IA64
+irq = ipf_map_irq(&dev->dev, irq);
+#endif
+
+if (dev->girq == irq)
+return r;
+
+memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
+assigned_irq_data.assigned_dev_id =
+calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
+assigned_irq_data.guest_irq = irq;
+assigned_irq_data.host_irq = dev->real_device.irq;
+r = kvm_assign_irq(kvm_context, &assigned_irq_data);
+if (r < 0) {
+fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
+adev->name, strerror(-r));
+fprintf(stderr, "Perhaps you are assigning a device "
+"that shares an IRQ with another device?\n");
+return r;
+}
+
+dev->girq = irq;
+return r;
+}
+
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
@@ -499,40 +561,13 @@ void assigned_dev_update_irqs()
 
 adev = LIST_FIRST(&adev_head);
 while (adev) {
-AssignedDevInfo *next = LIST_NEXT(adev, next);
-AssignedDevice *assigned_dev = adev->assigned_dev;
-int irq, r;
+int r;
+
+r = assign_irq(adev);
+if (r < 0)
+free_assigned_device(adev);
 
-irq = pci_map_irq(&assigned_dev->dev, assigned_dev->intpin);
-irq = piix_get_irq(irq);
-
-#ifdef TARGET_IA64
-   irq = ipf_map_irq(&assigned_dev->dev, irq);
-#endif
-
-if (irq != assigned_dev->girq) {
-struct kvm_assigned_irq assigned_irq_data;
-
-memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
-assigned_irq_data.assigned_dev_id  =
-calc_assigned_dev_id(assigned_dev->h_busnr,
- (uint8_t) assigned_dev->h_devfn);
-assigned_irq_data.guest_irq = irq;
-assigned_irq_data.host_irq = assigned_dev->real_device.irq;
-r = kvm_assign_irq(kvm_context, &assigned_irq_data);
-if (r < 0) {
-fprintf(stderr, "Failed to assign irq for \"%s\": %s\n",
-adev->name, strerror(-r));
-fprintf(stderr, "Perhaps you are assigning a device "
-"that shares an IRQ with another device?\n");
-free_assigned_device(adev);
-adev = next;
-continue;
-}
-assigned_dev->girq = irq;
-}
-
-adev = next;
+adev = LIST_NEXT(adev, next);
 }
 }
 
@@ -561,14 +596,14 @@ struct PCIDevice *init_assigned_device(AssignedDevInfo 
*adev, PCIBus *bus)
 if (get_real_device(dev, adev->bus, adev->dev, adev->func)) {
 fprintf(stderr, "%s: Error: Couldn't get real device (%s)!\n",
 __func__, adev->name);
-return NULL;
+goto out;
 }
 
 /* handle real device's MMIO/PIO BARs */
 if (assigned_dev_register_regions(dev->real_device.regions,
   dev->real_device.region_number,
  

[PATCH 3/4] [v2] kvm: qemu: Fix double LIST_REMOVE device

2008-12-10 Thread Han, Weidong
>From 614ef2e7027089a716fdccbb2da1d75630ef41f7 Mon Sep 17 00:00:00 2001
From: Weidong Han <[EMAIL PROTECTED]>
Date: Wed, 10 Dec 2008 21:17:46 +0800
Subject: [PATCH] Fix double LIST_REMOVE device

adev will be removed in free_assigned_device, shouldn't remove it in 
assigned_dev_update_irqs.

Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
---
 qemu/hw/device-assignment.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 04b6450..03a52e6 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -525,7 +525,6 @@ void assigned_dev_update_irqs()
 adev->name, strerror(-r));
 fprintf(stderr, "Perhaps you are assigning a device "
 "that shares an IRQ with another device?\n");
-LIST_REMOVE(adev, next);
 free_assigned_device(adev);
 adev = next;
 continue;
-- 
1.6.0.4


0003-Fix-double-LIST_REMOVE-device.patch
Description: 0003-Fix-double-LIST_REMOVE-device.patch


[PATCH 2/4] [v2] kvm: qemu: Fix leak of ioperm data

2008-12-10 Thread Han, Weidong
Free ioperm data in free_assigned_device.
And also, define ioperm_data structure and declare related functions when 
USE_KVM_DEVICE_ASSIGNMENT is defined, because ioperm data is only used by 
device assignment.

Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
---
 qemu/hw/device-assignment.c |2 ++
 qemu/qemu-kvm.c |   15 +++
 qemu/qemu-kvm.h |   12 +++-
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 74f183a..04b6450 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -460,6 +460,8 @@ void free_assigned_device(AssignedDevInfo *adev)
 if (!pci_region->valid || !(pci_region->type & IORESOURCE_MEM))
 continue;
 
+kvm_remove_ioperm_data(region->u.r_baseport, region->r_size);
+
 if (region->u.r_virtbase) {
 int ret = munmap(region->u.r_virtbase,
  (pci_region->size + 0xFFF) & 0xF000);
diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 067cf03..604f84e 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -1145,6 +1145,21 @@ void kvm_add_ioperm_data(struct ioperm_data *data)
 LIST_INSERT_HEAD(&ioperm_head, data, entries);
 }
 
+void kvm_remove_ioperm_data(unsigned long start_port, unsigned long num)
+{
+struct ioperm_data *data;
+
+data = LIST_FIRST(&ioperm_head);
+while (data) {
+if (data->start_port == start_port && data->num == num) {
+LIST_REMOVE(data, entries);
+qemu_free(data);
+}
+
+data = LIST_NEXT(data, entries);
+}
+}
+
 void kvm_ioperm(CPUState *env, void *data)
 {
 if (kvm_enabled() && qemu_system_ready)
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
index 90fadcd..3e86586 100644
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -90,11 +90,6 @@ int qemu_kvm_unregister_coalesced_mmio(target_phys_addr_t 
addr,
 
 void qemu_kvm_system_reset_request(void);
 
-#ifdef USE_KVM_DEVICE_ASSIGNMENT
-void kvm_ioperm(CPUState *env, void *data);
-void kvm_arch_do_ioperm(void *_data);
-#endif
-
 #ifdef TARGET_PPC
 int handle_powerpc_dcr_read(int vcpu, uint32_t dcrn, uint32_t *data);
 int handle_powerpc_dcr_write(int vcpu,uint32_t dcrn, uint32_t data);
@@ -110,6 +105,7 @@ int handle_powerpc_dcr_write(int vcpu,uint32_t dcrn, 
uint32_t data);
 extern int kvm_allowed;
 extern kvm_context_t kvm_context;
 
+#ifdef USE_KVM_DEVICE_ASSIGNMENT
 struct ioperm_data {
 unsigned long start_port;
 unsigned long num;
@@ -117,6 +113,12 @@ struct ioperm_data {
 LIST_ENTRY(ioperm_data) entries;
 };
 
+void kvm_ioperm(CPUState *env, void *data);
+void kvm_add_ioperm_data(struct ioperm_data *data);
+void kvm_remove_ioperm_data(unsigned long start_port, unsigned long num);
+void kvm_arch_do_ioperm(void *_data);
+#endif
+
 int qemu_kvm_has_sync_mmu(void);
 
 #define kvm_enabled() (kvm_allowed)
-- 
1.6.0.4


0002-Fix-leak-of-ioperm-data.patch
Description: 0002-Fix-leak-of-ioperm-data.patch


[PATCH 1/4] [v2] kvm: qemu: Remove the useless parameter of assigned_dev_update_irq

2008-12-10 Thread Han, Weidong
Should pass &assigned_dev->dev to ipf_map_irq in while loop. And rename it to 
assigned_dev_update_irqs() because it updates irq on all assigned devices.

Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
---
 qemu/hw/device-assignment.c |4 ++--
 qemu/hw/pci.c   |4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/qemu/hw/device-assignment.c b/qemu/hw/device-assignment.c
index 4a38a22..74f183a 100644
--- a/qemu/hw/device-assignment.c
+++ b/qemu/hw/device-assignment.c
@@ -491,7 +491,7 @@ static uint32_t calc_assigned_dev_id(uint8_t bus, uint8_t 
devfn)
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
-void assigned_dev_update_irq(PCIDevice *d)
+void assigned_dev_update_irqs()
 {
 AssignedDevInfo *adev;
 
@@ -505,7 +505,7 @@ void assigned_dev_update_irq(PCIDevice *d)
 irq = piix_get_irq(irq);
 
 #ifdef TARGET_IA64
-   irq = ipf_map_irq(d, irq);
+   irq = ipf_map_irq(&assigned_dev->dev, irq);
 #endif
 
 if (irq != assigned_dev->girq) {
diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c
index c93758d..bdd2985 100644
--- a/qemu/hw/pci.c
+++ b/qemu/hw/pci.c
@@ -51,7 +51,7 @@ struct PCIBus {
 
 static void pci_update_mappings(PCIDevice *d);
 static void pci_set_irq(void *opaque, int irq_num, int level);
-void assigned_dev_update_irq(PCIDevice *d);
+void assigned_dev_update_irqs(void);
 
 target_phys_addr_t pci_mem_base;
 static int pci_irq_index;
@@ -459,7 +459,7 @@ void pci_default_write_config(PCIDevice *d,
 if (kvm_enabled() && qemu_kvm_irqchip_in_kernel() &&
 address >= PIIX_CONFIG_IRQ_ROUTE &&
address < PIIX_CONFIG_IRQ_ROUTE + 4)
-assigned_dev_update_irq(d);
+assigned_dev_update_irqs();
 #endif /* USE_KVM_DEVICE_ASSIGNMENT */
 
 end = address + len;
-- 
1.6.0.4


0001-Remove-the-useless-parameter-of-assigned_dev_update_.patch
Description: 0001-Remove-the-useless-parameter-of-assigned_dev_update_.patch


RE: [PATCH 4/4] kvm: qemu: Assign irq in init_assigned_device

2008-12-10 Thread Han, Weidong
Mark McLoughlin wrote:
> On Wed, 2008-12-10 at 17:45 +0800, Han, Weidong wrote:
> 
>> assign_dev_update_irq may not be invoked when hot add a device, so
>> need to assign irq after assign device in init_assigned_device.
> 
> Makes sense, but ...
> 
>> Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
>> ---
>>  qemu/hw/device-assignment.c |   99
>>  -- 1 files changed, 66
>> insertions(+), 33 deletions(-) 
>> 
>> diff --git a/qemu/hw/device-assignment.c
>> b/qemu/hw/device-assignment.c 
>> index 20618a4..557beb8 100644
>> --- a/qemu/hw/device-assignment.c
>> +++ b/qemu/hw/device-assignment.c
>> @@ -490,6 +490,65 @@ static uint32_t calc_assigned_dev_id(uint8_t
>>  bus, uint8_t devfn) return (uint32_t)bus << 8 | (uint32_t)devfn;
>>  }
>> 
> ...
>> +static int assign_irq(AssignedDevInfo *adev)
>> +{
>> +struct kvm_assigned_irq assigned_irq_data;
>> +AssignedDevice *dev = adev->assigned_dev;
>> +int irq, r = 0;
>> +
>> +irq = pci_map_irq(&dev->dev, dev->intpin);
>> +irq = piix_get_irq(irq);
>> +
>> +#ifdef TARGET_IA64
>> +irq = ipf_map_irq(&dev->dev, irq);
>> +#endif
> 
> If you added here:
> 
>if (irq == dev->girq)
>return;
> 
> then ...
> 
>> +memset(&assigned_irq_data, 0, sizeof(assigned_irq_data));
>> +assigned_irq_data.assigned_dev_id =
>> +calc_assigned_dev_id(dev->h_busnr, dev->h_devfn);
>> +assigned_irq_data.guest_irq = irq;
>> +assigned_irq_data.host_irq = dev->real_device.irq;
>> +r = kvm_assign_irq(kvm_context, &assigned_irq_data); +if (r
>> < 0) { +fprintf(stderr, "Failed to assign irq for \"%s\":
>> %s\n", +adev->name, strerror(-r));
>> +fprintf(stderr, "Perhaps you are assigning a device "
>> +"that shares an IRQ with another device?\n"); +
>> return r; +}
>> +
>> +dev->girq = irq;
>> +return r;
>> +}
>> +
>>  /* The pci config space got updated. Check if irq numbers have
>> changed 
>>   * for our devices
>>   */
>> @@ -511,20 +570,8 @@ void assigned_dev_update_irq()  #endif
>> 
>>  if (irq != assigned_dev->girq) {
>> -struct kvm_assigned_irq assigned_irq_data; -
>> -memset(&assigned_irq_data, 0,
>> sizeof(assigned_irq_data)); 
>> -assigned_irq_data.assigned_dev_id  =
>> -calc_assigned_dev_id(assigned_dev->h_busnr,
>> - (uint8_t)
>> assigned_dev->h_devfn); 
>> -assigned_irq_data.guest_irq = irq;
>> -assigned_irq_data.host_irq =
>> assigned_dev->real_device.irq; 
>> -r = kvm_assign_irq(kvm_context, &assigned_irq_data);
>> +r = assign_irq(adev);
>>  if (r < 0) {
>> -fprintf(stderr, "Failed to assign irq for \"%s\":
>> %s\n", 
>> -adev->name, strerror(-r));
>> -fprintf(stderr, "Perhaps you are assigning a device
>> " 
>> -"that shares an IRQ with another
>>  device?\n"); free_assigned_device(adev);
>>  adev = next;
>>  continue;
> 
>  you end up with:
> 
> void assigned_dev_update_irq(PCIDevice *d)
> {
> AssignedDevInfo *adev;
> 
> adev = LIST_FIRST(&adev_head);
> while (adev) {
> AssignedDevInfo *next = LIST_NEXT(adev, next);
> int r;
> 
> r = assign_irq(adev);
> if (r < 0)
> free_assigned_device(adev);
> 
> adev = next;
> }
> 
> which is much nicer.

I will clean up it.

> 
>> @@ -579,27 +626,13 @@ struct PCIDevice
>>  *init_assigned_device(AssignedDevInfo *adev, PCIBus *bus)
>>  dev->h_busnr = adev->bus; dev->h_devfn = PCI_DEVFN(adev->dev,
>> adev->func); 
>> 
>> -memset(&assigned_dev_data, 0, sizeof(assigned_dev_data));
>> -assigned_dev_data.assigned_dev_id  =
>> -calc_assigned_dev_id(dev->h_busnr, (uint32_t)dev->h_devfn);
>> -assigned_dev_data.busnr = dev->h_busnr;
>> -assigned_dev_data.devfn = dev->h_devfn;
>> -
>> -#ifdef KVM_CAP_IOMMU
>> -/* We always enable the IOMMU if present
>> - * (or when not d

RE: [PATCH 16/17] KVM: support device assignment

2008-12-10 Thread Han, Weidong
Avi Kivity wrote:
> Han, Weidong wrote:
>> Support device assignment, it can be used in device hotplug.
>> 
>> 
> 
> device _de_assignment.

Yes, it's device deassignment. Sorry tt's a typo. 

Regards,
Weidong--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/4] kvm: qemu: Remove the useless parameter of assigned_dev_update_irq

2008-12-10 Thread Han, Weidong
Mark McLoughlin wrote:
> On Wed, 2008-12-10 at 17:44 +0800, Han, Weidong wrote:
>> Should pass &assigned_dev->dev to ipf_map_irq in while loop.
>> 
>> Signed-off-by: Weidong Han <[EMAIL PROTECTED]>
>> ---
>>  qemu/hw/device-assignment.c |4 ++--
>>  qemu/hw/pci.c   |4 ++--
>>  2 files changed, 4 insertions(+), 4 deletions(-)
>> 
>> diff --git a/qemu/hw/device-assignment.c
>> b/qemu/hw/device-assignment.c 
>> index 4a38a22..7240158 100644
>> --- a/qemu/hw/device-assignment.c
>> +++ b/qemu/hw/device-assignment.c
>> @@ -491,7 +491,7 @@ static uint32_t calc_assigned_dev_id(uint8_t
>>  bus, uint8_t devfn) /* The pci config space got updated. Check if
>> irq numbers have changed 
>>   * for our devices
>>   */
>> -void assigned_dev_update_irq(PCIDevice *d)
>> +void assigned_dev_update_irq()
> 
> Yes, but perhaps also rename to update_assigned_device_irqs() ?
> 
> (i.e. the current name implies that it's an operation on a single
> assigned device, rather than an operation on all assigned devices)
> 

Will rename it.

Regards,
Weidong

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   >