Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-23 Thread Alex Williamson
On Fri, 23 Sep 2016 14:52:46 +0800
Wei Xu  wrote:

> On 2016年09月21日 22:50, Alex Williamson wrote:
> > On Wed, 21 Sep 2016 14:04:20 +0800
> > Wei Xu  wrote:
> >  
> >> On 2016年09月21日 13:41, Wei Xu wrote:  
> >>   > On 2016年09月21日 12:31, Alex Williamson wrote:  
> >>   >> On Wed, 21 Sep 2016 11:52:31 +0800
> >>   >> Wei Xu  wrote:
> >>   >>  
> >>   >>> On 2016年09月21日 02:59, Nick Sarnie wrote:  
> >>    Hi Wei,
> >>   
> >>    My system is a desktop, so it must just be a general Gigabyte BIOS  
> >> bug.  
> >>    I submitted a help ticket about this issue and just gave a brief
> >>    explanation and then sent Alex's explanation. Hopefully it will be
> >>    escalated correctly.  
> >>   >>>
> >>   >>> Thanks for your feedback, i'm also using a Gigabyte board, i have
> >>   >>> checked out the firmware update history and updated my firmware to 
> >> the
> >>   >>> latest one which was released at March, looks it's a long way to get 
> >> a
> >>   >>> feedback for this issue from them.
> >>   >>>
> >>   >>> Alex,
> >>   >>> It's a hard time for us to do nothing but wait, the reason why i use 
> >> my
> >>   >>> desktop is i got a com console on it, so it's quite convenient to
> >>   >>> debugging kernel via kgdb, and i want to keep my realtek nic for ssh
> >>   >>> access from my notebook, anyway to workaround it to just bypass the
> >>   >>> wireless nic only as a temporary experiment?
> >>   >>>
> >>   >>> I'm trying VirtIO DMAR patch with vIOMMU in the guest recently, which
> >>   >>> need pass through a pcie unit from host, and one more virtio nic  
> >> for the  
> >>   >>> guest due to the feedbacks, maybe i can pass through a device in 
> >> other
> >>   >>> groups instead of a nic?  
> >>   >>
> >>   >> Sure, but skylake platforms are notoriously bad for their lack of
> >>   >> device isolation, even things like USB controllers and audio devices
> >>   >> are now part of multifunction packages that do not expose isolation
> >>   >> through ACS.  If you can't resolve the IOMMU grouping otherwise, your
> >>   >> choices are as I told Nick in the other thread:
> >>   >>
> >>   >>"Your choices are to run an unsupported (and unsupportable)
> >>   >>configuration using the ACS override patch, get your hardware 
> >> vendor
> >>   >>to fix their platform, or upgrade to better hardware with better
> >>   >>isolation characteristics."
> >>   >>
> >>   >> It's unfortunate that Intel provides VT-d on consumer platforms 
> >> without
> >>   >> sufficient device isolation to really make it usable, but that's often
> >>   >> the state of things.  The workstation and server class platforms,
> >>   >> supporting Xeon E5 or High End Desktop Processors provide the 
> >> necessary
> >>   >> isolation.  Thanks,  
> >>   >
> >>   > Yes, fortunately i get it solved finally, i tried adding the 'r8169'
> >>   > driver to the kernel group whitelist behind 'pci-stub' and recompile  &
> >>   > update the kernel firstly, and the VM boot up successfully, but a map
> >>   > page to iova error for realtek nic during DMA crashed the system later,
> >>   > looks it was caused by the group dependency, i remembered the vfio doc
> >>   > tells the group is the minimum isolation unit.  
> >
> > This approach is just a bad idea.
> >  
> >>   >
> >>   > Then i found there are 3 pci bridges on my board, 2 of them are with a
> >>   > group, another is a separate group, after plug the iwl wlan nic to this
> >>   > one, everything works well.  
> >>
> >> Just noticed a topology change of my system, looks the PCI bridges is
> >> different as before after i changed the slot for my wlan nic, i used to
> >> think i plugged it to 00:1d.0 but it was connected to Sky Lake PCIe
> >> controller, does this mean there are hidden PCI bridges for pci
> >> enumeration in the system, is this allowable?
> >>
> >> Before:
> >> 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
> >> Port #5 (rev f1) (prog-if 00 [Normal decode])
> >> 00:1c.7 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
> >> Port #8 (rev f1) (prog-if 00 [Normal decode])  wlan nic
> >> 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
> >> Port #9 (rev f1) (prog-if 00 [Normal decode])
> >>
> >> Now:
> >> 00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16)
> >> (rev 07) (prog-if 00 [Normal decode])  wlan nic
> >> 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
> >> Port #5 (rev f1) (prog-if 00 [Normal decode])
> >> 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
> >> Port #9 (rev f1) (prog-if 00 [Normal decode])  
> >
> > There are generally two sources of PCIe root ports on Intel systems,
> > the processor itself and the PCH (Platform Controller Hub).  Look at a
> > block diagram for a modern system and you'll see this.  Typically for a
> > client processor (i3/i5/i7) there is no isolation between or
> > downstrea

Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-22 Thread Wei Xu

On 2016年09月21日 22:50, Alex Williamson wrote:

On Wed, 21 Sep 2016 14:04:20 +0800
Wei Xu  wrote:


On 2016年09月21日 13:41, Wei Xu wrote:
  > On 2016年09月21日 12:31, Alex Williamson wrote:
  >> On Wed, 21 Sep 2016 11:52:31 +0800
  >> Wei Xu  wrote:
  >>
  >>> On 2016年09月21日 02:59, Nick Sarnie wrote:
   Hi Wei,
  
   My system is a desktop, so it must just be a general Gigabyte BIOS
bug.
   I submitted a help ticket about this issue and just gave a brief
   explanation and then sent Alex's explanation. Hopefully it will be
   escalated correctly.
  >>>
  >>> Thanks for your feedback, i'm also using a Gigabyte board, i have
  >>> checked out the firmware update history and updated my firmware to the
  >>> latest one which was released at March, looks it's a long way to get a
  >>> feedback for this issue from them.
  >>>
  >>> Alex,
  >>> It's a hard time for us to do nothing but wait, the reason why i use my
  >>> desktop is i got a com console on it, so it's quite convenient to
  >>> debugging kernel via kgdb, and i want to keep my realtek nic for ssh
  >>> access from my notebook, anyway to workaround it to just bypass the
  >>> wireless nic only as a temporary experiment?
  >>>
  >>> I'm trying VirtIO DMAR patch with vIOMMU in the guest recently, which
  >>> need pass through a pcie unit from host, and one more virtio nic
for the
  >>> guest due to the feedbacks, maybe i can pass through a device in other
  >>> groups instead of a nic?
  >>
  >> Sure, but skylake platforms are notoriously bad for their lack of
  >> device isolation, even things like USB controllers and audio devices
  >> are now part of multifunction packages that do not expose isolation
  >> through ACS.  If you can't resolve the IOMMU grouping otherwise, your
  >> choices are as I told Nick in the other thread:
  >>
  >>"Your choices are to run an unsupported (and unsupportable)
  >>configuration using the ACS override patch, get your hardware vendor
  >>to fix their platform, or upgrade to better hardware with better
  >>isolation characteristics."
  >>
  >> It's unfortunate that Intel provides VT-d on consumer platforms without
  >> sufficient device isolation to really make it usable, but that's often
  >> the state of things.  The workstation and server class platforms,
  >> supporting Xeon E5 or High End Desktop Processors provide the necessary
  >> isolation.  Thanks,
  >
  > Yes, fortunately i get it solved finally, i tried adding the 'r8169'
  > driver to the kernel group whitelist behind 'pci-stub' and recompile  &
  > update the kernel firstly, and the VM boot up successfully, but a map
  > page to iova error for realtek nic during DMA crashed the system later,
  > looks it was caused by the group dependency, i remembered the vfio doc
  > tells the group is the minimum isolation unit.


This approach is just a bad idea.


  >
  > Then i found there are 3 pci bridges on my board, 2 of them are with a
  > group, another is a separate group, after plug the iwl wlan nic to this
  > one, everything works well.

Just noticed a topology change of my system, looks the PCI bridges is
different as before after i changed the slot for my wlan nic, i used to
think i plugged it to 00:1d.0 but it was connected to Sky Lake PCIe
controller, does this mean there are hidden PCI bridges for pci
enumeration in the system, is this allowable?

Before:
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
Port #5 (rev f1) (prog-if 00 [Normal decode])
00:1c.7 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
Port #8 (rev f1) (prog-if 00 [Normal decode])  wlan nic
00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
Port #9 (rev f1) (prog-if 00 [Normal decode])

Now:
00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16)
(rev 07) (prog-if 00 [Normal decode])  wlan nic
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
Port #5 (rev f1) (prog-if 00 [Normal decode])
00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root
Port #9 (rev f1) (prog-if 00 [Normal decode])


There are generally two sources of PCIe root ports on Intel systems,
the processor itself and the PCH (Platform Controller Hub).  Look at a
block diagram for a modern system and you'll see this.  Typically for a
client processor (i3/i5/i7) there is no isolation between or
downstream of the individual processor root ports and isolation between
the individual PCH root ports is via quirks, because Intel didn't
include ACS or broke ACS.  You've found these processor root ports.
Why don't they show up in lspci when nothing is plugged into them?  Why
should they?  Chances are almost certain that your system does not
support PCI hotplug, so there's no requirement to expose empty
bridges.  I'm glad you've found a working setup, desktop class systems
often have poor isolation characteristics which make device assignment
difficult.  Thanks,


Than

Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-21 Thread Alex Williamson
On Wed, 21 Sep 2016 14:04:20 +0800
Wei Xu  wrote:

> On 2016年09月21日 13:41, Wei Xu wrote:
>  > On 2016年09月21日 12:31, Alex Williamson wrote:  
>  >> On Wed, 21 Sep 2016 11:52:31 +0800
>  >> Wei Xu  wrote:
>  >>  
>  >>> On 2016年09月21日 02:59, Nick Sarnie wrote:  
>   Hi Wei,
>  
>   My system is a desktop, so it must just be a general Gigabyte BIOS   
> bug.
>   I submitted a help ticket about this issue and just gave a brief
>   explanation and then sent Alex's explanation. Hopefully it will be
>   escalated correctly.  
>  >>>
>  >>> Thanks for your feedback, i'm also using a Gigabyte board, i have
>  >>> checked out the firmware update history and updated my firmware to the
>  >>> latest one which was released at March, looks it's a long way to get a
>  >>> feedback for this issue from them.
>  >>>
>  >>> Alex,
>  >>> It's a hard time for us to do nothing but wait, the reason why i use my
>  >>> desktop is i got a com console on it, so it's quite convenient to
>  >>> debugging kernel via kgdb, and i want to keep my realtek nic for ssh
>  >>> access from my notebook, anyway to workaround it to just bypass the
>  >>> wireless nic only as a temporary experiment?
>  >>>
>  >>> I'm trying VirtIO DMAR patch with vIOMMU in the guest recently, which
>  >>> need pass through a pcie unit from host, and one more virtio nic   
> for the
>  >>> guest due to the feedbacks, maybe i can pass through a device in other
>  >>> groups instead of a nic?  
>  >>
>  >> Sure, but skylake platforms are notoriously bad for their lack of
>  >> device isolation, even things like USB controllers and audio devices
>  >> are now part of multifunction packages that do not expose isolation
>  >> through ACS.  If you can't resolve the IOMMU grouping otherwise, your
>  >> choices are as I told Nick in the other thread:
>  >>
>  >>"Your choices are to run an unsupported (and unsupportable)
>  >>configuration using the ACS override patch, get your hardware vendor
>  >>to fix their platform, or upgrade to better hardware with better
>  >>isolation characteristics."
>  >>
>  >> It's unfortunate that Intel provides VT-d on consumer platforms without
>  >> sufficient device isolation to really make it usable, but that's often
>  >> the state of things.  The workstation and server class platforms,
>  >> supporting Xeon E5 or High End Desktop Processors provide the necessary
>  >> isolation.  Thanks,  
>  >
>  > Yes, fortunately i get it solved finally, i tried adding the 'r8169'
>  > driver to the kernel group whitelist behind 'pci-stub' and recompile  &
>  > update the kernel firstly, and the VM boot up successfully, but a map
>  > page to iova error for realtek nic during DMA crashed the system later,
>  > looks it was caused by the group dependency, i remembered the vfio doc
>  > tells the group is the minimum isolation unit.

This approach is just a bad idea.

>  >
>  > Then i found there are 3 pci bridges on my board, 2 of them are with a
>  > group, another is a separate group, after plug the iwl wlan nic to this
>  > one, everything works well.  
> 
> Just noticed a topology change of my system, looks the PCI bridges is
> different as before after i changed the slot for my wlan nic, i used to
> think i plugged it to 00:1d.0 but it was connected to Sky Lake PCIe 
> controller, does this mean there are hidden PCI bridges for pci 
> enumeration in the system, is this allowable?
> 
> Before:
> 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
> Port #5 (rev f1) (prog-if 00 [Normal decode])
> 00:1c.7 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
> Port #8 (rev f1) (prog-if 00 [Normal decode])  wlan nic
> 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
> Port #9 (rev f1) (prog-if 00 [Normal decode])
> 
> Now:
> 00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) 
> (rev 07) (prog-if 00 [Normal decode])  wlan nic
> 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
> Port #5 (rev f1) (prog-if 00 [Normal decode])
> 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
> Port #9 (rev f1) (prog-if 00 [Normal decode])

There are generally two sources of PCIe root ports on Intel systems,
the processor itself and the PCH (Platform Controller Hub).  Look at a
block diagram for a modern system and you'll see this.  Typically for a
client processor (i3/i5/i7) there is no isolation between or
downstream of the individual processor root ports and isolation between
the individual PCH root ports is via quirks, because Intel didn't
include ACS or broke ACS.  You've found these processor root ports.
Why don't they show up in lspci when nothing is plugged into them?  Why
should they?  Chances are almost certain that your system does not
support PCI hotplug, so there's no requirement to expose empty
bridges.  I'm glad you've found a working setup, desktop class systems
often have poo

Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Wei Xu

On 2016年09月21日 13:41, Wei Xu wrote:
> On 2016年09月21日 12:31, Alex Williamson wrote:
>> On Wed, 21 Sep 2016 11:52:31 +0800
>> Wei Xu  wrote:
>>
>>> On 2016年09月21日 02:59, Nick Sarnie wrote:
 Hi Wei,

 My system is a desktop, so it must just be a general Gigabyte BIOS 
bug.

 I submitted a help ticket about this issue and just gave a brief
 explanation and then sent Alex's explanation. Hopefully it will be
 escalated correctly.
>>>
>>> Thanks for your feedback, i'm also using a Gigabyte board, i have
>>> checked out the firmware update history and updated my firmware to the
>>> latest one which was released at March, looks it's a long way to get a
>>> feedback for this issue from them.
>>>
>>> Alex,
>>> It's a hard time for us to do nothing but wait, the reason why i use my
>>> desktop is i got a com console on it, so it's quite convenient to
>>> debugging kernel via kgdb, and i want to keep my realtek nic for ssh
>>> access from my notebook, anyway to workaround it to just bypass the
>>> wireless nic only as a temporary experiment?
>>>
>>> I'm trying VirtIO DMAR patch with vIOMMU in the guest recently, which
>>> need pass through a pcie unit from host, and one more virtio nic 
for the

>>> guest due to the feedbacks, maybe i can pass through a device in other
>>> groups instead of a nic?
>>
>> Sure, but skylake platforms are notoriously bad for their lack of
>> device isolation, even things like USB controllers and audio devices
>> are now part of multifunction packages that do not expose isolation
>> through ACS.  If you can't resolve the IOMMU grouping otherwise, your
>> choices are as I told Nick in the other thread:
>>
>>"Your choices are to run an unsupported (and unsupportable)
>>configuration using the ACS override patch, get your hardware vendor
>>to fix their platform, or upgrade to better hardware with better
>>isolation characteristics."
>>
>> It's unfortunate that Intel provides VT-d on consumer platforms without
>> sufficient device isolation to really make it usable, but that's often
>> the state of things.  The workstation and server class platforms,
>> supporting Xeon E5 or High End Desktop Processors provide the necessary
>> isolation.  Thanks,
>
> Yes, fortunately i get it solved finally, i tried adding the 'r8169'
> driver to the kernel group whitelist behind 'pci-stub' and recompile  &
> update the kernel firstly, and the VM boot up successfully, but a map
> page to iova error for realtek nic during DMA crashed the system later,
> looks it was caused by the group dependency, i remembered the vfio doc
> tells the group is the minimum isolation unit.
>
> Then i found there are 3 pci bridges on my board, 2 of them are with a
> group, another is a separate group, after plug the iwl wlan nic to this
> one, everything works well.

Just noticed a topology change of my system, looks the PCI bridges is
different as before after i changed the slot for my wlan nic, i used to
think i plugged it to 00:1d.0 but it was connected to Sky Lake PCIe 
controller, does this mean there are hidden PCI bridges for pci 
enumeration in the system, is this allowable?


Before:
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
Port #5 (rev f1) (prog-if 00 [Normal decode])
00:1c.7 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
Port #8 (rev f1) (prog-if 00 [Normal decode])  wlan nic
00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
Port #9 (rev f1) (prog-if 00 [Normal decode])


Now:
00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) 
(rev 07) (prog-if 00 [Normal decode])  wlan nic
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
Port #5 (rev f1) (prog-if 00 [Normal decode])
00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
Port #9 (rev f1) (prog-if 00 [Normal decode])



>
> Wei
>
>
>>
>> Alex
>>
>
> ___
> vfio-users mailing list
> vfio-users@redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Wei Xu

Hi Nick,
I'm using a Gigabyte B150M-ds3h board, not sure about your board and
requirement, if you want to pass through a usb device to a VM anxious
enough to pay for some staff but not a new board, maybe you could try
an extra pcie2usb adapter and plug it into the other iommu group:).

Wei

On 2016年09月21日 02:59, Nick Sarnie wrote:

Hi Wei,

My system is a desktop, so it must just be a general Gigabyte BIOS bug.
I submitted a help ticket about this issue and just gave a brief
explanation and then sent Alex's explanation. Hopefully it will be
escalated correctly.

Thanks,
Nick

On Tue, Sep 20, 2016 at 1:08 PM, Wei Xu mailto:w...@redhat.com>> wrote:



On 2016年09月20日 22:20, Alex Williamson wrote:

On Tue, 20 Sep 2016 08:14:45 -0600
Alex Williamson mailto:alex.william...@redhat.com>> wrote:

On Tue, 20 Sep 2016 21:56:33 +0800
Wei Xu mailto:w...@redhat.com>> wrote:

On 2016年09月20日 09:59, Alex Williamson wrote:

On Tue, 20 Sep 2016 09:28:57 +0800
Wei Xu mailto:w...@redhat.com>> wrote:

Hi Guys,
I'm trying to pass through a rtl8168 nic to
linux guest on my laptop
recently, the link is directly connected to my
notebook with a cable.

qemu: 2.7.0-rc4
host/guest kernel: 4.7.0-rc5
driver name: r8169

After binding the driver to vfio-pci and
launching the VM for a few
seconds, the connection led on the nic was
turned off, and the guest
only see a link down nic with below messages.

[6.173188] r8169 :00:04.0 ens4:
rtl_phy_reset_cond == 1 (loop:
100, delay: 1).
[6.177234] r8169 :00:04.0 ens4: link down
[6.177592] r8169 :00:04.0 ens4: link down
[6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4:
link is not ready


It's quite similar as below bug except it's for
windows driver while
i'm testing linux.

https://bugs.launchpad.net/qemu/+bug/1384892



More info:
My vm image is a pre-installed fedora 22
desktop, i also tried a fresh
fedora live iso, looks it can load the driver
correctly.

I tried to disable auto negotiation for the link
but it didn't work for me.

I did the same test with my notebook with a
Intel I218-LM ethernet
controller, it works pretty well every time.

I googled around and looks it happened to bare
metal too, so just wonder
if this is a bug of network-manager, or is being
caused by the nic
driver or an issue in qemu/kernel vfio, anybody
can help?


Realtek nics don't work well with device assignment,
they barely work
well on bare metal.  Stick with the Intel nic or
just use the rtl nic
with virtio.  I've spent longer than I care to admit
on the rtl quirks
we have in QEMU and I expect they still only work
with a few devices.


OK, I'll ignore Realtek, so I added one Intel iwl6235
wireless nic to my
laptop, the pci tree shows both the rtl and iwl are
behind a separate
pci bridge, after bind iwl to vfio-pci driver, i failed
to pass through
it again with error message from qemu.

qemu-system-x86_64: -device vfio-pci,host=:02:00.0:
vfio: error,
group 5 is not viable, please ensure all devices within
the iommu_group
are bound to their vfio bus driver.
qemu-system-x86_64: -device vfio-pci,host=:02:00.0:
vfio: failed to
get group 5
qemu-system-x86_64: -device vfio-pci,host=:02:00.0:
Device
initialization failed

Seems it's caused by the rtl nic is under the same iommu
group with iwl
as well, and when the kernel vfio driver checking the
viablity, it'll
  

Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Wei Xu

On 2016年09月21日 12:31, Alex Williamson wrote:

On Wed, 21 Sep 2016 11:52:31 +0800
Wei Xu  wrote:


On 2016年09月21日 02:59, Nick Sarnie wrote:

Hi Wei,

My system is a desktop, so it must just be a general Gigabyte BIOS bug.
I submitted a help ticket about this issue and just gave a brief
explanation and then sent Alex's explanation. Hopefully it will be
escalated correctly.


Thanks for your feedback, i'm also using a Gigabyte board, i have
checked out the firmware update history and updated my firmware to the
latest one which was released at March, looks it's a long way to get a
feedback for this issue from them.

Alex,
It's a hard time for us to do nothing but wait, the reason why i use my
desktop is i got a com console on it, so it's quite convenient to
debugging kernel via kgdb, and i want to keep my realtek nic for ssh
access from my notebook, anyway to workaround it to just bypass the
wireless nic only as a temporary experiment?

I'm trying VirtIO DMAR patch with vIOMMU in the guest recently, which
need pass through a pcie unit from host, and one more virtio nic for the
guest due to the feedbacks, maybe i can pass through a device in other
groups instead of a nic?


Sure, but skylake platforms are notoriously bad for their lack of
device isolation, even things like USB controllers and audio devices
are now part of multifunction packages that do not expose isolation
through ACS.  If you can't resolve the IOMMU grouping otherwise, your
choices are as I told Nick in the other thread:

   "Your choices are to run an unsupported (and unsupportable)
   configuration using the ACS override patch, get your hardware vendor
   to fix their platform, or upgrade to better hardware with better
   isolation characteristics."

It's unfortunate that Intel provides VT-d on consumer platforms without
sufficient device isolation to really make it usable, but that's often
the state of things.  The workstation and server class platforms,
supporting Xeon E5 or High End Desktop Processors provide the necessary
isolation.  Thanks,


Yes, fortunately i get it solved finally, i tried adding the 'r8169' 
driver to the kernel group whitelist behind 'pci-stub' and recompile  & 
update the kernel firstly, and the VM boot up successfully, but a map 
page to iova error for realtek nic during DMA crashed the system later, 
looks it was caused by the group dependency, i remembered the vfio doc 
tells the group is the minimum isolation unit.


Then i found there are 3 pci bridges on my board, 2 of them are with a 
group, another is a separate group, after plug the iwl wlan nic to this 
one, everything works well.


Wei




Alex



___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Alex Williamson
On Wed, 21 Sep 2016 11:52:31 +0800
Wei Xu  wrote:

> On 2016年09月21日 02:59, Nick Sarnie wrote:
> > Hi Wei,
> >
> > My system is a desktop, so it must just be a general Gigabyte BIOS bug.
> > I submitted a help ticket about this issue and just gave a brief
> > explanation and then sent Alex's explanation. Hopefully it will be
> > escalated correctly.  
> 
> Thanks for your feedback, i'm also using a Gigabyte board, i have
> checked out the firmware update history and updated my firmware to the
> latest one which was released at March, looks it's a long way to get a
> feedback for this issue from them.
> 
> Alex,
> It's a hard time for us to do nothing but wait, the reason why i use my 
> desktop is i got a com console on it, so it's quite convenient to 
> debugging kernel via kgdb, and i want to keep my realtek nic for ssh 
> access from my notebook, anyway to workaround it to just bypass the 
> wireless nic only as a temporary experiment?
> 
> I'm trying VirtIO DMAR patch with vIOMMU in the guest recently, which 
> need pass through a pcie unit from host, and one more virtio nic for the 
> guest due to the feedbacks, maybe i can pass through a device in other 
> groups instead of a nic?

Sure, but skylake platforms are notoriously bad for their lack of
device isolation, even things like USB controllers and audio devices
are now part of multifunction packages that do not expose isolation
through ACS.  If you can't resolve the IOMMU grouping otherwise, your
choices are as I told Nick in the other thread:

  "Your choices are to run an unsupported (and unsupportable)
  configuration using the ACS override patch, get your hardware vendor
  to fix their platform, or upgrade to better hardware with better
  isolation characteristics."

It's unfortunate that Intel provides VT-d on consumer platforms without
sufficient device isolation to really make it usable, but that's often
the state of things.  The workstation and server class platforms,
supporting Xeon E5 or High End Desktop Processors provide the necessary
isolation.  Thanks,

Alex

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Wei Xu

On 2016年09月21日 02:59, Nick Sarnie wrote:

Hi Wei,

My system is a desktop, so it must just be a general Gigabyte BIOS bug.
I submitted a help ticket about this issue and just gave a brief
explanation and then sent Alex's explanation. Hopefully it will be
escalated correctly.


Thanks for your feedback, i'm also using a Gigabyte board, i have
checked out the firmware update history and updated my firmware to the
latest one which was released at March, looks it's a long way to get a
feedback for this issue from them.

Alex,
It's a hard time for us to do nothing but wait, the reason why i use my 
desktop is i got a com console on it, so it's quite convenient to 
debugging kernel via kgdb, and i want to keep my realtek nic for ssh 
access from my notebook, anyway to workaround it to just bypass the 
wireless nic only as a temporary experiment?


I'm trying VirtIO DMAR patch with vIOMMU in the guest recently, which 
need pass through a pcie unit from host, and one more virtio nic for the 
guest due to the feedbacks, maybe i can pass through a device in other 
groups instead of a nic?


Wei



Thanks,
Nick

On Tue, Sep 20, 2016 at 1:08 PM, Wei Xu mailto:w...@redhat.com>> wrote:



On 2016年09月20日 22:20, Alex Williamson wrote:

On Tue, 20 Sep 2016 08:14:45 -0600
Alex Williamson mailto:alex.william...@redhat.com>> wrote:

On Tue, 20 Sep 2016 21:56:33 +0800
Wei Xu mailto:w...@redhat.com>> wrote:

On 2016年09月20日 09:59, Alex Williamson wrote:

On Tue, 20 Sep 2016 09:28:57 +0800
Wei Xu mailto:w...@redhat.com>> wrote:

Hi Guys,
I'm trying to pass through a rtl8168 nic to
linux guest on my laptop
recently, the link is directly connected to my
notebook with a cable.

qemu: 2.7.0-rc4
host/guest kernel: 4.7.0-rc5
driver name: r8169

After binding the driver to vfio-pci and
launching the VM for a few
seconds, the connection led on the nic was
turned off, and the guest
only see a link down nic with below messages.

[6.173188] r8169 :00:04.0 ens4:
rtl_phy_reset_cond == 1 (loop:
100, delay: 1).
[6.177234] r8169 :00:04.0 ens4: link down
[6.177592] r8169 :00:04.0 ens4: link down
[6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4:
link is not ready


It's quite similar as below bug except it's for
windows driver while
i'm testing linux.

https://bugs.launchpad.net/qemu/+bug/1384892



More info:
My vm image is a pre-installed fedora 22
desktop, i also tried a fresh
fedora live iso, looks it can load the driver
correctly.

I tried to disable auto negotiation for the link
but it didn't work for me.

I did the same test with my notebook with a
Intel I218-LM ethernet
controller, it works pretty well every time.

I googled around and looks it happened to bare
metal too, so just wonder
if this is a bug of network-manager, or is being
caused by the nic
driver or an issue in qemu/kernel vfio, anybody
can help?


Realtek nics don't work well with device assignment,
they barely work
well on bare metal.  Stick with the Intel nic or
just use the rtl nic
with virtio.  I've spent longer than I care to admit
on the rtl quirks
we have in QEMU and I expect they still only work
with a few devices.


OK, I'll ignore Realtek, so I added one Intel iwl6235
wireless nic to my
laptop, the pci tree shows both the rtl and iwl are
behind a separate
pci bridge, after bind iwl to vfio-pci driver, i failed
to pass through
it again with error message from qemu.

qemu-system-x86_64: -device vfio-pci,host=:02:00.0:
vfio: error,
group 5 is not viable, please ensure all devices within
the iommu_group
  

Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Nick Sarnie
Hi Wei,

My system is a desktop, so it must just be a general Gigabyte BIOS bug. I
submitted a help ticket about this issue and just gave a brief explanation
and then sent Alex's explanation. Hopefully it will be escalated correctly.

Thanks,
Nick

On Tue, Sep 20, 2016 at 1:08 PM, Wei Xu  wrote:

>
>
> On 2016年09月20日 22:20, Alex Williamson wrote:
>
>> On Tue, 20 Sep 2016 08:14:45 -0600
>> Alex Williamson  wrote:
>>
>> On Tue, 20 Sep 2016 21:56:33 +0800
>>> Wei Xu  wrote:
>>>
>>> On 2016年09月20日 09:59, Alex Williamson wrote:

> On Tue, 20 Sep 2016 09:28:57 +0800
> Wei Xu  wrote:
>
> Hi Guys,
>> I'm trying to pass through a rtl8168 nic to linux guest on my laptop
>> recently, the link is directly connected to my notebook with a cable.
>>
>> qemu: 2.7.0-rc4
>> host/guest kernel: 4.7.0-rc5
>> driver name: r8169
>>
>> After binding the driver to vfio-pci and launching the VM for a few
>> seconds, the connection led on the nic was turned off, and the guest
>> only see a link down nic with below messages.
>>
>> [6.173188] r8169 :00:04.0 ens4: rtl_phy_reset_cond == 1 (loop:
>> 100, delay: 1).
>> [6.177234] r8169 :00:04.0 ens4: link down
>> [6.177592] r8169 :00:04.0 ens4: link down
>> [6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready
>>
>>
>> It's quite similar as below bug except it's for windows driver while
>> i'm testing linux.
>>
>> https://bugs.launchpad.net/qemu/+bug/1384892
>>
>>
>> More info:
>> My vm image is a pre-installed fedora 22 desktop, i also tried a fresh
>> fedora live iso, looks it can load the driver correctly.
>>
>> I tried to disable auto negotiation for the link but it didn't work
>> for me.
>>
>> I did the same test with my notebook with a Intel I218-LM ethernet
>> controller, it works pretty well every time.
>>
>> I googled around and looks it happened to bare metal too, so just
>> wonder
>> if this is a bug of network-manager, or is being caused by the nic
>> driver or an issue in qemu/kernel vfio, anybody can help?
>>
>
> Realtek nics don't work well with device assignment, they barely work
> well on bare metal.  Stick with the Intel nic or just use the rtl nic
> with virtio.  I've spent longer than I care to admit on the rtl quirks
> we have in QEMU and I expect they still only work with a few devices.
>

 OK, I'll ignore Realtek, so I added one Intel iwl6235 wireless nic to my
 laptop, the pci tree shows both the rtl and iwl are behind a separate
 pci bridge, after bind iwl to vfio-pci driver, i failed to pass through
 it again with error message from qemu.

 qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: error,
 group 5 is not viable, please ensure all devices within the iommu_group
 are bound to their vfio bus driver.
 qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: failed to
 get group 5
 qemu-system-x86_64: -device vfio-pci,host=:02:00.0: Device
 initialization failed

 Seems it's caused by the rtl nic is under the same iommu group with iwl
 as well, and when the kernel vfio driver checking the viablity, it'll
 make sure all the devices under the same group are viable, it works fine
 after i bound the rtl to vfio-pci too, i'm wonder if this a discipline?
 can't i just bind the iwl nic and pass through the the guest?

 pci tree:
 -[:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
 +-1c.0-[01]00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
 PCI Express Gigabit Ethernet Controller
 +-1c.7-[02]00.0 Intel Corporation Centrino Advanced-N 6235

>>>
>>> If your PCH root ports report an ACS capability then you can run kernel
>>> v4.7 kernel on the host to expose the isolation.  If the root ports
>>> (00:1c.*) do not expose an ACS capability, it's probably a BIOS bug
>>> similar to Nick's system in this thread
>>> https://www.redhat.com/archives/vfio-users/2016-September/msg00059.html
>>>
>>
>> And I see you're running a v4.7 kernel already (though I'm not sure why
>> you're running an rc release for kernel or QEMU since both of those
>> have been released).  So you need to check them with lspci to see if an
>> ACS capability is exposed on the root ports, check whether your root
>> ports are covered by the device IDs in this quirk:
>>
>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.
>> git/commit/?id=1bf2bf229b64540f91ac6fa3af37c81249037a0b
>>
>> If there's no ACS capability but the root ports fall within the quirk,
>> it's a BIOS bug on the system.  Sorry.
>>
>
> Unfortunately, the device id is within your list in the commit qurik
> but it failed still, my ACS dump of pci is as the same as Nick's, just
> wondering why the bios doesn't report it, looks it's a default optio

Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Wei Xu



On 2016年09月20日 22:20, Alex Williamson wrote:

On Tue, 20 Sep 2016 08:14:45 -0600
Alex Williamson  wrote:


On Tue, 20 Sep 2016 21:56:33 +0800
Wei Xu  wrote:


On 2016年09月20日 09:59, Alex Williamson wrote:

On Tue, 20 Sep 2016 09:28:57 +0800
Wei Xu  wrote:


Hi Guys,
I'm trying to pass through a rtl8168 nic to linux guest on my laptop
recently, the link is directly connected to my notebook with a cable.

qemu: 2.7.0-rc4
host/guest kernel: 4.7.0-rc5
driver name: r8169

After binding the driver to vfio-pci and launching the VM for a few
seconds, the connection led on the nic was turned off, and the guest
only see a link down nic with below messages.

[6.173188] r8169 :00:04.0 ens4: rtl_phy_reset_cond == 1 (loop:
100, delay: 1).
[6.177234] r8169 :00:04.0 ens4: link down
[6.177592] r8169 :00:04.0 ens4: link down
[6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready


It's quite similar as below bug except it's for windows driver while
i'm testing linux.

https://bugs.launchpad.net/qemu/+bug/1384892


More info:
My vm image is a pre-installed fedora 22 desktop, i also tried a fresh
fedora live iso, looks it can load the driver correctly.

I tried to disable auto negotiation for the link but it didn't work for me.

I did the same test with my notebook with a Intel I218-LM ethernet
controller, it works pretty well every time.

I googled around and looks it happened to bare metal too, so just wonder
if this is a bug of network-manager, or is being caused by the nic
driver or an issue in qemu/kernel vfio, anybody can help?


Realtek nics don't work well with device assignment, they barely work
well on bare metal.  Stick with the Intel nic or just use the rtl nic
with virtio.  I've spent longer than I care to admit on the rtl quirks
we have in QEMU and I expect they still only work with a few devices.


OK, I'll ignore Realtek, so I added one Intel iwl6235 wireless nic to my
laptop, the pci tree shows both the rtl and iwl are behind a separate
pci bridge, after bind iwl to vfio-pci driver, i failed to pass through
it again with error message from qemu.

qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: error,
group 5 is not viable, please ensure all devices within the iommu_group
are bound to their vfio bus driver.
qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: failed to
get group 5
qemu-system-x86_64: -device vfio-pci,host=:02:00.0: Device
initialization failed

Seems it's caused by the rtl nic is under the same iommu group with iwl
as well, and when the kernel vfio driver checking the viablity, it'll
make sure all the devices under the same group are viable, it works fine
after i bound the rtl to vfio-pci too, i'm wonder if this a discipline?
can't i just bind the iwl nic and pass through the the guest?

pci tree:
-[:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
+-1c.0-[01]00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
PCI Express Gigabit Ethernet Controller
+-1c.7-[02]00.0 Intel Corporation Centrino Advanced-N 6235


If your PCH root ports report an ACS capability then you can run kernel
v4.7 kernel on the host to expose the isolation.  If the root ports
(00:1c.*) do not expose an ACS capability, it's probably a BIOS bug
similar to Nick's system in this thread
https://www.redhat.com/archives/vfio-users/2016-September/msg00059.html


And I see you're running a v4.7 kernel already (though I'm not sure why
you're running an rc release for kernel or QEMU since both of those
have been released).  So you need to check them with lspci to see if an
ACS capability is exposed on the root ports, check whether your root
ports are covered by the device IDs in this quirk:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1bf2bf229b64540f91ac6fa3af37c81249037a0b

If there's no ACS capability but the root ports fall within the quirk,
it's a BIOS bug on the system.  Sorry.


Unfortunately, the device id is within your list in the commit qurik
but it failed still, my ACS dump of pci is as the same as Nick's, just
wondering why the bios doesn't report it, looks it's a default option
for most of laptops, do you know what's the possible reason behind that?
to connect all the components by default even with VT-d enabled?

00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root 
Port #5 (rev f1)

00: 86 80 14 a1 07 00 10 00 f1 00 04 06 00 00 81 00
10: 00 00 00 00 00 00 00 00 00 01 01 00 e0 e0 00 20
20: 10 df 10 df f1 ff 01 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 0a 01 10 00
40: 10 80 42 01 01 80 00 00 00 00 10 00 13 40 72 05
50: 40 00 11 70 00 b2 44 00 00 00 40 01 00 00 00 00
60: 00 00 00 00 37 08 00 00 00 04 00 00 0e 00 00 00
70: 03 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 05 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 0d a0 00 00 58 14 01 50 00 00 00 00 00 00 00 00
a0: 01 00 03 c8 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 

Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Karsten Mettke
When I tried to Pass such nic to the guest, it would hard lock the device, even 
the Link Indikator LEDs stayed off, which used to work even when the driver was 
not loaded. The only way to get it working was/is to Power off the Maschine 
completely. After spending hours trying all kinds of recommandations i gave up 
and bought an Intel nic‎.
‎So my guess is that this Chip is "broken" at very low Level and there is not 
much i could do to get it working.
So, better dont waste your time, that would be my advice.

Karsten mettke

Gesendet von meinem BlackBerry 10-Smartphone.
  Originalnachricht  
Von: Sebastian Hahn
Gesendet: Dienstag, 20. September 2016 16:31
An: Alex Williamson
Cc: vfio-users@redhat.com
Betreff: Re: [vfio-users] Lost link when pass through rtl8168 to guest

Hi Alex,

> On 20 Sep 2016, at 16:14, Alex Williamson  wrote:
> If your PCH root ports report an ACS capability then you can run kernel
> v4.7 kernel on the host to expose the isolation. If the root ports
> (00:1c.*) do not expose an ACS capability, it's probably a BIOS bug
> similar to Nick's system in this thread
> https://www.redhat.com/archives/vfio-users/2016-September/msg00059.html
> Thanks,

first off, thank you so much for your work on all this, it's been a
great pleasure to discover everything you've done to support device
passthrough.

I have a Gigabyte GA-Z170-HD3P that exhibits the same issue that Nick's
system has (I checked with lspci -s 1c). I want to follow your
recommendation to report this to the vendor, but I do not know how to
describe the issue in a way that they can understand what's going on and
hopefully work on a fix. I guess making more high-quality bugreports
about this will have an actual chance of getting it fixed, rather than
just a single person complaining.

Thank you
Sebastian

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Sebastian Hahn
Hi Alex,

> On 20 Sep 2016, at 16:14, Alex Williamson  wrote:
> If your PCH root ports report an ACS capability then you can run kernel
> v4.7 kernel on the host to expose the isolation.  If the root ports
> (00:1c.*) do not expose an ACS capability, it's probably a BIOS bug
> similar to Nick's system in this thread
> https://www.redhat.com/archives/vfio-users/2016-September/msg00059.html
> Thanks,

first off, thank you so much for your work on all this, it's been a
great pleasure to discover everything you've done to support device
passthrough.

I have a Gigabyte GA-Z170-HD3P that exhibits the same issue that Nick's
system has (I checked with lspci -s 1c). I want to follow your
recommendation to report this to the vendor, but I do not know how to
describe the issue in a way that they can understand what's going on and
hopefully work on a fix. I guess making more high-quality bugreports
about this will have an actual chance of getting it fixed, rather than
just a single person complaining.

Thank you
Sebastian

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Alex Williamson
On Tue, 20 Sep 2016 21:56:33 +0800
Wei Xu  wrote:

> On 2016年09月20日 09:59, Alex Williamson wrote:
> > On Tue, 20 Sep 2016 09:28:57 +0800
> > Wei Xu  wrote:
> >  
> >> Hi Guys,
> >> I'm trying to pass through a rtl8168 nic to linux guest on my laptop
> >> recently, the link is directly connected to my notebook with a cable.
> >>
> >> qemu: 2.7.0-rc4
> >> host/guest kernel: 4.7.0-rc5
> >> driver name: r8169
> >>
> >> After binding the driver to vfio-pci and launching the VM for a few
> >> seconds, the connection led on the nic was turned off, and the guest
> >> only see a link down nic with below messages.
> >>
> >> [6.173188] r8169 :00:04.0 ens4: rtl_phy_reset_cond == 1 (loop:
> >> 100, delay: 1).
> >> [6.177234] r8169 :00:04.0 ens4: link down
> >> [6.177592] r8169 :00:04.0 ens4: link down
> >> [6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready
> >>
> >>
> >> It's quite similar as below bug except it's for windows driver while
> >> i'm testing linux.
> >>
> >> https://bugs.launchpad.net/qemu/+bug/1384892
> >>
> >>
> >> More info:
> >> My vm image is a pre-installed fedora 22 desktop, i also tried a fresh
> >> fedora live iso, looks it can load the driver correctly.
> >>
> >> I tried to disable auto negotiation for the link but it didn't work for me.
> >>
> >> I did the same test with my notebook with a Intel I218-LM ethernet
> >> controller, it works pretty well every time.
> >>
> >> I googled around and looks it happened to bare metal too, so just wonder
> >> if this is a bug of network-manager, or is being caused by the nic
> >> driver or an issue in qemu/kernel vfio, anybody can help?  
> >
> > Realtek nics don't work well with device assignment, they barely work
> > well on bare metal.  Stick with the Intel nic or just use the rtl nic
> > with virtio.  I've spent longer than I care to admit on the rtl quirks
> > we have in QEMU and I expect they still only work with a few devices.  
> 
> OK, I'll ignore Realtek, so I added one Intel iwl6235 wireless nic to my 
> laptop, the pci tree shows both the rtl and iwl are behind a separate 
> pci bridge, after bind iwl to vfio-pci driver, i failed to pass through 
> it again with error message from qemu.
> 
> qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: error, 
> group 5 is not viable, please ensure all devices within the iommu_group 
> are bound to their vfio bus driver.
> qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: failed to 
> get group 5
> qemu-system-x86_64: -device vfio-pci,host=:02:00.0: Device 
> initialization failed
> 
> Seems it's caused by the rtl nic is under the same iommu group with iwl 
> as well, and when the kernel vfio driver checking the viablity, it'll 
> make sure all the devices under the same group are viable, it works fine 
> after i bound the rtl to vfio-pci too, i'm wonder if this a discipline? 
> can't i just bind the iwl nic and pass through the the guest?
> 
> pci tree:
> -[:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
> +-1c.0-[01]00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 
> PCI Express Gigabit Ethernet Controller
> +-1c.7-[02]00.0 Intel Corporation Centrino Advanced-N 6235

If your PCH root ports report an ACS capability then you can run kernel
v4.7 kernel on the host to expose the isolation.  If the root ports
(00:1c.*) do not expose an ACS capability, it's probably a BIOS bug
similar to Nick's system in this thread
https://www.redhat.com/archives/vfio-users/2016-September/msg00059.html
Thanks,

Alex

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Alex Williamson
On Tue, 20 Sep 2016 08:14:45 -0600
Alex Williamson  wrote:

> On Tue, 20 Sep 2016 21:56:33 +0800
> Wei Xu  wrote:
> 
> > On 2016年09月20日 09:59, Alex Williamson wrote:  
> > > On Tue, 20 Sep 2016 09:28:57 +0800
> > > Wei Xu  wrote:
> > >
> > >> Hi Guys,
> > >> I'm trying to pass through a rtl8168 nic to linux guest on my laptop
> > >> recently, the link is directly connected to my notebook with a cable.
> > >>
> > >> qemu: 2.7.0-rc4
> > >> host/guest kernel: 4.7.0-rc5
> > >> driver name: r8169
> > >>
> > >> After binding the driver to vfio-pci and launching the VM for a few
> > >> seconds, the connection led on the nic was turned off, and the guest
> > >> only see a link down nic with below messages.
> > >>
> > >> [6.173188] r8169 :00:04.0 ens4: rtl_phy_reset_cond == 1 (loop:
> > >> 100, delay: 1).
> > >> [6.177234] r8169 :00:04.0 ens4: link down
> > >> [6.177592] r8169 :00:04.0 ens4: link down
> > >> [6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready
> > >>
> > >>
> > >> It's quite similar as below bug except it's for windows driver while
> > >> i'm testing linux.
> > >>
> > >> https://bugs.launchpad.net/qemu/+bug/1384892
> > >>
> > >>
> > >> More info:
> > >> My vm image is a pre-installed fedora 22 desktop, i also tried a fresh
> > >> fedora live iso, looks it can load the driver correctly.
> > >>
> > >> I tried to disable auto negotiation for the link but it didn't work for 
> > >> me.
> > >>
> > >> I did the same test with my notebook with a Intel I218-LM ethernet
> > >> controller, it works pretty well every time.
> > >>
> > >> I googled around and looks it happened to bare metal too, so just wonder
> > >> if this is a bug of network-manager, or is being caused by the nic
> > >> driver or an issue in qemu/kernel vfio, anybody can help?
> > >
> > > Realtek nics don't work well with device assignment, they barely work
> > > well on bare metal.  Stick with the Intel nic or just use the rtl nic
> > > with virtio.  I've spent longer than I care to admit on the rtl quirks
> > > we have in QEMU and I expect they still only work with a few devices.
> > 
> > OK, I'll ignore Realtek, so I added one Intel iwl6235 wireless nic to my 
> > laptop, the pci tree shows both the rtl and iwl are behind a separate 
> > pci bridge, after bind iwl to vfio-pci driver, i failed to pass through 
> > it again with error message from qemu.
> > 
> > qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: error, 
> > group 5 is not viable, please ensure all devices within the iommu_group 
> > are bound to their vfio bus driver.
> > qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: failed to 
> > get group 5
> > qemu-system-x86_64: -device vfio-pci,host=:02:00.0: Device 
> > initialization failed
> > 
> > Seems it's caused by the rtl nic is under the same iommu group with iwl 
> > as well, and when the kernel vfio driver checking the viablity, it'll 
> > make sure all the devices under the same group are viable, it works fine 
> > after i bound the rtl to vfio-pci too, i'm wonder if this a discipline? 
> > can't i just bind the iwl nic and pass through the the guest?
> > 
> > pci tree:
> > -[:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
> > +-1c.0-[01]00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 
> > PCI Express Gigabit Ethernet Controller
> > +-1c.7-[02]00.0 Intel Corporation Centrino Advanced-N 6235  
> 
> If your PCH root ports report an ACS capability then you can run kernel
> v4.7 kernel on the host to expose the isolation.  If the root ports
> (00:1c.*) do not expose an ACS capability, it's probably a BIOS bug
> similar to Nick's system in this thread
> https://www.redhat.com/archives/vfio-users/2016-September/msg00059.html

And I see you're running a v4.7 kernel already (though I'm not sure why
you're running an rc release for kernel or QEMU since both of those
have been released).  So you need to check them with lspci to see if an
ACS capability is exposed on the root ports, check whether your root
ports are covered by the device IDs in this quirk:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1bf2bf229b64540f91ac6fa3af37c81249037a0b

If there's no ACS capability but the root ports fall within the quirk,
it's a BIOS bug on the system.  Sorry.  Thanks,

Alex

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-20 Thread Wei Xu

On 2016年09月20日 09:59, Alex Williamson wrote:

On Tue, 20 Sep 2016 09:28:57 +0800
Wei Xu  wrote:


Hi Guys,
I'm trying to pass through a rtl8168 nic to linux guest on my laptop
recently, the link is directly connected to my notebook with a cable.

qemu: 2.7.0-rc4
host/guest kernel: 4.7.0-rc5
driver name: r8169

After binding the driver to vfio-pci and launching the VM for a few
seconds, the connection led on the nic was turned off, and the guest
only see a link down nic with below messages.

[6.173188] r8169 :00:04.0 ens4: rtl_phy_reset_cond == 1 (loop:
100, delay: 1).
[6.177234] r8169 :00:04.0 ens4: link down
[6.177592] r8169 :00:04.0 ens4: link down
[6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready


It's quite similar as below bug except it's for windows driver while
i'm testing linux.

https://bugs.launchpad.net/qemu/+bug/1384892


More info:
My vm image is a pre-installed fedora 22 desktop, i also tried a fresh
fedora live iso, looks it can load the driver correctly.

I tried to disable auto negotiation for the link but it didn't work for me.

I did the same test with my notebook with a Intel I218-LM ethernet
controller, it works pretty well every time.

I googled around and looks it happened to bare metal too, so just wonder
if this is a bug of network-manager, or is being caused by the nic
driver or an issue in qemu/kernel vfio, anybody can help?


Realtek nics don't work well with device assignment, they barely work
well on bare metal.  Stick with the Intel nic or just use the rtl nic
with virtio.  I've spent longer than I care to admit on the rtl quirks
we have in QEMU and I expect they still only work with a few devices.


OK, I'll ignore Realtek, so I added one Intel iwl6235 wireless nic to my 
laptop, the pci tree shows both the rtl and iwl are behind a separate 
pci bridge, after bind iwl to vfio-pci driver, i failed to pass through 
it again with error message from qemu.


qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: error, 
group 5 is not viable, please ensure all devices within the iommu_group 
are bound to their vfio bus driver.
qemu-system-x86_64: -device vfio-pci,host=:02:00.0: vfio: failed to 
get group 5
qemu-system-x86_64: -device vfio-pci,host=:02:00.0: Device 
initialization failed


Seems it's caused by the rtl nic is under the same iommu group with iwl 
as well, and when the kernel vfio driver checking the viablity, it'll 
make sure all the devices under the same group are viable, it works fine 
after i bound the rtl to vfio-pci too, i'm wonder if this a discipline? 
can't i just bind the iwl nic and pass through the the guest?


pci tree:
-[:00]-+-00.0 Intel Corporation Sky Lake Host Bridge/DRAM Registers
+-1c.0-[01]00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 
PCI Express Gigabit Ethernet Controller

+-1c.7-[02]00.0 Intel Corporation Centrino Advanced-N 6235

Initial Sys fs devices under the group after system up:
'/sys/bus/pci/devices/:02:00.0/iommu_group/devices':
lrwxrwxrwx 1 root root 0 Sep 20 21:35 :00:1c.0 -> 
../../../../devices/pci:00/:00:1c.0/
lrwxrwxrwx 1 root root 0 Sep 20 21:35 :00:1c.7 -> 
../../../../devices/pci:00/:00:1c.7/
lrwxrwxrwx 1 root root 0 Sep 20 21:35 :01:00.0 -> 
../../../../devices/pci:00/:00:1c.0/:01:00.0/
lrwxrwxrwx 1 root root 0 Sep 20 21:35 :02:00.0 -> 
../../../../devices/pci:00/:00:1c.7/:02:00.0/



Thanks,

Alex



___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Lost link when pass through rtl8168 to guest

2016-09-19 Thread Alex Williamson
On Tue, 20 Sep 2016 09:28:57 +0800
Wei Xu  wrote:

> Hi Guys,
> I'm trying to pass through a rtl8168 nic to linux guest on my laptop 
> recently, the link is directly connected to my notebook with a cable.
> 
> qemu: 2.7.0-rc4
> host/guest kernel: 4.7.0-rc5
> driver name: r8169
> 
> After binding the driver to vfio-pci and launching the VM for a few 
> seconds, the connection led on the nic was turned off, and the guest
> only see a link down nic with below messages.
> 
> [6.173188] r8169 :00:04.0 ens4: rtl_phy_reset_cond == 1 (loop: 
> 100, delay: 1).
> [6.177234] r8169 :00:04.0 ens4: link down
> [6.177592] r8169 :00:04.0 ens4: link down
> [6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready
> 
> 
> It's quite similar as below bug except it's for windows driver while
> i'm testing linux.
> 
> https://bugs.launchpad.net/qemu/+bug/1384892
> 
> 
> More info:
> My vm image is a pre-installed fedora 22 desktop, i also tried a fresh 
> fedora live iso, looks it can load the driver correctly.
> 
> I tried to disable auto negotiation for the link but it didn't work for me.
> 
> I did the same test with my notebook with a Intel I218-LM ethernet 
> controller, it works pretty well every time.
> 
> I googled around and looks it happened to bare metal too, so just wonder
> if this is a bug of network-manager, or is being caused by the nic 
> driver or an issue in qemu/kernel vfio, anybody can help?

Realtek nics don't work well with device assignment, they barely work
well on bare metal.  Stick with the Intel nic or just use the rtl nic
with virtio.  I've spent longer than I care to admit on the rtl quirks
we have in QEMU and I expect they still only work with a few devices.
Thanks,

Alex

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


[vfio-users] Lost link when pass through rtl8168 to guest

2016-09-19 Thread Wei Xu

Hi Guys,
I'm trying to pass through a rtl8168 nic to linux guest on my laptop 
recently, the link is directly connected to my notebook with a cable.


qemu: 2.7.0-rc4
host/guest kernel: 4.7.0-rc5
driver name: r8169

After binding the driver to vfio-pci and launching the VM for a few 
seconds, the connection led on the nic was turned off, and the guest

only see a link down nic with below messages.

[6.173188] r8169 :00:04.0 ens4: rtl_phy_reset_cond == 1 (loop: 
100, delay: 1).

[6.177234] r8169 :00:04.0 ens4: link down
[6.177592] r8169 :00:04.0 ens4: link down
[6.177889] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready


It's quite similar as below bug except it's for windows driver while
i'm testing linux.

https://bugs.launchpad.net/qemu/+bug/1384892


More info:
My vm image is a pre-installed fedora 22 desktop, i also tried a fresh 
fedora live iso, looks it can load the driver correctly.


I tried to disable auto negotiation for the link but it didn't work for me.

I did the same test with my notebook with a Intel I218-LM ethernet 
controller, it works pretty well every time.


I googled around and looks it happened to bare metal too, so just wonder
if this is a bug of network-manager, or is being caused by the nic 
driver or an issue in qemu/kernel vfio, anybody can help?


Thanks,
Wei

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users