On 2016年09月23日 02:47, Nick Sarnie wrote:
Hi again,

Very much to my surprise, Gigabyte replied and sent me a fixed BIOS. The
new IOMMU groups (with ACS override patch kernel commandline removed for
this boot), as well as my lspci information are below. I see four
messages the following messages in dmesg now:

[    0.523892] pci 0000:00:1c.0: Intel SPT PCH root port ACS workaround
enabled
[    0.524031] pci 0000:00:1c.4: Intel SPT PCH root port ACS workaround
enabled
[    0.524159] pci 0000:00:1c.5: Intel SPT PCH root port ACS workaround
enabled
[    0.524292] pci 0000:00:1d.0: Intel SPT PCH root port ACS workaround
enabled


IOMMU Groups: http://pastebin.com/raw/0dcHk8Xk
lspci: http://pastebin.com/raw/1zAZuPBM

That's cool, how did you report your issue to Gigabyte? I'd like to have a try as well.

Wei


Alex, please let me know if they missed anything else, so I can report
it to them.

Thanks,
Nick

On Sun, Sep 18, 2016 at 4:03 PM, Nick Sarnie <commendsar...@gmail.com
<mailto:commendsar...@gmail.com>> wrote:

    Hi again,

    Thanks a lot for investigating. I've reported the issue to the
    manufacturer.


    Thanks,
    sarnex

    On Sat, Sep 17, 2016 at 5:35 PM, Alex Williamson
    <alex.l.william...@gmail.com <mailto:alex.l.william...@gmail.com>>
    wrote:

        On Sat, Sep 17, 2016 at 12:29 PM, Nick Sarnie
        <commendsar...@gmail.com <mailto:commendsar...@gmail.com>> wrote:

            Hi Alex,

            The output is here: http://pastebin.com/raw/qjnpuaVr
            <http://pastebin.com/raw/qjnpuaVr>


        Ok, you need to go complain to your motherboard manufacturer,
        they're the ones hiding the ACS capability.  PCIe capabilities
        always start at 0x100, the dword there is:

        100: 01 00 01 22 = 0x22010001

        Breaking that down, the capability at 0x100 is ID 0x0001 (AER),
        version 0x1, and the next capability is at 0x220.  So we do the
        same there:

        220: 19 00 01 00 = 0x00010019

        Capability ID 0x0019 (Secondary PCIe), version 0x1, next
        capability 0x0, terminating the capability list.

        Per Intel documentation for the chipset
        
(http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-2.html
        
<http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-2.html>),
        the ACS capability and control registers live at 0x144 and 0x148
        respectively and we can see that you do have data here matching
        the default value of the capability register:

        140: 00 00 00 00 0f 00 00 00 00 00 00 00 00 00 00 00

        ie. default value of 0x144 is 0xf.  It appears that this BIOS
        vendor didn't connect the capability into the chain or fill in
        the capability header.  The registers to do this are RW/O, ie.
        Read-Write-Once.  IOW, the registers can only be written once,
        which is intended to be used by the BIOS.  The capability bits
        themselves are RW/O, allowing vendors to expose different sets
        of ACS capabilities.  Given that this vendor has not exposed the
        capability, we have no basis to believe that the default value
        of the register represents the real capabilities of the system
        and therefore we cannot assume we're able to control ACS.  File
        a bug with the vendor or look for a BIOS update where they may
        have already fixed this.

            Also, is there any way we could move the USB controller into
            its own group, or remove the Ethernet and SATA controller
            into a seperate group? Ideally, I could pass the USB
            Controller in group 7 without the ACS patch.


        That's not how IOMMU groups works.  See
        http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html 
<http://vfio.blogspot.com/2014/08/iommu-groups-inside-and-out.html>
          We aren't creating these groups arbitrarily, we base them on
        the information provided to use by the IOMMU driver and PCI
        topology features, including ACS.  If we cannot determine that
        there is isolation between components, we must assume that they
        are not isolated.  Your choices are to run an unsupported (and
        unsupportable) configuration using the ACS override patch, get
        your hardware vendor to fix their platform, or upgrade to better
        hardware with better isolation characteristics.





_______________________________________________
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


_______________________________________________
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users

Reply via email to