Re: [PATCH] iommu/amd: Add sanity check for interrupt remapping table length macros

2020-12-10 Thread Jerry Snitselaar


Suravee Suthikulpanit @ 2020-12-10 09:24 MST:

> Currently, macros related to the interrupt remapping table length are
> defined separately. This has resulted in an oversight in which one of
> the macros were missed when changing the length. To prevent this,
> redefine the macros to add built-in sanity check.
>
> Also, rename macros to use the name of the DTE[IntTabLen] field as
> specified in the AMD IOMMU specification. There is no functional change.
>
> Suggested-by: Linus Torvalds 
> Reviewed-by: Tom Lendacky 
> Signed-off-by: Suravee Suthikulpanit 
> Cc: Will Deacon 
> Cc: Jerry Snitselaar 
> Cc: Joerg Roedel 
> ---
>  drivers/iommu/amd/amd_iommu_types.h | 19 ++-
>  drivers/iommu/amd/init.c|  6 +++---
>  drivers/iommu/amd/iommu.c   |  2 +-
>  3 files changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/iommu/amd/amd_iommu_types.h 
> b/drivers/iommu/amd/amd_iommu_types.h
> index 494b42a31b7a..899ce62df3f0 100644
> --- a/drivers/iommu/amd/amd_iommu_types.h
> +++ b/drivers/iommu/amd/amd_iommu_types.h
> @@ -255,11 +255,19 @@
>  /* Bit value definition for dte irq remapping fields*/
>  #define DTE_IRQ_PHYS_ADDR_MASK   (((1ULL << 45)-1) << 6)
>  #define DTE_IRQ_REMAP_INTCTL_MASK(0x3ULL << 60)
> -#define DTE_IRQ_TABLE_LEN_MASK   (0xfULL << 1)
>  #define DTE_IRQ_REMAP_INTCTL(2ULL << 60)
> -#define DTE_IRQ_TABLE_LEN   (9ULL << 1)
>  #define DTE_IRQ_REMAP_ENABLE1ULL
>  
> +/*
> + * AMD IOMMU hardware only support 512 IRTEs despite
> + * the architectural limitation of 2048 entries.
> + */
> +#define DTE_INTTAB_ALIGNMENT128
> +#define DTE_INTTABLEN_VALUE 9ULL
> +#define DTE_INTTABLEN   (DTE_INTTABLEN_VALUE << 1)
> +#define DTE_INTTABLEN_MASK  (0xfULL << 1)
> +#define MAX_IRQS_PER_TABLE  (1 << DTE_INTTABLEN_VALUE)
> +
>  #define PAGE_MODE_NONE0x00
>  #define PAGE_MODE_1_LEVEL 0x01
>  #define PAGE_MODE_2_LEVEL 0x02
> @@ -409,13 +417,6 @@ extern bool amd_iommu_np_cache;
>  /* Only true if all IOMMUs support device IOTLBs */
>  extern bool amd_iommu_iotlb_sup;
>  
> -/*
> - * AMD IOMMU hardware only support 512 IRTEs despite
> - * the architectural limitation of 2048 entries.
> - */
> -#define MAX_IRQS_PER_TABLE   512
> -#define IRQ_TABLE_ALIGNMENT  128
> -
>  struct irq_remap_table {
>   raw_spinlock_t lock;
>   unsigned min_index;
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 23a790f8f550..6bec8913d064 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -989,10 +989,10 @@ static bool copy_device_table(void)
>  
>   irq_v = old_devtb[devid].data[2] & DTE_IRQ_REMAP_ENABLE;
>   int_ctl = old_devtb[devid].data[2] & DTE_IRQ_REMAP_INTCTL_MASK;
> - int_tab_len = old_devtb[devid].data[2] & DTE_IRQ_TABLE_LEN_MASK;
> + int_tab_len = old_devtb[devid].data[2] & DTE_INTTABLEN_MASK;
>   if (irq_v && (int_ctl || int_tab_len)) {
>   if ((int_ctl != DTE_IRQ_REMAP_INTCTL) ||
> - (int_tab_len != DTE_IRQ_TABLE_LEN)) {
> + (int_tab_len != DTE_INTTABLEN)) {
>   pr_err("Wrong old irq remapping flag: %#x\n", 
> devid);
>   return false;
>   }
> @@ -2674,7 +2674,7 @@ static int __init early_amd_iommu_init(void)
>   remap_cache_sz = MAX_IRQS_PER_TABLE * (sizeof(u64) * 2);
>   amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache",
>   remap_cache_sz,
> - IRQ_TABLE_ALIGNMENT,
> + DTE_INTTAB_ALIGNMENT,
>   0, NULL);
>   if (!amd_iommu_irq_cache)
>   goto out;
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index b9cf59443843..f7abf16d1e3a 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -3191,7 +3191,7 @@ static void set_dte_irq_entry(u16 devid, struct 
> irq_remap_table *table)
>   dte &= ~DTE_IRQ_PHYS_ADDR_MASK;
>   dte |= iommu_virt_to_phys(table->table);
>   dte |= DTE_IRQ_REMAP_INTCTL;
> - dte |= DTE_IRQ_TABLE_LEN;
> + dte |= DTE_INTTABLEN;
>   dte |= DTE_IRQ_REMAP_ENABLE;
>  
>   amd_iommu_dev_table[devid].data[2] = dte;


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] IOMMU fix for 5.10 (-final)

2020-12-09 Thread Jerry Snitselaar
On Wed, Dec 9, 2020 at 12:18 PM Linus Torvalds
 wrote:
>
> On Wed, Dec 9, 2020 at 11:12 AM Jerry Snitselaar  wrote:
> >
> > Since the field in the device table entry format expects it to be n
> > where there are 2^n entries in the table I guess it should be:
> >
> > #define DTE_IRQ_TABLE_LEN 9
> > #define MAX_IRQS_PER_TABLE (1 << DTE_IRQ_TABLE_LEN)
>
> No, that "DTE_IRQ_TABLE_LEN" is not the size shift - it's the size
> shift value in that DTE field, which is shifted up by 1.
>
> That's why the current code does that
>
>#define DTE_IRQ_TABLE_LEN   (9ULL << 1)
>
> there..
>
> Which was why I suggested that new #define that is the *actual* shift
> value, and then the DTE thing and the MAX_IRQS_PER_TABLE values would
> depend on that.
>
>Linus
>

Yes, when I read it my head was translating it as setting them both to
512 and then
I forgot that it gets shifted over 1. Which considering I was the once
who noticed the
original problem  of it still being 8 was a nice brain fart. This
should be fixed like you suggest.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] IOMMU fix for 5.10 (-final)

2020-12-09 Thread Jerry Snitselaar
On Wed, Dec 9, 2020 at 12:12 PM Jerry Snitselaar  wrote:
>
>
> Will Deacon @ 2020-12-09 11:50 MST:
>
> > On Wed, Dec 09, 2020 at 10:07:46AM -0800, Linus Torvalds wrote:
> >> On Wed, Dec 9, 2020 at 6:12 AM Will Deacon  wrote:
> >> >
> >> > Please pull this one-liner AMD IOMMU fix for 5.10. It's actually a fix
> >> > for a fix, where the size of the interrupt remapping table was increased
> >> > but a related constant for the size of the interrupt table was forgotten.
> >>
> >> Pulled.
> >
> > Thanks.
> >
> >> However, why didn't this then add some sanity checking for the two
> >> different #defines to be in sync?
> >>
> >> IOW, something like
> >>
> >>#define AMD_IOMMU_IRQ_TABLE_SHIFT 9
> >>
> >>#define MAX_IRQS_PER_TABLE (1 << AMD_IOMMU_IRQ_TABLE_SHIFT)
> >>#define DTE_IRQ_TABLE_LEN ((u64)AMD_IOMMU_IRQ_TABLE_SHIFT << 1)
>
> Since the field in the device table entry format expects it to be n
> where there are 2^n entries in the table I guess it should be:
>
> #define DTE_IRQ_TABLE_LEN 9
> #define MAX_IRQS_PER_TABLE (1 << DTE_IRQ_TABLE_LEN)
>
No, ignore that. I'm being stupid.


> >>
> >> or whatever. Hmm?
> >
> > This looks like a worthwhile change to me, but I don't have any hardware
> > so I've been very reluctant to make even "obvious" driver changes here.
> >
> > Suravee -- please can you post a patch implementing the above?
> >
> >> That way this won't happen again, but perhaps equally importantly the
> >> linkage will be more clear, and there won't be those random constants.
> >>
> >> Naming above is probably garbage - I assume there's some actual
> >> architectural name for that irq table length field in the DTE?
> >
> > The one in the spec is even better: "IntTabLen".
> >
> > Will
> > ___
> > iommu mailing list
> > iommu@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/iommu
>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] IOMMU fix for 5.10 (-final)

2020-12-09 Thread Jerry Snitselaar


Will Deacon @ 2020-12-09 11:50 MST:

> On Wed, Dec 09, 2020 at 10:07:46AM -0800, Linus Torvalds wrote:
>> On Wed, Dec 9, 2020 at 6:12 AM Will Deacon  wrote:
>> >
>> > Please pull this one-liner AMD IOMMU fix for 5.10. It's actually a fix
>> > for a fix, where the size of the interrupt remapping table was increased
>> > but a related constant for the size of the interrupt table was forgotten.
>> 
>> Pulled.
>
> Thanks.
>
>> However, why didn't this then add some sanity checking for the two
>> different #defines to be in sync?
>> 
>> IOW, something like
>> 
>>#define AMD_IOMMU_IRQ_TABLE_SHIFT 9
>> 
>>#define MAX_IRQS_PER_TABLE (1 << AMD_IOMMU_IRQ_TABLE_SHIFT)
>>#define DTE_IRQ_TABLE_LEN ((u64)AMD_IOMMU_IRQ_TABLE_SHIFT << 1)

Since the field in the device table entry format expects it to be n
where there are 2^n entries in the table I guess it should be:

#define DTE_IRQ_TABLE_LEN 9
#define MAX_IRQS_PER_TABLE (1 << DTE_IRQ_TABLE_LEN)

>> 
>> or whatever. Hmm?
>
> This looks like a worthwhile change to me, but I don't have any hardware
> so I've been very reluctant to make even "obvious" driver changes here.
>
> Suravee -- please can you post a patch implementing the above?
>
>> That way this won't happen again, but perhaps equally importantly the
>> linkage will be more clear, and there won't be those random constants.
>> 
>> Naming above is probably garbage - I assume there's some actual
>> architectural name for that irq table length field in the DTE?
>
> The one in the spec is even better: "IntTabLen".
>
> Will
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Set DTE[IntTabLen] to represent 512 IRTEs

2020-12-07 Thread Jerry Snitselaar


Suravee Suthikulpanit @ 2020-12-07 02:19 MST:

> According to the AMD IOMMU spec, the commit 73db2fc595f3
> ("iommu/amd: Increase interrupt remapping table limit to 512 entries")
> also requires the interrupt table length (IntTabLen) to be set to 9
> (power of 2) in the device table mapping entry (DTE).
>
> Fixes: 73db2fc595f3 ("iommu/amd: Increase interrupt remapping table limit to 
> 512 entries")
> Reported-by: Jerry Snitselaar 
> Signed-off-by: Suravee Suthikulpanit 
> ---
>  drivers/iommu/amd/amd_iommu_types.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/amd/amd_iommu_types.h 
> b/drivers/iommu/amd/amd_iommu_types.h
> index 89647700bab2..494b42a31b7a 100644
> --- a/drivers/iommu/amd/amd_iommu_types.h
> +++ b/drivers/iommu/amd/amd_iommu_types.h
> @@ -257,7 +257,7 @@
>  #define DTE_IRQ_REMAP_INTCTL_MASK(0x3ULL << 60)
>  #define DTE_IRQ_TABLE_LEN_MASK   (0xfULL << 1)
>  #define DTE_IRQ_REMAP_INTCTL(2ULL << 60)
> -#define DTE_IRQ_TABLE_LEN   (8ULL << 1)
> +#define DTE_IRQ_TABLE_LEN   (9ULL << 1)
>  #define DTE_IRQ_REMAP_ENABLE1ULL
>  
>  #define PAGE_MODE_NONE0x00

Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Increase interrupt remapping table limit to 512 entries

2020-12-01 Thread Jerry Snitselaar


Suravee Suthikulpanit @ 2020-10-14 19:50 MST:

> Certain device drivers allocate IO queues on a per-cpu basis.
> On AMD EPYC platform, which can support up-to 256 cpu threads,
> this can exceed the current MAX_IRQ_PER_TABLE limit of 256,
> and result in the error message:
>
> AMD-Vi: Failed to allocate IRTE
>
> This has been observed with certain NVME devices.
>
> AMD IOMMU hardware can actually support upto 512 interrupt
> remapping table entries. Therefore, update the driver to
> match the hardware limit.
>
> Please note that this also increases the size of interrupt remapping
> table to 8KB per device when using the 128-bit IRTE format.
>
> Signed-off-by: Suravee Suthikulpanit 
> ---
>  drivers/iommu/amd/amd_iommu_types.h | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/amd/amd_iommu_types.h 
> b/drivers/iommu/amd/amd_iommu_types.h
> index 30a5d412255a..427484c45589 100644
> --- a/drivers/iommu/amd/amd_iommu_types.h
> +++ b/drivers/iommu/amd/amd_iommu_types.h
> @@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache;
>  /* Only true if all IOMMUs support device IOTLBs */
>  extern bool amd_iommu_iotlb_sup;
>  
> -#define MAX_IRQS_PER_TABLE   256
> +/*
> + * AMD IOMMU hardware only support 512 IRTEs despite
> + * the architectural limitation of 2048 entries.
> + */
> +#define MAX_IRQS_PER_TABLE   512
>  #define IRQ_TABLE_ALIGNMENT  128
>  
>  struct irq_remap_table {

With this change should DTE_IRQ_TABLE_LEN be changed to 9? IIUC the spec
correctly leaving it at 8 is saying the table is 256 entries long.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Question about domain_init (v5.3-v5.7)

2020-11-30 Thread Jerry Snitselaar


Jerry Snitselaar @ 2020-11-30 10:50 MST:

> Lu Baolu @ 2020-11-26 19:12 MST:
>
>> Hi Jerry,
>>
>> On 11/27/20 5:35 AM, Jerry Snitselaar wrote:
>>> Lu Baolu @ 2020-11-26 04:01 MST:
>>> 
>>>> Hi Jerry,
>>>>
>>>> On 2020/11/26 4:27, Jerry Snitselaar wrote:
>>>>> Is there a reason we check the requested guest address width against
>>>>> the
>>>>> iommu's mgaw, instead of the agaw that we already know for the iommu?
>>>>> I've run into a case with a new system where the mgaw reported is 57,
>>>>> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports
>>>>> the highest supported agaw is 48 and the domain_init code fails here. In
>>>>
>>>> Isn't this a platform bug? If it's too late to fix it in the BIOS, you
>>>> maybe have to add a platform specific quirk to set mgaw to the highest
>>>> supported agaw?
>>>>
>>>> Best regards,
>>>> baolu
>>> Is there somewhere you can point me to that discusses how they
>>> should be
>>> setting the mgaw? I misunderstood when I previously asked you about
>>> whether the mgaw could be a value that was greater than any of sagaw.
>>> If it is a bios issue, then they should fix it there.
>>
>> MGAW indicates the max gpa width supported by 2nd translation. The VT-d
>> spec requires that this value must be at least equal to the host
>> physical addressibility. According to this, BIOS is good, right?
>>
>> For this failure case, domain_init() just wants to find a suitable agaw
>> for the private domain. I think it makes sense to check against
>> iommu->agaw instead of cap_mgaw.
>>
>> Best regards,
>> baolu
>>
>
> From this bit in the spec about MGAW:
>
> Guest addressability for a given DMA request is limited to the
> minimum of the value reported through this field and the adjusted
> guest address width of the corresponding page-table structure.
> (Adjusted guest address widths supported by hardware are reported
> through the SAGAW field).
>
> That does suggest it should be adjusted down to the sagaw value in this case, 
> yes?
> Just want to make sure I'm understanding it correctly.

Or I guess that is really talking about if you had an mgaw lower than the the
sagaw, the dma request would be limited to that lower mgaw value.

>
>>> 
>>>>
>>>>> other places like prepare_domain_attach_device, the dmar domain agaw
>>>>> gets adjusted down to the iommu agaw. The agaw of the iommu gets
>>>>> determined based off what is reported for sagaw. I'm wondering if it
>>>>> can't instead do:
>>>>> ---
>>>>>drivers/iommu/intel-iommu.c | 4 ++--
>>>>>1 file changed, 2 insertions(+), 2 deletions(-)
>>>>> diff --git a/drivers/iommu/intel-iommu.c
>>>>> b/drivers/iommu/intel-iommu.c
>>>>> index 6ca5c92ef2e5..a8e41ec36d9e 100644
>>>>> --- a/drivers/iommu/intel-iommu.c
>>>>> +++ b/drivers/iommu/intel-iommu.c
>>>>> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, 
>>>>> struct intel_iommu *iommu,
>>>>>   domain_reserve_special_ranges(domain);
>>>>>   /* calculate AGAW */
>>>>> - if (guest_width > cap_mgaw(iommu->cap))
>>>>> - guest_width = cap_mgaw(iommu->cap);
>>>>> + if (guest_width > agaw_to_width(iommu->agaw))
>>>>> + guest_width = agaw_to_width(iommu->agaw);
>>>>>   domain->gaw = guest_width;
>>>>>   adjust_width = guestwidth_to_adjustwidth(guest_width);
>>>>>   agaw = width_to_agaw(adjust_width);
>>>>> --
>>>>> 2.27.0
>>>>>
>>>>> Thoughts? With the former code the ehci device for the ilo fails when
>>>>> trying to get a private domain.
>>>>> Thanks,
>>>>> Jerry
>>>>>
>>> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Question about domain_init (v5.3-v5.7)

2020-11-30 Thread Jerry Snitselaar


Lu Baolu @ 2020-11-26 19:12 MST:

> Hi Jerry,
>
> On 11/27/20 5:35 AM, Jerry Snitselaar wrote:
>> Lu Baolu @ 2020-11-26 04:01 MST:
>> 
>>> Hi Jerry,
>>>
>>> On 2020/11/26 4:27, Jerry Snitselaar wrote:
>>>> Is there a reason we check the requested guest address width against
>>>> the
>>>> iommu's mgaw, instead of the agaw that we already know for the iommu?
>>>> I've run into a case with a new system where the mgaw reported is 57,
>>>> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports
>>>> the highest supported agaw is 48 and the domain_init code fails here. In
>>>
>>> Isn't this a platform bug? If it's too late to fix it in the BIOS, you
>>> maybe have to add a platform specific quirk to set mgaw to the highest
>>> supported agaw?
>>>
>>> Best regards,
>>> baolu
>> Is there somewhere you can point me to that discusses how they
>> should be
>> setting the mgaw? I misunderstood when I previously asked you about
>> whether the mgaw could be a value that was greater than any of sagaw.
>> If it is a bios issue, then they should fix it there.
>
> MGAW indicates the max gpa width supported by 2nd translation. The VT-d
> spec requires that this value must be at least equal to the host
> physical addressibility. According to this, BIOS is good, right?
>
> For this failure case, domain_init() just wants to find a suitable agaw
> for the private domain. I think it makes sense to check against
> iommu->agaw instead of cap_mgaw.
>
> Best regards,
> baolu
>

>From this bit in the spec about MGAW:

Guest addressability for a given DMA request is limited to the
minimum of the value reported through this field and the adjusted
guest address width of the corresponding page-table structure.
(Adjusted guest address widths supported by hardware are reported
through the SAGAW field).

That does suggest it should be adjusted down to the sagaw value in this case, 
yes?
Just want to make sure I'm understanding it correctly.

>> 
>>>
>>>> other places like prepare_domain_attach_device, the dmar domain agaw
>>>> gets adjusted down to the iommu agaw. The agaw of the iommu gets
>>>> determined based off what is reported for sagaw. I'm wondering if it
>>>> can't instead do:
>>>> ---
>>>>drivers/iommu/intel-iommu.c | 4 ++--
>>>>1 file changed, 2 insertions(+), 2 deletions(-)
>>>> diff --git a/drivers/iommu/intel-iommu.c
>>>> b/drivers/iommu/intel-iommu.c
>>>> index 6ca5c92ef2e5..a8e41ec36d9e 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, 
>>>> struct intel_iommu *iommu,
>>>>domain_reserve_special_ranges(domain);
>>>>/* calculate AGAW */
>>>> -  if (guest_width > cap_mgaw(iommu->cap))
>>>> -  guest_width = cap_mgaw(iommu->cap);
>>>> +  if (guest_width > agaw_to_width(iommu->agaw))
>>>> +  guest_width = agaw_to_width(iommu->agaw);
>>>>domain->gaw = guest_width;
>>>>adjust_width = guestwidth_to_adjustwidth(guest_width);
>>>>agaw = width_to_agaw(adjust_width);
>>>> --
>>>> 2.27.0
>>>>
>>>> Thoughts? With the former code the ehci device for the ilo fails when
>>>> trying to get a private domain.
>>>> Thanks,
>>>> Jerry
>>>>
>> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Question about domain_init (v5.3-v5.7)

2020-11-26 Thread Jerry Snitselaar


Lu Baolu @ 2020-11-26 19:12 MST:

> Hi Jerry,
>
> On 11/27/20 5:35 AM, Jerry Snitselaar wrote:
>> Lu Baolu @ 2020-11-26 04:01 MST:
>> 
>>> Hi Jerry,
>>>
>>> On 2020/11/26 4:27, Jerry Snitselaar wrote:
>>>> Is there a reason we check the requested guest address width against
>>>> the
>>>> iommu's mgaw, instead of the agaw that we already know for the iommu?
>>>> I've run into a case with a new system where the mgaw reported is 57,
>>>> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports
>>>> the highest supported agaw is 48 and the domain_init code fails here. In
>>>
>>> Isn't this a platform bug? If it's too late to fix it in the BIOS, you
>>> maybe have to add a platform specific quirk to set mgaw to the highest
>>> supported agaw?
>>>
>>> Best regards,
>>> baolu
>> Is there somewhere you can point me to that discusses how they
>> should be
>> setting the mgaw? I misunderstood when I previously asked you about
>> whether the mgaw could be a value that was greater than any of sagaw.
>> If it is a bios issue, then they should fix it there.
>
> MGAW indicates the max gpa width supported by 2nd translation. The VT-d
> spec requires that this value must be at least equal to the host
> physical addressibility. According to this, BIOS is good, right?
>

Yes, the host address width is 46. MGAW reports 57 (56+1), and highest
sagaw bit is for 48.


> For this failure case, domain_init() just wants to find a suitable agaw
> for the private domain. I think it makes sense to check against
> iommu->agaw instead of cap_mgaw.
>
> Best regards,
> baolu
>
>> 
>>>
>>>> other places like prepare_domain_attach_device, the dmar domain agaw
>>>> gets adjusted down to the iommu agaw. The agaw of the iommu gets
>>>> determined based off what is reported for sagaw. I'm wondering if it
>>>> can't instead do:
>>>> ---
>>>>drivers/iommu/intel-iommu.c | 4 ++--
>>>>1 file changed, 2 insertions(+), 2 deletions(-)
>>>> diff --git a/drivers/iommu/intel-iommu.c
>>>> b/drivers/iommu/intel-iommu.c
>>>> index 6ca5c92ef2e5..a8e41ec36d9e 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, 
>>>> struct intel_iommu *iommu,
>>>>domain_reserve_special_ranges(domain);
>>>>/* calculate AGAW */
>>>> -  if (guest_width > cap_mgaw(iommu->cap))
>>>> -  guest_width = cap_mgaw(iommu->cap);
>>>> +  if (guest_width > agaw_to_width(iommu->agaw))
>>>> +  guest_width = agaw_to_width(iommu->agaw);
>>>>domain->gaw = guest_width;
>>>>adjust_width = guestwidth_to_adjustwidth(guest_width);
>>>>agaw = width_to_agaw(adjust_width);
>>>> --
>>>> 2.27.0
>>>>
>>>> Thoughts? With the former code the ehci device for the ilo fails when
>>>> trying to get a private domain.
>>>> Thanks,
>>>> Jerry
>>>>
>> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Question about domain_init (v5.3-v5.7)

2020-11-26 Thread Jerry Snitselaar


Lu Baolu @ 2020-11-26 04:01 MST:

> Hi Jerry,
>
> On 2020/11/26 4:27, Jerry Snitselaar wrote:
>> Is there a reason we check the requested guest address width against
>> the
>> iommu's mgaw, instead of the agaw that we already know for the iommu?
>> I've run into a case with a new system where the mgaw reported is 57,
>> but if they set PAE to 46 instead of 52 in the bios, then sagaw reports
>> the highest supported agaw is 48 and the domain_init code fails here. In
>
> Isn't this a platform bug? If it's too late to fix it in the BIOS, you
> maybe have to add a platform specific quirk to set mgaw to the highest
> supported agaw?
>
> Best regards,
> baolu

Is there somewhere you can point me to that discusses how they should be
setting the mgaw? I misunderstood when I previously asked you about
whether the mgaw could be a value that was greater than any of sagaw.
If it is a bios issue, then they should fix it there.

>
>> other places like prepare_domain_attach_device, the dmar domain agaw
>> gets adjusted down to the iommu agaw. The agaw of the iommu gets
>> determined based off what is reported for sagaw. I'm wondering if it
>> can't instead do:
>> ---
>>   drivers/iommu/intel-iommu.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c
>> b/drivers/iommu/intel-iommu.c
>> index 6ca5c92ef2e5..a8e41ec36d9e 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, 
>> struct intel_iommu *iommu,
>>  domain_reserve_special_ranges(domain);
>>  /* calculate AGAW */
>> -if (guest_width > cap_mgaw(iommu->cap))
>> -guest_width = cap_mgaw(iommu->cap);
>> +if (guest_width > agaw_to_width(iommu->agaw))
>> +guest_width = agaw_to_width(iommu->agaw);
>>  domain->gaw = guest_width;
>>  adjust_width = guestwidth_to_adjustwidth(guest_width);
>>  agaw = width_to_agaw(adjust_width);
>> --
>> 2.27.0
>> 
>> Thoughts? With the former code the ehci device for the ilo fails when
>> trying to get a private domain.
>> Thanks,
>> Jerry
>> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Question about domain_init (v5.3-v5.7)

2020-11-25 Thread Jerry Snitselaar


Is there a reason we check the requested guest address width against the
iommu's mgaw, instead of the agaw that we already know for the iommu?
I've run into a case with a new system where the mgaw reported is 57,
but if they set PAE to 46 instead of 52 in the bios, then sagaw reports
the highest supported agaw is 48 and the domain_init code fails here. In
other places like prepare_domain_attach_device, the dmar domain agaw
gets adjusted down to the iommu agaw. The agaw of the iommu gets
determined based off what is reported for sagaw. I'm wondering if it
can't instead do:

---
 drivers/iommu/intel-iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 6ca5c92ef2e5..a8e41ec36d9e 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -1862,8 +1862,8 @@ static int domain_init(struct dmar_domain *domain, struct 
intel_iommu *iommu,
domain_reserve_special_ranges(domain);

/* calculate AGAW */
-   if (guest_width > cap_mgaw(iommu->cap))
-   guest_width = cap_mgaw(iommu->cap);
+   if (guest_width > agaw_to_width(iommu->agaw))
+   guest_width = agaw_to_width(iommu->agaw);
domain->gaw = guest_width;
adjust_width = guestwidth_to_adjustwidth(guest_width);
agaw = width_to_agaw(adjust_width);
--
2.27.0


Thoughts? With the former code the ehci device for the ilo fails when
trying to get a private domain.

Thanks,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


kdump boot failing with IVRS checksum failure

2020-09-21 Thread Jerry Snitselaar


Hello Joerg,

We are seeing a kdump kernel boot failure in test on an HP DL325 Gen10
and it was tracked down to 387caf0b759a ("iommu/amd: Treat per-device
exclusion ranges as r/w unity-mapped regions"). Reproduced on 5.9-rc5
and goes away with revert of the commit. There is a follow on commit
that depends on this that was reverted as well 2ca6b6dc8512 ("iommu/amd:
Remove unused variable"). I'm working on getting system access and want
to see what the IVRS table looks like, but thought I'd give you heads
up.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu: Move AMD and Intel Kconfig + Makefile bits into their directories

2020-07-27 Thread Jerry Snitselaar


Jerry Snitselaar @ 2020-06-30 13:06 MST:

> This patchset imeplements the suggestion from Linus to move the
> Kconfig and Makefile bits for AMD and Intel into their respective
> directories.
>
> v2: Rebase against v5.8-rc3. Dropped ---help--- changes from Kconfig as that 
> was
> dealt with in systemwide cleanup.
>
> Jerry Snitselaar (2):
>   iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
>   iommu/amd: Move Kconfig and Makefile bits down into amd directory
>
>
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

Hi Joerg,

Looks like I forgot to cc you on this cover letter for v2.
Does this work for you now?

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] iommu/amd: Move Kconfig and Makefile bits down into amd directory

2020-06-30 Thread Jerry Snitselaar
Move AMD Kconfig and Makefile bits down into the amd directory
with the rest of the AMD specific files.

Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/Kconfig  | 45 +-
 drivers/iommu/Makefile |  5 +
 drivers/iommu/amd/Kconfig  | 44 +
 drivers/iommu/amd/Makefile |  4 
 4 files changed, 50 insertions(+), 48 deletions(-)
 create mode 100644 drivers/iommu/amd/Kconfig
 create mode 100644 drivers/iommu/amd/Makefile

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 281cd6bd0fe0..24000e7ed0fa 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -132,50 +132,7 @@ config IOMMU_PGTABLES_L2
def_bool y
depends on MSM_IOMMU && MMU && SMP && CPU_DCACHE_DISABLE=n
 
-# AMD IOMMU support
-config AMD_IOMMU
-   bool "AMD IOMMU support"
-   select SWIOTLB
-   select PCI_MSI
-   select PCI_ATS
-   select PCI_PRI
-   select PCI_PASID
-   select IOMMU_API
-   select IOMMU_IOVA
-   select IOMMU_DMA
-   depends on X86_64 && PCI && ACPI
-   help
- With this option you can enable support for AMD IOMMU hardware in
- your system. An IOMMU is a hardware component which provides
- remapping of DMA memory accesses from devices. With an AMD IOMMU you
- can isolate the DMA memory of different devices and protect the
- system from misbehaving device drivers or hardware.
-
- You can find out if your system has an AMD IOMMU if you look into
- your BIOS for an option to enable it or if you have an IVRS ACPI
- table.
-
-config AMD_IOMMU_V2
-   tristate "AMD IOMMU Version 2 driver"
-   depends on AMD_IOMMU
-   select MMU_NOTIFIER
-   help
- This option enables support for the AMD IOMMUv2 features of the IOMMU
- hardware. Select this option if you want to use devices that support
- the PCI PRI and PASID interface.
-
-config AMD_IOMMU_DEBUGFS
-   bool "Enable AMD IOMMU internals in DebugFS"
-   depends on AMD_IOMMU && IOMMU_DEBUGFS
-   help
- !!!WARNING!!!  !!!WARNING!!!  !!!WARNING!!!  !!!WARNING!!!
-
- DO NOT ENABLE THIS OPTION UNLESS YOU REALLY, -REALLY- KNOW WHAT YOU 
ARE DOING!!!
- Exposes AMD IOMMU device internals in DebugFS.
-
- This option is -NOT- intended for production environments, and should
- not generally be enabled.
-
+source "drivers/iommu/amd/Kconfig"
 source "drivers/iommu/intel/Kconfig"
 
 config IRQ_REMAP
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 71dd2f382e78..f356bc12b1c7 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-obj-y += intel/
+obj-y += amd/ intel/
 obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
@@ -12,9 +12,6 @@ obj-$(CONFIG_IOASID) += ioasid.o
 obj-$(CONFIG_IOMMU_IOVA) += iova.o
 obj-$(CONFIG_OF_IOMMU) += of_iommu.o
 obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
-obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o
-obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
-obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
 arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
new file mode 100644
index ..1f061d91e0b8
--- /dev/null
+++ b/drivers/iommu/amd/Kconfig
@@ -0,0 +1,44 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# AMD IOMMU support
+config AMD_IOMMU
+   bool "AMD IOMMU support"
+   select SWIOTLB
+   select PCI_MSI
+   select PCI_ATS
+   select PCI_PRI
+   select PCI_PASID
+   select IOMMU_API
+   select IOMMU_IOVA
+   select IOMMU_DMA
+   depends on X86_64 && PCI && ACPI
+   help
+ With this option you can enable support for AMD IOMMU hardware in
+ your system. An IOMMU is a hardware component which provides
+ remapping of DMA memory accesses from devices. With an AMD IOMMU you
+ can isolate the DMA memory of different devices and protect the
+ system from misbehaving device drivers or hardware.
+
+ You can find out if your system has an AMD IOMMU if you look into
+ your BIOS for an option to enable it or if you have an IVRS ACPI
+ table.
+
+config AMD_IOMMU_V2
+   tristate "AMD IOMMU Version 2 driver"
+   depends on AMD_IOMMU
+   select MMU_NOTIFIER
+   help
+ This option enables support for the AMD IOMMUv2 features of the IOMMU
+ hardware. Select this option if you want to use devices that support
+ the PCI PRI and PASID int

[PATCH v2 0/2] iommu: Move AMD and Intel Kconfig + Makefile bits into their directories

2020-06-30 Thread Jerry Snitselaar
This patchset imeplements the suggestion from Linus to move the
Kconfig and Makefile bits for AMD and Intel into their respective
directories.

v2: Rebase against v5.8-rc3. Dropped ---help--- changes from Kconfig as that was
dealt with in systemwide cleanup.

Jerry Snitselaar (2):
  iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
  iommu/amd: Move Kconfig and Makefile bits down into amd directory


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] iommu/vt-d: Move Kconfig and Makefile bits down into intel directory

2020-06-30 Thread Jerry Snitselaar
Move Intel Kconfig and Makefile bits down into intel directory
with the rest of the Intel specific files.

Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/Kconfig| 86 +---
 drivers/iommu/Makefile   |  8 +---
 drivers/iommu/intel/Kconfig  | 86 
 drivers/iommu/intel/Makefile |  7 +++
 4 files changed, 96 insertions(+), 91 deletions(-)
 create mode 100644 drivers/iommu/intel/Kconfig
 create mode 100644 drivers/iommu/intel/Makefile

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6dc49ed8377a..281cd6bd0fe0 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -176,91 +176,7 @@ config AMD_IOMMU_DEBUGFS
  This option is -NOT- intended for production environments, and should
  not generally be enabled.
 
-# Intel IOMMU support
-config DMAR_TABLE
-   bool
-
-config INTEL_IOMMU
-   bool "Support for Intel IOMMU using DMA Remapping Devices"
-   depends on PCI_MSI && ACPI && (X86 || IA64)
-   select IOMMU_API
-   select IOMMU_IOVA
-   select NEED_DMA_MAP_STATE
-   select DMAR_TABLE
-   select SWIOTLB
-   select IOASID
-   help
- DMA remapping (DMAR) devices support enables independent address
- translations for Direct Memory Access (DMA) from devices.
- These DMA remapping devices are reported via ACPI tables
- and include PCI device scope covered by these DMA
- remapping devices.
-
-config INTEL_IOMMU_DEBUGFS
-   bool "Export Intel IOMMU internals in Debugfs"
-   depends on INTEL_IOMMU && IOMMU_DEBUGFS
-   help
- !!!WARNING!!!
-
- DO NOT ENABLE THIS OPTION UNLESS YOU REALLY KNOW WHAT YOU ARE DOING!!!
-
- Expose Intel IOMMU internals in Debugfs.
-
- This option is -NOT- intended for production environments, and should
- only be enabled for debugging Intel IOMMU.
-
-config INTEL_IOMMU_SVM
-   bool "Support for Shared Virtual Memory with Intel IOMMU"
-   depends on INTEL_IOMMU && X86_64
-   select PCI_PASID
-   select PCI_PRI
-   select MMU_NOTIFIER
-   select IOASID
-   help
- Shared Virtual Memory (SVM) provides a facility for devices
- to access DMA resources through process address space by
- means of a Process Address Space ID (PASID).
-
-config INTEL_IOMMU_DEFAULT_ON
-   def_bool y
-   prompt "Enable Intel DMA Remapping Devices by default"
-   depends on INTEL_IOMMU
-   help
- Selecting this option will enable a DMAR device at boot time if
- one is found. If this option is not selected, DMAR support can
- be enabled by passing intel_iommu=on to the kernel.
-
-config INTEL_IOMMU_BROKEN_GFX_WA
-   bool "Workaround broken graphics drivers (going away soon)"
-   depends on INTEL_IOMMU && BROKEN && X86
-   help
- Current Graphics drivers tend to use physical address
- for DMA and avoid using DMA APIs. Setting this config
- option permits the IOMMU driver to set a unity map for
- all the OS-visible memory. Hence the driver can continue
- to use physical addresses for DMA, at least until this
- option is removed in the 2.6.32 kernel.
-
-config INTEL_IOMMU_FLOPPY_WA
-   def_bool y
-   depends on INTEL_IOMMU && X86
-   help
- Floppy disk drivers are known to bypass DMA API calls
- thereby failing to work when IOMMU is enabled. This
- workaround will setup a 1:1 mapping for the first
- 16MiB to make floppy (an ISA device) work.
-
-config INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
-   bool "Enable Intel IOMMU scalable mode by default"
-   depends on INTEL_IOMMU
-   help
- Selecting this option will enable by default the scalable mode if
- hardware presents the capability. The scalable mode is defined in
- VT-d 3.0. The scalable mode capability could be checked by reading
- /sys/devices/virtual/iommu/dmar*/intel-iommu/ecap. If this option
- is not selected, scalable mode support could also be enabled by
- passing intel_iommu=sm_on to the kernel. If not sure, please use
- the default value.
+source "drivers/iommu/intel/Kconfig"
 
 config IRQ_REMAP
bool "Support for Interrupt Remapping"
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 342190196dfb..71dd2f382e78 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
+obj-y += intel/
 obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
@@ -17,13 +18,8 @@ obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
 arm_smmu-objs 

Re: [PATCH 00/13] iommu: Remove usage of dev->archdata.iommu

2020-06-25 Thread Jerry Snitselaar

On Thu Jun 25 20, Joerg Roedel wrote:

From: Joerg Roedel 

Hi,

here is a patch-set to remove the usage of dev->archdata.iommu from
the IOMMU code in the kernel and replace its uses by the iommu per-device
private data field. The changes also remove the field entirely from
the architectures which no longer need it.

On PowerPC the field is called dev->archdata.iommu_domain and was only
used by the PAMU IOMMU driver. It gets removed as well.

The patches have been runtime tested on Intel VT-d and compile tested
with allyesconfig for:

* x86 (32 and 64 bit)
* arm and arm64
* ia64 (only drivers/ because build failed for me in
arch/ia64)
* PPC64

Besides that the changes also survived my IOMMU tree compile tests.

Please review.

Regards,

Joerg

Joerg Roedel (13):
 iommu/exynos: Use dev_iommu_priv_get/set()
 iommu/vt-d: Use dev_iommu_priv_get/set()
 iommu/msm: Use dev_iommu_priv_get/set()
 iommu/omap: Use dev_iommu_priv_get/set()
 iommu/rockchip: Use dev_iommu_priv_get/set()
 iommu/tegra: Use dev_iommu_priv_get/set()
 iommu/pamu: Use dev_iommu_priv_get/set()
 iommu/mediatek: Do no use dev->archdata.iommu
 x86: Remove dev->archdata.iommu pointer
 ia64: Remove dev->archdata.iommu pointer
 arm: Remove dev->archdata.iommu pointer
 arm64: Remove dev->archdata.iommu pointer
 powerpc/dma: Remove dev->archdata.iommu_domain

arch/arm/include/asm/device.h |  3 ---
arch/arm64/include/asm/device.h   |  3 ---
arch/ia64/include/asm/device.h|  3 ---
arch/powerpc/include/asm/device.h |  3 ---
arch/x86/include/asm/device.h |  3 ---
.../gpu/drm/i915/selftests/mock_gem_device.c  | 10 --
drivers/iommu/exynos-iommu.c  | 20 +--
drivers/iommu/fsl_pamu_domain.c   |  8 
drivers/iommu/intel/iommu.c   | 18 -
drivers/iommu/msm_iommu.c |  4 ++--
drivers/iommu/mtk_iommu.h |  2 ++
drivers/iommu/mtk_iommu_v1.c  | 10 --
drivers/iommu/omap-iommu.c| 20 +--
drivers/iommu/rockchip-iommu.c|  8 
drivers/iommu/tegra-gart.c|  8 
drivers/iommu/tegra-smmu.c|  8 
.../media/platform/s5p-mfc/s5p_mfc_iommu.h|  4 +++-
17 files changed, 64 insertions(+), 71 deletions(-)

--
2.27.0



Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Fix misuse of iommu_domain_identity_map()

2020-06-18 Thread Jerry Snitselaar

On Fri Jun 19 20, Lu Baolu wrote:

The iommu_domain_identity_map() helper takes start/end PFN as arguments.
Fix a misuse case where the start and end addresses are passed.

Fixes: e70b081c6f376 ("iommu/vt-d: Remove IOVA handling code from the non-dma_ops 
path")
Cc: Tom Murphy 
Reported-by: Alex Williamson 
Signed-off-by: Lu Baolu 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/amd: Move Kconfig and Makefile bits down into amd directory

2020-06-12 Thread Jerry Snitselaar
Move AMD Kconfig and Makefile bits down into the amd directory
with the rest of the AMD specific files.

Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/Kconfig  | 45 +-
 drivers/iommu/Makefile |  5 +
 drivers/iommu/amd/Kconfig  | 44 +
 drivers/iommu/amd/Makefile |  4 
 4 files changed, 50 insertions(+), 48 deletions(-)
 create mode 100644 drivers/iommu/amd/Kconfig
 create mode 100644 drivers/iommu/amd/Makefile

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b12d4ec124f6..78a8be0053b3 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -132,50 +132,7 @@ config IOMMU_PGTABLES_L2
def_bool y
depends on MSM_IOMMU && MMU && SMP && CPU_DCACHE_DISABLE=n
 
-# AMD IOMMU support
-config AMD_IOMMU
-   bool "AMD IOMMU support"
-   select SWIOTLB
-   select PCI_MSI
-   select PCI_ATS
-   select PCI_PRI
-   select PCI_PASID
-   select IOMMU_API
-   select IOMMU_IOVA
-   select IOMMU_DMA
-   depends on X86_64 && PCI && ACPI
-   ---help---
- With this option you can enable support for AMD IOMMU hardware in
- your system. An IOMMU is a hardware component which provides
- remapping of DMA memory accesses from devices. With an AMD IOMMU you
- can isolate the DMA memory of different devices and protect the
- system from misbehaving device drivers or hardware.
-
- You can find out if your system has an AMD IOMMU if you look into
- your BIOS for an option to enable it or if you have an IVRS ACPI
- table.
-
-config AMD_IOMMU_V2
-   tristate "AMD IOMMU Version 2 driver"
-   depends on AMD_IOMMU
-   select MMU_NOTIFIER
-   ---help---
- This option enables support for the AMD IOMMUv2 features of the IOMMU
- hardware. Select this option if you want to use devices that support
- the PCI PRI and PASID interface.
-
-config AMD_IOMMU_DEBUGFS
-   bool "Enable AMD IOMMU internals in DebugFS"
-   depends on AMD_IOMMU && IOMMU_DEBUGFS
-   ---help---
- !!!WARNING!!!  !!!WARNING!!!  !!!WARNING!!!  !!!WARNING!!!
-
- DO NOT ENABLE THIS OPTION UNLESS YOU REALLY, -REALLY- KNOW WHAT YOU 
ARE DOING!!!
- Exposes AMD IOMMU device internals in DebugFS.
-
- This option is -NOT- intended for production environments, and should
- not generally be enabled.
-
+source "drivers/iommu/amd/Kconfig"
 source "drivers/iommu/intel/Kconfig"
 
 config IRQ_REMAP
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 71dd2f382e78..f356bc12b1c7 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-obj-y += intel/
+obj-y += amd/ intel/
 obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
@@ -12,9 +12,6 @@ obj-$(CONFIG_IOASID) += ioasid.o
 obj-$(CONFIG_IOMMU_IOVA) += iova.o
 obj-$(CONFIG_OF_IOMMU) += of_iommu.o
 obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
-obj-$(CONFIG_AMD_IOMMU) += amd/iommu.o amd/init.o amd/quirks.o
-obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd/debugfs.o
-obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
 arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
diff --git a/drivers/iommu/amd/Kconfig b/drivers/iommu/amd/Kconfig
new file mode 100644
index ..1f061d91e0b8
--- /dev/null
+++ b/drivers/iommu/amd/Kconfig
@@ -0,0 +1,44 @@
+# SPDX-License-Identifier: GPL-2.0-only
+# AMD IOMMU support
+config AMD_IOMMU
+   bool "AMD IOMMU support"
+   select SWIOTLB
+   select PCI_MSI
+   select PCI_ATS
+   select PCI_PRI
+   select PCI_PASID
+   select IOMMU_API
+   select IOMMU_IOVA
+   select IOMMU_DMA
+   depends on X86_64 && PCI && ACPI
+   help
+ With this option you can enable support for AMD IOMMU hardware in
+ your system. An IOMMU is a hardware component which provides
+ remapping of DMA memory accesses from devices. With an AMD IOMMU you
+ can isolate the DMA memory of different devices and protect the
+ system from misbehaving device drivers or hardware.
+
+ You can find out if your system has an AMD IOMMU if you look into
+ your BIOS for an option to enable it or if you have an IVRS ACPI
+ table.
+
+config AMD_IOMMU_V2
+   tristate "AMD IOMMU Version 2 driver"
+   depends on AMD_IOMMU
+   select MMU_NOTIFIER
+   help
+ This option enables support for the AMD IOMMUv2 features of the IOMMU
+ hardware. Select this option if you want to use devices that support
+ the PCI 

[PATCH 0/2] iommu: Move AMD and Intel Kconfig + Makefile bits into their directories

2020-06-12 Thread Jerry Snitselaar
This patchset imeplements the suggestion from Linus to move the Kconfig
and Makefile bits for AMD and Intel into their respective directories.
It also cleans up a couple Kconfig entries to use the newer help
attribute instead of ---help--- (complaint from checkpatch).

Jerry Snitselaar (2):
  iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
  iommu/amd: Move Kconfig and Makefile bits down into amd directory


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu/vt-d: Move Kconfig and Makefile bits down into intel directory

2020-06-12 Thread Jerry Snitselaar
Move Intel Kconfig and Makefile bits down into intel directory
with the rest of the Intel specific files.

Cc: Joerg Roedel 
Cc: Lu Baolu 
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/Kconfig| 86 +---
 drivers/iommu/Makefile   |  8 +---
 drivers/iommu/intel/Kconfig  | 86 
 drivers/iommu/intel/Makefile |  7 +++
 4 files changed, 96 insertions(+), 91 deletions(-)
 create mode 100644 drivers/iommu/intel/Kconfig
 create mode 100644 drivers/iommu/intel/Makefile

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index aca76383f201..b12d4ec124f6 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -176,91 +176,7 @@ config AMD_IOMMU_DEBUGFS
  This option is -NOT- intended for production environments, and should
  not generally be enabled.
 
-# Intel IOMMU support
-config DMAR_TABLE
-   bool
-
-config INTEL_IOMMU
-   bool "Support for Intel IOMMU using DMA Remapping Devices"
-   depends on PCI_MSI && ACPI && (X86 || IA64)
-   select IOMMU_API
-   select IOMMU_IOVA
-   select NEED_DMA_MAP_STATE
-   select DMAR_TABLE
-   select SWIOTLB
-   select IOASID
-   help
- DMA remapping (DMAR) devices support enables independent address
- translations for Direct Memory Access (DMA) from devices.
- These DMA remapping devices are reported via ACPI tables
- and include PCI device scope covered by these DMA
- remapping devices.
-
-config INTEL_IOMMU_DEBUGFS
-   bool "Export Intel IOMMU internals in Debugfs"
-   depends on INTEL_IOMMU && IOMMU_DEBUGFS
-   help
- !!!WARNING!!!
-
- DO NOT ENABLE THIS OPTION UNLESS YOU REALLY KNOW WHAT YOU ARE DOING!!!
-
- Expose Intel IOMMU internals in Debugfs.
-
- This option is -NOT- intended for production environments, and should
- only be enabled for debugging Intel IOMMU.
-
-config INTEL_IOMMU_SVM
-   bool "Support for Shared Virtual Memory with Intel IOMMU"
-   depends on INTEL_IOMMU && X86
-   select PCI_PASID
-   select PCI_PRI
-   select MMU_NOTIFIER
-   select IOASID
-   help
- Shared Virtual Memory (SVM) provides a facility for devices
- to access DMA resources through process address space by
- means of a Process Address Space ID (PASID).
-
-config INTEL_IOMMU_DEFAULT_ON
-   def_bool y
-   prompt "Enable Intel DMA Remapping Devices by default"
-   depends on INTEL_IOMMU
-   help
- Selecting this option will enable a DMAR device at boot time if
- one is found. If this option is not selected, DMAR support can
- be enabled by passing intel_iommu=on to the kernel.
-
-config INTEL_IOMMU_BROKEN_GFX_WA
-   bool "Workaround broken graphics drivers (going away soon)"
-   depends on INTEL_IOMMU && BROKEN && X86
-   ---help---
- Current Graphics drivers tend to use physical address
- for DMA and avoid using DMA APIs. Setting this config
- option permits the IOMMU driver to set a unity map for
- all the OS-visible memory. Hence the driver can continue
- to use physical addresses for DMA, at least until this
- option is removed in the 2.6.32 kernel.
-
-config INTEL_IOMMU_FLOPPY_WA
-   def_bool y
-   depends on INTEL_IOMMU && X86
-   ---help---
- Floppy disk drivers are known to bypass DMA API calls
- thereby failing to work when IOMMU is enabled. This
- workaround will setup a 1:1 mapping for the first
- 16MiB to make floppy (an ISA device) work.
-
-config INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON
-   bool "Enable Intel IOMMU scalable mode by default"
-   depends on INTEL_IOMMU
-   help
- Selecting this option will enable by default the scalable mode if
- hardware presents the capability. The scalable mode is defined in
- VT-d 3.0. The scalable mode capability could be checked by reading
- /sys/devices/virtual/iommu/dmar*/intel-iommu/ecap. If this option
- is not selected, scalable mode support could also be enabled by
- passing intel_iommu=sm_on to the kernel. If not sure, please use
- the default value.
+source "drivers/iommu/intel/Kconfig"
 
 config IRQ_REMAP
bool "Support for Interrupt Remapping"
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 342190196dfb..71dd2f382e78 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
+obj-y += intel/
 obj-$(CONFIG_IOMMU_API) += iommu.o
 obj-$(CONFIG_IOMMU_API) += iommu-traces.o
 obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
@@ -17,13 +18,8 @@ obj-$(CONFIG_AMD_IOMMU_V2) += amd/iommu_v2.o
 obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
 ar

[PATCH] iommu: add include/uapi/linux/iommu.h to MAINTAINERS file

2020-06-05 Thread Jerry Snitselaar
When include/uapi/linux/iommu.h was created it was never
added to the file list in MAINTAINERS.

Cc: Joerg Roedel 
Signed-off-by: Jerry Snitselaar 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e1897ed32930..061648b6e393 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8954,6 +8954,7 @@ F:drivers/iommu/
 F: include/linux/iommu.h
 F: include/linux/iova.h
 F: include/linux/of_iommu.h
+F: include/uapi/linux/iommu.h
 
 IO_URING
 M: Jens Axboe 
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Don't attach deferred device in iommu_group_do_dma_attach

2020-06-04 Thread Jerry Snitselaar
Attaching a deferred device should be delayed until dma api is called.

Cc: iommu@lists.linux-foundation.org
Suggested-by: Joerg Roedel 
Signed-off-by: Jerry Snitselaar 
---
If you already have thrown a patch together, then ignore this. Also
feel free to swap out the signed-off-by with your's since
this is more your patch than mine. You can put a reviewed-by
and tested-by instead for me.

 drivers/iommu/iommu.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b5ea203f6c68..d43120eb1dc5 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type 
*bus,
 static int iommu_group_do_dma_attach(struct device *dev, void *data)
 {
struct iommu_domain *domain = data;
+   int ret = 0;
 
-   return __iommu_attach_device(domain, dev);
+   if (!iommu_is_attach_deferred(domain, dev))
+   ret = __iommu_attach_device(domain, dev);
+
+   return ret;
 }
 
 static int __iommu_group_dma_attach(struct iommu_group *group)
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] iommu: Move Intel and AMD drivers into their own subdirectory

2020-06-03 Thread Jerry Snitselaar

On Thu Jun 04 20, Lu Baolu wrote:

Hi Joerg,

On 6/2/20 5:26 PM, Joerg Roedel wrote:

Hi,

two small patches to move the Intel and AMD IOMMU drivers into their own
subdirectory under drivers/iommu/ to make the file structure a bit less
cluttered.



Does the MAINTAINERS file need to update?

Best regards,
baolu



Yes, that should be updated to point at the new directories. Good catch.



Regards,

Joerg

Joerg Roedel (2):
  iommu/amd: Move AMD IOMMU driver into subdirectory
  iommu/vt-d: Move Intel IOMMU driver into subdirectory

 drivers/iommu/Makefile | 18 +-
 drivers/iommu/{ => amd}/amd_iommu.h|  0
 drivers/iommu/{ => amd}/amd_iommu_types.h  |  0
 .../{amd_iommu_debugfs.c => amd/debugfs.c} |  0
 drivers/iommu/{amd_iommu_init.c => amd/init.c} |  2 +-
 drivers/iommu/{amd_iommu.c => amd/iommu.c} |  2 +-
 .../iommu/{amd_iommu_v2.c => amd/iommu_v2.c}   |  0
 .../iommu/{amd_iommu_quirks.c => amd/quirks.c} |  0
 .../{intel-iommu-debugfs.c => intel/debugfs.c} |  0
 drivers/iommu/{ => intel}/dmar.c   |  2 +-
 drivers/iommu/{ => intel}/intel-pasid.h|  0
 drivers/iommu/{intel-iommu.c => intel/iommu.c} |  2 +-
 .../irq_remapping.c}   |  2 +-
 drivers/iommu/{intel-pasid.c => intel/pasid.c} |  0
 drivers/iommu/{intel-svm.c => intel/svm.c} |  0
 drivers/iommu/{intel-trace.c => intel/trace.c} |  0
 16 files changed, 14 insertions(+), 14 deletions(-)
 rename drivers/iommu/{ => amd}/amd_iommu.h (100%)
 rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%)
 rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (100%)
 rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%)
 rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (99%)
 rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (100%)
 rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%)
 rename drivers/iommu/{intel-iommu-debugfs.c => intel/debugfs.c} (100%)
 rename drivers/iommu/{ => intel}/dmar.c (99%)
 rename drivers/iommu/{ => intel}/intel-pasid.h (100%)
 rename drivers/iommu/{intel-iommu.c => intel/iommu.c} (99%)
 rename drivers/iommu/{intel_irq_remapping.c => intel/irq_remapping.c} (99%)
 rename drivers/iommu/{intel-pasid.c => intel/pasid.c} (100%)
 rename drivers/iommu/{intel-svm.c => intel/svm.c} (100%)
 rename drivers/iommu/{intel-trace.c => intel/trace.c} (100%)





___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] iommu: Move Intel and AMD drivers into their own subdirectory

2020-06-03 Thread Jerry Snitselaar

On Tue Jun 02 20, Joerg Roedel wrote:

Hi,

two small patches to move the Intel and AMD IOMMU drivers into their own
subdirectory under drivers/iommu/ to make the file structure a bit less
cluttered.

Regards,

Joerg

Joerg Roedel (2):
 iommu/amd: Move AMD IOMMU driver into subdirectory
 iommu/vt-d: Move Intel IOMMU driver into subdirectory

drivers/iommu/Makefile | 18 +-
drivers/iommu/{ => amd}/amd_iommu.h|  0
drivers/iommu/{ => amd}/amd_iommu_types.h  |  0
.../{amd_iommu_debugfs.c => amd/debugfs.c} |  0
drivers/iommu/{amd_iommu_init.c => amd/init.c} |  2 +-
drivers/iommu/{amd_iommu.c => amd/iommu.c} |  2 +-
.../iommu/{amd_iommu_v2.c => amd/iommu_v2.c}   |  0
.../iommu/{amd_iommu_quirks.c => amd/quirks.c} |  0
.../{intel-iommu-debugfs.c => intel/debugfs.c} |  0
drivers/iommu/{ => intel}/dmar.c   |  2 +-
drivers/iommu/{ => intel}/intel-pasid.h|  0
drivers/iommu/{intel-iommu.c => intel/iommu.c} |  2 +-
.../irq_remapping.c}   |  2 +-
drivers/iommu/{intel-pasid.c => intel/pasid.c} |  0
drivers/iommu/{intel-svm.c => intel/svm.c} |  0
drivers/iommu/{intel-trace.c => intel/trace.c} |  0
16 files changed, 14 insertions(+), 14 deletions(-)
rename drivers/iommu/{ => amd}/amd_iommu.h (100%)
rename drivers/iommu/{ => amd}/amd_iommu_types.h (100%)
rename drivers/iommu/{amd_iommu_debugfs.c => amd/debugfs.c} (100%)
rename drivers/iommu/{amd_iommu_init.c => amd/init.c} (99%)
rename drivers/iommu/{amd_iommu.c => amd/iommu.c} (99%)
rename drivers/iommu/{amd_iommu_v2.c => amd/iommu_v2.c} (100%)
rename drivers/iommu/{amd_iommu_quirks.c => amd/quirks.c} (100%)
rename drivers/iommu/{intel-iommu-debugfs.c => intel/debugfs.c} (100%)
rename drivers/iommu/{ => intel}/dmar.c (99%)
rename drivers/iommu/{ => intel}/intel-pasid.h (100%)
rename drivers/iommu/{intel-iommu.c => intel/iommu.c} (99%)
rename drivers/iommu/{intel_irq_remapping.c => intel/irq_remapping.c} (99%)
rename drivers/iommu/{intel-pasid.c => intel/pasid.c} (100%)
rename drivers/iommu/{intel-svm.c => intel/svm.c} (100%)
rename drivers/iommu/{intel-trace.c => intel/trace.c} (100%)

--
2.17.1



Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-06-02 Thread Jerry Snitselaar

On Tue Jun 02 20, Jerry Snitselaar wrote:

On Tue Jun 02 20, Joerg Roedel wrote:

Hi Jerry,

On Mon, Jun 01, 2020 at 05:02:36PM -0700, Jerry Snitselaar wrote:


Yeah, that will solve the panic.



If you still see the kdump faults, can you please try with the attached
diff? I was not able to reproduce them in my setup.

Regards,

Joerg



I have another hp proliant server now, and reproduced. I will have the
patch below tested shortly. Minor change, I switched group->domain to
domain since group isn't an argument, and *data being passed in comes
from group->domain anyways.



Looks like it solves problem for both the epyc system, and the hp proliant
server,


diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b5ea203f6c68..5a6d509f72b6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type 
*bus,
static int iommu_group_do_dma_attach(struct device *dev, void *data)
{
struct iommu_domain *domain = data;
+   int ret = 0;

-   return __iommu_attach_device(domain, dev);
+   if (!iommu_is_attach_deferred(group->domain, dev))
+   ret = __iommu_attach_device(group->domain, dev);
+
+   return ret;
}

static int __iommu_group_dma_attach(struct iommu_group *group)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-06-02 Thread Jerry Snitselaar

On Tue Jun 02 20, Joerg Roedel wrote:

Hi Jerry,

On Mon, Jun 01, 2020 at 05:02:36PM -0700, Jerry Snitselaar wrote:


Yeah, that will solve the panic.



If you still see the kdump faults, can you please try with the attached
diff? I was not able to reproduce them in my setup.

Regards,

Joerg



I have another hp proliant server now, and reproduced. I will have the
patch below tested shortly. Minor change, I switched group->domain to
domain since group isn't an argument, and *data being passed in comes
from group->domain anyways.


diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b5ea203f6c68..5a6d509f72b6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type 
*bus,
static int iommu_group_do_dma_attach(struct device *dev, void *data)
{
struct iommu_domain *domain = data;
+   int ret = 0;

-   return __iommu_attach_device(domain, dev);
+   if (!iommu_is_attach_deferred(group->domain, dev))
+   ret = __iommu_attach_device(group->domain, dev);
+
+   return ret;
}

static int __iommu_group_dma_attach(struct iommu_group *group)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-06-01 Thread Jerry Snitselaar

On Tue Jun 02 20, Lu Baolu wrote:

Hi Jerry,

On 6/1/20 6:42 PM, Jerry Snitselaar wrote:


Hi Joerg,

With this patchset, I have an epyc system where if I boot with
iommu=nopt and force a dump I will see some io page faults for a nic
on the system. The vmcore is harvested and the system reboots. I
haven't reproduced it on other systems yet, but without the patchset I
don't see the io page faults during the kdump.

Regards,
Jerry


I just hit an issue on a separate intel based system (kdump iommu=nopt),
where it panics in during intel_iommu_attach_device, in is_aux_domain,
due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't
get set to a valid address until the domain_add_dev_info call.

Is it as simple as the following?


I guess you won't hit this issue if you use iommu/next branch of Joerg's
tree. We've changed to use a generic helper to retrieve the valid per
device iommu data or NULL (if there's no).

Best regards,
baolu



Yeah, that will solve the panic. 



diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 29d3940847d3..f1bbeed46a4c 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct 
iommu_domain *domain)

 {
    struct device_domain_info *info = dev->archdata.iommu;

-   return info && info->auxd_enabled &&
-   domain->type == IOMMU_DOMAIN_UNMANAGED;
+   return info && info != DEFER_DEVICE_DOMAIN_INFO &&
+   info->auxd_enabled && domain->type == 
IOMMU_DOMAIN_UNMANAGED;

 }

 static void auxiliary_link_device(struct dmar_domain *domain,


Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-06-01 Thread Jerry Snitselaar

On Mon Jun 01 20, Jerry Snitselaar wrote:

On Fri May 29 20, Jerry Snitselaar wrote:

On Tue Apr 14 20, Joerg Roedel wrote:

Hi,

here is the second version of this patch-set. The first version with
some more introductory text can be found here:

https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/

Changes v1->v2:

* Rebased to v5.7-rc1

* Re-wrote the arm-smmu changes as suggested by Robin Murphy

* Re-worked the Exynos patches to hopefully not break the
  driver anymore

* Fixed a missing mutex_unlock() reported by Marek Szyprowski,
  thanks for that.

There is also a git-branch available with these patches applied:


https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2

Please review.

Thanks,

Joerg

Joerg Roedel (32):
iommu: Move default domain allocation to separate function
iommu/amd: Implement iommu_ops->def_domain_type call-back
iommu/vt-d: Wire up iommu_ops->def_domain_type
iommu/amd: Remove dma_mask check from check_device()
iommu/amd: Return -ENODEV in add_device when device is not handled by
 IOMMU
iommu: Add probe_device() and remove_device() call-backs
iommu: Move default domain allocation to iommu_probe_device()
iommu: Keep a list of allocated groups in __iommu_probe_device()
iommu: Move new probe_device path to separate function
iommu: Split off default domain allocation from group assignment
iommu: Move iommu_group_create_direct_mappings() out of
 iommu_group_add_device()
iommu: Export bus_iommu_probe() and make is safe for re-probing
iommu/amd: Remove dev_data->passthrough
iommu/amd: Convert to probe/release_device() call-backs
iommu/vt-d: Convert to probe/release_device() call-backs
iommu/arm-smmu: Convert to probe/release_device() call-backs
iommu/pamu: Convert to probe/release_device() call-backs
iommu/s390: Convert to probe/release_device() call-backs
iommu/virtio: Convert to probe/release_device() call-backs
iommu/msm: Convert to probe/release_device() call-backs
iommu/mediatek: Convert to probe/release_device() call-backs
iommu/mediatek-v1 Convert to probe/release_device() call-backs
iommu/qcom: Convert to probe/release_device() call-backs
iommu/rockchip: Convert to probe/release_device() call-backs
iommu/tegra: Convert to probe/release_device() call-backs
iommu/renesas: Convert to probe/release_device() call-backs
iommu/omap: Remove orphan_dev tracking
iommu/omap: Convert to probe/release_device() call-backs
iommu/exynos: Use first SYSMMU in controllers list for IOMMU core
iommu/exynos: Convert to probe/release_device() call-backs
iommu: Remove add_device()/remove_device() code-paths
iommu: Unexport iommu_group_get_for_dev()

Sai Praneeth Prakhya (1):
iommu: Add def_domain_type() callback in iommu_ops

drivers/iommu/amd_iommu.c   |  97 
drivers/iommu/amd_iommu_types.h |   1 -
drivers/iommu/arm-smmu-v3.c |  38 +--
drivers/iommu/arm-smmu.c|  39 ++--
drivers/iommu/exynos-iommu.c|  24 +-
drivers/iommu/fsl_pamu_domain.c |  22 +-
drivers/iommu/intel-iommu.c |  68 +-
drivers/iommu/iommu.c   | 393 +---
drivers/iommu/ipmmu-vmsa.c  |  60 ++---
drivers/iommu/msm_iommu.c   |  34 +--
drivers/iommu/mtk_iommu.c   |  24 +-
drivers/iommu/mtk_iommu_v1.c|  50 ++--
drivers/iommu/omap-iommu.c  |  99 ++--
drivers/iommu/qcom_iommu.c  |  24 +-
drivers/iommu/rockchip-iommu.c  |  26 +--
drivers/iommu/s390-iommu.c  |  22 +-
drivers/iommu/tegra-gart.c  |  24 +-
drivers/iommu/tegra-smmu.c  |  31 +--
drivers/iommu/virtio-iommu.c|  41 +---
include/linux/iommu.h   |  21 +-
20 files changed, 533 insertions(+), 605 deletions(-)

--
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



Hi Joerg,

With this patchset, I have an epyc system where if I boot with
iommu=nopt and force a dump I will see some io page faults for a nic
on the system. The vmcore is harvested and the system reboots. I
haven't reproduced it on other systems yet, but without the patchset I
don't see the io page faults during the kdump.

Regards,
Jerry


I just hit an issue on a separate intel based system (kdump iommu=nopt),
where it panics in during intel_iommu_attach_device, in is_aux_domain,
due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't
get set to a valid address until the domain_add_dev_info call.

Is it as simple as the following?

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 29d3940847d3..f1bbeed46a4c 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct iommu_domain 
*domain)
{
   struct device_domain_info *info = dev->archdata.iommu;
-   return info && info->auxd_enabled &&
- 

Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-06-01 Thread Jerry Snitselaar

On Fri May 29 20, Jerry Snitselaar wrote:

On Tue Apr 14 20, Joerg Roedel wrote:

Hi,

here is the second version of this patch-set. The first version with
some more introductory text can be found here:

https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/

Changes v1->v2:

* Rebased to v5.7-rc1

* Re-wrote the arm-smmu changes as suggested by Robin Murphy

* Re-worked the Exynos patches to hopefully not break the
  driver anymore

* Fixed a missing mutex_unlock() reported by Marek Szyprowski,
  thanks for that.

There is also a git-branch available with these patches applied:


https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2

Please review.

Thanks,

Joerg

Joerg Roedel (32):
iommu: Move default domain allocation to separate function
iommu/amd: Implement iommu_ops->def_domain_type call-back
iommu/vt-d: Wire up iommu_ops->def_domain_type
iommu/amd: Remove dma_mask check from check_device()
iommu/amd: Return -ENODEV in add_device when device is not handled by
  IOMMU
iommu: Add probe_device() and remove_device() call-backs
iommu: Move default domain allocation to iommu_probe_device()
iommu: Keep a list of allocated groups in __iommu_probe_device()
iommu: Move new probe_device path to separate function
iommu: Split off default domain allocation from group assignment
iommu: Move iommu_group_create_direct_mappings() out of
  iommu_group_add_device()
iommu: Export bus_iommu_probe() and make is safe for re-probing
iommu/amd: Remove dev_data->passthrough
iommu/amd: Convert to probe/release_device() call-backs
iommu/vt-d: Convert to probe/release_device() call-backs
iommu/arm-smmu: Convert to probe/release_device() call-backs
iommu/pamu: Convert to probe/release_device() call-backs
iommu/s390: Convert to probe/release_device() call-backs
iommu/virtio: Convert to probe/release_device() call-backs
iommu/msm: Convert to probe/release_device() call-backs
iommu/mediatek: Convert to probe/release_device() call-backs
iommu/mediatek-v1 Convert to probe/release_device() call-backs
iommu/qcom: Convert to probe/release_device() call-backs
iommu/rockchip: Convert to probe/release_device() call-backs
iommu/tegra: Convert to probe/release_device() call-backs
iommu/renesas: Convert to probe/release_device() call-backs
iommu/omap: Remove orphan_dev tracking
iommu/omap: Convert to probe/release_device() call-backs
iommu/exynos: Use first SYSMMU in controllers list for IOMMU core
iommu/exynos: Convert to probe/release_device() call-backs
iommu: Remove add_device()/remove_device() code-paths
iommu: Unexport iommu_group_get_for_dev()

Sai Praneeth Prakhya (1):
iommu: Add def_domain_type() callback in iommu_ops

drivers/iommu/amd_iommu.c   |  97 
drivers/iommu/amd_iommu_types.h |   1 -
drivers/iommu/arm-smmu-v3.c |  38 +--
drivers/iommu/arm-smmu.c|  39 ++--
drivers/iommu/exynos-iommu.c|  24 +-
drivers/iommu/fsl_pamu_domain.c |  22 +-
drivers/iommu/intel-iommu.c |  68 +-
drivers/iommu/iommu.c   | 393 +---
drivers/iommu/ipmmu-vmsa.c  |  60 ++---
drivers/iommu/msm_iommu.c   |  34 +--
drivers/iommu/mtk_iommu.c   |  24 +-
drivers/iommu/mtk_iommu_v1.c|  50 ++--
drivers/iommu/omap-iommu.c  |  99 ++--
drivers/iommu/qcom_iommu.c  |  24 +-
drivers/iommu/rockchip-iommu.c  |  26 +--
drivers/iommu/s390-iommu.c  |  22 +-
drivers/iommu/tegra-gart.c  |  24 +-
drivers/iommu/tegra-smmu.c  |  31 +--
drivers/iommu/virtio-iommu.c|  41 +---
include/linux/iommu.h   |  21 +-
20 files changed, 533 insertions(+), 605 deletions(-)

--
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



Hi Joerg,

With this patchset, I have an epyc system where if I boot with
iommu=nopt and force a dump I will see some io page faults for a nic
on the system. The vmcore is harvested and the system reboots. I
haven't reproduced it on other systems yet, but without the patchset I
don't see the io page faults during the kdump.

Regards,
Jerry


I just hit an issue on a separate intel based system (kdump iommu=nopt),
where it panics in during intel_iommu_attach_device, in is_aux_domain,
due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't
get set to a valid address until the domain_add_dev_info call.

Is it as simple as the following?

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 29d3940847d3..f1bbeed46a4c 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct iommu_domain 
*domain)
 {
struct device_domain_info *info = dev->archdata.iommu;
 
-   return info && info->auxd_enabled &&

-   domain->type == IOMMU_DOMAIN_U

Re: [PATCH v2 00/33] iommu: Move iommu_group setup to IOMMU core code

2020-05-29 Thread Jerry Snitselaar

On Tue Apr 14 20, Joerg Roedel wrote:

Hi,

here is the second version of this patch-set. The first version with
some more introductory text can be found here:

https://lore.kernel.org/lkml/20200407183742.4344-1-j...@8bytes.org/

Changes v1->v2:

* Rebased to v5.7-rc1

* Re-wrote the arm-smmu changes as suggested by Robin Murphy

* Re-worked the Exynos patches to hopefully not break the
  driver anymore

* Fixed a missing mutex_unlock() reported by Marek Szyprowski,
  thanks for that.

There is also a git-branch available with these patches applied:


https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2

Please review.

Thanks,

Joerg

Joerg Roedel (32):
 iommu: Move default domain allocation to separate function
 iommu/amd: Implement iommu_ops->def_domain_type call-back
 iommu/vt-d: Wire up iommu_ops->def_domain_type
 iommu/amd: Remove dma_mask check from check_device()
 iommu/amd: Return -ENODEV in add_device when device is not handled by
   IOMMU
 iommu: Add probe_device() and remove_device() call-backs
 iommu: Move default domain allocation to iommu_probe_device()
 iommu: Keep a list of allocated groups in __iommu_probe_device()
 iommu: Move new probe_device path to separate function
 iommu: Split off default domain allocation from group assignment
 iommu: Move iommu_group_create_direct_mappings() out of
   iommu_group_add_device()
 iommu: Export bus_iommu_probe() and make is safe for re-probing
 iommu/amd: Remove dev_data->passthrough
 iommu/amd: Convert to probe/release_device() call-backs
 iommu/vt-d: Convert to probe/release_device() call-backs
 iommu/arm-smmu: Convert to probe/release_device() call-backs
 iommu/pamu: Convert to probe/release_device() call-backs
 iommu/s390: Convert to probe/release_device() call-backs
 iommu/virtio: Convert to probe/release_device() call-backs
 iommu/msm: Convert to probe/release_device() call-backs
 iommu/mediatek: Convert to probe/release_device() call-backs
 iommu/mediatek-v1 Convert to probe/release_device() call-backs
 iommu/qcom: Convert to probe/release_device() call-backs
 iommu/rockchip: Convert to probe/release_device() call-backs
 iommu/tegra: Convert to probe/release_device() call-backs
 iommu/renesas: Convert to probe/release_device() call-backs
 iommu/omap: Remove orphan_dev tracking
 iommu/omap: Convert to probe/release_device() call-backs
 iommu/exynos: Use first SYSMMU in controllers list for IOMMU core
 iommu/exynos: Convert to probe/release_device() call-backs
 iommu: Remove add_device()/remove_device() code-paths
 iommu: Unexport iommu_group_get_for_dev()

Sai Praneeth Prakhya (1):
 iommu: Add def_domain_type() callback in iommu_ops

drivers/iommu/amd_iommu.c   |  97 
drivers/iommu/amd_iommu_types.h |   1 -
drivers/iommu/arm-smmu-v3.c |  38 +--
drivers/iommu/arm-smmu.c|  39 ++--
drivers/iommu/exynos-iommu.c|  24 +-
drivers/iommu/fsl_pamu_domain.c |  22 +-
drivers/iommu/intel-iommu.c |  68 +-
drivers/iommu/iommu.c   | 393 +---
drivers/iommu/ipmmu-vmsa.c  |  60 ++---
drivers/iommu/msm_iommu.c   |  34 +--
drivers/iommu/mtk_iommu.c   |  24 +-
drivers/iommu/mtk_iommu_v1.c|  50 ++--
drivers/iommu/omap-iommu.c  |  99 ++--
drivers/iommu/qcom_iommu.c  |  24 +-
drivers/iommu/rockchip-iommu.c  |  26 +--
drivers/iommu/s390-iommu.c  |  22 +-
drivers/iommu/tegra-gart.c  |  24 +-
drivers/iommu/tegra-smmu.c  |  31 +--
drivers/iommu/virtio-iommu.c|  41 +---
include/linux/iommu.h   |  21 +-
20 files changed, 533 insertions(+), 605 deletions(-)

--
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



Hi Joerg,

With this patchset, I have an epyc system where if I boot with
iommu=nopt and force a dump I will see some io page faults for a nic
on the system. The vmcore is harvested and the system reboots. I
haven't reproduced it on other systems yet, but without the patchset I
don't see the io page faults during the kdump.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Implement deferred domain attachment

2020-05-19 Thread Jerry Snitselaar

On Mon May 18 20, Joerg Roedel wrote:

On Fri, May 15, 2020 at 08:23:13PM +0100, Robin Murphy wrote:

But that's not what this is; this is (supposed to be) the exact same "don't
actually perform the attach yet" logic as before, just restricting it to
default domains in the one place that it actually needs to be, so as not to
fundamentally bugger up iommu_attach_device() in a way that prevents it from
working as expected at the correct point later.


You are right, that is better. I tested it and it seems to work. Updated
diff attached, with a minor cleanup included. Mind sending it as a
proper patch I can send upstream?

Thanks,

Joerg

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7b375421afba..a9d02bc3ab5b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -693,6 +693,15 @@ static int iommu_group_create_direct_mappings(struct 
iommu_group *group,
return ret;
}

+static bool iommu_is_attach_deferred(struct iommu_domain *domain,
+struct device *dev)
+{
+   if (domain->ops->is_attach_deferred)
+   return domain->ops->is_attach_deferred(domain, dev);
+
+   return false;
+}
+
/**
 * iommu_group_add_device - add a device to an iommu group
 * @group: the group into which to add the device (reference should be held)
@@ -705,6 +714,7 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)
{
int ret, i = 0;
struct group_device *device;
+   struct iommu_domain *domain;

device = kzalloc(sizeof(*device), GFP_KERNEL);
if (!device)
@@ -747,7 +757,8 @@ int iommu_group_add_device(struct iommu_group *group, 
struct device *dev)

mutex_lock(>mutex);
list_add_tail(>list, >devices);
-   if (group->domain)
+   domain = group->domain;
+   if (domain  && !iommu_is_attach_deferred(domain, dev))
ret = __iommu_attach_device(group->domain, dev);
mutex_unlock(>mutex);
if (ret)
@@ -1653,9 +1664,6 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
 struct device *dev)
{
int ret;
-   if ((domain->ops->is_attach_deferred != NULL) &&
-   domain->ops->is_attach_deferred(domain, dev))
-   return 0;

if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;
@@ -1727,8 +1735,7 @@ EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
static void __iommu_detach_device(struct iommu_domain *domain,
  struct device *dev)
{
-   if ((domain->ops->is_attach_deferred != NULL) &&
-   domain->ops->is_attach_deferred(domain, dev))
+   if (iommu_is_attach_deferred(domain, dev))
return;

if (unlikely(domain->ops->detach_dev == NULL))
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



This worked for me as well.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Implement deferred domain attachment

2020-05-18 Thread Jerry Snitselaar

On Mon May 18 20, Joerg Roedel wrote:

On Fri, May 15, 2020 at 08:23:13PM +0100, Robin Murphy wrote:

But that's not what this is; this is (supposed to be) the exact same "don't
actually perform the attach yet" logic as before, just restricting it to
default domains in the one place that it actually needs to be, so as not to
fundamentally bugger up iommu_attach_device() in a way that prevents it from
working as expected at the correct point later.


You are right, that is better. I tested it and it seems to work. Updated
diff attached, with a minor cleanup included. Mind sending it as a
proper patch I can send upstream?

Thanks,

Joerg



I should have this tested this afternoon.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: amd kdump failure with iommu=nopt

2020-05-14 Thread Jerry Snitselaar

On Thu May 14 20, Joerg Roedel wrote:

On Thu, May 14, 2020 at 05:36:23PM +0200, Joerg Roedel wrote:

This commit also removes the deferred attach of the device to its new
domain. Does the attached diff fix the problem for you?
+static int __iommu_attach_device_no_defer(struct iommu_domain *domain,
+ struct device *dev)
+{
if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;

ret = domain->ops->attach_dev(domain, dev);
if (!ret)
trace_attach_device_to_domain(dev);
+
return ret;
 }


Sorry, this didn't compile, here is an updated version that actually
compiles:


diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4050569188be..f54ebb964271 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1889,13 +1889,19 @@ void iommu_domain_free(struct iommu_domain *domain)
}
EXPORT_SYMBOL_GPL(iommu_domain_free);

-static int __iommu_attach_device(struct iommu_domain *domain,
-struct device *dev)
+static bool __iommu_is_attach_deferred(struct iommu_domain *domain,
+  struct device *dev)
+{
+   if (!domain->ops->is_attach_deferred)
+   return false;
+
+   return domain->ops->is_attach_deferred(domain, dev);
+}
+
+static int __iommu_attach_device_no_defer(struct iommu_domain *domain,
+ struct device *dev)
{
int ret;
-   if ((domain->ops->is_attach_deferred != NULL) &&
-   domain->ops->is_attach_deferred(domain, dev))
-   return 0;

if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;
@@ -1903,9 +1909,19 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
ret = domain->ops->attach_dev(domain, dev);
if (!ret)
trace_attach_device_to_domain(dev);
+
return ret;
}

+static int __iommu_attach_device(struct iommu_domain *domain,
+struct device *dev)
+{
+   if (__iommu_is_attach_deferred(domain, dev))
+   return 0;
+
+   return __iommu_attach_device_no_defer(domain, dev);
+}
+
int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
{
struct iommu_group *group;
@@ -2023,7 +2039,12 @@ EXPORT_SYMBOL_GPL(iommu_get_domain_for_dev);
 */
struct iommu_domain *iommu_get_dma_domain(struct device *dev)
{
-   return dev->iommu_group->default_domain;
+   struct iommu_domain *domain = dev->iommu_group->default_domain;
+
+   if (__iommu_is_attach_deferred(domain, dev))
+   __iommu_attach_device_no_defer(domain, dev);
+
+   return domain;
}

/*
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



Yes, that works.

Tested-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


amd kdump failure with iommu=nopt

2020-05-13 Thread Jerry Snitselaar

We've seen kdump failures with recent kernels (5.5, 5.6, 5.7-rc1) on
amd systems when iommu is enabled in translation mode. In the cases so
far there has been mpt3sas involved, but I'm also seeing io page
faults for ahci right before mpt3sas has an io page fault:

[   15.156620] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xfff9b300 flags=0x0020]
[   15.166889] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xfff9b320 flags=0x0020]
[   15.177169] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   15.186100] ata4.00: failed to IDENTIFY (device reports invalid type, 
err_mask=0x0)
[   15.193786] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f730c0 flags=0x0020]
[   15.204059] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f732c0 flags=0x0020]
[   15.214327] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f734c0 flags=0x0020]
[   15.224597] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f736c0 flags=0x0020]
[   15.234867] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f738c0 flags=0x0020]
[   15.245138] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73ac0 flags=0x0020]
[   15.255407] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73cc0 flags=0x0020]
[   15.265677] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73ec0 flags=0x0020]
[   20.599101] ata2.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, 
err_mask=0x80)
[   20.916172] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   20.922429] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xfff9b300 flags=0x0020]
[   20.932703] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xfff9b320 flags=0x0020]
[   20.943234] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   20.949430] ata4.00: failed to IDENTIFY (device reports invalid type, 
err_mask=0x0)
[   20.957115] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f730c0 flags=0x0020]
[   20.967384] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f732c0 flags=0x0020]
[   20.977654] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f734c0 flags=0x0020]
[   20.987923] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f736c0 flags=0x0020]
[   20.998193] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f738c0 flags=0x0020]
[   21.008464] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73ac0 flags=0x0020]
[   21.018733] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73cc0 flags=0x0020]
[   21.029005] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73ec0 flags=0x0020]
[   26.231097] ata2.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, 
err_mask=0x80)
[   26.238415] ata2: limiting SATA link speed to 3.0 Gbps
[   26.548169] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   26.564483] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[   26.571026] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f730c0 flags=0x0020]
[   26.581301] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f732c0 flags=0x0020]
[   26.591568] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f734c0 flags=0x0020]
[   26.601839] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f736c0 flags=0x0020]
[   26.612109] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f738c0 flags=0x0020]
[   26.622377] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73ac0 flags=0x0020]
[   26.632647] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73cc0 flags=0x0020]
[   26.642917] ahci :63:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT 
domain=0x0042 address=0xf1f73ec0 flags=0x0020]
[   26.654047] ata2.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, 
err_mask=0x80)
[   26.743097] xhci_hcd :05:00.3: Error while assigning device slot ID
[   26.749718] xhci_hcd :05:00.3: Max number of devices this xHCI host 
supports is 64.
[   26.757730] usb usb1-port2: couldn't allocate usb_device
[   26.987555] mpt3sas version 33.100.00.00 loaded
[   26.994668] mpt3sas_cm0: 63 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem 
(226256 kB)
[   27.060443] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page 
size to 4k
[   27.068469] mpt3sas_cm0: MSI-X vectors supported: 96
[   27.073444]   no of cores: 1, max_msix_vectors: -1
[   27.078244] 

Re: [PATCH v4 0/3] Replace private domain with per-group default domain

2020-05-12 Thread Jerry Snitselaar

On Wed May 06 20, Lu Baolu wrote:

Some devices are required to use a specific type (identity or dma) of
default domain when they are used with a vendor iommu. When the system
level default domain type is different from it, the vendor iommu driver
has to request a new default domain with either
iommu_request_dma_domain_for_dev() or iommu_request_dm_for_dev() in the
add_dev() callback. Unfortunately, these two helpers only work when the
group hasn't been assigned to any other devices, hence, some vendor iommu
driver has to use a private domain if it fails to request a new default
one.

Joerg proposed an on-going proposal which makes the default domain
framework to support configuring per-group default domain during boot
process.

https://lkml.org/lkml/2020/4/14/616
[This has been applied in iommu/next.]

Hence, there is no need to keep the private domain implementation
in the Intel IOMMU driver. This patch series aims to remove it.

Best regards,
baolu

Change log:
v3->v4:
- Make the commit message of the first patch more comprehensive.

v2->v3:
- Port necessary patches on the top of Joerg's new proposal.
  https://lkml.org/lkml/2020/4/14/616
  The per-group default domain proposed previously in this series
  will be deprecated due to a race concern between domain switching
  and device driver probing.

v1->v2:
- Rename the iommu ops callback to def_domain_type

Lu Baolu (3):
 iommu/vt-d: Allow 32bit devices to uses DMA domain
 iommu/vt-d: Allow PCI sub-hierarchy to use DMA domain
 iommu/vt-d: Apply per-device dma_ops

drivers/iommu/intel-iommu.c | 396 +++-
1 file changed, 26 insertions(+), 370 deletions(-)

--
2.17.1



Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: question about iommu_need_mapping

2020-02-20 Thread Jerry Snitselaar

On Thu Feb 20 20, Jerry Snitselaar wrote:

On Thu Feb 20 20, Lu Baolu wrote:

Hi Jerry,

On 2020/2/20 7:55, Jerry Snitselaar wrote:

Is it possible for a device to end up with dev->archdata.iommu == NULL
on iommu_need_mapping in the following instance:

1. iommu_group has dma domain for default
2. device gets private identity domain in intel_iommu_add_device
3. iommu_need_mapping gets called with that device.
4. dmar_remove_one_dev_info sets dev->archdata.iommu = NULL via 
unlink_domain_info.
5. request_default_domain_for_dev exits after checking that 
group->default_domain

   exists, and group->default_domain->type is dma.
6. iommu_request_dma_domain_for_dev returns 0 from 
request_default_domain_for_dev

   and a private dma domain isn't created for the device.



Yes. It's possible.


The case I was seeing went away with commit 9235cb13d7d1 ("iommu/vt-d:
Allow devices with RMRRs to use identity domain"), because it changed
which domain the group and devices were using, but it seems like it is
still a possibility with the code. Baolu, you mentioned possibly
removing the domain switch. Commit 98b2fffb5e27 ("iommu/vt-d: Handle
32bit device with identity default domain") makes it sound like the
domain switch is required.


It's more "nice to have" than "required" if the iommu driver doesn't
disable swiotlb explicitly. The device access of system memory higher
than the device's addressing capability could go through the bounced
buffer implemented in swiotlb.

Best regards,
baolu


Hi Baolu,

Would this mean switching to bounce_dma_ops instead?



Never mind. I see that it would go into the dma_direct code.


Regards,
Jerry


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: question about iommu_need_mapping

2020-02-20 Thread Jerry Snitselaar

On Thu Feb 20 20, Lu Baolu wrote:

Hi Jerry,

On 2020/2/20 7:55, Jerry Snitselaar wrote:

Is it possible for a device to end up with dev->archdata.iommu == NULL
on iommu_need_mapping in the following instance:

1. iommu_group has dma domain for default
2. device gets private identity domain in intel_iommu_add_device
3. iommu_need_mapping gets called with that device.
4. dmar_remove_one_dev_info sets dev->archdata.iommu = NULL via 
unlink_domain_info.
5. request_default_domain_for_dev exits after checking that 
group->default_domain

   exists, and group->default_domain->type is dma.
6. iommu_request_dma_domain_for_dev returns 0 from 
request_default_domain_for_dev

   and a private dma domain isn't created for the device.



Yes. It's possible.


The case I was seeing went away with commit 9235cb13d7d1 ("iommu/vt-d:
Allow devices with RMRRs to use identity domain"), because it changed
which domain the group and devices were using, but it seems like it is
still a possibility with the code. Baolu, you mentioned possibly
removing the domain switch. Commit 98b2fffb5e27 ("iommu/vt-d: Handle
32bit device with identity default domain") makes it sound like the
domain switch is required.


It's more "nice to have" than "required" if the iommu driver doesn't
disable swiotlb explicitly. The device access of system memory higher
than the device's addressing capability could go through the bounced
buffer implemented in swiotlb.

Best regards,
baolu


Hi Baolu,

Would this mean switching to bounce_dma_ops instead?

Regards,
Jerry


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

question about iommu_need_mapping

2020-02-19 Thread Jerry Snitselaar

Is it possible for a device to end up with dev->archdata.iommu == NULL
on iommu_need_mapping in the following instance:

1. iommu_group has dma domain for default
2. device gets private identity domain in intel_iommu_add_device
3. iommu_need_mapping gets called with that device.
4. dmar_remove_one_dev_info sets dev->archdata.iommu = NULL via 
unlink_domain_info.
5. request_default_domain_for_dev exits after checking that 
group->default_domain
   exists, and group->default_domain->type is dma.
6. iommu_request_dma_domain_for_dev returns 0 from 
request_default_domain_for_dev
   and a private dma domain isn't created for the device.

The case I was seeing went away with commit 9235cb13d7d1 ("iommu/vt-d:
Allow devices with RMRRs to use identity domain"), because it changed
which domain the group and devices were using, but it seems like it is
still a possibility with the code. Baolu, you mentioned possibly
removing the domain switch. Commit 98b2fffb5e27 ("iommu/vt-d: Handle
32bit device with identity default domain") makes it sound like the
domain switch is required.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar fault right around domain switch in iommu_need_mapping

2020-02-18 Thread Jerry Snitselaar

On Wed Feb 19 20, Lu Baolu wrote:

Hi Jerry,

On 2020/2/18 23:45, Jerry Snitselaar wrote:

Hi Joerg and Baolu,

I'm chasing down one last issue. I'm waiting to hear back from them
testing with Joerg's patchset, but I'm guessing this will still pop
up. It looks like right around when the domain switch occurs in
iommu_need_mapping there are some dmar faults (below is from 5.6-rc1
plus earlier fix attempt that moved deferred attach to beginning of
iommu_need_mapping):

[   12.546920] DMAR: DRHD: handling fault status reg 2
[   12.546923] DMAR: [DMA Read] Request device [02:00.0] PASID 
 fault addr 791dd000 [fault reason 02] Present bit in 
context entry is clear

[   12.635193] hpsa :02:00.0: Using iommu dma mapping
[   12.776712] hpsa :02:00.0: DMAR: 32bit DMA uses non-identity mapping

[   14.091219] DMAR: [DMA Read] Request device [07:00.0] PASID 
 fault addr 791dd000 [fault reason 02] Present bit in 
context entry is clear

[   14.180842] DMAR: DRHD: handling fault status reg 202
[   14.180845] DMAR: [DMA Read] Request device [07:00.0] PASID 
 fault addr 791dd000 [fault reason 02] Present bit in 
context entry is clear

[   14.268756] DMAR: DRHD: handling fault status reg 302
[   15.542551] hpsa :07:00.0: Using iommu dma mapping
[   15.567256] hpsa :07:00.0: DMAR: 32bit DMA uses non-identity mapping


It seems to only happen right then, and then things are fine. Happens
during both regular and kdump boot.  With the kdump boot the faults
are from the hpilo in the logs I'm looking at, so it doesn't seem to
be tied to a device, or certain rmrr. The faulting address always
seems to be the base address of the rmrr.  The dmar tables look sane.


Perhaps like this?

The device was boot with an identity domain (iommu=pt). When loading the
driver for this device, iommu driver finds that it's a 32-bit device and
tries to convert it to DMA domain. The rmrr is still active during the
switch, hence you see dma faults during that time window.

Best regards,
baolu



It looks like it doesn't occur with Joerg's patchset.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/5 v2] iommu/vt-d: Do deferred attachment in iommu_need_mapping()

2020-02-18 Thread Jerry Snitselaar

On Tue Feb 18 20, Joerg Roedel wrote:

Hi Baolu,

On Tue, Feb 18, 2020 at 10:38:14AM +0800, Lu Baolu wrote:

> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 42cdcce1602e..32f43695a22b 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2541,9 +2541,6 @@ static void do_deferred_attach(struct device *dev)
>   static struct dmar_domain *deferred_attach_domain(struct device *dev)
>   {
> -  if (unlikely(attach_deferred(dev)))
> -  do_deferred_attach(dev);
> -

This should also be moved to the call place of deferred_attach_domain()
in bounce_map_single().

bounce_map_single() assumes that devices always use DMA domain, so it
doesn't call iommu_need_mapping(). We could do_deferred_attach() there
manually.


Good point, thanks for your review. Updated patch below.

From 3a5b8a66d288d86ac1fd45092e7d96f842d0cccf Mon Sep 17 00:00:00 2001
From: Joerg Roedel 
Date: Mon, 17 Feb 2020 17:20:59 +0100
Subject: [PATCH 3/5] iommu/vt-d: Do deferred attachment in
iommu_need_mapping()

The attachment of deferred devices needs to happen before the check
whether the device is identity mapped or not. Otherwise the check will
return wrong results, cause warnings boot failures in kdump kernels, like

WARNING: CPU: 0 PID: 318 at ../drivers/iommu/intel-iommu.c:592 
domain_get_iommu+0x61/0x70

[...]

 Call Trace:
  __intel_map_single+0x55/0x190
  intel_alloc_coherent+0xac/0x110
  dmam_alloc_attrs+0x50/0xa0
  ahci_port_start+0xfb/0x1f0 [libahci]
  ata_host_start.part.39+0x104/0x1e0 [libata]

With the earlier check the kdump boot succeeds and a crashdump is written.

Signed-off-by: Joerg Roedel 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


dmar fault right around domain switch in iommu_need_mapping

2020-02-18 Thread Jerry Snitselaar

Hi Joerg and Baolu,

I'm chasing down one last issue. I'm waiting to hear back from them
testing with Joerg's patchset, but I'm guessing this will still pop
up. It looks like right around when the domain switch occurs in
iommu_need_mapping there are some dmar faults (below is from 5.6-rc1
plus earlier fix attempt that moved deferred attach to beginning of
iommu_need_mapping):

[   12.546920] DMAR: DRHD: handling fault status reg 2
[   12.546923] DMAR: [DMA Read] Request device [02:00.0] PASID  fault 
addr 791dd000 [fault reason 02] Present bit in context entry is clear
[   12.635193] hpsa :02:00.0: Using iommu dma mapping
[   12.776712] hpsa :02:00.0: DMAR: 32bit DMA uses non-identity mapping

[   14.091219] DMAR: [DMA Read] Request device [07:00.0] PASID  fault 
addr 791dd000 [fault reason 02] Present bit in context entry is clear
[   14.180842] DMAR: DRHD: handling fault status reg 202
[   14.180845] DMAR: [DMA Read] Request device [07:00.0] PASID  fault 
addr 791dd000 [fault reason 02] Present bit in context entry is clear
[   14.268756] DMAR: DRHD: handling fault status reg 302
[   15.542551] hpsa :07:00.0: Using iommu dma mapping
[   15.567256] hpsa :07:00.0: DMAR: 32bit DMA uses non-identity mapping


It seems to only happen right then, and then things are fine. Happens
during both regular and kdump boot.  With the kdump boot the faults
are from the hpilo in the logs I'm looking at, so it doesn't seem to
be tied to a device, or certain rmrr. The faulting address always
seems to be the base address of the rmrr.  The dmar tables look sane.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/5] iommu/vt-d: Simplify check in identity_mapping()

2020-02-17 Thread Jerry Snitselaar

On Mon Feb 17 20, Joerg Roedel wrote:

From: Joerg Roedel 

The function only has one call-site and there it is never called with
dummy or deferred devices. Simplify the check in the function to
account for that.

Signed-off-by: Joerg Roedel 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/5] iommu/vt-d: Do deferred attachment in iommu_need_mapping()

2020-02-17 Thread Jerry Snitselaar

On Mon Feb 17 20, Joerg Roedel wrote:

From: Joerg Roedel 

The attachment of deferred devices needs to happen before the check
whether the device is identity mapped or not. Otherwise the check will
return wrong results, cause warnings boot failures in kdump kernels, like

WARNING: CPU: 0 PID: 318 at ../drivers/iommu/intel-iommu.c:592 
domain_get_iommu+0x61/0x70

[...]

 Call Trace:
  __intel_map_single+0x55/0x190
  intel_alloc_coherent+0xac/0x110
  dmam_alloc_attrs+0x50/0xa0
  ahci_port_start+0xfb/0x1f0 [libahci]
  ata_host_start.part.39+0x104/0x1e0 [libata]

With the earlier check the kdump boot succeeds and a crashdump is written.

Signed-off-by: Joerg Roedel 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu/vt-d: Move deferred device attachment into helper function

2020-02-17 Thread Jerry Snitselaar

On Mon Feb 17 20, Joerg Roedel wrote:

From: Joerg Roedel 

Move the code that does the deferred device attachment into a separate
helper function.

Signed-off-by: Joerg Roedel 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/5] iommu/vt-d: Remove deferred_attach_domain()

2020-02-17 Thread Jerry Snitselaar

On Mon Feb 17 20, Joerg Roedel wrote:

From: Joerg Roedel 

The function is now only a wrapper around find_domain(). Remove the
function and call find_domain() directly at the call-sites.

Signed-off-by: Joerg Roedel 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] iommu/vt-d: Add attach_deferred() helper

2020-02-17 Thread Jerry Snitselaar

On Mon Feb 17 20, Joerg Roedel wrote:

From: Joerg Roedel 

Implement a helper function to check whether a device's attach process
is deferred.

Signed-off-by: Joerg Roedel 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1

2020-02-17 Thread Jerry Snitselaar

On Mon Feb 17 20, Robin Murphy wrote:

On 16/02/2020 10:11 pm, Jerry Snitselaar wrote:

On Fri Feb 14 20, Robin Murphy wrote:

Hi Jerry,

On 2020-02-14 8:13 pm, Jerry Snitselaar wrote:

Hi Will,

On a gigabyte system with Cavium CN8xx, when doing a fio test against
an nvme drive we are seeing the following:

[  637.161194] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x8010003f6000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.174329] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x80136000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.186887] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x8010002ee000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.199275] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x8010003c7000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.211885] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x801000392000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.224580] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x80118000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.237241] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x80100036, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.249657] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x801ba000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.262120] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x8013e000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.274468] arm-smmu arm-smmu.1.auto: Unhandled context 
fault: fsr=0x8402, iova=0x801000304000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7


Those "IOVAs" don't look much like IOVAs from the DMA allocator - 
if they were physical addresses, would they correspond to an 
expected region of the physical memory map?


I would suspect that this is most likely misbehaviour in the NVMe 
driver (issuing a write to a non-DMA-mapped address), and the SMMU 
is just doing its job in blocking and reporting it.


I also reproduced with 5.5-rc7, and will check 5.6-rc1 later 
today. I couldn't narrow it down further into 5.4-rc1.
I don't know smmu or the code well, any thoughts on where to 
start digging into this?


fio test that is being run is:

#fio -filename=/dev/nvme0n1 -iodepth=64 -thread -rw=randwrite 
-ioengine=libaio -bs=4k -runtime=43200 -size=-group_reporting 
-name=mytest -numjobs=32


Just to clarify, do other tests work OK on the same device?

Thanks,
Robin.



I was able to get back on the system today. I think I know what the 
problem is:


[    0.036189] iommu: Gigabyte R120-T34-00 detected, force iommu 
passthrough mode

[    6.324282] iommu: Default domain type: Translated

So the new default domain code in 5.4 overrides the iommu quirk code 
setting default
passthrough. Testing a quick patch that tracks whether the default 
domain was set
in the quirk code, and leaves it alone if it was. So far it seems to 
be working.


Ah, OK. Could you point me at that quirk code? I can't seem to track 
it down in mainline, and seeing this much leaves me dubious that it's 
even correct - matching a particular board implies that it's a 
firmware issue (as far as I'm aware the SMMUs in CN88xx SoCs are 
usable in general), but if the firmware description is wrong to the 
point that DMA ops translation doesn't work, then no other translation 
(e.g. VFIO) is likely to work either. In that case it's simply not 
safe to enable the SMMU at all, and fudging the default domain type 
merely hides one symptom of the problem.


Robin.



Ugh. It is a RHEL only patch, but for some reason it is applied to the
ark kernel builds as well. Sorry for the noise.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1

2020-02-16 Thread Jerry Snitselaar

On Fri Feb 14 20, Robin Murphy wrote:

Hi Jerry,

On 2020-02-14 8:13 pm, Jerry Snitselaar wrote:

Hi Will,

On a gigabyte system with Cavium CN8xx, when doing a fio test against
an nvme drive we are seeing the following:

[  637.161194] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8010003f6000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.174329] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x80136000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.186887] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8010002ee000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.199275] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8010003c7000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.211885] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x801000392000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.224580] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x80118000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.237241] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x80100036, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.249657] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x801ba000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.262120] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8013e000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7
[  637.274468] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x801000304000, fsynr=0x70091, 
cbfrsynra=0x9000, cb=7


Those "IOVAs" don't look much like IOVAs from the DMA allocator - if 
they were physical addresses, would they correspond to an expected 
region of the physical memory map?


I would suspect that this is most likely misbehaviour in the NVMe 
driver (issuing a write to a non-DMA-mapped address), and the SMMU is 
just doing its job in blocking and reporting it.


I also reproduced with 5.5-rc7, and will check 5.6-rc1 later today. 
I couldn't narrow it down further into 5.4-rc1.
I don't know smmu or the code well, any thoughts on where to start 
digging into this?


fio test that is being run is:

#fio -filename=/dev/nvme0n1 -iodepth=64 -thread -rw=randwrite 
-ioengine=libaio -bs=4k -runtime=43200 -size=-group_reporting 
-name=mytest -numjobs=32


Just to clarify, do other tests work OK on the same device?

Thanks,
Robin.



I was able to get back on the system today. I think I know what the problem is:

[0.036189] iommu: Gigabyte R120-T34-00 detected, force iommu passthrough 
mode
[6.324282] iommu: Default domain type: Translated

So the new default domain code in 5.4 overrides the iommu quirk code setting 
default
passthrough. Testing a quick patch that tracks whether the default domain was 
set
in the quirk code, and leaves it alone if it was. So far it seems to be working.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

arm-smmu.1.auto: Unhandled context fault starting with 5.4-rc1

2020-02-14 Thread Jerry Snitselaar

Hi Will,

On a gigabyte system with Cavium CN8xx, when doing a fio test against
an nvme drive we are seeing the following:

[  637.161194] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8010003f6000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.174329] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x80136000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.186887] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8010002ee000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.199275] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8010003c7000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.211885] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x801000392000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.224580] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x80118000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.237241] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x80100036, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.249657] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x801ba000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.262120] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x8013e000, fsynr=0x70091, cbfrsynra=0x9000, cb=7
[  637.274468] arm-smmu arm-smmu.1.auto: Unhandled context fault: 
fsr=0x8402, iova=0x801000304000, fsynr=0x70091, cbfrsynra=0x9000, cb=7

I also reproduced with 5.5-rc7, and will check 5.6-rc1 later today. I couldn't 
narrow it down further into 5.4-rc1.
I don't know smmu or the code well, any thoughts on where to start digging into 
this?

fio test that is being run is:

#fio -filename=/dev/nvme0n1 -iodepth=64 -thread -rw=randwrite -ioengine=libaio 
-bs=4k -runtime=43200 -size=-group_reporting -name=mytest -numjobs=32


Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: warning from domain_get_iommu

2020-02-08 Thread Jerry Snitselaar

On Sat Feb 08 20, Lu Baolu wrote:

Hi Jerry,

On 2020/2/7 17:34, Jerry Snitselaar wrote:

On Thu Feb 06 20, Jerry Snitselaar wrote:

On Tue Feb 04 20, Jerry Snitselaar wrote:
I'm working on getting a system to reproduce this, and verify it 
also occurs

with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[    2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller 
(EHCI) Driver

[    2.832615] ehci-pci: EHCI PCI platform driver
[    2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[    2.835974] ehci-pci :00:1a.0: new USB bus registered, 
assigned bus number 1

[    2.838276] ehci-pci :00:1a.0: debug port 2
[    2.839700] WARNING: CPU: 0 PID: 1 at 
drivers/iommu/intel-iommu.c:598 domain_get_iommu+0x55/0x60

[    2.840671] Modules linked in:
[    2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[    2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant 
DL360 Gen9, BIOS P89 07/21/2019

[    2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[    2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 
0b 48 63 c8 48 39 c2 75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 
91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 31 c9 eb eb 66 90 0f 1f 44 
00 00 41 55 40 0f b6 f6

[    2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[    2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 

[    2.840671] RDX: fff0 RSI:  RDI: 
88ec7f1c8000
[    2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 
88ec7cbfcd00
[    2.840671] R10: 0095 R11: c90df928 R12: 

[    2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 

[    2.840671] FS:  () 
GS:88ec7f60() knlGS:

[    2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[    2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 
001606b0

[    2.840671] Call Trace:
[    2.840671]  __intel_map_single+0x62/0x140
[    2.840671]  intel_alloc_coherent+0xa6/0x130
[    2.840671]  dma_pool_alloc+0xd8/0x1e0
[    2.840671]  e_qh_alloc+0x55/0x130
[    2.840671]  ehci_setup+0x284/0x7b0
[    2.840671]  ehci_pci_setup+0xa3/0x530
[    2.840671]  usb_add_hcd+0x2b6/0x800
[    2.840671]  usb_hcd_pci_probe+0x375/0x460
[    2.840671]  local_pci_probe+0x41/0x90
[    2.840671]  pci_device_probe+0x105/0x1b0
[    2.840671]  driver_probe_device+0x12d/0x460
[    2.840671]  device_driver_attach+0x50/0x60
[    2.840671]  __driver_attach+0x61/0x130
[    2.840671]  ? device_driver_attach+0x60/0x60
[    2.840671]  bus_for_each_dev+0x77/0xc0
[    2.840671]  ? klist_add_tail+0x3b/0x70
[    2.840671]  bus_add_driver+0x14d/0x1e0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  driver_register+0x6b/0xb0
[    2.840671]  ? ehci_hcd_init+0xaa/0xaa
[    2.840671]  do_one_initcall+0x46/0x1c3
[    2.840671]  ? do_early_param+0x91/0x91
[    2.840671]  kernel_init_freeable+0x1af/0x258
[    2.840671]  ? rest_init+0xaa/0xaa
[    2.840671]  kernel_init+0xa/0xf9
[    2.840671]  ret_from_fork+0x35/0x40
[    2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[    3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[    3.012551] ehci-pci :00:1a.0: 32bit DMA uses 
non-identity mapping
[    3.018537] ehci-pci :00:1a.0: cache line size of 64 is 
not supported

[    3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[    3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[    3.030918] usb usb1: New USB device found, idVendor=1d6b, 
idProduct=0002, bcdDevice= 4.18
[    3.033491] usb usb1: New USB device strings: Mfr=3, 
Product=2, SerialNumber=1

[    3.035900] usb usb1: Product: EHCI Host Controller
[    3.037423] usb usb1: Manufacturer: Linux 
4.18.0-170.el8.kdump2.x86_64 ehci_hcd

[    3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
 goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.


Hi Baolu,

I think I understand what is happening here. With the kdump boot
translation is pre-enabled, so in intel_iommu_add_device things are
getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent
calls iommu_need_mapping it returns true, but doesn't do the dma
domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then
__intel_map_single gets called and it calls deferred_attach_

Re: warning from domain_get_iommu

2020-02-07 Thread Jerry Snitselaar

On Thu Feb 06 20, Jerry Snitselaar wrote:

On Tue Feb 04 20, Jerry Snitselaar wrote:

I'm working on getting a system to reproduce this, and verify it also occurs
with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[2.832615] ehci-pci: EHCI PCI platform driver
[2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus 
number 1
[2.838276] ehci-pci :00:1a.0: debug port 2
[2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 
domain_get_iommu+0x55/0x60
[2.840671] Modules linked in:
[2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS 
P89 07/21/2019
[2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 
75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 
31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6
[2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 
[2.840671] RDX: fff0 RSI:  RDI: 88ec7f1c8000
[2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00
[2.840671] R10: 0095 R11: c90df928 R12: 
[2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 
[2.840671] FS:  () GS:88ec7f60() 
knlGS:
[2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0
[2.840671] Call Trace:
[2.840671]  __intel_map_single+0x62/0x140
[2.840671]  intel_alloc_coherent+0xa6/0x130
[2.840671]  dma_pool_alloc+0xd8/0x1e0
[2.840671]  e_qh_alloc+0x55/0x130
[2.840671]  ehci_setup+0x284/0x7b0
[2.840671]  ehci_pci_setup+0xa3/0x530
[2.840671]  usb_add_hcd+0x2b6/0x800
[2.840671]  usb_hcd_pci_probe+0x375/0x460
[2.840671]  local_pci_probe+0x41/0x90
[2.840671]  pci_device_probe+0x105/0x1b0
[2.840671]  driver_probe_device+0x12d/0x460
[2.840671]  device_driver_attach+0x50/0x60
[2.840671]  __driver_attach+0x61/0x130
[2.840671]  ? device_driver_attach+0x60/0x60
[2.840671]  bus_for_each_dev+0x77/0xc0
[2.840671]  ? klist_add_tail+0x3b/0x70
[2.840671]  bus_add_driver+0x14d/0x1e0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  driver_register+0x6b/0xb0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  do_one_initcall+0x46/0x1c3
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  kernel_init_freeable+0x1af/0x258
[2.840671]  ? rest_init+0xaa/0xaa
[2.840671]  kernel_init+0xa/0xf9
[2.840671]  ret_from_fork+0x35/0x40
[2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping
[3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported
[3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, 
bcdDevice= 4.18
[3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[3.035900] usb usb1: Product: EHCI Host Controller
[3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 
ehci_hcd
[3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
 goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.


Hi Baolu,

I think I understand what is happening here. With the kdump boot
translation is pre-enabled, so in intel_iommu_add_device things are
getting set to DEFER_DEVICE_DOMAIN_INFO. When intel_alloc_coherent
calls iommu_need_mapping it returns true, but doesn't do the dma
domain switch because of DEFER_DEVICE_DOMAIN_INFO. Then
__intel_map_single gets called and it calls deferred_attach_domain,
which sets the domain to the group domain, which in this case is the
identity domain. Then it

Re: Seeing some another issue with mixed domains in the same iommu_group

2020-02-06 Thread Jerry Snitselaar

On Thu Feb 06 20, Jerry Snitselaar wrote:



...

The above cases seem to be avoided by:

9235cb13d7d1 | 2020-01-24 | iommu/vt-d: Allow devices with RMRRs to use 
identity domain (Lu Baolu)

which results in the watchdog device no longer taking a dma domain and 
switching the group default.


Without that patch though when it gets into the iommu_need_mapping code for 
:01:00.4 after
the following:

dmar_remove_one_dev_info(dev);
ret = iommu_request_dma_domain_for_dev(dev);

ret is 0 and dev->archdata.iommu is NULL. Even with 9235cb13d7d1 
device_def_domain_type can return
return dma, but I'm not sure how likely it is for there to be an iommu group 
like that again where
the group default ends up dma, a device gets removed and added to the identity 
domain, and then
ends up in that code in iommu_need_mapping.




Hi Baolu,

Would something along these lines makes sense?

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 9dc37672bf89..40cc8f5a3ebb 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3614,6 +3614,20 @@ static bool iommu_need_mapping(struct device *dev)
}
dmar_remove_one_dev_info(dev);
get_private_domain_for_dev(dev);
+   } else {
+   if (dev->archdata.iommu == NULL) {
+   struct iommu_domain *domain;
+   struct iommu_group *group;
+   struct dmar_domain *dmar_domain, *tmp;
+
+   group = iommu_group_get_for_dev(dev);
+   domain = iommu_group_default_domain(group);
+   dmar_domain = to_dmar_domain(domain);
+   tmp = set_domain_for_dev(dev, dmar_domain);
+   }
}
 
 		dev_info(dev, "32bit DMA uses non-identity mapping\n");

--

Obviously needs some checks added, but this was just an initial test I
was trying.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Seeing some another issue with mixed domains in the same iommu_group

2020-02-06 Thread Jerry Snitselaar

On Thu Feb 06 20, Jerry Snitselaar wrote:

On Thu Feb 06 20, Jerry Snitselaar wrote:

Hi Baolu,

I'm seeing another issue with the devices in the HP ilo when the
system is booted with intel_iommu=on and iommu=pt (iommu=nopt does not
run into problems).

first system:

01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard 
Slave Instrumentation & System Support (rev 05)
01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH
01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out 
Standard Management Processor Support and Messaging (rev 05)
01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard 
Virtual USB Controller (rev 02)

[   21.208103] pci :01:00.0: Adding to iommu group 24
[   21.210911] pci :01:00.0: Using iommu dma mapping
[   21.212635] pci :01:00.1: Adding to iommu group 24
[   21.214326] pci :01:00.1: Device uses a private identity domain.
[   21.216507] pci :01:00.2: Adding to iommu group 24
[   21.618173] pci :01:00.4: Adding to iommu group 24
[   21.619839] pci :01:00.4: Device uses a private identity domain.

[   26.206832] uhci_hcd: USB Universal Host Controller Interface driver
[   26.209044] uhci_hcd :01:00.4: UHCI Host Controller
[   26.210897] uhci_hcd :01:00.4: new USB bus registered, assigned bus 
number 3
[   26.213247] uhci_hcd :01:00.4: detected 8 ports
[   26.214810] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports
[   26.217153] uhci_hcd :01:00.4: irq 16, io base 0x3c00
[   26.219171] uhci_hcd :01:00.4: 32bit DMA uses non-identity mapping
[   26.221261] uhci_hcd :01:00.4: unable to allocate consistent memory for 
frame list
[   26.223787] uhci_hcd :01:00.4: startup error -16
[   26.225381] uhci_hcd :01:00.4: USB bus 3 deregistered
[   26.227378] uhci_hcd :01:00.4: init :01:00.4 fail, -16
[   26.229296] uhci_hcd: probe of :01:00.4 failed with error -16


different system with similar issue:

01:00.0 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out 
Standard Slave Instrumentation & System Support [103c:3306] (rev 07)
01:00.1 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA 
G200eH3 [102b:0538] (rev 02) (prog-if 00 [VGA controller])
01:00.2 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out 
Standard Management Processor Support and Messaging [103c:3307] (rev 07)
01:00.4 USB controller [0c03]: Hewlett-Packard Company iLO5 Virtual USB 
Controller [103c:22f6] (prog-if 20 [EHCI])

[   13.695663] pci :01:00.0: Adding to iommu group 10
[   13.703667] pci :01:00.0: Using iommu dma mapping
[   13.708871] pci :01:00.1: Adding to iommu group 10
[   13.714033] pci :01:00.1: DMAR: Device uses a private identity domain.
[   13.721033] pci :01:00.2: Adding to iommu group 10
[   13.726290] pci :01:00.4: Adding to iommu group 10
[   13.731453] pci :01:00.4: DMAR: Device uses a private identity domain.

[   17.157796] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   17.164348] ehci-pci: EHCI PCI platform driver
[   17.170061] ehci-pci :01:00.4: EHCI Host Controller
[   17.175457] ehci-pci :01:00.4: new USB bus registered, assigned bus 
number 1
[   17.182912] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity mapping
[   17.189988] ehci-pci :01:00.4: can't setup: -12
[   17.194884] ehci-pci :01:00.4: USB bus 1 deregistered
[   17.200567] ehci-pci :01:00.4: init :01:00.4 fail, -12
[   17.206508] ehci-pci: probe of :01:00.4 failed with error -12


I'm looking through the code and trying to debug it, but any thoughts on this?

Regards,
Jerry


In iommu_need_mapping, in a case like the above does something like 
dmar_insert_one_dev_info need to
happen to associate the device back with the group default domain? In 
intel_iommu_add_device it is
going to get removed and added to the identity domain, and then in 
iommu_need_mapping it gets removed
from the identity domain, and iommu_request_dma_domain_for_dev should return 0 
because the group
default domain at this point is the correct type.


The above cases seem to be avoided by:

9235cb13d7d1 | 2020-01-24 | iommu/vt-d: Allow devices with RMRRs to use 
identity domain (Lu Baolu)

which results in the watchdog device no longer taking a dma domain and 
switching the group default.


Without that patch though when it gets into the iommu_need_mapping code for 
:01:00.4 after
the following:

dmar_remove_one_dev_info(dev);
ret = iommu_request_dma_domain_for_dev(dev);

ret is 0 and dev->archdata.iommu is NULL. Even with 9235cb13d7d1 
device_def_domain_type can return
return dma, but I'm not sure how likely it is for there to be an iommu group 
like that again where
the group default ends up dma, a device gets removed and added to the identity 
domain, and then
ends up in that code in iommu_n

Re: Seeing some another issue with mixed domains in the same iommu_group

2020-02-06 Thread Jerry Snitselaar

On Thu Feb 06 20, Jerry Snitselaar wrote:

Hi Baolu,

I'm seeing another issue with the devices in the HP ilo when the
system is booted with intel_iommu=on and iommu=pt (iommu=nopt does not
run into problems).

first system:

01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard 
Slave Instrumentation & System Support (rev 05)
01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH
01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out 
Standard Management Processor Support and Messaging (rev 05)
01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard 
Virtual USB Controller (rev 02)

[   21.208103] pci :01:00.0: Adding to iommu group 24
[   21.210911] pci :01:00.0: Using iommu dma mapping
[   21.212635] pci :01:00.1: Adding to iommu group 24
[   21.214326] pci :01:00.1: Device uses a private identity domain.
[   21.216507] pci :01:00.2: Adding to iommu group 24
[   21.618173] pci :01:00.4: Adding to iommu group 24
[   21.619839] pci :01:00.4: Device uses a private identity domain.

[   26.206832] uhci_hcd: USB Universal Host Controller Interface driver
[   26.209044] uhci_hcd :01:00.4: UHCI Host Controller
[   26.210897] uhci_hcd :01:00.4: new USB bus registered, assigned bus 
number 3
[   26.213247] uhci_hcd :01:00.4: detected 8 ports
[   26.214810] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports
[   26.217153] uhci_hcd :01:00.4: irq 16, io base 0x3c00
[   26.219171] uhci_hcd :01:00.4: 32bit DMA uses non-identity mapping
[   26.221261] uhci_hcd :01:00.4: unable to allocate consistent memory for 
frame list
[   26.223787] uhci_hcd :01:00.4: startup error -16
[   26.225381] uhci_hcd :01:00.4: USB bus 3 deregistered
[   26.227378] uhci_hcd :01:00.4: init :01:00.4 fail, -16
[   26.229296] uhci_hcd: probe of :01:00.4 failed with error -16


different system with similar issue:

01:00.0 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out 
Standard Slave Instrumentation & System Support [103c:3306] (rev 07)
01:00.1 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA 
G200eH3 [102b:0538] (rev 02) (prog-if 00 [VGA controller])
01:00.2 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out 
Standard Management Processor Support and Messaging [103c:3307] (rev 07)
01:00.4 USB controller [0c03]: Hewlett-Packard Company iLO5 Virtual USB 
Controller [103c:22f6] (prog-if 20 [EHCI])

[   13.695663] pci :01:00.0: Adding to iommu group 10
[   13.703667] pci :01:00.0: Using iommu dma mapping
[   13.708871] pci :01:00.1: Adding to iommu group 10
[   13.714033] pci :01:00.1: DMAR: Device uses a private identity domain.
[   13.721033] pci :01:00.2: Adding to iommu group 10
[   13.726290] pci :01:00.4: Adding to iommu group 10
[   13.731453] pci :01:00.4: DMAR: Device uses a private identity domain.

[   17.157796] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   17.164348] ehci-pci: EHCI PCI platform driver
[   17.170061] ehci-pci :01:00.4: EHCI Host Controller
[   17.175457] ehci-pci :01:00.4: new USB bus registered, assigned bus 
number 1
[   17.182912] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity mapping
[   17.189988] ehci-pci :01:00.4: can't setup: -12
[   17.194884] ehci-pci :01:00.4: USB bus 1 deregistered
[   17.200567] ehci-pci :01:00.4: init :01:00.4 fail, -12
[   17.206508] ehci-pci: probe of :01:00.4 failed with error -12


I'm looking through the code and trying to debug it, but any thoughts on this?

Regards,
Jerry


In iommu_need_mapping, in a case like the above does something like 
dmar_insert_one_dev_info need to
happen to associate the device back with the group default domain? In 
intel_iommu_add_device it is
going to get removed and added to the identity domain, and then in 
iommu_need_mapping it gets removed
from the identity domain, and iommu_request_dma_domain_for_dev should return 0 
because the group
default domain at this point is the correct type.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Seeing some another issue with mixed domains in the same iommu_group

2020-02-06 Thread Jerry Snitselaar

Hi Baolu,

I'm seeing another issue with the devices in the HP ilo when the
system is booted with intel_iommu=on and iommu=pt (iommu=nopt does not
run into problems).

first system:

01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard 
Slave Instrumentation & System Support (rev 05)
01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH
01:00.2 System peripheral: Hewlett-Packard Company Integrated Lights-Out 
Standard Management Processor Support and Messaging (rev 05)
01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard 
Virtual USB Controller (rev 02)

[   21.208103] pci :01:00.0: Adding to iommu group 24
[   21.210911] pci :01:00.0: Using iommu dma mapping
[   21.212635] pci :01:00.1: Adding to iommu group 24
[   21.214326] pci :01:00.1: Device uses a private identity domain.
[   21.216507] pci :01:00.2: Adding to iommu group 24
[   21.618173] pci :01:00.4: Adding to iommu group 24
[   21.619839] pci :01:00.4: Device uses a private identity domain.

[   26.206832] uhci_hcd: USB Universal Host Controller Interface driver
[   26.209044] uhci_hcd :01:00.4: UHCI Host Controller
[   26.210897] uhci_hcd :01:00.4: new USB bus registered, assigned bus 
number 3
[   26.213247] uhci_hcd :01:00.4: detected 8 ports
[   26.214810] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports
[   26.217153] uhci_hcd :01:00.4: irq 16, io base 0x3c00
[   26.219171] uhci_hcd :01:00.4: 32bit DMA uses non-identity mapping
[   26.221261] uhci_hcd :01:00.4: unable to allocate consistent memory for 
frame list
[   26.223787] uhci_hcd :01:00.4: startup error -16
[   26.225381] uhci_hcd :01:00.4: USB bus 3 deregistered
[   26.227378] uhci_hcd :01:00.4: init :01:00.4 fail, -16
[   26.229296] uhci_hcd: probe of :01:00.4 failed with error -16


different system with similar issue:

01:00.0 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out 
Standard Slave Instrumentation & System Support [103c:3306] (rev 07)
01:00.1 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA 
G200eH3 [102b:0538] (rev 02) (prog-if 00 [VGA controller])
01:00.2 System peripheral [0880]: Hewlett-Packard Company Integrated Lights-Out 
Standard Management Processor Support and Messaging [103c:3307] (rev 07)
01:00.4 USB controller [0c03]: Hewlett-Packard Company iLO5 Virtual USB 
Controller [103c:22f6] (prog-if 20 [EHCI])

[   13.695663] pci :01:00.0: Adding to iommu group 10
[   13.703667] pci :01:00.0: Using iommu dma mapping
[   13.708871] pci :01:00.1: Adding to iommu group 10
[   13.714033] pci :01:00.1: DMAR: Device uses a private identity domain.
[   13.721033] pci :01:00.2: Adding to iommu group 10
[   13.726290] pci :01:00.4: Adding to iommu group 10
[   13.731453] pci :01:00.4: DMAR: Device uses a private identity domain.

[   17.157796] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   17.164348] ehci-pci: EHCI PCI platform driver
[   17.170061] ehci-pci :01:00.4: EHCI Host Controller
[   17.175457] ehci-pci :01:00.4: new USB bus registered, assigned bus 
number 1
[   17.182912] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity mapping
[   17.189988] ehci-pci :01:00.4: can't setup: -12
[   17.194884] ehci-pci :01:00.4: USB bus 1 deregistered
[   17.200567] ehci-pci :01:00.4: init :01:00.4 fail, -12
[   17.206508] ehci-pci: probe of :01:00.4 failed with error -12


I'm looking through the code and trying to debug it, but any thoughts on this?

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: warning from domain_get_iommu

2020-02-06 Thread Jerry Snitselaar

On Tue Feb 04 20, Jerry Snitselaar wrote:

I'm working on getting a system to reproduce this, and verify it also occurs
with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[2.832615] ehci-pci: EHCI PCI platform driver
[2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus 
number 1
[2.838276] ehci-pci :00:1a.0: debug port 2
[2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 
domain_get_iommu+0x55/0x60
[2.840671] Modules linked in:
[2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS 
P89 07/21/2019
[2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 
75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 
31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6
[2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 
[2.840671] RDX: fff0 RSI:  RDI: 88ec7f1c8000
[2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00
[2.840671] R10: 0095 R11: c90df928 R12: 
[2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 
[2.840671] FS:  () GS:88ec7f60() 
knlGS:
[2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0
[2.840671] Call Trace:
[2.840671]  __intel_map_single+0x62/0x140
[2.840671]  intel_alloc_coherent+0xa6/0x130
[2.840671]  dma_pool_alloc+0xd8/0x1e0
[2.840671]  e_qh_alloc+0x55/0x130
[2.840671]  ehci_setup+0x284/0x7b0
[2.840671]  ehci_pci_setup+0xa3/0x530
[2.840671]  usb_add_hcd+0x2b6/0x800
[2.840671]  usb_hcd_pci_probe+0x375/0x460
[2.840671]  local_pci_probe+0x41/0x90
[2.840671]  pci_device_probe+0x105/0x1b0
[2.840671]  driver_probe_device+0x12d/0x460
[2.840671]  device_driver_attach+0x50/0x60
[2.840671]  __driver_attach+0x61/0x130
[2.840671]  ? device_driver_attach+0x60/0x60
[2.840671]  bus_for_each_dev+0x77/0xc0
[2.840671]  ? klist_add_tail+0x3b/0x70
[2.840671]  bus_add_driver+0x14d/0x1e0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  driver_register+0x6b/0xb0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  do_one_initcall+0x46/0x1c3
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  kernel_init_freeable+0x1af/0x258
[2.840671]  ? rest_init+0xaa/0xaa
[2.840671]  kernel_init+0xa/0xf9
[2.840671]  ret_from_fork+0x35/0x40
[2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping
[3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported
[3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, 
bcdDevice= 4.18
[3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[3.035900] usb usb1: Product: EHCI Host Controller
[3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 
ehci_hcd
[3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
  goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry


I reproduced the warning with a 5.5 kernel on an Intel NUC5i5MYBE.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


warning from domain_get_iommu

2020-02-04 Thread Jerry Snitselaar

I'm working on getting a system to reproduce this, and verify it also occurs
with 5.5, but I have a report of a case where the kdump kernel gives
warnings like the following on a hp dl360 gen9:

[2.830589] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[2.832615] ehci-pci: EHCI PCI platform driver
[2.834190] ehci-pci :00:1a.0: EHCI Host Controller
[2.835974] ehci-pci :00:1a.0: new USB bus registered, assigned bus 
number 1
[2.838276] ehci-pci :00:1a.0: debug port 2
[2.839700] WARNING: CPU: 0 PID: 1 at drivers/iommu/intel-iommu.c:598 
domain_get_iommu+0x55/0x60
[2.840671] Modules linked in:
[2.840671] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-170.el8.kdump2.x86_64 #1
[2.840671] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS 
P89 07/21/2019
[2.840671] RIP: 0010:domain_get_iommu+0x55/0x60
[2.840671] Code: c2 01 eb 0b 48 83 c0 01 8b 34 87 85 f6 75 0b 48 63 c8 48 39 c2 
75 ed 31 c0 c3 48 c1 e1 03 48 8b 05 70 f3 91 01 48 8b 04 08 c3 <0f> 0b 31 c0 c3 
31 c9 eb eb 66 90 0f 1f 44 00 00 41 55 40 0f b6 f6
[2.840671] RSP: 0018:c90dfab8 EFLAGS: 00010202
[2.840671] RAX: 88ec7f1c8000 RBX: 006c7c867000 RCX: 
[2.840671] RDX: fff0 RSI:  RDI: 88ec7f1c8000
[2.840671] RBP: 88ec6f7000b0 R08: 88ec7f19d000 R09: 88ec7cbfcd00
[2.840671] R10: 0095 R11: c90df928 R12: 
[2.840671] R13: 88ec7f1c8000 R14: 1000 R15: 
[2.840671] FS:  () GS:88ec7f60() 
knlGS:
[2.840671] CS:  0010 DS:  ES:  CR0: 80050033
[2.840671] CR2: 7ff3e1713000 CR3: 006c7de0a004 CR4: 001606b0
[2.840671] Call Trace:
[2.840671]  __intel_map_single+0x62/0x140
[2.840671]  intel_alloc_coherent+0xa6/0x130
[2.840671]  dma_pool_alloc+0xd8/0x1e0
[2.840671]  e_qh_alloc+0x55/0x130
[2.840671]  ehci_setup+0x284/0x7b0
[2.840671]  ehci_pci_setup+0xa3/0x530
[2.840671]  usb_add_hcd+0x2b6/0x800
[2.840671]  usb_hcd_pci_probe+0x375/0x460
[2.840671]  local_pci_probe+0x41/0x90
[2.840671]  pci_device_probe+0x105/0x1b0
[2.840671]  driver_probe_device+0x12d/0x460
[2.840671]  device_driver_attach+0x50/0x60
[2.840671]  __driver_attach+0x61/0x130
[2.840671]  ? device_driver_attach+0x60/0x60
[2.840671]  bus_for_each_dev+0x77/0xc0
[2.840671]  ? klist_add_tail+0x3b/0x70
[2.840671]  bus_add_driver+0x14d/0x1e0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  driver_register+0x6b/0xb0
[2.840671]  ? ehci_hcd_init+0xaa/0xaa
[2.840671]  do_one_initcall+0x46/0x1c3
[2.840671]  ? do_early_param+0x91/0x91
[2.840671]  kernel_init_freeable+0x1af/0x258
[2.840671]  ? rest_init+0xaa/0xaa
[2.840671]  kernel_init+0xa/0xf9
[2.840671]  ret_from_fork+0x35/0x40
[2.840671] ---[ end trace e87b0d9a1c8135c4 ]---
[3.010848] ehci-pci :00:1a.0: Using iommu dma mapping
[3.012551] ehci-pci :00:1a.0: 32bit DMA uses non-identity mapping
[3.018537] ehci-pci :00:1a.0: cache line size of 64 is not supported
[3.021188] ehci-pci :00:1a.0: irq 18, io mem 0x93002000
[3.029006] ehci-pci :00:1a.0: USB 2.0 started, EHCI 1.00
[3.030918] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, 
bcdDevice= 4.18
[3.033491] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=1
[3.035900] usb usb1: Product: EHCI Host Controller
[3.037423] usb usb1: Manufacturer: Linux 4.18.0-170.el8.kdump2.x86_64 
ehci_hcd
[3.039691] usb usb1: SerialNumber: :00:1a.0

It looks like the device finishes initializing once it figures out it
needs dma mapping instead of the default
passthrough. intel_alloc_coherent calls iommu_need_mapping, before it
calls __intel_map_single, so I'm not sure why it is tripping over the
WARN_ON in domain_get_iommu.

one thing I noticed while looking at this is that domain_get_iommu can
return NULL. So should there be something like the following in
__intel_map_single after the domain_get_iommu call?

if (!iommu)
   goto error;

It is possible to deref the null pointer later otherwise.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: call __dmar_remove_one_dev_info with valid pointer

2020-01-21 Thread Jerry Snitselaar
It is possible for archdata.iommu to be set to
DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO so check for
those values before calling __dmar_remove_one_dev_info. Without a
check it can result in a null pointer dereference. This has been seen
while booting a kdump kernel on an HP dl380 gen9.

Cc: Joerg Roedel 
Cc: Lu Baolu 
Cc: David Woodhouse 
Cc: sta...@vger.kernel.org # 5.3+
Cc: linux-ker...@vger.kernel.org
Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 1801f0aaf013..932267f49f9a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev)
 
spin_lock_irqsave(_domain_lock, flags);
info = dev->archdata.iommu;
-   if (info)
+   if (info && info != DEFER_DEVICE_DOMAIN_INFO
+   && info != DUMMY_DEVICE_DOMAIN_INFO)
__dmar_remove_one_dev_info(info);
spin_unlock_irqrestore(_domain_lock, flags);
 }
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/vt-d: Don't reject nvme host due to scope mismatch

2020-01-06 Thread Jerry Snitselaar

On Sun Jan 05 20, jimyan wrote:

On a system with an Intel PCIe port configured as a nvme host device, iommu
initialization fails with

   DMAR: Device scope type does not match for :80:00.0

This is because the DMAR table reports this device as having scope 2
(ACPI_DMAR_SCOPE_TYPE_BRIDGE):

but the device has a type 0 PCI header:
80:00.0 Class 0600: Device 8086:2020 (rev 06)
00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00
30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00

VT-d works perfectly on this system, so there's no reason to bail out
on initialization due to this apparent scope mismatch. Add the class
0x06 ("PCI_BASE_CLASS_BRIDGE") as a heuristic for allowing DMAR
initialization for non-bridge PCI devices listed with scope bridge.

Signed-off-by: jimyan 


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices

2020-01-06 Thread Jerry Snitselaar

On Tue Jan 07 20, Lu Baolu wrote:

Hi Jerry,

On 1/7/20 1:05 AM, Jerry Snitselaar wrote:

On Wed Jan 01 20, Roland Dreier via iommu wrote:

We saw more devices with the same mismatch quirk. So maintaining them in
a quirk table will make it more readable and maintainable.


I guess I disagree about the maintainable part, given that this patch
already regresses Broadwell NTB.

I'm not even sure what the DMAR table says about NTB on my Skylake
systems, exactly because the existing code means I did not have any
problems.  But we might need to add device 201Ch too.

Maybe we don't need the mismatch check at all?  Your patch sets the
quirk if any possibly mismatching device is present in the system, so
we'll ignore any scope mismatch on a system with, say, the 8086:2020
NVMe host in it.  So could we just drop the check completely and not
have a quirk to disable the check?

- R.


If the check is removed what happens for cases where there is an actual
problem in the dmar table? I just worked an issue with some Intel
people where a purley system had an rmrr entry pointing to a bridge as
the endpoint device instead of the raid module sitting behind it.


The latest solution was here. https://lkml.org/lkml/2020/1/5/103, does
this work for you?

Best regards,
baolu



Hi Baolu,

They resolved it by updating the rmrr entry in the dmar table to add
the extra path needed for it to point at the raid module. Looking
at the code though I imagine without the firmware update they would
still have the problem because IIRC it was a combo of an endpoint
scope type, and a pci bridge header so that first check would fail
as it did before. My worry was if the suggestion is to remove the
check completely, a case like that wouldn't report anything wrong.

Jim's latest patch I think solves the issue for what he was seeing
and the NTB case.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 22/22] iommu/vt-d: Add a quirk flag for scope mismatched devices

2020-01-06 Thread Jerry Snitselaar

On Wed Jan 01 20, Roland Dreier via iommu wrote:

We saw more devices with the same mismatch quirk. So maintaining them in
a quirk table will make it more readable and maintainable.


I guess I disagree about the maintainable part, given that this patch
already regresses Broadwell NTB.

I'm not even sure what the DMAR table says about NTB on my Skylake
systems, exactly because the existing code means I did not have any
problems.  But we might need to add device 201Ch too.

Maybe we don't need the mismatch check at all?  Your patch sets the
quirk if any possibly mismatching device is present in the system, so
we'll ignore any scope mismatch on a system with, say, the 8086:2020
NVMe host in it.  So could we just drop the check completely and not
have a quirk to disable the check?

- R.


If the check is removed what happens for cases where there is an actual
problem in the dmar table? I just worked an issue with some Intel
people where a purley system had an rmrr entry pointing to a bridge as
the endpoint device instead of the raid module sitting behind it.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Add a quirk flag for scope mismatched devices

2019-12-24 Thread Jerry Snitselaar

On Tue Dec 24 19, Lu Baolu wrote:

We expect devices with endpoint scope to have normal PCI headers,
and devices with bridge scope to have bridge PCI headers.  However
Some PCI devices may be listed in the DMAR table with bridge scope,
even though they have a normal PCI header. Add a quirk flag for
those special devices.

Cc: Roland Dreier 
Cc: Jim Yan 
Signed-off-by: Lu Baolu 
---


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/vt-d: Don't reject nvme host due to scope mismatch

2019-12-20 Thread Jerry Snitselaar

On Fri Dec 20 19, jimyan wrote:

On a system with an Intel PCIe port configured as a nvme host device, iommu
initialization fails with

   DMAR: Device scope type does not match for :80:00.0

This is because the DMAR table reports this device as having scope 2
(ACPI_DMAR_SCOPE_TYPE_BRIDGE):



Isn't that a problem to be fixed in the DMAR table then?


but the device has a type 0 PCI header:
80:00.0 Class 0600: Device 8086:2020 (rev 06)
00: 86 80 20 20 47 05 10 00 06 00 00 06 10 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 00 00
30: 00 00 00 00 90 00 00 00 00 00 00 00 00 01 00 00

VT-d works perfectly on this system, so there's no reason to bail out
on initialization due to this apparent scope mismatch. Add the class
0x600 ("PCI_CLASS_BRIDGE_HOST") as a heuristic for allowing DMAR
initialization for non-bridge PCI devices listed with scope bridge.

Signed-off-by: jimyan 
---
drivers/iommu/dmar.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index eecd6a421667..9faf2f0e0237 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -244,6 +244,7 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
 info->dev->hdr_type != PCI_HEADER_TYPE_NORMAL) ||
(scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE &&
 (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL &&
+ info->dev->class >> 8 != PCI_CLASS_BRIDGE_HOST &&
  info->dev->class >> 8 != PCI_CLASS_BRIDGE_OTHER))) {
pr_warn("Device scope type does not match for %s\n",
pci_name(info->dev));
--
2.11.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info

2019-12-17 Thread Jerry Snitselaar

On Tue Dec 17 19, Jerry Snitselaar wrote:

On Tue Dec 17 19, Jerry Snitselaar wrote:

In addition to checking for a null pointer, verify that
info does not have the value DEFER_DEVICE_DOMAIN_INFO or
DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values
__dmar_remove_one_dev_info will panic when trying to access
a member of the device_domain_info struct.

  [1.464241] BUG: unable to handle kernel NULL pointer dereference at 
004e
  [1.464241] PGD 0 P4D 0
  [1.464241] Oops:  [#1] SMP PTI
  [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW
- -  - 4.18.0-160.el8.x86_64 #1
  [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, 
BIOS P89 07/21/2019
  [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250
  [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 55 
53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 48 
8b 6f 58 $
  [1.464241] RSP: :c90dfd10 EFLAGS: 00010082
  [1.464241] RAX: 0001 RBX: fffe RCX: 

  [1.464241] RDX: 0001 RSI: 0004 RDI: 
fffe
  [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 
0039
  [1.464241] R10:  R11: c90dfa58 R12: 
88ec7a0eec20
  [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: 

  [1.464241] FS:  () GS:88ec7a60() 
knlGS:
  [1.464241] CS:  0010 DS:  ES:  CR0: 80050033
  [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0
  [1.464241] Call Trace:
  [1.464241]  dmar_remove_one_dev_info.isra.68+0x27/0x40
  [1.464241]  intel_iommu_add_device+0x124/0x180
  [1.464241]  ? iommu_probe_device+0x40/0x40
  [1.464241]  add_iommu_group+0xa/0x20
  [1.464241]  bus_for_each_dev+0x77/0xc0
  [1.464241]  ? down_write+0xe/0x40
  [1.464241]  bus_set_iommu+0x85/0xc0
  [1.464241]  intel_iommu_init+0x4b4/0x777
  [1.464241]  ? e820__memblock_setup+0x63/0x63
  [1.464241]  ? do_early_param+0x91/0x91
  [1.464241]  pci_iommu_init+0x19/0x45
  [1.464241]  do_one_initcall+0x46/0x1c3
  [1.464241]  ? do_early_param+0x91/0x91
  [1.464241]  kernel_init_freeable+0x1af/0x258
  [1.464241]  ? rest_init+0xaa/0xaa
  [1.464241]  kernel_init+0xa/0x107
  [1.464241]  ret_from_fork+0x35/0x40
  [1.464241] Modules linked in:
  [1.464241] CR2: 004e
  [1.464241] ---[ end trace 0927d2ba8b8032b5 ]---

Cc: Joerg Roedel 
Cc: Lu Baolu 
Cc: David Woodhouse 
Cc: sta...@vger.kernel.org # v5.3+
Cc: iommu@lists.linux-foundation.org
Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar 
---
drivers/iommu/intel-iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..e42a09794fa2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev)

spin_lock_irqsave(_domain_lock, flags);
info = dev->archdata.iommu;
-   if (info)
+   if (info && info != DEFER_DEVICE_DOMAIN_INFO
+   && info != DUMMY_DEVICE_DOMAIN_INFO)
__dmar_remove_one_dev_info(info);
spin_unlock_irqrestore(_domain_lock, flags);
}
--
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



Nack this.

Apparently the issue is just being seen with the kdump kernel.  I'm
wondering if it is already solved by 6c3a44ed3c55 ("iommu/vt-d: Turn
off translations at shutdown").  Testing a 5.5 build now.


And a minute later I got a response. The 5.5 kernel hits the original
panic when booting into the kdump kernel.

I need to test with this patch on 5.5, but with a test build of our
kernel with this patch the problem just moves to:

[3.742317] pci :01:00.0: Using iommu dma mapping
[3.744020] pci :01:00.1: Adding to iommu group 86
[3.746697] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0Modules 
linked in:
[3.746697] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.18.0-167.el8.iommu6.x86_64 #1
[3.746697] Hardware name: HP ProLiant DL560 Gen9/ProLiant DL560 Gen9, BIOS 
P85 07/21/2019
[3.746697] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1d0
[3.746697] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 
09 d0 a9 00 01 ff ff 75 47 85 c$
[3.746697] RSP: :c90f3bd8 EFLAGS: 0002
[3.746697] RAX: 0101 RBX: 0046 RCX: 7f17
[3.746697] RDX:  RSI:  RDI: 82e8a600
[3.746697] RB

Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info

2019-12-17 Thread Jerry Snitselaar

On Tue Dec 17 19, Jerry Snitselaar wrote:

In addition to checking for a null pointer, verify that
info does not have the value DEFER_DEVICE_DOMAIN_INFO or
DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values
__dmar_remove_one_dev_info will panic when trying to access
a member of the device_domain_info struct.

   [1.464241] BUG: unable to handle kernel NULL pointer dereference at 
004e
   [1.464241] PGD 0 P4D 0
   [1.464241] Oops:  [#1] SMP PTI
   [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW
- -  - 4.18.0-160.el8.x86_64 #1
   [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, 
BIOS P89 07/21/2019
   [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250
   [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 41 54 
55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 67 50 
48 8b 6f 58 $
   [1.464241] RSP: :c90dfd10 EFLAGS: 00010082
   [1.464241] RAX: 0001 RBX: fffe RCX: 

   [1.464241] RDX: 0001 RSI: 0004 RDI: 
fffe
   [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 
0039
   [1.464241] R10:  R11: c90dfa58 R12: 
88ec7a0eec20
   [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: 

   [1.464241] FS:  () GS:88ec7a60() 
knlGS:
   [1.464241] CS:  0010 DS:  ES:  CR0: 80050033
   [1.464241] CR2: 004e CR3: 006c7900a001 C 001606b0
   [1.464241] Call Trace:
   [1.464241]  dmar_remove_one_dev_info.isra.68+0x27/0x40
   [1.464241]  intel_iommu_add_device+0x124/0x180
   [1.464241]  ? iommu_probe_device+0x40/0x40
   [1.464241]  add_iommu_group+0xa/0x20
   [1.464241]  bus_for_each_dev+0x77/0xc0
   [1.464241]  ? down_write+0xe/0x40
   [1.464241]  bus_set_iommu+0x85/0xc0
   [1.464241]  intel_iommu_init+0x4b4/0x777
   [1.464241]  ? e820__memblock_setup+0x63/0x63
   [1.464241]  ? do_early_param+0x91/0x91
   [1.464241]  pci_iommu_init+0x19/0x45
   [1.464241]  do_one_initcall+0x46/0x1c3
   [1.464241]  ? do_early_param+0x91/0x91
   [1.464241]  kernel_init_freeable+0x1af/0x258
   [1.464241]  ? rest_init+0xaa/0xaa
   [1.464241]  kernel_init+0xa/0x107
   [1.464241]  ret_from_fork+0x35/0x40
   [1.464241] Modules linked in:
   [1.464241] CR2: 004e
   [1.464241] ---[ end trace 0927d2ba8b8032b5 ]---

Cc: Joerg Roedel 
Cc: Lu Baolu 
Cc: David Woodhouse 
Cc: sta...@vger.kernel.org # v5.3+
Cc: iommu@lists.linux-foundation.org
Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar 
---
drivers/iommu/intel-iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..e42a09794fa2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev)

spin_lock_irqsave(_domain_lock, flags);
info = dev->archdata.iommu;
-   if (info)
+   if (info && info != DEFER_DEVICE_DOMAIN_INFO
+   && info != DUMMY_DEVICE_DOMAIN_INFO)
__dmar_remove_one_dev_info(info);
spin_unlock_irqrestore(_domain_lock, flags);
}
--
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



Nack this.

Apparently the issue is just being seen with the kdump kernel.  I'm
wondering if it is already solved by 6c3a44ed3c55 ("iommu/vt-d: Turn
off translations at shutdown").  Testing a 5.5 build now.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info

2019-12-17 Thread Jerry Snitselaar
On Tue, Dec 17, 2019 at 10:56 AM Jerry Snitselaar  wrote:
>
> In addition to checking for a null pointer, verify that
> info does not have the value DEFER_DEVICE_DOMAIN_INFO or
> DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values
> __dmar_remove_one_dev_info will panic when trying to access
> a member of the device_domain_info struct.
>
> [1.464241] BUG: unable to handle kernel NULL pointer dereference at 
> 004e
> [1.464241] PGD 0 P4D 0
> [1.464241] Oops:  [#1] SMP PTI
> [1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW
> - -  - 4.18.0-160.el8.x86_64 #1
> [1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, 
> BIOS P89 07/21/2019
> [1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250
> [1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 
> 55 41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb 
> <4c> 8b 67 50 48 8b 6f 58 $
> [1.464241] RSP: :c90dfd10 EFLAGS: 00010082
> [1.464241] RAX: 0001 RBX: fffe RCX: 
> 
> [1.464241] RDX: 0001 RSI: 0004 RDI: 
> fffe
> [1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 
> 0039
> [1.464241] R10:  R11: c90dfa58 R12: 
> 88ec7a0eec20
> [1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: 
> 
> [1.464241] FS:  () GS:88ec7a60() 
> knlGS:
> [1.464241] CS:  0010 DS:  ES:  CR0: 80050033
> [1.464241] CR2: 004e CR3: 006c7900a001 C 
> 001606b0
> [1.464241] Call Trace:
> [1.464241]  dmar_remove_one_dev_info.isra.68+0x27/0x40
> [1.464241]  intel_iommu_add_device+0x124/0x180
> [1.464241]  ? iommu_probe_device+0x40/0x40
> [1.464241]  add_iommu_group+0xa/0x20
> [1.464241]  bus_for_each_dev+0x77/0xc0
> [1.464241]  ? down_write+0xe/0x40
> [1.464241]  bus_set_iommu+0x85/0xc0
> [1.464241]  intel_iommu_init+0x4b4/0x777
> [1.464241]  ? e820__memblock_setup+0x63/0x63
> [1.464241]  ? do_early_param+0x91/0x91
> [1.464241]  pci_iommu_init+0x19/0x45
> [1.464241]  do_one_initcall+0x46/0x1c3
> [1.464241]  ? do_early_param+0x91/0x91
> [1.464241]  kernel_init_freeable+0x1af/0x258
> [1.464241]  ? rest_init+0xaa/0xaa
> [1.464241]  kernel_init+0xa/0x107
> [1.464241]  ret_from_fork+0x35/0x40
> [1.464241] Modules linked in:
> [1.464241] CR2: 004e
> [1.464241] ---[ end trace 0927d2ba8b8032b5 ]---
>
> Cc: Joerg Roedel 
> Cc: Lu Baolu 
> Cc: David Woodhouse 
> Cc: sta...@vger.kernel.org # v5.3+
> Cc: iommu@lists.linux-foundation.org
> Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one")
> Signed-off-by: Jerry Snitselaar 
> ---
>  drivers/iommu/intel-iommu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 0c8d81f56a30..e42a09794fa2 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev)
>
> spin_lock_irqsave(_domain_lock, flags);
> info = dev->archdata.iommu;
> -   if (info)
> +   if (info && info != DEFER_DEVICE_DOMAIN_INFO
> +   && info != DUMMY_DEVICE_DOMAIN_INFO)
> __dmar_remove_one_dev_info(info);
> spin_unlock_irqrestore(_domain_lock, flags);
>  }
> --
> 2.24.0
>
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>

I'm not positive that the DUMMY_DEVICE_DOMAIN_INFO check is needed.
It seemed like there were checks for that most places before
dmar_remove_one_dev_info
would be called, but I wasn't certain.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH] iommu/vt-d: avoid panic in __dmar_remove_one_dev_info

2019-12-17 Thread Jerry Snitselaar
In addition to checking for a null pointer, verify that
info does not have the value DEFER_DEVICE_DOMAIN_INFO or
DUMMY_DEVICE_DOMAIN_INFO. If info has one of those values
__dmar_remove_one_dev_info will panic when trying to access
a member of the device_domain_info struct.

[1.464241] BUG: unable to handle kernel NULL pointer dereference at 
004e
[1.464241] PGD 0 P4D 0
[1.464241] Oops:  [#1] SMP PTI
[1.464241] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW
- -  - 4.18.0-160.el8.x86_64 #1
[1.464241] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, 
BIOS P89 07/21/2019
[1.464241] RIP: 0010:__dmar_remove_one_dev_info+0x27/0x250
[1.464241] Code: 00 00 00 0f 1f 44 00 00 8b 05 35 ec 75 01 41 56 41 55 
41 54 55 53 85 c0 0f 84 99 01 00 00 48 85 ff 0f 84 92 01 00 00 48 89 fb <4c> 8b 
67 50 48 8b 6f 58 $
[1.464241] RSP: :c90dfd10 EFLAGS: 00010082
[1.464241] RAX: 0001 RBX: fffe RCX: 

[1.464241] RDX: 0001 RSI: 0004 RDI: 
fffe
[1.464241] RBP: 88ec7a72f368 R08: 0457 R09: 
0039
[1.464241] R10:  R11: c90dfa58 R12: 
88ec7a0eec20
[1.464241] R13: 88ec6fd0eab0 R14: 81eae980 R15: 

[1.464241] FS:  () GS:88ec7a60() 
knlGS:
[1.464241] CS:  0010 DS:  ES:  CR0: 80050033
[1.464241] CR2: 004e CR3: 006c7900a001 C 
001606b0
[1.464241] Call Trace:
[1.464241]  dmar_remove_one_dev_info.isra.68+0x27/0x40
[1.464241]  intel_iommu_add_device+0x124/0x180
[1.464241]  ? iommu_probe_device+0x40/0x40
[1.464241]  add_iommu_group+0xa/0x20
[1.464241]  bus_for_each_dev+0x77/0xc0
[1.464241]  ? down_write+0xe/0x40
[1.464241]  bus_set_iommu+0x85/0xc0
[1.464241]  intel_iommu_init+0x4b4/0x777
[1.464241]  ? e820__memblock_setup+0x63/0x63
[1.464241]  ? do_early_param+0x91/0x91
[1.464241]  pci_iommu_init+0x19/0x45
[1.464241]  do_one_initcall+0x46/0x1c3
[1.464241]  ? do_early_param+0x91/0x91
[1.464241]  kernel_init_freeable+0x1af/0x258
[1.464241]  ? rest_init+0xaa/0xaa
[1.464241]  kernel_init+0xa/0x107
[1.464241]  ret_from_fork+0x35/0x40
[1.464241] Modules linked in:
[1.464241] CR2: 004e
[1.464241] ---[ end trace 0927d2ba8b8032b5 ]---

Cc: Joerg Roedel 
Cc: Lu Baolu 
Cc: David Woodhouse 
Cc: sta...@vger.kernel.org # v5.3+
Cc: iommu@lists.linux-foundation.org
Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..e42a09794fa2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev)
 
spin_lock_irqsave(_domain_lock, flags);
info = dev->archdata.iommu;
-   if (info)
+   if (info && info != DEFER_DEVICE_DOMAIN_INFO
+   && info != DUMMY_DEVICE_DOMAIN_INFO)
__dmar_remove_one_dev_info(info);
spin_unlock_irqrestore(_domain_lock, flags);
 }
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: panic in dmar_remove_one_dev_info

2019-12-16 Thread Jerry Snitselaar

On Mon Dec 16 19, Jerry Snitselaar wrote:

HP is seeing a panic on gen9 dl360 and dl560 while testing these other
changes we've been eorking on. I just took an initial look, but have
to run to a dentist appointment so couldn't dig too deep. It looks
like the device sets dev->archdata.iommu to DEFER_DEVICE_DOMAIN_INFO
in intel_iommu_add_device, and then it needs a private domain so
dmar_remove_one_dev_info gets called. That code path ends up trying to
use DEFER_DEVICE_DOMAIN_INFO as a pointer.  I don't need if there just
needs to be a check in there to bail out if it sees
DEFER_DEVICE_DOMAIN_INFO, or if something more is needed. I'll look
at it some more when I get back home.

Regards,
Jerry


Hi Baolu,

Does this look sane?

--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5163,7 +5163,8 @@ static void dmar_remove_one_dev_info(struct device *dev)
 
spin_lock_irqsave(_domain_lock, flags);

info = dev->archdata.iommu;
-   if (info)
+   if (info && info != DEFER_DEVICE_DOMAIN_INFO
+   && info != DUMMY_DEVICE_DOMAIN_INFO)
__dmar_remove_one_dev_info(info);
spin_unlock_irqrestore(_domain_lock, flags);
 }




Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


panic in dmar_remove_one_dev_info

2019-12-16 Thread Jerry Snitselaar

HP is seeing a panic on gen9 dl360 and dl560 while testing these other
changes we've been eorking on. I just took an initial look, but have
to run to a dentist appointment so couldn't dig too deep. It looks
like the device sets dev->archdata.iommu to DEFER_DEVICE_DOMAIN_INFO
in intel_iommu_add_device, and then it needs a private domain so
dmar_remove_one_dev_info gets called. That code path ends up trying to
use DEFER_DEVICE_DOMAIN_INFO as a pointer.  I don't need if there just
needs to be a check in there to bail out if it sees
DEFER_DEVICE_DOMAIN_INFO, or if something more is needed. I'll look
at it some more when I get back home.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error

2019-12-12 Thread Jerry Snitselaar

On Thu Dec 12 19, Lu Baolu wrote:

Hi,

On 12/12/19 9:49 AM, Jerry Snitselaar wrote:

On Wed Dec 11 19, Lu Baolu wrote:

If the default DMA domain of a group doesn't fit a device, it
will still sit in the group but use a private identity domain.
When map/unmap/iova_to_phys come through iommu API, the driver
should still serve them, otherwise, other devices in the same
group will be impacted. Since identity domain has been mapped
with the whole available memory space and RMRRs, we don't need
to worry about the impact on it.

Link: https://www.spinics.net/lists/iommu/msg40416.html
Cc: Jerry Snitselaar 
Reported-by: Jerry Snitselaar 
Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains 
replaced with private")

Cc: sta...@vger.kernel.org # v5.3+
Signed-off-by: Lu Baolu 


Reviewed-by: Jerry Snitselaar 


Can you please try this fix and check whether it can fix your problem?
If it helps, do you mind adding a Tested-by?

Best regards,
baolu



Tested-by: Jerry Snitselaar 




---
drivers/iommu/intel-iommu.c | 8 
1 file changed, 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..b73bebea9148 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct 
iommu_domain *domain,

int prot = 0;
int ret;

-    if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-    return -EINVAL;
-
if (iommu_prot & IOMMU_READ)
    prot |= DMA_PTE_READ;
if (iommu_prot & IOMMU_WRITE)
@@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct 
iommu_domain *domain,

/* Cope with horrid API which requires us to unmap more than the
   size argument if it happens to be a large-page mapping. */
BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, ));
-    if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-    return 0;

if (size < VTD_PAGE_SIZE << level_to_offset_bits(level))
    size = VTD_PAGE_SIZE << level_to_offset_bits(level);
@@ -5556,9 +5551,6 @@ static phys_addr_t 
intel_iommu_iova_to_phys(struct iommu_domain *domain,

int level = 0;
u64 phys = 0;

-    if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-    return 0;
-
pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, );
if (pte)
    phys = dma_pte_addr(pte);
--
2.17.1




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/vt-d: Set ISA bridge reserved region as relaxable

2019-12-12 Thread Jerry Snitselaar

On Wed Dec 11 19, Alex Williamson wrote:

Commit d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via
iommu_get_resv_regions") created a direct-mapped reserved memory region
in order to replace the static identity mapping of the ISA address
space, where the latter was then removed in commit df4f3c603aeb
("iommu/vt-d: Remove static identity map code").  According to the
history of this code and the Kconfig option surrounding it, this direct
mapping exists for the benefit of legacy ISA drivers that are not
compatible with the DMA API.

In conjuntion with commit 9b77e5c79840 ("vfio/type1: check dma map
request is within a valid iova range") this change introduced a
regression where the vfio IOMMU backend enforces reserved memory regions
per IOMMU group, preventing userspace from creating IOMMU mappings
conflicting with prescribed reserved regions.  A necessary prerequisite
for the vfio change was the introduction of "relaxable" direct mappings
introduced by commit adfd37382090 ("iommu: Introduce
IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions").  These relaxable
direct mappings provide the same identity mapping support in the default
domain, but also indicate that the reservation is software imposed and
may be relaxed under some conditions, such as device assignment.

Convert the ISA bridge direct-mapped reserved region to relaxable to
reflect that the restriction is self imposed and need not be enforced
by drivers such as vfio.

Fixes: d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via 
iommu_get_resv_regions")
Cc: sta...@vger.kernel.org # v5.3+
Link: https://lore.kernel.org/linux-iommu/20191211082304.2d4fa...@x1.home
Reported-by: cprt 
Tested-by: cprt 
Signed-off-by: Alex Williamson 


Tested-by: Jerry Snitselaar 
Reviewed-by: Jerry Snitselaar 


---
drivers/iommu/intel-iommu.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..6eb0dd7489a1 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5737,7 +5737,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,

if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
reg = iommu_alloc_resv_region(0, 1UL << 24, 0,
- IOMMU_RESV_DIRECT);
+  IOMMU_RESV_DIRECT_RELAXABLE);
if (reg)
list_add_tail(>list, head);
}

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/vt-d: Allocate reserved region for ISA with correct permission

2019-12-12 Thread Jerry Snitselaar
Currently the reserved region for ISA is allocated with no
permissions. If a dma domain is being used, mapping this region will
fail. Set the permissions to DMA_PTE_READ|DMA_PTE_WRITE.

Cc: Joerg Roedel 
Cc: Lu Baolu  
Cc: iommu@lists.linux-foundation.org
Cc: sta...@vger.kernel.org # v5.3+
Fixes: d850c2ee5fe2 ("iommu/vt-d: Expose ISA direct mapping region via 
iommu_get_resv_regions")
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/intel-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..998529cebcf2 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5736,7 +5736,7 @@ static void intel_iommu_get_resv_regions(struct device 
*device,
struct pci_dev *pdev = to_pci_dev(device);
 
if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
-   reg = iommu_alloc_resv_region(0, 1UL << 24, 0,
+   reg = iommu_alloc_resv_region(0, 1UL << 24, prot,
  IOMMU_RESV_DIRECT);
if (reg)
list_add_tail(>list, head);
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error

2019-12-12 Thread Jerry Snitselaar

On Thu Dec 12 19, Jerry Snitselaar wrote:

On Fri Dec 13 19, Lu Baolu wrote:

Hi,

On 12/13/19 8:30 AM, Jerry Snitselaar wrote:

On Thu Dec 12 19, Lu Baolu wrote:

Hi,

On 12/12/19 9:49 AM, Jerry Snitselaar wrote:

On Wed Dec 11 19, Lu Baolu wrote:

If the default DMA domain of a group doesn't fit a device, it
will still sit in the group but use a private identity domain.
When map/unmap/iova_to_phys come through iommu API, the driver
should still serve them, otherwise, other devices in the same
group will be impacted. Since identity domain has been mapped
with the whole available memory space and RMRRs, we don't need
to worry about the impact on it.

Link: https://www.spinics.net/lists/iommu/msg40416.html
Cc: Jerry Snitselaar 
Reported-by: Jerry Snitselaar 
Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains 
replaced with private")

Cc: sta...@vger.kernel.org # v5.3+
Signed-off-by: Lu Baolu 


Reviewed-by: Jerry Snitselaar 


Can you please try this fix and check whether it can fix your problem?
If it helps, do you mind adding a Tested-by?

Best regards,
baolu



I'm testing with this patch, my patch that moves the direct mapping call,
and Alex's patch for the ISA bridge. It solved the 2 iommu mapping errors
I was seeing with default passthrough, I no longer see all the dmar pte
read access errors, and the system boots allowing me to login. I'm tracking
down 2 issues at the moment. With passthrough I see a problem with 01:00.4
that I mentioned in the earlier email:

[   78.978573] uhci_hcd: USB Universal Host Controller Interface driver
[   78.980842] uhci_hcd :01:00.4: UHCI Host Controller
[   78.982738] uhci_hcd :01:00.4: new USB bus registered, 
assigned bus number 3

[   78.985222] uhci_hcd :01:00.4: detected 8 ports
[   78.986907] uhci_hcd :01:00.4: port count misdetected? 
forcing to 2 ports

[   78.989316] uhci_hcd :01:00.4: irq 16, io base 0x3c00
[   78.994634] uhci_hcd :01:00.4: DMAR: 32bit DMA uses 
non-identity mapping

[   7 :01:00.4: unable to allocate consistent memory for frame list
[   79.499891] uhci_hcd :01:00.4: startup error -16
[   79.501588] uhci_hcd :01:00.4: USB bus 3 deregistered
[   79.503494] uhci_hcd :01:00.4: init :01:00.4 fail, -16
[   79.505497] uhci_hcd: probe of :01:00.4 failed with error -16

If I boot the system with iommu=nopt I see an iommu map failure due to
the prot check in __domain_mapping:

[   40.940589] pci :00:1f.0: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   40.943558] pci :00:1f.0: 
iommu_group_create_direct_mappings: iterating through mappings
[   40.946402] pci :00:1f.0: 
iommu_group_create_direct_mappings: calling apply_resv_region
[   40.949184] pci :00:1f.0: 
iommu_group_create_direct_mappings: entry type is direct

[   40.951819] DMAR: intel_iommu_map: enter
[   40.953128] DMAR: __domain_mapping: prot & 
(DMA_PTE_READ|DMA_PTE_WRITE) == 0

[   40.955486] DMAR: domain_mapping: __domain_mapping failed
[   40.957348] DMAR: intel_iommu_map: domain_pfn_mapping returned -22
[   40.959466] DMAR: intel_iommu_map: leave
[   40.959468] iommu: iommu_map: ops->map failed iova 0x0 pa 
0x pgsize 0x1000
[   40.963511] pci :00:1f.0: 
iommu_group_create_direct_mappings: iommu_map failed
[   40.966026] pci :00:1f.0: 
iommu_group_create_direct_mappings: leaving func
[   40.968487] pci :00:1f.0: iommu_group_add_device: calling 
__iommu_attach_device

[   40.971016] pci :00:1f.0: Adding to iommu group 19
[   40.972731] pci :00:1f.0: DMAR: domain->type is dma

/sys/kernel/iommu_groups/19
[root@hp-dl388g8-07 19]# cat reserved_regions 0x 
0x00ff direct

0xbdf6e000 0xbdf84fff direct
0xfee0 0xfeef msi

00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC 
Controller


This seems to be another issue?

Best regards,
baolu


In intel_iommu_get_resv_regions this iommu_alloc_resv_region is called
with prot set to 0:

   if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
   reg = iommu_alloc_resv_region(0, 1UL << 24, 0,
 
IOMMU_RESV_DIRECT_RELAXABLE);
   if (reg)



Looking at the older code for the ISA bridge it looks like it called
iommu_prepare_identity_map -> domain_prepare_identity_map ->
iommu_domain_identity_map -> and finally __domain_mapping with 
DMA_PTE_READ|DMA_PTE_WRITE?


I wonder if this is an issue with the region starting at 0x0 and this
bit in iommu_group_create_mappings:

phys_addr = iommu_iova_to_phys(domain, addr);
if (phys_addr)
continue;


Disregard this



Off to stick in some more debugging statements.

Regards,
Jerry


___
iommu mailing list
iommu@lists.linux-foundation.o

Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error

2019-12-12 Thread Jerry Snitselaar

On Fri Dec 13 19, Lu Baolu wrote:

Hi,

On 12/13/19 8:30 AM, Jerry Snitselaar wrote:

On Thu Dec 12 19, Lu Baolu wrote:

Hi,

On 12/12/19 9:49 AM, Jerry Snitselaar wrote:

On Wed Dec 11 19, Lu Baolu wrote:

If the default DMA domain of a group doesn't fit a device, it
will still sit in the group but use a private identity domain.
When map/unmap/iova_to_phys come through iommu API, the driver
should still serve them, otherwise, other devices in the same
group will be impacted. Since identity domain has been mapped
with the whole available memory space and RMRRs, we don't need
to worry about the impact on it.

Link: https://www.spinics.net/lists/iommu/msg40416.html
Cc: Jerry Snitselaar 
Reported-by: Jerry Snitselaar 
Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains 
replaced with private")

Cc: sta...@vger.kernel.org # v5.3+
Signed-off-by: Lu Baolu 


Reviewed-by: Jerry Snitselaar 


Can you please try this fix and check whether it can fix your problem?
If it helps, do you mind adding a Tested-by?

Best regards,
baolu



I'm testing with this patch, my patch that moves the direct mapping call,
and Alex's patch for the ISA bridge. It solved the 2 iommu mapping errors
I was seeing with default passthrough, I no longer see all the dmar pte
read access errors, and the system boots allowing me to login. I'm tracking
down 2 issues at the moment. With passthrough I see a problem with 01:00.4
that I mentioned in the earlier email:

[   78.978573] uhci_hcd: USB Universal Host Controller Interface driver
[   78.980842] uhci_hcd :01:00.4: UHCI Host Controller
[   78.982738] uhci_hcd :01:00.4: new USB bus registered, 
assigned bus number 3

[   78.985222] uhci_hcd :01:00.4: detected 8 ports
[   78.986907] uhci_hcd :01:00.4: port count misdetected? 
forcing to 2 ports

[   78.989316] uhci_hcd :01:00.4: irq 16, io base 0x3c00
[   78.994634] uhci_hcd :01:00.4: DMAR: 32bit DMA uses 
non-identity mapping

[   7 :01:00.4: unable to allocate consistent memory for frame list
[   79.499891] uhci_hcd :01:00.4: startup error -16
[   79.501588] uhci_hcd :01:00.4: USB bus 3 deregistered
[   79.503494] uhci_hcd :01:00.4: init :01:00.4 fail, -16
[   79.505497] uhci_hcd: probe of :01:00.4 failed with error -16

If I boot the system with iommu=nopt I see an iommu map failure due to
the prot check in __domain_mapping:

[   40.940589] pci :00:1f.0: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   40.943558] pci :00:1f.0: iommu_group_create_direct_mappings: 
iterating through mappings
[   40.946402] pci :00:1f.0: iommu_group_create_direct_mappings: 
calling apply_resv_region
[   40.949184] pci :00:1f.0: iommu_group_create_direct_mappings: 
entry type is direct

[   40.951819] DMAR: intel_iommu_map: enter
[   40.953128] DMAR: __domain_mapping: prot & 
(DMA_PTE_READ|DMA_PTE_WRITE) == 0

[   40.955486] DMAR: domain_mapping: __domain_mapping failed
[   40.957348] DMAR: intel_iommu_map: domain_pfn_mapping returned -22
[   40.959466] DMAR: intel_iommu_map: leave
[   40.959468] iommu: iommu_map: ops->map failed iova 0x0 pa 
0x pgsize 0x1000
[   40.963511] pci :00:1f.0: iommu_group_create_direct_mappings: 
iommu_map failed
[   40.966026] pci :00:1f.0: iommu_group_create_direct_mappings: 
leaving func
[   40.968487] pci :00:1f.0: iommu_group_add_device: calling 
__iommu_attach_device

[   40.971016] pci :00:1f.0: Adding to iommu group 19
[   40.972731] pci :00:1f.0: DMAR: domain->type is dma

/sys/kernel/iommu_groups/19
[root@hp-dl388g8-07 19]# cat reserved_regions 0x 
0x00ff direct

0xbdf6e000 0xbdf84fff direct
0xfee0 0xfeef msi

00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC 
Controller


This seems to be another issue?

Best regards,
baolu


In intel_iommu_get_resv_regions this iommu_alloc_resv_region is called
with prot set to 0:

if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
reg = iommu_alloc_resv_region(0, 1UL << 24, 0,
  
IOMMU_RESV_DIRECT_RELAXABLE);
if (reg)

I wonder if this is an issue with the region starting at 0x0 and this
bit in iommu_group_create_mappings:

phys_addr = iommu_iova_to_phys(domain, addr);
if (phys_addr)
continue;

Off to stick in some more debugging statements.

Regards,
Jerry


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error

2019-12-12 Thread Jerry Snitselaar

On Thu Dec 12 19, Lu Baolu wrote:

Hi,

On 12/12/19 9:49 AM, Jerry Snitselaar wrote:

On Wed Dec 11 19, Lu Baolu wrote:

If the default DMA domain of a group doesn't fit a device, it
will still sit in the group but use a private identity domain.
When map/unmap/iova_to_phys come through iommu API, the driver
should still serve them, otherwise, other devices in the same
group will be impacted. Since identity domain has been mapped
with the whole available memory space and RMRRs, we don't need
to worry about the impact on it.

Link: https://www.spinics.net/lists/iommu/msg40416.html
Cc: Jerry Snitselaar 
Reported-by: Jerry Snitselaar 
Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains 
replaced with private")

Cc: sta...@vger.kernel.org # v5.3+
Signed-off-by: Lu Baolu 


Reviewed-by: Jerry Snitselaar 


Can you please try this fix and check whether it can fix your problem?
If it helps, do you mind adding a Tested-by?

Best regards,
baolu



I'm testing with this patch, my patch that moves the direct mapping call,
and Alex's patch for the ISA bridge. It solved the 2 iommu mapping errors
I was seeing with default passthrough, I no longer see all the dmar pte
read access errors, and the system boots allowing me to login. I'm tracking
down 2 issues at the moment. With passthrough I see a problem with 01:00.4
that I mentioned in the earlier email:

[   78.978573] uhci_hcd: USB Universal Host Controller Interface driver
[   78.980842] uhci_hcd :01:00.4: UHCI Host Controller
[   78.982738] uhci_hcd :01:00.4: new USB bus registered, assigned bus 
number 3
[   78.985222] uhci_hcd :01:00.4: detected 8 ports
[   78.986907] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports
[   78.989316] uhci_hcd :01:00.4: irq 16, io base 0x3c00
[   78.994634] uhci_hcd :01:00.4: DMAR: 32bit DMA uses non-identity mapping
[   7 :01:00.4: unable to allocate consistent memory for frame list
[   79.499891] uhci_hcd :01:00.4: startup error -16
[   79.501588] uhci_hcd :01:00.4: USB bus 3 deregistered
[   79.503494] uhci_hcd :01:00.4: init :01:00.4 fail, -16
[   79.505497] uhci_hcd: probe of :01:00.4 failed with error -16

If I boot the system with iommu=nopt I see an iommu map failure due to
the prot check in __domain_mapping:

[   40.940589] pci :00:1f.0: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   40.943558] pci :00:1f.0: iommu_group_create_direct_mappings: iterating 
through mappings
[   40.946402] pci :00:1f.0: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   40.949184] pci :00:1f.0: iommu_group_create_direct_mappings: entry type 
is direct
[   40.951819] DMAR: intel_iommu_map: enter
[   40.953128] DMAR: __domain_mapping: prot & (DMA_PTE_READ|DMA_PTE_WRITE) == 0
[   40.955486] DMAR: domain_mapping: __domain_mapping failed
[   40.957348] DMAR: intel_iommu_map: domain_pfn_mapping returned -22
[   40.959466] DMAR: intel_iommu_map: leave
[   40.959468] iommu: iommu_map: ops->map failed iova 0x0 pa 0x 
pgsize 0x1000
[   40.963511] pci :00:1f.0: iommu_group_create_direct_mappings: iommu_map 
failed
[   40.966026] pci :00:1f.0: iommu_group_create_direct_mappings: leaving 
func
[   40.968487] pci :00:1f.0: iommu_group_add_device: calling 
__iommu_attach_device
[   40.971016] pci :00:1f.0: Adding to iommu group 19
[   40.972731] pci :00:1f.0: DMAR: domain->type is dma

/sys/kernel/iommu_groups/19
[root@hp-dl388g8-07 19]# cat reserved_regions 
0x 0x00ff direct

0xbdf6e000 0xbdf84fff direct
0xfee0 0xfeef msi

00:1f.0 ISA bridge: Intel Corporation C600/X79 series chipset LPC Controller




---
drivers/iommu/intel-iommu.c | 8 
1 file changed, 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..b73bebea9148 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct 
iommu_domain *domain,

int prot = 0;
int ret;

-    if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-    return -EINVAL;
-
if (iommu_prot & IOMMU_READ)
    prot |= DMA_PTE_READ;
if (iommu_prot & IOMMU_WRITE)
@@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct 
iommu_domain *domain,

/* Cope with horrid API which requires us to unmap more than the
   size argument if it happens to be a large-page mapping. */
BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, ));
-    if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-    return 0;

if (size < VTD_PAGE_SIZE << level_to_offset_bits(level))
    size = VTD_PAGE_SIZE << level_to_offset_bits(level);
@@ -5556,9 +5551,6 @@ static phys_addr_t 
intel_iommu_iova_to_phys(struct iommu_domain *domain,

int level = 0;
u64 phys = 0;

-    if (dm

Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error

2019-12-11 Thread Jerry Snitselaar

On Wed Dec 11 19, Lu Baolu wrote:

If the default DMA domain of a group doesn't fit a device, it
will still sit in the group but use a private identity domain.
When map/unmap/iova_to_phys come through iommu API, the driver
should still serve them, otherwise, other devices in the same
group will be impacted. Since identity domain has been mapped
with the whole available memory space and RMRRs, we don't need
to worry about the impact on it.

Link: https://www.spinics.net/lists/iommu/msg40416.html
Cc: Jerry Snitselaar 
Reported-by: Jerry Snitselaar 
Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with 
private")
Cc: sta...@vger.kernel.org # v5.3+
Signed-off-by: Lu Baolu 


Reviewed-by: Jerry Snitselaar 


---
drivers/iommu/intel-iommu.c | 8 
1 file changed, 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..b73bebea9148 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct iommu_domain *domain,
int prot = 0;
int ret;

-   if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-   return -EINVAL;
-
if (iommu_prot & IOMMU_READ)
prot |= DMA_PTE_READ;
if (iommu_prot & IOMMU_WRITE)
@@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct iommu_domain 
*domain,
/* Cope with horrid API which requires us to unmap more than the
   size argument if it happens to be a large-page mapping. */
BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, ));
-   if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-   return 0;

if (size < VTD_PAGE_SIZE << level_to_offset_bits(level))
size = VTD_PAGE_SIZE << level_to_offset_bits(level);
@@ -5556,9 +5551,6 @@ static phys_addr_t intel_iommu_iova_to_phys(struct 
iommu_domain *domain,
int level = 0;
u64 phys = 0;

-   if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-   return 0;
-
pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, );
if (pte)
phys = dma_pte_addr(pte);
--
2.17.1



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Fix dmar pte read access not set error

2019-12-11 Thread Jerry Snitselaar

On Wed Dec 11 19, Lu Baolu wrote:

If the default DMA domain of a group doesn't fit a device, it
will still sit in the group but use a private identity domain.
When map/unmap/iova_to_phys come through iommu API, the driver
should still serve them, otherwise, other devices in the same
group will be impacted. Since identity domain has been mapped
with the whole available memory space and RMRRs, we don't need
to worry about the impact on it.



Does this pose any potential issues with the reverse case where the
group has a default identity domain, and the first device fits that,
but a later device in the group needs dma and gets a private dma
domain?


Link: https://www.spinics.net/lists/iommu/msg40416.html
Cc: Jerry Snitselaar 
Reported-by: Jerry Snitselaar 
Fixes: 942067f1b6b97 ("iommu/vt-d: Identify default domains replaced with 
private")
Cc: sta...@vger.kernel.org # v5.3+
Signed-off-by: Lu Baolu 
---
drivers/iommu/intel-iommu.c | 8 
1 file changed, 8 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 0c8d81f56a30..b73bebea9148 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5478,9 +5478,6 @@ static int intel_iommu_map(struct iommu_domain *domain,
int prot = 0;
int ret;

-   if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-   return -EINVAL;
-
if (iommu_prot & IOMMU_READ)
prot |= DMA_PTE_READ;
if (iommu_prot & IOMMU_WRITE)
@@ -5523,8 +5520,6 @@ static size_t intel_iommu_unmap(struct iommu_domain 
*domain,
/* Cope with horrid API which requires us to unmap more than the
   size argument if it happens to be a large-page mapping. */
BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, ));
-   if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-   return 0;

if (size < VTD_PAGE_SIZE << level_to_offset_bits(level))
size = VTD_PAGE_SIZE << level_to_offset_bits(level);
@@ -5556,9 +5551,6 @@ static phys_addr_t intel_iommu_iova_to_phys(struct 
iommu_domain *domain,
int level = 0;
u64 phys = 0;

-   if (dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN)
-   return 0;
-
pte = pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, );
if (pte)
phys = dma_pte_addr(pte);
--
2.17.1



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-10 Thread Jerry Snitselaar

On Tue Dec 10 19, Lu Baolu wrote:

Hi,

On 12/10/19 1:18 PM, Jerry Snitselaar wrote:

On Mon Dec 09 19, Jerry Snitselaar wrote:
[snip]


A call to iommu_map is failing.

[   36.686881] pci :01:00.2: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   36.689843] pci :01:00.2: 
iommu_group_create_direct_mappings: iterating through mappings
[   36.692757] pci :01:00.2: 
iommu_group_create_direct_mappings: calling apply_resv_region

[   36.695526] pci :01:00.2: e_direct_mappings: entry type is direct
[   37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 
pa 0xbddde000 pgsize 0x1000
[   37.201357] pci :01:00.2: 
iommu_group_create_direct_mappings: iommu_map failed
[   37.203973] pci :01:00.2: 
iommu_group_create_direct_mappings: leaving func
[   37.206385] pci :01:00.2: iommu_group_add_device: calling 
__iommu_attach_device

[   37.208950] pci :01:00.2: Adding to iommu group 25
[   37.210660] pci :01:00.2: DMAR: domain->type is dma



It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check
at the beginning of intel_iommu_map.  I will verify, but it looks like
that is getting set when intel_iommu_add_device is called for 01:00.1.
request_default_domain_for_dev for 01:00.1 will return -EBUSY because
iommu_group_device_count(group) != 1.


Okay, I will send you a fix patch later. Thanks!

Best regards,
baolu



One issue I see is:

[   38.869182] uhci_hcd :01:00.4: UHCI Host Controller
[   39.371173] uhci_hcd :01:00.4: new USB bus registered, assigned bus 
number 3
[   39.373708] uhci_hcd :01:00.4: detected 8 ports
[   39.375333] uhci_hcd :01:00.4: port count misdetected? forcing to 2 ports
[   39.377820] uhci_hcd :01:00.4: irq 16, io base 0x3c00
[   39.379921] uhci_hcd :01:00.4: DMAR: 32bit DMA uses non-identity mapping
[   39.382269] uhci_hcd :01:00.4: unable to allocate consistent memory for 
frame list
[   39.384920] uhci_hcd :01:00.4: startup error -16
[   39.386619] uhci_hcd :01:00.4: USB bus 3 deregistered
[   39.388640] uhci_hcd :01:00.4: init :01:00.4 fail, -16
[   39.390616] uhci_hcd: probe of :01:00.4 failed with error -16

I'm not sure if this is related to the flag and what is allowed now
by the api. I need to go look at the code to see what it is
doing. I'll try debugging it tonight.

Regards,
Jerry




Also fails for 01:00.4:

[   37.212448] pci :01:00.4: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   37.215382] pci :01:00.4: 
iommu_group_create_direct_mappings: iterating through mappings
[   37.218170] pci :01:00.4: 
iommu_group_create_direct_mappings: calling apply_resv_region
[   37.220933] pci :01:00.4: 
iommu_group_create_direct_mappings: entry type is direct-relaxable
[   37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 
pa 0xbddde000 pgsize 0x1000
[   37.226857] pci :01:00.4: 
iommu_group_create_direct_mappings: iommu_map failed
[   37.229300] pci :01:00.4: 
iommu_group_create_direct_mappings: leaving func
[   37.231648] pci :01:00.4: iommu_group_add_device: calling 
__iommu_attach_device

[   37.234194] pci :01:00.4: Adding to iommu group 25
[   37.236192] pci :01:00.4: DMAR: domain->type is dma
[   37.237958] pci :01:00.4: DMAR: device default domain type 
is identity. requesting identity domain
[   37.241061] pci :01:00.4: don't change mappings of existing 
d37.489870] pci :01:00.4: DMAR: Device uses a private identity 
domain.


There is an RMRR for 0xbddde000-0xefff:

[63Ah 1594   2]    Subtable Type : 0001 [Reserved 
Memory Region]

[63Ch 1596   2]   Length : 0036

[63Eh 1598   2] Reserved : 
[640h 1600   2]   PCI Segment Number : 
[642h 1602   8] Base Address : BDDDE000
[64Ah 1610   8]  End Address (limit) : BDDDEFFF

[652h 1618   1]    Device Scope Type : 01 [PCI Endpoint Device]
[653h 1619   1] Entry Length : 0A
[654h 1620   2] Reserved : 
[656h 1622   1]   Enumeration ID : 00
[657h 1623   1]   PCI Bus Number : 00

[658h 1624   2] PCI Path : 1C,07

[65Ah 1626   2] PCI Path : 00,00


[65Ch 1628   1]    Device Scope Type : 01 [PCI Endpoint Device]
[65Dh 1629   1] Entry Length : 0A
[65Eh 1630   2] Reserved : 
[660h 1632   1]   Enumeration ID : 00
[661h 1633   1]   PCI Bus Number : 00

[662h 1634   2] PCI Path : 1C,07

[664h 1636   2] PCI Path : 00,02


[666h 1638   1]    Device Scope Type : 01 [PCI Endpoint Device]
[667h 1639   1] Entry Length : 0A
[668h 1640   2] Reserved : 
[66Ah 1642   1]   Enumeration ID : 00
[66Bh 1643   1]   

[PATCH] iommu: set group default domain before creating direct mappings

2019-12-10 Thread Jerry Snitselaar
iommu_group_create_direct_mappings uses group->default_domain, but
right after it is called, request_default_domain_for_dev calls
iommu_domain_free for the default domain, and sets the group default
domain to a different domain. Move the
iommu_group_create_direct_mappings call to after the group default
domain is set, so the direct mappings get associated with that domain.

Cc: Joerg Roedel 
Cc: Lu Baolu  
Cc: iommu@lists.linux-foundation.org
Cc: sta...@vger.kernel.org
Fixes: 7423e01741dd ("iommu: Add API to request DMA domain for device")
Signed-off-by: Jerry Snitselaar 
---
 drivers/iommu/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..fa908179b80b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2282,13 +2282,13 @@ request_default_domain_for_dev(struct device *dev, 
unsigned long type)
goto out;
}
 
-   iommu_group_create_direct_mappings(group, dev);
-
/* Make the domain the default for this group */
if (group->default_domain)
iommu_domain_free(group->default_domain);
group->default_domain = domain;
 
+   iommu_group_create_direct_mappings(group, dev);
+
dev_info(dev, "Using iommu %s mapping\n",
 type == IOMMU_DOMAIN_DMA ? "dma" : "direct");
 
-- 
2.24.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-09 Thread Jerry Snitselaar

On Tue Dec 10 19, Lu Baolu wrote:

Hi,

On 12/10/19 2:16 PM, Jerry Snitselaar wrote:

On Mon Dec 09 19, Jerry Snitselaar wrote:

On Mon Dec 09 19, Jerry Snitselaar wrote:

On Mon Dec 09 19, Jerry Snitselaar wrote:
[snip]


A call to iommu_map is failing.

[   36.686881] pci :01:00.2: iommu_group_add_device: 
calling iommu_group_create_direct_mappings
[   36.689843] pci :01:00.2: 
iommu_group_create_direct_mappings: iterating through mappings
[   36.692757] pci :01:00.2: 
iommu_group_create_direct_mappings: calling apply_resv_region
[   36.695526] pci :01:00.2: e_direct_mappings: entry type 
is direct
[   37.198053] iommu: iommu_map: ops->map failed iova 
0xbddde000 pa 0xbddde000 pgsize 0x1000
[   37.201357] pci :01:00.2: 
iommu_group_create_direct_mappings: iommu_map failed
[   37.203973] pci :01:00.2: 
iommu_group_create_direct_mappings: leaving func
[   37.206385] pci :01:00.2: iommu_group_add_device: 
calling __iommu_attach_device

[   37.208950] pci :01:00.2: Adding to iommu group 25
[   37.210660] pci :01:00.2: DMAR: domain->type is dma



It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check
at the beginning of intel_iommu_map.  I will verify, but it looks like
that is getting set when intel_iommu_add_device is called for 01:00.1.
request_default_domain_for_dev for 01:00.1 will return -EBUSY because
iommu_group_device_count(group) != 1.



Also I see 01:00.0 and others that are the first in a group 
exiting iommu_group_create_direct_mappings
at the (!domain || domain->type != IOMMU_DOMAIN_DMA) check. In 
request_default_domain_for_dev default_domain
doesn't getting set until after that call. Should the 
iommu_group_create_direct_mappings call be moved below

where group->default_domain gets set?



Doing this the system boots, and I don't get any dmar pte read 
errors. I still see the map failing because
of the DOMAIN_FLAG_LOSE_CHILDREN in those cases mentioned above, but 
it no longer is spitting out tons of

dmar pte read errors.


You can post a patch if you think this is worth of.

Best regards,
baolu



I will send a patch tomorrow. In the case where you have
default passthrough enabled, if the default domain type
for the first device in a group is dma the call will fail, because
iommu_group_create_direct_mappings uses group->default_domain and
that will have an identity type until group->default_domain gets
set right after the iommu_group_create_direct_mappings call.

Regards,
Jerry




Also fails for 01:00.4:

[   37.212448] pci :01:00.4: iommu_group_add_device: 
calling iommu_group_create_direct_mappings
[   37.215382] pci :01:00.4: 
iommu_group_create_direct_mappings: iterating through mappings
[   37.218170] pci :01:00.4: 
iommu_group_create_direct_mappings: calling apply_resv_region
[   37.220933] pci :01:00.4: 
iommu_group_create_direct_mappings: entry type is 
direct-relaxable
[   37.223932] iommu: iommu_map: ops->map failed iova 
0xbddde000 pa 0xbddde000 pgsize 0x1000
[   37.226857] pci :01:00.4: 
iommu_group_create_direct_mappings: iommu_map failed
[   37.229300] pci :01:00.4: 
iommu_group_create_direct_mappings: leaving func
[   37.231648] pci :01:00.4: iommu_group_add_device: 
calling __iommu_attach_device

[   37.234194] pci :01:00.4: Adding to iommu group 25
[   37.236192] pci :01:00.4: DMAR: domain->type is dma
[   37.237958] pci :01:00.4: DMAR: device default domain 
type is identity. requesting identity domain
[   37.241061] pci :01:00.4: don't change mappings of 
existing d37.489870] pci :01:00.4: DMAR: Device uses a 
private identity domain.


There is an RMRR for 0xbddde000-0xefff:

[63Ah 1594   2]    Subtable Type : 0001 [Reserved 
Memory Region]

[63Ch 1596   2]   Length : 0036

[63Eh 1598   2] Reserved : 
[640h 1600   2]   PCI Segment Number : 
[642h 1602   8] Base Address : BDDDE000
[64Ah 1610   8]  End Address (limit) : BDDDEFFF

[652h 1618   1]    Device Scope Type : 01 [PCI Endpoint Device]
[653h 1619   1] Entry Length : 0A
[654h 1620   2] Reserved : 
[656h 1622   1]   Enumeration ID : 00
[657h 1623   1]   PCI Bus Number : 00

[658h 1624   2] PCI Path : 1C,07

[65Ah 1626   2] PCI Path : 00,00


[65Ch 1628   1]    Device Scope Type : 01 [PCI Endpoint Device]
[65Dh 1629   1] Entry Length : 0A
[65Eh 1630   2] Reserved : 
[660h 1632   1]   Enumeration ID : 00
[661h 1633   1]   PCI Bus Number : 00

[662h 1634   2] PCI Path : 1C,07

[664h 1636   2] PCI Path : 00,02


[666h 1638   1]    Device Scope Type : 01 [PCI Endpoint Device]
[667h 1639   1] Entry Length : 0A
[668h 1640   

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-09 Thread Jerry Snitselaar

On Mon Dec 09 19, Jerry Snitselaar wrote:

On Mon Dec 09 19, Jerry Snitselaar wrote:

On Mon Dec 09 19, Jerry Snitselaar wrote:
[snip]


A call to iommu_map is failing.

[   36.686881] pci :01:00.2: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating 
through mappings
[   36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   36.695526] pci :01:00.2: e_direct_mappings: entry type is direct
[   37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0xbddde000 pgsize 0x1000
[   37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving 
func
[   37.206385] pci :01:00.2: iommu_group_add_device: calling 
__iommu_attach_device
[   37.208950] pci :01:00.2: Adding to iommu group 25
[   37.210660] pci :01:00.2: DMAR: domain->type is dma



It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check
at the beginning of intel_iommu_map.  I will verify, but it looks like
that is getting set when intel_iommu_add_device is called for 01:00.1.
request_default_domain_for_dev for 01:00.1 will return -EBUSY because
iommu_group_device_count(group) != 1.



Also I see 01:00.0 and others that are the first in a group exiting 
iommu_group_create_direct_mappings
at the (!domain || domain->type != IOMMU_DOMAIN_DMA) check. In 
request_default_domain_for_dev default_domain
doesn't getting set until after that call. Should the 
iommu_group_create_direct_mappings call be moved below
where group->default_domain gets set?



Doing this the system boots, and I don't get any dmar pte read errors. I still 
see the map failing because
of the DOMAIN_FLAG_LOSE_CHILDREN in those cases mentioned above, but it no 
longer is spitting out tons of
dmar pte read errors.


Also fails for 01:00.4:

[   37.212448] pci :01:00.4: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating 
through mappings
[   37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type 
is direct-relaxable
[   37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0xbddde000 pgsize 0x1000
[   37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving 
func
[   37.231648] pci :01:00.4: iommu_group_add_device: calling 
__iommu_attach_device
[   37.234194] pci :01:00.4: Adding to iommu group 25
[   37.236192] pci :01:00.4: DMAR: domain->type is dma
[   37.237958] pci :01:00.4: DMAR: device default domain type is identity. 
requesting identity domain
[   37.241061] pci :01:00.4: don't change mappings of existing d37.489870] 
pci :01:00.4: DMAR: Device uses a private identity domain.

There is an RMRR for 0xbddde000-0xefff:

[63Ah 1594   2]Subtable Type : 0001 [Reserved Memory Region]
[63Ch 1596   2]   Length : 0036

[63Eh 1598   2] Reserved : 
[640h 1600   2]   PCI Segment Number : 
[642h 1602   8] Base Address : BDDDE000
[64Ah 1610   8]  End Address (limit) : BDDDEFFF

[652h 1618   1]Device Scope Type : 01 [PCI Endpoint Device]
[653h 1619   1] Entry Length : 0A
[654h 1620   2] Reserved : 
[656h 1622   1]   Enumeration ID : 00
[657h 1623   1]   PCI Bus Number : 00

[658h 1624   2] PCI Path : 1C,07

[65Ah 1626   2] PCI Path : 00,00


[65Ch 1628   1]Device Scope Type : 01 [PCI Endpoint Device]
[65Dh 1629   1] Entry Length : 0A
[65Eh 1630   2] Reserved : 
[660h 1632   1]   Enumeration ID : 00
[661h 1633   1]   PCI Bus Number : 00

[662h 1634   2] PCI Path : 1C,07

[664h 1636   2] PCI Path : 00,02


[666h 1638   1]Device Scope Type : 01 [PCI Endpoint Device]
[667h 1639   1] Entry Length : 0A
[668h 1640   2] Reserved : 
[66Ah 1642   1]   Enumeration ID : 00
[66Bh 1643   1]   PCI Bus Number : 00

[66Ch 1644   2] PCI Path : 1C,07

[66Eh 1646   2] PCI Path : 00,04



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-09 Thread Jerry Snitselaar

On Mon Dec 09 19, Jerry Snitselaar wrote:

On Mon Dec 09 19, Jerry Snitselaar wrote:
[snip]


A call to iommu_map is failing.

[   36.686881] pci :01:00.2: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating 
through mappings
[   36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   36.695526] pci :01:00.2: e_direct_mappings: entry type is direct
[   37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0xbddde000 pgsize 0x1000
[   37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving 
func
[   37.206385] pci :01:00.2: iommu_group_add_device: calling 
__iommu_attach_device
[   37.208950] pci :01:00.2: Adding to iommu group 25
[   37.210660] pci :01:00.2: DMAR: domain->type is dma



It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check
at the beginning of intel_iommu_map.  I will verify, but it looks like
that is getting set when intel_iommu_add_device is called for 01:00.1.
request_default_domain_for_dev for 01:00.1 will return -EBUSY because
iommu_group_device_count(group) != 1.



Also I see 01:00.0 and others that are the first in a group exiting 
iommu_group_create_direct_mappings
at the (!domain || domain->type != IOMMU_DOMAIN_DMA) check. In 
request_default_domain_for_dev default_domain
doesn't getting set until after that call. Should the 
iommu_group_create_direct_mappings call be moved below
where group->default_domain gets set?


Also fails for 01:00.4:

[   37.212448] pci :01:00.4: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating 
through mappings
[   37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type 
is direct-relaxable
[   37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0xbddde000 pgsize 0x1000
[   37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving 
func
[   37.231648] pci :01:00.4: iommu_group_add_device: calling 
__iommu_attach_device
[   37.234194] pci :01:00.4: Adding to iommu group 25
[   37.236192] pci :01:00.4: DMAR: domain->type is dma
[   37.237958] pci :01:00.4: DMAR: device default domain type is identity. 
requesting identity domain
[   37.241061] pci :01:00.4: don't change mappings of existing d37.489870] 
pci :01:00.4: DMAR: Device uses a private identity domain.

There is an RMRR for 0xbddde000-0xefff:

[63Ah 1594   2]Subtable Type : 0001 [Reserved Memory Region]
[63Ch 1596   2]   Length : 0036

[63Eh 1598   2] Reserved : 
[640h 1600   2]   PCI Segment Number : 
[642h 1602   8] Base Address : BDDDE000
[64Ah 1610   8]  End Address (limit) : BDDDEFFF

[652h 1618   1]Device Scope Type : 01 [PCI Endpoint Device]
[653h 1619   1] Entry Length : 0A
[654h 1620   2] Reserved : 
[656h 1622   1]   Enumeration ID : 00
[657h 1623   1]   PCI Bus Number : 00

[658h 1624   2] PCI Path : 1C,07

[65Ah 1626   2] PCI Path : 00,00


[65Ch 1628   1]Device Scope Type : 01 [PCI Endpoint Device]
[65Dh 1629   1] Entry Length : 0A
[65Eh 1630   2] Reserved : 
[660h 1632   1]   Enumeration ID : 00
[661h 1633   1]   PCI Bus Number : 00

[662h 1634   2] PCI Path : 1C,07

[664h 1636   2] PCI Path : 00,02


[666h 1638   1]Device Scope Type : 01 [PCI Endpoint Device]
[667h 1639   1] Entry Length : 0A
[668h 1640   2] Reserved : 
[66Ah 1642   1]   Enumeration ID : 00
[66Bh 1643   1]   PCI Bus Number : 00

[66Ch 1644   2] PCI Path : 1C,07

[66Eh 1646   2] PCI Path : 00,04



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-09 Thread Jerry Snitselaar

On Mon Dec 09 19, Jerry Snitselaar wrote:
[snip]


A call to iommu_map is failing.

[   36.686881] pci :01:00.2: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   36.689843] pci :01:00.2: iommu_group_create_direct_mappings: iterating 
through mappings
[   36.692757] pci :01:00.2: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   36.695526] pci :01:00.2: e_direct_mappings: entry type is direct
[   37.198053] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0xbddde000 pgsize 0x1000
[   37.201357] pci :01:00.2: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.203973] pci :01:00.2: iommu_group_create_direct_mappings: leaving 
func
[   37.206385] pci :01:00.2: iommu_group_add_device: calling 
__iommu_attach_device
[   37.208950] pci :01:00.2: Adding to iommu group 25
[   37.210660] pci :01:00.2: DMAR: domain->type is dma



It bails at the dmar_domain->flags & DOMAIN_FLAG_LOSE_CHILDREN check
at the beginning of intel_iommu_map.  I will verify, but it looks like
that is getting set when intel_iommu_add_device is called for 01:00.1.
request_default_domain_for_dev for 01:00.1 will return -EBUSY because
iommu_group_device_count(group) != 1.


Also fails for 01:00.4:

[   37.212448] pci :01:00.4: iommu_group_add_device: calling 
iommu_group_create_direct_mappings
[   37.215382] pci :01:00.4: iommu_group_create_direct_mappings: iterating 
through mappings
[   37.218170] pci :01:00.4: iommu_group_create_direct_mappings: calling 
apply_resv_region
[   37.220933] pci :01:00.4: iommu_group_create_direct_mappings: entry type 
is direct-relaxable
[   37.223932] iommu: iommu_map: ops->map failed iova 0xbddde000 pa 
0xbddde000 pgsize 0x1000
[   37.226857] pci :01:00.4: iommu_group_create_direct_mappings: iommu_map 
failed
[   37.229300] pci :01:00.4: iommu_group_create_direct_mappings: leaving 
func
[   37.231648] pci :01:00.4: iommu_group_add_device: calling 
__iommu_attach_device
[   37.234194] pci :01:00.4: Adding to iommu group 25
[   37.236192] pci :01:00.4: DMAR: domain->type is dma
[   37.237958] pci :01:00.4: DMAR: device default domain type is identity. 
requesting identity domain
[   37.241061] pci :01:00.4: don't change mappings of existing d37.489870] 
pci :01:00.4: DMAR: Device uses a private identity domain.

There is an RMRR for 0xbddde000-0xefff:

[63Ah 1594   2]Subtable Type : 0001 [Reserved Memory Region]
[63Ch 1596   2]   Length : 0036

[63Eh 1598   2] Reserved : 
[640h 1600   2]   PCI Segment Number : 
[642h 1602   8] Base Address : BDDDE000
[64Ah 1610   8]  End Address (limit) : BDDDEFFF

[652h 1618   1]Device Scope Type : 01 [PCI Endpoint Device]
[653h 1619   1] Entry Length : 0A
[654h 1620   2] Reserved : 
[656h 1622   1]   Enumeration ID : 00
[657h 1623   1]   PCI Bus Number : 00

[658h 1624   2] PCI Path : 1C,07

[65Ah 1626   2] PCI Path : 00,00


[65Ch 1628   1]Device Scope Type : 01 [PCI Endpoint Device]
[65Dh 1629   1] Entry Length : 0A
[65Eh 1630   2] Reserved : 
[660h 1632   1]   Enumeration ID : 00
[661h 1633   1]   PCI Bus Number : 00

[662h 1634   2] PCI Path : 1C,07

[664h 1636   2] PCI Path : 00,02


[666h 1638   1]Device Scope Type : 01 [PCI Endpoint Device]
[667h 1639   1] Entry Length : 0A
[668h 1640   2] Reserved : 
[66Ah 1642   1]   Enumeration ID : 00
[66Bh 1643   1]   PCI Bus Number : 00

[66Ch 1644   2] PCI Path : 1C,07

[66Eh 1646   2] PCI Path : 00,04



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-09 Thread Jerry Snitselaar

On Tue Dec 10 19, Lu Baolu wrote:

Hi,

On 12/10/19 8:52 AM, Jerry Snitselaar wrote:

On Sun Dec 08 19, Lu Baolu wrote:

Hi,

On 12/7/19 10:41 AM, Jerry Snitselaar wrote:

On Fri Dec 06 19, Jerry Snitselaar wrote:

On Sat Dec 07 19, Lu Baolu wrote:

Hi Jerry,

On 12/6/19 3:24 PM, Jerry Snitselaar wrote:

On Fri Dec 06 19, Lu Baolu wrote:
[snip]


Can you please try below change? Let's check whether the afending
address has been mapped for device 01.00.2.

$ git diff
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..d9daf66be849 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -663,6 +663,8 @@ static int 
iommu_group_create_direct_mappings(struct iommu_group 
*group,
   ret = iommu_map(domain, addr, 
addr, pg_size, entry->prot);

   if (ret)
   goto out;
+
+   dev_info(dev, "Setting identity 
map [0x%Lx - 0x%Lx] for group %d\n", addr, addr + 
pg_size, group->id);

   }

   }

I am doubting that device 01.00.2 is not in the device scope of

[    4.485108] DMAR: RMRR base: 0x00bdf6f000 end: 
0x00bdf7efff


By the way, does device 01.00.2 works well after binding the driver?



When I boot it with passthrough it doesn't get to a point where I can
login. I think the serial console on these systems is tied 
to the ilo,

so the conserver connection could be making things
worse. Unfortunately the system is remote. I should have 
more time now

to focus on debugging this.

Attaching console output for the above patch.


It seems that device 01.00.2 isn't in the scope of RMRR [base:
0x00bdf6f000 end: 0x00bdf7efff]. But it still tries to access
the address within it, hence faults generated.

You can check it with ACPI/DMAR table.

Best regards,
baolu



I believe it is the 3rd endpoint device entry in dmar data below.
So question about request_default_domain_for_dev. Since a dma mapping
is already done for 1.00.0, and that sets the default_domain for the
group (I think), won't it bail out for 1.00.2 at this check?

if (group->default_domain && group->default_domain->type == type)
    goto out;



Or I guess request_default_domain_for_dev wouldn't even be 
called for 1.00.2.

intel_iommu_add_device it wouldn't even call one of the request
functions with 1.00.2 since domain->type would be dma from 
1.00.0, and device_def_domain_type

should return dma.


Can you please add some debug messages and check what really happens
here?

Best regards,
baolu



[   25.000544] pci :01:00.0: Adding to iommu group 25
[   25.502243] pci :01:00.0: DMAR: domain->type is identity  << 
intel_iommu_add_device (alloced in iommu_group_get_for_dev)
[   25.504239] pci :01:00.0: DMAR: device default domain type is 
dma. requesting dma domain  << intel_iommu_add_device
[   25.507954] pci :01:00.0: Using iommu dma mapping    << 
request_default_domain_for_dev  (now default domain for group is 
dma)

[   25.509765] pci :01:00.1: Adding to iommu group 25
[   25.511514] pci :01:00.1: DMAR: domain->type is dma  << 
intel_iommu_add_device
[   25.513263] pci :01:00.1: DMAR: device default domain type is 
identity. requesting identity domain  << intel_iommu_add_device
[   25.516435] pci :01:00.1: don't change mappings of existing 
devices.    << request_default_domain_for_dev
[   25.518669] pci :01:00.1: DMAR: Device uses a private 
identity domain.  << intel_iommu_add_device

[   25.521061] pci :01:00.2: Adding to iommu group 25
[   25.522791] pci :01:00.2: DMAR: domain->type is dma  << 
intel_iommu_add_device

[   25.524706] pci :01:00.4: Adding to iommu group 25
[   25.526458] pci :01:00.4: DMAR: domain->type is dma  << 
intel_iommu_add_device
[   25.528213] pci :01:00.4: DMAR: device default domain type is 
identity. requesting identity domain  << intel_iommu_add_device
[   25.531284] pci :01:00.4: don't change mappings of existing 
devices.    << request_default_domain_for_dev
[   25.533500] pci :01:00.4: DMAR: Device uses a private 
identity domain.  << intel_iommu_add_device


So the domain type is dma after 01:00.0 gets added, and when
intel_iommu_add_device is called for 01:00.2 it will go into the if
section. Since the device default domain type for 01:00.2 is dma
nothing happens in there, and it goes on to 01:00.4. Is the "private
identity domain" message really accurate since everyone will use
si_domain? Adding some more debugging.


The facts that we have seen:

1) 01.00.2 uses the default domain in group 25. The domain type of this
  default domain is DMA.

2) iommu_group_create_direct_mappings() *should* be called when adding
  01.00.2 into group 25. As the result, RMRR for this device *should*
  be identity mapped.

3) By checking DMAR table, RMRR (0x00bdf6f000 ~ 0

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-09 Thread Jerry Snitselaar

On Sun Dec 08 19, Lu Baolu wrote:

Hi,

On 12/7/19 10:41 AM, Jerry Snitselaar wrote:

On Fri Dec 06 19, Jerry Snitselaar wrote:

On Sat Dec 07 19, Lu Baolu wrote:

Hi Jerry,

On 12/6/19 3:24 PM, Jerry Snitselaar wrote:

On Fri Dec 06 19, Lu Baolu wrote:
[snip]


Can you please try below change? Let's check whether the afending
address has been mapped for device 01.00.2.

$ git diff
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..d9daf66be849 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -663,6 +663,8 @@ static int 
iommu_group_create_direct_mappings(struct iommu_group 
*group,
   ret = iommu_map(domain, addr, addr, 
pg_size, entry->prot);

   if (ret)
   goto out;
+
+   dev_info(dev, "Setting identity map 
[0x%Lx - 0x%Lx] for group %d\n", addr, addr + pg_size, 
group->id);

   }

   }

I am doubting that device 01.00.2 is not in the device scope of

[    4.485108] DMAR: RMRR base: 0x00bdf6f000 end: 0x00bdf7efff

By the way, does device 01.00.2 works well after binding the driver?



When I boot it with passthrough it doesn't get to a point where I can
login. I think the serial console on these systems is tied to the ilo,
so the conserver connection could be making things
worse. Unfortunately the system is remote. I should have more time now
to focus on debugging this.

Attaching console output for the above patch.


It seems that device 01.00.2 isn't in the scope of RMRR [base:
0x00bdf6f000 end: 0x00bdf7efff]. But it still tries to access
the address within it, hence faults generated.

You can check it with ACPI/DMAR table.

Best regards,
baolu



I believe it is the 3rd endpoint device entry in dmar data below.
So question about request_default_domain_for_dev. Since a dma mapping
is already done for 1.00.0, and that sets the default_domain for the
group (I think), won't it bail out for 1.00.2 at this check?

if (group->default_domain && group->default_domain->type == type)
    goto out;



Or I guess request_default_domain_for_dev wouldn't even be called 
for 1.00.2.

intel_iommu_add_device it wouldn't even call one of the request
functions with 1.00.2 since domain->type would be dma from 1.00.0, 
and device_def_domain_type

should return dma.


Can you please add some debug messages and check what really happens
here?

Best regards,
baolu



[   25.000544] pci :01:00.0: Adding to iommu group 25
[   25.502243] pci :01:00.0: DMAR: domain->type is identity  << 
intel_iommu_add_device (alloced in iommu_group_get_for_dev)
[   25.504239] pci :01:00.0: DMAR: device default domain type is dma. requesting 
dma domain  << intel_iommu_add_device
[   25.507954] pci :01:00.0: Using iommu dma mapping<< 
request_default_domain_for_dev  (now default domain for group is dma)
[   25.509765] pci :01:00.1: Adding to iommu group 25
[   25.511514] pci :01:00.1: DMAR: domain->type is dma  << 
intel_iommu_add_device
[   25.513263] pci :01:00.1: DMAR: device default domain type is identity. 
requesting identity domain  << intel_iommu_add_device
[   25.516435] pci :01:00.1: don't change mappings of existing devices.
<< request_default_domain_for_dev
[   25.518669] pci :01:00.1: DMAR: Device uses a private identity domain.  
<< intel_iommu_add_device
[   25.521061] pci :01:00.2: Adding to iommu group 25
[   25.522791] pci :01:00.2: DMAR: domain->type is dma  << 
intel_iommu_add_device
[   25.524706] pci :01:00.4: Adding to iommu group 25
[   25.526458] pci :01:00.4: DMAR: domain->type is dma  << 
intel_iommu_add_device
[   25.528213] pci :01:00.4: DMAR: device default domain type is identity. 
requesting identity domain  << intel_iommu_add_device
[   25.531284] pci :01:00.4: don't change mappings of existing devices.
<< request_default_domain_for_dev
[   25.533500] pci :01:00.4: DMAR: Device uses a private identity domain.  
<< intel_iommu_add_device

So the domain type is dma after 01:00.0 gets added, and when
intel_iommu_add_device is called for 01:00.2 it will go into the if
section. Since the device default domain type for 01:00.2 is dma
nothing happens in there, and it goes on to 01:00.4. Is the "private
identity domain" message really accurate since everyone will use
si_domain? Adding some more debugging.

Regards,
Jerry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-06 Thread Jerry Snitselaar

On Fri Dec 06 19, Jerry Snitselaar wrote:

On Sat Dec 07 19, Lu Baolu wrote:

Hi Jerry,

On 12/6/19 3:24 PM, Jerry Snitselaar wrote:

On Fri Dec 06 19, Lu Baolu wrote:
[snip]


Can you please try below change? Let's check whether the afending
address has been mapped for device 01.00.2.

$ git diff
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..d9daf66be849 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -663,6 +663,8 @@ static int 
iommu_group_create_direct_mappings(struct iommu_group *group,
   ret = iommu_map(domain, addr, addr, 
pg_size, entry->prot);

   if (ret)
   goto out;
+
+   dev_info(dev, "Setting identity map 
[0x%Lx - 0x%Lx] for group %d\n", addr, addr + pg_size, 
group->id);

   }

   }

I am doubting that device 01.00.2 is not in the device scope of

[    4.485108] DMAR: RMRR base: 0x00bdf6f000 end: 0x00bdf7efff

By the way, does device 01.00.2 works well after binding the driver?



When I boot it with passthrough it doesn't get to a point where I can
login. I think the serial console on these systems is tied to the ilo,
so the conserver connection could be making things
worse. Unfortunately the system is remote. I should have more time now
to focus on debugging this.

Attaching console output for the above patch.


It seems that device 01.00.2 isn't in the scope of RMRR [base:
0x00bdf6f000 end: 0x00bdf7efff]. But it still tries to access
the address within it, hence faults generated.

You can check it with ACPI/DMAR table.

Best regards,
baolu



I believe it is the 3rd endpoint device entry in dmar data below.
So question about request_default_domain_for_dev. Since a dma mapping
is already done for 1.00.0, and that sets the default_domain for the
group (I think), won't it bail out for 1.00.2 at this check?

if (group->default_domain && group->default_domain->type == type)
goto out;



Or I guess request_default_domain_for_dev wouldn't even be called for 1.00.2.
intel_iommu_add_device it wouldn't even call one of the request
functions with 1.00.2 since domain->type would be dma from 1.00.0, and 
device_def_domain_type
should return dma. 



output from lspci -t:

\-[:00]-+-00.0
+-01.0-[08]--
+-01.1-[14]--
+-02.0-[03]--+-00.0
|+-00.1
|+-00.2
|\-00.3
+-02.1-[15]--
+-02.2-[02]00.0
+-02.3-[16]--
+-03.0-[04-07]00.0-[05-07]--+-02.0-[06]--+-00.0
|   |\-00.1
|   \-04.0-[07]--+-00.0
|\-00.1
+-03.1-[17]--
+-03.2-[18]--
+-03.3-[19]--
+-04.0
+-04.1
+-04.2
+-04.3
+-04.4
+-04.5
+-04.6
+-04.7
+-05.0
+-05.2
+-05.4
+-11.0-[1b]--
+-1a.0
+-1c.0-[0b]--
+-1c.7-[01]--+-00.0
|+-00.1
|+-00.2
|\-00.4
+-1d.0
+-1e.0-[1a]--
+-1f.0
\-1f.2

DMAR table entries for the RMRR:

[302h 0770   2]Subtable Type : 0001 [Reserved Memory Region]
[304h 0772   2]   Length : 00CE

[306h 0774   2] Reserved : 
[308h 0776   2]   PCI Segment Number : 
[30Ah 0778   8] Base Address : BDF6F000
[312h 0786   8]  End Address (limit) : BDF7EFFF

[31Ah 0794   1]Device Scope Type : 01 [PCI Endpoint Device]
[31Bh 0795   1] Entry Length : 0A
[31Ch 0796   2] Reserved : 
[31Eh 0798   1]   Enumeration ID : 00
[31Fh 0799   1]   PCI Bus Number : 00

[320h 0800   2] PCI Path : 02,02

[322h 0802   2] PCI Path : 00,00


[324h 0804   1]Device Scope Type : 01 [PCI Endpoint Device]
[325h 0805   1] Entry Length : 0A
[326h 0806   2] Reserved : 
[328h 0808   1]   Enumeration ID : 00
[329h 0809   1]   PCI Bus Number : 00

[32Ah 0810   2] PCI Path : 1C,07

[32Ch 0812   2] PCI Path : 00,00


[32Eh 0814   1]Device Scope Type : 01 [PCI Endpoint Device]
[32Fh 0815   1] Entry Length : 0A
[330h 0816   2] Reserved : 
[332h 0818   1]   Enumeration ID : 00
[333h 0819   1]   PCI Bus Number : 00

[334h 0820   2] PCI Path : 1C,07

[336h 0822   2]  

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-06 Thread Jerry Snitselaar

On Sat Dec 07 19, Lu Baolu wrote:

Hi Jerry,

On 12/6/19 3:24 PM, Jerry Snitselaar wrote:

On Fri Dec 06 19, Lu Baolu wrote:
[snip]


Can you please try below change? Let's check whether the afending
address has been mapped for device 01.00.2.

$ git diff
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..d9daf66be849 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -663,6 +663,8 @@ static int 
iommu_group_create_direct_mappings(struct iommu_group *group,
   ret = iommu_map(domain, addr, addr, 
pg_size, entry->prot);

   if (ret)
   goto out;
+
+   dev_info(dev, "Setting identity map [0x%Lx 
- 0x%Lx] for group %d\n", addr, addr + pg_size, group->id);

   }

   }

I am doubting that device 01.00.2 is not in the device scope of

[    4.485108] DMAR: RMRR base: 0x00bdf6f000 end: 0x00bdf7efff

By the way, does device 01.00.2 works well after binding the driver?



When I boot it with passthrough it doesn't get to a point where I can
login. I think the serial console on these systems is tied to the ilo,
so the conserver connection could be making things
worse. Unfortunately the system is remote. I should have more time now
to focus on debugging this.

Attaching console output for the above patch.


It seems that device 01.00.2 isn't in the scope of RMRR [base:
0x00bdf6f000 end: 0x00bdf7efff]. But it still tries to access
the address within it, hence faults generated.

You can check it with ACPI/DMAR table.

Best regards,
baolu



I believe it is the 3rd endpoint device entry in dmar data below.
So question about request_default_domain_for_dev. Since a dma mapping
is already done for 1.00.0, and that sets the default_domain for the
group (I think), won't it bail out for 1.00.2 at this check?

if (group->default_domain && group->default_domain->type == type)
goto out;


output from lspci -t:

 \-[:00]-+-00.0
 +-01.0-[08]--
 +-01.1-[14]--
 +-02.0-[03]--+-00.0
 |+-00.1
 |+-00.2
 |\-00.3
 +-02.1-[15]--
 +-02.2-[02]00.0
 +-02.3-[16]--
 +-03.0-[04-07]00.0-[05-07]--+-02.0-[06]--+-00.0
 |   |\-00.1
 |   \-04.0-[07]--+-00.0
 |\-00.1
 +-03.1-[17]--
 +-03.2-[18]--
 +-03.3-[19]--
 +-04.0
 +-04.1
 +-04.2
 +-04.3
 +-04.4
 +-04.5
 +-04.6
 +-04.7
 +-05.0
 +-05.2
 +-05.4
 +-11.0-[1b]--
 +-1a.0
 +-1c.0-[0b]--
 +-1c.7-[01]--+-00.0
 |+-00.1
 |+-00.2
 |\-00.4
 +-1d.0
 +-1e.0-[1a]--
 +-1f.0
 \-1f.2

DMAR table entries for the RMRR:

[302h 0770   2]Subtable Type : 0001 [Reserved Memory Region]
[304h 0772   2]   Length : 00CE

[306h 0774   2] Reserved : 
[308h 0776   2]   PCI Segment Number : 
[30Ah 0778   8] Base Address : BDF6F000
[312h 0786   8]  End Address (limit) : BDF7EFFF

[31Ah 0794   1]Device Scope Type : 01 [PCI Endpoint Device]
[31Bh 0795   1] Entry Length : 0A
[31Ch 0796   2] Reserved : 
[31Eh 0798   1]   Enumeration ID : 00
[31Fh 0799   1]   PCI Bus Number : 00

[320h 0800   2] PCI Path : 02,02

[322h 0802   2] PCI Path : 00,00


[324h 0804   1]Device Scope Type : 01 [PCI Endpoint Device]
[325h 0805   1] Entry Length : 0A
[326h 0806   2] Reserved : 
[328h 0808   1]   Enumeration ID : 00
[329h 0809   1]   PCI Bus Number : 00

[32Ah 0810   2] PCI Path : 1C,07

[32Ch 0812   2] PCI Path : 00,00


[32Eh 0814   1]Device Scope Type : 01 [PCI Endpoint Device]
[32Fh 0815   1] Entry Length : 0A
[330h 0816   2] Reserved : 
[332h 0818   1]   Enumeration ID : 00
[333h 0819   1]   PCI Bus Number : 00

[334h 0820   2] PCI Path : 1C,07

[336h 0822   2] PCI Path : 00,02


[338h 0824   1]Device Scope Type : 01 [PCI Endpoint Device]
[339h 0825   1] Entry Length : 08
[33Ah 0826   2] Reserved : 
[33Ch 0828   1]   Enumeration ID : 00
[33Dh 0829   1]   PC

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-05 Thread Jerry Snitselaar

On Fri Dec 06 19, Lu Baolu wrote:
[snip]


Can you please try below change? Let's check whether the afending
address has been mapped for device 01.00.2.

$ git diff
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index db7bfd4f2d20..d9daf66be849 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -663,6 +663,8 @@ static int 
iommu_group_create_direct_mappings(struct iommu_group *group,
   ret = iommu_map(domain, addr, addr, pg_size, 
entry->prot);

   if (ret)
   goto out;
+
+   dev_info(dev, "Setting identity map [0x%Lx - 
0x%Lx] for group %d\n", addr, addr + pg_size, group->id);

   }

   }

I am doubting that device 01.00.2 is not in the device scope of

[4.485108] DMAR: RMRR base: 0x00bdf6f000 end: 0x00bdf7efff

By the way, does device 01.00.2 works well after binding the driver?



When I boot it with passthrough it doesn't get to a point where I can
login. I think the serial console on these systems is tied to the ilo,
so the conserver connection could be making things
worse. Unfortunately the system is remote. I should have more time now
to focus on debugging this.

Attaching console output for the above patch.

Regards,
Jerry


Best regards,
baolu

[   21.833145] DMAR: dmar0: Using Queued invalidation
[   21.834960] DMAR: dmar1: Using Queued invalidation
[   21.837860] pci :00:00.0: Adding to iommu group 0
[   21.839733] pci :00:01.0: Adding to iommu group 1
[   21.841601] pci :00:01.1: Adding to iommu group 2
[   21.843410] pci :00:02.0: Adding to iommu group 3
[   21.845652] pci :00:02.1: Adding to iommu group 4
[   21.847473] pci :00:02.2: Adding to iommu group 5
[   21.849299] pci :00:02.3: Adding to iommu group 6
[   21.851163] pci :00:03.0: Adding to iommu group 7
[   21.852918] pci :00:03.1: Adding to iommu group 8
[   21.854720] pci :00:03.2: Adding to iommu group 9
[   21.856761] pci :00:03.3: Adding to iommu group 10
[   21.858998] pci :00:04.0: Adding to iommu group 11
[   21.860937] pci :00:04.1: Adding to iommu group 11
[   21.863149] pci :00:04.2: Adding to iommu group 11
[   21.864941] pci :00:04.3: Adding to iommu group 11
[   21.866744] pci :00:04.4: Adding to iommu group 11
[   21.868567] pci :00:04.5: Adding to iommu group 11
[   21.870458] pci :00:04.6: Adding to iommu group 11
[   21.872254] pci :00:04.7: Adding to iommu group 11
[   21.874231] pci :00:05.0: Adding to iommu group 12
[   21.876047] pci :00:05.2: Adding to iommu group 12
[   21.877908] pci :00:05.4: Adding to iommu group 12
[   21.879814] pci :00:11.0: Adding to iommu group 13
[   21.881684] pci :00:1a.0: Adding to iommu group 14
[   21.883516] pci :00:1c.0: Adding to iommu group 15
[   21.885324] pci :00:1c.7: Adding to iommu group 16
[   21.887116] pci :00:1d.0: Adding to iommu group 17
[   21.888907] pci :00:1e.0: Adding to iommu group 18
[   21.891913] pci :00:1e.0: Using iommu dma mapping
[   21.894001] pci :00:1f.0: Adding to iommu group 19
[   21.895979] pci :00:1f.2: Adding to iommu group 19
[   21.898870] pci :00:1f.2: DMAR: Setting identity map [0xe8000 - 0xe8fff]
[   21.901428] pci :00:1f.2: DMAR: Setting identity map [0xf4000 - 0xf4fff]
[   21.903790] pci :00:1f.2: DMAR: Setting identity map [0xbdf6e000 - 
0xbdf6efff]
[   21.906347] pci :00:1f.2: DMAR: Setting identity map [0xbdf6f000 - 
0xbdf7efff]
[   21.908885] pci :00:1f.2: DMAR: Setting identity map [0xbdf7f000 - 
0xbdf82fff]
[   21.911620] pci :00:1f.2: DMAR: Setting identity map [0xbdf83000 - 
0xbdf84fff]
[   21.914141] pci :00:1f.2: DMAR: Device uses a private dma domain.
[   21.916690] pci :03:00.0: Adding to iommu group 20
[   21.919589] pci :03:00.0: Using iommu dma mapping
[   21.921708] pci :03:00.1: Setting identity map [0xe8000 - 0xe9000] for 
group 20
[   21.924309] pci :03:00.1: Setting identity map [0xf4000 - 0xf5000] for 
group 20
[   21.926854] pci :03:00.1: Setting identity map [0xbdf6e000 - 0xbdf6f000] 
for group 20
[   21.929560] pci :03:00.1: Setting identity map [0xbdf6f000 - 0xbdf7] 
for group 20
[   21.932366] pci :03:00.1: Setting identity map [0xbdf00 - 0xbdf71000] 
for group 20
[   22.035075] pci :03:00.1: Setting identity00 - 0xbdf72000] for group 20
[   22.437744] pci :03:00.1: Setting identityp [0xbdf72000 - 0xbdf73000] 
for group 20
[   22.540574] pci :03:00.1: Setting identity map [0xbdf73000 - 0xbdf74000] 
for group 20
[   22.543522] pci :03:00.1: Setting identity map [0xbdf74000 - 0xbdf75000] 
for group 20
[   22.546270] pci :03:00.1: Setting identity map [0xbdf75000 - 0xbdf76000] 
for group 20
[   22.548969] pci :03:00.1: Setting identity map [0xbdf76000 - 0xbdf77000] 
for group 20
[   22.551801] pci :03:00.1: Setting identity map [0xbdf77000 - 0xbdf78000] 
for 

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-04 Thread Jerry Snitselaar

On Thu Dec 05 19, Lu Baolu wrote:

Hi,

On 12/5/19 10:25 AM, Jerry Snitselaar wrote:


It seems that iommu pci bus probe didn't enumerate device [01:00.2] and
[02:00.0], the corresponding context entries were not setup. Hence dma
fault generated when devices access the memory.

Do these two devices show in "lspci" output? How do these devices get
enumerated by the system?

Best regards,
baolu



They are there in the output, but it seems out of order:



[   23.446201] pci :01:00.0: Adding to iommu group 25
[   23.448949] pci :01:00.0: Using iommu dma mapping
[   23.450807] pci :01:00.1: Adding to iommu group 25
[   23.452666] pci :01:00.1: DMAR: Device uses a private identity 
domain.

[   23.455063] pci :01:00.2: Adding to iommu group 25
[   23.456881] pci :01:00.4: Adding to iommu group 25
[   23.458693] pci :01:00.4: DMAR: Device uses a private identity 
domain.


Oh, yes!

So device 01:00.0 01:00.1 01:00.2 01:00.4 share a single group. The
default domain for this group has been set to DMA although iommu=pt has
been set. As the result, .0 .2 use DMA, but .1, .4 use IDENTITY. This is
not a valid configuration since all devices in a group should use a same
domain.

Do you mind posting the "lspci -vvv" output of these devices? I want to
figure out why these devices request different domain type.

Best regards,
baolu



01:00.0 System peripheral: Hewlett-Packard Company Integrated Lights-Out Standard 
Slave Instrumentation & System Support (rev 05)
Subsystem: Hewlett-Packard Company iLO4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-04 Thread Jerry Snitselaar

On Thu Dec 05 19, Lu Baolu wrote:

Hi,

On 12/5/19 4:53 AM, Jerry Snitselaar wrote:

Attaching console output (can't get to a point to actually log in) and
config that is used to build that kernel.


[...]
[   21.969477] pci :00:00.0: Adding to iommu group 0
[   21.971390] pci :00:01.0: Adding to iommu group 1
[   21.973173] pci :00:01.1: Adding to iommu group 2
[   21.974930] pci :00:02.0: Adding to iommu group 3
[   21.976672] pci :00:02.1: Adding to iommu group 4
[   21.978446] pci :00:02.2: Adding to iommu group 5
[   21.980224] pci :00:02.3: Adding to iommu group 6
[   21.982096] pci :00:03.0: Adding to iommu group 7
[   21.983868] pci :00:03.1: Adding to iommu group 8
[   21.985644] pci :00:03.2: Adding to iommu group 9
[   21.987484] pci :00:03.3: Adding to iommu group 10
[   21.989830] pci :00:04.0: Adding to iommu group 11
[   21.991738] pci :00:04.1: Adding to iommu group 11
[   21.993557] pci :00:04.2: Adding to iommu group 11
[   21.995360] pci :00:04.3: Adding to iommu group 11
[   21.997145] pci :00:04.4: Adding to iommu group 11
[   21.998915] pci :00:04.5: Adding to iommu group 11
[   22.000694] pci :00:04.6: Adding to iommu group 11
[   22.002569] pci :00:04.7: Adding to iommu group 11
[   22.004556] pci :00:05.0: Adding to iommu group 12
[   22.006388] pci :00:05.2: Adding to iommu group 12
[   22.008186] pci :00:05.4: Adding to iommu group 12
[   22.009968] pci :00:11.0: Adding to iommu group 13
[   22.011815] pci :00:1a.0: Adding to iommu group 14
[   22.013605] pci :00:1c.0: Adding to iommu group 15
[   22.015408] pci :00:1c.7: Adding to iommu group 16
[   22.017216] pci :00:1d.0: Adding to iommu group 17
[   22.018991] pci :00:1e.0: Adding to iommu group 18
[   22.021826] pci :00:1e.0: Using iommu dma mapping
[   22.023783] pci :00:1f.0: Adding to iommu group 19
[   22.025667] pci :00:1f.2: Adding to iommu group 19
[   22.346001] pci :03:00.0: Adding to iommu group 20
[   22.348727] pci :03:00.0: Using iommu dma mapping
[   22.350644] pci :03:00.1: Adding to iommu group 20
[   22.352833] pci :03:00.2: Adding to iommu group 20
[...]

It seems that iommu pci bus probe didn't enumerate device [01:00.2] and
[02:00.0], the corresponding context entries were not setup. Hence dma
fault generated when devices access the memory.

Do these two devices show in "lspci" output? How do these devices get
enumerated by the system?

Best regards,
baolu



They are there in the output, but it seems out of order:

[   22.025667] pci :00:1f.2: Adding to iommu group 19
[   22.028569] pci :00:1f.2: DMAR: Setting identity map- 0xe8fff]
[   22.331183] pci :00:1f.2: DMAR: Setting identity map [0xf4000 - 0xf4fff]
[   22.333546] pci :00:1f.2: DMAR: Setting identity map [0xbdf6e000 - 
0xbdf6efff]
[   22.336099] pci :00:1f.2: DMAR: Setting identity map [0xbdf6f000 - 
0xbdf7efff]
[   22.338604] pci :00:1f.2: DMAR: Setting identity map [0xbdf7f000 - 
0xbdf82fff]
[   22.341189] pci :00:1f.2: DMAR: Setting identity map [0xbdf83000 - 
0xbdf84fff]
[   22.343700] pci :00:1f.2: DMAR: Device uses a private dma domain.
[   22.346001] pci :03:00.0: Adding to iommu group 20
[   22.348727] pci :03:00.0: Using iommu dma mapping
[   22.350644] pci :03:00.1: Adding to iommu group 20
[   22.352833] pci :03:00.2: Adding to iommu group 20
[   22.354619] pci :03:00.3: Adding to iommu group 20


[   22.356423] pci :02:00.0: Adding to iommu group 21
[   22.358999] pci :02:00.0: Using iommu dma mapping


[   22.360785] pci :04:00.0: Adding to iommu group 22
[   22.362623] pci :05:02.0: Adding to iommu group 23
[   22.364412] pci :05:04.0: Adding to iommu group 24
[   22.366172] pci :06:00.0: Adding to iommu group 23
[   22.368762] pci :06:00.0: DMAR: Setting identity map [0xe8000 - 0xe8fff]
[   22.371290] pci :06:00.0: DMAR: Setting identity map [0xf4000 - 0xf4fff]
[   22.373646] pci :06:00.0: DMAR: Setting ide000 - 0xbdf6efff]
[   22.876042] pci :06:00.0: DMAR: Setting identity map [0xbdf6f000 - 
0xbdf7efff]
[   22.878572] pci :06:00.0: DMAR: Setting identity map [0xbdf7f000 - 
0xbdf82fff]
[   22.881167] pci :06:00.0: DMAR: Setting identity map [0xbdf83000 - 
0xbdf84fff]
[   22.883729] pci :06:00.0: DMAR: Device uses a private dma domain.
[   22.885899] pci :06:00.1: Adding to iommu group 23
[   22.888675] pci :06:00.1: DMAR: Setting identity map [0xe8000 - 0xe8fff]
[   22.891216] pci :06:00.1: DMAR: Setting identity map [0xf4000 - 0xf4fff]
[   22.893576] pci :06:00.1: DMAR: Setting identity map [0xbdf6e000 - 
0xbdf6efff]
[   22.896119] pci :06:00.1: DMAR: Setting identity map [0xbdf6f000 - 
0xbdf7efff]
[   22.898620] pci :06:00.1: DMAR: Setting identity map [0xbdf7f000 - 
0xbdf82fff]
[   22.901232] pci :06:00.1: DMAR: Setting identity map [0xbdf83000 - 

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-03 Thread Jerry Snitselaar

On Tue Dec 03 19, Lu Baolu wrote:

Hi,

On 12/3/19 12:13 AM, Jerry Snitselaar wrote:

On Mon Dec 02 19, Jerry Snitselaar wrote:

On Mon Dec 02 19, Lu Baolu wrote:

Hi,

On 12/2/19 2:34 PM, Jerry Snitselaar wrote:

We are seeing DMAR PTE read access not set errors when booting a
kernel with default passthrough, both with a test kernel and with
a 5.4.0 kernel. Previously we would see a number of identity mappings
being set related to the rmrrs, and now they aren't seen and we get
the dmar pte errors as devices touch those regions. From what 
I can tell

currently df4f3c603aeb ("iommu/vt-d: Remove static identity map code")
removed the bit of code in init_dmars that used to set up those
mappings:

-   /*
-    * For each rmrr
-    *   for each dev attached to rmrr
-    *   do
-    * locate drhd for dev, alloc domain for dev
-    * allocate free domain
-    * allocate page table entries for rmrr
-    * if context not allocated for bus
-    *   allocate and init context
-    *   set present in root table for this bus
-    * init context with domain, translation etc
-    *    endfor
-    * endfor
-    */
-   pr_info("Setting RMRR:\n");
-   for_each_rmrr_units(rmrr) {
-   /* some BIOS lists non-exist devices in DMAR table. */
-   for_each_active_dev_scope(rmrr->devices, 
rmrr->devices_cnt,

- i, dev) {
-   ret = iommu_prepare_rmrr_dev(rmrr, dev);
-   if (ret)
-   pr_err("Mapping reserved 
region failed\n");

-   }
-   }

si_domain_init now has code that sets identity maps for 
devices in rmrrs, but

only for certain devices.


On which device, are you seeing this error? Is it a rmrr locked device?

Best regards,
baolu



Almost all of the messages are for the ilo, but there also is a 
message for

the smart array raid bus controller.



Also seeing it with a dl380 gen9 system, where the raid bus controller
is getting the error.


Does it help if you remove

   if (device_is_rmrr_locked(dev))
   continue;

in si_domain_init()?



Unfortunately it still spits out a bunch of error messages.


Best regards,
baolu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-02 Thread Jerry Snitselaar

On Mon Dec 02 19, Jerry Snitselaar wrote:

On Mon Dec 02 19, Lu Baolu wrote:

Hi,

On 12/2/19 2:34 PM, Jerry Snitselaar wrote:

We are seeing DMAR PTE read access not set errors when booting a
kernel with default passthrough, both with a test kernel and with
a 5.4.0 kernel. Previously we would see a number of identity mappings
being set related to the rmrrs, and now they aren't seen and we get
the dmar pte errors as devices touch those regions. From what I can tell
currently df4f3c603aeb ("iommu/vt-d: Remove static identity map code")
removed the bit of code in init_dmars that used to set up those
mappings:

-   /*
-    * For each rmrr
-    *   for each dev attached to rmrr
-    *   do
-    * locate drhd for dev, alloc domain for dev
-    * allocate free domain
-    * allocate page table entries for rmrr
-    * if context not allocated for bus
-    *   allocate and init context
-    *   set present in root table for this bus
-    * init context with domain, translation etc
-    *    endfor
-    * endfor
-    */
-   pr_info("Setting RMRR:\n");
-   for_each_rmrr_units(rmrr) {
-   /* some BIOS lists non-exist devices in DMAR table. */
-   for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
- i, dev) {
-   ret = iommu_prepare_rmrr_dev(rmrr, dev);
-   if (ret)
-   pr_err("Mapping reserved region failed\n");
-   }
-   }

si_domain_init now has code that sets identity maps for devices in 
rmrrs, but

only for certain devices.


On which device, are you seeing this error? Is it a rmrr locked device?

Best regards,
baolu



Almost all of the messages are for the ilo, but there also is a message for
the smart array raid bus controller.



Also seeing it with a dl380 gen9 system, where the raid bus controller
is getting the error.



With iommu=nopt, the system boots up without issue.





___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-01 Thread Jerry Snitselaar

On Mon Dec 02 19, Lu Baolu wrote:

Hi,

On 12/2/19 2:34 PM, Jerry Snitselaar wrote:

We are seeing DMAR PTE read access not set errors when booting a
kernel with default passthrough, both with a test kernel and with
a 5.4.0 kernel. Previously we would see a number of identity mappings
being set related to the rmrrs, and now they aren't seen and we get
the dmar pte errors as devices touch those regions. From what I can tell
currently df4f3c603aeb ("iommu/vt-d: Remove static identity map code")
removed the bit of code in init_dmars that used to set up those
mappings:

-   /*
-    * For each rmrr
-    *   for each dev attached to rmrr
-    *   do
-    * locate drhd for dev, alloc domain for dev
-    * allocate free domain
-    * allocate page table entries for rmrr
-    * if context not allocated for bus
-    *   allocate and init context
-    *   set present in root table for this bus
-    * init context with domain, translation etc
-    *    endfor
-    * endfor
-    */
-   pr_info("Setting RMRR:\n");
-   for_each_rmrr_units(rmrr) {
-   /* some BIOS lists non-exist devices in DMAR table. */
-   for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
- i, dev) {
-   ret = iommu_prepare_rmrr_dev(rmrr, dev);
-   if (ret)
-   pr_err("Mapping reserved region failed\n");
-   }
-   }

si_domain_init now has code that sets identity maps for devices in 
rmrrs, but

only for certain devices.


On which device, are you seeing this error? Is it a rmrr locked device?

Best regards,
baolu



Almost all of the messages are for the ilo, but there also is a message for
the smart array raid bus controller.



With iommu=nopt, the system boots up without issue.





___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

dmar pte read access not set error messages on hp dl388 gen8 systems

2019-12-01 Thread Jerry Snitselaar

We are seeing DMAR PTE read access not set errors when booting a
kernel with default passthrough, both with a test kernel and with
a 5.4.0 kernel. Previously we would see a number of identity mappings
being set related to the rmrrs, and now they aren't seen and we get
the dmar pte errors as devices touch those regions. From what I can tell
currently df4f3c603aeb ("iommu/vt-d: Remove static identity map code")
removed the bit of code in init_dmars that used to set up those
mappings:

-   /*
-* For each rmrr
-*   for each dev attached to rmrr
-*   do
-* locate drhd for dev, alloc domain for dev
-* allocate free domain
-* allocate page table entries for rmrr
-* if context not allocated for bus
-*   allocate and init context
-*   set present in root table for this bus
-* init context with domain, translation etc
-*endfor
-* endfor
-*/
-   pr_info("Setting RMRR:\n");
-   for_each_rmrr_units(rmrr) {
-   /* some BIOS lists non-exist devices in DMAR table. */
-   for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
- i, dev) {
-   ret = iommu_prepare_rmrr_dev(rmrr, dev);
-   if (ret)
-   pr_err("Mapping reserved region failed\n");
-   }
-   }

si_domain_init now has code that sets identity maps for devices in rmrrs, but
only for certain devices.

With iommu=nopt, the system boots up without issue.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] iommu: fix KASAN use-after-free in iommu_insert_resv_region

2019-11-26 Thread Jerry Snitselaar

On Tue Nov 26 19, Eric Auger wrote:

In case the new region gets merged into another one, the nr
list node is freed. Checking its type while completing the
merge algorithm leads to a use-after-free. Use new->type
instead.

Fixes: 4dbd258ff63e ("iommu: Revisit iommu_insert_resv_region()
implementation")
Signed-off-by: Eric Auger 
Reported-by: Qian Cai 
Cc: Stable  #v5.3+



Minor nit, but should the comment above list_for_each_entry_safe get
updated as well? Other than that, lgtm.

Reviewed-by: Jerry Snitselaar 


---

v2 -> v3:
- directly use new->type

v1 -> v2:
- remove spurious new line
---
drivers/iommu/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d658c7c6a2ab..285ad4a4c7f2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -313,7 +313,7 @@ int iommu_insert_resv_region(struct iommu_resv_region *new,
phys_addr_t top_end, iter_end = iter->start + iter->length - 1;

/* no merge needed on elements of different types than @nr */
-   if (iter->type != nr->type) {
+   if (iter->type != new->type) {
list_move_tail(>list, );
continue;
}
--
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Check PM_LEVEL_SIZE() condition in locked section

2019-10-18 Thread Jerry Snitselaar

On Fri Oct 18 19, Joerg Roedel wrote:

On Thu, Oct 17, 2019 at 07:36:51AM -0400, Qian Cai wrote:



> On Oct 16, 2019, at 6:59 PM, Jerry Snitselaar  wrote:
>
> I guess the mode level 6 check is really for other potential callers
> increase_address_space, none exist at the moment, and the condition
> of the while loop in alloc_pte should fail if the mode level is 6.

Because there is no locking around iommu_map_page(), if there are
several concurrent callers of it for the same domain, could it be that
it silently corrupt data due to invalid access?


No, that can't happen because increase_address_space locks the domain
before actually doing anything. So the address space can't grow above
domain->mode == 6. But what can happen is that the WARN_ON_ONCE triggers
in there and that the address space is increased multiple times when
only one increase would be sufficient.

To fix this we just need to check the PM_LEVEL_SIZE() condition again
when we hold the lock:

From e930e792a998e89dfd4feef15fbbf289c45124dc Mon Sep 17 00:00:00 2001
From: Joerg Roedel 
Date: Fri, 18 Oct 2019 11:34:22 +0200
Subject: [PATCH] iommu/amd: Check PM_LEVEL_SIZE() condition in locked section

The increase_address_space() function has to check the PM_LEVEL_SIZE()
condition again under the domain->lock to avoid a false trigger of the
WARN_ON_ONCE() and to avoid that the address space is increase more
often than necessary.

Reported-by: Qian Cai 
Fixes: 754265bcab78 ("iommu/amd: Fix race in increase_address_space()")
Signed-off-by: Joerg Roedel 
---
drivers/iommu/amd_iommu.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2369b8af81f3..a0639e511ffe 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1463,6 +1463,7 @@ static void free_pagetable(struct protection_domain 
*domain)
 * to 64 bits.
 */
static bool increase_address_space(struct protection_domain *domain,
+  unsigned long address,
   gfp_t gfp)
{
unsigned long flags;
@@ -1471,8 +1472,8 @@ static bool increase_address_space(struct 
protection_domain *domain,

spin_lock_irqsave(>lock, flags);

-   if (WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL))
-   /* address space already 64 bit large */
+   if (address <= PM_LEVEL_SIZE(domain->mode) ||
+   WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL))
goto out;

pte = (void *)get_zeroed_page(gfp);
@@ -1505,7 +1506,7 @@ static u64 *alloc_pte(struct protection_domain *domain,
BUG_ON(!is_power_of_2(page_size));

while (address > PM_LEVEL_SIZE(domain->mode))
-   *updated = increase_address_space(domain, gfp) || *updated;
+   *updated = increase_address_space(domain, address, gfp) || 
*updated;

level   = domain->mode - 1;
pte = >pt_root[PM_LEVEL_INDEX(level, address)];
--
2.16.4



Reviewed-by: Jerry Snitselaar 



Re: [PATCH] iommu/amd: Pass gfp flags to iommu_map_page() in amd_iommu_map()

2019-10-18 Thread Jerry Snitselaar

On Fri Oct 18 19, Joerg Roedel wrote:

From: Joerg Roedel 

A recent commit added a gfp parameter to amd_iommu_map() to make it
callable from atomic context, but forgot to pass it down to
iommu_map_page() and left GFP_KERNEL there. This caused
sleep-while-atomic warnings and needs to be fixed.

Reported-by: Qian Cai 
Reported-by: Dan Carpenter 
Fixes: 781ca2de89ba ("iommu: Add gfp parameter to iommu_ops::map")
Signed-off-by: Joerg Roedel 
---
drivers/iommu/amd_iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 0d2479546b77..fb54df5c2e11 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2561,7 +2561,7 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
if (iommu_prot & IOMMU_WRITE)
prot |= IOMMU_PROT_IW;

-   ret = iommu_map_page(domain, iova, paddr, page_size, prot, GFP_KERNEL);
+   ret = iommu_map_page(domain, iova, paddr, page_size, prot, gfp);

domain_flush_np_cache(domain, iova, page_size);

--
2.16.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Reviewed-by: Jerry Snitselaar 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH -next] iommu/amd: fix a warning in increase_address_space

2019-10-16 Thread Jerry Snitselaar

On Wed Oct 16 19, Jerry Snitselaar wrote:

On Wed Oct 16 19, Qian Cai wrote:


BTW, Joerg, this line from the commit "iommu/amd: Remove domain->updated" looks
suspicious. Not sure what the purpose of it.

*updated = increase_address_space(domain, gfp) || *updated;



Looking at it again I think that isn't an issue really, it would just
not lose updated being set in a previous loop iteration, but now
I'm wondering about the loop itself. In the cases where it would return
false, how does the evaluation of the condition for the while loop
change?



I guess the mode level 6 check is really for other potential callers
increase_address_space, none exist at the moment, and the condition
of the while loop in alloc_pte should fail if the mode level is 6.



Re: [PATCH -next] iommu/amd: fix a warning in increase_address_space

2019-10-16 Thread Jerry Snitselaar

On Wed Oct 16 19, Qian Cai wrote:

After the commit 754265bcab78 ("iommu/amd: Fix race in
increase_address_space()"), it could still possible trigger a race
condition under some heavy memory pressure below. The race to trigger a
warning is,

CPU0:   CPU1:
in alloc_pte(): in increase_address_space():
while (address > PM_LEVEL_SIZE(domain->mode)) [1]

spin_lock_irqsave(>lock
domain->mode+= 1;
spin_unlock_irqrestore(>lock

in increase_address_space():
spin_lock_irqsave(>lock
if (WARN_ON_ONCE(domain->mode == PAGE_MODE_6_LEVEL))

[1] domain->mode = 5

It is unclear the triggering of the warning is the root cause of the
smartpqi offline yet, but let's fix it first by lifting the locking.

WARNING: CPU: 57 PID: 124314 at drivers/iommu/amd_iommu.c:1474
iommu_map_page+0x718/0x7e0
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec flags=0x0010]
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec1000 flags=0x0010]
CPU: 57 PID: 124314 Comm: oom01 Tainted: G   O
Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40
07/10/2019
RIP: 0010:iommu_map_page+0x718/0x7e0
Code: 88 a5 70 ff ff ff e9 5d fa ff ff 48 8b b5 70 ff ff ff 4c 89 ef e8
08 32 2f 00 41 80 fc 01 0f 87 b7 3d 00 00 41 83 e4 01 eb be <0f> 0b 48
8b b5 70 ff ff ff 4c 89 ef e8 e7 31 2f 00 eb dd 0f 0b 48
RSP: 0018:888da4816cb8 EFLAGS: 00010046
RAX:  RBX: 8885fe689000 RCX: 96f4a6c4
RDX: 0007 RSI: dc00 RDI: 8885fe689124
RBP: 888da4816da8 R08: ed10bfcd120e R09: ed10bfcd120e
R10: ed10bfcd120d R11: 8885fe68906b R12: 
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec1a00 flags=0x0010]
R13: 8885fe689068 R14: 8885fe689124 R15: 
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec1e00 flags=0x0010]
FS:  7f29722ba700() GS:88902f88()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f27f82d8000 CR3: 00102ed9c000 CR4: 003406e0
Call Trace:
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec2000 flags=0x0010]
map_sg+0x1ce/0x2f0
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec2400 flags=0x0010]
scsi_dma_map+0xd7/0x160
pqi_raid_submit_scsi_cmd_with_io_request+0x1b8/0x420 [smartpqi]
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec2800 flags=0x0010]
pqi_scsi_queue_command+0x8ab/0xe00 [smartpqi]
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec2c00 flags=0x0010]
scsi_queue_rq+0xd19/0x1360
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec3000 flags=0x0010]
__blk_mq_try_issue_directly+0x295/0x3f0
smartpqi :23:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0xffec3400 flags=0x0010]
AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x
address=0xffec3800 flags=0x0010]
blk_mq_request_issue_directly+0xb5/0x100
AMD-Vi: Event logged [IO_PAGE_FAULT device=23:00.0 domain=0x
address=0xffec3c00 flags=0x0010]
blk_mq_try_issue_list_directly+0xa9/0x160
blk_mq_sched_insert_requests+0x228/0x380
blk_mq_flush_plug_list+0x448/0x7e0
blk_flush_plug_list+0x1eb/0x230
blk_finish_plug+0x43/0x5d
shrink_node_memcg+0x9c5/0x1550
smartpqi :23:00.0: controller is offline: status code 0x14803
smartpqi :23:00.0: controller offline

Fixes: 754265bcab78 ("iommu/amd: Fix race in increase_address_space()")
Signed-off-by: Qian Cai 
---

BTW, Joerg, this line from the commit "iommu/amd: Remove domain->updated" looks
suspicious. Not sure what the purpose of it.

*updated = increase_address_space(domain, gfp) || *updated;



Looking at it again I think that isn't an issue really, it would just
not lose updated being set in a previous loop iteration, but now
I'm wondering about the loop itself. In the cases where it would return
false, how does the evaluation of the condition for the while loop
change?


drivers/iommu/amd_iommu.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2369b8af81f3..a5754068aa29 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1465,12 +1465,9 @@ static void free_pagetable(struct protection_domain 
*domain)
static bool increase_address_space(struct protection_domain *domain,
   gfp_t gfp)
{
-   unsigned long flags;
bool ret = false;
u64 *pte;

-   spin_lock_irqsave(>lock, flags);
-
if (WARN_ON_ONCE(domain->mode == 

  1   2   >