Re: [PATCH 1/2] iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock

2019-06-18 Thread Lu Baolu

Hi Chris,

On 6/19/19 5:02 AM, Chris Wilson wrote:

Quoting Lu Baolu (2019-05-21 08:30:15)

From: Dave Jiang 

Lockdep debug reported lock inversion related with the iommu code
caused by dmar_insert_one_dev_info() grabbing the iommu->lock and
the device_domain_lock out of order versus the code path in
iommu_flush_dev_iotlb(). Expanding the scope of the iommu->lock and
reversing the order of lock acquisition fixes the issue.


Which of course violates the property that device_domain_lock is the
outer lock...


Agreed.

I also realized that this might be an incorrect fix. I am looking into
it and will submit a new fix later.

Best regards,
Lu Baolu



<4>[1.252997] ==
<4>[1.252999] WARNING: possible circular locking dependency detected
<4>[1.253002] 5.2.0-rc5-CI-CI_DRM_6299+ #1 Not tainted
<4>[1.253004] --
<4>[1.253006] swapper/0/1 is trying to acquire lock:
<4>[1.253009] 91462475 (&(>lock)->rlock){+.+.}, at: 
domain_context_mapping_one+0xa0/0x4f0
<4>[1.253015]
   but task is already holding lock:
<4>[1.253017] 69266737 (device_domain_lock){}, at: 
domain_context_mapping_one+0x88/0x4f0
<4>[1.253021]
   which lock already depends on the new lock.

<4>[1.253024]
   the existing dependency chain (in reverse order) is:
<4>[1.253027]
   -> #1 (device_domain_lock){}:
<4>[1.253031]_raw_spin_lock_irqsave+0x33/0x50
<4>[1.253034]dmar_insert_one_dev_info+0xb8/0x520
<4>[1.253036]set_domain_for_dev+0x66/0xf0
<4>[1.253039]iommu_prepare_identity_map+0x48/0x95
<4>[1.253042]intel_iommu_init+0xfd8/0x138d
<4>[1.253045]pci_iommu_init+0x11/0x3a
<4>[1.253048]do_one_initcall+0x58/0x300
<4>[1.253051]kernel_init_freeable+0x2c0/0x359
<4>[1.253054]kernel_init+0x5/0x100
<4>[1.253056]ret_from_fork+0x3a/0x50
<4>[1.253058]
   -> #0 (&(>lock)->rlock){+.+.}:
<4>[1.253062]lock_acquire+0xa6/0x1c0
<4>[1.253064]_raw_spin_lock+0x2a/0x40
<4>[1.253067]domain_context_mapping_one+0xa0/0x4f0
<4>[1.253070]pci_for_each_dma_alias+0x2b/0x160
<4>[1.253072]dmar_insert_one_dev_info+0x44e/0x520
<4>[1.253075]set_domain_for_dev+0x66/0xf0
<4>[1.253077]iommu_prepare_identity_map+0x48/0x95
<4>[1.253080]intel_iommu_init+0xfd8/0x138d
<4>[1.253082]pci_iommu_init+0x11/0x3a
<4>[1.253084]do_one_initcall+0x58/0x300
<4>[1.253086]kernel_init_freeable+0x2c0/0x359
<4>[1.253089]kernel_init+0x5/0x100
<4>[1.253091]ret_from_fork+0x3a/0x50
<4>[1.253093]
   other info that might help us debug this:

<4>[1.253095]  Possible unsafe locking scenario:

<4>[1.253095]CPU0CPU1
<4>[1.253095]
<4>[1.253095]   lock(device_domain_lock);
<4>[1.253095]lock(&(>lock)->rlock);
<4>[1.253095]lock(device_domain_lock);
<4>[1.253095]   lock(&(>lock)->rlock);
<4>[1.253095]
*** DEADLOCK ***

<4>[1.253095] 2 locks held by swapper/0/1:
<4>[1.253095]  #0: 76465a1e (dmar_global_lock){}, at: 
intel_iommu_init+0x1d3/0x138d
<4>[1.253095]  #1: 69266737 (device_domain_lock){}, at: 
domain_context_mapping_one+0x88/0x4f0
<4>[1.253095]
   stack backtrace:
<4>[1.253095] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.2.0-rc5-CI-CI_DRM_6299+ #1
<4>[1.253095] Hardware name:  /NUC5i7RYB, BIOS 
RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
<4>[1.253095] Call Trace:
<4>[1.253095]  dump_stack+0x67/0x9b
<4>[1.253095]  print_circular_bug+0x1c8/0x2b0
<4>[1.253095]  __lock_acquire+0x1ce9/0x24c0
<4>[1.253095]  ? lock_acquire+0xa6/0x1c0
<4>[1.253095]  lock_acquire+0xa6/0x1c0
<4>[1.253095]  ? domain_context_mapping_one+0xa0/0x4f0
<4>[1.253095]  _raw_spin_lock+0x2a/0x40
<4>[1.253095]  ? domain_context_mapping_one+0xa0/0x4f0
<4>[1.253095]  domain_context_mapping_one+0xa0/0x4f0
<4>[1.253095]  ? domain_context_mapping_one+0x4f0/0x4f0
<4>[1.253095]  pci_for_each_dma_alias+0x2b/0x160
<4>[1.253095]  dmar_insert_one_dev_info+0x44e/0x520
<4>[1.253095]  set_domain_for_dev+0x66/0xf0
<4>[1.253095]  iommu_prepare_identity_map+0x48/0x95
<4>[1.253095]  intel_iommu_init+0xfd8/0x138d
<4>[1.253095]  ? set_debug_rodata+0xc/0xc
<4>[1.253095]  ? set_debug_rodata+0xc/0xc
<4>[1.253095]  ? e820__memblock_setup+0x5b/0x5b
<4>[1.253095]  ? pci_iommu_init+0x11/0x3a
<4>[1.253095]  ? set_debug_rodata+0xc/0xc
<4>[1.253095]  pci_iommu_init+0x11/0x3a
<4>[1.253095]  do_one_initcall+0x58/0x300

Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler

2019-06-18 Thread Jacob Pan
On Tue, 18 Jun 2019 15:04:36 +0100
Jean-Philippe Brucker  wrote:

> On 12/06/2019 19:53, Jacob Pan wrote:
> >>> You are right, the worst case of the spurious PS is to terminate
> >>> the group prematurely. Need to know the scope of the HW damage in
> >>> case of mdev where group IDs can be shared among mdevs belong to
> >>> the same PF.
> >>
> >> But from the IOMMU fault API point of view, the full page request
> >> is identified by both PRGI and PASID. Given that each mdev has its
> >> own set of PASIDs, it should be easy to isolate page responses per
> >> mdev. 
> > On Intel platform, devices sending page request with private data
> > must receive page response with matching private data. If we solely
> > depend on PRGI and PASID, we may send stale private data to the
> > device in those incorrect page response. Since private data may
> > represent PF device wide contexts, the consequence of sending page
> > response with wrong private data may affect other mdev/PASID.
> > 
> > One solution we are thinking to do is to inject the sequence #(e.g.
> > ktime raw mono clock) as vIOMMU private data into to the guest.
> > Guest would return this fake private data in page response, then
> > host will send page response back to the device that matches PRG1
> > and PASID and private_data.
> > 
> > This solution does not expose HW context related private data to the
> > guest but need to extend page response in iommu uapi.
> > 
> > /**
> >  * struct iommu_page_response - Generic page response information
> >  * @version: API version of this structure
> >  * @flags: encodes whether the corresponding fields are valid
> >  * (IOMMU_FAULT_PAGE_RESPONSE_* values)
> >  * @pasid: Process Address Space ID
> >  * @grpid: Page Request Group Index
> >  * @code: response code from  iommu_page_response_code
> >  * @private_data: private data for the matching page request
> >  */
> > struct iommu_page_response {
> > #define IOMMU_PAGE_RESP_VERSION_1   1
> > __u32   version;
> > #define IOMMU_PAGE_RESP_PASID_VALID (1 << 0)
> > #define IOMMU_PAGE_RESP_PRIVATE_DATA(1 << 1)
> > __u32   flags;
> > __u32   pasid;
> > __u32   grpid;
> > __u32   code;
> > __u32   padding;
> > __u64   private_data[2];
> > };
> > 
> > There is also the change needed for separating storage for the real
> > and fake private data.
> > 
> > Sorry for the last minute change, did not realize the HW
> > implications.
> > 
> > I see this as a future extension due to limited testing,   
> 
> I'm wondering how we deal with:
> (1) old userspace that won't fill the new private_data field in
> page_response. A new kernel still has to support it.
> (2) old kernel that won't recognize the new PRIVATE_DATA flag.
> Currently iommu_page_response() rejects page responses with unknown
> flags.
> 
> I guess we'll need a two-way negotiation, where userspace queries
> whether the kernel supports the flag (2), and the kernel learns
> whether it should expect the private data to come back (1).
> 
I am not sure case (1) exist in that there is no existing user space
supports PRQ w/o private data. Am I missing something?

For VT-d emulation, private data is always part of the scalable mode
PASID capability. If vIOMMU query host supports PASID and scalable
mode, it will always support private data once PRQ is enabled.

So I think we only need to negotiate (2) which should be covered by
VT-d PASID cap.

> > perhaps for
> > now, can you add paddings similar to page request? Make it 64B as
> > well.  
> 
> I don't think padding is necessary, because iommu_page_response is
> sent by userspace to the kernel, unlike iommu_fault which is
> allocated by userspace and filled by the kernel.
> 
> Page response looks a lot more like existing VFIO mechanisms, so I
> suppose we'll wrap the iommu_page_response structure and include an
> argsz parameter at the top:
> 
>   struct vfio_iommu_page_response {
>   u32 argsz;
>   struct iommu_page_response pr;
>   };
> 
>   struct vfio_iommu_page_response vpr = {
>   .argsz = sizeof(vpr),
>   .pr = ...
>   ...
>   };
> 
>   ioctl(devfd, VFIO_IOMMU_PAGE_RESPONSE, );
> 
> In that case supporting private data can be done by simply appending a
> field at the end (plus the negotiation above).
> 
Do you mean at the end of struct vfio_iommu_page_response{}? or at
the end of that seems struct iommu_page_response{}?

The consumer of the private data is iommu driver not vfio. So I think
you want to add the new field at the end of struct iommu_page_response,
right?
I think that would work, just to clarify.


Re: [RFC PATCH v4 18/21] x86/apic: Add a parameter for the APIC delivery mode

2019-06-18 Thread Thomas Gleixner
On Tue, 18 Jun 2019, Ricardo Neri wrote:

> On Sun, Jun 16, 2019 at 11:55:03AM +0200, Thomas Gleixner wrote:
> > On Thu, 23 May 2019, Ricardo Neri wrote:
> > >  
> > >  struct irq_cfg {
> > > - unsigned intdest_apicid;
> > > - unsigned intvector;
> > > + unsigned intdest_apicid;
> > > + unsigned intvector;
> > > + enum ioapic_irq_destination_types   delivery_mode;
> > 
> > And how is this related to IOAPIC?
> 
> In my view, IOAPICs can also be programmed with a delivery mode. Mode
> values are the same for MSI interrupts.

> > I know this enum exists already, but in
> > connection with MSI this does not make any sense at all.
> 
> Is the issue here the name of the enumeration?
> 
> > 
> > > +
> > > + /*
> > > +  * Initialize the delivery mode of this irq to match the
> > > +  * default delivery mode of the APIC. This is useful for
> > > +  * children irq domains which want to take the delivery
> > > +  * mode from the individual irq configuration rather
> > > +  * than from the APIC.
> > > +  */
> > > +  apicd->hw_irq_cfg.delivery_mode = apic->irq_delivery_mode;
> > 
> > And here it's initialized from apic->irq_delivery_mode, which is an
> > u32. Intuitive and consistent - NOT!
> 
> Yes, this is wrong. Then should the member in the structure above be an
> u32 instead of enum ioapic_irq_destination_types?
> 
> Thanks and BR,
> Ricardo
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 03/21] x86/hpet: Calculate ticks-per-second in a separate function

2019-06-18 Thread Thomas Gleixner
On Tue, 18 Jun 2019, Ricardo Neri wrote:
> On Fri, Jun 14, 2019 at 05:54:05PM +0200, Thomas Gleixner wrote:
> > 
> > So we already have ticks per second, aka frequency, right? So why do we
> > need yet another function instead of using the value which is computed
> > once? The frequency of the HPET channels has to be identical no matter
> > what. If it's not HPET is broken beyond repair.
> 
> I don't think it needs to be recomputed again. I missed the fact that
> the frequency was already computed here.
> 
> Also, the hpet char driver has its own frequency computation. Perhaps it
> could also obtain it from here, right?

No. It's separate on purpose. Hint: IA64, aka Itanic

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 04/21] x86/hpet: Add hpet_set_comparator() for periodic and one-shot modes

2019-06-18 Thread Ricardo Neri
On Fri, Jun 14, 2019 at 08:17:14PM +0200, Thomas Gleixner wrote:
> On Thu, 23 May 2019, Ricardo Neri wrote:
> > +/**
> > + * hpet_set_comparator() - Helper function for setting comparator register
> > + * @num:   The timer ID
> > + * @cmp:   The value to be written to the comparator/accumulator
> > + * @period:The value to be written to the period (0 = oneshot mode)
> > + *
> > + * Helper function for updating comparator, accumulator and period values.
> > + *
> > + * In periodic mode, HPET needs HPET_TN_SETVAL to be set before writing
> > + * to the Tn_CMP to update the accumulator. Then, HPET needs a second
> > + * write (with HPET_TN_SETVAL cleared) to Tn_CMP to set the period.
> > + * The HPET_TN_SETVAL bit is automatically cleared after the first write.
> > + *
> > + * For one-shot mode, HPET_TN_SETVAL does not need to be set.
> > + *
> > + * See the following documents:
> > + *   - Intel IA-PC HPET (High Precision Event Timers) Specification
> > + *   - AMD-8111 HyperTransport I/O Hub Data Sheet, Publication # 24674
> > + */
> > +void hpet_set_comparator(int num, unsigned int cmp, unsigned int period)
> > +{
> > +   if (period) {
> > +   unsigned int v = hpet_readl(HPET_Tn_CFG(num));
> > +
> > +   hpet_writel(v | HPET_TN_SETVAL, HPET_Tn_CFG(num));
> > +   }
> > +
> > +   hpet_writel(cmp, HPET_Tn_CMP(num));
> > +
> > +   if (!period)
> > +   return;
> 
> TBH, I hate this conditional handling. What's wrong with two functions?

There is probably nothing wrong with two functions. I can split it into
hpet_set_comparator_periodic() and hpet_set_comparator(). Perhaps the
latter is not needed as it would be a one-line function; you have
suggested earlier to avoid such small functions.
> 
> > +
> > +   /*
> > +* This delay is seldom used: never in one-shot mode and in periodic
> > +* only when reprogramming the timer.
> > +*/
> > +   udelay(1);
> > +   hpet_writel(period, HPET_Tn_CMP(num));
> > +}
> > +EXPORT_SYMBOL_GPL(hpet_set_comparator);
> 
> Why is this exported? Which module user needs this?

It is not used anywhere else. I will remove this export.

Thanks and BR,

Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 03/21] x86/hpet: Calculate ticks-per-second in a separate function

2019-06-18 Thread Ricardo Neri
On Fri, Jun 14, 2019 at 05:54:05PM +0200, Thomas Gleixner wrote:
> On Thu, 23 May 2019, Ricardo Neri wrote:
> >  
> > +u64 hpet_get_ticks_per_sec(u64 hpet_caps)
> > +{
> > +   u64 ticks_per_sec, period;
> > +
> > +   period = (hpet_caps & HPET_COUNTER_CLK_PERIOD_MASK) >>
> > +HPET_COUNTER_CLK_PERIOD_SHIFT; /* fs, 10^-15 */
> > +
> > +   /*
> > +* The frequency is the reciprocal of the period. The period is given
> > +* in femtoseconds per second. Thus, prepare a dividend to obtain the
> > +* frequency in ticks per second.
> > +*/
> > +
> > +   /* 10^15 femtoseconds per second */
> > +   ticks_per_sec = 1000ULL;
> > +   ticks_per_sec += period >> 1; /* round */
> > +
> > +   /* The quotient is put in the dividend. We drop the remainder. */
> > +   do_div(ticks_per_sec, period);
> > +
> > +   return ticks_per_sec;
> > +}
> > +
> >  int hpet_alloc(struct hpet_data *hdp)
> >  {
> > u64 cap, mcfg;
> > @@ -844,7 +867,6 @@ int hpet_alloc(struct hpet_data *hdp)
> > struct hpets *hpetp;
> > struct hpet __iomem *hpet;
> > static struct hpets *last;
> > -   unsigned long period;
> > unsigned long long temp;
> > u32 remainder;
> >  
> > @@ -894,12 +916,7 @@ int hpet_alloc(struct hpet_data *hdp)
> >  
> > last = hpetp;
> >  
> > -   period = (cap & HPET_COUNTER_CLK_PERIOD_MASK) >>
> > -   HPET_COUNTER_CLK_PERIOD_SHIFT; /* fs, 10^-15 */
> > -   temp = 1000uLL; /* 10^15 femtoseconds per second */
> > -   temp += period >> 1; /* round */
> > -   do_div(temp, period);
> > -   hpetp->hp_tick_freq = temp; /* ticks per second */
> > +   hpetp->hp_tick_freq = hpet_get_ticks_per_sec(cap);
> 
> Why are we actually computing this over and over?
> 
> In hpet_enable() which is the first function invoked we have:
> 
> /*
>  * The period is a femto seconds value. Convert it to a
>  * frequency.
>  */
> freq = FSEC_PER_SEC;
> do_div(freq, hpet_period);
> hpet_freq = freq;
> 
> So we already have ticks per second, aka frequency, right? So why do we
> need yet another function instead of using the value which is computed
> once? The frequency of the HPET channels has to be identical no matter
> what. If it's not HPET is broken beyond repair.

I don't think it needs to be recomputed again. I missed the fact that
the frequency was already computed here.

Also, the hpet char driver has its own frequency computation. Perhaps it
could also obtain it from here, right?

Thanks and BR,
Ricardo
> 
> Thanks,
> 
>   tglx
> 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 05/21] x86/hpet: Reserve timer for the HPET hardlockup detector

2019-06-18 Thread Ricardo Neri
On Fri, Jun 14, 2019 at 06:10:18PM +0200, Thomas Gleixner wrote:
> On Thu, 13 Jun 2019, Ricardo Neri wrote:
> 
> > On Tue, Jun 11, 2019 at 09:54:25PM +0200, Thomas Gleixner wrote:
> > > On Thu, 23 May 2019, Ricardo Neri wrote:
> > > 
> > > > HPET timer 2 will be used to drive the HPET-based hardlockup detector.
> > > > Reserve such timer to ensure it cannot be used by user space programs or
> > > > for clock events.
> > > > 
> > > > When looking for MSI-capable timers for clock events, skip timer 2 if
> > > > the HPET hardlockup detector is selected.
> > > 
> > > Why? Both the changelog and the code change lack an explanation why this
> > > timer is actually touched after it got reserved for the platform. The
> > > reservation should make it inaccessible for other things.
> > 
> > hpet_reserve_platform_timers() will give the HPET char driver a data
> > structure which specifies which drivers are reserved. In this manner,
> > they cannot be used by applications via file opens. The timer used by
> > the hardlockup detector should be marked as reserved.
> > 
> > Also, hpet_msi_capability_lookup() populates another data structure
> > which is used when obtaining an unused timer for a HPET clock event.
> > The timer used by the hardlockup detector should not be included in such
> > data structure.
> > 
> > Is this the explanation you would like to see? If yes, I will include it
> > in the changelog.
> 
> Yes, the explanation makes sense. The code still sucks. Not really your
> fault, but this is not making it any better.
> 
> What bothers me most is the fact that CONFIG_X86_HARDLOCKUP_DETECTOR_HPET
> removes one HPET timer unconditionally. It neither checks whether the hpet
> watchdog is actually enabled on the command line, nor does it validate
> upfront whether the HPET supports FSB delivery.
> 
> That wastes an HPET timer unconditionally for no value. Not that I
> personally care much about /dev/hpet, but some older laptops depend on HPET
> per cpu timers as the local APIC timer stops in C2/3. So this unconditional
> reservation will cause regressions for no reason.
> 
> The proper approach here is to:
> 
>  1) Evaluate the command line _before_ hpet_enable() is invoked
> 
>  2) Check the availability of FSB delivery in hpet_enable()
> 
> Reserve an HPET channel for the watchdog only when #1 and #2 are true.

Sure. I will add the explanation in the message commit and only reserve
the timer if both of the conditions above are met.

Thanks and BR,
Ricardo


Re: [RFC PATCH v4 18/21] x86/apic: Add a parameter for the APIC delivery mode

2019-06-18 Thread Ricardo Neri
On Sun, Jun 16, 2019 at 11:55:03AM +0200, Thomas Gleixner wrote:
> On Thu, 23 May 2019, Ricardo Neri wrote:
> >  
> >  struct irq_cfg {
> > -   unsigned intdest_apicid;
> > -   unsigned intvector;
> > +   unsigned intdest_apicid;
> > +   unsigned intvector;
> > +   enum ioapic_irq_destination_types   delivery_mode;
> 
> And how is this related to IOAPIC?

In my view, IOAPICs can also be programmed with a delivery mode. Mode
values are the same for MSI interrupts.

> I know this enum exists already, but in
> connection with MSI this does not make any sense at all.

Is the issue here the name of the enumeration?

> 
> > +
> > +   /*
> > +* Initialize the delivery mode of this irq to match the
> > +* default delivery mode of the APIC. This is useful for
> > +* children irq domains which want to take the delivery
> > +* mode from the individual irq configuration rather
> > +* than from the APIC.
> > +*/
> > +apicd->hw_irq_cfg.delivery_mode = apic->irq_delivery_mode;
> 
> And here it's initialized from apic->irq_delivery_mode, which is an
> u32. Intuitive and consistent - NOT!

Yes, this is wrong. Then should the member in the structure above be an
u32 instead of enum ioapic_irq_destination_types?

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 12/21] watchdog/hardlockup/hpet: Adjust timer expiration on the number of monitored CPUs

2019-06-18 Thread Ricardo Neri
On Tue, Jun 11, 2019 at 10:11:04PM +0200, Thomas Gleixner wrote:
> On Thu, 23 May 2019, Ricardo Neri wrote:
> > @@ -52,10 +59,10 @@ static void kick_timer(struct hpet_hld_data *hdata, 
> > bool force)
> > return;
> >  
> > if (hdata->has_periodic)
> > -   period = watchdog_thresh * hdata->ticks_per_second;
> > +   period = watchdog_thresh * hdata->ticks_per_cpu;
> >  
> > count = hpet_readl(HPET_COUNTER);
> > -   new_compare = count + watchdog_thresh * hdata->ticks_per_second;
> > +   new_compare = count + watchdog_thresh * hdata->ticks_per_cpu;
> > hpet_set_comparator(hdata->num, (u32)new_compare, (u32)period);
> 
> So with this you might get close to the point where you trip over the SMI
> induced madness where CPUs vanish for several milliseconds in some value
> add code. You really want to do a read back of the hpet to detect that. See
> the comment in the hpet code. RHEL 7/8 allow up to 768 logical CPUs

Do you mean adding a readback to check if the new compare value is
greater than the current count? Similar to the check at the end of
hpet_next_event():

return res < HPET_MIN_CYCLES ? -ETIME : 0;

In such a case, should it try to set the comparator again? I think it
should, as otherwise the hardlockup detector would stop working.

Thanks and BR,
Ricardo
> 
> Thanks,
> 
>   tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v4 20/21] iommu/vt-d: hpet: Reserve an interrupt remampping table entry for watchdog

2019-06-18 Thread Ricardo Neri
On Mon, Jun 17, 2019 at 10:25:35AM +0200, Thomas Gleixner wrote:
> On Sun, 16 Jun 2019, Thomas Gleixner wrote:
> > On Thu, 23 May 2019, Ricardo Neri wrote:
> > > When the hardlockup detector is enabled, the function
> > > hld_hpet_intremapactivate_irq() activates the recently created entry
> > > in the interrupt remapping table via the modify_irte() functions. While
> > > doing this, it specifies which CPU the interrupt must target via its APIC
> > > ID. This function can be called every time the destination iD of the
> > > interrupt needs to be updated; there is no need to allocate or remove
> > > entries in the interrupt remapping table.
> > 
> > Brilliant.
> > 
> > > +int hld_hpet_intremap_activate_irq(struct hpet_hld_data *hdata)
> > > +{
> > > + u32 destid = apic->calc_dest_apicid(hdata->handling_cpu);
> > > + struct intel_ir_data *data;
> > > +
> > > + data = (struct intel_ir_data *)hdata->intremap_data;
> > > + data->irte_entry.dest_id = IRTE_DEST(destid);
> > > + return modify_irte(>irq_2_iommu, >irte_entry);
> > 
> > This calls modify_irte() which does at the very beginning:
> > 
> >raw_spin_lock_irqsave(_2_ir_lock, flags);
> > 
> > How is that supposed to work from NMI context? Not to talk about the
> > other spinlocks which are taken in the subsequent call chain.
> > 
> > You cannot call in any of that code from NMI context.
> > 
> > The only reason why this never deadlocked in your testing is that nothing
> > else touched that particular iommu where the HPET hangs off concurrently.
> > 
> > But that's just pure luck and not design. 
> 
> And just for the record. I warned you about that problem during the review
> of an earlier version and told you to talk to IOMMU folks whether there is
> a way to update the entry w/o running into that lock problem.

I think I misunderstood your feedback. You did mention issues on locking
between NMI and !NMI contexts. However, that was in the context of using the
generic irq code to do things such as set the affinity of the interrupt and
requesting an irq. I understood that I should instead program things directly.
I extrapolated this to the IOMMU driver in which I also added code directly
instead of using the existing layering.

Also, at the time, the question regarding the IOMMU, as I understood, was
whether it was posible to reserve a IOMMU remapping entry upfront. I believe
my patches achieve that, even if they are hacky and ugly, and have locking
issues. I see now that the locking issues are also part of the IOMMU
discussion. Perhaps that was also implicit.
> 
> Can you tell my why am I actually reviewing patches and spending time on
> this when the result is ignored anyway?

Yes, Thomas, I should have checked first with the IOMMU maintainers
first on the issues in the paragraph above. It is not my intention to
waste your time; your feedback has been valuable and has contributed to
improve the code.

> 
> I also tried to figure out why you went away from the IPI broadcast
> design. The only information I found is:
> 
> Changes vs. v1:
> 
>  * Brought back the round-robin mechanism proposed in v1 (this time not
>using the interrupt subsystem). This also requires to compute
>expiration times as in v1 (Andi Kleen, Stephane Eranian).
> 
> Great that there is no trace of any mail from Andi or Stephane about this
> on LKML. There is no problem with talking offlist about this stuff, but
> then you should at least provide a rationale for those who were not part of
> the private conversation.

Stephane has already commented the rationale.

Thanks and BR,

Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 16/21] memory: mtk-smi: Add bus_sel for mt8183

2019-06-18 Thread Matthias Brugger



On 18/06/2019 14:10, Yong Wu wrote:
> On Mon, 2019-06-17 at 18:23 +0200, Matthias Brugger wrote:
>>
>> On 10/06/2019 14:17, Yong Wu wrote:
>>> There are 2 mmu cells in a M4U HW. we could adjust some larbs entering
>>> mmu0 or mmu1 to balance the bandwidth via the smi-common register
>>> SMI_BUS_SEL(0x220)(Each larb occupy 2 bits).
>>>
>>> In mt8183, For better performance, we switch larb1/2/5/7 to enter
>>> mmu1 while the others still keep enter mmu0.
>>>
>>> In mt8173 and mt2712, we don't get the performance issue,
>>> Keep its default value(0x0), that means all the larbs enter mmu0.
>>>
>>> Note: smi gen1(mt2701/mt7623) don't have this bus_sel.
>>>
>>> And, the base of smi-common is completely different with smi_ao_base
>>> of gen1, thus I add new variable for that.
>>>
>>> CC: Matthias Brugger 
>>> Signed-off-by: Yong Wu 
>>> Reviewed-by: Evan Green 
>>> ---
>>>  drivers/memory/mtk-smi.c | 22 --
>>>  1 file changed, 20 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
>>> index 9790801..08cf40d 100644
>>> --- a/drivers/memory/mtk-smi.c
>>> +++ b/drivers/memory/mtk-smi.c
>>> @@ -49,6 +49,12 @@
>>>  #define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
>>>  #define F_MMU_EN   BIT(0)
>>>  
>>> +/* SMI COMMON */
>>> +#define SMI_BUS_SEL0x220
>>> +#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
>>> +/* All are MMU0 defaultly. Only specialize mmu1 here. */
>>> +#define F_MMU1_LARB(larbid)(0x1 << 
>>> SMI_BUS_LARB_SHIFT(larbid))
>>> +
>>>  enum mtk_smi_gen {
>>> MTK_SMI_GEN1,
>>> MTK_SMI_GEN2
>>> @@ -57,6 +63,7 @@ enum mtk_smi_gen {
>>>  struct mtk_smi_common_plat {
>>> enum mtk_smi_gen gen;
>>> bool has_gals;
>>> +   u32  bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
>>>  };
>>>  
>>>  struct mtk_smi_larb_gen {
>>> @@ -72,8 +79,8 @@ struct mtk_smi {
>>> struct clk  *clk_apb, *clk_smi;
>>> struct clk  *clk_gals0, *clk_gals1;
>>> struct clk  *clk_async; /*only needed by mt2701*/
>>> -   void __iomem*smi_ao_base;
>>> -
>>> +   void __iomem*smi_ao_base; /* only for gen1 */
>>> +   void __iomem*base;/* only for gen2 */
>>
>> union {} maybe?
> 
> Yes. Thanks.
> 
> I will add it.
> 
>>
>>> const struct mtk_smi_common_plat *plat;
>>>  };
>>>  
>>> @@ -410,6 +417,8 @@ static int __maybe_unused mtk_smi_larb_suspend(struct 
>>> device *dev)
>>>  static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
>>> .gen  = MTK_SMI_GEN2,
>>> .has_gals = true,
>>> +   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
>>> +   F_MMU1_LARB(7),
>>>  };
>>>  
>>>  static const struct of_device_id mtk_smi_common_of_ids[] = {
>>> @@ -482,6 +491,11 @@ static int mtk_smi_common_probe(struct platform_device 
>>> *pdev)
>>> ret = clk_prepare_enable(common->clk_async);
>>> if (ret)
>>> return ret;
>>> +   } else {
>>> +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>>> +   common->base = devm_ioremap_resource(dev, res);
>>> +   if (IS_ERR(common->base))
>>> +   return PTR_ERR(common->base);
>>
>> We must be backwards compatible with DT which does not have the base defined.
> 
> The smi-common node in the previous mt2712 and mt8173 also have the
> "reg" property even though they didn't use this base, Thus, It looks ok
> for all the cases.
> 

Correct, it is defined as a required property in the binding description so we
are good.
Sorry for the noise.

With the union added you can add:
Reviewed-by: Matthias Brugger 

>>
>> Regards,
>> Matthias
>>
>>> }
>>> pm_runtime_enable(dev);
>>> platform_set_drvdata(pdev, common);
>>> @@ -497,6 +511,7 @@ static int mtk_smi_common_remove(struct platform_device 
>>> *pdev)
>>>  static int __maybe_unused mtk_smi_common_resume(struct device *dev)
>>>  {
>>> struct mtk_smi *common = dev_get_drvdata(dev);
>>> +   u32 bus_sel = common->plat->bus_sel;
>>> int ret;
>>>  
>>> ret = mtk_smi_clk_enable(common);
>>> @@ -504,6 +519,9 @@ static int __maybe_unused mtk_smi_common_resume(struct 
>>> device *dev)
>>> dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
>>> return ret;
>>> }
>>> +
>>> +   if (common->plat->gen == MTK_SMI_GEN2 && bus_sel)
>>> +   writel(bus_sel, common->base + SMI_BUS_SEL);
>>> return 0;
>>>  }
>>>  
>>>
> 
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 15/22] docs: driver-api: add a chapter for memory-related API

2019-06-18 Thread Mauro Carvalho Chehab
There are some DMA files under the main dir. Move them to the
new chapter and add an index file for them.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/PCI/pci.rst|  6 +++---
 Documentation/block/biodoc.rst   |  2 +-
 .../driver-api/bus-virt-phys-mapping.rst |  2 +-
 Documentation/driver-api/index.rst   |  2 ++
 .../mm/dma-api-howto.rst}|  2 --
 .../{DMA-API.rst => driver-api/mm/dma-api.rst}   |  8 +++-
 .../mm/dma-attributes.rst}   |  2 --
 .../mm/dma-isa-lpc.rst}  |  4 +---
 Documentation/driver-api/mm/index.rst| 11 +++
 Documentation/driver-api/usb/dma.rst |  6 +++---
 Documentation/memory-barriers.txt|  6 +++---
 .../translations/ko_KR/memory-barriers.txt   |  6 +++---
 arch/ia64/hp/common/sba_iommu.c  | 12 ++--
 arch/ia64/sn/pci/pci_dma.c   |  4 ++--
 arch/parisc/kernel/pci-dma.c |  2 +-
 arch/x86/include/asm/dma-mapping.h   |  4 ++--
 arch/x86/kernel/amd_gart_64.c|  2 +-
 drivers/parisc/sba_iommu.c   | 16 
 include/linux/dma-mapping.h  |  2 +-
 include/media/videobuf-dma-sg.h  |  2 +-
 kernel/dma/debug.c   |  2 +-
 21 files changed, 54 insertions(+), 49 deletions(-)
 rename Documentation/{DMA-API-HOWTO.rst => driver-api/mm/dma-api-howto.rst} 
(99%)
 rename Documentation/{DMA-API.rst => driver-api/mm/dma-api.rst} (99%)
 rename Documentation/{DMA-attributes.rst => driver-api/mm/dma-attributes.rst} 
(99%)
 rename Documentation/{DMA-ISA-LPC.rst => driver-api/mm/dma-isa-lpc.rst} (98%)
 create mode 100644 Documentation/driver-api/mm/index.rst

diff --git a/Documentation/PCI/pci.rst b/Documentation/PCI/pci.rst
index 0f52d172c9ac..8665209eda40 100644
--- a/Documentation/PCI/pci.rst
+++ b/Documentation/PCI/pci.rst
@@ -265,7 +265,7 @@ Set the DMA mask size
 -
 .. note::
If anything below doesn't make sense, please refer to
-   Documentation/DMA-API.rst. This section is just a reminder that
+   Documentation/driver-api/mm/dma-api.rst. This section is just a reminder 
that
drivers need to indicate DMA capabilities of the device and is not
an authoritative source for DMA interfaces.
 
@@ -291,7 +291,7 @@ Many 64-bit "PCI" devices (before PCI-X) and some PCI-X 
devices are
 Setup shared control data
 -
 Once the DMA masks are set, the driver can allocate "consistent" (a.k.a. 
shared)
-memory.  See Documentation/DMA-API.rst for a full description of
+memory.  See Documentation/driver-api/mm/dma-api.rst for a full description of
 the DMA APIs. This section is just a reminder that it needs to be done
 before enabling DMA on the device.
 
@@ -421,7 +421,7 @@ owners if there is one.
 
 Then clean up "consistent" buffers which contain the control data.
 
-See Documentation/DMA-API.rst for details on unmapping interfaces.
+See Documentation/driver-api/mm/dma-api.rst for details on unmapping 
interfaces.
 
 
 Unregister from other subsystems
diff --git a/Documentation/block/biodoc.rst b/Documentation/block/biodoc.rst
index 59bd93bec8fc..2206c88e7dee 100644
--- a/Documentation/block/biodoc.rst
+++ b/Documentation/block/biodoc.rst
@@ -195,7 +195,7 @@ a virtual address mapping (unlike the earlier scheme of 
virtual address
 do not have a corresponding kernel virtual address space mapping) and
 low-memory pages.
 
-Note: Please refer to Documentation/DMA-API-HOWTO.rst for a discussion
+Note: Please refer to Documentation/driver-api/mm/dma-api-howto.rst for a 
discussion
 on PCI high mem DMA aspects and mapping of scatter gather lists, and support
 for 64 bit PCI.
 
diff --git a/Documentation/driver-api/bus-virt-phys-mapping.rst 
b/Documentation/driver-api/bus-virt-phys-mapping.rst
index 80972916e88c..18b6fdf618d2 100644
--- a/Documentation/driver-api/bus-virt-phys-mapping.rst
+++ b/Documentation/driver-api/bus-virt-phys-mapping.rst
@@ -8,7 +8,7 @@ How to access I/O mapped memory from within device drivers
 
The virt_to_bus() and bus_to_virt() functions have been
superseded by the functionality provided by the PCI DMA interface
-   (see Documentation/DMA-API-HOWTO.rst).  They continue
+   (see Documentation/driver-api/mm/dma-api-howto.rst).  They continue
to be documented below for historical purposes, but new code
must not use them. --davidm 00/12/12
 
diff --git a/Documentation/driver-api/index.rst 
b/Documentation/driver-api/index.rst
index bb2621b17212..492b96003af2 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -16,6 +16,7 @@ available subsections can be seen below.
 
basics
infrastructure
+   mm/index
pm/index
clk
device-io
@@ -59,6 +60,7 @@ available subsections can be seen 

Re: [PATCH 1/2] iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock

2019-06-18 Thread Chris Wilson
Quoting Lu Baolu (2019-05-21 08:30:15)
> From: Dave Jiang 
> 
> Lockdep debug reported lock inversion related with the iommu code
> caused by dmar_insert_one_dev_info() grabbing the iommu->lock and
> the device_domain_lock out of order versus the code path in
> iommu_flush_dev_iotlb(). Expanding the scope of the iommu->lock and
> reversing the order of lock acquisition fixes the issue.

Which of course violates the property that device_domain_lock is the
outer lock...

<4>[1.252997] ==
<4>[1.252999] WARNING: possible circular locking dependency detected
<4>[1.253002] 5.2.0-rc5-CI-CI_DRM_6299+ #1 Not tainted
<4>[1.253004] --
<4>[1.253006] swapper/0/1 is trying to acquire lock:
<4>[1.253009] 91462475 (&(>lock)->rlock){+.+.}, at: 
domain_context_mapping_one+0xa0/0x4f0
<4>[1.253015]
  but task is already holding lock:
<4>[1.253017] 69266737 (device_domain_lock){}, at: 
domain_context_mapping_one+0x88/0x4f0
<4>[1.253021]
  which lock already depends on the new lock.

<4>[1.253024]
  the existing dependency chain (in reverse order) is:
<4>[1.253027]
  -> #1 (device_domain_lock){}:
<4>[1.253031]_raw_spin_lock_irqsave+0x33/0x50
<4>[1.253034]dmar_insert_one_dev_info+0xb8/0x520
<4>[1.253036]set_domain_for_dev+0x66/0xf0
<4>[1.253039]iommu_prepare_identity_map+0x48/0x95
<4>[1.253042]intel_iommu_init+0xfd8/0x138d
<4>[1.253045]pci_iommu_init+0x11/0x3a
<4>[1.253048]do_one_initcall+0x58/0x300
<4>[1.253051]kernel_init_freeable+0x2c0/0x359
<4>[1.253054]kernel_init+0x5/0x100
<4>[1.253056]ret_from_fork+0x3a/0x50
<4>[1.253058]
  -> #0 (&(>lock)->rlock){+.+.}:
<4>[1.253062]lock_acquire+0xa6/0x1c0
<4>[1.253064]_raw_spin_lock+0x2a/0x40
<4>[1.253067]domain_context_mapping_one+0xa0/0x4f0
<4>[1.253070]pci_for_each_dma_alias+0x2b/0x160
<4>[1.253072]dmar_insert_one_dev_info+0x44e/0x520
<4>[1.253075]set_domain_for_dev+0x66/0xf0
<4>[1.253077]iommu_prepare_identity_map+0x48/0x95
<4>[1.253080]intel_iommu_init+0xfd8/0x138d
<4>[1.253082]pci_iommu_init+0x11/0x3a
<4>[1.253084]do_one_initcall+0x58/0x300
<4>[1.253086]kernel_init_freeable+0x2c0/0x359
<4>[1.253089]kernel_init+0x5/0x100
<4>[1.253091]ret_from_fork+0x3a/0x50
<4>[1.253093]
  other info that might help us debug this:

<4>[1.253095]  Possible unsafe locking scenario:

<4>[1.253095]CPU0CPU1
<4>[1.253095]
<4>[1.253095]   lock(device_domain_lock);
<4>[1.253095]lock(&(>lock)->rlock);
<4>[1.253095]lock(device_domain_lock);
<4>[1.253095]   lock(&(>lock)->rlock);
<4>[1.253095]
   *** DEADLOCK ***

<4>[1.253095] 2 locks held by swapper/0/1:
<4>[1.253095]  #0: 76465a1e (dmar_global_lock){}, at: 
intel_iommu_init+0x1d3/0x138d
<4>[1.253095]  #1: 69266737 (device_domain_lock){}, at: 
domain_context_mapping_one+0x88/0x4f0
<4>[1.253095]
  stack backtrace:
<4>[1.253095] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.2.0-rc5-CI-CI_DRM_6299+ #1
<4>[1.253095] Hardware name:  /NUC5i7RYB, BIOS 
RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
<4>[1.253095] Call Trace:
<4>[1.253095]  dump_stack+0x67/0x9b
<4>[1.253095]  print_circular_bug+0x1c8/0x2b0
<4>[1.253095]  __lock_acquire+0x1ce9/0x24c0
<4>[1.253095]  ? lock_acquire+0xa6/0x1c0
<4>[1.253095]  lock_acquire+0xa6/0x1c0
<4>[1.253095]  ? domain_context_mapping_one+0xa0/0x4f0
<4>[1.253095]  _raw_spin_lock+0x2a/0x40
<4>[1.253095]  ? domain_context_mapping_one+0xa0/0x4f0
<4>[1.253095]  domain_context_mapping_one+0xa0/0x4f0
<4>[1.253095]  ? domain_context_mapping_one+0x4f0/0x4f0
<4>[1.253095]  pci_for_each_dma_alias+0x2b/0x160
<4>[1.253095]  dmar_insert_one_dev_info+0x44e/0x520
<4>[1.253095]  set_domain_for_dev+0x66/0xf0
<4>[1.253095]  iommu_prepare_identity_map+0x48/0x95
<4>[1.253095]  intel_iommu_init+0xfd8/0x138d
<4>[1.253095]  ? set_debug_rodata+0xc/0xc
<4>[1.253095]  ? set_debug_rodata+0xc/0xc
<4>[1.253095]  ? e820__memblock_setup+0x5b/0x5b
<4>[1.253095]  ? pci_iommu_init+0x11/0x3a
<4>[1.253095]  ? set_debug_rodata+0xc/0xc
<4>[1.253095]  pci_iommu_init+0x11/0x3a
<4>[1.253095]  do_one_initcall+0x58/0x300
<4>[1.253095]  kernel_init_freeable+0x2c0/0x359
<4>[1.253095]  ? rest_init+0x250/0x250
<4>[1.253095]  kernel_init+0x5/0x100
<4>[1.253095]  ret_from_fork+0x3a/0x50


Re: [PATCH 3/8] iommu/arm-smmu-v3: Support platform SSID

2019-06-18 Thread Will Deacon
On Mon, Jun 10, 2019 at 07:47:09PM +0100, Jean-Philippe Brucker wrote:
> For platform devices that support SubstreamID (SSID), firmware provides
> the number of supported SSID bits. Restrict it to what the SMMU supports
> and cache it into master->ssid_bits.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/iommu/arm-smmu-v3.c | 11 +++
>  drivers/iommu/of_iommu.c|  6 +-
>  include/linux/iommu.h   |  1 +
>  3 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 4d5a694f02c2..3254f473e681 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -604,6 +604,7 @@ struct arm_smmu_master {
>   struct list_headdomain_head;
>   u32 *sids;
>   unsigned intnum_sids;
> + unsigned intssid_bits;
>   boolats_enabled :1;
>  };
>  
> @@ -2097,6 +2098,16 @@ static int arm_smmu_add_device(struct device *dev)
>   }
>   }
>  
> + master->ssid_bits = min(smmu->ssid_bits, fwspec->num_pasid_bits);
> +
> + /*
> +  * If the SMMU doesn't support 2-stage CD, limit the linear
> +  * tables to a reasonable number of contexts, let's say
> +  * 64kB / sizeof(ctx_desc) = 1024 = 2^10
> +  */
> + if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB))
> + master->ssid_bits = min(master->ssid_bits, 10U);

Please introduce a #define for the 10, so that it is computed in the way
you describe in the comment (a bit like we do for things like queue sizes).

> +
>   group = iommu_group_get_for_dev(dev);
>   if (!IS_ERR(group)) {
>   iommu_group_put(group);
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index f04a6df65eb8..04f4f6b95d82 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -206,8 +206,12 @@ const struct iommu_ops *of_iommu_configure(struct device 
> *dev,
>   if (err)
>   break;
>   }
> - }
>  
> + fwspec = dev_iommu_fwspec_get(dev);
> + if (!err && fwspec)
> + of_property_read_u32(master_np, "pasid-num-bits",
> +  >num_pasid_bits);
> + }

Hmm. Do you know if there's anything in ACPI for this?

Otherwise, patch looks fine. Thanks.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/4] firmware: qcom_scm-64: Add atomic version of qcom_scm_call

2019-06-18 Thread Will Deacon
On Wed, Jun 12, 2019 at 12:45:51PM +0530, Vivek Gautam wrote:
> There are scnenarios where drivers are required to make a
> scm call in atomic context, such as in one of the qcom's
> arm-smmu-500 errata [1].
> 
> [1] ("https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/
>   drivers/iommu/arm-smmu.c?h=CogSystems-msm-49/
>   msm-4.9=da765c6c75266b38191b38ef086274943f353ea7")
> 
> Signed-off-by: Vivek Gautam 
> Reviewed-by: Bjorn Andersson 
> ---
>  drivers/firmware/qcom_scm-64.c | 136 
> -
>  1 file changed, 92 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
> index 91d5ad7cf58b..b6dca32c5ac4 100644
> --- a/drivers/firmware/qcom_scm-64.c
> +++ b/drivers/firmware/qcom_scm-64.c
> @@ -62,32 +62,71 @@ static DEFINE_MUTEX(qcom_scm_lock);
>  #define FIRST_EXT_ARG_IDX 3
>  #define N_REGISTER_ARGS (MAX_QCOM_SCM_ARGS - N_EXT_QCOM_SCM_ARGS + 1)
>  
> -/**
> - * qcom_scm_call() - Invoke a syscall in the secure world
> - * @dev: device
> - * @svc_id:  service identifier
> - * @cmd_id:  command identifier
> - * @desc:Descriptor structure containing arguments and return values
> - *
> - * Sends a command to the SCM and waits for the command to finish processing.
> - * This should *only* be called in pre-emptible context.
> -*/
> -static int qcom_scm_call(struct device *dev, u32 svc_id, u32 cmd_id,
> -  const struct qcom_scm_desc *desc,
> -  struct arm_smccc_res *res)
> +static void __qcom_scm_call_do(const struct qcom_scm_desc *desc,
> +struct arm_smccc_res *res, u32 fn_id,
> +u64 x5, u32 type)
> +{
> + u64 cmd;
> + struct arm_smccc_quirk quirk = {.id = ARM_SMCCC_QUIRK_QCOM_A6};
> +
> + cmd = ARM_SMCCC_CALL_VAL(type, qcom_smccc_convention,
> +  ARM_SMCCC_OWNER_SIP, fn_id);
> +
> + quirk.state.a6 = 0;
> +
> + do {
> + arm_smccc_smc_quirk(cmd, desc->arginfo, desc->args[0],
> + desc->args[1], desc->args[2], x5,
> + quirk.state.a6, 0, res, );
> +
> + if (res->a0 == QCOM_SCM_INTERRUPTED)
> + cmd = res->a0;
> +
> + } while (res->a0 == QCOM_SCM_INTERRUPTED);
> +}
> +
> +static void qcom_scm_call_do(const struct qcom_scm_desc *desc,
> +  struct arm_smccc_res *res, u32 fn_id,
> +  u64 x5, bool atomic)
> +{

Maybe pass in the call type (ARM_SMCCC_FAST_CALL vs ARM_SMCCC_STD_CALL)
instead of "bool atomic"? Would certainly make the callsites easier to
understand.

> + int retry_count = 0;
> +
> + if (!atomic) {
> + do {
> + mutex_lock(_scm_lock);
> +
> + __qcom_scm_call_do(desc, res, fn_id, x5,
> +ARM_SMCCC_STD_CALL);
> +
> + mutex_unlock(_scm_lock);
> +
> + if (res->a0 == QCOM_SCM_V2_EBUSY) {
> + if (retry_count++ > QCOM_SCM_EBUSY_MAX_RETRY)
> + break;
> + msleep(QCOM_SCM_EBUSY_WAIT_MS);
> + }
> + }  while (res->a0 == QCOM_SCM_V2_EBUSY);
> + } else {
> + __qcom_scm_call_do(desc, res, fn_id, x5, ARM_SMCCC_FAST_CALL);
> + }

Is it safe to make concurrent FAST calls?

Will


Re: [PATCH v3 3/4] iommu/arm-smmu: Add support to handle Qcom's wait-for-safe logic

2019-06-18 Thread Will Deacon
Hi Vivek,

On Fri, Jun 14, 2019 at 02:48:07PM +0530, Vivek Gautam wrote:
> On 6/14/2019 9:35 AM, Bjorn Andersson wrote:
> > On Wed 12 Jun 00:15 PDT 2019, Vivek Gautam wrote:
> > 
> > > Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
> > > to address under-performance issues in real-time clients, such as
> > > Display, and Camera.
> > > On receiving an invalidation requests, the SMMU forwards SAFE request
> > > to these clients and waits for SAFE ack signal from real-time clients.
> > > The SAFE signal from such clients is used to qualify the start of
> > > invalidation.
> > > This logic is controlled by chicken bits, one for each - MDP (display),
> > > IFE0, and IFE1 (camera), that can be accessed only from secure software
> > > on sdm845.
> > > 
> > > This configuration, however, degrades the performance of non-real time
> > > clients, such as USB, and UFS etc. This happens because, with 
> > > wait-for-safe
> > > logic enabled the hardware tries to throttle non-real time clients while
> > > waiting for SAFE ack signals from real-time clients.
> > > 
> > > On MTP sdm845 devices, with wait-for-safe logic enabled at the boot time
> > > by the bootloaders we see degraded performance of USB and UFS when kernel
> > > enables the smmu stage-1 translations for these clients.
> > > Turn off this wait-for-safe logic from the kernel gets us back the perf
> > > of USB and UFS devices until we re-visit this when we start seeing perf
> > > issues on display/camera on upstream supported SDM845 platforms.

Re-visit what exactly, and how?

> > > Now, different bootloaders with their access control policies allow this
> > > register access differently through secure monitor calls -
> > > 1) With one we can issue io-read/write secure monitor call (qcom-scm)
> > > to update the register, while,
> > > 2) With other, such as one on MTP sdm845 we should use the specific
> > > qcom-scm command to send request to do the complete register
> > > configuration.
> > > Adding a separate device tree flag for arm-smmu to identify which
> > > firmware configuration of the two mentioned above we use.
> > > Not adding code change to allow type-(1) bootloaders to toggle the
> > > safe using io-read/write qcom-scm call.
> > > 
> > > This change is inspired by the downstream change from Patrick Daly
> > > to address performance issues with display and camera by handling
> > > this wait-for-safe within separte io-pagetable ops to do TLB
> > > maintenance. So a big thanks to him for the change.
> > > 
> > > Without this change the UFS reads are pretty slow:
> > > $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
> > > 10+0 records in
> > > 10+0 records out
> > > 10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
> > > real0m 22.39s
> > > user0m 0.00s
> > > sys 0m 0.01s
> > > 
> > > With this change they are back to rock!
> > > $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
> > > 300+0 records in
> > > 300+0 records out
> > > 314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
> > > real0m 1.03s
> > > user0m 0.00s
> > > sys 0m 0.54s
> > > 
> > > Signed-off-by: Vivek Gautam 
> > > ---
> > >   drivers/iommu/arm-smmu.c | 16 
> > >   1 file changed, 16 insertions(+)
> > > 
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 0ad086da399c..3c3ad43eda97 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -39,6 +39,7 @@
> > >   #include 
> > >   #include 
> > >   #include 
> > > +#include 
> > >   #include 
> > >   #include 
> > > @@ -177,6 +178,7 @@ struct arm_smmu_device {
> > >   u32 features;
> > >   #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
> > > +#define ARM_SMMU_OPT_QCOM_FW_IMPL_SAFE_ERRATA (1 << 1)
> > >   u32 options;
> > >   enum arm_smmu_arch_version  version;
> > >   enum arm_smmu_implementationmodel;
> > > @@ -262,6 +264,7 @@ static bool using_legacy_binding, 
> > > using_generic_binding;
> > >   static struct arm_smmu_option_prop arm_smmu_options[] = {
> > >   { ARM_SMMU_OPT_SECURE_CFG_ACCESS, 
> > > "calxeda,smmu-secure-config-access" },
> > > + { ARM_SMMU_OPT_QCOM_FW_IMPL_SAFE_ERRATA, 
> > > "qcom,smmu-500-fw-impl-safe-errata" },
> > This should be added to the DT binding as well.
> 
> Ah right. I missed that. Will add this and respin unless Robin and Will have
> concerns with this change.

My only concern really is whether it's safe for us to turn this off. It's
clear that somebody went to a lot of effort to add this extra goodness to
the IP, but your benchmarks suggest they never actually tried it out after
they finished building it.

Is there some downside I'm not seeing from disabling this stuff?

We probably also need some co-ordination with Andy if we're going to
merge this, since he maintains the firmware calling code.

Will

Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables

2019-06-18 Thread Will Deacon
Hi Bjorn,

On Wed, May 15, 2019 at 04:32:34PM -0700, Bjorn Andersson wrote:
> Describe the memory related to page table walks as non-cachable for iommu
> instances that are not DMA coherent.
> 
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/iommu/io-pgtable-arm.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 4e21efbc4459..68ff22ffd2cb 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, 
> void *cookie)
>   return NULL;
>  
>   /* TCR */
> - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
> -   (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
> -   (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
> + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) {
> + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |
> +   (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) |
> +   (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT);
> + } else {
> + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) |

Nit: this should be outer-shareable (ARM_LPAE_TCR_SH_OS).

> +   (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) |
> +   (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT);
> + }

Should we also be doing something similar for the short-descriptor code
in io-pgtable-arm-v7s.c? Looks like you just need to use ARM_V7S_RGN_NC
instead of ARM_V7S_RGN_WBWA when initialising ttbr0 for non-coherent
SMMUs.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/8] iommu: Add I/O ASID allocator

2019-06-18 Thread Jacob Pan
On Tue, 18 Jun 2019 15:22:20 +0100
Jean-Philippe Brucker  wrote:

> On 11/06/2019 19:13, Jacob Pan wrote:
>  +/**
>  + * ioasid_find - Find IOASID data
>  + * @set: the IOASID set
>  + * @ioasid: the IOASID to find
>  + * @getter: function to call on the found object
>  + *
>  + * The optional getter function allows to take a reference to
>  the found object
>  + * under the rcu lock. The function can also check if the object
>  is still valid:
>  + * if @getter returns false, then the object is invalid and NULL
>  is returned.
>  + *
>  + * If the IOASID has been allocated for this set, return the
>  private pointer
>  + * passed to ioasid_alloc. Private data can be NULL if not set.
>  Return an error
>  + * if the IOASID is not found or does not belong to the set.
> >>>
> >>> Perhaps should make it clear that @set can be null.
> >>
> >> Indeed. But I'm not sure allowing @set to be NULL is such a good
> >> idea, because the data type associated to an ioasid depends on its
> >> set. For example SVA will put an mm_struct in there, and auxiliary
> >> domains use some structure private to the IOMMU domain.
> >>  
> > I am not sure we need to count on @set to decipher data type.
> > Whoever does the allocation and owns the IOASID should knows its
> > own data type. My thought was that @set is only used to group IDs,
> > permission check etc.
> >   
> >> Jacob, could me make @set mandatory, or do you see a use for a
> >> global search? If @set is NULL, then callers can check if the
> >> return pointer is NULL, but will run into trouble if they try to
> >> dereference it. 
> > A global search use case can be for PRQ. IOMMU driver gets a IOASID
> > (first interrupt then retrieve from a queue), it has no idea which
> > @set it belongs to. But the data types are the same for all IOASIDs
> > used by the IOMMU.  
> 
> They aren't when we use a generic SVA handler. Following a call to
> iommu_sva_bind_device(), iommu-sva.c allocates an IOASID and store an
> mm_struct. If auxiliary domains are also enabled for the device,
> following a call to iommu_aux_attach_device() the IOMMU driver
> allocates an IOASID and stores some private object.
> 
> Now for example the IOMMU driver receives a PPR and calls
> ioasid_find() with @set = NULL. ioasid_find() may return either an
> mm_struct or a private object, and the driver cannot know which it is
> so the returned value is unusable.
I think we might have a misunderstanding of what ioasid_set is. Or i
have misused your original intention for it:) So my understanding of an
ioasid_set is to identify a sub set of IOASIDs that
1. shares the same data type
2. belongs to the same permission group, owner.

Our usage is #2., we put a mm_struct as the set to do permission
check. E.g VFIO can check guest PASID ownership based on QEMU process
mm. This will make sure IOASID allocated by one guest can only be used
by the same guest.

For IOASID used for sva bind, let it be native or guest sva, we always
have the same data type. Therefore, when page request handler gets
called to search with ioasid_find(), it knows its data type. An
additional flag will tell if it is a guest bind or native bind.

I was under the assumption that aux domain and its IOASID is a 1:1
mapping, there is no need for a search. Or even it needs to search, it
will be searched by the same caller that did the allocation, therefore
it knows what private data type is.

In addition, aux domain is used for request w/o PASID. And PPR for
request w/o PASID is not to be supported. So there would not be a need
to search from page request handler.

Now if we take the above assumption away, and use ioasid_set strictly
to differentiate the data types (Usage #1). Then I agree we can get
rid of NULL set and global search.

That would require we converge on the generic sva_bind for both
native and guest. The additional implication is that VFIO layer has to
be SVA struct aware in order to sanitize the mm_struct for guest bind.
i.e. check guest ownership by using QEMU process mm. Whereas we have
today, VFIO just use IOASID set as mm to check ownership, no need to
know the type.

Can we keep the NULL set for now and remove it __after__ we converge on
the sva bind flows?

Thanks,

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 12/22] iommu: Add I/O ASID allocator

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:12 -0700
Jacob Pan  wrote:

> From: Jean-Philippe Brucker 
> 
> Some devices might support multiple DMA address spaces, in particular
> those that have the PCI PASID feature. PASID (Process Address Space ID)
> allows to share process address spaces with devices (SVA), partition a
> device into VM-assignable entities (VFIO mdev) or simply provide
> multiple DMA address space to kernel drivers. Add a global PASID
> allocator usable by different drivers at the same time. Name it I/O ASID
> to avoid confusion with ASIDs allocated by arch code, which are usually
> a separate ID space.
> 
> The IOASID space is global. Each device can have its own PASID space,
> but by convention the IOMMU ended up having a global PASID space, so
> that with SVA, each mm_struct is associated to a single PASID.
> 
> The allocator is primarily used by IOMMU subsystem but in rare occasions
> drivers would like to allocate PASIDs for devices that aren't managed by
> an IOMMU, using the same ID space as IOMMU.
> 
> There are two types of allocators:
> 1. default allocator - Always available, uses an XArray to track
> 2. custom allocators - Can be registered at runtime, take precedence
> over the default allocator.
> 
> Custom allocators have these attributes:
> - provides platform specific alloc()/free() functions with private data.
> - allocation results lookup are not provided by the allocator, lookup
>   request must be done by the IOASID framework by its own XArray.
> - allocators can be unregistered at runtime, either fallback to the next
>   custom allocator or to the default allocator.

What is the usecase for having a 'stack' of custom allocators?

> - custom allocators can share the same set of alloc()/free() helpers, in
>   this case they also share the same IOASID space, thus the same XArray.
> - switching between allocators requires all outstanding IOASIDs to be
>   freed unless the two allocators share the same alloc()/free() helpers.
In general this approach has a lot of features where the justification is
missing from this particular patch.  It may be useful to add some
more background to this intro?

> 
> Signed-off-by: Jean-Philippe Brucker 
> Signed-off-by: Jacob Pan 
> Link: https://lkml.org/lkml/2019/4/26/462

Various comments inline.  Given the several cups of coffee this took to
review I may well have misunderstood everything ;)

Jonathan
> ---
>  drivers/iommu/Kconfig  |   8 +
>  drivers/iommu/Makefile |   1 +
>  drivers/iommu/ioasid.c | 427 
> +
>  include/linux/ioasid.h |  74 +
>  4 files changed, 510 insertions(+)
>  create mode 100644 drivers/iommu/ioasid.c
>  create mode 100644 include/linux/ioasid.h
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 83664db..c40c4b5 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -3,6 +3,13 @@
>  config IOMMU_IOVA
>   tristate
>  
> +# The IOASID allocator may also be used by non-IOMMU_API users
> +config IOASID
> + tristate
> + help
> +   Enable the I/O Address Space ID allocator. A single ID space shared
> +   between different users.
> +
>  # IOMMU_API always gets selected by whoever wants it.
>  config IOMMU_API
>   bool
> @@ -207,6 +214,7 @@ config INTEL_IOMMU_SVM
>   depends on INTEL_IOMMU && X86
>   select PCI_PASID
>   select MMU_NOTIFIER
> + select IOASID
>   help
> Shared Virtual Memory (SVM) provides a facility for devices
> to access DMA resources through process address space by
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 8c71a15..0efac6f 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -7,6 +7,7 @@ obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_LPAE) += io-pgtable-arm.o
> +obj-$(CONFIG_IOASID) += ioasid.o
>  obj-$(CONFIG_IOMMU_IOVA) += iova.o
>  obj-$(CONFIG_OF_IOMMU)   += of_iommu.o
>  obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
> diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
> new file mode 100644
> index 000..0919b70
> --- /dev/null
> +++ b/drivers/iommu/ioasid.c
> @@ -0,0 +1,427 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * I/O Address Space ID allocator. There is one global IOASID space, split 
> into
> + * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
> + * free IOASIDs with ioasid_alloc and ioasid_free.
> + */
> +#define pr_fmt(fmt)  KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct ioasid_data {
> + ioasid_t id;
> + struct ioasid_set *set;
> + void *private;
> + struct rcu_head rcu;
> +};
> +
> +/*
> + * struct ioasid_allocator_data - Internal data structure to hold information
> + * about an allocator. There are two types of allocators:
> + *
> + * - 

Re: [PATCH v4 20/22] iommu/vt-d: Add bind guest PASID support

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:20 -0700
Jacob Pan  wrote:

> When supporting guest SVA with emulated IOMMU, the guest PASID
> table is shadowed in VMM. Updates to guest vIOMMU PASID table
> will result in PASID cache flush which will be passed down to
> the host as bind guest PASID calls.
> 
> For the SL page tables, it will be harvested from device's
> default domain (request w/o PASID), or aux domain in case of
> mediated device.
> 
> .-.  .---.
> |   vIOMMU|  | Guest process CR3, FL only|
> | |  '---'
> ./
> | PASID Entry |--- PASID cache flush -
> '-'   |
> | |   V
> | |CR3 in GPA
> '-'
> Guest
> --| Shadow |--|
>   vv  v
> Host
> .-.  .--.
> |   pIOMMU|  | Bind FL for GVA-GPA  |
> | |  '--'
> ./  |
> | PASID Entry | V (Nested xlate)
> '\.--.
> | |   |SL for GPA-HPA, default domain|
> | |   '--'
> '-'
> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu, Yi L 


A few trivial bits inline.  As far as I can tell looks good but I'm not that
familiar with the hardware.

Jonathan

> ---
>  drivers/iommu/intel-iommu.c |   4 +
>  drivers/iommu/intel-svm.c   | 187 
> 
>  include/linux/intel-iommu.h |  13 ++-
>  include/linux/intel-svm.h   |  17 
>  4 files changed, 219 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 7cfa0eb..3b4d712 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5782,6 +5782,10 @@ const struct iommu_ops intel_iommu_ops = {
>   .dev_enable_feat= intel_iommu_dev_enable_feat,
>   .dev_disable_feat   = intel_iommu_dev_disable_feat,
>   .pgsize_bitmap  = INTEL_IOMMU_PGSIZES,
> +#ifdef CONFIG_INTEL_IOMMU_SVM
> + .sva_bind_gpasid= intel_svm_bind_gpasid,
> + .sva_unbind_gpasid  = intel_svm_unbind_gpasid,
> +#endif
>  };
>  
>  static void quirk_iommu_g4x_gfx(struct pci_dev *dev)
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 66d98e1..f06a82f 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -229,6 +229,193 @@ static LIST_HEAD(global_svm_list);
>   list_for_each_entry(sdev, >devs, list) \
>   if (dev == sdev->dev)   \
>  
> +int intel_svm_bind_gpasid(struct iommu_domain *domain,
> + struct device *dev,
> + struct gpasid_bind_data *data)
> +{
> + struct intel_iommu *iommu = intel_svm_device_to_iommu(dev);
> + struct intel_svm_dev *sdev;
> + struct intel_svm *svm = NULL;
I think this is set in all the paths that use it..

> + struct dmar_domain *ddomain;
> + int ret = 0;
> +
> + if (WARN_ON(!iommu) || !data)
> + return -EINVAL;
> +
> + if (data->version != IOMMU_GPASID_BIND_VERSION_1 ||
> + data->format != IOMMU_PASID_FORMAT_INTEL_VTD)
> + return -EINVAL;
> +
> + if (dev_is_pci(dev)) {
> + /* VT-d supports devices with full 20 bit PASIDs only */
> + if (pci_max_pasids(to_pci_dev(dev)) != PASID_MAX)
> + return -EINVAL;
> + }
> +
> + /*
> +  * We only check host PASID range, we have no knowledge to check
> +  * guest PASID range nor do we use the guest PASID.
> +  */
> + if (data->hpasid <= 0 || data->hpasid >= PASID_MAX)
> + return -EINVAL;
> +
> + ddomain = to_dmar_domain(domain);
> + /* REVISIT:
> +  * Sanity check adddress width and paging mode support
> +  * width matching in two dimensions:
> +  * 1. paging mode CPU <= IOMMU
> +  * 2. address width Guest <= Host.
> +  */
> + mutex_lock(_mutex);
> + svm = ioasid_find(NULL, data->hpasid, NULL);
> + if (IS_ERR(svm)) {
> + ret = PTR_ERR(svm);
> + goto out;
> + }
> + if (svm) {
> + /*
> +  * If we found svm for the PASID, there must be at
> +  * least one device bond, otherwise svm should be freed.
> +  */
> + BUG_ON(list_empty(>devs));
> +
> + for_each_svm_dev() {
> + /* In case of multiple sub-devices of the same pdev 
> assigned, we should
> +  * allow multiple bind calls with the same PASID and 
> pdev.
> +  */
> + sdev->users++;
> + 

Re: [PATCH v4 19/22] iommu/vt-d: Clean up for SVM device list

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:19 -0700
Jacob Pan  wrote:

> Use combined macro for_each_svm_dev() to simplify SVM device iteration.
> 
> Suggested-by: Andy Shevchenko 
> Signed-off-by: Jacob Pan 
> Reviewed-by: Eric Auger 
> ---
>  drivers/iommu/intel-svm.c | 79 
> +++
>  1 file changed, 39 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 9cbcc1f..66d98e1 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -225,6 +225,9 @@ static const struct mmu_notifier_ops intel_mmuops = {
>  
>  static DEFINE_MUTEX(pasid_mutex);
>  static LIST_HEAD(global_svm_list);
> +#define for_each_svm_dev() \
> + list_for_each_entry(sdev, >devs, list) \
> + if (dev == sdev->dev)   \

Could we make this macro less opaque and have it take the svm and dev as
arguments?

>  
>  int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct 
> svm_dev_ops *ops)
>  {
> @@ -271,15 +274,13 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, 
> int flags, struct svm_dev_
>   goto out;
>   }
>  
> - list_for_each_entry(sdev, >devs, list) {
> - if (dev == sdev->dev) {
> - if (sdev->ops != ops) {
> - ret = -EBUSY;
> - goto out;
> - }
> - sdev->users++;
> - goto success;
> + for_each_svm_dev() {
> + if (sdev->ops != ops) {
> + ret = -EBUSY;
> + goto out;
>   }
> + sdev->users++;
> + goto success;
>   }
>  
>   break;
> @@ -409,40 +410,38 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
>   if (!svm)
>   goto out;
>  
> - list_for_each_entry(sdev, >devs, list) {
> - if (dev == sdev->dev) {
> - ret = 0;
> - sdev->users--;
> - if (!sdev->users) {
> - list_del_rcu(>list);
> - /* Flush the PASID cache and IOTLB for this 
> device.
> -  * Note that we do depend on the hardware *not* 
> using
> -  * the PASID any more. Just as we depend on 
> other
> -  * devices never using PASIDs that they have no 
> right
> -  * to use. We have a *shared* PASID table, 
> because it's
> -  * large and has to be physically contiguous. 
> So it's
> -  * hard to be as defensive as we might like. */
> - intel_pasid_tear_down_entry(iommu, dev, 
> svm->pasid);
> - intel_flush_svm_range_dev(svm, sdev, 0, -1, 0, 
> !svm->mm);
> - kfree_rcu(sdev, rcu);
> -
> - if (list_empty(>devs)) {
> - ioasid_free(svm->pasid);
> - if (svm->mm)
> - 
> mmu_notifier_unregister(>notifier, svm->mm);
> -
> - list_del(>list);
> -
> - /* We mandate that no page faults may 
> be outstanding
> -  * for the PASID when 
> intel_svm_unbind_mm() is called.
> -  * If that is not obeyed, subtle errors 
> will happen.
> -  * Let's make them less subtle... */
> - memset(svm, 0x6b, sizeof(*svm));
> - kfree(svm);
> - }
> + for_each_svm_dev() {
> + ret = 0;
> + sdev->users--;
> + if (!sdev->users) {
> + list_del_rcu(>list);
> + /* Flush the PASID cache and IOTLB for this device.
> +  * Note that we do depend on the hardware *not* using
> +  * the PASID any more. Just as we depend on other
> +  * devices never using PASIDs that they have no right
> +  * to use. We have a *shared* PASID table, because it's
> +  * large and has to be physically contiguous. So it's
> +  * hard to be as defensive as we might like. */
> + intel_pasid_tear_down_entry(iommu, dev, svm->pasid);
> + intel_flush_svm_range_dev(svm, sdev, 0, -1, 0, 

Re: [PATCH v7 20/21] iommu/mediatek: Fix iova_to_phys PA start for 4GB mode

2019-06-18 Thread Matthias Brugger



On 10/06/2019 14:17, Yong Wu wrote:
> In the 4GB mode, the physical address is remapped,
> 
> Here is the detailed remap relationship.
> CPU PA ->HW PA
> 0x4000_  0x1_4000_ (Add bit32)
> 0x8000_  0x1_8000_ ...
> 0xc000_  0x1_c000_ ...
> 0x1__0x1__ (No change)
> 
> Thus, we always add bit32 for PA when entering mtk_iommu_map.
> But in the iova_to_phys, the CPU don't need this bit32 if the
> PA is from 0x1_4000_ to 0x1__.
> This patch discards the bit32 in this iova_to_phys in the 4GB mode.
> 
> Fixes: 30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range
> for 4GB mode")
> Signed-off-by: Yong Wu 
> ---
>  drivers/iommu/mtk_iommu.c | 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 67cab2d..34f2e40 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -119,6 +119,19 @@ struct mtk_iommu_domain {
>  
>  static const struct iommu_ops mtk_iommu_ops;
>  
> +/*
> + * In M4U 4GB mode, the physical address is remapped as below:
> + *  CPU PA ->   M4U HW PA
> + *  0x4000_ 0x1_4000_ (Add bit32)
> + *  0x8000_ 0x1_8000_ ...
> + *  0xc000_ 0x1_c000_ ...
> + *  0x1__   0x1__ (No change)
> + *
> + * Thus, We always add BIT32 in the iommu_map and disable BIT32 if PA is >=
> + * 0x1_4000_ in the iova_to_phys.
> + */
> +#define MTK_IOMMU_4GB_MODE_PA_14000 0x14000UL
> +
>  static LIST_HEAD(m4ulist);   /* List all the M4U HWs */
>  
>  #define for_each_m4u(data)   list_for_each_entry(data, , list)
> @@ -415,6 +428,7 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct 
> iommu_domain *domain,
> dma_addr_t iova)
>  {
>   struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> + struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
>   unsigned long flags;
>   phys_addr_t pa;
>  
> @@ -422,6 +436,10 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct 
> iommu_domain *domain,
>   pa = dom->iop->iova_to_phys(dom->iop, iova);
>   spin_unlock_irqrestore(>pgtlock, flags);
>  
> + if (data->plat_data->has_4gb_mode && data->dram_is_4gb &&
> + pa >= MTK_IOMMU_4GB_MODE_PA_14000)
> + pa &= ~BIT_ULL(32);
> +

Hm, I wonder if we could fix this as first patch in the series, especially 
before:
"[PATCH 06/21] iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode"

This would make it easier for the stable maintainer to cherry-pick the fix.
Without 100% understanding the code, it seems suspicious to me, that you first
move the setting of the bit32 and bit33 into v7s and later explicitly clean the
bits here.

So my take on this is, that patch 6/21 introduced the regression you are trying
to fix here. As said that is speculation as I don't understand the code in its
whole.

Any clarification would be useful.

Regards,
Matthias

>   return pa;
>  }
>  
> 


Re: [PATCH v7 19/21] iommu/mediatek: Rename enable_4GB to dram_is_4gb

2019-06-18 Thread Matthias Brugger



On 10/06/2019 14:17, Yong Wu wrote:
> This patch only rename the variable name from enable_4GB to
> dram_is_4gb for readable.

>From my understanding this is true when available RAM > 4GB so I think the name
should be something like dram_bigger_4gb otherwise it may create confusion 
again.

Also from my point of view this patch should be done before
"[PATCH 06/21] iommu/io-pgtable-arm-v7s: Extend MediaTek 4GB Mode"

Regards,
Matthias

> 
> Signed-off-by: Yong Wu 
> Reviewed-by: Evan Green 
> ---
>  drivers/iommu/mtk_iommu.c | 10 +-
>  drivers/iommu/mtk_iommu.h |  2 +-
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 86158d8..67cab2d 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -382,7 +382,7 @@ static int mtk_iommu_map(struct iommu_domain *domain, 
> unsigned long iova,
>   int ret;
>  
>   /* The "4GB mode" M4U physically can not use the lower remap of Dram. */
> - if (data->plat_data->has_4gb_mode && data->enable_4GB)
> + if (data->plat_data->has_4gb_mode && data->dram_is_4gb)
>   paddr |= BIT_ULL(32);
>  
>   spin_lock_irqsave(>pgtlock, flags);
> @@ -554,13 +554,13 @@ static int mtk_iommu_hw_init(const struct 
> mtk_iommu_data *data)
>   writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
>  
>   if (data->plat_data->m4u_plat == M4U_MT8173)
> - regval = (data->protect_base >> 1) | (data->enable_4GB << 31);
> + regval = (data->protect_base >> 1) | (data->dram_is_4gb << 31);
>   else
>   regval = lower_32_bits(data->protect_base) |
>upper_32_bits(data->protect_base);
>   writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
>  
> - if (data->enable_4GB && data->plat_data->has_vld_pa_rng) {
> + if (data->dram_is_4gb && data->plat_data->has_vld_pa_rng) {
>   /*
>* If 4GB mode is enabled, the validate PA range is from
>* 0x1__ to 0x1__. here record bit[32:30].
> @@ -611,8 +611,8 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   return -ENOMEM;
>   data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);
>  
> - /* Whether the current dram is over 4GB */
> - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
> + /* Whether the current dram is 4GB. */
> + data->dram_is_4gb = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   data->base = devm_ioremap_resource(dev, res);
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index 753266b..e8114b2 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -65,7 +65,7 @@ struct mtk_iommu_data {
>   struct mtk_iommu_domain *m4u_dom;
>   struct iommu_group  *m4u_group;
>   struct mtk_smi_iommusmi_imu;  /* SMI larb iommu info */
> - boolenable_4GB;
> + booldram_is_4gb;
>   booltlb_flush_active;
>  
>   struct iommu_device iommu;
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 17/22] iommu/vt-d: Avoid duplicated code for PASID setup

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:17 -0700
Jacob Pan  wrote:

> After each setup for PASID entry, related translation caches must be flushed.
> We can combine duplicated code into one function which is less error prone.
> 
> Signed-off-by: Jacob Pan 
Formatting nitpick below ;)

Otherwise it's cut and paste
> ---
>  drivers/iommu/intel-pasid.c | 48 
> +
>  1 file changed, 18 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 1e25539..1ff2ecc 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -522,6 +522,21 @@ void intel_pasid_tear_down_entry(struct intel_iommu 
> *iommu,
>   devtlb_invalidation_with_pasid(iommu, dev, pasid);
>  }
>  
> +static inline void pasid_flush_caches(struct intel_iommu *iommu,
> + struct pasid_entry *pte,
> + int pasid, u16 did)
> +{
> + if (!ecap_coherent(iommu->ecap))
> + clflush_cache_range(pte, sizeof(*pte));
> +
> + if (cap_caching_mode(iommu->cap)) {
> + pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> + iotlb_invalidation_with_pasid(iommu, did, pasid);
> + } else
> + iommu_flush_write_buffer(iommu);

I have some vague recollection kernel style says use brackets around
single lines if other blocks in an if / else stack have multiple lines..

I checked, this case is specifically called out

https://www.kernel.org/doc/html/v5.1/process/coding-style.html
> +
This blank line doesn't add anything either ;)
> +}
> +
>  /*
>   * Set up the scalable mode pasid table entry for first only
>   * translation type.
> @@ -567,16 +582,7 @@ int intel_pasid_setup_first_level(struct intel_iommu 
> *iommu,
>   /* Setup Present and PASID Granular Transfer Type: */
>   pasid_set_translation_type(pte, 1);
>   pasid_set_present(pte);
> -
> - if (!ecap_coherent(iommu->ecap))
> - clflush_cache_range(pte, sizeof(*pte));
> -
> - if (cap_caching_mode(iommu->cap)) {
> - pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> - iotlb_invalidation_with_pasid(iommu, did, pasid);
> - } else {
> - iommu_flush_write_buffer(iommu);
> - }
> + pasid_flush_caches(iommu, pte, pasid, did);
>  
>   return 0;
>  }
> @@ -640,16 +646,7 @@ int intel_pasid_setup_second_level(struct intel_iommu 
> *iommu,
>*/
>   pasid_set_sre(pte);
>   pasid_set_present(pte);
> -
> - if (!ecap_coherent(iommu->ecap))
> - clflush_cache_range(pte, sizeof(*pte));
> -
> - if (cap_caching_mode(iommu->cap)) {
> - pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> - iotlb_invalidation_with_pasid(iommu, did, pasid);
> - } else {
> - iommu_flush_write_buffer(iommu);
> - }
> + pasid_flush_caches(iommu, pte, pasid, did);
>  
>   return 0;
>  }
> @@ -683,16 +680,7 @@ int intel_pasid_setup_pass_through(struct intel_iommu 
> *iommu,
>*/
>   pasid_set_sre(pte);
>   pasid_set_present(pte);
> -
> - if (!ecap_coherent(iommu->ecap))
> - clflush_cache_range(pte, sizeof(*pte));
> -
> - if (cap_caching_mode(iommu->cap)) {
> - pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> - iotlb_invalidation_with_pasid(iommu, did, pasid);
> - } else {
> - iommu_flush_write_buffer(iommu);
> - }
> + pasid_flush_caches(iommu, pte, pasid, did);
>  
>   return 0;
>  }




Re: [PATCH v4 15/22] iommu/vt-d: Replace Intel specific PASID allocator with IOASID

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:15 -0700
Jacob Pan  wrote:

> Make use of generic IOASID code to manage PASID allocation,
> free, and lookup. Replace Intel specific code.
> 
> Signed-off-by: Jacob Pan 
Hi Jacob,

One question inline.

Jonathan

> ---
>  drivers/iommu/intel-iommu.c | 11 +--
>  drivers/iommu/intel-pasid.c | 36 
>  drivers/iommu/intel-svm.c   | 37 +
>  3 files changed, 26 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 5b84994..39b63fe 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5167,7 +5167,7 @@ static void auxiliary_unlink_device(struct dmar_domain 
> *domain,
>   domain->auxd_refcnt--;
>  
>   if (!domain->auxd_refcnt && domain->default_pasid > 0)
> - intel_pasid_free_id(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
>  }
>  
>  static int aux_domain_add_dev(struct dmar_domain *domain,
> @@ -5185,10 +5185,9 @@ static int aux_domain_add_dev(struct dmar_domain 
> *domain,
>   if (domain->default_pasid <= 0) {
>   int pasid;
>  
> - pasid = intel_pasid_alloc_id(domain, PASID_MIN,
> -  pci_max_pasids(to_pci_dev(dev)),
> -  GFP_KERNEL);
> - if (pasid <= 0) {
> + pasid = ioasid_alloc(NULL, PASID_MIN, 
> pci_max_pasids(to_pci_dev(dev)) - 1,
> + domain);

Is there any point in passing the domain in as the private pointer here?
I can't immediately see anywhere it is read back?

It is also rather confusing as the same driver stashes two different types of 
data
in the same xarray.

> + if (pasid == INVALID_IOASID) {
>   pr_err("Can't allocate default pasid\n");
>   return -ENODEV;
>   }
> @@ -5224,7 +5223,7 @@ static int aux_domain_add_dev(struct dmar_domain 
> *domain,
>   spin_unlock(>lock);
>   spin_unlock_irqrestore(_domain_lock, flags);
>   if (!domain->auxd_refcnt && domain->default_pasid > 0)
> - intel_pasid_free_id(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
>  
>   return ret;
>  }
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 69fddd3..1e25539 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -26,42 +26,6 @@
>   */
>  static DEFINE_SPINLOCK(pasid_lock);
>  u32 intel_pasid_max_id = PASID_MAX;
> -static DEFINE_IDR(pasid_idr);
> -
> -int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
> -{
> - int ret, min, max;
> -
> - min = max_t(int, start, PASID_MIN);
> - max = min_t(int, end, intel_pasid_max_id);
> -
> - WARN_ON(in_interrupt());
> - idr_preload(gfp);
> - spin_lock(_lock);
> - ret = idr_alloc(_idr, ptr, min, max, GFP_ATOMIC);
> - spin_unlock(_lock);
> - idr_preload_end();
> -
> - return ret;
> -}
> -
> -void intel_pasid_free_id(int pasid)
> -{
> - spin_lock(_lock);
> - idr_remove(_idr, pasid);
> - spin_unlock(_lock);
> -}
> -
> -void *intel_pasid_lookup_id(int pasid)
> -{
> - void *p;
> -
> - spin_lock(_lock);
> - p = idr_find(_idr, pasid);
> - spin_unlock(_lock);
> -
> - return p;
> -}
>  
>  int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)
>  {
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 8f87304..9cbcc1f 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "intel-pasid.h"
> @@ -332,16 +333,15 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, 
> int flags, struct svm_dev_
>   if (pasid_max > intel_pasid_max_id)
>   pasid_max = intel_pasid_max_id;
>  
> - /* Do not use PASID 0 in caching mode (virtualised IOMMU) */
> - ret = intel_pasid_alloc_id(svm,
> -!!cap_caching_mode(iommu->cap),
> -pasid_max - 1, GFP_KERNEL);
> - if (ret < 0) {
> + /* Do not use PASID 0, reserved for RID to PASID */
> + svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> + pasid_max - 1, svm);
> + if (svm->pasid == INVALID_IOASID) {
>   kfree(svm);
>   kfree(sdev);
> + ret = ENOSPC;
>   goto out;
>   }
> - svm->pasid = ret;
>   svm->notifier.ops = _mmuops;
>   svm->mm = mm;
>   svm->flags = flags;
> @@ -351,7 +351,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int 
> flags, struct svm_dev_
>   if (mm) {
>  

Re: [PATCH v4 04/22] iommu: Add recoverable fault reporting

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:04 -0700
Jacob Pan  wrote:

> From: Jean-Philippe Brucker 
> 
> Some IOMMU hardware features, for example PCI's PRI and Arm SMMU's Stall,
> enable recoverable I/O page faults. Allow IOMMU drivers to report PRI Page
> Requests and Stall events through the new fault reporting API. The
> consumer of the fault can be either an I/O page fault handler in the host,
> or a guest OS.
> 
> Once handled, the fault must be completed by sending a page response back
> to the IOMMU. Add an iommu_page_response() function to complete a page
> fault.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Jean-Philippe Brucker 
One totally trivial ordering of docs comment in here.  Otherwise good.

Jonathan
> ---
>  drivers/iommu/iommu.c | 77 
> ++-
>  include/linux/iommu.h | 51 ++
>  2 files changed, 127 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 7955184..13b301c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -869,7 +869,14 @@ EXPORT_SYMBOL_GPL(iommu_group_unregister_notifier);
>   * @data: private data passed as argument to the handler
>   *
>   * When an IOMMU fault event is received, this handler gets called with the
> - * fault event and data as argument.
> + * fault event and data as argument. The handler should return 0 on success. 
> If
> + * the fault is recoverable (IOMMU_FAULT_PAGE_REQ), the handler should also
> + * complete the fault by calling iommu_page_response() with one of the 
> following
> + * response code:
> + * - IOMMU_PAGE_RESP_SUCCESS: retry the translation
> + * - IOMMU_PAGE_RESP_INVALID: terminate the fault
> + * - IOMMU_PAGE_RESP_FAILURE: terminate the fault and stop reporting
> + *   page faults if possible.
>   *
>   * Return 0 if the fault handler was installed successfully, or an error.
>   */
> @@ -904,6 +911,8 @@ int iommu_register_device_fault_handler(struct device 
> *dev,
>   }
>   param->fault_param->handler = handler;
>   param->fault_param->data = data;
> + mutex_init(>fault_param->lock);
> + INIT_LIST_HEAD(>fault_param->faults);
>  
>  done_unlock:
>   mutex_unlock(>lock);
> @@ -934,6 +943,12 @@ int iommu_unregister_device_fault_handler(struct device 
> *dev)
>   if (!param->fault_param)
>   goto unlock;
>  
> + /* we cannot unregister handler if there are pending faults */
> + if (!list_empty(>fault_param->faults)) {
> + ret = -EBUSY;
> + goto unlock;
> + }
> +
>   kfree(param->fault_param);
>   param->fault_param = NULL;
>   put_device(dev);
> @@ -958,6 +973,7 @@ EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler);
>  int iommu_report_device_fault(struct device *dev, struct iommu_fault_event 
> *evt)
>  {
>   struct iommu_param *param = dev->iommu_param;
> + struct iommu_fault_event *evt_pending;
>   struct iommu_fault_param *fparam;
>   int ret = 0;
>  
> @@ -972,6 +988,20 @@ int iommu_report_device_fault(struct device *dev, struct 
> iommu_fault_event *evt)
>   ret = -EINVAL;
>   goto done_unlock;
>   }
> +
> + if (evt->fault.type == IOMMU_FAULT_PAGE_REQ &&
> + (evt->fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE)) {
> + evt_pending = kmemdup(evt, sizeof(struct iommu_fault_event),
> +   GFP_KERNEL);
> + if (!evt_pending) {
> + ret = -ENOMEM;
> + goto done_unlock;
> + }
> + mutex_lock(>lock);
> + list_add_tail(_pending->list, >faults);
> + mutex_unlock(>lock);
> + }
> +
>   ret = fparam->handler(evt, fparam->data);
>  done_unlock:
>   mutex_unlock(>lock);
> @@ -1513,6 +1543,51 @@ int iommu_attach_device(struct iommu_domain *domain, 
> struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(iommu_attach_device);
>  
> +int iommu_page_response(struct device *dev,
> + struct page_response_msg *msg)
> +{
> + struct iommu_param *param = dev->iommu_param;
> + int ret = -EINVAL;
> + struct iommu_fault_event *evt;
> + struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> +
> + if (!domain || !domain->ops->page_response)
> + return -ENODEV;
> +
> + /*
> +  * Device iommu_param should have been allocated when device is
> +  * added to its iommu_group.
> +  */
> + if (!param || !param->fault_param)
> + return -EINVAL;
> +
> + /* Only send response if there is a fault report pending */
> + mutex_lock(>fault_param->lock);
> + if (list_empty(>fault_param->faults)) {
> + pr_warn("no pending PRQ, drop response\n");
> + goto done_unlock;
> + }
> + /*
> +  * Check if we have a matching page request pending to respond,
> +  * otherwise return -EINVAL
> +  */
> + 

Re: [PATCH v4 02/22] iommu: Introduce device fault data

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:02 -0700
Jacob Pan  wrote:

> Device faults detected by IOMMU can be reported outside the IOMMU
> subsystem for further processing. This patch introduces
> a generic device fault data structure.
> 
> The fault can be either an unrecoverable fault or a page request,
> also referred to as a recoverable fault.
> 
> We only care about non internal faults that are likely to be reported
> to an external subsystem.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Jean-Philippe Brucker 
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Ashok Raj 
> Signed-off-by: Eric Auger 

A few trivial nitpicks in here.

Otherwise looks straight forward and sensible to me.

Jonathan
> ---
>  include/linux/iommu.h  |  44 +
>  include/uapi/linux/iommu.h | 118 
> +
>  2 files changed, 162 insertions(+)
>  create mode 100644 include/uapi/linux/iommu.h
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index a815cf6..7890a92 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define IOMMU_READ   (1 << 0)
>  #define IOMMU_WRITE  (1 << 1)
> @@ -49,6 +50,7 @@ struct device;
>  struct iommu_domain;
>  struct notifier_block;
>  struct iommu_sva;
> +struct iommu_fault_event;
>  
>  /* iommu fault flags */
>  #define IOMMU_FAULT_READ 0x0
> @@ -58,6 +60,7 @@ typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
>   struct device *, unsigned long, int, void *);
>  typedef int (*iommu_mm_exit_handler_t)(struct device *dev, struct iommu_sva 
> *,
>  void *);
> +typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault_event *, void *);
>  
>  struct iommu_domain_geometry {
>   dma_addr_t aperture_start; /* First address that can be mapped*/
> @@ -301,6 +304,46 @@ struct iommu_device {
>   struct device *dev;
>  };
>  
> +/**
> + * struct iommu_fault_event - Generic fault event
> + *
> + * Can represent recoverable faults such as a page requests or
> + * unrecoverable faults such as DMA or IRQ remapping faults.
> + *
> + * @fault: fault descriptor
> + * @iommu_private: used by the IOMMU driver for storing fault-specific
> + * data. Users should not modify this field before
> + * sending the fault response.
> + */
> +struct iommu_fault_event {
> + struct iommu_fault fault;
> + u64 iommu_private;
> +};
> +
> +/**
> + * struct iommu_fault_param - per-device IOMMU fault data
> + * @dev_fault_handler: Callback function to handle IOMMU faults at device 
> level
> + * @data: handler private data
> + *

No need for this blank line.  Seems inconsistent with other docs in here.

Also, there is a docs fixup in patch 3 which should be pulled back to here.

> + */
> +struct iommu_fault_param {
> + iommu_dev_fault_handler_t handler;
> + void *data;
> +};
> +
> +/**
> + * struct iommu_param - collection of per-device IOMMU data
> + *
> + * @fault_param: IOMMU detected device fault reporting data
> + *
> + * TODO: migrate other per device data pointers under iommu_dev_data, e.g.
> + *   struct iommu_group  *iommu_group;
> + *   struct iommu_fwspec *iommu_fwspec;
> + */
> +struct iommu_param {
> + struct iommu_fault_param *fault_param;
Is it actually worth having the indirection of a pointer here as opposed
to just embedding the actual structure?  The null value is used in places
but can just use the handler being null for the same job I think...

It reduces the code needed in patch 3 a bit.

It gets a bit bigger in patch 4, but is still only about 16 bytes.

> +};
> +
>  int  iommu_device_register(struct iommu_device *iommu);
>  void iommu_device_unregister(struct iommu_device *iommu);
>  int  iommu_device_sysfs_add(struct iommu_device *iommu,
> @@ -504,6 +547,7 @@ struct iommu_ops {};
>  struct iommu_group {};
>  struct iommu_fwspec {};
>  struct iommu_device {};
> +struct iommu_fault_param {};
>  
>  static inline bool iommu_present(struct bus_type *bus)
>  {
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> new file mode 100644
> index 000..aaa3b6a
> --- /dev/null
> +++ b/include/uapi/linux/iommu.h
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * IOMMU user API definitions
> + */
> +
> +#ifndef _UAPI_IOMMU_H
> +#define _UAPI_IOMMU_H
> +
> +#include 
> +
> +#define IOMMU_FAULT_PERM_READ(1 << 0) /* read */
> +#define IOMMU_FAULT_PERM_WRITE   (1 << 1) /* write */
> +#define IOMMU_FAULT_PERM_EXEC(1 << 2) /* exec */
> +#define IOMMU_FAULT_PERM_PRIV(1 << 3) /* privileged */
> +
> +/* Generic fault types, can be expanded IRQ remapping fault */
> +enum iommu_fault_type {
> + IOMMU_FAULT_DMA_UNRECOV = 1,/* unrecoverable fault */
> + IOMMU_FAULT_PAGE_REQ,   /* page request fault */
> +};
> +
> +enum 

Re: [PATCH v4 08/22] iommu: Introduce attach/detach_pasid_table API

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:08 -0700
Jacob Pan  wrote:

> In virtualization use case, when a guest is assigned
> a PCI host device, protected by a virtual IOMMU on the guest,
> the physical IOMMU must be programmed to be consistent with
> the guest mappings. If the physical IOMMU supports two
> translation stages it makes sense to program guest mappings
> onto the first stage/level (ARM/Intel terminology) while the host
> owns the stage/level 2.
> 
> In that case, it is mandated to trap on guest configuration
> settings and pass those to the physical iommu driver.
> 
> This patch adds a new API to the iommu subsystem that allows
> to set/unset the pasid table information.
> 
> A generic iommu_pasid_table_config struct is introduced in
> a new iommu.h uapi header. This is going to be used by the VFIO
> user API.

Another case where strictly speaking stuff is introduced that this series
doesn't use.  I don't know what the plans are to merge the various
related series though so this might make sense in general. Right now
it just bloats this series a bit..
> 
> Signed-off-by: Jean-Philippe Brucker 
> Signed-off-by: Liu, Yi L 
> Signed-off-by: Ashok Raj 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Eric Auger 
> Reviewed-by: Jean-Philippe Brucker 
> ---
>  drivers/iommu/iommu.c  | 19 +
>  include/linux/iommu.h  | 18 
>  include/uapi/linux/iommu.h | 52 
> ++
>  3 files changed, 89 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 166adb8..4496ccd 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1619,6 +1619,25 @@ int iommu_page_response(struct device *dev,
>  }
>  EXPORT_SYMBOL_GPL(iommu_page_response);
>  
> +int iommu_attach_pasid_table(struct iommu_domain *domain,
> +  struct iommu_pasid_table_config *cfg)
> +{
> + if (unlikely(!domain->ops->attach_pasid_table))
> + return -ENODEV;
> +
> + return domain->ops->attach_pasid_table(domain, cfg);
> +}
> +EXPORT_SYMBOL_GPL(iommu_attach_pasid_table);
> +
> +void iommu_detach_pasid_table(struct iommu_domain *domain)
> +{
> + if (unlikely(!domain->ops->detach_pasid_table))
> + return;
> +
> + domain->ops->detach_pasid_table(domain);
> +}
> +EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
> +
>  static void __iommu_detach_device(struct iommu_domain *domain,
> struct device *dev)
>  {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 950347b..d3edb10 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -264,6 +264,8 @@ struct page_response_msg {
>   * @sva_unbind: Unbind process address space from device
>   * @sva_get_pasid: Get PASID associated to a SVA handle
>   * @page_response: handle page request response
> + * @attach_pasid_table: attach a pasid table
> + * @detach_pasid_table: detach the pasid table
>   * @pgsize_bitmap: bitmap of all possible supported page sizes
>   */
>  struct iommu_ops {
> @@ -323,6 +325,9 @@ struct iommu_ops {
> void *drvdata);
>   void (*sva_unbind)(struct iommu_sva *handle);
>   int (*sva_get_pasid)(struct iommu_sva *handle);
> + int (*attach_pasid_table)(struct iommu_domain *domain,
> +   struct iommu_pasid_table_config *cfg);
> + void (*detach_pasid_table)(struct iommu_domain *domain);
>  
>   int (*page_response)(struct device *dev, struct page_response_msg *msg);
>  
> @@ -434,6 +439,9 @@ extern int iommu_attach_device(struct iommu_domain 
> *domain,
>  struct device *dev);
>  extern void iommu_detach_device(struct iommu_domain *domain,
>   struct device *dev);
> +extern int iommu_attach_pasid_table(struct iommu_domain *domain,
> + struct iommu_pasid_table_config *cfg);
> +extern void iommu_detach_pasid_table(struct iommu_domain *domain);
>  extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>  extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
>  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
> @@ -947,6 +955,13 @@ iommu_aux_get_pasid(struct iommu_domain *domain, struct 
> device *dev)
>   return -ENODEV;
>  }
>  
> +static inline
> +int iommu_attach_pasid_table(struct iommu_domain *domain,
> +  struct iommu_pasid_table_config *cfg)
> +{
> + return -ENODEV;
> +}
> +
>  static inline struct iommu_sva *
>  iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void 
> *drvdata)
>  {
> @@ -968,6 +983,9 @@ static inline int iommu_sva_get_pasid(struct iommu_sva 
> *handle)
>   return IOMMU_PASID_INVALID;
>  }
>  
> +static inline
> +void iommu_detach_pasid_table(struct iommu_domain *domain) {}
> +
>  #endif /* CONFIG_IOMMU_API */
>  
>  #ifdef CONFIG_IOMMU_DEBUGFS
> diff --git 

Re: [PATCH v4 09/22] iommu: Introduce cache_invalidate API

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:09 -0700
Jacob Pan  wrote:

> From: Liu Yi L 
> 
> In any virtualization use case, when the first translation stage
> is "owned" by the guest OS, the host IOMMU driver has no knowledge
> of caching structure updates unless the guest invalidation activities
> are trapped by the virtualizer and passed down to the host.
> 
> Since the invalidation data are obtained from user space and will be
> written into physical IOMMU, we must allow security check at various
> layers. Therefore, generic invalidation data format are proposed here,
> model specific IOMMU drivers need to convert them into their own format.
> 
> Signed-off-by: Liu Yi L 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Ashok Raj 
> Signed-off-by: Eric Auger 
> Signed-off-by: Jean-Philippe Brucker 
Some comment ordering nitpicks.  Nothing important.

Jonathan

> ---
>  drivers/iommu/iommu.c  |  10 +
>  include/linux/iommu.h  |  14 ++
>  include/uapi/linux/iommu.h | 110 
> +
>  3 files changed, 134 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 4496ccd..1758b57 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1638,6 +1638,16 @@ void iommu_detach_pasid_table(struct iommu_domain 
> *domain)
>  }
>  EXPORT_SYMBOL_GPL(iommu_detach_pasid_table);
>  
> +int iommu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
> +struct iommu_cache_invalidate_info *inv_info)
> +{
> + if (unlikely(!domain->ops->cache_invalidate))
> + return -ENODEV;
> +
> + return domain->ops->cache_invalidate(domain, dev, inv_info);
> +}
> +EXPORT_SYMBOL_GPL(iommu_cache_invalidate);
> +
>  static void __iommu_detach_device(struct iommu_domain *domain,
> struct device *dev)
>  {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index d3edb10..7a37336 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -266,6 +266,7 @@ struct page_response_msg {
>   * @page_response: handle page request response
>   * @attach_pasid_table: attach a pasid table
>   * @detach_pasid_table: detach the pasid table
> + * @cache_invalidate: invalidate translation caches
>   * @pgsize_bitmap: bitmap of all possible supported page sizes
>   */
>  struct iommu_ops {
> @@ -330,6 +331,8 @@ struct iommu_ops {
>   void (*detach_pasid_table)(struct iommu_domain *domain);
>  
>   int (*page_response)(struct device *dev, struct page_response_msg *msg);
> + int (*cache_invalidate)(struct iommu_domain *domain, struct device *dev,
> + struct iommu_cache_invalidate_info *inv_info);
>  
>   unsigned long pgsize_bitmap;
>  };
> @@ -442,6 +445,9 @@ extern void iommu_detach_device(struct iommu_domain 
> *domain,
>  extern int iommu_attach_pasid_table(struct iommu_domain *domain,
>   struct iommu_pasid_table_config *cfg);
>  extern void iommu_detach_pasid_table(struct iommu_domain *domain);
> +extern int iommu_cache_invalidate(struct iommu_domain *domain,
> +   struct device *dev,
> +   struct iommu_cache_invalidate_info *inv_info);
>  extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
>  extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
>  extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
> @@ -986,6 +992,14 @@ static inline int iommu_sva_get_pasid(struct iommu_sva 
> *handle)
>  static inline
>  void iommu_detach_pasid_table(struct iommu_domain *domain) {}
>  
> +static inline int
> +iommu_cache_invalidate(struct iommu_domain *domain,
> +struct device *dev,
> +struct iommu_cache_invalidate_info *inv_info)
> +{
> + return -ENODEV;
> +}
> +
>  #endif /* CONFIG_IOMMU_API */
>  
>  #ifdef CONFIG_IOMMU_DEBUGFS
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 3976767..ca4b753 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -167,4 +167,114 @@ struct iommu_pasid_table_config {
>   };
>  };
>  
> +/* defines the granularity of the invalidation */
> +enum iommu_inv_granularity {
> + IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */
> + IOMMU_INV_GRANU_PASID,  /* PASID-selective invalidation */
> + IOMMU_INV_GRANU_ADDR,   /* page-selective invalidation */
> + IOMMU_INV_GRANU_NR, /* number of invalidation granularities */
> +};
> +
> +/**
> + * struct iommu_inv_addr_info - Address Selective Invalidation Structure
> + *
> + * @flags: indicates the granularity of the address-selective invalidation
> + * - If the PASID bit is set, the @pasid field is populated and the 
> invalidation
> + *   relates to cache entries tagged with this PASID and matching the address
> + *   range.
> + * - If ARCHID bit is set, @archid is populated and the 

Re: [PATCH v4 11/22] iommu: Introduce guest PASID bind function

2019-06-18 Thread Jean-Philippe Brucker
On 09/06/2019 14:44, Jacob Pan wrote:
> Guest shared virtual address (SVA) may require host to shadow guest
> PASID tables. Guest PASID can also be allocated from the host via
> enlightened interfaces. In this case, guest needs to bind the guest
> mm, i.e. cr3 in guest physical address to the actual PASID table in
> the host IOMMU. Nesting will be turned on such that guest virtual
> address can go through a two level translation:
> - 1st level translates GVA to GPA
> - 2nd level translates GPA to HPA
> This patch introduces APIs to bind guest PASID data to the assigned
> device entry in the physical IOMMU. See the diagram below for usage
> explaination.

explanation

> 
> .-.  .---.
> |   vIOMMU|  | Guest process mm, FL only |
> | |  '---'
> ./
> | PASID Entry |--- PASID cache flush -
> '-'   |
> | |   V
> | |  GP
> '-'
> Guest
> --| Shadow |--- GP->HP* -
>   vv  |
> Host  v
> .-.  .--.
> |   pIOMMU|  | Bind FL for GVA-GPA  |
> | |  '--'
> ./  |
> | PASID Entry | V (Nested xlate)
> '\.-.
> | |   |Set SL to GPA-HPA|
> | |   '-'
> '-'
> 
> Where:
>  - FL = First level/stage one page tables
>  - SL = Second level/stage two page tables
>  - GP = Guest PASID
>  - HP = Host PASID
> * Conversion needed if non-identity GP-HP mapping option is chosen.
> 
> Signed-off-by: Jacob Pan 
> Signed-off-by: Liu Yi L 
> ---
>  drivers/iommu/iommu.c  | 20 
>  include/linux/iommu.h  | 21 +
>  include/uapi/linux/iommu.h | 58 
> ++
>  3 files changed, 99 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 1758b57..d0416f60 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1648,6 +1648,26 @@ int iommu_cache_invalidate(struct iommu_domain 
> *domain, struct device *dev,
>  }
>  EXPORT_SYMBOL_GPL(iommu_cache_invalidate);
>  
> +int iommu_sva_bind_gpasid(struct iommu_domain *domain,
> + struct device *dev, struct gpasid_bind_data *data)

I'm curious about the VFIO side of this. Is the ioctl on the device or
on the container fd? For bind_pasid_table, it's on the container and we
only pass the iommu_domain to the IOMMU driver, not the device (since
devices in a domain share the same PASID table).

> +{
> + if (unlikely(!domain->ops->sva_bind_gpasid))
> + return -ENODEV;
> +
> + return domain->ops->sva_bind_gpasid(domain, dev, data);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_bind_gpasid);
> +
> +int iommu_sva_unbind_gpasid(struct iommu_domain *domain, struct device *dev,
> + ioasid_t pasid)
> +{
> + if (unlikely(!domain->ops->sva_unbind_gpasid))
> + return -ENODEV;
> +
> + return domain->ops->sva_unbind_gpasid(dev, pasid);
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_unbind_gpasid);
> +
>  static void __iommu_detach_device(struct iommu_domain *domain,
> struct device *dev)
>  {
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 8d766a8..560c8c8 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #define IOMMU_READ   (1 << 0)
> @@ -267,6 +268,8 @@ struct page_response_msg {
>   * @detach_pasid_table: detach the pasid table
>   * @cache_invalidate: invalidate translation caches
>   * @pgsize_bitmap: bitmap of all possible supported page sizes
> + * @sva_bind_gpasid: bind guest pasid and mm
> + * @sva_unbind_gpasid: unbind guest pasid and mm
>   */
>  struct iommu_ops {
>   bool (*capable)(enum iommu_cap);
> @@ -332,6 +335,10 @@ struct iommu_ops {
>   int (*page_response)(struct device *dev, struct page_response_msg *msg);
>   int (*cache_invalidate)(struct iommu_domain *domain, struct device *dev,
>   struct iommu_cache_invalidate_info *inv_info);
> + int (*sva_bind_gpasid)(struct iommu_domain *domain,
> + struct device *dev, struct gpasid_bind_data *data);
> +
> + int (*sva_unbind_gpasid)(struct device *dev, int pasid);
>  
>   unsigned long pgsize_bitmap;
>  };
> @@ -447,6 +454,10 @@ extern void iommu_detach_pasid_table(struct iommu_domain 
> *domain);
>  extern int iommu_cache_invalidate(struct iommu_domain *domain,
> struct device *dev,
> struct iommu_cache_invalidate_info *inv_info);
> +extern 

Re: [PATCH -next] iommu/intel: silence a variable set but not used

2019-06-18 Thread Joerg Roedel
On Mon, Jun 03, 2019 at 10:05:19AM -0400, Qian Cai wrote:
> The commit "iommu/vt-d: Probe DMA-capable ACPI name space devices"
> introduced a compilation warning due to the "iommu" variable in
> for_each_active_iommu() but never used the for each element, i.e,
> "drhd->iommu".
> 
> drivers/iommu/intel-iommu.c: In function 'probe_acpi_namespace_devices':
> drivers/iommu/intel-iommu.c:4639:22: warning: variable 'iommu' set but
> not used [-Wunused-but-set-variable]
>   struct intel_iommu *iommu;
> 
> Silence the warning the same way as in the commit d3ed71e5cc50
> ("drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used")
> 
> Signed-off-by: Qian Cai 
> ---
>  drivers/iommu/intel-iommu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/intel: remove an unused variable "length"

2019-06-18 Thread Joerg Roedel
On Mon, Jun 17, 2019 at 09:20:27AM -0400, Qian Cai wrote:
> The linux-next commit "iommu/vt-d: Duplicate iommu_resv_region objects
> per device list" [1] left out an unused variable,
> 
> drivers/iommu/intel-iommu.c: In function 'dmar_parse_one_rmrr':
> drivers/iommu/intel-iommu.c:4014:9: warning: variable 'length' set but
> not used [-Wunused-but-set-variable]
> 
> [1] https://lore.kernel.org/patchwork/patch/1083073/
> 
> Signed-off-by: Qian Cai 
> ---
>  drivers/iommu/intel-iommu.c | 3 ---
>  1 file changed, 3 deletions(-)

Applied, thanks.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: How to resolve an issue in swiotlb environment?

2019-06-18 Thread shuah

On 6/14/19 8:44 AM, Alan Stern wrote:

On Thu, 13 Jun 2019, shuah wrote:


Great!  So all we have to do is fix vhci-hcd.  Then we can remove all
the virt_boundary_mask stuff from usb-storage and uas entirely.

(I'm assuming wireless USB isn't a genuine issue.  As far as I know, it
is pretty much abandoned at this point.)

Valentina and Shua: Adding SG support to vhci-hcd shouldn't be too
hard.  It ought to be possible even without changing the network
protocol.



I will start taking a look at this. Is there a target release in plan
to drop virt_boundary_mask stuff?


Not yet.  But since it doesn't do what we want anyway, this should be
fixed quickly.



Sounds good. I am working on it.

thanks,
-- Shuah


Re: [PATCH] iommu: fix integer truncation

2019-06-18 Thread Joerg Roedel
On Mon, Jun 17, 2019 at 03:30:54PM +0200, Arnd Bergmann wrote:
> On 32-bit architectures, phys_addr_t may be different from dma_add_t,
> both smaller and bigger. This can lead to an overflow during an assignment
> that clang warns about:
> 
> drivers/iommu/dma-iommu.c:230:10: error: implicit conversion from 
> 'dma_addr_t' (aka 'unsigned long long') to
>   'phys_addr_t' (aka 'unsigned int') changes value from 
> 18446744073709551615 to 4294967295 [-Werror,-Wconstant-conversion]
> 
> Use phys_addr_t here because that is the type that the variable was
> declared as.
> 
> Fixes: aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible DMA 
> address")
> Signed-off-by: Arnd Bergmann 

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: Add padding to struct iommu_fault

2019-06-18 Thread Joerg Roedel
On Wed, Jun 12, 2019 at 06:59:38PM +0100, Jean-Philippe Brucker wrote:
> Ease future extensions of struct iommu_fault_page_request and struct
> iommu_fault_unrecoverable by adding a few bytes of padding. That way, a
> new field can be added to either of these structures by simply introducing
> a new flag. To extend it after the size limit is reached, a new fault
> reporting structure will have to be negotiated with userspace.
> 
> With 56 bytes of padding, the total size of iommu_fault is 64 bytes and
> fits in a cache line on a lot of contemporary machines, while providing 16
> and 24 bytes of extension to structures iommu_fault_page_request and
> iommu_fault_unrecoverable respectively.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  include/uapi/linux/iommu.h | 2 ++
>  1 file changed, 2 insertions(+)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/8] iommu: Add I/O ASID allocator

2019-06-18 Thread Jean-Philippe Brucker
On 11/06/2019 19:13, Jacob Pan wrote:
 +/**
 + * ioasid_find - Find IOASID data
 + * @set: the IOASID set
 + * @ioasid: the IOASID to find
 + * @getter: function to call on the found object
 + *
 + * The optional getter function allows to take a reference to the
 found object
 + * under the rcu lock. The function can also check if the object
 is still valid:
 + * if @getter returns false, then the object is invalid and NULL
 is returned.
 + *
 + * If the IOASID has been allocated for this set, return the
 private pointer
 + * passed to ioasid_alloc. Private data can be NULL if not set.
 Return an error
 + * if the IOASID is not found or does not belong to the set.  
>>>
>>> Perhaps should make it clear that @set can be null.  
>>
>> Indeed. But I'm not sure allowing @set to be NULL is such a good idea,
>> because the data type associated to an ioasid depends on its set. For
>> example SVA will put an mm_struct in there, and auxiliary domains use
>> some structure private to the IOMMU domain.
>>
> I am not sure we need to count on @set to decipher data type. Whoever
> does the allocation and owns the IOASID should knows its own data type.
> My thought was that @set is only used to group IDs, permission check
> etc.
> 
>> Jacob, could me make @set mandatory, or do you see a use for a global
>> search? If @set is NULL, then callers can check if the return pointer
>> is NULL, but will run into trouble if they try to dereference it.
>>
> A global search use case can be for PRQ. IOMMU driver gets a IOASID
> (first interrupt then retrieve from a queue), it has no idea which
> @set it belongs to. But the data types are the same for all IOASIDs
> used by the IOMMU.

They aren't when we use a generic SVA handler. Following a call to
iommu_sva_bind_device(), iommu-sva.c allocates an IOASID and store an
mm_struct. If auxiliary domains are also enabled for the device,
following a call to iommu_aux_attach_device() the IOMMU driver allocates
an IOASID and stores some private object.

Now for example the IOMMU driver receives a PPR and calls ioasid_find()
with @set = NULL. ioasid_find() may return either an mm_struct or a
private object, and the driver cannot know which it is so the returned
value is unusable.

Thanks,
Jean


Re: [PATCH v4 10/22] iommu: Fix compile error without IOMMU_API

2019-06-18 Thread Jonathan Cameron
On Sun, 9 Jun 2019 06:44:10 -0700
Jacob Pan  wrote:

> struct page_response_msg needs to be defined outside CONFIG_IOMMU_API.

What was the error? 

If this is a fix for an earlier patch in this series role it in there
(or put it before it). If more general we should add a fixes tag.

Jonathan
> 
> Signed-off-by: Jacob Pan 
> ---
>  include/linux/iommu.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 7a37336..8d766a8 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -189,8 +189,6 @@ struct iommu_sva_ops {
>   iommu_mm_exit_handler_t mm_exit;
>  };
>  
> -#ifdef CONFIG_IOMMU_API
> -
>  /**
>   * enum page_response_code - Return status of fault handlers, telling the 
> IOMMU
>   * driver how to proceed with the fault.
> @@ -227,6 +225,7 @@ struct page_response_msg {
>   u64 iommu_data;
>  };
>  
> +#ifdef CONFIG_IOMMU_API
>  /**
>   * struct iommu_ops - iommu ops and capabilities
>   * @capable: check capability




Re: [PATCH v7 14/21] iommu/mediatek: Add mmu1 support

2019-06-18 Thread Tomasz Figa
On Tue, Jun 18, 2019 at 9:09 PM Yong Wu  wrote:
>
> On Tue, 2019-06-18 at 15:19 +0900, Tomasz Figa wrote:
> > On Mon, Jun 10, 2019 at 9:21 PM Yong Wu  wrote:
> > >
> > > Normally the M4U HW connect EMI with smi. the diagram is like below:
> > >   EMI
> > >|
> > >   M4U
> > >|
> > > smi-common
> > >|
> > >-
> > >||| |...
> > > larb0 larb1  larb2 larb3
> > >
> > > Actually there are 2 mmu cells in the M4U HW, like this diagram:
> > >
> > >   EMI
> > >-
> > > | |
> > >mmu0  mmu1 <- M4U
> > > | |
> > >-
> > >|
> > > smi-common
> > >|
> > >-
> > >||| |...
> > > larb0 larb1  larb2 larb3
> > >
> > > This patch add support for mmu1. In order to get better performance,
> > > we could adjust some larbs go to mmu1 while the others still go to
> > > mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).
> > >
> > > mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
> > > value of that register is 0 which means all the larbs go to mmu0
> > > defaultly.
> > >
> > > This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
> > >
> > > Signed-off-by: Yong Wu 
> > > Reviewed-by: Evan Green 
> > > ---
> > >  drivers/iommu/mtk_iommu.c | 46 
> > > +-
> > >  1 file changed, 29 insertions(+), 17 deletions(-)
> > >
> > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > index 3a14301..ec4ce74 100644
> > > --- a/drivers/iommu/mtk_iommu.c
> > > +++ b/drivers/iommu/mtk_iommu.c
> > > @@ -72,26 +72,32 @@
> > >  #define F_INT_CLR_BIT  BIT(12)
> > >
> > >  #define REG_MMU_INT_MAIN_CONTROL   0x124
> > > -#define F_INT_TRANSLATION_FAULTBIT(0)
> > > -#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
> > > -#define F_INT_INVALID_PA_FAULT BIT(2)
> > > -#define F_INT_ENTRY_REPLACEMENT_FAULT  BIT(3)
> > > -#define F_INT_TLB_MISS_FAULT   BIT(4)
> > > -#define F_INT_MISS_TRANSACTION_FIFO_FAULT  BIT(5)
> > > -#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   BIT(6)
> > > +   /* mmu0 | mmu1 */
> > > +#define F_INT_TRANSLATION_FAULT(BIT(0) | BIT(7))
> > > +#define F_INT_MAIN_MULTI_HIT_FAULT (BIT(1) | BIT(8))
> > > +#define F_INT_INVALID_PA_FAULT (BIT(2) | BIT(9))
> > > +#define F_INT_ENTRY_REPLACEMENT_FAULT  (BIT(3) | BIT(10))
> > > +#define F_INT_TLB_MISS_FAULT   (BIT(4) | BIT(11))
> > > +#define F_INT_MISS_TRANSACTION_FIFO_FAULT  (BIT(5) | BIT(12))
> > > +#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   (BIT(6) | BIT(13))
> >
> > If there are two IOMMUs, shouldn't we have two driver instances handle
> > them, instead of making the driver combine them two internally?
>
> Actually it means only one IOMMU(M4U) HW here. Each a M4U HW has two
> small iommu cells which have independent MTLB. As the diagram above, M4U
> contain mmu0 and mmu1.
>
> MT8173 and MT8183 have only one M4U HW while MT2712 have 2 M4U HWs(two
> driver instances).
>
> >
> > And, what is even more important from security point of view actually,
> > have two separate page tables (aka IOMMU groups) for them?
>
> Each a IOMMU(M4U) have its own pagetable, thus, mt8183 have only one
> pagetable while mt2712 have two.

I see, thanks for clarifying.

Best regards,
Tomasz


Re: [PATCH v8 26/29] vfio-pci: Register an iommu fault handler

2019-06-18 Thread Jean-Philippe Brucker
On 12/06/2019 19:53, Jacob Pan wrote:
>>> You are right, the worst case of the spurious PS is to terminate the
>>> group prematurely. Need to know the scope of the HW damage in case
>>> of mdev where group IDs can be shared among mdevs belong to the
>>> same PF.  
>>
>> But from the IOMMU fault API point of view, the full page request is
>> identified by both PRGI and PASID. Given that each mdev has its own
>> set of PASIDs, it should be easy to isolate page responses per mdev.
>>
> On Intel platform, devices sending page request with private data must
> receive page response with matching private data. If we solely depend
> on PRGI and PASID, we may send stale private data to the device in
> those incorrect page response. Since private data may represent PF
> device wide contexts, the consequence of sending page response with
> wrong private data may affect other mdev/PASID.
> 
> One solution we are thinking to do is to inject the sequence #(e.g.
> ktime raw mono clock) as vIOMMU private data into to the guest. Guest
> would return this fake private data in page response, then host will
> send page response back to the device that matches PRG1 and PASID and
> private_data.
> 
> This solution does not expose HW context related private data to the
> guest but need to extend page response in iommu uapi.
> 
> /**
>  * struct iommu_page_response - Generic page response information
>  * @version: API version of this structure
>  * @flags: encodes whether the corresponding fields are valid
>  * (IOMMU_FAULT_PAGE_RESPONSE_* values)
>  * @pasid: Process Address Space ID
>  * @grpid: Page Request Group Index
>  * @code: response code from  iommu_page_response_code
>  * @private_data: private data for the matching page request
>  */
> struct iommu_page_response {
> #define IOMMU_PAGE_RESP_VERSION_1 1
>   __u32   version;
> #define IOMMU_PAGE_RESP_PASID_VALID   (1 << 0)
> #define IOMMU_PAGE_RESP_PRIVATE_DATA  (1 << 1)
>   __u32   flags;
>   __u32   pasid;
>   __u32   grpid;
>   __u32   code;
>   __u32   padding;
>   __u64   private_data[2];
> };
> 
> There is also the change needed for separating storage for the real and
> fake private data.
> 
> Sorry for the last minute change, did not realize the HW implications.
> 
> I see this as a future extension due to limited testing, 

I'm wondering how we deal with:
(1) old userspace that won't fill the new private_data field in
page_response. A new kernel still has to support it.
(2) old kernel that won't recognize the new PRIVATE_DATA flag. Currently
iommu_page_response() rejects page responses with unknown flags.

I guess we'll need a two-way negotiation, where userspace queries
whether the kernel supports the flag (2), and the kernel learns whether
it should expect the private data to come back (1).

> perhaps for
> now, can you add paddings similar to page request? Make it 64B as well.

I don't think padding is necessary, because iommu_page_response is sent
by userspace to the kernel, unlike iommu_fault which is allocated by
userspace and filled by the kernel.

Page response looks a lot more like existing VFIO mechanisms, so I
suppose we'll wrap the iommu_page_response structure and include an
argsz parameter at the top:

struct vfio_iommu_page_response {
u32 argsz;
struct iommu_page_response pr;
};

struct vfio_iommu_page_response vpr = {
.argsz = sizeof(vpr),
.pr = ...
...
};

ioctl(devfd, VFIO_IOMMU_PAGE_RESPONSE, );

In that case supporting private data can be done by simply appending a
field at the end (plus the negotiation above).

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 17/21] memory: mtk-smi: Get rid of need_larbid

2019-06-18 Thread Matthias Brugger



On 10/06/2019 14:17, Yong Wu wrote:
> The "mediatek,larb-id" has already been parsed in MTK IOMMU driver.
> It's no need to parse it again in SMI driver. Only clean some codes.
> This patch is fit for all the current mt2701, mt2712, mt7623, mt8173
> and mt8183.
> 
> After this patch, the "mediatek,larb-id" only be needed for mt2712
> which have 2 M4Us. In the other SoCs, we can get the larb-id from M4U
> in which the larbs in the "mediatek,larbs" always are ordered.
> 
> Correspondingly, the larb_nr in the "struct mtk_smi_iommu" could also
> be deleted.
> 

I think we can get rid of struct mtk_smi_iommu and just add the
struct mtk_smi_larb_iommu larb_imu[MTK_LARB_NR_MAX] directly to mtk_iommu_data,
passing just that array to the components bind function.

Never the less this patch looks fine:
Reviewed-by: Matthias Brugger 

> CC: Matthias Brugger 
> Signed-off-by: Yong Wu 
> Reviewed-by: Evan Green 
> ---
>  drivers/iommu/mtk_iommu.c|  1 -
>  drivers/iommu/mtk_iommu_v1.c |  2 --
>  drivers/memory/mtk-smi.c | 26 ++
>  include/soc/mediatek/smi.h   |  1 -
>  4 files changed, 2 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index ec4ce74..6053b8b 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -634,7 +634,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>"mediatek,larbs", NULL);
>   if (larb_nr < 0)
>   return larb_nr;
> - data->smi_imu.larb_nr = larb_nr;
>  
>   for (i = 0; i < larb_nr; i++) {
>   struct device_node *larbnode;
> diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> index 52b01e3..73308ad 100644
> --- a/drivers/iommu/mtk_iommu_v1.c
> +++ b/drivers/iommu/mtk_iommu_v1.c
> @@ -624,8 +624,6 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   larb_nr++;
>   }
>  
> - data->smi_imu.larb_nr = larb_nr;
> -
>   platform_set_drvdata(pdev, data);
>  
>   ret = mtk_iommu_hw_init(data);
> diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> index 08cf40d..10e6493 100644
> --- a/drivers/memory/mtk-smi.c
> +++ b/drivers/memory/mtk-smi.c
> @@ -67,7 +67,6 @@ struct mtk_smi_common_plat {
>  };
>  
>  struct mtk_smi_larb_gen {
> - bool need_larbid;
>   int port_in_larb[MTK_LARB_NR_MAX + 1];
>   void (*config_port)(struct device *);
>   unsigned int larb_direct_to_common_mask;
> @@ -153,18 +152,9 @@ void mtk_smi_larb_put(struct device *larbdev)
>   struct mtk_smi_iommu *smi_iommu = data;
>   unsigned int i;
>  
> - if (larb->larb_gen->need_larbid) {
> - larb->mmu = _iommu->larb_imu[larb->larbid].mmu;
> - return 0;
> - }
> -
> - /*
> -  * If there is no larbid property, Loop to find the corresponding
> -  * iommu information.
> -  */
> - for (i = 0; i < smi_iommu->larb_nr; i++) {
> + for (i = 0; i < MTK_LARB_NR_MAX; i++) {
>   if (dev == smi_iommu->larb_imu[i].dev) {
> - /* The 'mmu' may be updated in iommu-attach/detach. */
> + larb->larbid = i;
>   larb->mmu = _iommu->larb_imu[i].mmu;
>   return 0;
>   }
> @@ -243,7 +233,6 @@ static void mtk_smi_larb_config_port_gen1(struct device 
> *dev)
>  };
>  
>  static const struct mtk_smi_larb_gen mtk_smi_larb_mt2701 = {
> - .need_larbid = true,
>   .port_in_larb = {
>   LARB0_PORT_OFFSET, LARB1_PORT_OFFSET,
>   LARB2_PORT_OFFSET, LARB3_PORT_OFFSET
> @@ -252,7 +241,6 @@ static void mtk_smi_larb_config_port_gen1(struct device 
> *dev)
>  };
>  
>  static const struct mtk_smi_larb_gen mtk_smi_larb_mt2712 = {
> - .need_larbid = true,
>   .config_port= mtk_smi_larb_config_port_gen2_general,
>   .larb_direct_to_common_mask = BIT(8) | BIT(9),  /* bdpsys */
>  };
> @@ -291,7 +279,6 @@ static int mtk_smi_larb_probe(struct platform_device 
> *pdev)
>   struct device *dev = >dev;
>   struct device_node *smi_node;
>   struct platform_device *smi_pdev;
> - int err;
>  
>   larb = devm_kzalloc(dev, sizeof(*larb), GFP_KERNEL);
>   if (!larb)
> @@ -321,15 +308,6 @@ static int mtk_smi_larb_probe(struct platform_device 
> *pdev)
>   }
>   larb->smi.dev = dev;
>  
> - if (larb->larb_gen->need_larbid) {
> - err = of_property_read_u32(dev->of_node, "mediatek,larb-id",
> ->larbid);
> - if (err) {
> - dev_err(dev, "missing larbid property\n");
> - return err;
> - }
> - }
> -
>   smi_node = of_parse_phandle(dev->of_node, "mediatek,smi", 0);
>   if (!smi_node)
>   return -EINVAL;
> diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
> index 5201e90..a65324d 100644
> 

Re: [PATCH v2 08/12] drm/mediatek: Get rid of mtk_smi_larb_get/put

2019-06-18 Thread Yong Wu
On Tue, 2019-06-18 at 14:35 +0800, CK Hu wrote:
> Hi, Yong:
> 
> On Mon, 2019-06-10 at 20:55 +0800, Yong Wu wrote:
> > MediaTek IOMMU has already added the device_link between the consumer
> > and smi-larb device. If the drm device call the pm_runtime_get_sync,
> > the smi-larb's pm_runtime_get_sync also be called automatically.
> > 
> > CC: CK Hu 
> > CC: Philipp Zabel 
> > Signed-off-by: Yong Wu 
> > Reviewed-by: Evan Green 
> > ---
> >  drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 11 ---
> >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 26 --
> >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 -
> >  3 files changed, 38 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
> > b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > index acad088..3a21a48 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > @@ -18,7 +18,6 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> >  
> >  #include "mtk_drm_drv.h"
> >  #include "mtk_drm_crtc.h"
> > @@ -371,20 +370,12 @@ static void mtk_drm_crtc_atomic_enable(struct 
> > drm_crtc *crtc,
> >struct drm_crtc_state *old_state)
> >  {
> > struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
> > -   struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
> > int ret;
> >  
> > DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
> >  
> > -   ret = mtk_smi_larb_get(comp->larb_dev);
> > -   if (ret) {
> > -   DRM_ERROR("Failed to get larb: %d\n", ret);
> > -   return;
> > -   }
> > -
> > ret = mtk_crtc_ddp_hw_init(mtk_crtc);
> > if (ret) {
> > -   mtk_smi_larb_put(comp->larb_dev);
> > return;
> > }
> 
> Remove {}.

Thanks. I will fix in next version.

> 
> Regards,
> CK
> 
> >  
> > @@ -396,7 +387,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
> > *crtc,
> > struct drm_crtc_state *old_state)
> >  {
> > struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
> > -   struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
> > int i;
> >  
> > DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
> > @@ -419,7 +409,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
> > *crtc,
> >  
> > drm_crtc_vblank_off(crtc);
> > mtk_crtc_ddp_hw_fini(mtk_crtc);
> > -   mtk_smi_larb_put(comp->larb_dev);
> >  
> > mtk_crtc->enabled = false;
> >  }
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > index 54ca794..ede15c9 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > @@ -265,8 +265,6 @@ int mtk_ddp_comp_init(struct device *dev, struct 
> > device_node *node,
> >   const struct mtk_ddp_comp_funcs *funcs)
> >  {
> > enum mtk_ddp_comp_type type;
> > -   struct device_node *larb_node;
> > -   struct platform_device *larb_pdev;
> >  
> > if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX)
> > return -EINVAL;
> > @@ -296,30 +294,6 @@ int mtk_ddp_comp_init(struct device *dev, struct 
> > device_node *node,
> > if (IS_ERR(comp->clk))
> > return PTR_ERR(comp->clk);
> >  
> > -   /* Only DMA capable components need the LARB property */
> > -   comp->larb_dev = NULL;
> > -   if (type != MTK_DISP_OVL &&
> > -   type != MTK_DISP_RDMA &&
> > -   type != MTK_DISP_WDMA)
> > -   return 0;
> > -
> > -   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
> > -   if (!larb_node) {
> > -   dev_err(dev,
> > -   "Missing mediadek,larb phandle in %pOF node\n", node);
> > -   return -EINVAL;
> > -   }
> > -
> > -   larb_pdev = of_find_device_by_node(larb_node);
> > -   if (!larb_pdev) {
> > -   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
> > -   of_node_put(larb_node);
> > -   return -EPROBE_DEFER;
> > -   }
> > -   of_node_put(larb_node);
> > -
> > -   comp->larb_dev = _pdev->dev;
> > -
> > return 0;
> >  }
> >  
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h 
> > b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> > index 8399229..b8dc17e 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> > @@ -91,7 +91,6 @@ struct mtk_ddp_comp {
> > struct clk *clk;
> > void __iomem *regs;
> > int irq;
> > -   struct device *larb_dev;
> > enum mtk_ddp_comp_id id;
> > const struct mtk_ddp_comp_funcs *funcs;
> >  };
> 
> 




Re: [PATCH v7 16/21] memory: mtk-smi: Add bus_sel for mt8183

2019-06-18 Thread Yong Wu
On Mon, 2019-06-17 at 18:23 +0200, Matthias Brugger wrote:
> 
> On 10/06/2019 14:17, Yong Wu wrote:
> > There are 2 mmu cells in a M4U HW. we could adjust some larbs entering
> > mmu0 or mmu1 to balance the bandwidth via the smi-common register
> > SMI_BUS_SEL(0x220)(Each larb occupy 2 bits).
> > 
> > In mt8183, For better performance, we switch larb1/2/5/7 to enter
> > mmu1 while the others still keep enter mmu0.
> > 
> > In mt8173 and mt2712, we don't get the performance issue,
> > Keep its default value(0x0), that means all the larbs enter mmu0.
> > 
> > Note: smi gen1(mt2701/mt7623) don't have this bus_sel.
> > 
> > And, the base of smi-common is completely different with smi_ao_base
> > of gen1, thus I add new variable for that.
> > 
> > CC: Matthias Brugger 
> > Signed-off-by: Yong Wu 
> > Reviewed-by: Evan Green 
> > ---
> >  drivers/memory/mtk-smi.c | 22 --
> >  1 file changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > index 9790801..08cf40d 100644
> > --- a/drivers/memory/mtk-smi.c
> > +++ b/drivers/memory/mtk-smi.c
> > @@ -49,6 +49,12 @@
> >  #define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
> >  #define F_MMU_EN   BIT(0)
> >  
> > +/* SMI COMMON */
> > +#define SMI_BUS_SEL0x220
> > +#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
> > +/* All are MMU0 defaultly. Only specialize mmu1 here. */
> > +#define F_MMU1_LARB(larbid)(0x1 << 
> > SMI_BUS_LARB_SHIFT(larbid))
> > +
> >  enum mtk_smi_gen {
> > MTK_SMI_GEN1,
> > MTK_SMI_GEN2
> > @@ -57,6 +63,7 @@ enum mtk_smi_gen {
> >  struct mtk_smi_common_plat {
> > enum mtk_smi_gen gen;
> > bool has_gals;
> > +   u32  bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
> >  };
> >  
> >  struct mtk_smi_larb_gen {
> > @@ -72,8 +79,8 @@ struct mtk_smi {
> > struct clk  *clk_apb, *clk_smi;
> > struct clk  *clk_gals0, *clk_gals1;
> > struct clk  *clk_async; /*only needed by mt2701*/
> > -   void __iomem*smi_ao_base;
> > -
> > +   void __iomem*smi_ao_base; /* only for gen1 */
> > +   void __iomem*base;/* only for gen2 */
> 
> union {} maybe?

Yes. Thanks.

I will add it.

> 
> > const struct mtk_smi_common_plat *plat;
> >  };
> >  
> > @@ -410,6 +417,8 @@ static int __maybe_unused mtk_smi_larb_suspend(struct 
> > device *dev)
> >  static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
> > .gen  = MTK_SMI_GEN2,
> > .has_gals = true,
> > +   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
> > +   F_MMU1_LARB(7),
> >  };
> >  
> >  static const struct of_device_id mtk_smi_common_of_ids[] = {
> > @@ -482,6 +491,11 @@ static int mtk_smi_common_probe(struct platform_device 
> > *pdev)
> > ret = clk_prepare_enable(common->clk_async);
> > if (ret)
> > return ret;
> > +   } else {
> > +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > +   common->base = devm_ioremap_resource(dev, res);
> > +   if (IS_ERR(common->base))
> > +   return PTR_ERR(common->base);
> 
> We must be backwards compatible with DT which does not have the base defined.

The smi-common node in the previous mt2712 and mt8173 also have the
"reg" property even though they didn't use this base, Thus, It looks ok
for all the cases.

> 
> Regards,
> Matthias
> 
> > }
> > pm_runtime_enable(dev);
> > platform_set_drvdata(pdev, common);
> > @@ -497,6 +511,7 @@ static int mtk_smi_common_remove(struct platform_device 
> > *pdev)
> >  static int __maybe_unused mtk_smi_common_resume(struct device *dev)
> >  {
> > struct mtk_smi *common = dev_get_drvdata(dev);
> > +   u32 bus_sel = common->plat->bus_sel;
> > int ret;
> >  
> > ret = mtk_smi_clk_enable(common);
> > @@ -504,6 +519,9 @@ static int __maybe_unused mtk_smi_common_resume(struct 
> > device *dev)
> > dev_err(common->dev, "Failed to enable clock(%d).\n", ret);
> > return ret;
> > }
> > +
> > +   if (common->plat->gen == MTK_SMI_GEN2 && bus_sel)
> > +   writel(bus_sel, common->base + SMI_BUS_SEL);
> > return 0;
> >  }
> >  
> > 




Re: [PATCH v7 14/21] iommu/mediatek: Add mmu1 support

2019-06-18 Thread Yong Wu
On Tue, 2019-06-18 at 15:19 +0900, Tomasz Figa wrote:
> On Mon, Jun 10, 2019 at 9:21 PM Yong Wu  wrote:
> >
> > Normally the M4U HW connect EMI with smi. the diagram is like below:
> >   EMI
> >|
> >   M4U
> >|
> > smi-common
> >|
> >-
> >||| |...
> > larb0 larb1  larb2 larb3
> >
> > Actually there are 2 mmu cells in the M4U HW, like this diagram:
> >
> >   EMI
> >-
> > | |
> >mmu0  mmu1 <- M4U
> > | |
> >-
> >|
> > smi-common
> >|
> >-
> >||| |...
> > larb0 larb1  larb2 larb3
> >
> > This patch add support for mmu1. In order to get better performance,
> > we could adjust some larbs go to mmu1 while the others still go to
> > mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).
> >
> > mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
> > value of that register is 0 which means all the larbs go to mmu0
> > defaultly.
> >
> > This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
> >
> > Signed-off-by: Yong Wu 
> > Reviewed-by: Evan Green 
> > ---
> >  drivers/iommu/mtk_iommu.c | 46 
> > +-
> >  1 file changed, 29 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 3a14301..ec4ce74 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -72,26 +72,32 @@
> >  #define F_INT_CLR_BIT  BIT(12)
> >
> >  #define REG_MMU_INT_MAIN_CONTROL   0x124
> > -#define F_INT_TRANSLATION_FAULTBIT(0)
> > -#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
> > -#define F_INT_INVALID_PA_FAULT BIT(2)
> > -#define F_INT_ENTRY_REPLACEMENT_FAULT  BIT(3)
> > -#define F_INT_TLB_MISS_FAULT   BIT(4)
> > -#define F_INT_MISS_TRANSACTION_FIFO_FAULT  BIT(5)
> > -#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   BIT(6)
> > +   /* mmu0 | mmu1 */
> > +#define F_INT_TRANSLATION_FAULT(BIT(0) | BIT(7))
> > +#define F_INT_MAIN_MULTI_HIT_FAULT (BIT(1) | BIT(8))
> > +#define F_INT_INVALID_PA_FAULT (BIT(2) | BIT(9))
> > +#define F_INT_ENTRY_REPLACEMENT_FAULT  (BIT(3) | BIT(10))
> > +#define F_INT_TLB_MISS_FAULT   (BIT(4) | BIT(11))
> > +#define F_INT_MISS_TRANSACTION_FIFO_FAULT  (BIT(5) | BIT(12))
> > +#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   (BIT(6) | BIT(13))
> 
> If there are two IOMMUs, shouldn't we have two driver instances handle
> them, instead of making the driver combine them two internally?

Actually it means only one IOMMU(M4U) HW here. Each a M4U HW has two
small iommu cells which have independent MTLB. As the diagram above, M4U
contain mmu0 and mmu1.

MT8173 and MT8183 have only one M4U HW while MT2712 have 2 M4U HWs(two
driver instances).

> 
> And, what is even more important from security point of view actually,
> have two separate page tables (aka IOMMU groups) for them?

Each a IOMMU(M4U) have its own pagetable, thus, mt8183 have only one
pagetable while mt2712 have two.

> 
> Best regards,
> Tomasz
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek




Re: [RFC PATCH 2/2] soc: ti: Add Support for the TI Page-based Address Translator (PAT)

2019-06-18 Thread Tero Kristo via iommu

On 07/06/2019 22:35, Andrew F. Davis wrote:

This patch adds a driver for the Page-based Address Translator (PAT)
present on various TI SoCs. A PAT device performs address translation
using tables stored in an internal SRAM. Each PAT supports a set number
of pages, each occupying a programmable 4KB, 16KB, 64KB, or 1MB of
addresses in a window for which an incoming transaction will be
translated.

Signed-off-by: Andrew F. Davis 
---
  drivers/soc/ti/Kconfig  |   9 +
  drivers/soc/ti/Makefile |   1 +
  drivers/soc/ti/ti-pat.c | 569 
  include/uapi/linux/ti-pat.h |  44 +++
  4 files changed, 623 insertions(+)
  create mode 100644 drivers/soc/ti/ti-pat.c
  create mode 100644 include/uapi/linux/ti-pat.h

diff --git a/drivers/soc/ti/Kconfig b/drivers/soc/ti/Kconfig
index f0be35d3dcba..b838ae74d01f 100644
--- a/drivers/soc/ti/Kconfig
+++ b/drivers/soc/ti/Kconfig
@@ -86,4 +86,13 @@ config TI_SCI_INTA_MSI_DOMAIN
help
  Driver to enable Interrupt Aggregator specific MSI Domain.
  
+config TI_PAT

+   tristate "TI PAT DMA-BUF exporter"
+   select REGMAP


What is the reasoning for using regmap for internal register access? Why 
not just use direct readl/writel for everything? To me it seems this is 
only used during probe time also...



+   help
+ Driver for TI Page-based Address Translator (PAT). This driver
+ provides the an API allowing the remapping of a non-contiguous
+ DMA-BUF into a contiguous one that is sutable for devices needing
+ coniguous memory.


Minor typo: contiguous.


+
  endif # SOC_TI
diff --git a/drivers/soc/ti/Makefile b/drivers/soc/ti/Makefile
index b3868d392d4f..1369642b40c3 100644
--- a/drivers/soc/ti/Makefile
+++ b/drivers/soc/ti/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_AMX3_PM)   += pm33xx.o
  obj-$(CONFIG_WKUP_M3_IPC) += wkup_m3_ipc.o
  obj-$(CONFIG_TI_SCI_PM_DOMAINS)   += ti_sci_pm_domains.o
  obj-$(CONFIG_TI_SCI_INTA_MSI_DOMAIN)  += ti_sci_inta_msi.o
+obj-$(CONFIG_TI_PAT)   += ti-pat.o
diff --git a/drivers/soc/ti/ti-pat.c b/drivers/soc/ti/ti-pat.c
new file mode 100644
index ..7359ea0f7ccf
--- /dev/null
+++ b/drivers/soc/ti/ti-pat.c
@@ -0,0 +1,569 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TI PAT mapped DMA-BUF memory re-exporter
+ *
+ * Copyright (C) 2018-2019 Texas Instruments Incorporated - http://www.ti.com/
+ * Andrew F. Davis 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define TI_PAT_DRIVER_NAME "ti-pat"


Why do you have a define for this seeing it is only used in single location?


+
+/* TI PAT MMRS registers */
+#define TI_PAT_MMRS_PID0x0 /* Revision Register */
+#define TI_PAT_MMRS_CONFIG 0x4 /* Config Register */
+#define TI_PAT_MMRS_CONTROL0x10 /* Control Register */
+
+/* TI PAT CONTROL register field values */
+#define TI_PAT_CONTROL_ARB_MODE_UF 0x0 /* Updates first */
+#define TI_PAT_CONTROL_ARB_MODE_RR 0x2 /* Round-robin */
+
+#define TI_PAT_CONTROL_PAGE_SIZE_4KB   0x0
+#define TI_PAT_CONTROL_PAGE_SIZE_16KB  0x1
+#define TI_PAT_CONTROL_PAGE_SIZE_64KB  0x2
+#define TI_PAT_CONTROL_PAGE_SIZE_1MB   0x3
+
+static unsigned int ti_pat_page_sizes[] = {
+   [TI_PAT_CONTROL_PAGE_SIZE_4KB]  = 4 * 1024,
+   [TI_PAT_CONTROL_PAGE_SIZE_16KB] = 16 * 1024,
+   [TI_PAT_CONTROL_PAGE_SIZE_64KB] = 64 * 1024,
+   [TI_PAT_CONTROL_PAGE_SIZE_1MB]  = 1024 * 1024,
+};
+
+enum ti_pat_mmrs_fields {
+   /* Revision */
+   F_PID_MAJOR,
+   F_PID_MINOR,
+
+   /* Controls */
+   F_CONTROL_ARB_MODE,
+   F_CONTROL_PAGE_SIZE,
+   F_CONTROL_REPLACE_OID_EN,
+   F_CONTROL_EN,
+
+   /* sentinel */
+   F_MAX_FIELDS
+};
+
+static const struct reg_field ti_pat_mmrs_reg_fields[] = {
+   /* Revision */
+   [F_PID_MAJOR]   = REG_FIELD(TI_PAT_MMRS_PID, 8, 10),
+   [F_PID_MINOR]   = REG_FIELD(TI_PAT_MMRS_PID, 0, 5),
+   /* Controls */
+   [F_CONTROL_ARB_MODE]= REG_FIELD(TI_PAT_MMRS_CONTROL, 6, 7),
+   [F_CONTROL_PAGE_SIZE]   = REG_FIELD(TI_PAT_MMRS_CONTROL, 4, 5),
+   [F_CONTROL_REPLACE_OID_EN]  = REG_FIELD(TI_PAT_MMRS_CONTROL, 1, 1),
+   [F_CONTROL_EN]  = REG_FIELD(TI_PAT_MMRS_CONTROL, 0, 0),
+};
+
+/**
+ * struct ti_pat_data - PAT device instance data
+ * @dev: PAT device structure
+ * @mdev: misc device
+ * @mmrs_map: Register map of MMRS region
+ * @table_base: Base address of TABLE region


Please add kerneldoc comments for all fields.


+ */
+struct ti_pat_data {
+   struct device *dev;
+   struct miscdevice mdev;
+   struct regmap *mmrs_map;
+   struct regmap_field *mmrs_fields[F_MAX_FIELDS];
+   void __iomem *table_base;
+   unsigned int page_count;
+   unsigned int page_size;
+   phys_addr_t window_base;
+ 

Re: [PATCH v2 08/12] drm/mediatek: Get rid of mtk_smi_larb_get/put

2019-06-18 Thread CK Hu
Hi, Yong:

On Mon, 2019-06-10 at 20:55 +0800, Yong Wu wrote:
> MediaTek IOMMU has already added the device_link between the consumer
> and smi-larb device. If the drm device call the pm_runtime_get_sync,
> the smi-larb's pm_runtime_get_sync also be called automatically.
> 
> CC: CK Hu 
> CC: Philipp Zabel 
> Signed-off-by: Yong Wu 
> Reviewed-by: Evan Green 
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 11 ---
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 26 --
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 -
>  3 files changed, 38 deletions(-)
> 
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> index acad088..3a21a48 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> @@ -18,7 +18,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  
>  #include "mtk_drm_drv.h"
>  #include "mtk_drm_crtc.h"
> @@ -371,20 +370,12 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
> *crtc,
>  struct drm_crtc_state *old_state)
>  {
>   struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
> - struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
>   int ret;
>  
>   DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
>  
> - ret = mtk_smi_larb_get(comp->larb_dev);
> - if (ret) {
> - DRM_ERROR("Failed to get larb: %d\n", ret);
> - return;
> - }
> -
>   ret = mtk_crtc_ddp_hw_init(mtk_crtc);
>   if (ret) {
> - mtk_smi_larb_put(comp->larb_dev);
>   return;
>   }

Remove {}.

Regards,
CK

>  
> @@ -396,7 +387,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
> *crtc,
>   struct drm_crtc_state *old_state)
>  {
>   struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
> - struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
>   int i;
>  
>   DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
> @@ -419,7 +409,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
> *crtc,
>  
>   drm_crtc_vblank_off(crtc);
>   mtk_crtc_ddp_hw_fini(mtk_crtc);
> - mtk_smi_larb_put(comp->larb_dev);
>  
>   mtk_crtc->enabled = false;
>  }
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> index 54ca794..ede15c9 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> @@ -265,8 +265,6 @@ int mtk_ddp_comp_init(struct device *dev, struct 
> device_node *node,
> const struct mtk_ddp_comp_funcs *funcs)
>  {
>   enum mtk_ddp_comp_type type;
> - struct device_node *larb_node;
> - struct platform_device *larb_pdev;
>  
>   if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX)
>   return -EINVAL;
> @@ -296,30 +294,6 @@ int mtk_ddp_comp_init(struct device *dev, struct 
> device_node *node,
>   if (IS_ERR(comp->clk))
>   return PTR_ERR(comp->clk);
>  
> - /* Only DMA capable components need the LARB property */
> - comp->larb_dev = NULL;
> - if (type != MTK_DISP_OVL &&
> - type != MTK_DISP_RDMA &&
> - type != MTK_DISP_WDMA)
> - return 0;
> -
> - larb_node = of_parse_phandle(node, "mediatek,larb", 0);
> - if (!larb_node) {
> - dev_err(dev,
> - "Missing mediadek,larb phandle in %pOF node\n", node);
> - return -EINVAL;
> - }
> -
> - larb_pdev = of_find_device_by_node(larb_node);
> - if (!larb_pdev) {
> - dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
> - of_node_put(larb_node);
> - return -EPROBE_DEFER;
> - }
> - of_node_put(larb_node);
> -
> - comp->larb_dev = _pdev->dev;
> -
>   return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h 
> b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> index 8399229..b8dc17e 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> @@ -91,7 +91,6 @@ struct mtk_ddp_comp {
>   struct clk *clk;
>   void __iomem *regs;
>   int irq;
> - struct device *larb_dev;
>   enum mtk_ddp_comp_id id;
>   const struct mtk_ddp_comp_funcs *funcs;
>  };




Re: [PATCH v7 14/21] iommu/mediatek: Add mmu1 support

2019-06-18 Thread Tomasz Figa via iommu
On Mon, Jun 10, 2019 at 9:21 PM Yong Wu  wrote:
>
> Normally the M4U HW connect EMI with smi. the diagram is like below:
>   EMI
>|
>   M4U
>|
> smi-common
>|
>-
>||| |...
> larb0 larb1  larb2 larb3
>
> Actually there are 2 mmu cells in the M4U HW, like this diagram:
>
>   EMI
>-
> | |
>mmu0  mmu1 <- M4U
> | |
>-
>|
> smi-common
>|
>-
>||| |...
> larb0 larb1  larb2 larb3
>
> This patch add support for mmu1. In order to get better performance,
> we could adjust some larbs go to mmu1 while the others still go to
> mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).
>
> mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
> value of that register is 0 which means all the larbs go to mmu0
> defaultly.
>
> This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
>
> Signed-off-by: Yong Wu 
> Reviewed-by: Evan Green 
> ---
>  drivers/iommu/mtk_iommu.c | 46 +-
>  1 file changed, 29 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 3a14301..ec4ce74 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -72,26 +72,32 @@
>  #define F_INT_CLR_BIT  BIT(12)
>
>  #define REG_MMU_INT_MAIN_CONTROL   0x124
> -#define F_INT_TRANSLATION_FAULTBIT(0)
> -#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
> -#define F_INT_INVALID_PA_FAULT BIT(2)
> -#define F_INT_ENTRY_REPLACEMENT_FAULT  BIT(3)
> -#define F_INT_TLB_MISS_FAULT   BIT(4)
> -#define F_INT_MISS_TRANSACTION_FIFO_FAULT  BIT(5)
> -#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   BIT(6)
> +   /* mmu0 | mmu1 */
> +#define F_INT_TRANSLATION_FAULT(BIT(0) | BIT(7))
> +#define F_INT_MAIN_MULTI_HIT_FAULT (BIT(1) | BIT(8))
> +#define F_INT_INVALID_PA_FAULT (BIT(2) | BIT(9))
> +#define F_INT_ENTRY_REPLACEMENT_FAULT  (BIT(3) | BIT(10))
> +#define F_INT_TLB_MISS_FAULT   (BIT(4) | BIT(11))
> +#define F_INT_MISS_TRANSACTION_FIFO_FAULT  (BIT(5) | BIT(12))
> +#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   (BIT(6) | BIT(13))

If there are two IOMMUs, shouldn't we have two driver instances handle
them, instead of making the driver combine them two internally?

And, what is even more important from security point of view actually,
have two separate page tables (aka IOMMU groups) for them?

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu