Re: [PATCH v2 5/7] iommu/vt-d: Save prq descriptors in an internal list

2020-04-16 Thread Lu Baolu

Hi Kevin,

On 2020/4/16 9:46, Lu Baolu wrote:

On 2020/4/15 17:30, Tian, Kevin wrote:

From: Lu Baolu
Sent: Wednesday, April 15, 2020 1:26 PM

Currently, the page request interrupt thread handles the page
requests in the queue in this way:

- Clear PPR bit to ensure new interrupt could come in;
- Read and record the head and tail registers;
- Handle all descriptors between head and tail;
- Write tail to head register.

This might cause some descriptors to be handles multiple times.
An example sequence:

- Thread A got scheduled with PRQ_1 and PRQ_2 in the queue;
- Thread A clear the PPR bit and record the head and tail;
- A new PRQ_3 comes and Thread B gets scheduled;
- Thread B record the head and tail which includes PRQ_1
   and PRQ_2.

I may overlook something but isn't the prq interrupt thread
per iommu then why would two prq threads contend here?


The prq interrupt could be masked by the PPR (Pending Page Request) bit
in Page Request Status Register. In the interrupt handling thread once
this bit is clear, new prq interrupts are allowed to be generated.

So, if a page request is in process and the PPR bit is cleared, another
page request from any devices under the same iommu could trigger another
interrupt thread.


Rechecked the code. You are right. As long as the interrupt thread is
per iommu, there will only single prq thread scheduled. I will change
this accordingly in the new version. Thank you for pointing this out.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 5/7] iommu/vt-d: Save prq descriptors in an internal list

2020-04-15 Thread Lu Baolu

On 2020/4/15 17:30, Tian, Kevin wrote:

From: Lu Baolu
Sent: Wednesday, April 15, 2020 1:26 PM

Currently, the page request interrupt thread handles the page
requests in the queue in this way:

- Clear PPR bit to ensure new interrupt could come in;
- Read and record the head and tail registers;
- Handle all descriptors between head and tail;
- Write tail to head register.

This might cause some descriptors to be handles multiple times.
An example sequence:

- Thread A got scheduled with PRQ_1 and PRQ_2 in the queue;
- Thread A clear the PPR bit and record the head and tail;
- A new PRQ_3 comes and Thread B gets scheduled;
- Thread B record the head and tail which includes PRQ_1
   and PRQ_2.

I may overlook something but isn't the prq interrupt thread
per iommu then why would two prq threads contend here?


The prq interrupt could be masked by the PPR (Pending Page Request) bit
in Page Request Status Register. In the interrupt handling thread once
this bit is clear, new prq interrupts are allowed to be generated.

So, if a page request is in process and the PPR bit is cleared, another
page request from any devices under the same iommu could trigger another
interrupt thread.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 5/7] iommu/vt-d: Save prq descriptors in an internal list

2020-04-15 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Wednesday, April 15, 2020 1:26 PM
> 
> Currently, the page request interrupt thread handles the page
> requests in the queue in this way:
> 
> - Clear PPR bit to ensure new interrupt could come in;
> - Read and record the head and tail registers;
> - Handle all descriptors between head and tail;
> - Write tail to head register.
> 
> This might cause some descriptors to be handles multiple times.
> An example sequence:
> 
> - Thread A got scheduled with PRQ_1 and PRQ_2 in the queue;
> - Thread A clear the PPR bit and record the head and tail;
> - A new PRQ_3 comes and Thread B gets scheduled;
> - Thread B record the head and tail which includes PRQ_1
>   and PRQ_2.

I may overlook something but isn't the prq interrupt thread
per iommu then why would two prq threads contend here?

Thanks,
Kevin

> 
> As the result, PRQ_1 and PRQ_2 are handled twice in Thread_A and
> Thread_B.
> 
>Thread_AThread_B
>   ..  ..
>   ||  ||
>   ..  ..
>   head| PRQ_1  |  head| PRQ_1  |
>   ..  ..
>   | PRQ_2  |  | PRQ_2  |
>   ..  ..
>   tail||  | PRQ_3  |
>   ..  ..
>   ||  tail||
>   ''  ''
> 
> To avoid this, probably, we need to apply a spinlock to ensure
> that PRQs are handled in a serialized way. But that means the
> intel_svm_process_prq() will be called with a spinlock held.
> This causes extra complexities in intel_svm_process_prq().
> 
> This aims to make PRQ descriptors to be handled in a serialized
> way while remove the requirement of holding the spin lock in
> intel_svm_process_prq() by saving the descriptors in a list.
> 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel-svm.c   | 58 ++---
>  include/linux/intel-iommu.h |  2 ++
>  2 files changed, 49 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index a1921b462783..05aeb8ea51c4 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -50,6 +50,8 @@ int intel_svm_enable_prq(struct intel_iommu *iommu)
>   return ret;
>   }
>   iommu->pr_irq = irq;
> + INIT_LIST_HEAD(>prq_list);
> + spin_lock_init(>prq_lock);
> 
>   snprintf(iommu->prq_name, sizeof(iommu->prq_name), "dmar%d-
> prq", iommu->seq_id);
> 
> @@ -698,6 +700,14 @@ struct page_req_dsc {
> 
>  #define PRQ_RING_MASK((0x1000 << PRQ_ORDER) - 0x20)
> 
> +struct page_req {
> + struct list_head list;
> + struct page_req_dsc desc;
> + unsigned int processing:1;
> + unsigned int drained:1;
> + unsigned int completed:1;
> +};
> +
>  static bool access_error(struct vm_area_struct *vma, struct page_req_dsc
> *req)
>  {
>   unsigned long requested = 0;
> @@ -842,34 +852,60 @@ static void process_single_prq(struct intel_iommu
> *iommu,
>   }
>  }
> 
> -static void intel_svm_process_prq(struct intel_iommu *iommu,
> -   struct page_req_dsc *prq,
> -   int head, int tail)
> +static void intel_svm_process_prq(struct intel_iommu *iommu)
>  {
> - struct page_req_dsc *req;
> -
> - while (head != tail) {
> - req = >prq[head / sizeof(*req)];
> - process_single_prq(iommu, req);
> - head = (head + sizeof(*req)) & PRQ_RING_MASK;
> + struct page_req *req;
> + unsigned long flags;
> +
> + spin_lock_irqsave(>prq_lock, flags);
> + while (!list_empty(>prq_list)) {
> + req = list_first_entry(>prq_list, struct page_req, list);
> + if (!req->processing) {
> + req->processing = true;
> + spin_unlock_irqrestore(>prq_lock, flags);
> + process_single_prq(iommu, >desc);
> + spin_lock_irqsave(>prq_lock, flags);
> + req->completed = true;
> + } else if (req->completed) {
> + list_del(>list);
> + kfree(req);
> + } else {
> + break;
> + }
>   }
> + spin_unlock_irqrestore(>prq_lock, flags);
>  }
> 
>  static irqreturn_t prq_event_thread(int irq, void *d)
>  {
>   struct intel_iommu *iommu = d;
> + unsigned long flags;
>   int head, tail;
> 
> + spin_lock_irqsave(>prq_lock, flags);
>   /*
>* Clear PPR bit before reading head/tail registers, to
>* ensure that we get a new interrupt if needed.
>*/
>   writel(DMA_PRS_PPR, iommu->reg + DMAR_PRS_REG);
> -
>   tail = dmar_readq(iommu->reg + DMAR_PQT_REG) &
> PRQ_RING_MASK;
>   head = dmar_readq(iommu->reg + DMAR_PQH_REG) &
> PRQ_RING_MASK;
> - intel_svm_process_prq(iommu, iommu->prq, head, tail);
> + while (head != tail) {
> +