Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-05 Thread Lu Baolu

Hi Joerg,

On 12/5/18 11:56 PM, Joerg Roedel wrote:

On Tue, Dec 04, 2018 at 02:13:31PM +0800, Lu Baolu wrote:

The existing code uses GFP_ATOMIC, this patch only changes the size of
the allocated desc_page.

I don't think we really need GFP_ATOMIC here (and also for some other
places). I will clean up them in a separated patch.


Okay, thanks.


In this patch, there is some code like the code below. It calculates
destination address of memcpy with qi->desc. If it's still struct qi_desc
pointer, the calculation result would be wrong.

+   memcpy(desc, qi->desc + (wait_index << shift),
+  1 << shift);

The change of the calculation method is to support 128 bits invalidation
descriptors and 256 invalidation descriptors in this unified code logic.

Also, the conversation between Baolu and me may help.

https://lore.kernel.org/patchwork/patch/1006756/


Yes. We need to support different descriptor size.


Okay, pointer arithmetic on void* isn't well defined in the C standard,
afaik. But it should work with GCC, so it's probably fine.

Unrelated to this patch-set, the whole qi management code needs some
cleanups, it queues a sync after every command and has very tricky
locking. This patch-set further complicates matters there, so it is
probably time for a clean re-write of that part?


Sure. I will start the cleanups(including the unnecessary ATOMIC page
allocation) and submit them in a separated patch set after well tested.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-05 Thread Joerg Roedel
On Tue, Dec 04, 2018 at 02:13:31PM +0800, Lu Baolu wrote:
> The existing code uses GFP_ATOMIC, this patch only changes the size of
> the allocated desc_page.
> 
> I don't think we really need GFP_ATOMIC here (and also for some other
> places). I will clean up them in a separated patch.

Okay, thanks.

> > In this patch, there is some code like the code below. It calculates
> > destination address of memcpy with qi->desc. If it's still struct qi_desc
> > pointer, the calculation result would be wrong.
> > 
> > +   memcpy(desc, qi->desc + (wait_index << shift),
> > +  1 << shift);
> > 
> > The change of the calculation method is to support 128 bits invalidation
> > descriptors and 256 invalidation descriptors in this unified code logic.
> > 
> > Also, the conversation between Baolu and me may help.
> > 
> > https://lore.kernel.org/patchwork/patch/1006756/
> 
> Yes. We need to support different descriptor size.

Okay, pointer arithmetic on void* isn't well defined in the C standard,
afaik. But it should work with GCC, so it's probably fine.

Unrelated to this patch-set, the whole qi management code needs some
cleanups, it queues a sync after every command and has very tricky
locking. This patch-set further complicates matters there, so it is
probably time for a clean re-write of that part?

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-03 Thread Lu Baolu

Hi,

On 12/4/18 1:23 AM, Liu, Yi L wrote:

Hi Joerg,


From: Joerg Roedel [mailto:j...@8bytes.org]
Sent: Monday, December 3, 2018 5:49 AM
To: Lu Baolu 
Subject: Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor
support

On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote:

-
-   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,

0);

+   /*
+* Need two pages to accommodate 256 descriptors of 256 bits each
+* if the remapping hardware supports scalable mode translation.
+*/
+   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
+!!ecap_smts(iommu->ecap));



Same here, does the allocation really need GFP_ATOMIC?


still leave to Baolu.


The existing code uses GFP_ATOMIC, this patch only changes the size of
the allocated desc_page.

I don't think we really need GFP_ATOMIC here (and also for some other
places). I will clean up them in a separated patch.






  struct q_inval {
raw_spinlock_t  q_lock;
-   struct qi_desc  *desc;  /* invalidation queue */
+   void*desc;  /* invalidation queue */
int *desc_status;   /* desc status */
int free_head;  /* first free entry */
int free_tail;  /* last free entry */


Why do you switch the pointer to void* ?


In this patch, there is some code like the code below. It calculates
destination address of memcpy with qi->desc. If it's still struct qi_desc
pointer, the calculation result would be wrong.

+   memcpy(desc, qi->desc + (wait_index << shift),
+  1 << shift);

The change of the calculation method is to support 128 bits invalidation
descriptors and 256 invalidation descriptors in this unified code logic.

Also, the conversation between Baolu and me may help.

https://lore.kernel.org/patchwork/patch/1006756/


Yes. We need to support different descriptor size.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-03 Thread Liu, Yi L
Hi Joerg,

> From: Joerg Roedel [mailto:j...@8bytes.org]
> Sent: Monday, December 3, 2018 5:49 AM
> To: Lu Baolu 
> Subject: Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor
> support
> 
> On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote:
> > -
> > -   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
> 0);
> > +   /*
> > +* Need two pages to accommodate 256 descriptors of 256 bits each
> > +* if the remapping hardware supports scalable mode translation.
> > +*/
> > +   desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
> > +!!ecap_smts(iommu->ecap));
> 
> 
> Same here, does the allocation really need GFP_ATOMIC?

still leave to Baolu.

> 
> >  struct q_inval {
> > raw_spinlock_t  q_lock;
> > -   struct qi_desc  *desc;  /* invalidation queue */
> > +   void*desc;  /* invalidation queue */
> > int *desc_status;   /* desc status */
> > int free_head;  /* first free entry */
> > int free_tail;  /* last free entry */
> 
> Why do you switch the pointer to void* ?

In this patch, there is some code like the code below. It calculates
destination address of memcpy with qi->desc. If it's still struct qi_desc
pointer, the calculation result would be wrong.

+   memcpy(desc, qi->desc + (wait_index << shift),
+  1 << shift);

The change of the calculation method is to support 128 bits invalidation
descriptors and 256 invalidation descriptors in this unified code logic.

Also, the conversation between Baolu and me may help.

https://lore.kernel.org/patchwork/patch/1006756/

> 
>   Joerg

Thanks,
Yi Liu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-12-03 Thread Joerg Roedel
On Wed, Nov 28, 2018 at 11:54:41AM +0800, Lu Baolu wrote:
> -
> - desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO, 0);
> + /*
> +  * Need two pages to accommodate 256 descriptors of 256 bits each
> +  * if the remapping hardware supports scalable mode translation.
> +  */
> + desc_page = alloc_pages_node(iommu->node, GFP_ATOMIC | __GFP_ZERO,
> +  !!ecap_smts(iommu->ecap));


Same here, does the allocation really need GFP_ATOMIC?

>  struct q_inval {
>   raw_spinlock_t  q_lock;
> - struct qi_desc  *desc;  /* invalidation queue */
> + void*desc;  /* invalidation queue */
>   int *desc_status;   /* desc status */
>   int free_head;  /* first free entry */
>   int free_tail;  /* last free entry */

Why do you switch the pointer to void* ?


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 04/12] iommu/vt-d: Add 256-bit invalidation descriptor support

2018-11-27 Thread Lu Baolu
Intel vt-d spec rev3.0 requires software to use 256-bit
descriptors in invalidation queue. As the spec reads in
section 6.5.2:

Remapping hardware supporting Scalable Mode Translations
(ECAP_REG.SMTS=1) allow software to additionally program
the width of the descriptors (128-bits or 256-bits) that
will be written into the Queue. Software should setup the
Invalidation Queue for 256-bit descriptors before progra-
mming remapping hardware for scalable-mode translation as
128-bit descriptors are treated as invalid descriptors
(see Table 21 in Section 6.5.2.10) in scalable-mode.

This patch adds 256-bit invalidation descriptor support
if the hardware presents scalable mode capability.

Cc: Ashok Raj 
Cc: Jacob Pan 
Cc: Kevin Tian 
Signed-off-by: Sanjay Kumar 
Signed-off-by: Liu Yi L 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/dmar.c| 91 +++--
 drivers/iommu/intel-svm.c   | 76 +++-
 drivers/iommu/intel_irq_remapping.c |  6 +-
 include/linux/intel-iommu.h |  9 ++-
 4 files changed, 121 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index d9c748b6f9e4..9511f9aeb77c 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1160,6 +1160,7 @@ static int qi_check_fault(struct intel_iommu *iommu, int 
index)
int head, tail;
struct q_inval *qi = iommu->qi;
int wait_index = (index + 1) % QI_LENGTH;
+   int shift = qi_shift(iommu);
 
if (qi->desc_status[wait_index] == QI_ABORT)
return -EAGAIN;
@@ -1173,13 +1174,19 @@ static int qi_check_fault(struct intel_iommu *iommu, 
int index)
 */
if (fault & DMA_FSTS_IQE) {
head = readl(iommu->reg + DMAR_IQH_REG);
-   if ((head >> DMAR_IQ_SHIFT) == index) {
-   pr_err("VT-d detected invalid descriptor: "
-   "low=%llx, high=%llx\n",
-   (unsigned long long)qi->desc[index].low,
-   (unsigned long long)qi->desc[index].high);
-   memcpy(>desc[index], >desc[wait_index],
-   sizeof(struct qi_desc));
+   if ((head >> shift) == index) {
+   struct qi_desc *desc = qi->desc + head;
+
+   /*
+* desc->qw2 and desc->qw3 are either reserved or
+* used by software as private data. We won't print
+* out these two qw's for security consideration.
+*/
+   pr_err("VT-d detected invalid descriptor: qw0 = %llx, 
qw1 = %llx\n",
+  (unsigned long long)desc->qw0,
+  (unsigned long long)desc->qw1);
+   memcpy(desc, qi->desc + (wait_index << shift),
+  1 << shift);
writel(DMA_FSTS_IQE, iommu->reg + DMAR_FSTS_REG);
return -EINVAL;
}
@@ -1191,10 +1198,10 @@ static int qi_check_fault(struct intel_iommu *iommu, 
int index)
 */
if (fault & DMA_FSTS_ITE) {
head = readl(iommu->reg + DMAR_IQH_REG);
-   head = ((head >> DMAR_IQ_SHIFT) - 1 + QI_LENGTH) % QI_LENGTH;
+   head = ((head >> shift) - 1 + QI_LENGTH) % QI_LENGTH;
head |= 1;
tail = readl(iommu->reg + DMAR_IQT_REG);
-   tail = ((tail >> DMAR_IQ_SHIFT) - 1 + QI_LENGTH) % QI_LENGTH;
+   tail = ((tail >> shift) - 1 + QI_LENGTH) % QI_LENGTH;
 
writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG);
 
@@ -1222,15 +1229,14 @@ int qi_submit_sync(struct qi_desc *desc, struct 
intel_iommu *iommu)
 {
int rc;
struct q_inval *qi = iommu->qi;
-   struct qi_desc *hw, wait_desc;
+   int offset, shift, length;
+   struct qi_desc wait_desc;
int wait_index, index;
unsigned long flags;
 
if (!qi)
return 0;
 
-   hw = qi->desc;
-
 restart:
rc = 0;
 
@@ -1243,16 +1249,21 @@ int qi_submit_sync(struct qi_desc *desc, struct 
intel_iommu *iommu)
 
index = qi->free_head;
wait_index = (index + 1) % QI_LENGTH;
+   shift = qi_shift(iommu);
+   length = 1 << shift;
 
qi->desc_status[index] = qi->desc_status[wait_index] = QI_IN_USE;
 
-   hw[index] = *desc;
-
-   wait_desc.low = QI_IWD_STATUS_DATA(QI_DONE) |
+   offset = index << shift;
+   memcpy(qi->desc + offset, desc, length);
+   wait_desc.qw0 = QI_IWD_STATUS_DATA(QI_DONE) |
QI_IWD_STATUS_WRITE | QI_IWD_TYPE;
-   wait_desc.high = virt_to_phys(>desc_status[wait_index]);
+   wait_desc.qw1 = virt_to_phys(>desc_status[wait_index]);
+   wait_desc.qw2 = 0;
+   wait_desc.qw3 = 0;
 
-   hw[wait_index] = wait_desc;
+   offset = wait_index <<