Re: [PATCH] swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm

2022-04-29 Thread Boris Ostrovsky



On 4/28/22 6:49 PM, Stefano Stabellini wrote:

On Thu, 28 Apr 2022, Boris Ostrovsky wrote:

On 4/28/22 5:49 PM, Stefano Stabellini wrote:

On Thu, 28 Apr 2022, Christoph Hellwig wrote:

On Tue, Apr 26, 2022 at 04:07:45PM -0700, Stefano Stabellini wrote:

Reported-by: Rahul Singh 
Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 

Do you want to take this through the Xen tree or should I pick it up?
Either way I'd love to see some testing on x86 as well.

I agree on the x86 testing. Juergen, Boris?

I'd say to take this patch via the Xen tree but Juergen has just sent a
Xen pull request to Linux last Saturday. Juergen do you plan to send
another one? Do you have something else lined up? If not, it might be
better to let Christoph pick it up.


We don't have anything pending.


I can test it but at best tomorrow so not sure we can get this into rc5. Do
you consider this an urgent fix or can we wait until 5.19? Because it's a bit
too much IMO for rc6.

On one hand, Linux doesn't boot on a platform without this fix. On the
other hand, I totally see that this patch could introduce regressions on
x86 so I think it is fair that we are careful with it.

 From my point of view, it might be better to wait for 5.19 and mark it
as backport.



No problems uncovered during testing.


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm

2022-04-28 Thread Boris Ostrovsky



On 4/28/22 5:49 PM, Stefano Stabellini wrote:

On Thu, 28 Apr 2022, Christoph Hellwig wrote:

On Tue, Apr 26, 2022 at 04:07:45PM -0700, Stefano Stabellini wrote:

Reported-by: Rahul Singh 
Signed-off-by: Christoph Hellwig 

Reviewed-by: Stefano Stabellini 

Do you want to take this through the Xen tree or should I pick it up?
Either way I'd love to see some testing on x86 as well.

I agree on the x86 testing. Juergen, Boris?

I'd say to take this patch via the Xen tree but Juergen has just sent a
Xen pull request to Linux last Saturday. Juergen do you plan to send
another one? Do you have something else lined up? If not, it might be
better to let Christoph pick it up.



We don't have anything pending.


I can test it but at best tomorrow so not sure we can get this into rc5. Do you 
consider this an urgent fix or can we wait until 5.19? Because it's a bit too 
much IMO for rc6.


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: cleanup swiotlb initialization v8

2022-04-05 Thread Boris Ostrovsky



On 4/4/22 1:05 AM, Christoph Hellwig wrote:

Hi all,

this series tries to clean up the swiotlb initialization, including
that of swiotlb-xen.  To get there is also removes the x86 iommu table
infrastructure that massively obsfucates the initialization path.

Git tree:

 git://git.infradead.org/users/hch/misc.git swiotlb-init-cleanup

Gitweb:

 
http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb-init-cleanup





Tested-by: Boris Ostrovsky 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 12/15] swiotlb: provide swiotlb_init variants that remap the buffer

2022-03-15 Thread Boris Ostrovsky




On 3/15/22 2:36 AM, Christoph Hellwig wrote:


@@ -271,12 +273,23 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
 * allow to pick a location everywhere for hypervisors with guest
 * memory encryption.
 */
+retry:
+   bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT);
if (flags & SWIOTLB_ANY)
tlb = memblock_alloc(bytes, PAGE_SIZE);
else
tlb = memblock_alloc_low(bytes, PAGE_SIZE);
if (!tlb)
goto fail;
+   if (remap && remap(tlb, nslabs) < 0) {
+   memblock_free(tlb, PAGE_ALIGN(bytes));
+
+   if (nslabs <= IO_TLB_MIN_SLABS)
+   panic("%s: Failed to remap %zu bytes\n",
+ __func__, bytes);
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));



I spoke with Konrad (who wrote the original patch --- 
f4b2f07b2ed9b469ead87e06fc2fc3d12663a725) and apparently the reason for 2MB was 
to optimize for Xen's slab allocator, it had nothing to do with 
IO_TLB_MIN_SLABS. Since this is now common code we should not expose 
Xen-specific optimizations here and smaller values will still work so 
IO_TLB_MIN_SLABS is fine.

I think this should be mentioned in the commit message though, probably best in 
the next patch where you switch to this code.

As far as the hunk above, I don't think we need the max() here: with 
IO_TLB_MIN_SLABS being 512 we may get stuck in an infinite loop. Something like

nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
if (nslabs <= IO_TLB_MIN_SLABS)
panic()

should be sufficient.



+   goto retry;
+   }
if (swiotlb_init_with_tbl(tlb, default_nslabs, flags))
goto fail_free_mem;
return;
@@ -287,12 +300,18 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
pr_warn("Cannot allocate buffer");
  }
  
+void __init swiotlb_init(bool addressing_limit, unsigned int flags)

+{
+   return swiotlb_init_remap(addressing_limit, flags, NULL);
+}
+
  /*
   * Systems with larger DMA zones (those that don't support ISA) can
   * initialize the swiotlb later using the slab allocator if needed.
   * This should be just like above, but with some error catching.
   */
-int swiotlb_init_late(size_t size, gfp_t gfp_mask)
+int swiotlb_init_late(size_t size, gfp_t gfp_mask,
+   int (*remap)(void *tlb, unsigned long nslabs))
  {
unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
unsigned long bytes;
@@ -303,6 +322,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
if (swiotlb_force_disable)
return 0;
  
+retry:

order = get_order(nslabs << IO_TLB_SHIFT);
nslabs = SLABS_PER_PAGE << order;
bytes = nslabs << IO_TLB_SHIFT;
@@ -317,6 +337,16 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
  
  	if (!vstart)

return -ENOMEM;
+   if (remap)
+   rc = remap(vstart, nslabs);
+   if (rc) {
+   free_pages((unsigned long)vstart, order);
+
+   if (IO_TLB_MIN_SLABS <= 1024)
+   return rc;
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));



Same here. (The 'if' check above is wrong anyway).

Patches 13 and 14 look good.


-boris




+   goto retry;
+   }
  
  	if (order != get_order(bytes)) {

pr_warn("only able to allocate %ld MB\n",

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 14/15] swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl

2022-03-14 Thread Boris Ostrovsky



On 3/14/22 3:31 AM, Christoph Hellwig wrote:

@@ -314,6 +293,7 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
  int swiotlb_init_late(size_t size, gfp_t gfp_mask,
int (*remap)(void *tlb, unsigned long nslabs))
  {
+   struct io_tlb_mem *mem = _tlb_default_mem;
unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
unsigned long bytes;
unsigned char *vstart = NULL;
@@ -355,33 +335,28 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
(PAGE_SIZE << order) >> 20);
nslabs = SLABS_PER_PAGE << order;
}
-   rc = swiotlb_late_init_with_tbl(vstart, nslabs);
-   if (rc)
-   free_pages((unsigned long)vstart, order);
-
-   return rc;
-}
-
-int
-swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
-{
-   struct io_tlb_mem *mem = _tlb_default_mem;
-   unsigned long bytes = nslabs << IO_TLB_SHIFT;
  
-	if (swiotlb_force_disable)

-   return 0;
+   if (remap)
+   rc = remap(vstart, nslabs);
+   if (rc) {
+   free_pages((unsigned long)vstart, order);
  
-	/* protect against double initialization */

-   if (WARN_ON_ONCE(mem->nslabs))
-   return -ENOMEM;
+   /* Min is 2MB */
+   if (nslabs <= 1024)
+   return rc;
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
+   goto retry;
+   }



We now end up with two attempts to remap. I think this second one is what we 
want since it solves the problem I pointed in previous patch.


-boris


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 13/15] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-14 Thread Boris Ostrovsky


On 3/14/22 3:31 AM, Christoph Hellwig wrote:

-
  static void __init pci_xen_swiotlb_init(void)
  {
if (!xen_initial_domain() && !x86_swiotlb_enable)
return;
x86_swiotlb_enable = true;
-   xen_swiotlb = true;
-   xen_swiotlb_init_early();
+   swiotlb_init_remap(true, x86_swiotlb_flags, xen_swiotlb_fixup);



I think we need to have SWIOTLB_ANY set in x86_swiotlb_flags here.




dma_ops = _swiotlb_dma_ops;
if (IS_ENABLED(CONFIG_PCI))
pci_request_acs();
@@ -88,14 +85,16 @@ static void __init pci_xen_swiotlb_init(void)
  
  int pci_xen_swiotlb_init_late(void)

  {
-   int rc;
-
-   if (xen_swiotlb)
+   if (dma_ops == _swiotlb_dma_ops)
return 0;
  
-	rc = xen_swiotlb_init();

-   if (rc)
-   return rc;
+   /* we can work with the default swiotlb */
+   if (!io_tlb_default_mem.nslabs) {
+   int rc = swiotlb_init_late(swiotlb_size_or_default(),
+  GFP_KERNEL, xen_swiotlb_fixup);



This may be comment for previous patch but looking at swiotlb_init_late():


retry:
    order = get_order(nslabs << IO_TLB_SHIFT);
    nslabs = SLABS_PER_PAGE << order;
    bytes = nslabs << IO_TLB_SHIFT;

    while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
    vstart = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
  order);
    if (vstart)
    break;
    order--;
    }

    if (!vstart)
    return -ENOMEM;
    if (remap)
    rc = remap(vstart, nslabs);
    if (rc) {
    free_pages((unsigned long)vstart, order);

    /* Min is 2MB */
    if (nslabs <= 1024)
    return rc;
    nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
    goto retry;
    }

    if (order != get_order(bytes)) {
    pr_warn("only able to allocate %ld MB\n",
    (PAGE_SIZE << order) >> 20);
    nslabs = SLABS_PER_PAGE << order; <===
    }

    rc = swiotlb_late_init_with_tbl(vstart, nslabs);

Notice that we don't do remap() after final update to nslabs. We should.



-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/15] swiotlb: provide swiotlb_init variants that remap the buffer

2022-03-14 Thread Boris Ostrovsky



On 3/14/22 3:31 AM, Christoph Hellwig wrote:

-void __init swiotlb_init(bool addressing_limit, unsigned int flags)
+void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
+   int (*remap)(void *tlb, unsigned long nslabs))
  {
-   size_t bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT);
+   unsigned long nslabs = default_nslabs;
+   size_t bytes;
void *tlb;
  
  	if (!addressing_limit && !swiotlb_force_bounce)

@@ -271,12 +273,24 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
 * allow to pick a location everywhere for hypervisors with guest
 * memory encryption.
 */
+retry:
+   bytes = PAGE_ALIGN(default_nslabs << IO_TLB_SHIFT);
if (flags & SWIOTLB_ANY)
tlb = memblock_alloc(bytes, PAGE_SIZE);
else
tlb = memblock_alloc_low(bytes, PAGE_SIZE);
if (!tlb)
goto fail;
+   if (remap && remap(tlb, nslabs) < 0) {
+   memblock_free(tlb, PAGE_ALIGN(bytes));
+
+   /* Min is 2MB */
+   if (nslabs <= 1024)



This is IO_TLB_MIN_SLABS, isn't it? (Xen code didn't say so but that's what it 
meant to say I believe)



+   panic("%s: Failed to remap %zu bytes\n",
+ __func__, bytes);
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
+   goto retry;
+   }

@@ -303,6 +323,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
if (swiotlb_force_disable)
return 0;
  
+retry:

order = get_order(nslabs << IO_TLB_SHIFT);
nslabs = SLABS_PER_PAGE << order;
bytes = nslabs << IO_TLB_SHIFT;
@@ -317,6 +338,17 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
  
  	if (!vstart)

return -ENOMEM;
+   if (remap)
+   rc = remap(vstart, nslabs);
+   if (rc) {
+   free_pages((unsigned long)vstart, order);
+
+   /* Min is 2MB */
+   if (nslabs <= 1024)



Same here.


-boris



+   return rc;
+   nslabs = max(1024UL, ALIGN(nslabs >> 1, IO_TLB_SEGSIZE));
+   goto retry;
+   }
  
  	if (order != get_order(bytes)) {

pr_warn("only able to allocate %ld MB\n",

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-09 Thread Boris Ostrovsky



On 3/9/22 1:18 AM, Christoph Hellwig wrote:

On Tue, Mar 08, 2022 at 04:38:21PM -0500, Boris Ostrovsky wrote:

On 3/1/22 5:53 AM, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.


Any chance this patch could be split? Lots of things are happening here and 
it's somewhat hard to review. (Patch 7 too BTW but I think I managed to get 
through it)

What would be your preferred split?



swiotlb_init() rework to be done separately?





diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index e0def4b1c3181..2f2c468acb955 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -71,15 +71,12 @@ static inline void __init pci_swiotlb_detect(void)
   #endif /* CONFIG_SWIOTLB */
 #ifdef CONFIG_SWIOTLB_XEN
-static bool xen_swiotlb;
-
   static void __init pci_xen_swiotlb_init(void)
   {
if (!xen_initial_domain() && !x86_swiotlb_enable)
return;


Now that there is a single call site for this routine I think this check can be 
dropped. We are only called here for xen_initial_domain()==true.

The callsite just checks xen_pv_domain() and itself is called
unconditionally during initialization.



Oh, right, nevermind. *pv* domain.


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-08 Thread Boris Ostrovsky



On 3/1/22 5:53 AM, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.



Any chance this patch could be split? Lots of things are happening here and 
it's somewhat hard to review. (Patch 7 too BTW but I think I managed to get 
through it)



diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index e0def4b1c3181..2f2c468acb955 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -71,15 +71,12 @@ static inline void __init pci_swiotlb_detect(void)
  #endif /* CONFIG_SWIOTLB */
  
  #ifdef CONFIG_SWIOTLB_XEN

-static bool xen_swiotlb;
-
  static void __init pci_xen_swiotlb_init(void)
  {
if (!xen_initial_domain() && !x86_swiotlb_enable)
return;



Now that there is a single call site for this routine I think this check can be 
dropped. We are only called here for xen_initial_domain()==true.


-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-04 Thread Boris Ostrovsky



On 3/4/22 12:43 PM, Christoph Hellwig wrote:

On Fri, Mar 04, 2022 at 12:36:17PM -0500, Boris Ostrovsky wrote:

I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.

That looks like the swiotlb buffer did not get initialized at all, but I
can't really explain why.

Can you stick in a printk and see if xen_swiotlb_init_early gets called
at all?



Actually, that's the only thing I did do so far and yes, it does get called.

So, specifically for "x86: remove the IOMMU table infrastructure" I
think we need the one-liner below so that swiotlb_exit doesn't get called
for the Xen case.  But that should have been fixed up by the next
patch already.



This indeed allows dom0 to boot. Not sure I see where in the next patch this 
would have been fixed?


(BTW, just noticed in iommu_setup() you set this variable to 1. Should be 
'true')


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-04 Thread Boris Ostrovsky



On 3/4/22 12:28 PM, Christoph Hellwig wrote:

On Wed, Mar 02, 2022 at 08:15:03AM -0500, Boris Ostrovsky wrote:

Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.

That looks like the swiotlb buffer did not get initialized at all, but I
can't really explain why.

Can you stick in a printk and see if xen_swiotlb_init_early gets called
at all?




Actually, that's the only thing I did do so far and yes, it does get called.


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-03 Thread Boris Ostrovsky


On 3/3/22 5:57 AM, Christoph Hellwig wrote:

On Wed, Mar 02, 2022 at 08:15:03AM -0500, Boris Ostrovsky wrote:

Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.

Thanks. Looks like the sizing is going wrong.  Just to confirm, this is
dom0 on x86 and no special command line options?



Right.


module2 /boot/vmlinuz-5.17.0-rc6swiotlb placeholder 
root=UUID=dbef1262-8c8a-43db-8055-7d9bec7bece0 ro crashkernel=auto 
LANG=en_US.UTF-8 rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 
netroot=iscsi:169.254.0.2:::1:iqn.2015-02.oracle.boot:uefi 
iscsi_param=node.session.timeo.replacement_timeout=6000 net.ifnames=1 
nvme_core.shutdown_timeout=10 ipmi_si.tryacpi=0 ipmi_si.trydmi=0 
ipmi_si.trydefaults=0 libiscsi.debug_libiscsi_eh=1  panic=20 nokaslr 
earlyprintk=xen console=hvc0 loglevel=8 4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Boris Ostrovsky


On 3/2/22 8:15 AM, Boris Ostrovsky wrote:



On 3/1/22 9:55 PM, Stefano Stabellini wrote:

On Tue, 1 Mar 2022, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.




Aside from that the rest looks OK. Also, you can add my:

Tested-by: Stefano Stabellini 



Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.



Again, this is as dom0. Baremetal is fine.


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Boris Ostrovsky




On 3/1/22 9:55 PM, Stefano Stabellini wrote:

On Tue, 1 Mar 2022, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.




Aside from that the rest looks OK. Also, you can add my:

Tested-by: Stefano Stabellini 



Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.


-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: cleanup swiotlb initialization

2022-02-25 Thread Boris Ostrovsky



On 2/25/22 3:47 AM, Christoph Hellwig wrote:

On Thu, Feb 24, 2022 at 12:07:26PM -0500, Boris Ostrovsky wrote:

Just to be clear: this crashes only as dom0. Boots fine as baremetal.

Ah.  I can gues what this might be.  On Xen the hypervisor controls the
IOMMU and we should never end up initializing it in Linux, right?


Right, we shouldn't be in that code path.

Can you try the patch below on top of the series?



Yes, this makes dom0 boot fine.


(It also addresses something I wanted to mention while looking at the patches, 
which was to remove Xen initialization code from pci_swiotlb_detect_4gb() since 
it had noting to do with what the routine's name implies.)


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: cleanup swiotlb initialization

2022-02-24 Thread Boris Ostrovsky



On 2/24/22 11:39 AM, Christoph Hellwig wrote:

On Thu, Feb 24, 2022 at 11:18:33AM -0500, Boris Ostrovsky wrote:

On 2/24/22 10:58 AM, Christoph Hellwig wrote:

Thanks.

This looks really strange as early_amd_iommu_init should not interact much
with the changes.  I'll see if I can find a AMD system to test on.


Just to be clear: this crashes only as dom0. Boots fine as baremetal.

Ah.  I can gues what this might be.  On Xen the hypervisor controls the
IOMMU and we should never end up initializing it in Linux, right?



Right, we shouldn't be in that code path.


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: cleanup swiotlb initialization

2022-02-24 Thread Boris Ostrovsky


On 2/24/22 10:58 AM, Christoph Hellwig wrote:

Thanks.

This looks really strange as early_amd_iommu_init should not interact much
with the changes.  I'll see if I can find a AMD system to test on.



Just to be clear: this crashes only as dom0. Boots fine as baremetal.


-boris




On Wed, Feb 23, 2022 at 07:57:49PM -0500, Boris Ostrovsky wrote:

[   37.377313] BUG: unable to handle page fault for address: c90042880018
[   37.378219] #PF: supervisor read access in kernel mode
[   37.378219] #PF: error_code(0x) - not-present page
[   37.378219] PGD 7c2f2ee067 P4D 7c2f2ee067 PUD 7bf019b067 PMD 105a30067 PTE 0
[   37.378219] Oops:  [#1] PREEMPT SMP NOPTI
[   37.378219] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc5swiotlb #9
[   37.378219] Hardware name: Oracle Corporation ORACLE SERVER 
E1-2c/ASY,Generic,SM,E1-2c, BIOS 49004900 12/23/2020
[   37.378219] RIP: e030:init_iommu_one+0x248/0x2f0
[   37.378219] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   37.378219] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   37.378219] RAX: c9004288 RBX: 888107260800 RCX: 
[   37.378219] RDX: 8000 RSI: ea00041cab80 RDI: 
[   37.378219] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   37.378219] R10: 0002 R11:  R12: c90040435008
[   37.378219] R13: 0008 R14: efa0 R15: 
[   37.378219] FS:  () GS:88fef418() 
knlGS:
[   37.378219] CS:  e030 DS:  ES:  CR0: 80050033
[   37.378219] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   37.378219] Call Trace:
[   37.378219]  
[   37.378219]  early_amd_iommu_init+0x3c5/0x72d
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  state_next+0x158/0x68f
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  iommu_go_to_state+0x28/0x2d
[   37.378219]  amd_iommu_init+0x15/0x4b
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  pci_iommu_init+0x12/0x37
[   37.378219]  do_one_initcall+0x48/0x210
[   37.378219]  kernel_init_freeable+0x229/0x28c
[   37.378219]  ? rest_init+0xe0/0xe0
[   37.963966]  kernel_init+0x1a/0x130
[   37.979415]  ret_from_fork+0x22/0x30
[   37.991436]  
[   37.999465] Modules linked in:
[   38.007413] CR2: c90042880018
[   38.019416] ---[ end trace  ]---
[   38.023418] RIP: e030:init_iommu_one+0x248/0x2f0
[   38.023418] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   38.023418] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   38.023418] RAX: c9004288 RBX: 888107260800 RCX: 
[   38.155413] RDX: 8000 RSI: ea00041cab80 RDI: 
[   38.175965] Freeing initrd memory: 62640K
[   38.155413] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   38.155413] R10: 0002 R11:  R12: c90040435008
[   38.155413] R13: 0008 R14: efa0 R15: 
[   38.155413] FS:  () GS:88fef418() 
knlGS:
[   38.287414] CS:  e030 DS:  ES:  CR0: 80050033
[   38.309557] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   38.332403] Kernel panic - not syncing: Fatal exception
[   38.351414] Rebooting in 20 seconds..



-boris

---end quoted text---


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: cleanup swiotlb initialization

2022-02-23 Thread Boris Ostrovsky


On 2/22/22 10:35 AM, Christoph Hellwig wrote:

Hi all,

this series tries to clean up the swiotlb initialization, including
that of swiotlb-xen.  To get there is also removes the x86 iommu table
infrastructure that massively obsfucates the initialization path.

Git tree:

 git://git.infradead.org/users/hch/misc.git swiotlb-init-cleanup



I haven't had a chance to look at this yet but this crashes as dom0:


[   37.377313] BUG: unable to handle page fault for address: c90042880018
[   37.378219] #PF: supervisor read access in kernel mode
[   37.378219] #PF: error_code(0x) - not-present page
[   37.378219] PGD 7c2f2ee067 P4D 7c2f2ee067 PUD 7bf019b067 PMD 105a30067 PTE 0
[   37.378219] Oops:  [#1] PREEMPT SMP NOPTI
[   37.378219] CPU: 14 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc5swiotlb #9
[   37.378219] Hardware name: Oracle Corporation ORACLE SERVER 
E1-2c/ASY,Generic,SM,E1-2c, BIOS 49004900 12/23/2020
[   37.378219] RIP: e030:init_iommu_one+0x248/0x2f0
[   37.378219] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   37.378219] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   37.378219] RAX: c9004288 RBX: 888107260800 RCX: 
[   37.378219] RDX: 8000 RSI: ea00041cab80 RDI: 
[   37.378219] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   37.378219] R10: 0002 R11:  R12: c90040435008
[   37.378219] R13: 0008 R14: efa0 R15: 
[   37.378219] FS:  () GS:88fef418() 
knlGS:
[   37.378219] CS:  e030 DS:  ES:  CR0: 80050033
[   37.378219] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   37.378219] Call Trace:
[   37.378219]  
[   37.378219]  early_amd_iommu_init+0x3c5/0x72d
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  state_next+0x158/0x68f
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  iommu_go_to_state+0x28/0x2d
[   37.378219]  amd_iommu_init+0x15/0x4b
[   37.378219]  ? iommu_setup+0x284/0x284
[   37.378219]  pci_iommu_init+0x12/0x37
[   37.378219]  do_one_initcall+0x48/0x210
[   37.378219]  kernel_init_freeable+0x229/0x28c
[   37.378219]  ? rest_init+0xe0/0xe0
[   37.963966]  kernel_init+0x1a/0x130
[   37.979415]  ret_from_fork+0x22/0x30
[   37.991436]  
[   37.999465] Modules linked in:
[   38.007413] CR2: c90042880018
[   38.019416] ---[ end trace  ]---
[   38.023418] RIP: e030:init_iommu_one+0x248/0x2f0
[   38.023418] Code: 48 89 43 68 48 85 c0 74 c4 be 00 20 00 00 48 89 df e8 ea ee ff 
ff 48 89 43 78 48 85 c0 74 ae c6 83 98 00 00 00 00 48 8b 43 38 <48> 8b 40 18 a8 
01 74 07 83 8b a8 04 00 00 01 f6 83 a8 04 00 00 01
[   38.023418] RSP: e02b:c9004044bd18 EFLAGS: 00010286
[   38.023418] RAX: c9004288 RBX: 888107260800 RCX: 
[   38.155413] RDX: 8000 RSI: ea00041cab80 RDI: 
[   38.175965] Freeing initrd memory: 62640K
[   38.155413] RBP: c9004044bd38 R08: 0901 R09: ea00041cab00
[   38.155413] R10: 0002 R11:  R12: c90040435008
[   38.155413] R13: 0008 R14: efa0 R15: 
[   38.155413] FS:  () GS:88fef418() 
knlGS:
[   38.287414] CS:  e030 DS:  ES:  CR0: 80050033
[   38.309557] CR2: c90042880018 CR3: 0260a000 CR4: 00050660
[   38.332403] Kernel panic - not syncing: Fatal exception
[   38.351414] Rebooting in 20 seconds..



-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [GIT PULL] dma-mapping fix for Linux 5.14

2021-07-26 Thread Boris Ostrovsky


On 7/25/21 12:50 PM, Linus Torvalds wrote:
> On Sat, Jul 24, 2021 at 11:03 PM Christoph Hellwig  wrote:
>
>>   - handle vmalloc addresses in dma_common_{mmap,get_sgtable}
>> (Roman Skakun)
> I've pulled this, but my reaction is that we've tried to avoid this in
> the past. Why is Xen using vmalloc'ed addresses and passing those in
> to the dma mapping routines?
>
> It *smells* to me like a Xen-swiotlb bug, and it would have been
> better to try to fix it there. Was that just too painful?


Stefano will probably know better but this appears to have something to do with 
how Pi (and possibly more ARM systems?) manage DMA memory: 
https://lore.kernel.org/xen-devel/CADz_WD5Ln7Pe1WAFp73d2Mz9wxspzTE3WgAJusp5S8LX4=8...@mail.gmail.com/.




-boris



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 13/16] xen: swiotlb: return error code from xen_swiotlb_map_sg()

2021-07-19 Thread Boris Ostrovsky


On 7/15/21 12:45 PM, Logan Gunthorpe wrote:
> From: Martin Oliveira 
>
> The .map_sg() op now expects an error code instead of zero on failure.
>
> xen_swiotlb_map_sg() may only fail if xen_swiotlb_map_page() fails, but
> xen_swiotlb_map_page() only supports returning errors as
> DMA_MAPPING_ERROR. So coalesce all errors into EINVAL.


Reviewed-by: Boris Ostrovsky 



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] dma-mapping: use vmalloc_to_page for vmalloc addresses

2021-07-15 Thread Boris Ostrovsky

On 7/15/21 3:39 AM, Roman Skakun wrote:
>> This looks like it wasn't picked up? Should it go in rc1?
> Hi, Konrad!
>
> This looks like an unambiguous bug, and should be in rc1.


Looks like you didn't copy Christoph which could be part of the problem. Adding 
him.


-boris



>
> Cheers!
>
> ср, 14 июл. 2021 г. в 03:15, Konrad Rzeszutek Wilk :
>> On Tue, Jun 22, 2021 at 04:34:14PM +0300, Roman Skakun wrote:
>>> This commit is dedicated to fix incorrect conversion from
>>> cpu_addr to page address in cases when we get virtual
>>> address which allocated in the vmalloc range.
>>> As the result, virt_to_page() cannot convert this address
>>> properly and return incorrect page address.
>>>
>>> Need to detect such cases and obtains the page address using
>>> vmalloc_to_page() instead.
>>>
>>> Signed-off-by: Roman Skakun 
>>> Reviewed-by: Andrii Anisov 
>>> ---
>>> Hey!
>>> Thanks for suggestions, Christoph!
>>> I updated the patch according to your advice.
>>> But, I'm so surprised because nobody catches this problem
>>> in the common code before. It looks a bit strange as for me.
>> This looks like it wasn't picked up? Should it go in rc1?
>>>
>>>  kernel/dma/ops_helpers.c | 12 ++--
>>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
>>> index 910ae69cae77..782728d8a393 100644
>>> --- a/kernel/dma/ops_helpers.c
>>> +++ b/kernel/dma/ops_helpers.c
>>> @@ -5,6 +5,14 @@
>>>   */
>>>  #include 
>>>
>>> +static struct page *cpu_addr_to_page(void *cpu_addr)
>>> +{
>>> + if (is_vmalloc_addr(cpu_addr))
>>> + return vmalloc_to_page(cpu_addr);
>>> + else
>>> + return virt_to_page(cpu_addr);
>>> +}
>>> +
>>>  /*
>>>   * Create scatter-list for the already allocated DMA buffer.
>>>   */
>>> @@ -12,7 +20,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
>>> sg_table *sgt,
>>>void *cpu_addr, dma_addr_t dma_addr, size_t size,
>>>unsigned long attrs)
>>>  {
>>> - struct page *page = virt_to_page(cpu_addr);
>>> + struct page *page = cpu_addr_to_page(cpu_addr);
>>>   int ret;
>>>
>>>   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
>>> @@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct 
>>> vm_area_struct *vma,
>>>   return -ENXIO;
>>>
>>>   return remap_pfn_range(vma, vma->vm_start,
>>> - page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
>>> + page_to_pfn(cpu_addr_to_page(cpu_addr)) + 
>>> vma->vm_pgoff,
>>>   user_count << PAGE_SHIFT, vma->vm_page_prot);
>>>  #else
>>>   return -ENXIO;
>>> --
>>> 2.25.1
>>>
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Boris Ostrovsky


On 6/16/21 11:35 AM, Christoph Hellwig wrote:
> On Wed, Jun 16, 2021 at 11:33:50AM -0400, Boris Ostrovsky wrote:
>> Isn't the expectation of virt_to_page() that it only works on non-vmalloc'd 
>> addresses? (This is not a rhetorical question, I actually don't know).
> Yes.  Thus is why I'd suggest to just do the vmalloc_to_page or
> virt_to_page dance in ops_helpers.c and just continue using that.


Ah, OK, so something along the lines of what I suggested. (I thought by 
"helpers" you meant virt_to_page()).


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Boris Ostrovsky

On 6/16/21 10:21 AM, Christoph Hellwig wrote:
> On Wed, Jun 16, 2021 at 10:12:55AM -0400, Boris Ostrovsky wrote:
>> I wonder now whether we could avoid code duplication between here and 
>> dma_common_mmap()/dma_common_get_sgtable() and use your helper there.
>>
>>
>> Christoph, would that work?  I.e. something like
> You should not duplicate the code at all, and just make the common
> helpers work with vmalloc addresses.


Isn't the expectation of virt_to_page() that it only works on non-vmalloc'd 
addresses? (This is not a rhetorical question, I actually don't know).



-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Boris Ostrovsky

On 6/16/21 7:42 AM, Roman Skakun wrote:
> This commit is dedicated to fix incorrect conversion from
> cpu_addr to page address in cases when we get virtual
> address which allocated through xen_swiotlb_alloc_coherent()
> and can be mapped in the vmalloc range.
> As the result, virt_to_page() cannot convert this address
> properly and return incorrect page address.
>
> Need to detect such cases and obtains the page address using
> vmalloc_to_page() instead.
>
> The reference code for mmap() and get_sgtable() was copied
> from kernel/dma/ops_helpers.c and modified to provide
> additional detections as described above.
>
> In order to simplify code there was added a new
> dma_cpu_addr_to_page() helper.
>
> Signed-off-by: Roman Skakun 
> Reviewed-by: Andrii Anisov 
> ---
>  drivers/xen/swiotlb-xen.c | 42 +++
>  1 file changed, 34 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 90bc5fc321bc..9331a8500547 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -118,6 +118,14 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
> dma_addr_t dma_addr)
>   return 0;
>  }
>  
> +static struct page *cpu_addr_to_page(void *cpu_addr)
> +{
> + if (is_vmalloc_addr(cpu_addr))
> + return vmalloc_to_page(cpu_addr);
> + else
> + return virt_to_page(cpu_addr);
> +}
> +
>  static int
>  xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  {
> @@ -337,7 +345,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   int order = get_order(size);
>   phys_addr_t phys;
>   u64 dma_mask = DMA_BIT_MASK(32);
> - struct page *page;
> + struct page *page = cpu_addr_to_page(vaddr);
>  
>   if (hwdev && hwdev->coherent_dma_mask)
>   dma_mask = hwdev->coherent_dma_mask;
> @@ -349,11 +357,6 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   /* Convert the size to actually allocated. */
>   size = 1UL << (order + XEN_PAGE_SHIFT);
>  
> - if (is_vmalloc_addr(vaddr))
> - page = vmalloc_to_page(vaddr);
> - else
> - page = virt_to_page(vaddr);
> -
>   if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
>range_straddles_page_boundary(phys, size)) &&
>   TestClearPageXenRemapped(page))
> @@ -573,7 +576,23 @@ xen_swiotlb_dma_mmap(struct device *dev, struct 
> vm_area_struct *vma,
>void *cpu_addr, dma_addr_t dma_addr, size_t size,
>unsigned long attrs)
>  {
> - return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
> + unsigned long user_count = vma_pages(vma);
> + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> + unsigned long off = vma->vm_pgoff;
> + struct page *page = cpu_addr_to_page(cpu_addr);
> + int ret;
> +
> + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
> +
> + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
> + return ret;
> +
> + if (off >= count || user_count > count - off)
> + return -ENXIO;
> +
> + return remap_pfn_range(vma, vma->vm_start,
> + page_to_pfn(page) + vma->vm_pgoff,
> + user_count << PAGE_SHIFT, vma->vm_page_prot);
>  }


I wonder now whether we could avoid code duplication between here and 
dma_common_mmap()/dma_common_get_sgtable() and use your helper there.


Christoph, would that work?  I.e. something like


diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
index 910ae69cae77..43411c2fa47b 100644
--- a/kernel/dma/ops_helpers.c
+++ b/kernel/dma/ops_helpers.c
@@ -12,7 +12,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
sg_table *sgt,
 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 unsigned long attrs)
 {
-   struct page *page = virt_to_page(cpu_addr);
+   struct page *page = cpu_addr_to_page(cpu_addr);
    int ret;
 
    ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
@@ -43,7 +43,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct 
*vma,
    return -ENXIO;
 
    return remap_pfn_range(vma, vma->vm_start,
-   page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
+   page_to_pfn(cpu_addr_to_page(cpu_addr)) + vma->vm_pgoff,
    user_count << PAGE_SHIFT, vma->vm_page_prot);
 #else
    return -ENXIO;


-boris


>  
>  /*
> @@ -585,7 +604,14 @@ xen_swiotlb_get_sgtable(struct device *dev, struct 
> sg_table *sgt,
>   void *cpu_addr, dma_addr_t handle, size_t size,
>   unsigned long attrs)
>  {
> - return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
> + struct page *page = cpu_addr_to_page(cpu_addr);
> + int ret;
> +
> + ret = sg_alloc_table(sgt, 1, 

Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-14 Thread Boris Ostrovsky


On 6/14/21 8:47 AM, Roman Skakun wrote:
> Hey, Boris!
> Thanks for the review.
>
> I have an additional question about current implementation that disturbed me.
> I think, that we can have cases when mapped memory can not be
> physically contiguous.
> In order to proceed this correctly need to apply some additional steps
> to current implementation as well.
>
> In mmap() :
> 1. Is the memory region physically contiguous?
> 2. Remap multiple ranges if it is not.
>
> In get_sgtable() :
> 1. Is the memory region physically contiguous?
> 2. Create sgt that will be included multiple contiguous ranges if it is not.
>
> What do you think about it?


We make sure that we allocate contiguous memory in xen_swiotlb_alloc_coherent().


-boris


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-11 Thread Boris Ostrovsky


On 6/11/21 5:55 AM, Roman Skakun wrote:
>  
> +static int
> +xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
> + void *cpu_addr, dma_addr_t dma_addr, size_t size,
> + unsigned long attrs)
> +{
> + unsigned long user_count = vma_pages(vma);
> + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> + unsigned long off = vma->vm_pgoff;
> + struct page *page;
> +
> + if (is_vmalloc_addr(cpu_addr))
> + page = vmalloc_to_page(cpu_addr);
> + else
> + page = virt_to_page(cpu_addr);
> +
> + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
> +
> + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
> + return -ENXIO;
> +
> + if (off >= count || user_count > count - off)
> + return -ENXIO;
> +
> + return remap_pfn_range(vma, vma->vm_start,
> + page_to_pfn(page) + vma->vm_pgoff,
> + user_count << PAGE_SHIFT, vma->vm_page_prot);
> +}


I suggest you create a helper for computing page value and then revert 
922659ea771b3fd728149262c5ea15608fab9719 and pass result of the helper instead 
of cpu_addr. Here and in xen_swiotlb_dma_get_sgtable().


And use this new helper in xen_swiotlb_free_coherent() too. I am curious though 
why this was not a problem when Stefano was looking at the problem that 
introduced this vmalloc check (i.e. 8b1e868f66076490189a36d984fcce286cdd6295). 
Stefano?


-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH V3 09/11] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM

2021-06-03 Thread Boris Ostrovsky


On 6/3/21 11:37 AM, Tianyu Lan wrote:
>
> Yes, the dependency is between hyperv_swiotlb_detect() and
> pci_swiotlb_detect_override()/pci_swiotlb_detect_4gb(). Now
> pci_swiotlb_detect_override() and pci_swiotlb_detect_4gb() depends on
> pci_xen_swiotlb_detect(). To keep dependency between
> hyperv_swiotlb_detect() and pci_swiotlb_detect_override/4gb(), make 
> pci_xen_swiotlb_detect() depends on hyperv_swiotlb_detect() and just to
> keep order in the IOMMU table. Current iommu_table_entry only has one
> depend callback and this is why I put xen depends on hyperv detect function.
>

Ah, ok. Thanks.



-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH V3 09/11] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM

2021-06-02 Thread Boris Ostrovsky

On 6/2/21 11:01 AM, Tianyu Lan wrote:
> Hi Boris:
> Thanks for your review.
>
> On 6/2/2021 9:16 AM, Boris Ostrovsky wrote:
>>
>> On 5/30/21 11:06 AM, Tianyu Lan wrote:
>>> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void)
>>>   EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late);
>>>     IOMMU_INIT_FINISH(2,
>>> -  NULL,
>>> +  hyperv_swiotlb_detect,
>>>     pci_xen_swiotlb_init,
>>>     NULL);
>>
>>
>> Could you explain this change?
>
> Hyper-V allocates its own swiotlb bounce buffer and the default
> swiotlb buffer should not be allocated. swiotlb_init() in pci_swiotlb_init() 
> is to allocate default swiotlb buffer.
> To achieve this, put hyperv_swiotlb_detect() as the first entry in the 
> iommu_table_entry list. The detect loop in the pci_iommu_alloc() will exit 
> once hyperv_swiotlb_detect() is called in Hyper-V VM and other 
> iommu_table_entry callback will not be called.



Right. But pci_xen_swiotlb_detect() will only do something for Xen PV guests, 
and those guests don't run on hyperV. It's either xen_pv_domain() (i.e. 
hypervisor_is_type(X86_HYPER_XEN_PV)) or 
hypervisor_is_type(X86_HYPER_MS_HYPERV) but never both. So I don't think there 
needs to be a dependency between the two callbacks.



-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH V3 09/11] HV/IOMMU: Enable swiotlb bounce buffer for Isolation VM

2021-06-01 Thread Boris Ostrovsky


On 5/30/21 11:06 AM, Tianyu Lan wrote:
> @@ -91,6 +92,6 @@ int pci_xen_swiotlb_init_late(void)
>  EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late);
>  
>  IOMMU_INIT_FINISH(2,
> -   NULL,
> +   hyperv_swiotlb_detect,
> pci_xen_swiotlb_init,
> NULL);


Could you explain this change?


-boris



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays

2021-02-19 Thread Boris Ostrovsky


On 2/19/21 3:32 PM, Konrad Rzeszutek Wilk wrote:
> On Sun, Feb 07, 2021 at 04:56:01PM +0100, Christoph Hellwig wrote:
>> On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote:
>>> So one thing that has been on my mind for a while:  I'd really like
>>> to kill the separate dma ops in Xen swiotlb.  If we compare xen-swiotlb
>>> to swiotlb the main difference seems to be:
>>>
>>>  - additional reasons to bounce I/O vs the plain DMA capable
>>>  - the possibility to do a hypercall on arm/arm64
>>>  - an extra translation layer before doing the phys_to_dma and vice
>>>versa
>>>  - an special memory allocator
>>>
>>> I wonder if inbetween a few jump labels or other no overhead enablement
>>> options and possibly better use of the dma_range_map we could kill
>>> off most of swiotlb-xen instead of maintaining all this code duplication?
>> So I looked at this a bit more.
>>
>> For x86 with XENFEAT_auto_translated_physmap (how common is that?)
> Juergen, Boris please correct me if I am wrong, but that 
> XENFEAT_auto_translated_physmap
> only works for PVH guests?


That's both HVM and PVH (for dom0 it's only PVH).


-boris



>
>> pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is.
>>
>> xen_arch_need_swiotlb always returns true for x86, and
>> range_straddles_page_boundary should never be true for the
>> XENFEAT_auto_translated_physmap case.
> Correct. The kernel should have no clue of what the real MFNs are
> for PFNs.
>> So as far as I can tell the mapping fast path for the
>> XENFEAT_auto_translated_physmap can be trivially reused from swiotlb.
>>
>> That leaves us with the next more complicated case, x86 or fully cache
>> coherent arm{,64} without XENFEAT_auto_translated_physmap.  In that case
>> we need to patch in a phys_to_dma/dma_to_phys that performs the MFN
>> lookup, which could be done using alternatives or jump labels.
>> I think if that is done right we should also be able to let that cover
>> the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but
>> in that worst case that would need another alternative / jump label.
>>
>> For non-coherent arm{,64} we'd also need to use alternatives or jump
>> labels to for the cache maintainance ops, but that isn't a hard problem
>> either.
>>
>>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 09/11] swiotlb-xen: simplify cache maintainance

2019-09-06 Thread Boris Ostrovsky
On 9/6/19 10:43 AM, Konrad Rzeszutek Wilk wrote:
> On Fri, Sep 06, 2019 at 10:19:01AM -0400, Boris Ostrovsky wrote:
>> On 9/6/19 10:01 AM, Christoph Hellwig wrote:
>>> On Fri, Sep 06, 2019 at 09:52:12AM -0400, Boris Ostrovsky wrote:
>>>> We need nop definitions of these two for x86.
>>>>
>>>> Everything builds now but that's probably because the calls are under
>>>> 'if (!dev_is_dma_coherent(dev))' which is always false so compiler
>>>> optimized is out. I don't think we should rely on that.
>>> That is how a lot of the kernel works.  Provide protypes only for code
>>> that is semantically compiled, but can't ever be called due to
>>> IS_ENABLED() checks.  It took me a while to get used to it, but it
>>> actually is pretty nice as the linker does the work for you to check
>>> that it really is never called.  Much better than say a BUILD_BUG_ON().
>>
>> (with corrected Juergen's email)
>>
>> I know about IS_ENABLED() but I didn't realize that this is allowed for
>> compile-time inlines and such as well.
>>
>> Anyway, for non-ARM bits
>>
>> Reviewed-by: Boris Ostrovsky 
> Acked-by: Konrad Rzeszutek Wilk 
>
> as well.
>
> Albeit folks have tested this under x86 Xen with 'swiotlb=force' right?


Yes, I did.

-boris


>
> I can test it myself but it will take a couple of days.
>> If this goes via Xen tree then the first couple of patches need an ack
>> from ARM maintainers.
>>
>> -boris



Re: [PATCH 09/11] swiotlb-xen: simplify cache maintainance

2019-09-06 Thread Boris Ostrovsky
On 9/6/19 10:01 AM, Christoph Hellwig wrote:
> On Fri, Sep 06, 2019 at 09:52:12AM -0400, Boris Ostrovsky wrote:
>> We need nop definitions of these two for x86.
>>
>> Everything builds now but that's probably because the calls are under
>> 'if (!dev_is_dma_coherent(dev))' which is always false so compiler
>> optimized is out. I don't think we should rely on that.
> That is how a lot of the kernel works.  Provide protypes only for code
> that is semantically compiled, but can't ever be called due to
> IS_ENABLED() checks.  It took me a while to get used to it, but it
> actually is pretty nice as the linker does the work for you to check
> that it really is never called.  Much better than say a BUILD_BUG_ON().


(with corrected Juergen's email)

I know about IS_ENABLED() but I didn't realize that this is allowed for
compile-time inlines and such as well.

Anyway, for non-ARM bits

Reviewed-by: Boris Ostrovsky 

If this goes via Xen tree then the first couple of patches need an ack
from ARM maintainers.

-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 09/11] swiotlb-xen: simplify cache maintainance

2019-09-06 Thread Boris Ostrovsky
On 9/5/19 7:34 AM, Christoph Hellwig wrote:
> diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h
> index 5e4b83f83dbc..d71380f6ed0b 100644
> --- a/include/xen/swiotlb-xen.h
> +++ b/include/xen/swiotlb-xen.h
> @@ -4,6 +4,11 @@
>  
>  #include 
>  
> +void xen_dma_sync_for_cpu(struct device *dev, dma_addr_t handle,
> + phys_addr_t paddr, size_t size, enum dma_data_direction dir);
> +void xen_dma_sync_for_device(struct device *dev, dma_addr_t handle,
> + phys_addr_t paddr, size_t size, enum dma_data_direction dir);
> +
>  extern int xen_swiotlb_init(int verbose, bool early);
>  extern const struct dma_map_ops xen_swiotlb_dma_ops;


We need nop definitions of these two for x86.

Everything builds now but that's probably because the calls are under
'if (!dev_is_dma_coherent(dev))' which is always false so compiler
optimized is out. I don't think we should rely on that.

-boris



Re: [PATCH v2] swiotlb-xen: Convert to use macro

2019-09-06 Thread Boris Ostrovsky
On 9/6/19 8:27 AM, Souptick Joarder wrote:
> On Mon, Sep 2, 2019 at 2:04 PM Souptick Joarder  wrote:
>> Rather than using static int max_dma_bits, this
>> can be coverted to use as macro.
>>
>> Signed-off-by: Souptick Joarder 
>> Reviewed-by: Juergen Gross 
> If it is still not late, can we get this patch in queue for 5.4 ?


Yes, I will queue it later today.

-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/13] swiotlb-xen: always use dma-direct helpers to alloc coherent pages

2019-09-03 Thread Boris Ostrovsky
(Now with correct address for Juergen)

On 9/3/19 6:15 PM, Boris Ostrovsky wrote:
> On 9/2/19 9:03 AM, Christoph Hellwig wrote:
>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>> index b8808677ae1d..f9dd4cb6e4b3 100644
>> --- a/drivers/xen/swiotlb-xen.c
>> +++ b/drivers/xen/swiotlb-xen.c
>> @@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
>> size,
>>   * address. In fact on ARM virt_to_phys only works for kernel direct
>>   * mapped RAM memory. Also see comment below.
>>   */
>> -ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
>> -
>> +ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);
>
> If I am reading __dma_direct_alloc_pages() correctly there is a path
> that will force us to use GFP_DMA32 and as Juergen pointed out in
> another message that may not be desirable.
>
> -boris
>
>
>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 08/13] swiotlb-xen: always use dma-direct helpers to alloc coherent pages

2019-09-03 Thread Boris Ostrovsky
On 9/2/19 9:03 AM, Christoph Hellwig wrote:
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index b8808677ae1d..f9dd4cb6e4b3 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -299,8 +299,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
> size,
>* address. In fact on ARM virt_to_phys only works for kernel direct
>* mapped RAM memory. Also see comment below.
>*/
> - ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs);
> -
> + ret = dma_direct_alloc(hwdev, size, dma_handle, flags, attrs);


If I am reading __dma_direct_alloc_pages() correctly there is a path
that will force us to use GFP_DMA32 and as Juergen pointed out in
another message that may not be desirable.

-boris





Re: [PATCH v2 3/3] xen/swiotlb: remember having called xen_create_contiguous_region()

2019-05-30 Thread Boris Ostrovsky
On 5/30/19 5:04 AM, Christoph Hellwig wrote:
> Please don't add your private flag to page-flags.h.  The whole point of
> the private flag is that you can use it in any way you want withou
> touching the common code.


There is already a bunch of aliases for various sub-components
(including Xen) in page-flags.h for private flags, which is why I
suggested we do the same for the new use case. Adding this new alias
will keep flag usage consistent.

-boris


Re: [PATCH 3/3] xen/swiotlb: remember having called xen_create_contiguous_region()

2019-04-23 Thread Boris Ostrovsky
On 4/23/19 6:54 AM, Juergen Gross wrote:
> Instead of always calling xen_destroy_contiguous_region() in case the
> memory is DMA-able for the used device, do so only in case it has been
> made DMA-able via xen_create_contiguous_region() before.
>
> This will avoid a lot of xen_destroy_contiguous_region() calls for
> 64-bit capable devices.
>
> As the memory in question is owned by swiotlb-xen the PG_owner_priv_1
> flag of the first allocated page can be used for remembering.

I think a new enum in pageflags would be useful, and be consistent with
other flag uses.

-boris


>
> Signed-off-by: Juergen Gross 
> ---
>  drivers/xen/swiotlb-xen.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 43b6e65ae256..a72f181d8e20 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -321,6 +321,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t 
> size,
>   xen_free_coherent_pages(hwdev, size, ret, 
> (dma_addr_t)phys, attrs);
>   return NULL;
>   }
> + SetPageOwnerPriv1(virt_to_page(ret));
>   }
>   memset(ret, 0, size);
>   return ret;
> @@ -344,9 +345,11 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   /* Convert the size to actually allocated. */
>   size = 1UL << (order + XEN_PAGE_SHIFT);
>  
> - if ((dev_addr + size - 1 <= dma_mask) &&
> - !WARN_ON(range_straddles_page_boundary(phys, size)))
> - xen_destroy_contiguous_region(phys, order);
> + if (PageOwnerPriv1(virt_to_page(vaddr))) {
> + if (!WARN_ON(range_straddles_page_boundary(phys, size)))
> + xen_destroy_contiguous_region(phys, order);
> + ClearPageOwnerPriv1(virt_to_page(vaddr));
> + }
>  
>   xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
>  }

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] xen/swiotlb: simplify range_straddles_page_boundary()

2019-04-23 Thread Boris Ostrovsky
On 4/23/19 6:54 AM, Juergen Gross wrote:
> range_straddles_page_boundary() is open coding several macros from
> include/xen/page.h. Use those instead. Additionally there is no need
> to have check_pages_physically_contiguous() as a separate function as
> it is used only once, so merge it into range_straddles_page_boundary().
>
> Signed-off-by: Juergen Gross 

Reviewed-by: Boris Ostrovsky 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()

2019-04-23 Thread Boris Ostrovsky
On 4/23/19 6:54 AM, Juergen Gross wrote:
> The condition in xen_swiotlb_free_coherent() for deciding whether to
> call xen_destroy_contiguous_region() is wrong: in case the region to
> be freed is not contiguous calling xen_destroy_contiguous_region() is
> the wrong thing to do: it would result in inconsistent mappings of
> multiple PFNs to the same MFN. This will lead to various strange
> crashes or data corruption.
>
> Instead of calling xen_destroy_contiguous_region() in that case a
> warning should be issued as that situation should never occur.
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Juergen Gross 

Reviewed-by: Boris Ostrovsky 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH -next] xen-swiotlb: Make two functions static

2019-04-16 Thread Boris Ostrovsky
On 4/16/19 10:49 AM, Yue Haibing wrote:
> From: YueHaibing 
>
> Fix sparse warnings:
>
> drivers/xen/swiotlb-xen.c:489:1: warning:
>  symbol 'xen_swiotlb_sync_single_for_cpu' was not declared. Should it be 
> static?
> drivers/xen/swiotlb-xen.c:496:1: warning:
>  symbol 'xen_swiotlb_sync_single_for_device' was not declared. Should it be 
> static?
>
> Reported-by: Hulk Robot 
> Signed-off-by: YueHaibing 


I think latest patches from Christoph take care of this.

-boris


> ---
>  drivers/xen/swiotlb-xen.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 877baf2..e741df4 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -485,14 +485,14 @@ xen_swiotlb_sync_single(struct device *hwdev, 
> dma_addr_t dev_addr,
>   xen_dma_sync_single_for_device(hwdev, dev_addr, size, dir);
>  }
>  
> -void
> +static void
>  xen_swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
>   size_t size, enum dma_data_direction dir)
>  {
>   xen_swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
>  }
>  
> -void
> +static void
>  xen_swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
>  size_t size, enum dma_data_direction dir)
>  {

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/9] Use vm_insert_range

2018-11-21 Thread Boris Ostrovsky
On 11/21/18 2:56 PM, Souptick Joarder wrote:
> On Thu, Nov 22, 2018 at 1:08 AM Boris Ostrovsky
>  wrote:
>> On 11/21/18 1:24 AM, Souptick Joarder wrote:
>>> On Thu, Nov 15, 2018 at 9:09 PM Souptick Joarder  
>>> wrote:
>>>> Previouly drivers have their own way of mapping range of
>>>> kernel pages/memory into user vma and this was done by
>>>> invoking vm_insert_page() within a loop.
>>>>
>>>> As this pattern is common across different drivers, it can
>>>> be generalized by creating a new function and use it across
>>>> the drivers.
>>>>
>>>> vm_insert_range is the new API which will be used to map a
>>>> range of kernel memory/pages to user vma.
>>>>
>>>> All the applicable places are converted to use new vm_insert_range
>>>> in this patch series.
>>>>
>>>> Souptick Joarder (9):
>>>>   mm: Introduce new vm_insert_range API
>>>>   arch/arm/mm/dma-mapping.c: Convert to use vm_insert_range
>>>>   drivers/firewire/core-iso.c: Convert to use vm_insert_range
>>>>   drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range
>>>>   drm/xen/xen_drm_front_gem.c: Convert to use vm_insert_range
>>>>   iommu/dma-iommu.c: Convert to use vm_insert_range
>>>>   videobuf2/videobuf2-dma-sg.c: Convert to use vm_insert_range
>>>>   xen/gntdev.c: Convert to use vm_insert_range
>>>>   xen/privcmd-buf.c: Convert to use vm_insert_range
>>> Any further comment on driver changes ?
>> Xen drivers (the last two patches) look fine to me.
> Thanks, can I considered this as Reviewed-by ?


Reviewed-by: Boris Ostrovsky 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/9] Use vm_insert_range

2018-11-21 Thread Boris Ostrovsky
On 11/21/18 1:24 AM, Souptick Joarder wrote:
> On Thu, Nov 15, 2018 at 9:09 PM Souptick Joarder  wrote:
>> Previouly drivers have their own way of mapping range of
>> kernel pages/memory into user vma and this was done by
>> invoking vm_insert_page() within a loop.
>>
>> As this pattern is common across different drivers, it can
>> be generalized by creating a new function and use it across
>> the drivers.
>>
>> vm_insert_range is the new API which will be used to map a
>> range of kernel memory/pages to user vma.
>>
>> All the applicable places are converted to use new vm_insert_range
>> in this patch series.
>>
>> Souptick Joarder (9):
>>   mm: Introduce new vm_insert_range API
>>   arch/arm/mm/dma-mapping.c: Convert to use vm_insert_range
>>   drivers/firewire/core-iso.c: Convert to use vm_insert_range
>>   drm/rockchip/rockchip_drm_gem.c: Convert to use vm_insert_range
>>   drm/xen/xen_drm_front_gem.c: Convert to use vm_insert_range
>>   iommu/dma-iommu.c: Convert to use vm_insert_range
>>   videobuf2/videobuf2-dma-sg.c: Convert to use vm_insert_range
>>   xen/gntdev.c: Convert to use vm_insert_range
>>   xen/privcmd-buf.c: Convert to use vm_insert_range
> Any further comment on driver changes ?

Xen drivers (the last two patches) look fine to me.

-boris


>>  arch/arm/mm/dma-mapping.c | 21 ++---
>>  drivers/firewire/core-iso.c   | 15 ++--
>>  drivers/gpu/drm/rockchip/rockchip_drm_gem.c   | 20 ++--
>>  drivers/gpu/drm/xen/xen_drm_front_gem.c   | 20 +---
>>  drivers/iommu/dma-iommu.c | 12 ++
>>  drivers/media/common/videobuf2/videobuf2-dma-sg.c | 23 ++-
>>  drivers/xen/gntdev.c  | 11 -
>>  drivers/xen/privcmd-buf.c |  8 ++-
>>  include/linux/mm_types.h  |  3 +++
>>  mm/memory.c   | 28 
>> +++
>>  mm/nommu.c|  7 ++
>>  11 files changed, 70 insertions(+), 98 deletions(-)
>>
>> --
>> 1.9.1
>>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3

2017-06-09 Thread Boris Ostrovsky

>>
>> PV guests don't go through Linux x86 early boot code. They start at
>> xen_start_kernel() (well, xen-head.S:startup_xen(), really) and  merge
>> with baremetal path at x86_64_start_reservations() (for 64-bit).
>>
>
> Ok, I don't think anything needs to be done then. The sme_me_mask is set
> in sme_enable() which is only called from head_64.S. If the sme_me_mask
> isn't set then SME won't be active. The feature will just report the
> capability of the processor, but that doesn't mean it is active. If you
> still want the feature to be clobbered we can do that, though.

I'd prefer to explicitly clear to avoid any ambiguity.

-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3

2017-06-09 Thread Boris Ostrovsky
On 06/09/2017 02:36 PM, Tom Lendacky wrote:
> On 6/8/2017 5:01 PM, Andrew Cooper wrote:
>> On 08/06/2017 22:17, Boris Ostrovsky wrote:
>>> On 06/08/2017 05:02 PM, Tom Lendacky wrote:
>>>> On 6/8/2017 3:51 PM, Boris Ostrovsky wrote:
>>>>>>> What may be needed is making sure X86_FEATURE_SME is not set for PV
>>>>>>> guests.
>>>>>> And that may be something that Xen will need to control through
>>>>>> either
>>>>>> CPUID or MSR support for the PV guests.
>>>>>
>>>>> Only on newer versions of Xen. On earlier versions (2-3 years old)
>>>>> leaf
>>>>> 0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG.
>>>> The SME feature is in leaf 0x801f, is that leaf passed to the
>>>> guest
>>>> unchanged?
>>> Oh, I misread the patch where X86_FEATURE_SME is defined. Then all
>>> versions, including the current one, pass it unchanged.
>>>
>>> All that's needed is setup_clear_cpu_cap(X86_FEATURE_SME) in
>>> xen_init_capabilities().
>>
>> AMD processors still don't support CPUID Faulting (or at least, I
>> couldn't find any reference to it in the latest docs), so we cannot
>> actually hide SME from a guest which goes looking at native CPUID.
>> Furthermore, I'm not aware of any CPUID masking support covering that
>> leaf.
>>
>> However, if Linux is using the paravirtual cpuid hook, things are
>> slightly better.
>>
>> On Xen 4.9 and later, no guests will see the feature.  On earlier
>> versions of Xen (before I fixed the logic), plain domUs will not see the
>> feature, while dom0 will.
>>
>> For safely, I'd recommend unilaterally clobbering the feature as Boris
>> suggested.  There is no way SME will be supportable on a per-PV guest
>
> That may be too late. Early boot support in head_64.S will make calls to
> check for the feature (through CPUID and MSR), set the sme_me_mask and
> encrypt the kernel in place. Is there another way to approach this?


PV guests don't go through Linux x86 early boot code. They start at
xen_start_kernel() (well, xen-head.S:startup_xen(), really) and  merge
with baremetal path at x86_64_start_reservations() (for 64-bit).


-boris

>
>> basis, although (as far as I am aware) Xen as a whole would be able to
>> encompass itself and all of its PV guests inside one single SME
>> instance.
>
> Yes, that is correct.
>
> Thanks,
> Tom
>
>>
>> ~Andrew
>>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3

2017-06-08 Thread Boris Ostrovsky
On 06/08/2017 05:02 PM, Tom Lendacky wrote:
> On 6/8/2017 3:51 PM, Boris Ostrovsky wrote:
>>
>>>
>>>> What may be needed is making sure X86_FEATURE_SME is not set for PV
>>>> guests.
>>>
>>> And that may be something that Xen will need to control through either
>>> CPUID or MSR support for the PV guests.
>>
>>
>> Only on newer versions of Xen. On earlier versions (2-3 years old) leaf
>> 0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG.
>
> The SME feature is in leaf 0x801f, is that leaf passed to the guest
> unchanged?

Oh, I misread the patch where X86_FEATURE_SME is defined. Then all
versions, including the current one, pass it unchanged.

All that's needed is setup_clear_cpu_cap(X86_FEATURE_SME) in
xen_init_capabilities().


-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 10/34] x86, x86/mm, x86/xen, olpc: Use __va() against just the physical address in cr3

2017-06-08 Thread Boris Ostrovsky

>
>> What may be needed is making sure X86_FEATURE_SME is not set for PV
>> guests.
>
> And that may be something that Xen will need to control through either
> CPUID or MSR support for the PV guests.


Only on newer versions of Xen. On earlier versions (2-3 years old) leaf
0x8007 is passed to the guest unchanged. And so is MSR_K8_SYSCFG.

-boris
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu