Re: [PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-28 Thread Alexey Kardashevskiy




On 28/10/2021 08:30, Brian King wrote:

On 10/26/21 12:39 AM, Alexey Kardashevskiy wrote:



On 10/26/21 01:40, Brian King wrote:

On 10/23/21 7:18 AM, Alexey Kardashevskiy wrote:



On 23/10/2021 07:18, Brian King wrote:

On 10/22/21 7:24 AM, Alexey Kardashevskiy wrote:



On 22/10/2021 04:44, Brian King wrote:

If ibm,pmemory is installed in the system, it can appear anywhere
in the address space. This patch enhances how we handle DMA for devices when
ibm,pmemory is present. In the case where we have enough DMA space to
direct map all of RAM, but not ibm,pmemory, we use direct DMA for
I/O to RAM and use the default window to dynamically map ibm,pmemory.
In the case where we only have a single DMA window, this won't work, > so if 
the window is not big enough to map the entire address range,
we cannot direct map.


but we want the pmem range to be mapped into the huge DMA window too if we can, 
why skip it?


This patch should simply do what the comment in this commit mentioned below 
suggests, which says that
ibm,pmemory can appear anywhere in the address space. If the DMA window is 
large enough
to map all of MAX_PHYSMEM_BITS, we will indeed simply do direct DMA for 
everything,
including the pmem. If we do not have a big enough window to do that, we will do
direct DMA for DRAM and dynamic mapping for pmem.



Right, and this is what we do already, do not we? I missing something here.


The upstream code does not work correctly that I can see. If I boot an upstream 
kernel
with an nvme device and vpmem assigned to the LPAR, and enable dev_dbg in 
arch/powerpc/platforms/pseries/iommu.c,
I see the following in the logs:

[2.157549] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 800 
2121 returned 0
[2.157561] nvme 0121:50:00.0: Skipping ibm,pmemory
[2.157567] nvme 0121:50:00.0: can't map partition max 0x8 with 
16777216 65536-sized pages
[2.170150] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 800 
2121 10 28 returned 0 (liobn = 0x7121 starting addr = 800 0)
[2.170170] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
/pci@8002121/pci1014,683@0
[2.356260] nvme 0121:50:00.0: node is /pci@8002121/pci1014,683@0

This means we are heading down the leg in enable_ddw where we do not set 
direct_mapping to true. We use
create the DDW window, but don't do any direct DMA. This is because the window 
is not large enough to
map 2PB of memory, which is what ddw_memory_hotplug_max returns without my 
patch.

With my patch applied, I get this in the logs:

[2.204866] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 800 
2121 returned 0
[2.204875] nvme 0121:50:00.0: Skipping ibm,pmemory
[2.205058] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 800 
2121 10 21 returned 0 (liobn = 0x7121 starting addr = 800 0)
[2.205068] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
/pci@8002121/pci1014,683@0
[2.215898] nvme 0121:50:00.0: iommu: 64-bit OK but direct DMA is limited by 
802




ah I see. then...




Thanks,

Brian






https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/platforms/pseries/iommu.c?id=bf6e2d562bbc4d115cf322b0bca57fe5bbd26f48


Thanks,

Brian







Signed-off-by: Brian King 
---
    arch/powerpc/platforms/pseries/iommu.c | 19 ++-
    1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 269f61d519c2..d9ae985d10a4 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
    phys_addr_t max_addr = memory_hotplug_max();
    struct device_node *memory;
    -    /*
- * The "ibm,pmemory" can appear anywhere in the address space.
- * Assuming it is still backed by page structs, set the upper limit
- * for the huge DMA window as MAX_PHYSMEM_BITS.
- */
-    if (of_find_node_by_type(NULL, "ibm,pmemory"))
-    return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
-    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
-
    for_each_node_by_type(memory, "memory") {
    unsigned long start, size;
    int n_mem_addr_cells, n_mem_size_cells, len;
@@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
     */
    len = max_ram_len;
    if (pmem_present) {
+    if (default_win_removed) {
+    /*
+ * If we only have one DMA window and have pmem present,
+ * then we need to be able to map the entire address
+ * range in order to be able to do direct DMA to RAM.
+ */
+    len = order_base_2((sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
+    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS));




Re: [PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-27 Thread Brian King
On 10/26/21 12:39 AM, Alexey Kardashevskiy wrote:
> 
> 
> On 10/26/21 01:40, Brian King wrote:
>> On 10/23/21 7:18 AM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 23/10/2021 07:18, Brian King wrote:
 On 10/22/21 7:24 AM, Alexey Kardashevskiy wrote:
>
>
> On 22/10/2021 04:44, Brian King wrote:
>> If ibm,pmemory is installed in the system, it can appear anywhere
>> in the address space. This patch enhances how we handle DMA for devices 
>> when
>> ibm,pmemory is present. In the case where we have enough DMA space to
>> direct map all of RAM, but not ibm,pmemory, we use direct DMA for
>> I/O to RAM and use the default window to dynamically map ibm,pmemory.
>> In the case where we only have a single DMA window, this won't work, > 
>> so if the window is not big enough to map the entire address range,
>> we cannot direct map.
>
> but we want the pmem range to be mapped into the huge DMA window too if 
> we can, why skip it?

 This patch should simply do what the comment in this commit mentioned 
 below suggests, which says that
 ibm,pmemory can appear anywhere in the address space. If the DMA window is 
 large enough
 to map all of MAX_PHYSMEM_BITS, we will indeed simply do direct DMA for 
 everything,
 including the pmem. If we do not have a big enough window to do that, we 
 will do
 direct DMA for DRAM and dynamic mapping for pmem.
>>>
>>>
>>> Right, and this is what we do already, do not we? I missing something here.
>>
>> The upstream code does not work correctly that I can see. If I boot an 
>> upstream kernel
>> with an nvme device and vpmem assigned to the LPAR, and enable dev_dbg in 
>> arch/powerpc/platforms/pseries/iommu.c,
>> I see the following in the logs:
>>
>> [2.157549] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 
>> 800 2121 returned 0
>> [2.157561] nvme 0121:50:00.0: Skipping ibm,pmemory
>> [2.157567] nvme 0121:50:00.0: can't map partition max 0x8 
>> with 16777216 65536-sized pages
>> [2.170150] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 
>> 800 2121 10 28 returned 0 (liobn = 0x7121 starting addr = 
>> 800 0)
>> [2.170170] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
>> /pci@8002121/pci1014,683@0
>> [2.356260] nvme 0121:50:00.0: node is /pci@8002121/pci1014,683@0
>>
>> This means we are heading down the leg in enable_ddw where we do not set 
>> direct_mapping to true. We use
>> create the DDW window, but don't do any direct DMA. This is because the 
>> window is not large enough to
>> map 2PB of memory, which is what ddw_memory_hotplug_max returns without my 
>> patch. 
>>
>> With my patch applied, I get this in the logs:
>>
>> [2.204866] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 
>> 800 2121 returned 0
>> [2.204875] nvme 0121:50:00.0: Skipping ibm,pmemory
>> [2.205058] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 
>> 800 2121 10 21 returned 0 (liobn = 0x7121 starting addr = 
>> 800 0)
>> [2.205068] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
>> /pci@8002121/pci1014,683@0
>> [2.215898] nvme 0121:50:00.0: iommu: 64-bit OK but direct DMA is limited 
>> by 802
>>
> 
> 
> ah I see. then...
> 
> 
>>
>> Thanks,
>>
>> Brian
>>
>>
>>>

 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/platforms/pseries/iommu.c?id=bf6e2d562bbc4d115cf322b0bca57fe5bbd26f48


 Thanks,

 Brian


>
>
>>
>> Signed-off-by: Brian King 
>> ---
>>    arch/powerpc/platforms/pseries/iommu.c | 19 ++-
>>    1 file changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
>> b/arch/powerpc/platforms/pseries/iommu.c
>> index 269f61d519c2..d9ae985d10a4 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
>>    phys_addr_t max_addr = memory_hotplug_max();
>>    struct device_node *memory;
>>    -    /*
>> - * The "ibm,pmemory" can appear anywhere in the address space.
>> - * Assuming it is still backed by page structs, set the upper limit
>> - * for the huge DMA window as MAX_PHYSMEM_BITS.
>> - */
>> -    if (of_find_node_by_type(NULL, "ibm,pmemory"))
>> -    return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
>> -    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
>> -
>>    for_each_node_by_type(memory, "memory") {
>>    unsigned long start, size;
>>    int n_mem_addr_cells, n_mem_size_cells, len;
>> @@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, 
>> struct device_node 

Re: [PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-25 Thread Alexey Kardashevskiy



On 10/26/21 01:40, Brian King wrote:
> On 10/23/21 7:18 AM, Alexey Kardashevskiy wrote:
>>
>>
>> On 23/10/2021 07:18, Brian King wrote:
>>> On 10/22/21 7:24 AM, Alexey Kardashevskiy wrote:


 On 22/10/2021 04:44, Brian King wrote:
> If ibm,pmemory is installed in the system, it can appear anywhere
> in the address space. This patch enhances how we handle DMA for devices 
> when
> ibm,pmemory is present. In the case where we have enough DMA space to
> direct map all of RAM, but not ibm,pmemory, we use direct DMA for
> I/O to RAM and use the default window to dynamically map ibm,pmemory.
> In the case where we only have a single DMA window, this won't work, > so 
> if the window is not big enough to map the entire address range,
> we cannot direct map.

 but we want the pmem range to be mapped into the huge DMA window too if we 
 can, why skip it?
>>>
>>> This patch should simply do what the comment in this commit mentioned below 
>>> suggests, which says that
>>> ibm,pmemory can appear anywhere in the address space. If the DMA window is 
>>> large enough
>>> to map all of MAX_PHYSMEM_BITS, we will indeed simply do direct DMA for 
>>> everything,
>>> including the pmem. If we do not have a big enough window to do that, we 
>>> will do
>>> direct DMA for DRAM and dynamic mapping for pmem.
>>
>>
>> Right, and this is what we do already, do not we? I missing something here.
> 
> The upstream code does not work correctly that I can see. If I boot an 
> upstream kernel
> with an nvme device and vpmem assigned to the LPAR, and enable dev_dbg in 
> arch/powerpc/platforms/pseries/iommu.c,
> I see the following in the logs:
> 
> [2.157549] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 800 
> 2121 returned 0
> [2.157561] nvme 0121:50:00.0: Skipping ibm,pmemory
> [2.157567] nvme 0121:50:00.0: can't map partition max 0x8 
> with 16777216 65536-sized pages
> [2.170150] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 800 
> 2121 10 28 returned 0 (liobn = 0x7121 starting addr = 800 0)
> [2.170170] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
> /pci@8002121/pci1014,683@0
> [2.356260] nvme 0121:50:00.0: node is /pci@8002121/pci1014,683@0
> 
> This means we are heading down the leg in enable_ddw where we do not set 
> direct_mapping to true. We use
> create the DDW window, but don't do any direct DMA. This is because the 
> window is not large enough to
> map 2PB of memory, which is what ddw_memory_hotplug_max returns without my 
> patch. 
> 
> With my patch applied, I get this in the logs:
> 
> [2.204866] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 800 
> 2121 returned 0
> [2.204875] nvme 0121:50:00.0: Skipping ibm,pmemory
> [2.205058] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 800 
> 2121 10 21 returned 0 (liobn = 0x7121 starting addr = 800 0)
> [2.205068] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
> /pci@8002121/pci1014,683@0
> [2.215898] nvme 0121:50:00.0: iommu: 64-bit OK but direct DMA is limited 
> by 802
> 


ah I see. then...


> 
> Thanks,
> 
> Brian
> 
> 
>>
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/platforms/pseries/iommu.c?id=bf6e2d562bbc4d115cf322b0bca57fe5bbd26f48
>>>
>>>
>>> Thanks,
>>>
>>> Brian
>>>
>>>


>
> Signed-off-by: Brian King 
> ---
>    arch/powerpc/platforms/pseries/iommu.c | 19 ++-
>    1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> b/arch/powerpc/platforms/pseries/iommu.c
> index 269f61d519c2..d9ae985d10a4 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
>    phys_addr_t max_addr = memory_hotplug_max();
>    struct device_node *memory;
>    -    /*
> - * The "ibm,pmemory" can appear anywhere in the address space.
> - * Assuming it is still backed by page structs, set the upper limit
> - * for the huge DMA window as MAX_PHYSMEM_BITS.
> - */
> -    if (of_find_node_by_type(NULL, "ibm,pmemory"))
> -    return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
> -    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
> -
>    for_each_node_by_type(memory, "memory") {
>    unsigned long start, size;
>    int n_mem_addr_cells, n_mem_size_cells, len;
> @@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, struct 
> device_node *pdn)
>     */
>    len = max_ram_len;
>    if (pmem_present) {
> +    if (default_win_removed) {
> +    /*
> + * If we 

Re: [PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-25 Thread Brian King
On 10/23/21 7:18 AM, Alexey Kardashevskiy wrote:
> 
> 
> On 23/10/2021 07:18, Brian King wrote:
>> On 10/22/21 7:24 AM, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 22/10/2021 04:44, Brian King wrote:
 If ibm,pmemory is installed in the system, it can appear anywhere
 in the address space. This patch enhances how we handle DMA for devices 
 when
 ibm,pmemory is present. In the case where we have enough DMA space to
 direct map all of RAM, but not ibm,pmemory, we use direct DMA for
 I/O to RAM and use the default window to dynamically map ibm,pmemory.
 In the case where we only have a single DMA window, this won't work, > so 
 if the window is not big enough to map the entire address range,
 we cannot direct map.
>>>
>>> but we want the pmem range to be mapped into the huge DMA window too if we 
>>> can, why skip it?
>>
>> This patch should simply do what the comment in this commit mentioned below 
>> suggests, which says that
>> ibm,pmemory can appear anywhere in the address space. If the DMA window is 
>> large enough
>> to map all of MAX_PHYSMEM_BITS, we will indeed simply do direct DMA for 
>> everything,
>> including the pmem. If we do not have a big enough window to do that, we 
>> will do
>> direct DMA for DRAM and dynamic mapping for pmem.
> 
> 
> Right, and this is what we do already, do not we? I missing something here.

The upstream code does not work correctly that I can see. If I boot an upstream 
kernel
with an nvme device and vpmem assigned to the LPAR, and enable dev_dbg in 
arch/powerpc/platforms/pseries/iommu.c,
I see the following in the logs:

[2.157549] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 800 
2121 returned 0
[2.157561] nvme 0121:50:00.0: Skipping ibm,pmemory
[2.157567] nvme 0121:50:00.0: can't map partition max 0x8 with 
16777216 65536-sized pages
[2.170150] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 800 
2121 10 28 returned 0 (liobn = 0x7121 starting addr = 800 0)
[2.170170] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
/pci@8002121/pci1014,683@0
[2.356260] nvme 0121:50:00.0: node is /pci@8002121/pci1014,683@0

This means we are heading down the leg in enable_ddw where we do not set 
direct_mapping to true. We use
create the DDW window, but don't do any direct DMA. This is because the window 
is not large enough to
map 2PB of memory, which is what ddw_memory_hotplug_max returns without my 
patch. 

With my patch applied, I get this in the logs:

[2.204866] nvme 0121:50:00.0: ibm,query-pe-dma-windows(53) 50 800 
2121 returned 0
[2.204875] nvme 0121:50:00.0: Skipping ibm,pmemory
[2.205058] nvme 0121:50:00.0: ibm,create-pe-dma-window(54) 50 800 
2121 10 21 returned 0 (liobn = 0x7121 starting addr = 800 0)
[2.205068] nvme 0121:50:00.0: created tce table LIOBN 0x7121 for 
/pci@8002121/pci1014,683@0
[2.215898] nvme 0121:50:00.0: iommu: 64-bit OK but direct DMA is limited by 
802


Thanks,

Brian


> 
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/platforms/pseries/iommu.c?id=bf6e2d562bbc4d115cf322b0bca57fe5bbd26f48
>>
>>
>> Thanks,
>>
>> Brian
>>
>>
>>>
>>>

 Signed-off-by: Brian King 
 ---
    arch/powerpc/platforms/pseries/iommu.c | 19 ++-
    1 file changed, 10 insertions(+), 9 deletions(-)

 diff --git a/arch/powerpc/platforms/pseries/iommu.c 
 b/arch/powerpc/platforms/pseries/iommu.c
 index 269f61d519c2..d9ae985d10a4 100644
 --- a/arch/powerpc/platforms/pseries/iommu.c
 +++ b/arch/powerpc/platforms/pseries/iommu.c
 @@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
    phys_addr_t max_addr = memory_hotplug_max();
    struct device_node *memory;
    -    /*
 - * The "ibm,pmemory" can appear anywhere in the address space.
 - * Assuming it is still backed by page structs, set the upper limit
 - * for the huge DMA window as MAX_PHYSMEM_BITS.
 - */
 -    if (of_find_node_by_type(NULL, "ibm,pmemory"))
 -    return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
 -    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
 -
    for_each_node_by_type(memory, "memory") {
    unsigned long start, size;
    int n_mem_addr_cells, n_mem_size_cells, len;
 @@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, struct 
 device_node *pdn)
     */
    len = max_ram_len;
    if (pmem_present) {
 +    if (default_win_removed) {
 +    /*
 + * If we only have one DMA window and have pmem present,
 + * then we need to be able to map the entire address
 + * range in order to be able to do direct DMA to RAM.
 + */
 +

Re: [PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-23 Thread Alexey Kardashevskiy




On 23/10/2021 07:18, Brian King wrote:

On 10/22/21 7:24 AM, Alexey Kardashevskiy wrote:



On 22/10/2021 04:44, Brian King wrote:

If ibm,pmemory is installed in the system, it can appear anywhere
in the address space. This patch enhances how we handle DMA for devices when
ibm,pmemory is present. In the case where we have enough DMA space to
direct map all of RAM, but not ibm,pmemory, we use direct DMA for
I/O to RAM and use the default window to dynamically map ibm,pmemory.
In the case where we only have a single DMA window, this won't work, > so if 
the window is not big enough to map the entire address range,
we cannot direct map.


but we want the pmem range to be mapped into the huge DMA window too if we can, 
why skip it?


This patch should simply do what the comment in this commit mentioned below 
suggests, which says that
ibm,pmemory can appear anywhere in the address space. If the DMA window is 
large enough
to map all of MAX_PHYSMEM_BITS, we will indeed simply do direct DMA for 
everything,
including the pmem. If we do not have a big enough window to do that, we will do
direct DMA for DRAM and dynamic mapping for pmem.



Right, and this is what we do already, do not we? I missing something here.



https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/platforms/pseries/iommu.c?id=bf6e2d562bbc4d115cf322b0bca57fe5bbd26f48


Thanks,

Brian







Signed-off-by: Brian King 
---
   arch/powerpc/platforms/pseries/iommu.c | 19 ++-
   1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 269f61d519c2..d9ae985d10a4 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
   phys_addr_t max_addr = memory_hotplug_max();
   struct device_node *memory;
   -    /*
- * The "ibm,pmemory" can appear anywhere in the address space.
- * Assuming it is still backed by page structs, set the upper limit
- * for the huge DMA window as MAX_PHYSMEM_BITS.
- */
-    if (of_find_node_by_type(NULL, "ibm,pmemory"))
-    return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
-    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
-
   for_each_node_by_type(memory, "memory") {
   unsigned long start, size;
   int n_mem_addr_cells, n_mem_size_cells, len;
@@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
    */
   len = max_ram_len;
   if (pmem_present) {
+    if (default_win_removed) {
+    /*
+ * If we only have one DMA window and have pmem present,
+ * then we need to be able to map the entire address
+ * range in order to be able to do direct DMA to RAM.
+ */
+    len = order_base_2((sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
+    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS));
+    }
+
   if (query.largest_available_block >=
   (1ULL << (MAX_PHYSMEM_BITS - page_shift)))
   len = MAX_PHYSMEM_BITS;








--
Alexey


Re: [PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-22 Thread Brian King
On 10/22/21 7:24 AM, Alexey Kardashevskiy wrote:
> 
> 
> On 22/10/2021 04:44, Brian King wrote:
>> If ibm,pmemory is installed in the system, it can appear anywhere
>> in the address space. This patch enhances how we handle DMA for devices when
>> ibm,pmemory is present. In the case where we have enough DMA space to
>> direct map all of RAM, but not ibm,pmemory, we use direct DMA for
>> I/O to RAM and use the default window to dynamically map ibm,pmemory.
>> In the case where we only have a single DMA window, this won't work, > so if 
>> the window is not big enough to map the entire address range,
>> we cannot direct map.
> 
> but we want the pmem range to be mapped into the huge DMA window too if we 
> can, why skip it?

This patch should simply do what the comment in this commit mentioned below 
suggests, which says that
ibm,pmemory can appear anywhere in the address space. If the DMA window is 
large enough
to map all of MAX_PHYSMEM_BITS, we will indeed simply do direct DMA for 
everything,
including the pmem. If we do not have a big enough window to do that, we will do
direct DMA for DRAM and dynamic mapping for pmem.


https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/platforms/pseries/iommu.c?id=bf6e2d562bbc4d115cf322b0bca57fe5bbd26f48


Thanks,

Brian


> 
> 
>>
>> Signed-off-by: Brian King 
>> ---
>>   arch/powerpc/platforms/pseries/iommu.c | 19 ++-
>>   1 file changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
>> b/arch/powerpc/platforms/pseries/iommu.c
>> index 269f61d519c2..d9ae985d10a4 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
>>   phys_addr_t max_addr = memory_hotplug_max();
>>   struct device_node *memory;
>>   -    /*
>> - * The "ibm,pmemory" can appear anywhere in the address space.
>> - * Assuming it is still backed by page structs, set the upper limit
>> - * for the huge DMA window as MAX_PHYSMEM_BITS.
>> - */
>> -    if (of_find_node_by_type(NULL, "ibm,pmemory"))
>> -    return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
>> -    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
>> -
>>   for_each_node_by_type(memory, "memory") {
>>   unsigned long start, size;
>>   int n_mem_addr_cells, n_mem_size_cells, len;
>> @@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, struct 
>> device_node *pdn)
>>    */
>>   len = max_ram_len;
>>   if (pmem_present) {
>> +    if (default_win_removed) {
>> +    /*
>> + * If we only have one DMA window and have pmem present,
>> + * then we need to be able to map the entire address
>> + * range in order to be able to do direct DMA to RAM.
>> + */
>> +    len = order_base_2((sizeof(phys_addr_t) * 8 <= 
>> MAX_PHYSMEM_BITS) ?
>> +    (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS));
>> +    }
>> +
>>   if (query.largest_available_block >=
>>   (1ULL << (MAX_PHYSMEM_BITS - page_shift)))
>>   len = MAX_PHYSMEM_BITS;
>>
> 


-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



Re: [PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-22 Thread Alexey Kardashevskiy




On 22/10/2021 04:44, Brian King wrote:

If ibm,pmemory is installed in the system, it can appear anywhere
in the address space. This patch enhances how we handle DMA for devices when
ibm,pmemory is present. In the case where we have enough DMA space to
direct map all of RAM, but not ibm,pmemory, we use direct DMA for
I/O to RAM and use the default window to dynamically map ibm,pmemory.
In the case where we only have a single DMA window, this won't work, > so if 
the window is not big enough to map the entire address range,
we cannot direct map.


but we want the pmem range to be mapped into the huge DMA window too if 
we can, why skip it?





Signed-off-by: Brian King 
---
  arch/powerpc/platforms/pseries/iommu.c | 19 ++-
  1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 269f61d519c2..d9ae985d10a4 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
phys_addr_t max_addr = memory_hotplug_max();
struct device_node *memory;
  
-	/*

-* The "ibm,pmemory" can appear anywhere in the address space.
-* Assuming it is still backed by page structs, set the upper limit
-* for the huge DMA window as MAX_PHYSMEM_BITS.
-*/
-   if (of_find_node_by_type(NULL, "ibm,pmemory"))
-   return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
-   (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
-
for_each_node_by_type(memory, "memory") {
unsigned long start, size;
int n_mem_addr_cells, n_mem_size_cells, len;
@@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
 */
len = max_ram_len;
if (pmem_present) {
+   if (default_win_removed) {
+   /*
+* If we only have one DMA window and have pmem present,
+* then we need to be able to map the entire address
+* range in order to be able to do direct DMA to RAM.
+*/
+   len = order_base_2((sizeof(phys_addr_t) * 8 <= 
MAX_PHYSMEM_BITS) ?
+   (phys_addr_t) -1 : (1ULL << 
MAX_PHYSMEM_BITS));
+   }
+
if (query.largest_available_block >=
(1ULL << (MAX_PHYSMEM_BITS - page_shift)))
len = MAX_PHYSMEM_BITS;



--
Alexey


[PATCH] powerpc: Enhance pmem DMA bypass handling

2021-10-21 Thread Brian King
If ibm,pmemory is installed in the system, it can appear anywhere
in the address space. This patch enhances how we handle DMA for devices when
ibm,pmemory is present. In the case where we have enough DMA space to
direct map all of RAM, but not ibm,pmemory, we use direct DMA for
I/O to RAM and use the default window to dynamically map ibm,pmemory.
In the case where we only have a single DMA window, this won't work,
so if the window is not big enough to map the entire address range,
we cannot direct map.

Signed-off-by: Brian King 
---
 arch/powerpc/platforms/pseries/iommu.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 269f61d519c2..d9ae985d10a4 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
phys_addr_t max_addr = memory_hotplug_max();
struct device_node *memory;
 
-   /*
-* The "ibm,pmemory" can appear anywhere in the address space.
-* Assuming it is still backed by page structs, set the upper limit
-* for the huge DMA window as MAX_PHYSMEM_BITS.
-*/
-   if (of_find_node_by_type(NULL, "ibm,pmemory"))
-   return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
-   (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
-
for_each_node_by_type(memory, "memory") {
unsigned long start, size;
int n_mem_addr_cells, n_mem_size_cells, len;
@@ -1341,6 +1332,16 @@ static bool enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
 */
len = max_ram_len;
if (pmem_present) {
+   if (default_win_removed) {
+   /*
+* If we only have one DMA window and have pmem present,
+* then we need to be able to map the entire address
+* range in order to be able to do direct DMA to RAM.
+*/
+   len = order_base_2((sizeof(phys_addr_t) * 8 <= 
MAX_PHYSMEM_BITS) ?
+   (phys_addr_t) -1 : (1ULL << 
MAX_PHYSMEM_BITS));
+   }
+
if (query.largest_available_block >=
(1ULL << (MAX_PHYSMEM_BITS - page_shift)))
len = MAX_PHYSMEM_BITS;
-- 
2.27.0