Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-21 Thread Nicolas Saenz Julienne
On Thu, 2019-11-21 at 08:31 +0100, Christoph Hellwig wrote:
> On Tue, Nov 19, 2019 at 05:17:03PM +, Robin Murphy wrote:
> > TBH I can't see it being a massive problem even if the DMA patch, driver 
> > and DTS patch went entirely separately via the respective DMA, PCI, and 
> > arm-soc trees in the same cycle. Bisecting over a merge window is a big 
> > enough pain in the bum as it is, and if the worst case is that someone 
> > trying to do that on a Pi4 has a wonky PCI controller appear for a couple 
> > of commits, they may as well just disable that driver for their bisection, 
> > because it wasn't there at the start so can't possibly be the thing they're 
> > looking for regressions in ;)
> 
> Agreed.
> 
> Nicolas, can you send a respin?  That way I can still queue it up
> for 5.5.

Oh, I thought it was too late for that already. I'll send it in a minute.

Regards,
Nicolas



signature.asc
Description: This is a digitally signed message part


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-21 Thread Christoph Hellwig
On Tue, Nov 19, 2019 at 05:17:03PM +, Robin Murphy wrote:
> TBH I can't see it being a massive problem even if the DMA patch, driver 
> and DTS patch went entirely separately via the respective DMA, PCI, and 
> arm-soc trees in the same cycle. Bisecting over a merge window is a big 
> enough pain in the bum as it is, and if the worst case is that someone 
> trying to do that on a Pi4 has a wonky PCI controller appear for a couple 
> of commits, they may as well just disable that driver for their bisection, 
> because it wasn't there at the start so can't possibly be the thing they're 
> looking for regressions in ;)

Agreed.

Nicolas, can you send a respin?  That way I can still queue it up
for 5.5.


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-19 Thread Robin Murphy

On 19/11/2019 5:00 pm, Christoph Hellwig wrote:

On Tue, Nov 19, 2019 at 01:57:43PM +0100, Nicolas Saenz Julienne wrote:

Hi Rob & Christoph,
do you mind if I append v2 of this into my upcoming v3 RPi4 PCIe support
series, I didn't do it initially as I thought this was going to be a
contentious patch.  But as it turned out better than expected, I think it
should go into the PCIe series. In the end it's the first explicit user of the
bus DMA limit.

Here's v2 in case you don't know what I'm talking about:
https://www.spinics.net/lists/arm-kernel/msg768459.html


In principle I wouldn't mind, but I think this is going to conflict
quite badly with other changes in the dma-mapping tree (including
yours).  So I think we'll need a shared tree or I'll need to pull
in the whole series through the dma-mapping tree if there are not
other conflicts and the other maintainers are fine with it.


TBH I can't see it being a massive problem even if the DMA patch, driver 
and DTS patch went entirely separately via the respective DMA, PCI, and 
arm-soc trees in the same cycle. Bisecting over a merge window is a big 
enough pain in the bum as it is, and if the worst case is that someone 
trying to do that on a Pi4 has a wonky PCI controller appear for a 
couple of commits, they may as well just disable that driver for their 
bisection, because it wasn't there at the start so can't possibly be the 
thing they're looking for regressions in ;)


Robin.


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-19 Thread Christoph Hellwig
On Tue, Nov 19, 2019 at 01:57:43PM +0100, Nicolas Saenz Julienne wrote:
> Hi Rob & Christoph,
> do you mind if I append v2 of this into my upcoming v3 RPi4 PCIe support
> series, I didn't do it initially as I thought this was going to be a
> contentious patch.  But as it turned out better than expected, I think it
> should go into the PCIe series. In the end it's the first explicit user of the
> bus DMA limit.
> 
> Here's v2 in case you don't know what I'm talking about:
> https://www.spinics.net/lists/arm-kernel/msg768459.html

In principle I wouldn't mind, but I think this is going to conflict
quite badly with other changes in the dma-mapping tree (including
yours).  So I think we'll need a shared tree or I'll need to pull
in the whole series through the dma-mapping tree if there are not
other conflicts and the other maintainers are fine with it.


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-19 Thread Nicolas Saenz Julienne
On Wed, 2019-11-13 at 17:13 +0100, Nicolas Saenz Julienne wrote:
> Using a mask to represent bus DMA constraints has a set of limitations.
> The biggest one being it can only hold a power of two (minus one). The
> DMA mapping code is already aware of this and treats dev->bus_dma_mask
> as a limit. This quirk is already used by some architectures although
> still rare.
> 
> With the introduction of the Raspberry Pi 4 we've found a new contender
> for the use of bus DMA limits, as its PCIe bus can only address the
> lower 3GB of memory (of a total of 4GB). This is impossible to represent
> with a mask. To make things worse the device-tree code rounds non power
> of two bus DMA limits to the next power of two, which is unacceptable in
> this case.
> 
> In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
> over the tree and treat it as such. Note that dev->bus_dma_limit is
> meant to contain the higher accesible DMA address.
> 
> Signed-off-by: Nicolas Saenz Julienne 
> 

Hi Rob & Christoph,
do you mind if I append v2 of this into my upcoming v3 RPi4 PCIe support
series, I didn't do it initially as I thought this was going to be a
contentious patch.  But as it turned out better than expected, I think it
should go into the PCIe series. In the end it's the first explicit user of the
bus DMA limit.

Here's v2 in case you don't know what I'm talking about:
https://www.spinics.net/lists/arm-kernel/msg768459.html

Regards,
Nicolas



signature.asc
Description: This is a digitally signed message part


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-14 Thread Nicolas Saenz Julienne
On Wed, 2019-11-13 at 20:34 +, Robin Murphy wrote:
> On 13/11/2019 4:13 pm, Nicolas Saenz Julienne wrote:
> > Using a mask to represent bus DMA constraints has a set of limitations.
> > The biggest one being it can only hold a power of two (minus one). The
> > DMA mapping code is already aware of this and treats dev->bus_dma_mask
> > as a limit. This quirk is already used by some architectures although
> > still rare.
> > 
> > With the introduction of the Raspberry Pi 4 we've found a new contender
> > for the use of bus DMA limits, as its PCIe bus can only address the
> > lower 3GB of memory (of a total of 4GB). This is impossible to represent
> > with a mask. To make things worse the device-tree code rounds non power
> > of two bus DMA limits to the next power of two, which is unacceptable in
> > this case.
> > 
> > In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
> > over the tree and treat it as such. Note that dev->bus_dma_limit is
> > meant to contain the higher accesible DMA address.
> 
> Neat, you win a "why didn't I do it that way in the first place?" :)

:)

> Looking at it without all the history of previous attempts, this looks 
> entirely reasonable, and definitely a step in the right direction.
> 
> [...]
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 5a7551d060f2..f18827cf96df 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -1097,7 +1097,7 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
> > u64 *dma_size)
> >  * Limit coherent and dma mask based on size
> >  * retrieved from firmware.
> >  */
> > -   dev->bus_dma_mask = mask;
> > +   dev->bus_dma_limit = mask;
> 
> Although this preserves the existing behaviour, as in of_dma_configure() 
> we can do better here since we have the original address range to hand. 
> I think it's worth keeping the ACPI and OF paths in sync for minor 
> tweaks like this, rather than letting them diverge unnecessarily.

I figure you mean something like this:

@@ -1085,19 +1085,15 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr,
u64 *dma_size)
}

if (!ret) {
-   msb = fls64(dmaaddr + size - 1);
-   /*
-* Round-up to the power-of-two mask or set
-* the mask to the whole 64-bit address space
-* in case the DMA region covers the full
-* memory window.
-*/
-   mask = msb == 64 ? U64_MAX : (1ULL << msb) - 1;
+   /* Round-up to the power-of-two */
+   end = dmaddr + size - 1;
+   mask = DMA_BIT_MASK(ilog2(end) + 1);
+
/*
 * Limit coherent and dma mask based on size
 * retrieved from firmware.
 */
-   dev->bus_dma_limit = mask;
+   dev->bus_dma_limit = end;
dev->coherent_dma_mask = mask;
*dev->dma_mask = mask;
}

> Otherwise, the rest looks OK to me - in principle we could store it as 
> an exclusive limit such that we could then streamline the min_not_zero() 
> tests to just min(mask, limit - 1), but that's probably too clever for 
> its own good.

Yes, that was my first intuition and in a perfect world I'd prefer it like
that. But as you say, it's probably going to cause more trouble than anything.

Regards,
Nicolas



signature.asc
Description: This is a digitally signed message part


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-13 Thread Christoph Hellwig
On Wed, Nov 13, 2019 at 05:13:39PM +0100, Nicolas Saenz Julienne wrote:
> Using a mask to represent bus DMA constraints has a set of limitations.
> The biggest one being it can only hold a power of two (minus one). The
> DMA mapping code is already aware of this and treats dev->bus_dma_mask
> as a limit. This quirk is already used by some architectures although
> still rare.
> 
> With the introduction of the Raspberry Pi 4 we've found a new contender
> for the use of bus DMA limits, as its PCIe bus can only address the
> lower 3GB of memory (of a total of 4GB). This is impossible to represent
> with a mask. To make things worse the device-tree code rounds non power
> of two bus DMA limits to the next power of two, which is unacceptable in
> this case.
> 
> In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
> over the tree and treat it as such. Note that dev->bus_dma_limit is
> meant to contain the higher accesible DMA address.

This looks sensible modulo the minor comments in this thread.

> 
> Signed-off-by: Nicolas Saenz Julienne 
> 
> ---
> 
> Note this is rebased on top of Christoph's latest DMA series:
> https://www.spinics.net/lists/arm-kernel/msg768600.html

FYI, I'll plan to merge those tonight unless anyone screams.


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-13 Thread Robin Murphy

On 2019-11-13 8:41 pm, Florian Fainelli wrote:

On 11/13/19 12:34 PM, Robin Murphy wrote:

On 13/11/2019 4:13 pm, Nicolas Saenz Julienne wrote:

Using a mask to represent bus DMA constraints has a set of limitations.
The biggest one being it can only hold a power of two (minus one). The
DMA mapping code is already aware of this and treats dev->bus_dma_mask
as a limit. This quirk is already used by some architectures although
still rare.

With the introduction of the Raspberry Pi 4 we've found a new contender
for the use of bus DMA limits, as its PCIe bus can only address the
lower 3GB of memory (of a total of 4GB). This is impossible to represent
with a mask. To make things worse the device-tree code rounds non power
of two bus DMA limits to the next power of two, which is unacceptable in
this case.

In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
over the tree and treat it as such. Note that dev->bus_dma_limit is
meant to contain the higher accesible DMA address.


Neat, you win a "why didn't I do it that way in the first place?" :)

Looking at it without all the history of previous attempts, this looks
entirely reasonable, and definitely a step in the right direction.


And while you are changing those, would it make sense to not only rename
the structure member but introduce a getter and setter in order to ease
future work where this would no longer be a scalar?


I doubt it - once we get as a far as supporting multiple DMA ranges, 
there will be a whole load of infrastructure churn anyway if only to 
replace dma_pfn_offset, and I'm not sure a simple get/set paradigm would 
even be viable, so it's probably better to save that until clearly 
necessary.


Robin.


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-13 Thread Florian Fainelli
On 11/13/19 12:34 PM, Robin Murphy wrote:
> On 13/11/2019 4:13 pm, Nicolas Saenz Julienne wrote:
>> Using a mask to represent bus DMA constraints has a set of limitations.
>> The biggest one being it can only hold a power of two (minus one). The
>> DMA mapping code is already aware of this and treats dev->bus_dma_mask
>> as a limit. This quirk is already used by some architectures although
>> still rare.
>>
>> With the introduction of the Raspberry Pi 4 we've found a new contender
>> for the use of bus DMA limits, as its PCIe bus can only address the
>> lower 3GB of memory (of a total of 4GB). This is impossible to represent
>> with a mask. To make things worse the device-tree code rounds non power
>> of two bus DMA limits to the next power of two, which is unacceptable in
>> this case.
>>
>> In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
>> over the tree and treat it as such. Note that dev->bus_dma_limit is
>> meant to contain the higher accesible DMA address.
> 
> Neat, you win a "why didn't I do it that way in the first place?" :)
> 
> Looking at it without all the history of previous attempts, this looks
> entirely reasonable, and definitely a step in the right direction.

And while you are changing those, would it make sense to not only rename
the structure member but introduce a getter and setter in order to ease
future work where this would no longer be a scalar?
-- 
Florian


Re: [PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-13 Thread Robin Murphy

On 13/11/2019 4:13 pm, Nicolas Saenz Julienne wrote:

Using a mask to represent bus DMA constraints has a set of limitations.
The biggest one being it can only hold a power of two (minus one). The
DMA mapping code is already aware of this and treats dev->bus_dma_mask
as a limit. This quirk is already used by some architectures although
still rare.

With the introduction of the Raspberry Pi 4 we've found a new contender
for the use of bus DMA limits, as its PCIe bus can only address the
lower 3GB of memory (of a total of 4GB). This is impossible to represent
with a mask. To make things worse the device-tree code rounds non power
of two bus DMA limits to the next power of two, which is unacceptable in
this case.

In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
over the tree and treat it as such. Note that dev->bus_dma_limit is
meant to contain the higher accesible DMA address.


Neat, you win a "why didn't I do it that way in the first place?" :)

Looking at it without all the history of previous attempts, this looks 
entirely reasonable, and definitely a step in the right direction.


[...]

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 5a7551d060f2..f18827cf96df 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -1097,7 +1097,7 @@ void iort_dma_setup(struct device *dev, u64 *dma_addr, 
u64 *dma_size)
 * Limit coherent and dma mask based on size
 * retrieved from firmware.
 */
-   dev->bus_dma_mask = mask;
+   dev->bus_dma_limit = mask;


Although this preserves the existing behaviour, as in of_dma_configure() 
we can do better here since we have the original address range to hand. 
I think it's worth keeping the ACPI and OF paths in sync for minor 
tweaks like this, rather than letting them diverge unnecessarily.


Otherwise, the rest looks OK to me - in principle we could store it as 
an exclusive limit such that we could then streamline the min_not_zero() 
tests to just min(mask, limit - 1), but that's probably too clever for 
its own good.


Robin.


dev->coherent_dma_mask = mask;
*dev->dma_mask = mask;
}


[PATCH] dma-mapping: treat dev->bus_dma_mask as a DMA limit

2019-11-13 Thread Nicolas Saenz Julienne
Using a mask to represent bus DMA constraints has a set of limitations.
The biggest one being it can only hold a power of two (minus one). The
DMA mapping code is already aware of this and treats dev->bus_dma_mask
as a limit. This quirk is already used by some architectures although
still rare.

With the introduction of the Raspberry Pi 4 we've found a new contender
for the use of bus DMA limits, as its PCIe bus can only address the
lower 3GB of memory (of a total of 4GB). This is impossible to represent
with a mask. To make things worse the device-tree code rounds non power
of two bus DMA limits to the next power of two, which is unacceptable in
this case.

In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
over the tree and treat it as such. Note that dev->bus_dma_limit is
meant to contain the higher accesible DMA address.

Signed-off-by: Nicolas Saenz Julienne 

---

Note this is rebased on top of Christoph's latest DMA series:
https://www.spinics.net/lists/arm-kernel/msg768600.html

 arch/mips/pci/fixup-sb1250.c  | 16 
 arch/powerpc/sysdev/fsl_pci.c |  6 +++---
 arch/x86/kernel/pci-dma.c |  2 +-
 arch/x86/mm/mem_encrypt.c |  2 +-
 arch/x86/pci/sta2x11-fixup.c  |  2 +-
 drivers/acpi/arm64/iort.c |  2 +-
 drivers/ata/ahci.c|  2 +-
 drivers/iommu/dma-iommu.c |  3 +--
 drivers/of/device.c   |  9 +
 include/linux/device.h|  6 +++---
 include/linux/dma-direct.h|  2 +-
 include/linux/dma-mapping.h   |  2 +-
 kernel/dma/direct.c   | 27 +--
 13 files changed, 40 insertions(+), 41 deletions(-)

diff --git a/arch/mips/pci/fixup-sb1250.c b/arch/mips/pci/fixup-sb1250.c
index 8a41b359cf90..40efc990cdce 100644
--- a/arch/mips/pci/fixup-sb1250.c
+++ b/arch/mips/pci/fixup-sb1250.c
@@ -21,22 +21,22 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SIBYTE, 
PCI_DEVICE_ID_BCM1250_PCI,
 
 /*
  * The BCM1250, etc. PCI host bridge does not support DAC on its 32-bit
- * bus, so we set the bus's DMA mask accordingly.  However the HT link
+ * bus, so we set the bus's DMA limit accordingly.  However the HT link
  * down the artificial PCI-HT bridge supports 40-bit addressing and the
  * SP1011 HT-PCI bridge downstream supports both DAC and a 64-bit bus
  * width, so we record the PCI-HT bridge's secondary and subordinate bus
- * numbers and do not set the mask for devices present in the inclusive
+ * numbers and do not set the limit for devices present in the inclusive
  * range of those.
  */
-struct sb1250_bus_dma_mask_exclude {
+struct sb1250_bus_dma_limit_exclude {
bool set;
unsigned char start;
unsigned char end;
 };
 
-static int sb1250_bus_dma_mask(struct pci_dev *dev, void *data)
+static int sb1250_bus_dma_limit(struct pci_dev *dev, void *data)
 {
-   struct sb1250_bus_dma_mask_exclude *exclude = data;
+   struct sb1250_bus_dma_limit_exclude *exclude = data;
bool exclude_this;
bool ht_bridge;
 
@@ -55,7 +55,7 @@ static int sb1250_bus_dma_mask(struct pci_dev *dev, void 
*data)
exclude->start, exclude->end);
} else {
dev_dbg(>dev, "disabling DAC for device");
-   dev->dev.bus_dma_mask = DMA_BIT_MASK(32);
+   dev->dev.bus_dma_limit = DMA_BIT_MASK(32);
}
 
return 0;
@@ -63,9 +63,9 @@ static int sb1250_bus_dma_mask(struct pci_dev *dev, void 
*data)
 
 static void quirk_sb1250_pci_dac(struct pci_dev *dev)
 {
-   struct sb1250_bus_dma_mask_exclude exclude = { .set = false };
+   struct sb1250_bus_dma_limit_exclude exclude = { .set = false };
 
-   pci_walk_bus(dev->bus, sb1250_bus_dma_mask, );
+   pci_walk_bus(dev->bus, sb1250_bus_dma_limit, );
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SIBYTE, PCI_DEVICE_ID_BCM1250_PCI,
quirk_sb1250_pci_dac);
diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
index ff0e2b156cb5..617a443d673d 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -115,8 +115,8 @@ static void pci_dma_dev_setup_swiotlb(struct pci_dev *pdev)
 {
struct pci_controller *hose = pci_bus_to_host(pdev->bus);
 
-   pdev->dev.bus_dma_mask =
-   hose->dma_window_base_cur + hose->dma_window_size;
+   pdev->dev.bus_dma_limit =
+   hose->dma_window_base_cur + hose->dma_window_size - 1;
 }
 
 static void setup_swiotlb_ops(struct pci_controller *hose)
@@ -135,7 +135,7 @@ static void fsl_pci_dma_set_mask(struct device *dev, u64 
dma_mask)
 * mapping that allows addressing any RAM address from across PCI.
 */
if (dev_is_pci(dev) && dma_mask >= pci64_dma_offset * 2 - 1) {
-   dev->bus_dma_mask = 0;
+   dev->bus_dma_limit = 0;
dev->archdata.dma_offset = pci64_dma_offset;
}
 }
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index fa4352dce491..3a75d665d43c 100644
---