Re: issue on rcar-du and/or vsp suspend-resume paths

2017-12-07 Thread Nikita Yushchenko

Hello Kieran.

Mayne thanks for that information.
It will save me a lot of time.

Nikita



Hello Nikita

On 07/12/17 08:48, Nikita Yushchenko wrote:

Hello.

Writing this mail to commiters to rcar-du and vsp1 drivers in kernel branch
v4.9/rcar-3.5.9 of
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git - which
AFAIU is the latest BSP kernel for Renesas Gen3 SoCs.


Suspend resume is known not to work on this kernel version for VSP1 and rcar-du
on Gen3. You could potentially back-port the relevant patches, but I have not
looked at the dependencies required for this:


I believe the VSP1 was fixed with the following patch series:

[PATCH v5 0/2] v4l: vsp1: Fix suspend/resume and race on M2M pipelines
[PATCH v5 1/2] v4l: vsp1: Move vsp1_video_setup_pipeline()
[PATCH v5 2/2] v4l: vsp1: Repair suspend resume operations for video pipelines

And the RCar DU was fixed with the following patches:

[PATCH v2 0/3] drm/media: Implement DU Suspend and Resume on VSP pipelines
  (1/3 is v3 "media: vsp1" below)
[PATCH v2 2/3] drm: rcar-du: Add suspend resume helpers
[PATCH v2 3/3] drm: rcar-du: Remove unused CRTC suspend/resume functions

[PATCH v3] media: vsp1: Prevent suspending and resuming DRM pipelines

All of the above patches were posted to the linux-renesas-soc mailinglist, and
should be available in the archives.

The best source of the patches would be the tag:
   renesas-drivers-2017-11-28-v4.15-rc1
from
   git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git,

which should be functional for suspend/resume on both VSP1 and rcar-du.


I'm trying to make swsusp (i.e echo disk > /sys/power/state) working on
r8a7796-m3ulcb board.

I'm facing an issue with suspend/resume routines of rcar-du driver.

Currently, suspend-to-mem works on this target. I can run
# i2cset -f -y 7 0x30 0x20 0x0F
# echo mem > /sys/power/state
and target suspends. Then, I can press power button and target resumes,
including image on HDMI-connected display.

However, if I try

# echo devices > /sys/power/pm_test
# echo mem > /sys/power/state

which should terminate suspend process after calling drivers method and resume
immediately, rcar-du resume fails:

[   75.233956] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:53:crtc-1] flip_done timed out
[   85.474009] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
[CRTC:53:crtc-1] flip_done timed out


Yes, this looks like the expected errors.


Same will happen on any other variants of /sys/power/pm_test.
Resume of rcar-du will only succeed of CPU really goes sleep after suspend of
rcar-du.

I've tried a test that calls rcar-du's suspend method, sleeps a bit and calls
rcar-du's resume method - and that breaks display. In particular, VSP interrupts
no longer arrive. Although the VSP hardware programming sequence looks exactly
the same as on normal driver startup or on blank-unblank path.

For swsusp (and also for suspend-to-mem error paths) it is required that
driver's resume method undoes what driver's suspend method does, so this
behavior is definitely a bug.

Could you please help debugging this?


Backporting the patches, or moving to latest kernel versions would be the
resolution here.



Nikita Yushchenko,
system sw engineer at cogentembedded.com

--
Regards

Kieran



Re: [PATCH 1/2] dma-mapping: let arch know origin of dma range passed to arch_setup_dma_ops()

2017-01-12 Thread Nikita Yushchenko


 Hmm, I think when the dma-ranges are missing, we should either enforce
 a 32-bit mask, or disallow DMA completely. It's probably too late for
 the latter, I wish we had done this earlier in order to force everyone
 on ARM64 to have a valid dma-ranges property for any DMA master.
>>>
>>> This can be done over time.
>>>
>>> However the very idea of this version of patch is - keep working pieces
>>> as-is, thus for now setting enforce_range to false in case of no defined
>>> dma-ranges is intentional.
>>
>> What we can do is - check bus width (as it is defined in DT) and set
>> enforce_range to true if bus is 32-bit
>>
>>> What I should re-check is - does rcar dtsi set dma-ranges, and add it if
>>> it does not.
>>
>> It does not, will have to add.
>>
>> In DT bus is defined as 64-bit. But looks like physically it is 32-bit.
>> Maybe DT needs fixing.
> 
> I think we always assumed that the lack of a dma-ranges property
> implied a 32-bit width, as that is the safe fallback as well as the
> most common case.

Yes we assumed that, but that was combined with blindly accepting wider
dma_mask per driver's request.  Later is being changed.

> AFAICT, this means you are actually fine on rcar, and all other
> platforms will keep working as we enforce it, but might get slowed
> down if they relied on the unintended behavior of allowing 64-bit
> DMA.

Yesterday Robin raised issue that a change starting to enforce default
dma_mask will break existing setups - i.e. those that depend in 64bit
DMA being implicitly supported without manually declaring such support.

In reply to that, I suggested this version of patchset that should keep
existing behavior by default.

I'm fine with both approaches regarding behavior on hw that I don't have
- but I'm not in position to make any decisions on that.


Isn't gen3 peripheral bus 32-bit?

2017-01-11 Thread Nikita Yushchenko
Hi

R-Car GEN3 PCIe modules have 32-bit connect to internal AXI bus.

Could please somebody with lowlevel hardware knowledge answer, if this
is PCIe module limitation, or internal bus limitation.

Depending on that, different modification of renesas/r8a779[56].dtsi
files is needed: either
- wrap only pcie nodes into 32-bit sub-node of /soc, or
- turn entire /soc subtree to 32-bit.


Problem I'm trying to solve is related to DMA ranges handling.
Architecture code needs explicit information on DMA range
limitations caused by interconnect. See [1] thread for details.

Nikita

[1] https://www.mail-archive.com/linux-renesas-soc@vger.kernel.org/msg10449.html


Re: [PATCH 1/2] dma-mapping: let arch know origin of dma range passed to arch_setup_dma_ops()

2017-01-11 Thread Nikita Yushchenko


12.01.2017 08:52, Nikita Yushchenko wrote:
>>> diff --git a/drivers/staging/fsl-mc/bus/fsl-mc-bus.c 
>>> b/drivers/staging/fsl-mc/bus/fsl-mc-bus.c
>>> index 5ac373c..480b644 100644
>>> --- a/drivers/staging/fsl-mc/bus/fsl-mc-bus.c
>>> +++ b/drivers/staging/fsl-mc/bus/fsl-mc-bus.c
>>> @@ -540,7 +540,7 @@ int fsl_mc_device_add(struct dprc_obj_desc *obj_desc,
>>>  
>>> /* Objects are coherent, unless 'no shareability' flag set. */
>>> if (!(obj_desc->flags & DPRC_OBJ_FLAG_NO_MEM_SHAREABILITY))
>>> -   arch_setup_dma_ops(_dev->dev, 0, 0, NULL, true);
>>> +   arch_setup_dma_ops(_dev->dev, 0, 0, false, NULL, true);
>>>  
>>> /*
>>>  * The device-specific probe callback will get invoked by device_add()
>>
>> Why are these actually calling arch_setup_dma_ops() here in the first
>> place? Are these all devices that are DMA masters without an OF node?
> 
> I don't know, but that's a different topic. This patch just adds
> argument and sets it to false everywhere but in the location when range
> should be definitely enforced.
> 
>>> @@ -126,6 +127,8 @@ void of_dma_configure(struct device *dev, struct 
>>> device_node *np)
>>> return;
>>> }
>>> dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
>>> +
>>> +   enforce_range = true;
>>> }
>>>  
>>> dev->dma_pfn_offset = offset;
>>
>> Hmm, I think when the dma-ranges are missing, we should either enforce
>> a 32-bit mask, or disallow DMA completely. It's probably too late for
>> the latter, I wish we had done this earlier in order to force everyone
>> on ARM64 to have a valid dma-ranges property for any DMA master.
> 
> This can be done over time.
> 
> However the very idea of this version of patch is - keep working pieces
> as-is, thus for now setting enforce_range to false in case of no defined
> dma-ranges is intentional.

What we can do is - check bus width (as it is defined in DT) and set
enforce_range to true if bus is 32-bit

> What I should re-check is - does rcar dtsi set dma-ranges, and add it if
> it does not.

It does not, will have to add.

In DT bus is defined as 64-bit. But looks like physically it is 32-bit.
Maybe DT needs fixing.


Re: [PATCH 1/2] dma-mapping: let arch know origin of dma range passed to arch_setup_dma_ops()

2017-01-11 Thread Nikita Yushchenko
>> diff --git a/drivers/staging/fsl-mc/bus/fsl-mc-bus.c 
>> b/drivers/staging/fsl-mc/bus/fsl-mc-bus.c
>> index 5ac373c..480b644 100644
>> --- a/drivers/staging/fsl-mc/bus/fsl-mc-bus.c
>> +++ b/drivers/staging/fsl-mc/bus/fsl-mc-bus.c
>> @@ -540,7 +540,7 @@ int fsl_mc_device_add(struct dprc_obj_desc *obj_desc,
>>  
>>  /* Objects are coherent, unless 'no shareability' flag set. */
>>  if (!(obj_desc->flags & DPRC_OBJ_FLAG_NO_MEM_SHAREABILITY))
>> -arch_setup_dma_ops(_dev->dev, 0, 0, NULL, true);
>> +arch_setup_dma_ops(_dev->dev, 0, 0, false, NULL, true);
>>  
>>  /*
>>   * The device-specific probe callback will get invoked by device_add()
> 
> Why are these actually calling arch_setup_dma_ops() here in the first
> place? Are these all devices that are DMA masters without an OF node?

I don't know, but that's a different topic. This patch just adds
argument and sets it to false everywhere but in the location when range
should be definitely enforced.

>> @@ -126,6 +127,8 @@ void of_dma_configure(struct device *dev, struct 
>> device_node *np)
>>  return;
>>  }
>>  dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
>> +
>> +enforce_range = true;
>>  }
>>  
>>  dev->dma_pfn_offset = offset;
> 
> Hmm, I think when the dma-ranges are missing, we should either enforce
> a 32-bit mask, or disallow DMA completely. It's probably too late for
> the latter, I wish we had done this earlier in order to force everyone
> on ARM64 to have a valid dma-ranges property for any DMA master.

This can be done over time.

However the very idea of this version of patch is - keep working pieces
as-is, thus for now setting enforce_range to false in case of no defined
dma-ranges is intentional.

What I should re-check is - does rcar dtsi set dma-ranges, and add it if
it does not.

Nikita


Re: [PATCH 2/2] arm64: avoid increasing DMA masks above what hardware supports

2017-01-11 Thread Nikita Yushchenko
>> @@ -959,6 +990,15 @@ void arch_setup_dma_ops(struct device *dev, u64 
>> dma_base, u64 size,
>> if (!dev->archdata.dma_ops)
>> dev->archdata.dma_ops = _dma_ops;
>>  
>> +   /*
>> +* Whatever the parent bus can set. A device must not set
>> +* a DMA mask larger than this.
>> +*/
>> +   if (enforce_range)
>> +   dev->archdata.parent_dma_mask = size - 1;
>> +   else
>> +   dev->archdata.parent_dma_mask = DMA_BIT_MASK(64);
>> +
>> dev->archdata.dma_coherent = coherent;
>> __iommu_setup_dma_ops(dev, dma_base, size, iommu);
>>
> 
> Could we just pass the mask instead of the size here?

We don't want to change API now.

Nikita


[PATCH 1/2] dma-mapping: let arch know origin of dma range passed to arch_setup_dma_ops()

2017-01-11 Thread Nikita Yushchenko
There are cases when device is capable of wide DMA mask (and driver
issues corresponding dma_set_mask() call), but bus device sits on can't
support wide address. Example: NVMe device behind PCIe controller
sitting on 32-bit SoC bus.

To support such case, architecture needs information about such
limitations. Such information can originate from dma-ranges property
in device tree, and is passed to architecture via arch_setup_dma_ops()
call.

Problem is that in wide majority of cases, no dma range is defined.
E.g. ACPI has no means to define it. Thus default range (usually
full 32-bit range, i.e. 4G starting at zero address) is passed instead.

If architecture enforce this range, all setups currently using
wide DMA addresses without explicitly defining support for that via
device tree will break. This is bad, especially for ACPI based
platforms.

To avoid that, this patch adds additional boolean argument to
arch_setup_dma_ops() to show if range originates from authorative source
and thus should be enforced, or is just a guess and should be handled as
such.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
---
 arch/arm/include/asm/dma-mapping.h | 1 +
 arch/arm/mm/dma-mapping.c  | 3 ++-
 arch/arm64/include/asm/dma-mapping.h   | 3 ++-
 arch/arm64/mm/dma-mapping.c| 3 ++-
 arch/mips/include/asm/dma-mapping.h| 3 ++-
 drivers/acpi/scan.c| 2 +-
 drivers/iommu/rockchip-iommu.c | 2 +-
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 2 +-
 drivers/of/device.c| 5 -
 drivers/staging/fsl-mc/bus/fsl-mc-bus.c| 2 +-
 10 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index bf02dbd..2a3863e 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -117,6 +117,7 @@ static inline unsigned long dma_max_pfn(struct device *dev)
 
 #define arch_setup_dma_ops arch_setup_dma_ops
 extern void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
+  bool enforce_range,
   const struct iommu_ops *iommu, bool coherent);
 
 #define arch_teardown_dma_ops arch_teardown_dma_ops
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index ab77100..b8b11f8 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -2380,7 +2380,8 @@ static struct dma_map_ops *arm_get_dma_map_ops(bool 
coherent)
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent)
+   bool enforce_range, const struct iommu_ops *iommu,
+   bool coherent)
 {
struct dma_map_ops *dma_ops;
 
diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
index ccea82c..ae1c23f 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -48,7 +48,8 @@ static inline struct dma_map_ops *get_dma_ops(struct device 
*dev)
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent);
+   bool enforce_range, const struct iommu_ops *iommu,
+   bool coherent);
 #define arch_setup_dma_ops arch_setup_dma_ops
 
 #ifdef CONFIG_IOMMU_DMA
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index e040827..ea295f1 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -953,7 +953,8 @@ static void __iommu_setup_dma_ops(struct device *dev, u64 
dma_base, u64 size,
 #endif  /* CONFIG_IOMMU_DMA */
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
-   const struct iommu_ops *iommu, bool coherent)
+   bool enforce_range, const struct iommu_ops *iommu,
+   bool coherent)
 {
if (!dev->archdata.dma_ops)
dev->archdata.dma_ops = _dma_ops;
diff --git a/arch/mips/include/asm/dma-mapping.h 
b/arch/mips/include/asm/dma-mapping.h
index 7aa71b9..6af4d87 100644
--- a/arch/mips/include/asm/dma-mapping.h
+++ b/arch/mips/include/asm/dma-mapping.h
@@ -34,7 +34,8 @@ extern void dma_cache_sync(struct device *dev, void *vaddr, 
size_t size,
 
 #define arch_setup_dma_ops arch_setup_dma_ops
 static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
- u64 size, const struct iommu_ops *iommu,
+ u64 size, bool enforce_range,
+ const struct iommu_ops *iommu,
  bool coherent)
 {
 #ifdef CONFIG_DMA_PERDEV_COHERENT
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 1926918..dea17a5 100644
--- a/drivers/acpi/

[PATCH 0/2] arm64: fix handling of DMA masks wider than bus supports

2017-01-11 Thread Nikita Yushchenko
> I reckon the easiest way forward would be to pass in some flag to
> arch_setup_dma_ops to indicate whether it's an explicitly-configured
> range or not - then simply initialising parent_dma_mask to ~0 for the
> default case *should* keep things working as before.

Tried to do that.

---

Nikita Yushchenko (2):
  dma-mapping: let arch know origin of dma range passed to arch_setup_dma_ops()
  arm64: avoid increasing DMA masks above what hardware supports

 arch/arm/include/asm/dma-mapping.h |  1 +
 arch/arm/mm/dma-mapping.c  |  3 +-
 arch/arm64/Kconfig |  3 ++
 arch/arm64/include/asm/device.h|  1 +
 arch/arm64/include/asm/dma-mapping.h   |  6 +++-
 arch/arm64/mm/dma-mapping.c| 43 +-
 arch/mips/include/asm/dma-mapping.h|  3 +-
 drivers/acpi/scan.c|  2 +-
 drivers/iommu/rockchip-iommu.c |  2 +-
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c |  2 +-
 drivers/of/device.c|  5 ++-
 drivers/staging/fsl-mc/bus/fsl-mc-bus.c|  2 +-
 12 files changed, 64 insertions(+), 9 deletions(-)

-- 
2.1.4



[PATCH 2/2] arm64: avoid increasing DMA masks above what hardware supports

2017-01-11 Thread Nikita Yushchenko
There are cases when device supports wide DMA addresses wider than
device's connection supports.

In this case driver sets DMA mask based on knowledge of device
capabilities. That must succeed to allow drivers to initialize.

However, swiotlb or iommu still need knowledge about actual device
capabilities. To avoid breakage, actual mask must not be set wider than
device connection allows.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
CC: Arnd Bergmann <a...@arndb.de>
CC: Robin Murphy <robin.mur...@arm.com>
CC: Will Deacon <will.dea...@arm.com>
---
 arch/arm64/Kconfig   |  3 +++
 arch/arm64/include/asm/device.h  |  1 +
 arch/arm64/include/asm/dma-mapping.h |  3 +++
 arch/arm64/mm/dma-mapping.c  | 40 
 4 files changed, 47 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1117421..afb2c08 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -216,6 +216,9 @@ config NEED_DMA_MAP_STATE
 config NEED_SG_DMA_LENGTH
def_bool y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+   def_bool y
+
 config SMP
def_bool y
 
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 243ef25..a57e7bb 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -22,6 +22,7 @@ struct dev_archdata {
void *iommu;/* private IOMMU data */
 #endif
bool dma_coherent;
+   u64 parent_dma_mask;
 };
 
 struct pdev_archdata {
diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
index ae1c23f..46b53e6 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -52,6 +52,9 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 
size,
bool coherent);
 #define arch_setup_dma_ops arch_setup_dma_ops
 
+#define HAVE_ARCH_DMA_SET_MASK 1
+extern int dma_set_mask(struct device *dev, u64 dma_mask);
+
 #ifdef CONFIG_IOMMU_DMA
 void arch_teardown_dma_ops(struct device *dev);
 #define arch_teardown_dma_ops  arch_teardown_dma_ops
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index ea295f1..31b96fd 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -203,6 +203,37 @@ static void __dma_free(struct device *dev, size_t size,
__dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs);
 }
 
+int dma_set_mask(struct device *dev, u64 dma_mask)
+{
+   const struct dma_map_ops *ops = get_dma_ops(dev);
+
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   if (ops->set_dma_mask)
+   return ops->set_dma_mask(dev, mask);
+
+   if (!dev->dma_mask || !dma_supported(dev, mask))
+   return -EIO;
+
+   *dev->dma_mask = mask;
+   return 0;
+}
+EXPORT_SYMBOL(dma_set_mask);
+
+int dma_set_coherent_mask(struct device *dev, u64 mask)
+{
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   if (!dma_supported(dev, mask))
+   return -EIO;
+
+   dev->coherent_dma_mask = mask;
+   return 0;
+}
+EXPORT_SYMBOL(dma_set_coherent_mask);
+
 static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
 unsigned long offset, size_t size,
 enum dma_data_direction dir,
@@ -959,6 +990,15 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, 
u64 size,
if (!dev->archdata.dma_ops)
dev->archdata.dma_ops = _dma_ops;
 
+   /*
+* Whatever the parent bus can set. A device must not set
+* a DMA mask larger than this.
+*/
+   if (enforce_range)
+   dev->archdata.parent_dma_mask = size - 1;
+   else
+   dev->archdata.parent_dma_mask = DMA_BIT_MASK(64);
+
dev->archdata.dma_coherent = coherent;
__iommu_setup_dma_ops(dev, dma_base, size, iommu);
 }
-- 
2.1.4



Re: [PATCH] arm64: avoid increasing DMA masks above what hardware supports

2017-01-11 Thread Nikita Yushchenko
> Yes, I think that ought to work, although the __iommu_setup_dma_ops()
> call will still want a real size reflecting the default mask

I see iommu_dma_ops do not define set_dma_mask.

So what if setup was done for size reflecting one mask and then driver
changes mask?  Will things still operate correctly?


Re: [PATCH v2] arm64: do not set dma masks that device connection can't handle

2017-01-11 Thread Nikita Yushchenko
> I actually have a third variation of this problem involving a PCI root
> complex which *could* drive full-width (40-bit) addresses, but won't,
> due to the way its PCI<->AXI interface is programmed. That would require
> even more complicated dma-ranges handling to describe the windows of
> valid physical addresses which it *will* pass, so I'm not pressing the
> issue - let's just get the basic DMA mask case fixed first.

R-Car + NVMe is actually not "basic case".

It has PCI<->AXI interface involved.
PCI addresses are 64-bit and controller does handle 64-bit addresses
there. Mapping between PCI addresses and AXI addresses is defined. But
AXI is 32-bit.

SoC has iommu that probably could be used between PCIe module and RAM.
Although AFAIK nobody made that working yet.

Board I work with has 4G of RAM, in 4 banks, located at different parts
of wide address space, and only one of them is below 4G. But if iommu is
capable of translating addresses such that 4 gigabyte banks map to first
4 gigabytes of address space, then all memory will become available for
DMA from PCIe device.


Re: [PATCH] arm64: avoid increasing DMA masks above what hardware supports

2017-01-11 Thread Nikita Yushchenko
>> +/*
>> + * we don't yet support buses that have a non-zero mapping.
>> + *  Let's hope we won't need it
>> + */
>> +WARN_ON(dma_base != 0);
> 
> I believe we now accomodate the bus remap bits on BCM2837 as a DMA
> offset, so unfortunately I think this is no longer true.

Arnd, this check is from you. Any updates? Perhaps this check can be
just dropped?

In swiotlb code, dma address (i.e. with offset already applied) is
checked against mask.  Not sure what 'dma_base' means in iommu case.

>> +/*
>> + * Whatever the parent bus can set. A device must not set
>> + * a DMA mask larger than this.
>> + */
>> +dev->archdata.parent_dma_mask = size - 1;
> 
> This will effectively constrain *all* DMA masks to be 32-bit, since for
> 99% of devices we're going to see a size derived from the default mask
> passed in here. I worry that that's liable to lead to performance and
> stability regressions

That was exactly my concern when I first tried to address this issue. My
first attempt was to alter very locally exact configuration where
problem shows, while ensuring that everything else stays as is. See
https://lkml.org/lkml/2016/12/29/218

But looks like people want a generic solution.

> I reckon the easiest way forward would be to pass in some flag to
> arch_setup_dma_ops to indicate whether it's an explicitly-configured
> range or not - then simply initialising parent_dma_mask to ~0 for the
> default case *should* keep things working as before.

Currently only arm, arm64 and mips define arch_setup_dma_ops().
Mips version only checks 'coherent' argument, 'size' is used only by arm
and arm64.

Maybe move setting the default from caller to callee?
I.e. pass size=0 if no explicit information exists, and let architecture
handle that?


Re: [PATCH v2] arm64: do not set dma masks that device connection can't handle

2017-01-10 Thread Nikita Yushchenko
>> What issue "IOMMU doesn't solve"?
>>
>> Issue I'm trying to address is - inconsistency within swiotlb
>> dma_map_ops, where (1) any wide mask is silently accepted, but (2) then
>> mask is used to decide if bounce buffers are needed or not. This
>> inconsistency causes NVMe+R-Car cobmo not working (and breaking memory
>> instead).
> 
> The fundamental underlying problem is the "any wide mask is silently
> accepted" part, and that applies equally to IOMMU ops as well.

Is just posted version better?

It should cover iommu case as well.

Nikita


[PATCH] arm64: avoid increasing DMA masks above what hardware supports

2017-01-10 Thread Nikita Yushchenko
There are cases when device supports wide DMA addresses wider than
device's connection supports.

In this case driver sets DMA mask based on knowledge of device
capabilities. That must succeed to allow drivers to initialize.

However, swiotlb or iommu still need knowledge about actual device
capabilities. To avoid breakage, actual mask must not be set wider than
device connection allows.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
CC: Arnd Bergmann <a...@arndb.de>
CC: Robin Murphy <robin.mur...@arm.com>
CC: Will Deacon <will.dea...@arm.com>
---
 arch/arm64/Kconfig   |  3 +++
 arch/arm64/include/asm/device.h  |  1 +
 arch/arm64/include/asm/dma-mapping.h |  3 +++
 arch/arm64/mm/dma-mapping.c  | 43 
 4 files changed, 50 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1117421..afb2c08 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -216,6 +216,9 @@ config NEED_DMA_MAP_STATE
 config NEED_SG_DMA_LENGTH
def_bool y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+   def_bool y
+
 config SMP
def_bool y
 
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 243ef25..a57e7bb 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -22,6 +22,7 @@ struct dev_archdata {
void *iommu;/* private IOMMU data */
 #endif
bool dma_coherent;
+   u64 parent_dma_mask;
 };
 
 struct pdev_archdata {
diff --git a/arch/arm64/include/asm/dma-mapping.h 
b/arch/arm64/include/asm/dma-mapping.h
index ccea82c..eab36d2 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -51,6 +51,9 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 
size,
const struct iommu_ops *iommu, bool coherent);
 #define arch_setup_dma_ops arch_setup_dma_ops
 
+#define HAVE_ARCH_DMA_SET_MASK 1
+extern int dma_set_mask(struct device *dev, u64 dma_mask);
+
 #ifdef CONFIG_IOMMU_DMA
 void arch_teardown_dma_ops(struct device *dev);
 #define arch_teardown_dma_ops  arch_teardown_dma_ops
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index e040827..7b1bb87 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -203,6 +203,37 @@ static void __dma_free(struct device *dev, size_t size,
__dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs);
 }
 
+int dma_set_mask(struct device *dev, u64 dma_mask)
+{
+   const struct dma_map_ops *ops = get_dma_ops(dev);
+
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   if (ops->set_dma_mask)
+   return ops->set_dma_mask(dev, mask);
+
+   if (!dev->dma_mask || !dma_supported(dev, mask))
+   return -EIO;
+
+   *dev->dma_mask = mask;
+   return 0;
+}
+EXPORT_SYMBOL(dma_set_mask);
+
+int dma_set_coherent_mask(struct device *dev, u64 mask)
+{
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   if (!dma_supported(dev, mask))
+   return -EIO;
+
+   dev->coherent_dma_mask = mask;
+   return 0;
+}
+EXPORT_SYMBOL(dma_set_coherent_mask);
+
 static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
 unsigned long offset, size_t size,
 enum dma_data_direction dir,
@@ -958,6 +989,18 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, 
u64 size,
if (!dev->archdata.dma_ops)
dev->archdata.dma_ops = _dma_ops;
 
+   /*
+* we don't yet support buses that have a non-zero mapping.
+*  Let's hope we won't need it
+*/
+   WARN_ON(dma_base != 0);
+
+   /*
+* Whatever the parent bus can set. A device must not set
+* a DMA mask larger than this.
+*/
+   dev->archdata.parent_dma_mask = size - 1;
+
dev->archdata.dma_coherent = coherent;
__iommu_setup_dma_ops(dev, dma_base, size, iommu);
 }
-- 
2.1.4



Re: [PATCH v2] arm64: do not set dma masks that device connection can't handle

2017-01-10 Thread Nikita Yushchenko
Hi

> The point here is that an IOMMU doesn't solve your issue, and the
> IOMMU-backed DMA ops need the same treatment. In light of that, it really
> feels to me like the DMA masks should be restricted in of_dma_configure
> so that the parent mask is taken into account there, rather than hook
> into each set of DMA ops to intercept set_dma_mask. We'd still need to
> do something to stop dma_set_mask widening the mask if it was restricted
> by of_dma_configure, but I think Robin (cc'd) was playing with that.

What issue "IOMMU doesn't solve"?

Issue I'm trying to address is - inconsistency within swiotlb
dma_map_ops, where (1) any wide mask is silently accepted, but (2) then
mask is used to decide if bounce buffers are needed or not. This
inconsistency causes NVMe+R-Car cobmo not working (and breaking memory
instead).

I just can't think out what similar issue iommu can have.
Do you mean that in iommu case, mask also must not be set to whatever
wider than initial value? Why? What is the use of mask in iommu case? Is
there any real case when iommu can't address all memory existing in the
system?

NVMe maintainer has just stated that they expect
set_dma_mask(DMA_BIT_MASK(64)) to always succeed, and are going to error
out driver probe if that call fails.  They claim that architecture must
always be able to dma_map() whatever memory existing in the system - via
iommu or swiotlb or whatever. Their direction is to remove bounce
buffers from block and other layers.

With this direction, semantics of dma mask becomes even more
questionable. I'd say dma_mask is candidate for removal (or to move to
swiotlb's or iommu's local area)

Nikita


Re: NVMe vs DMA addressing limitations

2017-01-09 Thread Nikita Yushchenko
Christoph, thanks for clear input.

Arnd, I think that given this discussion, best short-term solution is
indeed the patch I've submitted yesterday. That is, your version +
coherent mask support.  With that, set_dma_mask(DMA_BIT_MASK(64)) will
succeed and hardware with work with swiotlb.

Possible next step is to teach swiotlb to dynamically allocate bounce
buffers within entire arm64's ZONE_DMA.

Also there is some hope that R-Car *can* iommu-translate addresses that
PCIe module issues to system bus.  Although previous attempts to make
that working failed. Additional research is needed here.

Nikita

> On Tue, Jan 10, 2017 at 09:47:21AM +0300, Nikita Yushchenko wrote:
>> I'm now working with HW that:
>> - is now way "low end" or "obsolete", it has 4G of RAM and 8 CPU cores,
>> and is being manufactured and developed,
>> - has 75% of it's RAM located beyond first 4G of address space,
>> - can't physically handle incoming PCIe transactions addressed to memory
>> beyond 4G.
> 
> It might not be low end or obselete, but it's absolutely braindead.
> Your I/O performance will suffer badly for the life of the platform
> because someone tries to save 2 cents, and there is not much we can do
> about it.
> 
>> (1) it constantly runs of swiotlb space, logs are full of warnings
>> despite of rate limiting,
> 
>> Per my current understanding, blk-level bounce buffering will at least
>> help with (1) - if done properly it will allocate bounce buffers within
>> entire memory below 4G, not within dedicated swiotlb space (that is
>> small and enlarging it makes memory permanently unavailable for other
>> use).  This looks simple and safe (in sense of not anyhow breaking
>> unrelated use cases).
> 
> Yes.  Although there is absolutely no reason why swiotlb could not
> do the same.
> 
>> (2) it runs far suboptimal due to bounce-buffering almost all i/o,
>> despite of lots of free memory in area where direct DMA is possible.
> 
>> Addressing (2) looks much more difficult because different memory
>> allocation policy is required for that.
> 
> It's basically not possible.  Every piece of memory in a Linux
> kernel is a possible source of I/O, and depending on the workload
> type it might even be a the prime source of I/O.
> 
>>> NVMe should never bounce, the fact that it currently possibly does
>>> for highmem pages is a bug.
>>
>> The entire topic is absolutely not related to highmem (i.e. memory not
>> directly addressable by 32-bit kernel).
> 
> I did not say this affects you, but thanks to your mail I noticed that
> NVMe has a suboptimal setting there.  Also note that highmem does not
> have to imply a 32-bit kernel, just physical memory that is not in the
> kernel mapping.
> 
>> What we are discussing is hw-originated restriction on where DMA is
>> possible.
> 
> Yes, where hw means the SOC, and not the actual I/O device, which is an
> important distinction.
> 
>>> Or even better remove the call to dma_set_mask_and_coherent with
>>> DMA_BIT_MASK(32).  NVMe is designed around having proper 64-bit DMA
>>> addressing, there is not point in trying to pretent it works without that
>>
>> Are you claiming that NVMe driver in mainline is intentionally designed
>> to not work on HW that can't do DMA to entire 64-bit space?
> 
> It is not intenteded to handle the case where the SOC / chipset
> can't handle DMA to all physical memoery, yes.
> 
>> Such setups do exist and there is interest to make them working.
> 
> Sure, but it's not the job of the NVMe driver to work around such a broken
> system.  It's something your architecture code needs to do, maybe with
> a bit of core kernel support.
> 
>> Quite a few pages used for block I/O are allocated by filemap code - and
>> at allocation point it is known what inode page is being allocated for.
>> If this inode is from filesystem located on a known device with known
>> DMA limitations, this knowledge can be used to allocate page that can be
>> DMAed directly.
> 
> But in other cases we might never DMA to it.  Or we rarely DMA to it, say
> for a machine running databses or qemu and using lots of direct I/O. Or
> a storage target using it's local alloc_pages buffers.
> 
>> Sure there are lots of cases when at allocation time there is no idea
>> what device will run DMA on page being allocated, or perhaps page is
>> going to be shared, or whatever. Such cases unavoidably require bounce
>> buffers if page ends to be used with device with DMA limitations. But
>> still there are cases when better allocation can remove need for bounce
>> buffers - without any hurt for other cases.
> 
> It takes your max 1GB DMA addressable memoery away from other uses,
> and introduce the crazy highmem VM tuning issues we had with big
> 32-bit x86 systems in the past.
> 


NVMe vs DMA addressing limitations

2017-01-09 Thread Nikita Yushchenko
>> I believe the bounce buffering code you refer to is not in SATA/SCSI/MMC
>> but in block layer, in particular it should be controlled by
>> blk_queue_bounce_limit().  [Yes there is CONFIG_MMC_BLOCK_BOUNCE but it
>> is something completely different, namely it is for request merging for
>> hw not supporting scatter-gather].  And NVMe also uses block layer and
>> thus should get same support.
> 
> NVMe shouldn't have to call blk_queue_bounce_limit - 
> blk_queue_bounce_limit is to set the DMA addressing limit of the device.
> NVMe devices must support unlimited 64-bit addressing and thus calling
> blk_queue_bounce_limit from NVMe does not make sense.

I'm now working with HW that:
- is now way "low end" or "obsolete", it has 4G of RAM and 8 CPU cores,
and is being manufactured and developed,
- has 75% of it's RAM located beyond first 4G of address space,
- can't physically handle incoming PCIe transactions addressed to memory
beyond 4G.

Swiotlb is used there, sure (once a bug in arm64 arch is patched). But
that setup still has at least two issues.

(1) it constantly runs of swiotlb space, logs are full of warnings
despite of rate limiting,

(2) it runs far suboptimal due to bounce-buffering almost all i/o,
despite of lots of free memory in area where direct DMA is possible.

I'm looking for proper way to address these. Shooting HW designer as you
suggested elsewhere doesn't look like a practical solution. Any better
ideas?


Per my current understanding, blk-level bounce buffering will at least
help with (1) - if done properly it will allocate bounce buffers within
entire memory below 4G, not within dedicated swiotlb space (that is
small and enlarging it makes memory permanently unavailable for other
use).  This looks simple and safe (in sense of not anyhow breaking
unrelated use cases).

Addressing (2) looks much more difficult because different memory
allocation policy is required for that.


> That being said currently the default for a queue without a call
> to blk_queue_make_request which does the wrong thing on highmem
> setups, so we should fix it.  In fact BLK_BOUNCE_HIGH as-is doesn't
> really make much sense these days as no driver should ever dereference
> pages passed to it directly.
> 
>> Maybe fixing that, together with making NVMe use this API, could stop it
>> from issuing dma_map()s of addresses beyond mask.
> 
> NVMe should never bounce, the fact that it currently possibly does
> for highmem pages is a bug.

The entire topic is absolutely not related to highmem (i.e. memory not
directly addressable by 32-bit kernel).

What we are discussing is hw-originated restriction on where DMA is
possible.


> Or even better remove the call to dma_set_mask_and_coherent with
> DMA_BIT_MASK(32).  NVMe is designed around having proper 64-bit DMA
> addressing, there is not point in trying to pretent it works without that

Are you claiming that NVMe driver in mainline is intentionally designed
to not work on HW that can't do DMA to entire 64-bit space?

Such setups do exist and there is interest to make them working.


> We need to kill off BLK_BOUNCE_HIGH, it just doesn't make sense to
> mix the highmem aspect with the addressing limits.  In fact the whole
> block bouncing scheme doesn't make much sense at all these days, we
> should rely on swiotlb instead.

I agree that centralized bounce buffering is better than
subsystem-implemented bounce buffering.

I still claim that even better - especially from performance point of
view - is some memory allocation policy that is aware of HW limitations
and avoids bounce buffering at least when it is possible.


>> What I mean is some API to allocate memory for use with streaming DMA in
>> such way that bounce buffers won't be needed. There are many cases when
>> at buffer allocation time, it is already known that buffer will be used
>> for DMA with particular device. Bounce buffers will still be needed
>> cases when no such information is available at allocation time, or when
>> there is no directly-DMAable memory available at allocation time.
> 
> For block I/O that is never the case.

Quite a few pages used for block I/O are allocated by filemap code - and
at allocation point it is known what inode page is being allocated for.
If this inode is from filesystem located on a known device with known
DMA limitations, this knowledge can be used to allocate page that can be
DMAed directly.

Sure there are lots of cases when at allocation time there is no idea
what device will run DMA on page being allocated, or perhaps page is
going to be shared, or whatever. Such cases unavoidably require bounce
buffers if page ends to be used with device with DMA limitations. But
still there are cases when better allocation can remove need for bounce
buffers - without any hurt for other cases.


Nikita


Re: [PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2017-01-09 Thread Nikita Yushchenko
[CCing NVMe maintainers since we are discussion issues in that driver]

>> With my patch applied and thus 32bit dma_mask set for NVMe device, I do
>> see high addresses passed to dma_map_*() routines and handled by
>> swiotlb. Thus your statement that behavior "succeed 64bit dma_set_mask()
>> operation but silently replace mask behind the scene" is required for
>> swiotlb to be used, does not match reality.
> 
> See my point about drivers that don't implement bounce buffering.
> Apparently NVMe is one of them, unlike the SATA/SCSI/MMC storage
> drivers that do their own thing.

I believe the bounce buffering code you refer to is not in SATA/SCSI/MMC
but in block layer, in particular it should be controlled by
blk_queue_bounce_limit().  [Yes there is CONFIG_MMC_BLOCK_BOUNCE but it
is something completely different, namely it is for request merging for
hw not supporting scatter-gather].  And NVMe also uses block layer and
thus should get same support.

But blk_queue_bounce_limit() is somewhat broken, it has very strange
code under #if BITS_PER_LONG == 64 that makes setting max_addr to
0x not working if max_low_pfn is above 4G.

Maybe fixing that, together with making NVMe use this API, could stop it
from issuing dma_map()s of addresses beyond mask.


> What I think happened here in chronological order is:
> 
> - In the old days, 64-bit architectures tended to use an IOMMU 
>   all the time to work around 32-bit limitations on DMA masters
> - Some architectures had no IOMMU that fully solved this and the
>   dma-mapping API required drivers to set the right mask and check
>   the return code. If this failed, the driver needed to use its
>   own bounce buffers as network and scsi do. See also the
>   grossly misnamed "PCI_DMA_BUS_IS_PHYS" macro.
> - As we never had support for bounce buffers in all drivers, and
>   early 64-bit Intel machines had no IOMMU, the swiotlb code was
>   introduced as a workaround, so we can use the IOMMU case without
>   driver specific bounce buffers everywhere
> - As most of the important 64-bit architectures (x86, arm64, powerpc)
>   now always have either IOMMU or swiotlb enabled, drivers like
>   NVMe started relying on it, and no longer handle a dma_set_mask
>   failure properly.

... and architectures started to add to this breakage, not handling
dma_set_mask() as documented.


As for PCI_DMA_BUS_IS_PHYS - ironically, what all current usages of this
macro in the kernel do is - *disable* bounce buffers in block layer if
PCI_DMA_BUS_IS_PHYS is zero.  Defining it to zero (as arm64 currently
does) on system with memory above 4G makes all block drivers to depend
on swiotlb (or iommu). Affected drivers are SCSI and IDE.

> We may need to audit how drivers typically handle dma_set_mask()
> failure. The NVMe driver in its current state will probably cause
> silent data corruption when used on a 64-bit architecture that has
> a 32-bit bus but neither swiotlb nor iommu enabled at runtime.

With current code NVME causes system memory breakage even if swiotlb is
there - because it's dma_set_mask_and_coherent(DMA_BIT_MASK(64)) call
has effect of silent disable of swiotlb.


> I would argue that the driver should be fixed to either refuse
> working in that configuration to avoid data corruption, or that
> it should implement bounce buffering like SCSI does.

Difference from "SCSI" (actually - from block drivers that work) is in
that dma_set_mask_and_coherent(DMA_BIT_MASK(64)) call: driver that does
not do it works, driver that does it fails.

Per documentation, driver *should* do it if it's hardware supports
64-bit dma, and platform *should* either fail this call, or ensure that
64-bit addresses can be dma_map()ed successfully.

So what we have on arm64 is - drivers that follow documented procedure
fail, drivers that don't follow it work, That's nonsense.

> If we make it
> simply not work, then your suggestion of making dma_set_mask()
> fail will break your system in a different way.

Proper fix should fix *both* architecture and NVMe.

- architecture should stop breaking 64-bit DMA when driver attempts to
set 64-bit dma mask,

- NVMe should issue proper blk_queue_bounce_limit() call based on what
is actually set mask,

- and blk_queue_bounce_limit() should also be fixed to actually set
0x limit, instead of replacing it with (max_low_pfn <<
PAGE_SHIFT) as it does now.


>> Still current code does not work, thus fix is needed.
>>
>> Perhaps need to introduce some generic API to "allocate memory best
>> suited for DMA to particular device", and fix allocation points (in
>> drivers, filesystems, etc) to use it. Such an API could try to allocate
>> area that can be DMAed by hardware, and fallback to other memory that
>> can be used via swiotlb or other bounce buffer implementation.
> 
> The DMA mapping API is meant to do this, but we can definitely improve
> it or clarify some of the rules.

DMA mapping API can't help here, it's about mapping, not about allocation.

What I 

[PATCH v2] arm64: do not set dma masks that device connection can't handle

2017-01-08 Thread Nikita Yushchenko
It is possible that device is capable of 64-bit DMA addresses, and
device driver tries to set wide DMA mask, but bridge or bus used to
connect device to the system can't handle wide addresses.

With swiotlb, memory above 4G still can be used by drivers for streaming
DMA, but *dev->mask and dev->dma_coherent_mask must still keep values
that hardware handles physically.

This patch enforces that. Based on original version by
Arnd Bergmann <a...@arndb.de>, extended with coherent mask hadnling.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
CC: Arnd Bergmann <a...@arndb.de>
---
Changes since v1:
- fixed issues noted by Sergei Shtylyov <sergei.shtyl...@cogentembedded.com>
  - save mask, not size
  - remove doube empty line

 arch/arm64/Kconfig  |  3 +++
 arch/arm64/include/asm/device.h |  1 +
 arch/arm64/mm/dma-mapping.c | 51 +
 3 files changed, 55 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1117421..afb2c08 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -216,6 +216,9 @@ config NEED_DMA_MAP_STATE
 config NEED_SG_DMA_LENGTH
def_bool y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+   def_bool y
+
 config SMP
def_bool y
 
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 243ef25..a57e7bb 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -22,6 +22,7 @@ struct dev_archdata {
void *iommu;/* private IOMMU data */
 #endif
bool dma_coherent;
+   u64 parent_dma_mask;
 };
 
 struct pdev_archdata {
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index e040827..5ab15ce 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -352,6 +352,30 @@ static int __swiotlb_dma_supported(struct device *hwdev, 
u64 mask)
return 1;
 }
 
+static int __swiotlb_set_dma_mask(struct device *dev, u64 mask)
+{
+   /* device is not DMA capable */
+   if (!dev->dma_mask)
+   return -EIO;
+
+   /* mask is below swiotlb bounce buffer, so fail */
+   if (!swiotlb_dma_supported(dev, mask))
+   return -EIO;
+
+   /*
+* because of the swiotlb, we can return success for
+* larger masks, but need to ensure that bounce buffers
+* are used above parent_dma_mask, so set that as
+* the effective mask.
+*/
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   *dev->dma_mask = mask;
+
+   return 0;
+}
+
 static struct dma_map_ops swiotlb_dma_ops = {
.alloc = __dma_alloc,
.free = __dma_free,
@@ -367,8 +391,23 @@ static struct dma_map_ops swiotlb_dma_ops = {
.sync_sg_for_device = __swiotlb_sync_sg_for_device,
.dma_supported = __swiotlb_dma_supported,
.mapping_error = swiotlb_dma_mapping_error,
+   .set_dma_mask = __swiotlb_set_dma_mask,
 };
 
+int dma_set_coherent_mask(struct device *dev, u64 mask)
+{
+   if (!dma_supported(dev, mask))
+   return -EIO;
+
+   if (get_dma_ops(dev) == _dma_ops &&
+   mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   dev->coherent_dma_mask = mask;
+   return 0;
+}
+EXPORT_SYMBOL(dma_set_coherent_mask);
+
 static int __init atomic_pool_init(void)
 {
pgprot_t prot = __pgprot(PROT_NORMAL_NC);
@@ -958,6 +997,18 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, 
u64 size,
if (!dev->archdata.dma_ops)
dev->archdata.dma_ops = _dma_ops;
 
+   /*
+* we don't yet support buses that have a non-zero mapping.
+*  Let's hope we won't need it
+*/
+   WARN_ON(dma_base != 0);
+
+   /*
+* Whatever the parent bus can set. A device must not set
+* a DMA mask larger than this.
+*/
+   dev->archdata.parent_dma_mask = size - 1;
+
dev->archdata.dma_coherent = coherent;
__iommu_setup_dma_ops(dev, dma_base, size, iommu);
 }
-- 
2.1.4



Re: [PATCH] arm64: do not set dma masks that device connection can't handle

2017-01-08 Thread Nikita Yushchenko
>> +if (mask > dev->archdata.parent_dma_mask)
>> +mask = dev->archdata.parent_dma_mask;
>> +
>> +
> 
>One empty line is enough...

Ok

>> +/*
>> + * Whatever the parent bus can set. A device must not set
>> + * a DMA mask larger than this.
>> + */
>> +dev->archdata.parent_dma_mask = size;
> 
>Not 'size - 1'?

Good question.

Indeed of_dma_configure() calls arch_setup_dma_ops() with size, not
mask. Which implies '-1' is needed here. Although better fix may be to
change caller side - to make DMA_BIT_MASK(64) case cleaner.

Will repost path.

Nikita


[PATCH] arm64: do not set dma masks that device connection can't handle

2017-01-06 Thread Nikita Yushchenko
It is possible that device is capable of 64-bit DMA addresses, and
device driver tries to set wide DMA mask, but bridge or bus used to
connect device to the system can't handle wide addresses.

With swiotlb, memory above 4G still can be used by drivers for streaming
DMA, but *dev->mask and dev->dma_coherent_mask must still keep values
that hardware handles physically.

This patch enforces that. Based on original version by
Arnd Bergmann <a...@arndb.de>, extended with coherent mask hadnling.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
CC: Arnd Bergmann <a...@arndb.de>
---

... now with initially missed change in arch_setup_dma_ops() ...

 arch/arm64/Kconfig  |  3 +++
 arch/arm64/include/asm/device.h |  1 +
 arch/arm64/mm/dma-mapping.c | 52 +
 3 files changed, 56 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1117421..afb2c08 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -216,6 +216,9 @@ config NEED_DMA_MAP_STATE
 config NEED_SG_DMA_LENGTH
def_bool y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+   def_bool y
+
 config SMP
def_bool y
 
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 243ef25..a57e7bb 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -22,6 +22,7 @@ struct dev_archdata {
void *iommu;/* private IOMMU data */
 #endif
bool dma_coherent;
+   u64 parent_dma_mask;
 };
 
 struct pdev_archdata {
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 290a84f..09c7900 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -352,6 +352,31 @@ static int __swiotlb_dma_supported(struct device *hwdev, 
u64 mask)
return 1;
 }
 
+static int __swiotlb_set_dma_mask(struct device *dev, u64 mask)
+{
+   /* device is not DMA capable */
+   if (!dev->dma_mask)
+   return -EIO;
+
+   /* mask is below swiotlb bounce buffer, so fail */
+   if (!swiotlb_dma_supported(dev, mask))
+   return -EIO;
+
+   /*
+* because of the swiotlb, we can return success for
+* larger masks, but need to ensure that bounce buffers
+* are used above parent_dma_mask, so set that as
+* the effective mask.
+*/
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+
+   *dev->dma_mask = mask;
+
+   return 0;
+}
+
 static struct dma_map_ops swiotlb_dma_ops = {
.alloc = __dma_alloc,
.free = __dma_free,
@@ -367,8 +392,23 @@ static struct dma_map_ops swiotlb_dma_ops = {
.sync_sg_for_device = __swiotlb_sync_sg_for_device,
.dma_supported = __swiotlb_dma_supported,
.mapping_error = swiotlb_dma_mapping_error,
+   .set_dma_mask = __swiotlb_set_dma_mask,
 };
 
+int dma_set_coherent_mask(struct device *dev, u64 mask)
+{
+   if (!dma_supported(dev, mask))
+   return -EIO;
+
+   if (get_dma_ops(dev) == _dma_ops &&
+   mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   dev->coherent_dma_mask = mask;
+   return 0;
+}
+EXPORT_SYMBOL(dma_set_coherent_mask);
+
 static int __init atomic_pool_init(void)
 {
pgprot_t prot = __pgprot(PROT_NORMAL_NC);
@@ -957,6 +997,18 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, 
u64 size,
if (!dev->archdata.dma_ops)
dev->archdata.dma_ops = _dma_ops;
 
+   /*
+* we don't yet support buses that have a non-zero mapping.
+*  Let's hope we won't need it
+*/
+   WARN_ON(dma_base != 0);
+
+   /*
+* Whatever the parent bus can set. A device must not set
+* a DMA mask larger than this.
+*/
+   dev->archdata.parent_dma_mask = size;
+
dev->archdata.dma_coherent = coherent;
__iommu_setup_dma_ops(dev, dma_base, size, iommu);
 }
-- 
2.1.4



[PATCH] arm64: do not set dma masks that device connection can't handle

2017-01-06 Thread Nikita Yushchenko
It is possible that device is capable of 64-bit DMA addresses, and
device driver tries to set wide DMA mask, but bridge or bus used to
connect device to the system can't handle wide addresses.

With swiotlb, memory above 4G still can be used by drivers for streaming
DMA, but *dev->mask and dev->dma_coherent_mask must still keep values
that hardware handles physically.

This patch enforces that. Based on original version by
Arnd Bergmann <a...@arndb.de>, extended with coherent mask hadnling.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
CC: Arnd Bergmann <a...@arndb.de>
---
 arch/arm64/Kconfig  |  3 +++
 arch/arm64/include/asm/device.h |  1 +
 arch/arm64/mm/dma-mapping.c | 40 
 3 files changed, 44 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1117421..afb2c08 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -216,6 +216,9 @@ config NEED_DMA_MAP_STATE
 config NEED_SG_DMA_LENGTH
def_bool y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+   def_bool y
+
 config SMP
def_bool y
 
diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
index 243ef25..a57e7bb 100644
--- a/arch/arm64/include/asm/device.h
+++ b/arch/arm64/include/asm/device.h
@@ -22,6 +22,7 @@ struct dev_archdata {
void *iommu;/* private IOMMU data */
 #endif
bool dma_coherent;
+   u64 parent_dma_mask;
 };
 
 struct pdev_archdata {
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 290a84f..be3632e 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -352,6 +352,31 @@ static int __swiotlb_dma_supported(struct device *hwdev, 
u64 mask)
return 1;
 }
 
+static int __swiotlb_set_dma_mask(struct device *dev, u64 mask)
+{
+   /* device is not DMA capable */
+   if (!dev->dma_mask)
+   return -EIO;
+
+   /* mask is below swiotlb bounce buffer, so fail */
+   if (!swiotlb_dma_supported(dev, mask))
+   return -EIO;
+
+   /*
+* because of the swiotlb, we can return success for
+* larger masks, but need to ensure that bounce buffers
+* are used above parent_dma_mask, so set that as
+* the effective mask.
+*/
+   if (mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+
+   *dev->dma_mask = mask;
+
+   return 0;
+}
+
 static struct dma_map_ops swiotlb_dma_ops = {
.alloc = __dma_alloc,
.free = __dma_free,
@@ -367,8 +392,23 @@ static struct dma_map_ops swiotlb_dma_ops = {
.sync_sg_for_device = __swiotlb_sync_sg_for_device,
.dma_supported = __swiotlb_dma_supported,
.mapping_error = swiotlb_dma_mapping_error,
+   .set_dma_mask = __swiotlb_set_dma_mask,
 };
 
+int dma_set_coherent_mask(struct device *dev, u64 mask)
+{
+   if (!dma_supported(dev, mask))
+   return -EIO;
+
+   if (get_dma_ops(dev) == _dma_ops &&
+   mask > dev->archdata.parent_dma_mask)
+   mask = dev->archdata.parent_dma_mask;
+
+   dev->coherent_dma_mask = mask;
+   return 0;
+}
+EXPORT_SYMBOL(dma_set_coherent_mask);
+
 static int __init atomic_pool_init(void)
 {
pgprot_t prot = __pgprot(PROT_NORMAL_NC);
-- 
2.1.4



Re: [PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2017-01-06 Thread Nikita Yushchenko
>>> Just a guess, but if the inbound translation windows in the host
>>> bridge are wider than 32-bit, the reason for setting up a single
>>> 32-bit window is probably because that is what the parent bus supports.

I've re-checked rcar-pcie hardware documentation.

It indeed mentions that AXI bus it sits on is 32-bit.


>> Well anyway applying patch similar to your's will fix pcie-rcar + nvme
>> case - thus I don't object :)   But it can break other cases ...
>>
>> But why do you hook at set_dma_mask() and overwrite mask inside, instead
>> of hooking at dma_supported() and rejecting unsupported mask?
>>
>> I think later is better, because it lets drivers to handle unsupported
>> high-dma case, like documented in DMA-API_HOWTO.
> 
> I think the behavior I put in there is required for swiotlb to make
> sense, otherwise you would rely on the driver to handle dma_set_mask()
> failure gracefully with its own bounce buffers (as network and
> scsi drivers do but others don't).
> 
> Having swiotlb or iommu enabled should result in dma_set_mask() always
> succeeding unless the mask is too small to cover the swiotlb
> bounce buffer area or the iommu virtual address space. This behavior
> is particularly important in case the bus address space is narrower
> than 32-bit, as we have to guarantee that the fallback to 32-bit
> DMA always succeeds. There are also a lot of drivers that try to
> set a 64-bit mask but don't implement bounce buffers for streaming
> mappings if that fails, and swiotlb is what we use to make those
> drivers work.
> 
> And yes, the API is a horrible mess.

With my patch applied and thus 32bit dma_mask set for NVMe device, I do
see high addresses passed to dma_map_*() routines and handled by
swiotlb. Thus your statement that behavior "succeed 64bit dma_set_mask()
operation but silently replace mask behind the scene" is required for
swiotlb to be used, does not match reality.

It can be interpreted as a breakage elsewhere, but it's hard to point
particular "root cause". The entire infrastructure to allocate and use
DMA memory is messy.

Still current code does not work, thus fix is needed.

Perhaps need to introduce some generic API to "allocate memory best
suited for DMA to particular device", and fix allocation points (in
drivers, filesystems, etc) to use it. Such an API could try to allocate
area that can be DMAed by hardware, and fallback to other memory that
can be used via swiotlb or other bounce buffer implementation.

But for now, have to stay with dma masks. Will follow-up with a patch
based on your but with coherent mask handling added.

Nikita


Re: [PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2017-01-04 Thread Nikita Yushchenko
 For OF platforms, this is called via of_dma_configure(), that checks
 dma-ranges of node that is *parent* for host bridge. Host bridge
 currently does not control this at all.
>>>
>>> We need to think about this a bit. Is it actually the PCI host
>>> bridge that limits the ranges here, or the bus that it is connected
>>> to. In the latter case, the caller needs to be adapted to handle
>>> both.
>>
>> In r-car case, I'm not sure what is the source of limitation at physical
>> level.
>>
>> pcie-rcar driver configures ranges for PCIe inbound transactions based
>> on dma-ranges property in it's device tree node. In the current device
>> tree for this platform, that only contains one range and it is in lower
>> memory.
>>
>> NVMe driver tries i/o to kmalloc()ed area. That returns 0x5
>> addresses here. As a quick experiment, I tried to add second range to
>> pcie-rcar's dma-ranges to cover 0x5 area - but that did not make
>> DMA to high addresses working.
>>
>> My current understanding is that host bridge hardware module can't
>> handle inbound transactions to PCI addresses above 4G - and this
>> limitations comes from host bridge itself.
>>
>> I've read somewhere in the lists that pcie-rcar hardware is "32-bit" -
>> but I don't remember where, and don't know lowlevel details. Maybe
>> somebody from linux-renesas can elaborate?
> 
> Just a guess, but if the inbound translation windows in the host
> bridge are wider than 32-bit, the reason for setting up a single
> 32-bit window is probably because that is what the parent bus supports.

Well anyway applying patch similar to your's will fix pcie-rcar + nvme
case - thus I don't object :)   But it can break other cases ...

But why do you hook at set_dma_mask() and overwrite mask inside, instead
of hooking at dma_supported() and rejecting unsupported mask?

I think later is better, because it lets drivers to handle unsupported
high-dma case, like documented in DMA-API_HOWTO.


Re: [PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2017-01-04 Thread Nikita Yushchenko
>> For OF platforms, this is called via of_dma_configure(), that checks
>> dma-ranges of node that is *parent* for host bridge. Host bridge
>> currently does not control this at all.
> 
> We need to think about this a bit. Is it actually the PCI host
> bridge that limits the ranges here, or the bus that it is connected
> to. In the latter case, the caller needs to be adapted to handle
> both.

In r-car case, I'm not sure what is the source of limitation at physical
level.

pcie-rcar driver configures ranges for PCIe inbound transactions based
on dma-ranges property in it's device tree node. In the current device
tree for this platform, that only contains one range and it is in lower
memory.

NVMe driver tries i/o to kmalloc()ed area. That returns 0x5
addresses here. As a quick experiment, I tried to add second range to
pcie-rcar's dma-ranges to cover 0x5 area - but that did not make
DMA to high addresses working.

My current understanding is that host bridge hardware module can't
handle inbound transactions to PCI addresses above 4G - and this
limitations comes from host bridge itself.

I've read somewhere in the lists that pcie-rcar hardware is "32-bit" -
but I don't remember where, and don't know lowlevel details. Maybe
somebody from linux-renesas can elaborate?


>> In current device trees no dma-ranges is defined for nodes that are
>> parents to pci host bridges. This will make of_dma_configure() to fall
>> back to 32-bit size for all devices on all current platforms.  Thus
>> applying this patch will immediately break 64-bit dma masks on all
>> hardware that supports it.
> 
> No, it won't break it, it will just fall back to swiotlb for all the
> ones that are lacking the dma-ranges property. I think this is correct
> behavior.

I'd say - for all ones that have parents without dma-ranges property.

As of 4.10-rc2, I see only two definitions of wide parent dma-ranges
under arch/arm64/boot/dts/ - in amd/amd-seattle-soc.dtsi and
apm/apm-storm.dtsi

Are these the only arm64 platforms that can to DMA to high addresses?
I'm not arm64 expert but I'd be surprised if that's the case.


>> Also related: dma-ranges property used by several pci host bridges is
>> *not* compatible with "legacy" dma-ranges parsed by of_get_dma_range() -
>> former uses additional flags word at beginning.
> 
> Can you elaborate? Do we have PCI host bridges that use wrongly formatted
> dma-ranges properties?

of_dma_get_range() expects  format.

pcie-rcar.c, pci-rcar-gen2.c, pci-xgene.c and pcie-iproc.c from
drivers/pci/host/ all parse dma-ranges using of_pci_range_parser that
uses  format - i.e. something different
from what of_dma_get_range() uses.


Nikita


Re: [PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2017-01-03 Thread Nikita Yushchenko
> commit 9a57d58d116800a535510053136c6dd7a9c26e25
> Author: Arnd Bergmann 
> Date:   Tue Nov 17 14:06:55 2015 +0100
> 
> [EXPERIMENTAL] ARM64: check implement dma_set_mask
> 
> Needs work for coherent mask
> 
> Signed-off-by: Arnd Bergmann 

Unfortunately this is far incomplete

> @@ -957,6 +983,18 @@ void arch_setup_dma_ops(struct device *dev, u64 
> dma_base, u64 size,
>   if (!dev->archdata.dma_ops)
>   dev->archdata.dma_ops = _dma_ops;
>  
> + /*
> +  * we don't yet support buses that have a non-zero mapping.
> +  *  Let's hope we won't need it
> +  */
> + WARN_ON(dma_base != 0);
> +
> + /*
> +  * Whatever the parent bus can set. A device must not set
> +  * a DMA mask larger than this.
> +  */
> + dev->archdata.parent_dma_mask = size;
> +

... because size/mask passed here for PCI devices are meaningless.

For OF platforms, this is called via of_dma_configure(), that checks
dma-ranges of node that is *parent* for host bridge. Host bridge
currently does not control this at all.

In current device trees no dma-ranges is defined for nodes that are
parents to pci host bridges. This will make of_dma_configure() to fall
back to 32-bit size for all devices on all current platforms.  Thus
applying this patch will immediately break 64-bit dma masks on all
hardware that supports it.


Also related: dma-ranges property used by several pci host bridges is
*not* compatible with "legacy" dma-ranges parsed by of_get_dma_range() -
former uses additional flags word at beginning.


Re: [PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2017-01-03 Thread Nikita Yushchenko
>> It is possible that PCI device supports 64-bit DMA addressing, and thus
>> it's driver sets device's dma_mask to DMA_BIT_MASK(64), however PCI host
>> bridge has limitations on inbound transactions addressing. Example of
>> such setup is NVME SSD device connected to RCAR PCIe controller.
>>
>> Previously there was attempt to handle this via bus notifier: after
>> driver is attached to PCI device, bridge driver gets notifier callback,
>> and resets dma_mask from there. However, this is racy: PCI device driver
>> could already allocate buffers and/or start i/o in probe routine.
>> In NVME case, i/o is started in workqueue context, and this race gives
>> "sometimes works, sometimes not" effect.
>>
>> Proper solution should make driver's dma_set_mask() call to fail if host
>> bridge can't support mask being set.
>>
>> This patch makes __swiotlb_dma_supported() to check mask being set for
>> PCI device against dma_mask of struct device corresponding to PCI host
>> bridge (one with name "pci:YY"), if that dma_mask is set.
>>
>> This is the least destructive approach: currently dma_mask of that device
>> object is not used anyhow, thus all existing setups will work as before,
>> and modification is required only in actually affected components -
>> driver of particular PCI host bridge, and dma_map_ops of particular
>> platform.
>>
>> Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
>> ---
>>  arch/arm64/mm/dma-mapping.c | 11 +++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
>> index 290a84f..49645277 100644
>> --- a/arch/arm64/mm/dma-mapping.c
>> +++ b/arch/arm64/mm/dma-mapping.c
>> @@ -28,6 +28,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include 
>>  
>> @@ -347,6 +348,16 @@ static int __swiotlb_get_sgtable(struct device *dev, 
>> struct sg_table *sgt,
>>  
>>  static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
>>  {
>> +#ifdef CONFIG_PCI
>> +if (dev_is_pci(hwdev)) {
>> +struct pci_dev *pdev = to_pci_dev(hwdev);
>> +struct pci_host_bridge *br = pci_find_host_bridge(pdev->bus);
>> +
>> +if (br->dev.dma_mask && (*br->dev.dma_mask) &&
>> +(mask & (*br->dev.dma_mask)) != mask)
>> +return 0;
>> +}
>> +#endif
> 
> Hmm, but this makes it look like the problem is both arm64 and swiotlb
> specific, when in reality it's not. Perhaps another hack you could try
> would be to register a PCI bus notifier in the host bridge looking for
> BUS_NOTIFY_BIND_DRIVER, then you could proxy the DMA ops for each child
> device before the driver has probed, but adding a dma_set_mask callback
> to limit the mask to what you need?

This is what Renesas BSP tries to do and it does not work.

BUS_NOTIFY_BIND_DRIVER arrives after driver's probe routine exits, but
i/o can be started before that.


[PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2016-12-29 Thread Nikita Yushchenko
It is possible that PCI device supports 64-bit DMA addressing, and thus
it's driver sets device's dma_mask to DMA_BIT_MASK(64), however PCI host
bridge has limitations on inbound transactions addressing. Example of
such setup is NVME SSD device connected to RCAR PCIe controller.

Previously there was attempt to handle this via bus notifier: after
driver is attached to PCI device, bridge driver gets notifier callback,
and resets dma_mask from there. However, this is racy: PCI device driver
could already allocate buffers and/or start i/o in probe routine.
In NVME case, i/o is started in workqueue context, and this race gives
"sometimes works, sometimes not" effect.

Proper solution should make driver's dma_set_mask() call to fail if host
bridge can't support mask being set.

This patch makes __swiotlb_dma_supported() to check mask being set for
PCI device against dma_mask of struct device corresponding to PCI host
bridge (one with name "pci:YY"), if that dma_mask is set.

This is the least destructive approach: currently dma_mask of that device
object is not used anyhow, thus all existing setups will work as before,
and modification is required only in actually affected components -
driver of particular PCI host bridge, and dma_map_ops of particular
platform.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
---
 arch/arm64/mm/dma-mapping.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 290a84f..49645277 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -347,6 +348,16 @@ static int __swiotlb_get_sgtable(struct device *dev, 
struct sg_table *sgt,
 
 static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
+#ifdef CONFIG_PCI
+   if (dev_is_pci(hwdev)) {
+   struct pci_dev *pdev = to_pci_dev(hwdev);
+   struct pci_host_bridge *br = pci_find_host_bridge(pdev->bus);
+
+   if (br->dev.dma_mask && (*br->dev.dma_mask) &&
+   (mask & (*br->dev.dma_mask)) != mask)
+   return 0;
+   }
+#endif
if (swiotlb)
return swiotlb_dma_supported(hwdev, mask);
return 1;
-- 
2.1.4



[PATCH 2/2] rcar-pcie: set host bridge's DMA mask

2016-12-29 Thread Nikita Yushchenko
This gives platform DMA mapping code a chance to disallow setting device
DMA mask to something that host bridge can't support.

Signed-off-by: Nikita Yushchenko <nikita.yo...@cogentembedded.com>
---
 drivers/pci/host/pcie-rcar.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/host/pcie-rcar.c b/drivers/pci/host/pcie-rcar.c
index aca85be..b1edc3c 100644
--- a/drivers/pci/host/pcie-rcar.c
+++ b/drivers/pci/host/pcie-rcar.c
@@ -451,6 +451,7 @@ static int rcar_pcie_enable(struct rcar_pcie *pcie)
 {
struct device *dev = pcie->dev;
struct pci_bus *bus, *child;
+   struct pci_host_bridge *bridge;
LIST_HEAD(res);
 
/* Try setting 5 GT/s link speed */
@@ -480,6 +481,10 @@ static int rcar_pcie_enable(struct rcar_pcie *pcie)
list_for_each_entry(child, >children, node)
pcie_bus_configure_settings(child);
 
+   bridge = pci_find_host_bridge(bus);
+   bridge->dev.coherent_dma_mask = DMA_BIT_MASK(32);
+   bridge->dev.dma_mask = >dev.coherent_dma_mask;
+
pci_bus_add_devices(bus);
 
return 0;
-- 
2.1.4