Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-30 Thread Oza Oza
On Tue, Mar 28, 2017 at 7:43 PM, Rob Herring  wrote:
> On Tue, Mar 28, 2017 at 12:27 AM, Oza Oza  wrote:
>> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  
>>> wrote:
 it is possible that PCI device supports 64-bit DMA addressing,
 and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
 however PCI host bridge may have limitations on the inbound
 transaction addressing. As an example, consider NVME SSD device
 connected to iproc-PCIe controller.

 Currently, the IOMMU DMA ops only considers PCI device dma_mask
 when allocating an IOVA. This is particularly problematic on
 ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
 PA for in-bound transactions only after PCI Host has forwarded
 these transactions on SOC IO bus. This means on such ARM/ARM64
 SOCs the IOVA of in-bound transactions has to honor the addressing
 restrictions of the PCI Host.

 current pcie frmework and of framework integration assumes dma-ranges
 in a way where memory-mapped devices define their dma-ranges.
 dma-ranges: (child-bus-address, parent-bus-address, length).

 but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
 dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>>
>>> If you implement a common function, then I expect to see other users
>>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>>> dma-ranges.
>>
>> the common function should be similar to what
>> of_pci_get_host_bridge_resources is doing right now.
>> it parses ranges property right now.
>>
>> the new function would look look following.
>>
>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
>> where resources would return the dma-ranges.
>>
>> but right now if you see the patch, of_dma_configure calls the new
>> function, which actually returns the largest possible size.
>> so this new function has to be generic in a way where other PCI hosts
>> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
>> use it for sure.
>>
>> although having powerpc using it;  is a separate exercise, since I do
>> not have any access to other PCI hosts such as powerpc. but we can
>> workout with them on thsi forum if required.
>
> You don't need h/w. You can analyze what parts are common, write
> patches to convert to common implementation, and build test. The PPC
> and rcar folks can test on h/w.
>
> Rob


Hi Rob,

I have addressed your comment and made generic function.
Gentle request to have a look at following approach and patch.

[RFC PATCH 2/2] of/pci: call pci specific dma-ranges instead of memory-mapped.
[RFC PATCH 1/2] of/pci: implement inbound dma-ranges for PCI

I have tested this on our platform, with and without iommu, and seems to work.

let me know your view on this.

Regards,
Oza.


Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-30 Thread Oza Oza
On Tue, Mar 28, 2017 at 7:43 PM, Rob Herring  wrote:
> On Tue, Mar 28, 2017 at 12:27 AM, Oza Oza  wrote:
>> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  
>>> wrote:
 it is possible that PCI device supports 64-bit DMA addressing,
 and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
 however PCI host bridge may have limitations on the inbound
 transaction addressing. As an example, consider NVME SSD device
 connected to iproc-PCIe controller.

 Currently, the IOMMU DMA ops only considers PCI device dma_mask
 when allocating an IOVA. This is particularly problematic on
 ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
 PA for in-bound transactions only after PCI Host has forwarded
 these transactions on SOC IO bus. This means on such ARM/ARM64
 SOCs the IOVA of in-bound transactions has to honor the addressing
 restrictions of the PCI Host.

 current pcie frmework and of framework integration assumes dma-ranges
 in a way where memory-mapped devices define their dma-ranges.
 dma-ranges: (child-bus-address, parent-bus-address, length).

 but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
 dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>>
>>> If you implement a common function, then I expect to see other users
>>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>>> dma-ranges.
>>
>> the common function should be similar to what
>> of_pci_get_host_bridge_resources is doing right now.
>> it parses ranges property right now.
>>
>> the new function would look look following.
>>
>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
>> where resources would return the dma-ranges.
>>
>> but right now if you see the patch, of_dma_configure calls the new
>> function, which actually returns the largest possible size.
>> so this new function has to be generic in a way where other PCI hosts
>> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
>> use it for sure.
>>
>> although having powerpc using it;  is a separate exercise, since I do
>> not have any access to other PCI hosts such as powerpc. but we can
>> workout with them on thsi forum if required.
>
> You don't need h/w. You can analyze what parts are common, write
> patches to convert to common implementation, and build test. The PPC
> and rcar folks can test on h/w.
>
> Rob


Hi Rob,

I have addressed your comment and made generic function.
Gentle request to have a look at following approach and patch.

[RFC PATCH 2/2] of/pci: call pci specific dma-ranges instead of memory-mapped.
[RFC PATCH 1/2] of/pci: implement inbound dma-ranges for PCI

I have tested this on our platform, with and without iommu, and seems to work.

let me know your view on this.

Regards,
Oza.


Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-29 Thread Oza Oza
On Wed, Mar 29, 2017 at 10:13 AM, Oza Oza  wrote:
> On Tue, Mar 28, 2017 at 7:59 PM, Robin Murphy  wrote:
>> On 28/03/17 06:27, Oza Oza wrote:
>>> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
 On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  
 wrote:
> it is possible that PCI device supports 64-bit DMA addressing,
> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
> however PCI host bridge may have limitations on the inbound
> transaction addressing. As an example, consider NVME SSD device
> connected to iproc-PCIe controller.
>
> Currently, the IOMMU DMA ops only considers PCI device dma_mask
> when allocating an IOVA. This is particularly problematic on
> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
> PA for in-bound transactions only after PCI Host has forwarded
> these transactions on SOC IO bus. This means on such ARM/ARM64
> SOCs the IOVA of in-bound transactions has to honor the addressing
> restrictions of the PCI Host.
>
> current pcie frmework and of framework integration assumes dma-ranges
> in a way where memory-mapped devices define their dma-ranges.
> dma-ranges: (child-bus-address, parent-bus-address, length).
>
> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

 If you implement a common function, then I expect to see other users
 converted to use it. There's also PCI hosts in arch/powerpc that parse
 dma-ranges.
>>>
>>> the common function should be similar to what
>>> of_pci_get_host_bridge_resources is doing right now.
>>> it parses ranges property right now.
>>>
>>> the new function would look look following.
>>>
>>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
>>> where resources would return the dma-ranges.
>>>
>>> but right now if you see the patch, of_dma_configure calls the new
>>> function, which actually returns the largest possible size.
>>> so this new function has to be generic in a way where other PCI hosts
>>> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
>>> use it for sure.
>>>
>>> although having powerpc using it;  is a separate exercise, since I do
>>> not have any access to other PCI hosts such as powerpc. but we can
>>> workout with them on thsi forum if required.
>>>
>>> so overall, of_pci_get_dma_ranges has to serve following 2 purposes.
>>>
>>> 1) it has to return largest possible size to of_dma_configure to
>>> generate largest possible dma_mask.
>>>
>>> 2) it also has to return resources (dma-ranges) parsed, to the users.
>>>
>>> so to address above needs
>>>
>>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head
>>> *resources, u64 *size)
>>>
>>> dev -> device node.
>>> resources -> dma-ranges in allocated list.
>>> size -> highest possible size to generate possible dma_mask for
>>> of_dma_configure.
>>>
>>> let em know how this sounds.
>>
>> Note that the point of passing PCI host bridges into of_dma_configure()
>> in the first place was to avoid having some separate PCI-specific path
>> for DMA configuration. I worry that introducing bus-specific dma-ranges
>> parsing largely defeats that, since we end up with the worst of both
>> worlds; effectively-duplicated code, and/or a load of extra complexity
>> to then attempt to reconverge the divergent paths (there really
>> shouldn't be any need to allocate a list of anything). Given that
>> of_translate_dma_address() is already bus-agnostic, it hardly seems
>> justifiable for its caller not to be so as well, especially when it
>> mostly just comes down to getting the right #address-cells value.
>>
>> The patch below is actually enough to make typical cases work, but is
>> vile, so I'm not seriously considering it (hence I've not bothered
>> making IOMMU configuration handle all circumstances). What it has served
>> to do, though, is give me a clear idea of how to properly sort out the
>> not-quite-right device/parent assumptions between of_dma_configure() and
>> of_dma_get_range() rather than bodging around them any further - stay tuned.
>>
>> Robin.
>>
>> ->8-
>> From: Robin Murphy 
>> Subject: [PATCH] of/pci: Use child node for DMA configuration
>>
>> of_dma_configure() expects to be passed an OF node representing the
>> device being configured - for PCI devices we currently pass the node of
>> the appropriate host controller, which sort of works for inherited
>> properties which may appear at any level, like "dma-coherent", but falls
>> apart for properties which actually care about specific device-parent
>> relationships, like "dma-ranges".
>>
>> Solve this by attempting to find a suitable child node if the PCI
>> hierarchy is actually represented in DT, and if not then faking one up
>> as a last resort, to 

Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-29 Thread Oza Oza
On Wed, Mar 29, 2017 at 10:13 AM, Oza Oza  wrote:
> On Tue, Mar 28, 2017 at 7:59 PM, Robin Murphy  wrote:
>> On 28/03/17 06:27, Oza Oza wrote:
>>> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
 On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  
 wrote:
> it is possible that PCI device supports 64-bit DMA addressing,
> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
> however PCI host bridge may have limitations on the inbound
> transaction addressing. As an example, consider NVME SSD device
> connected to iproc-PCIe controller.
>
> Currently, the IOMMU DMA ops only considers PCI device dma_mask
> when allocating an IOVA. This is particularly problematic on
> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
> PA for in-bound transactions only after PCI Host has forwarded
> these transactions on SOC IO bus. This means on such ARM/ARM64
> SOCs the IOVA of in-bound transactions has to honor the addressing
> restrictions of the PCI Host.
>
> current pcie frmework and of framework integration assumes dma-ranges
> in a way where memory-mapped devices define their dma-ranges.
> dma-ranges: (child-bus-address, parent-bus-address, length).
>
> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

 If you implement a common function, then I expect to see other users
 converted to use it. There's also PCI hosts in arch/powerpc that parse
 dma-ranges.
>>>
>>> the common function should be similar to what
>>> of_pci_get_host_bridge_resources is doing right now.
>>> it parses ranges property right now.
>>>
>>> the new function would look look following.
>>>
>>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
>>> where resources would return the dma-ranges.
>>>
>>> but right now if you see the patch, of_dma_configure calls the new
>>> function, which actually returns the largest possible size.
>>> so this new function has to be generic in a way where other PCI hosts
>>> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
>>> use it for sure.
>>>
>>> although having powerpc using it;  is a separate exercise, since I do
>>> not have any access to other PCI hosts such as powerpc. but we can
>>> workout with them on thsi forum if required.
>>>
>>> so overall, of_pci_get_dma_ranges has to serve following 2 purposes.
>>>
>>> 1) it has to return largest possible size to of_dma_configure to
>>> generate largest possible dma_mask.
>>>
>>> 2) it also has to return resources (dma-ranges) parsed, to the users.
>>>
>>> so to address above needs
>>>
>>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head
>>> *resources, u64 *size)
>>>
>>> dev -> device node.
>>> resources -> dma-ranges in allocated list.
>>> size -> highest possible size to generate possible dma_mask for
>>> of_dma_configure.
>>>
>>> let em know how this sounds.
>>
>> Note that the point of passing PCI host bridges into of_dma_configure()
>> in the first place was to avoid having some separate PCI-specific path
>> for DMA configuration. I worry that introducing bus-specific dma-ranges
>> parsing largely defeats that, since we end up with the worst of both
>> worlds; effectively-duplicated code, and/or a load of extra complexity
>> to then attempt to reconverge the divergent paths (there really
>> shouldn't be any need to allocate a list of anything). Given that
>> of_translate_dma_address() is already bus-agnostic, it hardly seems
>> justifiable for its caller not to be so as well, especially when it
>> mostly just comes down to getting the right #address-cells value.
>>
>> The patch below is actually enough to make typical cases work, but is
>> vile, so I'm not seriously considering it (hence I've not bothered
>> making IOMMU configuration handle all circumstances). What it has served
>> to do, though, is give me a clear idea of how to properly sort out the
>> not-quite-right device/parent assumptions between of_dma_configure() and
>> of_dma_get_range() rather than bodging around them any further - stay tuned.
>>
>> Robin.
>>
>> ->8-
>> From: Robin Murphy 
>> Subject: [PATCH] of/pci: Use child node for DMA configuration
>>
>> of_dma_configure() expects to be passed an OF node representing the
>> device being configured - for PCI devices we currently pass the node of
>> the appropriate host controller, which sort of works for inherited
>> properties which may appear at any level, like "dma-coherent", but falls
>> apart for properties which actually care about specific device-parent
>> relationships, like "dma-ranges".
>>
>> Solve this by attempting to find a suitable child node if the PCI
>> hierarchy is actually represented in DT, and if not then faking one up
>> as a last resort, to make all of DMA configuration work as expected.
>>
>> Signed-off-by: Robin Murphy 
>> ---
>>  

Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-28 Thread Oza Oza
On Tue, Mar 28, 2017 at 7:59 PM, Robin Murphy  wrote:
> On 28/03/17 06:27, Oza Oza wrote:
>> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  
>>> wrote:
 it is possible that PCI device supports 64-bit DMA addressing,
 and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
 however PCI host bridge may have limitations on the inbound
 transaction addressing. As an example, consider NVME SSD device
 connected to iproc-PCIe controller.

 Currently, the IOMMU DMA ops only considers PCI device dma_mask
 when allocating an IOVA. This is particularly problematic on
 ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
 PA for in-bound transactions only after PCI Host has forwarded
 these transactions on SOC IO bus. This means on such ARM/ARM64
 SOCs the IOVA of in-bound transactions has to honor the addressing
 restrictions of the PCI Host.

 current pcie frmework and of framework integration assumes dma-ranges
 in a way where memory-mapped devices define their dma-ranges.
 dma-ranges: (child-bus-address, parent-bus-address, length).

 but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
 dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>>
>>> If you implement a common function, then I expect to see other users
>>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>>> dma-ranges.
>>
>> the common function should be similar to what
>> of_pci_get_host_bridge_resources is doing right now.
>> it parses ranges property right now.
>>
>> the new function would look look following.
>>
>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
>> where resources would return the dma-ranges.
>>
>> but right now if you see the patch, of_dma_configure calls the new
>> function, which actually returns the largest possible size.
>> so this new function has to be generic in a way where other PCI hosts
>> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
>> use it for sure.
>>
>> although having powerpc using it;  is a separate exercise, since I do
>> not have any access to other PCI hosts such as powerpc. but we can
>> workout with them on thsi forum if required.
>>
>> so overall, of_pci_get_dma_ranges has to serve following 2 purposes.
>>
>> 1) it has to return largest possible size to of_dma_configure to
>> generate largest possible dma_mask.
>>
>> 2) it also has to return resources (dma-ranges) parsed, to the users.
>>
>> so to address above needs
>>
>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head
>> *resources, u64 *size)
>>
>> dev -> device node.
>> resources -> dma-ranges in allocated list.
>> size -> highest possible size to generate possible dma_mask for
>> of_dma_configure.
>>
>> let em know how this sounds.
>
> Note that the point of passing PCI host bridges into of_dma_configure()
> in the first place was to avoid having some separate PCI-specific path
> for DMA configuration. I worry that introducing bus-specific dma-ranges
> parsing largely defeats that, since we end up with the worst of both
> worlds; effectively-duplicated code, and/or a load of extra complexity
> to then attempt to reconverge the divergent paths (there really
> shouldn't be any need to allocate a list of anything). Given that
> of_translate_dma_address() is already bus-agnostic, it hardly seems
> justifiable for its caller not to be so as well, especially when it
> mostly just comes down to getting the right #address-cells value.
>
> The patch below is actually enough to make typical cases work, but is
> vile, so I'm not seriously considering it (hence I've not bothered
> making IOMMU configuration handle all circumstances). What it has served
> to do, though, is give me a clear idea of how to properly sort out the
> not-quite-right device/parent assumptions between of_dma_configure() and
> of_dma_get_range() rather than bodging around them any further - stay tuned.
>
> Robin.
>
> ->8-
> From: Robin Murphy 
> Subject: [PATCH] of/pci: Use child node for DMA configuration
>
> of_dma_configure() expects to be passed an OF node representing the
> device being configured - for PCI devices we currently pass the node of
> the appropriate host controller, which sort of works for inherited
> properties which may appear at any level, like "dma-coherent", but falls
> apart for properties which actually care about specific device-parent
> relationships, like "dma-ranges".
>
> Solve this by attempting to find a suitable child node if the PCI
> hierarchy is actually represented in DT, and if not then faking one up
> as a last resort, to make all of DMA configuration work as expected.
>
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/of_iommu.c |  3 ++-
>  drivers/pci/of.c | 24 

Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-28 Thread Oza Oza
On Tue, Mar 28, 2017 at 7:59 PM, Robin Murphy  wrote:
> On 28/03/17 06:27, Oza Oza wrote:
>> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  
>>> wrote:
 it is possible that PCI device supports 64-bit DMA addressing,
 and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
 however PCI host bridge may have limitations on the inbound
 transaction addressing. As an example, consider NVME SSD device
 connected to iproc-PCIe controller.

 Currently, the IOMMU DMA ops only considers PCI device dma_mask
 when allocating an IOVA. This is particularly problematic on
 ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
 PA for in-bound transactions only after PCI Host has forwarded
 these transactions on SOC IO bus. This means on such ARM/ARM64
 SOCs the IOVA of in-bound transactions has to honor the addressing
 restrictions of the PCI Host.

 current pcie frmework and of framework integration assumes dma-ranges
 in a way where memory-mapped devices define their dma-ranges.
 dma-ranges: (child-bus-address, parent-bus-address, length).

 but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
 dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>>
>>> If you implement a common function, then I expect to see other users
>>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>>> dma-ranges.
>>
>> the common function should be similar to what
>> of_pci_get_host_bridge_resources is doing right now.
>> it parses ranges property right now.
>>
>> the new function would look look following.
>>
>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
>> where resources would return the dma-ranges.
>>
>> but right now if you see the patch, of_dma_configure calls the new
>> function, which actually returns the largest possible size.
>> so this new function has to be generic in a way where other PCI hosts
>> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
>> use it for sure.
>>
>> although having powerpc using it;  is a separate exercise, since I do
>> not have any access to other PCI hosts such as powerpc. but we can
>> workout with them on thsi forum if required.
>>
>> so overall, of_pci_get_dma_ranges has to serve following 2 purposes.
>>
>> 1) it has to return largest possible size to of_dma_configure to
>> generate largest possible dma_mask.
>>
>> 2) it also has to return resources (dma-ranges) parsed, to the users.
>>
>> so to address above needs
>>
>> of_pci_get_dma_ranges(struct device_node *dev, struct list_head
>> *resources, u64 *size)
>>
>> dev -> device node.
>> resources -> dma-ranges in allocated list.
>> size -> highest possible size to generate possible dma_mask for
>> of_dma_configure.
>>
>> let em know how this sounds.
>
> Note that the point of passing PCI host bridges into of_dma_configure()
> in the first place was to avoid having some separate PCI-specific path
> for DMA configuration. I worry that introducing bus-specific dma-ranges
> parsing largely defeats that, since we end up with the worst of both
> worlds; effectively-duplicated code, and/or a load of extra complexity
> to then attempt to reconverge the divergent paths (there really
> shouldn't be any need to allocate a list of anything). Given that
> of_translate_dma_address() is already bus-agnostic, it hardly seems
> justifiable for its caller not to be so as well, especially when it
> mostly just comes down to getting the right #address-cells value.
>
> The patch below is actually enough to make typical cases work, but is
> vile, so I'm not seriously considering it (hence I've not bothered
> making IOMMU configuration handle all circumstances). What it has served
> to do, though, is give me a clear idea of how to properly sort out the
> not-quite-right device/parent assumptions between of_dma_configure() and
> of_dma_get_range() rather than bodging around them any further - stay tuned.
>
> Robin.
>
> ->8-
> From: Robin Murphy 
> Subject: [PATCH] of/pci: Use child node for DMA configuration
>
> of_dma_configure() expects to be passed an OF node representing the
> device being configured - for PCI devices we currently pass the node of
> the appropriate host controller, which sort of works for inherited
> properties which may appear at any level, like "dma-coherent", but falls
> apart for properties which actually care about specific device-parent
> relationships, like "dma-ranges".
>
> Solve this by attempting to find a suitable child node if the PCI
> hierarchy is actually represented in DT, and if not then faking one up
> as a last resort, to make all of DMA configuration work as expected.
>
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/of_iommu.c |  3 ++-
>  drivers/pci/of.c | 24 
>  drivers/pci/probe.c  | 14 +-
>  include/linux/pci.h  |  

Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-28 Thread Robin Murphy
On 28/03/17 06:27, Oza Oza wrote:
> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
>>> it is possible that PCI device supports 64-bit DMA addressing,
>>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
>>> however PCI host bridge may have limitations on the inbound
>>> transaction addressing. As an example, consider NVME SSD device
>>> connected to iproc-PCIe controller.
>>>
>>> Currently, the IOMMU DMA ops only considers PCI device dma_mask
>>> when allocating an IOVA. This is particularly problematic on
>>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
>>> PA for in-bound transactions only after PCI Host has forwarded
>>> these transactions on SOC IO bus. This means on such ARM/ARM64
>>> SOCs the IOVA of in-bound transactions has to honor the addressing
>>> restrictions of the PCI Host.
>>>
>>> current pcie frmework and of framework integration assumes dma-ranges
>>> in a way where memory-mapped devices define their dma-ranges.
>>> dma-ranges: (child-bus-address, parent-bus-address, length).
>>>
>>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
>>> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>
>> If you implement a common function, then I expect to see other users
>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>> dma-ranges.
> 
> the common function should be similar to what
> of_pci_get_host_bridge_resources is doing right now.
> it parses ranges property right now.
> 
> the new function would look look following.
> 
> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
> where resources would return the dma-ranges.
> 
> but right now if you see the patch, of_dma_configure calls the new
> function, which actually returns the largest possible size.
> so this new function has to be generic in a way where other PCI hosts
> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
> use it for sure.
> 
> although having powerpc using it;  is a separate exercise, since I do
> not have any access to other PCI hosts such as powerpc. but we can
> workout with them on thsi forum if required.
> 
> so overall, of_pci_get_dma_ranges has to serve following 2 purposes.
> 
> 1) it has to return largest possible size to of_dma_configure to
> generate largest possible dma_mask.
> 
> 2) it also has to return resources (dma-ranges) parsed, to the users.
> 
> so to address above needs
> 
> of_pci_get_dma_ranges(struct device_node *dev, struct list_head
> *resources, u64 *size)
> 
> dev -> device node.
> resources -> dma-ranges in allocated list.
> size -> highest possible size to generate possible dma_mask for
> of_dma_configure.
> 
> let em know how this sounds.

Note that the point of passing PCI host bridges into of_dma_configure()
in the first place was to avoid having some separate PCI-specific path
for DMA configuration. I worry that introducing bus-specific dma-ranges
parsing largely defeats that, since we end up with the worst of both
worlds; effectively-duplicated code, and/or a load of extra complexity
to then attempt to reconverge the divergent paths (there really
shouldn't be any need to allocate a list of anything). Given that
of_translate_dma_address() is already bus-agnostic, it hardly seems
justifiable for its caller not to be so as well, especially when it
mostly just comes down to getting the right #address-cells value.

The patch below is actually enough to make typical cases work, but is
vile, so I'm not seriously considering it (hence I've not bothered
making IOMMU configuration handle all circumstances). What it has served
to do, though, is give me a clear idea of how to properly sort out the
not-quite-right device/parent assumptions between of_dma_configure() and
of_dma_get_range() rather than bodging around them any further - stay tuned.

Robin.

->8-
From: Robin Murphy 
Subject: [PATCH] of/pci: Use child node for DMA configuration

of_dma_configure() expects to be passed an OF node representing the
device being configured - for PCI devices we currently pass the node of
the appropriate host controller, which sort of works for inherited
properties which may appear at any level, like "dma-coherent", but falls
apart for properties which actually care about specific device-parent
relationships, like "dma-ranges".

Solve this by attempting to find a suitable child node if the PCI
hierarchy is actually represented in DT, and if not then faking one up
as a last resort, to make all of DMA configuration work as expected.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/of_iommu.c |  3 ++-
 drivers/pci/of.c | 24 
 drivers/pci/probe.c  | 14 +-
 include/linux/pci.h  |  3 +++
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/of_iommu.c 

Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-28 Thread Robin Murphy
On 28/03/17 06:27, Oza Oza wrote:
> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
>>> it is possible that PCI device supports 64-bit DMA addressing,
>>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
>>> however PCI host bridge may have limitations on the inbound
>>> transaction addressing. As an example, consider NVME SSD device
>>> connected to iproc-PCIe controller.
>>>
>>> Currently, the IOMMU DMA ops only considers PCI device dma_mask
>>> when allocating an IOVA. This is particularly problematic on
>>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
>>> PA for in-bound transactions only after PCI Host has forwarded
>>> these transactions on SOC IO bus. This means on such ARM/ARM64
>>> SOCs the IOVA of in-bound transactions has to honor the addressing
>>> restrictions of the PCI Host.
>>>
>>> current pcie frmework and of framework integration assumes dma-ranges
>>> in a way where memory-mapped devices define their dma-ranges.
>>> dma-ranges: (child-bus-address, parent-bus-address, length).
>>>
>>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
>>> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>
>> If you implement a common function, then I expect to see other users
>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>> dma-ranges.
> 
> the common function should be similar to what
> of_pci_get_host_bridge_resources is doing right now.
> it parses ranges property right now.
> 
> the new function would look look following.
> 
> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
> where resources would return the dma-ranges.
> 
> but right now if you see the patch, of_dma_configure calls the new
> function, which actually returns the largest possible size.
> so this new function has to be generic in a way where other PCI hosts
> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
> use it for sure.
> 
> although having powerpc using it;  is a separate exercise, since I do
> not have any access to other PCI hosts such as powerpc. but we can
> workout with them on thsi forum if required.
> 
> so overall, of_pci_get_dma_ranges has to serve following 2 purposes.
> 
> 1) it has to return largest possible size to of_dma_configure to
> generate largest possible dma_mask.
> 
> 2) it also has to return resources (dma-ranges) parsed, to the users.
> 
> so to address above needs
> 
> of_pci_get_dma_ranges(struct device_node *dev, struct list_head
> *resources, u64 *size)
> 
> dev -> device node.
> resources -> dma-ranges in allocated list.
> size -> highest possible size to generate possible dma_mask for
> of_dma_configure.
> 
> let em know how this sounds.

Note that the point of passing PCI host bridges into of_dma_configure()
in the first place was to avoid having some separate PCI-specific path
for DMA configuration. I worry that introducing bus-specific dma-ranges
parsing largely defeats that, since we end up with the worst of both
worlds; effectively-duplicated code, and/or a load of extra complexity
to then attempt to reconverge the divergent paths (there really
shouldn't be any need to allocate a list of anything). Given that
of_translate_dma_address() is already bus-agnostic, it hardly seems
justifiable for its caller not to be so as well, especially when it
mostly just comes down to getting the right #address-cells value.

The patch below is actually enough to make typical cases work, but is
vile, so I'm not seriously considering it (hence I've not bothered
making IOMMU configuration handle all circumstances). What it has served
to do, though, is give me a clear idea of how to properly sort out the
not-quite-right device/parent assumptions between of_dma_configure() and
of_dma_get_range() rather than bodging around them any further - stay tuned.

Robin.

->8-
From: Robin Murphy 
Subject: [PATCH] of/pci: Use child node for DMA configuration

of_dma_configure() expects to be passed an OF node representing the
device being configured - for PCI devices we currently pass the node of
the appropriate host controller, which sort of works for inherited
properties which may appear at any level, like "dma-coherent", but falls
apart for properties which actually care about specific device-parent
relationships, like "dma-ranges".

Solve this by attempting to find a suitable child node if the PCI
hierarchy is actually represented in DT, and if not then faking one up
as a last resort, to make all of DMA configuration work as expected.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/of_iommu.c |  3 ++-
 drivers/pci/of.c | 24 
 drivers/pci/probe.c  | 14 +-
 include/linux/pci.h  |  3 +++
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 2683e9fc0dcf..35c97b945c15 100644
--- 

Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-28 Thread Rob Herring
On Tue, Mar 28, 2017 at 12:27 AM, Oza Oza  wrote:
> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
>>> it is possible that PCI device supports 64-bit DMA addressing,
>>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
>>> however PCI host bridge may have limitations on the inbound
>>> transaction addressing. As an example, consider NVME SSD device
>>> connected to iproc-PCIe controller.
>>>
>>> Currently, the IOMMU DMA ops only considers PCI device dma_mask
>>> when allocating an IOVA. This is particularly problematic on
>>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
>>> PA for in-bound transactions only after PCI Host has forwarded
>>> these transactions on SOC IO bus. This means on such ARM/ARM64
>>> SOCs the IOVA of in-bound transactions has to honor the addressing
>>> restrictions of the PCI Host.
>>>
>>> current pcie frmework and of framework integration assumes dma-ranges
>>> in a way where memory-mapped devices define their dma-ranges.
>>> dma-ranges: (child-bus-address, parent-bus-address, length).
>>>
>>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
>>> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>
>> If you implement a common function, then I expect to see other users
>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>> dma-ranges.
>
> the common function should be similar to what
> of_pci_get_host_bridge_resources is doing right now.
> it parses ranges property right now.
>
> the new function would look look following.
>
> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
> where resources would return the dma-ranges.
>
> but right now if you see the patch, of_dma_configure calls the new
> function, which actually returns the largest possible size.
> so this new function has to be generic in a way where other PCI hosts
> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
> use it for sure.
>
> although having powerpc using it;  is a separate exercise, since I do
> not have any access to other PCI hosts such as powerpc. but we can
> workout with them on thsi forum if required.

You don't need h/w. You can analyze what parts are common, write
patches to convert to common implementation, and build test. The PPC
and rcar folks can test on h/w.

Rob


Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-28 Thread Rob Herring
On Tue, Mar 28, 2017 at 12:27 AM, Oza Oza  wrote:
> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
>>> it is possible that PCI device supports 64-bit DMA addressing,
>>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
>>> however PCI host bridge may have limitations on the inbound
>>> transaction addressing. As an example, consider NVME SSD device
>>> connected to iproc-PCIe controller.
>>>
>>> Currently, the IOMMU DMA ops only considers PCI device dma_mask
>>> when allocating an IOVA. This is particularly problematic on
>>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
>>> PA for in-bound transactions only after PCI Host has forwarded
>>> these transactions on SOC IO bus. This means on such ARM/ARM64
>>> SOCs the IOVA of in-bound transactions has to honor the addressing
>>> restrictions of the PCI Host.
>>>
>>> current pcie frmework and of framework integration assumes dma-ranges
>>> in a way where memory-mapped devices define their dma-ranges.
>>> dma-ranges: (child-bus-address, parent-bus-address, length).
>>>
>>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
>>> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>>
>> If you implement a common function, then I expect to see other users
>> converted to use it. There's also PCI hosts in arch/powerpc that parse
>> dma-ranges.
>
> the common function should be similar to what
> of_pci_get_host_bridge_resources is doing right now.
> it parses ranges property right now.
>
> the new function would look look following.
>
> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
> where resources would return the dma-ranges.
>
> but right now if you see the patch, of_dma_configure calls the new
> function, which actually returns the largest possible size.
> so this new function has to be generic in a way where other PCI hosts
> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
> use it for sure.
>
> although having powerpc using it;  is a separate exercise, since I do
> not have any access to other PCI hosts such as powerpc. but we can
> workout with them on thsi forum if required.

You don't need h/w. You can analyze what parts are common, write
patches to convert to common implementation, and build test. The PPC
and rcar folks can test on h/w.

Rob


Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-27 Thread Oza Oza
On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
>> it is possible that PCI device supports 64-bit DMA addressing,
>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
>> however PCI host bridge may have limitations on the inbound
>> transaction addressing. As an example, consider NVME SSD device
>> connected to iproc-PCIe controller.
>>
>> Currently, the IOMMU DMA ops only considers PCI device dma_mask
>> when allocating an IOVA. This is particularly problematic on
>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
>> PA for in-bound transactions only after PCI Host has forwarded
>> these transactions on SOC IO bus. This means on such ARM/ARM64
>> SOCs the IOVA of in-bound transactions has to honor the addressing
>> restrictions of the PCI Host.
>>
>> current pcie frmework and of framework integration assumes dma-ranges
>> in a way where memory-mapped devices define their dma-ranges.
>> dma-ranges: (child-bus-address, parent-bus-address, length).
>>
>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
>> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>
> If you implement a common function, then I expect to see other users
> converted to use it. There's also PCI hosts in arch/powerpc that parse
> dma-ranges.

the common function should be similar to what
of_pci_get_host_bridge_resources is doing right now.
it parses ranges property right now.

the new function would look look following.

of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
where resources would return the dma-ranges.

but right now if you see the patch, of_dma_configure calls the new
function, which actually returns the largest possible size.
so this new function has to be generic in a way where other PCI hosts
can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
use it for sure.

although having powerpc using it;  is a separate exercise, since I do
not have any access to other PCI hosts such as powerpc. but we can
workout with them on thsi forum if required.

so overall, of_pci_get_dma_ranges has to serve following 2 purposes.

1) it has to return largest possible size to of_dma_configure to
generate largest possible dma_mask.

2) it also has to return resources (dma-ranges) parsed, to the users.

so to address above needs

of_pci_get_dma_ranges(struct device_node *dev, struct list_head
*resources, u64 *size)

dev -> device node.
resources -> dma-ranges in allocated list.
size -> highest possible size to generate possible dma_mask for
of_dma_configure.

let em know how this sounds.

Regards,
Oza.


Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-27 Thread Oza Oza
On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring  wrote:
> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
>> it is possible that PCI device supports 64-bit DMA addressing,
>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
>> however PCI host bridge may have limitations on the inbound
>> transaction addressing. As an example, consider NVME SSD device
>> connected to iproc-PCIe controller.
>>
>> Currently, the IOMMU DMA ops only considers PCI device dma_mask
>> when allocating an IOVA. This is particularly problematic on
>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
>> PA for in-bound transactions only after PCI Host has forwarded
>> these transactions on SOC IO bus. This means on such ARM/ARM64
>> SOCs the IOVA of in-bound transactions has to honor the addressing
>> restrictions of the PCI Host.
>>
>> current pcie frmework and of framework integration assumes dma-ranges
>> in a way where memory-mapped devices define their dma-ranges.
>> dma-ranges: (child-bus-address, parent-bus-address, length).
>>
>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
>> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
>
> If you implement a common function, then I expect to see other users
> converted to use it. There's also PCI hosts in arch/powerpc that parse
> dma-ranges.

the common function should be similar to what
of_pci_get_host_bridge_resources is doing right now.
it parses ranges property right now.

the new function would look look following.

of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources)
where resources would return the dma-ranges.

but right now if you see the patch, of_dma_configure calls the new
function, which actually returns the largest possible size.
so this new function has to be generic in a way where other PCI hosts
can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can
use it for sure.

although having powerpc using it;  is a separate exercise, since I do
not have any access to other PCI hosts such as powerpc. but we can
workout with them on thsi forum if required.

so overall, of_pci_get_dma_ranges has to serve following 2 purposes.

1) it has to return largest possible size to of_dma_configure to
generate largest possible dma_mask.

2) it also has to return resources (dma-ranges) parsed, to the users.

so to address above needs

of_pci_get_dma_ranges(struct device_node *dev, struct list_head
*resources, u64 *size)

dev -> device node.
resources -> dma-ranges in allocated list.
size -> highest possible size to generate possible dma_mask for
of_dma_configure.

let em know how this sounds.

Regards,
Oza.


Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-27 Thread Rob Herring
On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
> it is possible that PCI device supports 64-bit DMA addressing,
> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
> however PCI host bridge may have limitations on the inbound
> transaction addressing. As an example, consider NVME SSD device
> connected to iproc-PCIe controller.
>
> Currently, the IOMMU DMA ops only considers PCI device dma_mask
> when allocating an IOVA. This is particularly problematic on
> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
> PA for in-bound transactions only after PCI Host has forwarded
> these transactions on SOC IO bus. This means on such ARM/ARM64
> SOCs the IOVA of in-bound transactions has to honor the addressing
> restrictions of the PCI Host.
>
> current pcie frmework and of framework integration assumes dma-ranges
> in a way where memory-mapped devices define their dma-ranges.
> dma-ranges: (child-bus-address, parent-bus-address, length).
>
> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

If you implement a common function, then I expect to see other users
converted to use it. There's also PCI hosts in arch/powerpc that parse
dma-ranges.

Rob


Re: [RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-27 Thread Rob Herring
On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep  wrote:
> it is possible that PCI device supports 64-bit DMA addressing,
> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
> however PCI host bridge may have limitations on the inbound
> transaction addressing. As an example, consider NVME SSD device
> connected to iproc-PCIe controller.
>
> Currently, the IOMMU DMA ops only considers PCI device dma_mask
> when allocating an IOVA. This is particularly problematic on
> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
> PA for in-bound transactions only after PCI Host has forwarded
> these transactions on SOC IO bus. This means on such ARM/ARM64
> SOCs the IOVA of in-bound transactions has to honor the addressing
> restrictions of the PCI Host.
>
> current pcie frmework and of framework integration assumes dma-ranges
> in a way where memory-mapped devices define their dma-ranges.
> dma-ranges: (child-bus-address, parent-bus-address, length).
>
> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
> dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

If you implement a common function, then I expect to see other users
converted to use it. There's also PCI hosts in arch/powerpc that parse
dma-ranges.

Rob


[RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-24 Thread Oza Pawandeep
it is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound
transaction addressing. As an example, consider NVME SSD device
connected to iproc-PCIe controller.

Currently, the IOMMU DMA ops only considers PCI device dma_mask
when allocating an IOVA. This is particularly problematic on
ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
PA for in-bound transactions only after PCI Host has forwarded
these transactions on SOC IO bus. This means on such ARM/ARM64
SOCs the IOVA of in-bound transactions has to honor the addressing
restrictions of the PCI Host.

current pcie frmework and of framework integration assumes dma-ranges
in a way where memory-mapped devices define their dma-ranges.
dma-ranges: (child-bus-address, parent-bus-address, length).

but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

of_dma_configure is specifically witten to take care of memory mapped devices.
but no implementation exists for pci to take care of pcie based memory ranges.
in fact pci world doesnt seem to define standard dma-ranges

this patch implements of_pci_get_dma_ranges to cater to pci world dma-ranges.
so then the returned size get best possible (largest) dma_mask.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

Reviewed-by: Anup Patel 
Reviewed-by: Scott Branden 
Signed-off-by: Oza Pawandeep 

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/device.c b/drivers/of/device.c
index b1e6beb..d362a98 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include "of_private.h"
@@ -104,7 +105,11 @@ void of_dma_configure(struct device *dev, struct 
device_node *np)
if (!dev->dma_mask)
dev->dma_mask = >coherent_dma_mask;
 
-   ret = of_dma_get_range(np, _addr, , );
+   if (dev_is_pci(dev))
+   ret = of_pci_get_dma_ranges(np, _addr, , );
+   else
+   ret = of_dma_get_range(np, _addr, , );
+
if (ret < 0) {
dma_addr = offset = 0;
size = dev->coherent_dma_mask + 1;
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 0ee42c3..c7f8626 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -283,6 +283,52 @@ int of_pci_get_host_bridge_resources(struct device_node 
*dev,
return err;
 }
 EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
+int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 *paddr, 
u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   int rlen, ret = 0;
+   const int na = 3, ns = 2;
+   struct of_pci_range_parser parser;
+   struct of_pci_range range;
+
+   if (!node)
+   return -EINVAL;
+
+   parser.node = node;
+   parser.pna = of_n_addr_cells(node);
+   parser.np = parser.pna + na + ns;
+
+   parser.range = of_get_property(node, "dma-ranges", );
+
+   if (!parser.range) {
+   pr_debug("pcie device has no dma-ranges defined for 
node(%s)\n", np->full_name);
+   ret = -ENODEV;
+   goto out;
+   }
+
+   parser.end = parser.range + rlen / sizeof(__be32);
+   *size = 0;
+
+   for_each_of_pci_range(, ) {
+   if (*size < range.size) {
+   *dma_addr = range.pci_addr;
+   *size = range.size;
+   *paddr = range.cpu_addr;
+   }
+   }
+
+   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+*dma_addr, *paddr, *size);
+*dma_addr = range.pci_addr;
+*size = range.size;
+
+out:
+   of_node_put(node);
+   return ret;
+
+}
+EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges);
 #endif /* CONFIG_OF_ADDRESS */
 
 #ifdef CONFIG_PCI_MSI
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 0e0974e..907ace0 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { }
 int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
struct list_head *resources, resource_size_t *io_base);
+int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 *paddr, 
u64 *size);
 #else
 static inline int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
@@ -83,6 +84,11 @@ static inline int of_pci_get_host_bridge_resources(struct 
device_node *dev,
 {
return -EINVAL;
 }
+
+static inline int of_pci_get_dma_ranges(struct device_node 

[RFC PATCH 1/3] of/pci: dma-ranges to account highest possible host bridge dma_mask

2017-03-24 Thread Oza Pawandeep
it is possible that PCI device supports 64-bit DMA addressing,
and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64),
however PCI host bridge may have limitations on the inbound
transaction addressing. As an example, consider NVME SSD device
connected to iproc-PCIe controller.

Currently, the IOMMU DMA ops only considers PCI device dma_mask
when allocating an IOVA. This is particularly problematic on
ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to
PA for in-bound transactions only after PCI Host has forwarded
these transactions on SOC IO bus. This means on such ARM/ARM64
SOCs the IOVA of in-bound transactions has to honor the addressing
restrictions of the PCI Host.

current pcie frmework and of framework integration assumes dma-ranges
in a way where memory-mapped devices define their dma-ranges.
dma-ranges: (child-bus-address, parent-bus-address, length).

but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;

of_dma_configure is specifically witten to take care of memory mapped devices.
but no implementation exists for pci to take care of pcie based memory ranges.
in fact pci world doesnt seem to define standard dma-ranges

this patch implements of_pci_get_dma_ranges to cater to pci world dma-ranges.
so then the returned size get best possible (largest) dma_mask.
for e.g.
dma-ranges = <0x4300 0x00 0x00 0x00 0x00 0x80 0x00>;
we should get dev->coherent_dma_mask=0x7f.

Reviewed-by: Anup Patel 
Reviewed-by: Scott Branden 
Signed-off-by: Oza Pawandeep 

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/of/device.c b/drivers/of/device.c
index b1e6beb..d362a98 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include "of_private.h"
@@ -104,7 +105,11 @@ void of_dma_configure(struct device *dev, struct 
device_node *np)
if (!dev->dma_mask)
dev->dma_mask = >coherent_dma_mask;
 
-   ret = of_dma_get_range(np, _addr, , );
+   if (dev_is_pci(dev))
+   ret = of_pci_get_dma_ranges(np, _addr, , );
+   else
+   ret = of_dma_get_range(np, _addr, , );
+
if (ret < 0) {
dma_addr = offset = 0;
size = dev->coherent_dma_mask + 1;
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 0ee42c3..c7f8626 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -283,6 +283,52 @@ int of_pci_get_host_bridge_resources(struct device_node 
*dev,
return err;
 }
 EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources);
+
+int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 *paddr, 
u64 *size)
+{
+   struct device_node *node = of_node_get(np);
+   int rlen, ret = 0;
+   const int na = 3, ns = 2;
+   struct of_pci_range_parser parser;
+   struct of_pci_range range;
+
+   if (!node)
+   return -EINVAL;
+
+   parser.node = node;
+   parser.pna = of_n_addr_cells(node);
+   parser.np = parser.pna + na + ns;
+
+   parser.range = of_get_property(node, "dma-ranges", );
+
+   if (!parser.range) {
+   pr_debug("pcie device has no dma-ranges defined for 
node(%s)\n", np->full_name);
+   ret = -ENODEV;
+   goto out;
+   }
+
+   parser.end = parser.range + rlen / sizeof(__be32);
+   *size = 0;
+
+   for_each_of_pci_range(, ) {
+   if (*size < range.size) {
+   *dma_addr = range.pci_addr;
+   *size = range.size;
+   *paddr = range.cpu_addr;
+   }
+   }
+
+   pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n",
+*dma_addr, *paddr, *size);
+*dma_addr = range.pci_addr;
+*size = range.size;
+
+out:
+   of_node_put(node);
+   return ret;
+
+}
+EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges);
 #endif /* CONFIG_OF_ADDRESS */
 
 #ifdef CONFIG_PCI_MSI
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index 0e0974e..907ace0 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { }
 int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
struct list_head *resources, resource_size_t *io_base);
+int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 *paddr, 
u64 *size);
 #else
 static inline int of_pci_get_host_bridge_resources(struct device_node *dev,
unsigned char busno, unsigned char bus_max,
@@ -83,6 +84,11 @@ static inline int of_pci_get_host_bridge_resources(struct 
device_node *dev,
 {
return -EINVAL;
 }
+
+static inline int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, 
u64 *paddr, u64 *size)
+{
+   return -EINVAL;
+}
 #endif
 
 #if