RE: [PATCH v2] dt: platform driver: Fill the resources before probe and defer if needed
Hi Jean-Jacques, Sorry for top posting. As I know, there have been several attempts to solve the same problem already:) [1] https://lkml.org/lkml/2013/9/18/216 [2] https://lkml.org/lkml/2013/11/22/520 [3] https://lkml.org/lkml/2014/1/8/240 There are some questions related to your approach: 1) How to distinguish between cases "IRQ domain not ready" and "wrong IRQ data in DT" or other IRQ parsing errors? Otherwise, Driver's probe will be deffered wrongly and forever, Thierry Reding has tried to solve this in [1]. 2) How will be handled driver reloading situation? The worst case (sparse IRQ enabled): - remove driver A - remove driver B (irq-controller) - load driver B <--- different set of Linux IRQ numbers can be assigned - load driver A <--- oops. IRQ resources table contains invalid data Best regards, Grygorii Strashko = The goal of this patch is to allow drivers to be probed even if at the time of the DT parsing some of their ressources are not available yet. In the current situation, the resource of a platform device are filled from the DT at the time the device is created (of_device_alloc()). The drawbackof this is that a device sitting close to the top of the DT (ahb for example) but depending on ressources that are initialized later (IRQ domain dynamically created for example) will fail to probe because the ressources don't exist at this time. This patch fills the resource structure only before the device is probed and will defer the probe if the resource are not available yet. Signed-off-by: Jean-Jacques Hiblot Reviewed-by: Gregory CLEMENT --- Hi Grant, I reworked the patch as you proposed. To keep the overhead minimum, nirq and nreg are computed only the first time. In this implementation, only the missing IRQ ressources are re-tried for. It could easily be changed to re-parse all the IRQs though (replace if (!res->flags) with if ((!res->flags) || (res->flags & IORESOURCE_IRQ)). drivers/base/platform.c | 5 +++ drivers/of/platform.c | 100 +--- include/linux/of_platform.h | 10 + 3 files changed, 90 insertions(+), 25 deletions(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index bc78848..cee9b8d 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -481,6 +481,10 @@ static int platform_drv_probe(struct device *_dev) struct platform_device *dev = to_platform_device(_dev); int ret; + ret = of_platform_device_prepare(dev); + if (ret) + goto error; + if (ACPI_HANDLE(_dev)) acpi_dev_pm_attach(_dev, true); @@ -488,6 +492,7 @@ static int platform_drv_probe(struct device *_dev) if (ret && ACPI_HANDLE(_dev)) acpi_dev_pm_detach(_dev, true); +error: if (drv->prevent_deferred_probe && ret == -EPROBE_DEFER) { dev_warn(_dev, "probe deferral not supported\n"); ret = -ENXIO; diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 404d1da..a4e2602 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -141,36 +141,11 @@ struct platform_device *of_device_alloc(struct device_node *np, struct device *parent) { struct platform_device *dev; - int rc, i, num_reg = 0, num_irq; - struct resource *res, temp_res; dev = platform_device_alloc("", -1); if (!dev) return NULL; - /* count the io and irq resources */ - if (of_can_translate_address(np)) - while (of_address_to_resource(np, num_reg, _res) == 0) - num_reg++; - num_irq = of_irq_count(np); - - /* Populate the resource table */ - if (num_irq || num_reg) { - res = kzalloc(sizeof(*res) * (num_irq + num_reg), GFP_KERNEL); - if (!res) { - platform_device_put(dev); - return NULL; - } - - dev->num_resources = num_reg + num_irq; - dev->resource = res; - for (i = 0; i < num_reg; i++, res++) { - rc = of_address_to_resource(np, i, res); - WARN_ON(rc); - } - WARN_ON(of_irq_to_resource_table(np, res, num_irq) != num_irq); - } - dev->dev.of_node = of_node_get(np); #if defined(CONFIG_MICROBLAZE) dev->dev.dma_mask = >archdata.dma_mask; @@ -233,6 +208,81 @@ static struct platform_device *of_platform_device_create_pdata( return dev; } +static int of_reg_count(struct device_node *np) +{ + int nreg = 0; + if (of_can_translate_address(np)) { + struct resource temp_res; + while (of_address_to_resource(np, nreg, _res) == 0) + nreg++; + } + return nreg; +} + +int of_platform_device_prepare(struct platform_device *dev) +{ +
RE: [PATCH v2] dt: platform driver: Fill the resources before probe and defer if needed
Hi Jean-Jacques, Sorry for top posting. As I know, there have been several attempts to solve the same problem already:) [1] https://lkml.org/lkml/2013/9/18/216 [2] https://lkml.org/lkml/2013/11/22/520 [3] https://lkml.org/lkml/2014/1/8/240 There are some questions related to your approach: 1) How to distinguish between cases IRQ domain not ready and wrong IRQ data in DT or other IRQ parsing errors? Otherwise, Driver's probe will be deffered wrongly and forever, Thierry Reding has tried to solve this in [1]. 2) How will be handled driver reloading situation? The worst case (sparse IRQ enabled): - remove driver A - remove driver B (irq-controller) - load driver B --- different set of Linux IRQ numbers can be assigned - load driver A --- oops. IRQ resources table contains invalid data Best regards, Grygorii Strashko = The goal of this patch is to allow drivers to be probed even if at the time of the DT parsing some of their ressources are not available yet. In the current situation, the resource of a platform device are filled from the DT at the time the device is created (of_device_alloc()). The drawbackof this is that a device sitting close to the top of the DT (ahb for example) but depending on ressources that are initialized later (IRQ domain dynamically created for example) will fail to probe because the ressources don't exist at this time. This patch fills the resource structure only before the device is probed and will defer the probe if the resource are not available yet. Signed-off-by: Jean-Jacques Hiblot jjhib...@traphandler.com Reviewed-by: Gregory CLEMENT gregory.clem...@free-electrons.com --- Hi Grant, I reworked the patch as you proposed. To keep the overhead minimum, nirq and nreg are computed only the first time. In this implementation, only the missing IRQ ressources are re-tried for. It could easily be changed to re-parse all the IRQs though (replace if (!res-flags) with if ((!res-flags) || (res-flags IORESOURCE_IRQ)). drivers/base/platform.c | 5 +++ drivers/of/platform.c | 100 +--- include/linux/of_platform.h | 10 + 3 files changed, 90 insertions(+), 25 deletions(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index bc78848..cee9b8d 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -481,6 +481,10 @@ static int platform_drv_probe(struct device *_dev) struct platform_device *dev = to_platform_device(_dev); int ret; + ret = of_platform_device_prepare(dev); + if (ret) + goto error; + if (ACPI_HANDLE(_dev)) acpi_dev_pm_attach(_dev, true); @@ -488,6 +492,7 @@ static int platform_drv_probe(struct device *_dev) if (ret ACPI_HANDLE(_dev)) acpi_dev_pm_detach(_dev, true); +error: if (drv-prevent_deferred_probe ret == -EPROBE_DEFER) { dev_warn(_dev, probe deferral not supported\n); ret = -ENXIO; diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 404d1da..a4e2602 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -141,36 +141,11 @@ struct platform_device *of_device_alloc(struct device_node *np, struct device *parent) { struct platform_device *dev; - int rc, i, num_reg = 0, num_irq; - struct resource *res, temp_res; dev = platform_device_alloc(, -1); if (!dev) return NULL; - /* count the io and irq resources */ - if (of_can_translate_address(np)) - while (of_address_to_resource(np, num_reg, temp_res) == 0) - num_reg++; - num_irq = of_irq_count(np); - - /* Populate the resource table */ - if (num_irq || num_reg) { - res = kzalloc(sizeof(*res) * (num_irq + num_reg), GFP_KERNEL); - if (!res) { - platform_device_put(dev); - return NULL; - } - - dev-num_resources = num_reg + num_irq; - dev-resource = res; - for (i = 0; i num_reg; i++, res++) { - rc = of_address_to_resource(np, i, res); - WARN_ON(rc); - } - WARN_ON(of_irq_to_resource_table(np, res, num_irq) != num_irq); - } - dev-dev.of_node = of_node_get(np); #if defined(CONFIG_MICROBLAZE) dev-dev.dma_mask = dev-archdata.dma_mask; @@ -233,6 +208,81 @@ static struct platform_device *of_platform_device_create_pdata( return dev; } +static int of_reg_count(struct device_node *np) +{ + int nreg = 0; + if (of_can_translate_address(np)) { + struct resource temp_res; + while (of_address_to_resource(np, nreg, temp_res) == 0) + nreg++; + } + return nreg; +} + +int
RE: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()
Hi all, Sorry, for the invalid mail & patch format - have no way to send it properly now. Suppose there is another way to fix this issue (more generic) - Correct memblock_virt_allocX() API to limit allocations below memblock.current_limit (patch attached). Then the code behavior will become more similar to _alloc_memory_core_early. Not tested. Best regards, - grygorii From: Konrad Rzeszutek Wilk [konrad.w...@oracle.com] Sent: Tuesday, January 28, 2014 8:56 PM To: Shilimkar, Santosh Cc: Russell King - ARM Linux; Yinghai Lu; Kevin Hilman; Olof Johansson; Linus Torvalds; Andrew Morton; Ingo Molnar; H. Peter Anvin; Dave Hansen; linux-kernel@vger.kernel.org; Strashko, Grygorii; xen-de...@lists.xensource.com Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low() On Tue, Jan 28, 2014 at 01:36:28PM -0500, Santosh Shilimkar wrote: > + Gryagorii, > On Tuesday 28 January 2014 01:22 PM, Russell King - ARM Linux wrote: > > On Tue, Jan 28, 2014 at 12:23:02PM -0500, Santosh Shilimkar wrote: > >> On Tuesday 28 January 2014 12:12 PM, Yinghai Lu wrote: > >>> Index: linux-2.6/include/linux/bootmem.h > >>> === > >>> --- linux-2.6.orig/include/linux/bootmem.h > >>> +++ linux-2.6/include/linux/bootmem.h > >>> @@ -179,6 +179,9 @@ static inline void * __init memblock_vir > >>> NUMA_NO_NODE); > >>> } > >>> > >>> +/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/ > >>> +#include > >>> + > >>> #ifndef ARCH_LOW_ADDRESS_LIMIT > >>> #define ARCH_LOW_ADDRESS_LIMIT 0xUL > >>> #endif > >> > >> This won't help mostly since the ARM 32 arch don't set > >> ARCH_LOW_ADDRESS_LIMIT. > >> Sorry i couldn't respond to the thread earlier because of travel and > >> don't have access to my board to try out the patches. > > > > Let's think about this for a moment, shall we... > > > > What does memblock_alloc_virt*() return? It returns a virtual address. > > > > How is that virtual address obtained? ptr = phys_to_virt(alloc); > > > > What is the valid address range for passing into phys_to_virt() ? Only > > lowmem addresses. > > > > Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be > > completely rediculous - and presumably this also fails on x86_32 if it > > returns memory up at 4GB. > > > > So... yes, I think reverting the arch/arm part of this patch is the right > > solution, whether the rest of it should be reverted is something I can't > > comment on. > > > Grygorri mentioned an alternate to update the memblock_find_in_range_node() so > that it takes into account the limit. This patch breaks also Xen and 32-bit guests (see http://lists.xen.org/archives/html/xen-devel/2014-01/msg02476.html) Reverting it fixes it. > > Regards, > Santosh From ee31afb9c5c0e78819ce624e3a930d31b97527cd Mon Sep 17 00:00:00 2001 From: grygoriis Date: Tue, 28 Jan 2014 21:59:30 +0200 Subject: [PATCH] mm/memblock: fix upper boundary of allocating region Correct memblock_virt_allocX() API to limit allocations below memblock.current_limit. Signed-off-by: grygoriis --- mm/memblock.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index 87d21a6..e93d669 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1077,6 +1077,9 @@ static void * __init memblock_virt_alloc_internal( if (!align) align = SMP_CACHE_BYTES; +if (max_addr > memblock.current_limit) +max_addr = memblock.current_limit; + again: alloc = memblock_find_in_range_node(size, align, min_addr, max_addr, nid); -- 1.7.4.1
RE: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low()
Hi all, Sorry, for the invalid mail patch format - have no way to send it properly now. Suppose there is another way to fix this issue (more generic) - Correct memblock_virt_allocX() API to limit allocations below memblock.current_limit (patch attached). Then the code behavior will become more similar to _alloc_memory_core_early. Not tested. Best regards, - grygorii From: Konrad Rzeszutek Wilk [konrad.w...@oracle.com] Sent: Tuesday, January 28, 2014 8:56 PM To: Shilimkar, Santosh Cc: Russell King - ARM Linux; Yinghai Lu; Kevin Hilman; Olof Johansson; Linus Torvalds; Andrew Morton; Ingo Molnar; H. Peter Anvin; Dave Hansen; linux-kernel@vger.kernel.org; Strashko, Grygorii; xen-de...@lists.xensource.com Subject: Re: [PATCH 1/3] memblock, nobootmem: Add memblock_virt_alloc_low() On Tue, Jan 28, 2014 at 01:36:28PM -0500, Santosh Shilimkar wrote: + Gryagorii, On Tuesday 28 January 2014 01:22 PM, Russell King - ARM Linux wrote: On Tue, Jan 28, 2014 at 12:23:02PM -0500, Santosh Shilimkar wrote: On Tuesday 28 January 2014 12:12 PM, Yinghai Lu wrote: Index: linux-2.6/include/linux/bootmem.h === --- linux-2.6.orig/include/linux/bootmem.h +++ linux-2.6/include/linux/bootmem.h @@ -179,6 +179,9 @@ static inline void * __init memblock_vir NUMA_NO_NODE); } +/* Take arch's ARCH_LOW_ADDRESS_LIMIT at first*/ +#include asm/processor.h + #ifndef ARCH_LOW_ADDRESS_LIMIT #define ARCH_LOW_ADDRESS_LIMIT 0xUL #endif This won't help mostly since the ARM 32 arch don't set ARCH_LOW_ADDRESS_LIMIT. Sorry i couldn't respond to the thread earlier because of travel and don't have access to my board to try out the patches. Let's think about this for a moment, shall we... What does memblock_alloc_virt*() return? It returns a virtual address. How is that virtual address obtained? ptr = phys_to_virt(alloc); What is the valid address range for passing into phys_to_virt() ? Only lowmem addresses. Hence, having ARCH_LOW_ADDRESS_LIMIT set to 4GB-1 by default seems to be completely rediculous - and presumably this also fails on x86_32 if it returns memory up at 4GB. So... yes, I think reverting the arch/arm part of this patch is the right solution, whether the rest of it should be reverted is something I can't comment on. Grygorri mentioned an alternate to update the memblock_find_in_range_node() so that it takes into account the limit. This patch breaks also Xen and 32-bit guests (see http://lists.xen.org/archives/html/xen-devel/2014-01/msg02476.html) Reverting it fixes it. Regards, Santosh From ee31afb9c5c0e78819ce624e3a930d31b97527cd Mon Sep 17 00:00:00 2001 From: grygoriis grygorii.stras...@globallogic.com Date: Tue, 28 Jan 2014 21:59:30 +0200 Subject: [PATCH] mm/memblock: fix upper boundary of allocating region Correct memblock_virt_allocX() API to limit allocations below memblock.current_limit. Signed-off-by: grygoriis grygorii.stras...@globallogic.com --- mm/memblock.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index 87d21a6..e93d669 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1077,6 +1077,9 @@ static void * __init memblock_virt_alloc_internal( if (!align) align = SMP_CACHE_BYTES; +if (max_addr memblock.current_limit) +max_addr = memblock.current_limit; + again: alloc = memblock_find_in_range_node(size, align, min_addr, max_addr, nid); -- 1.7.4.1
RE: [PATCH V3 2/2] mm/memblock: Add support for excluded memory areas
Hi Philipp, On 01/14/2014 08:52 PM, Philipp Hachtmann wrote: > Hello Grygorii, > > thank you for your comments. > > To clarify we have the following requirements for memblock: > > (1) Reserved areas can be declared before memory is added. > (2) The physical memory is detected once only. > (3) The free memory (i.e. not reserved) memory can be iterated to add > it to the buddy allocator. > (4) Memory designated to be mapped into the kernel address space can be > iterated. > (5) Kdump on s390 requires knowledge about the full system memory > layout. > > The s390 kdump implementation works a bit different from the > implementation on other architectures: The layout is not taken from the > production system and saved for the kdump kernel. Instead the kdump > kernel needs to gather information about the whole memory without > respect to locked out areas (like mem= and OLDMEM etc.). > > Without kdump's requirement it would of course be suitable and easy > just to remove memory from memblock.memory. But then this information > is lost for later use by kdump. > > The patch does not change any behaviour of the current API - whether it > is enabled or not. Sorry, for the delayed reply. My main concern here was that you are introducing new *generic* API, but in fact it is not generic, because it can't be re-used without huge rework of existing code. (at least as of wide usage of for_each_memblock(memory,...), because (if ARCH_MEMBLOCK_NOMAP=y) the meaning of "memory" ranges will be changed form "mapped memory" to "real phys memory"). And therefore, I've proposed to keep things as is and introduce phys_memory ranges instead, to store real phys memory configuration. > > The current patch seems to be overly complicated. > The following patch contains only the nomap functionality without any > cleanup and refactoring. I will post a V4 patch set which will contain > this patch. Regards, -grygorii -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH V3 2/2] mm/memblock: Add support for excluded memory areas
Hi Philipp, On 01/14/2014 08:52 PM, Philipp Hachtmann wrote: Hello Grygorii, thank you for your comments. To clarify we have the following requirements for memblock: (1) Reserved areas can be declared before memory is added. (2) The physical memory is detected once only. (3) The free memory (i.e. not reserved) memory can be iterated to add it to the buddy allocator. (4) Memory designated to be mapped into the kernel address space can be iterated. (5) Kdump on s390 requires knowledge about the full system memory layout. The s390 kdump implementation works a bit different from the implementation on other architectures: The layout is not taken from the production system and saved for the kdump kernel. Instead the kdump kernel needs to gather information about the whole memory without respect to locked out areas (like mem= and OLDMEM etc.). Without kdump's requirement it would of course be suitable and easy just to remove memory from memblock.memory. But then this information is lost for later use by kdump. The patch does not change any behaviour of the current API - whether it is enabled or not. Sorry, for the delayed reply. My main concern here was that you are introducing new *generic* API, but in fact it is not generic, because it can't be re-used without huge rework of existing code. (at least as of wide usage of for_each_memblock(memory,...), because (if ARCH_MEMBLOCK_NOMAP=y) the meaning of memory ranges will be changed form mapped memory to real phys memory). And therefore, I've proposed to keep things as is and introduce phys_memory ranges instead, to store real phys memory configuration. The current patch seems to be overly complicated. The following patch contains only the nomap functionality without any cleanup and refactoring. I will post a V4 patch set which will contain this patch. Regards, -grygorii -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2 08/23] mm/memblock: Add memblock memory allocation apis
Hi Tejun, >On Thu, Dec 05, 2013 at 06:35:00PM +0200, Grygorii Strashko wrote: >> >> +#define memblock_virt_alloc_align(x, align) \ >> >> + memblock_virt_alloc_try_nid(x, align, BOOTMEM_LOW_LIMIT, \ >> >> + BOOTMEM_ALLOC_ACCESSIBLE, MAX_NUMNODES) >> > >> > Also, do we really need this align variant separate when the caller >> > can simply specify 0 for the default? >> >> Unfortunately Yes. >> We need it to keep compatibility with bootmem/nobootmem >> which don't handle 0 as default align value. > >Hmm... why wouldn't just interpreting 0 to SMP_CACHE_BYTES in the >memblock_virt*() function work? > Problem is not with memblock_virt*(). The issue will happen in case if memblock or nobootmem are disabled in below code (memblock_virt*() is disabled). +/* Fall back to all the existing bootmem APIs */ +#define memblock_virt_alloc(x) \ + __alloc_bootmem(x, SMP_CACHE_BYTES, BOOTMEM_LOW_LIMIT) which will be transformed to +/* Fall back to all the existing bootmem APIs */ +#define memblock_virt_alloc(x, align) \ + __alloc_bootmem(x, align, BOOTMEM_LOW_LIMIT) and used as memblock_virt_alloc(size, 0); so, by default bootmem code will use 0 as default alignment and not SMP_CACHE_BYTES and that is wrong. Regards, -grygorii-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH v2 08/23] mm/memblock: Add memblock memory allocation apis
Hi Tejun, On Thu, Dec 05, 2013 at 06:35:00PM +0200, Grygorii Strashko wrote: +#define memblock_virt_alloc_align(x, align) \ + memblock_virt_alloc_try_nid(x, align, BOOTMEM_LOW_LIMIT, \ + BOOTMEM_ALLOC_ACCESSIBLE, MAX_NUMNODES) Also, do we really need this align variant separate when the caller can simply specify 0 for the default? Unfortunately Yes. We need it to keep compatibility with bootmem/nobootmem which don't handle 0 as default align value. Hmm... why wouldn't just interpreting 0 to SMP_CACHE_BYTES in the memblock_virt*() function work? Problem is not with memblock_virt*(). The issue will happen in case if memblock or nobootmem are disabled in below code (memblock_virt*() is disabled). +/* Fall back to all the existing bootmem APIs */ +#define memblock_virt_alloc(x) \ + __alloc_bootmem(x, SMP_CACHE_BYTES, BOOTMEM_LOW_LIMIT) which will be transformed to +/* Fall back to all the existing bootmem APIs */ +#define memblock_virt_alloc(x, align) \ + __alloc_bootmem(x, align, BOOTMEM_LOW_LIMIT) and used as memblock_virt_alloc(size, 0); so, by default bootmem code will use 0 as default alignment and not SMP_CACHE_BYTES and that is wrong. Regards, -grygorii-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH 7/9] of/platform: Resolve interrupt references at probe time
Hi Thierry, On 09/16/2013 11:32 AM, Thierry Reding wrote:> Interrupt references are currently resolved very early (when a device is > created). This has the disadvantage that it will fail in cases where the > interrupt parent hasn't been probed and no IRQ domain for it has been > registered yet. To work around that various drivers use explicit > initcall ordering to force interrupt parents to be probed before devices > that need them are created. That's error prone and doesn't always work. > If a platform device uses an interrupt line connected to a different > platform device (such as a GPIO controller), both will be created in the > same batch, and the GPIO controller won't have been probed by its driver > when the depending platform device is created. Interrupt resolution will > fail in that case. > > Another common workaround is for drivers to explicitly resolve interrupt > references at probe time. This is suboptimal, however, because it will > require every driver to duplicate the code. > > This patch adds support for late interrupt resolution to the platform > driver core, by resolving the references right before a device driver's > .probe() function will be called. This not only delays the resolution > until a much later time (giving interrupt parents a better chance of > being probed in the meantime), but it also allows the platform driver > core to queue the device for deferred probing if the interrupt parent > hasn't registered its IRQ domain yet. > > Signed-off-by: Thierry Reding > --- > drivers/base/platform.c | 4 > drivers/of/platform.c | 43 +-- > include/linux/of_platform.h | 7 +++ > 3 files changed, 48 insertions(+), 6 deletions(-) > > diff --git a/drivers/base/platform.c b/drivers/base/platform.c > index 4f8bef3..8dcf835 100644 > --- a/drivers/base/platform.c > +++ b/drivers/base/platform.c > @@ -481,6 +481,10 @@ static int platform_drv_probe(struct device *_dev) Should it be the part of really_probe()? Isn't it? > struct platform_device *dev = to_platform_device(_dev); > int ret; > > + ret = of_platform_probe(dev); > + if (ret) > + return ret; > + > if (ACPI_HANDLE(_dev)) > acpi_dev_pm_attach(_dev, true); > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > index 9b439ac..edaef70 100644 > --- a/drivers/of/platform.c > +++ b/drivers/of/platform.c > @@ -142,7 +142,7 @@ struct platform_device *of_device_alloc(struct > device_node *np, > struct device *parent) > { > struct platform_device *dev; > - int rc, i, num_reg = 0, num_irq; > + int rc, i, num_reg = 0; > struct resource *res, temp_res; > > dev = platform_device_alloc("", -1); > @@ -153,23 +153,21 @@ struct platform_device *of_device_alloc(struct > device_node *np, > if (of_can_translate_address(np)) > while (of_address_to_resource(np, num_reg, _res) == 0) > num_reg++; > - num_irq = of_irq_count(np); > > /* Populate the resource table */ > - if (num_irq || num_reg) { > - res = kzalloc(sizeof(*res) * (num_irq + num_reg), GFP_KERNEL); > + if (num_reg) { > + res = kzalloc(sizeof(*res) * num_reg, GFP_KERNEL); > if (!res) { > platform_device_put(dev); > return NULL; > } > > - dev->num_resources = num_reg + num_irq; > + dev->num_resources = num_reg; > dev->resource = res; > for (i = 0; i < num_reg; i++, res++) { > rc = of_address_to_resource(np, i, res); > WARN_ON(rc); > } > - WARN_ON(of_irq_to_resource_table(np, res, num_irq) != num_irq); > } > > dev->dev.of_node = of_node_get(np); > @@ -490,4 +488,37 @@ int of_platform_populate(struct device_node *root, > return rc; > } > EXPORT_SYMBOL_GPL(of_platform_populate); > + > +int of_platform_probe(struct platform_device *pdev) > +{ > + struct device_node *np = pdev->dev.of_node; > + int num_irq, ret = 0; > + > + if (!pdev->dev.of_node) > + return 0; > + > + num_irq = of_irq_count(pdev->dev.of_node); > + if (num_irq > 0) { > + struct resource *res = pdev->resource; > + int num_reg = pdev->num_resources; > + int num = num_reg + num_irq; > + > + res = krealloc(res, num * sizeof(*res), GFP_KERNEL); > + if (!res) > + return -ENOMEM; > + > + pdev->num_resources = num; > + pdev->resource = res; > + res += num_reg; What will happen if Driver probe is failed or deferred? Seems resource table size will grow each time the Driver probe is deferred or failed. > + > + ret = of_irq_to_resource_table(np, res, num_irq); > + if
RE: [PATCH 7/9] of/platform: Resolve interrupt references at probe time
Hi Thierry, On 09/16/2013 11:32 AM, Thierry Reding wrote: Interrupt references are currently resolved very early (when a device is created). This has the disadvantage that it will fail in cases where the interrupt parent hasn't been probed and no IRQ domain for it has been registered yet. To work around that various drivers use explicit initcall ordering to force interrupt parents to be probed before devices that need them are created. That's error prone and doesn't always work. If a platform device uses an interrupt line connected to a different platform device (such as a GPIO controller), both will be created in the same batch, and the GPIO controller won't have been probed by its driver when the depending platform device is created. Interrupt resolution will fail in that case. Another common workaround is for drivers to explicitly resolve interrupt references at probe time. This is suboptimal, however, because it will require every driver to duplicate the code. This patch adds support for late interrupt resolution to the platform driver core, by resolving the references right before a device driver's .probe() function will be called. This not only delays the resolution until a much later time (giving interrupt parents a better chance of being probed in the meantime), but it also allows the platform driver core to queue the device for deferred probing if the interrupt parent hasn't registered its IRQ domain yet. Signed-off-by: Thierry Reding tred...@nvidia.com --- drivers/base/platform.c | 4 drivers/of/platform.c | 43 +-- include/linux/of_platform.h | 7 +++ 3 files changed, 48 insertions(+), 6 deletions(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 4f8bef3..8dcf835 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -481,6 +481,10 @@ static int platform_drv_probe(struct device *_dev) Should it be the part of really_probe()? Isn't it? struct platform_device *dev = to_platform_device(_dev); int ret; + ret = of_platform_probe(dev); + if (ret) + return ret; + if (ACPI_HANDLE(_dev)) acpi_dev_pm_attach(_dev, true); diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 9b439ac..edaef70 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -142,7 +142,7 @@ struct platform_device *of_device_alloc(struct device_node *np, struct device *parent) { struct platform_device *dev; - int rc, i, num_reg = 0, num_irq; + int rc, i, num_reg = 0; struct resource *res, temp_res; dev = platform_device_alloc(, -1); @@ -153,23 +153,21 @@ struct platform_device *of_device_alloc(struct device_node *np, if (of_can_translate_address(np)) while (of_address_to_resource(np, num_reg, temp_res) == 0) num_reg++; - num_irq = of_irq_count(np); /* Populate the resource table */ - if (num_irq || num_reg) { - res = kzalloc(sizeof(*res) * (num_irq + num_reg), GFP_KERNEL); + if (num_reg) { + res = kzalloc(sizeof(*res) * num_reg, GFP_KERNEL); if (!res) { platform_device_put(dev); return NULL; } - dev-num_resources = num_reg + num_irq; + dev-num_resources = num_reg; dev-resource = res; for (i = 0; i num_reg; i++, res++) { rc = of_address_to_resource(np, i, res); WARN_ON(rc); } - WARN_ON(of_irq_to_resource_table(np, res, num_irq) != num_irq); } dev-dev.of_node = of_node_get(np); @@ -490,4 +488,37 @@ int of_platform_populate(struct device_node *root, return rc; } EXPORT_SYMBOL_GPL(of_platform_populate); + +int of_platform_probe(struct platform_device *pdev) +{ + struct device_node *np = pdev-dev.of_node; + int num_irq, ret = 0; + + if (!pdev-dev.of_node) + return 0; + + num_irq = of_irq_count(pdev-dev.of_node); + if (num_irq 0) { + struct resource *res = pdev-resource; + int num_reg = pdev-num_resources; + int num = num_reg + num_irq; + + res = krealloc(res, num * sizeof(*res), GFP_KERNEL); + if (!res) + return -ENOMEM; + + pdev-num_resources = num; + pdev-resource = res; + res += num_reg; What will happen if Driver probe is failed or deferred? Seems resource table size will grow each time the Driver probe is deferred or failed. + + ret = of_irq_to_resource_table(np, res, num_irq); + if (ret 0) + return ret; + + WARN_ON(ret != num_irq); + ret = 0; +
RE: [PATCH v3] gpio: interrupt consistency check for OF GPIO IRQs
Hi Lars, Linus, I've few comments regarding this patch and problem in general. On 08/26/2013 05:07 PM, Lars Poeschel wrote: > From: Linus Walleij > > Currently the kernel is ambigously treating GPIOs and interrupts > from a GPIO controller: GPIOs and interrupts are treated as > orthogonal. This unfortunately makes it unclear how to actually > retrieve and request a GPIO line or interrupt from a GPIO > controller in the device tree probe path. > > In the non-DT probe path it is clear that the GPIO number has to > be passed to the consuming device, and if it is to be used as > an interrupt, the consumer shall performa a gpio_to_irq() mapping > and request the resulting IRQ number. > > In the DT boot path consumers may have been given one or more > interrupts from the interrupt-controller side of the GPIO chip > in an abstract way, such that the driver is not aware of the > fact that a GPIO chip is backing this interrupt, and the GPIO > side of the same line is never requested with gpio_request(). > A typical case for this is ethernet chips which just take some > interrupt line which may be a "hard" interrupt line (such as an > external interrupt line from an SoC) or a cascaded interrupt > connected to a GPIO line. > > This has the following undesired effects: > > - The GPIOlib subsystem is not aware that the line is in use >and willingly lets some other consumer perform gpio_request() >on it, leading to a complex resource conflict if it occurs. > > - The GPIO debugfs file claims this GPIO line is "free". > > - The line direction of the interrupt GPIO line is not >explicitly set as input, even though it is obvious that such >a line need to be set up in this way, often making the system >depend on boot-on defaults for this kind of settings. > > To solve this dilemma, perform an interrupt consistency check > when adding a GPIO chip: if the chip is both gpio-controller and > interrupt-controller, walk all children of the device tree, > check if these in turn reference the interrupt-controller, and > if they do, loop over the interrupts used by that child and > perform gpio_request() and gpio_direction_input() on these, > making them unreachable from the GPIO side. > > The patch has been devised by Linus Walleij and Lars Poeschel. > > Changelog V3: > - define of_gpiochip_reserve_irq_lines as empty if >CONFIG_OF_IRQ is not defined. Walking the devicetree simply >makes no sense in this case and caused a compile time error >on SPARC. > - exit of_gpiochip_reserve_irq_lines if no irq_domain can be >found. > - convert the irqspec to cpu byte order before invoking the >irq_domains xlate function. > > Changelog V2: > - To be able to parse custom interrupts properties from the >device tree, get a reference to the drivers irq_domain >and use the xlate function to parse the proptery and >get the irq number. This is tested with >#interrupt-cells = 1, 2, and 3 and multiple interrupts >per property. > > Cc: devicet...@vger.kernel.org > Cc: Javier Martinez Canillas > Cc: Enric Balletbo i Serra > Cc: Grant Likely > Cc: Jean-Christophe PLAGNIOL-VILLARD > Cc: Santosh Shilimkar > Cc: Kevin Hilman > Cc: Balaji T K > Cc: Tony Lindgren > Cc: Jon Hunter > Signed-off-by: Lars Poeschel > Signed-off-by: Linus Walleij > --- > drivers/gpio/gpiolib-of.c | 201 > - > 1 file changed, 200 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c > index 665f953..d328c5d 100644 > --- a/drivers/gpio/gpiolib-of.c > +++ b/drivers/gpio/gpiolib-of.c > @@ -10,7 +10,6 @@ >* the Free Software Foundation; either version 2 of the License, or >* (at your option) any later version. >*/ > - > #include > #include > #include > @@ -19,6 +18,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -126,6 +126,201 @@ int of_gpio_simple_xlate(struct gpio_chip *gc, > } > EXPORT_SYMBOL(of_gpio_simple_xlate); > > +#if defined(CONFIG_OF_IRQ) > +/** > + * of_gpio_scan_irq_lines() - internal function to recursively scan the > device > + * tree and request or free the GPIOs that are to be used as IRQ lines > + * @node:node to start the scanning at > + * @gcn: device node of the GPIO chip > + * @irq_domain: the irq_domain for the GPIO chip > + * @intsize: size of one single interrupt in the device tree for the GPIO > + * chip. It is the same as #interrupt-cells. > + * @gc: GPIO chip instantiated from same node > + * @request: wheter the function should request(true) or free(false) the > + * irq lines > + * > + * This is a internal function that calls itself to recursively scan the > device > + * tree. It scans for uses of the device_node gcn as an interrupt-controller. > + * If it finds some, it requests the corresponding gpio lines that are to be > + * used as irq lines and sets them as input. > + *
RE: [PATCH v3] gpio: interrupt consistency check for OF GPIO IRQs
Hi Lars, Linus, I've few comments regarding this patch and problem in general. On 08/26/2013 05:07 PM, Lars Poeschel wrote: From: Linus Walleij linus.wall...@linaro.org Currently the kernel is ambigously treating GPIOs and interrupts from a GPIO controller: GPIOs and interrupts are treated as orthogonal. This unfortunately makes it unclear how to actually retrieve and request a GPIO line or interrupt from a GPIO controller in the device tree probe path. In the non-DT probe path it is clear that the GPIO number has to be passed to the consuming device, and if it is to be used as an interrupt, the consumer shall performa a gpio_to_irq() mapping and request the resulting IRQ number. In the DT boot path consumers may have been given one or more interrupts from the interrupt-controller side of the GPIO chip in an abstract way, such that the driver is not aware of the fact that a GPIO chip is backing this interrupt, and the GPIO side of the same line is never requested with gpio_request(). A typical case for this is ethernet chips which just take some interrupt line which may be a hard interrupt line (such as an external interrupt line from an SoC) or a cascaded interrupt connected to a GPIO line. This has the following undesired effects: - The GPIOlib subsystem is not aware that the line is in use and willingly lets some other consumer perform gpio_request() on it, leading to a complex resource conflict if it occurs. - The GPIO debugfs file claims this GPIO line is free. - The line direction of the interrupt GPIO line is not explicitly set as input, even though it is obvious that such a line need to be set up in this way, often making the system depend on boot-on defaults for this kind of settings. To solve this dilemma, perform an interrupt consistency check when adding a GPIO chip: if the chip is both gpio-controller and interrupt-controller, walk all children of the device tree, check if these in turn reference the interrupt-controller, and if they do, loop over the interrupts used by that child and perform gpio_request() and gpio_direction_input() on these, making them unreachable from the GPIO side. The patch has been devised by Linus Walleij and Lars Poeschel. Changelog V3: - define of_gpiochip_reserve_irq_lines as empty if CONFIG_OF_IRQ is not defined. Walking the devicetree simply makes no sense in this case and caused a compile time error on SPARC. - exit of_gpiochip_reserve_irq_lines if no irq_domain can be found. - convert the irqspec to cpu byte order before invoking the irq_domains xlate function. Changelog V2: - To be able to parse custom interrupts properties from the device tree, get a reference to the drivers irq_domain and use the xlate function to parse the proptery and get the irq number. This is tested with #interrupt-cells = 1, 2, and 3 and multiple interrupts per property. Cc: devicet...@vger.kernel.org Cc: Javier Martinez Canillas javier.marti...@collabora.co.uk Cc: Enric Balletbo i Serra eballe...@gmail.com Cc: Grant Likely grant.lik...@linaro.org Cc: Jean-Christophe PLAGNIOL-VILLARD plagn...@jcrosoft.com Cc: Santosh Shilimkar santosh.shilim...@ti.com Cc: Kevin Hilman khil...@linaro.org Cc: Balaji T K balaj...@ti.com Cc: Tony Lindgren t...@atomide.com Cc: Jon Hunter jgchun...@gmail.com Signed-off-by: Lars Poeschel poesc...@lemonage.de Signed-off-by: Linus Walleij linus.wall...@linaro.org --- drivers/gpio/gpiolib-of.c | 201 - 1 file changed, 200 insertions(+), 1 deletion(-) diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c index 665f953..d328c5d 100644 --- a/drivers/gpio/gpiolib-of.c +++ b/drivers/gpio/gpiolib-of.c @@ -10,7 +10,6 @@ * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. */ - #include linux/device.h #include linux/errno.h #include linux/module.h @@ -19,6 +18,7 @@ #include linux/of.h #include linux/of_address.h #include linux/of_gpio.h +#include linux/of_irq.h #include linux/pinctrl/pinctrl.h #include linux/slab.h @@ -126,6 +126,201 @@ int of_gpio_simple_xlate(struct gpio_chip *gc, } EXPORT_SYMBOL(of_gpio_simple_xlate); +#if defined(CONFIG_OF_IRQ) +/** + * of_gpio_scan_irq_lines() - internal function to recursively scan the device + * tree and request or free the GPIOs that are to be used as IRQ lines + * @node:node to start the scanning at + * @gcn: device node of the GPIO chip + * @irq_domain: the irq_domain for the GPIO chip + * @intsize: size of one single interrupt in the device tree for the GPIO + * chip. It is the same as #interrupt-cells. + * @gc: GPIO chip instantiated from same node + * @request: wheter the function should request(true) or free(false) the + * irq lines + * + * This is a internal function
RE: [PATCH] mfd: twl-core: convert to module_init()
Hi Kevin, It's done alreday here: https://patchwork.kernel.org/patch/2477541/ - mfd: twl-core: convert to module_i2c_driver() (Samuel has taken this one) - and - https://patchwork.kernel.org/patch/2477561/ - i2c: omap: convert to module_platform_driver() (you can vote here - TWL can't be shifted without I2C) (Sent from Web Outlook - sorry if format is wrong) Best regards, - Grygorii Strashko From: linux-omap-ow...@vger.kernel.org [linux-omap-ow...@vger.kernel.org] on behalf of Kevin Hilman [khil...@linaro.org] Sent: Friday, May 31, 2013 11:56 PM To: Samuel Ortiz; Tony Lindgren Cc: linux-o...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; open list Subject: [PATCH] mfd: twl-core: convert to module_init() There's no reason for twl-core to be subsys_inicall anymore. Anything using twl that early needs a rethink, or deferred probe changes. Boot tested and RTC wake tested (via TWL RTC) on OMAP platforms: 3530/Beagle, 3730/Beagle-xM, 3530/Overo, 3730/Overo-STORM, 4430/Panda, 4460/Panda-ES. Cc: Tony Lindgren Signed-off-by: Kevin Hilman --- drivers/mfd/twl-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mfd/twl-core.c b/drivers/mfd/twl-core.c index 89ab4d9..7749b9e 100644 --- a/drivers/mfd/twl-core.c +++ b/drivers/mfd/twl-core.c @@ -1309,7 +1309,7 @@ static int __init twl_init(void) { return i2c_add_driver(_driver); } -subsys_initcall(twl_init); +module_init(twl_init); static void __exit twl_exit(void) { -- 1.8.2 -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] mfd: twl-core: convert to module_init()
Hi Kevin, It's done alreday here: https://patchwork.kernel.org/patch/2477541/ - mfd: twl-core: convert to module_i2c_driver() (Samuel has taken this one) - and - https://patchwork.kernel.org/patch/2477561/ - i2c: omap: convert to module_platform_driver() (you can vote here - TWL can't be shifted without I2C) (Sent from Web Outlook - sorry if format is wrong) Best regards, - Grygorii Strashko From: linux-omap-ow...@vger.kernel.org [linux-omap-ow...@vger.kernel.org] on behalf of Kevin Hilman [khil...@linaro.org] Sent: Friday, May 31, 2013 11:56 PM To: Samuel Ortiz; Tony Lindgren Cc: linux-o...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; open list Subject: [PATCH] mfd: twl-core: convert to module_init() There's no reason for twl-core to be subsys_inicall anymore. Anything using twl that early needs a rethink, or deferred probe changes. Boot tested and RTC wake tested (via TWL RTC) on OMAP platforms: 3530/Beagle, 3730/Beagle-xM, 3530/Overo, 3730/Overo-STORM, 4430/Panda, 4460/Panda-ES. Cc: Tony Lindgren t...@atomide.com Signed-off-by: Kevin Hilman khil...@linaro.org --- drivers/mfd/twl-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mfd/twl-core.c b/drivers/mfd/twl-core.c index 89ab4d9..7749b9e 100644 --- a/drivers/mfd/twl-core.c +++ b/drivers/mfd/twl-core.c @@ -1309,7 +1309,7 @@ static int __init twl_init(void) { return i2c_add_driver(twl_driver); } -subsys_initcall(twl_init); +module_init(twl_init); static void __exit twl_exit(void) { -- 1.8.2 -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/