Re: [PATCH v3 5/5] mm: enable CONFIG_MOVABLE_NODE on powerpc
On Tue, Sep 27, 2016 at 07:15:41AM +1000, Benjamin Herrenschmidt wrote: What is that business with a command line argument ? Do that mean that we'll need some magic command line argument to properly handle LPC memory on CAPI devices or GPUs ? If yes that's bad ... kernel arguments should be a last resort. Well, movable_node is just a boolean, meaning "allow nodes which contain only movable memory". It's _not_ like "movable_node=10,13-15,17", if that's what you were thinking. We should have all the information we need from the device-tree. Note also that we shouldn't need to create those nodes at boot time, we need to add the ability to create the whole thing at runtime, we may know that there's an NPU with an LPC window in the system but we won't know if it's used until it is and for CAPI we just simply don't know until some PCI device gets turned into CAPI mode and starts claiming LPC memory... Yes, this is what is planned for, if I'm understanding you correctly. In the dt, the PCI device node has a phandle pointing to the memory node. The memory node describes the window into which we can hotplug at runtime. -- Reza Arbab -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 4/5] powerpc/mm: restore top-down allocation when using movable_node
On Tue, Sep 27, 2016 at 07:12:31AM +1000, Benjamin Herrenschmidt wrote: In any case, if the memory hasn't been hotplug, this shouldn't be necessary as we shouldn't be considering it for allocation. Right. To be clear, the background info I put in the commit log refers to x86, where the SRAT can describe movable nodes which exist at boot. They're trying to avoid allocations from those nodes before they've been identified. On power, movable nodes can only exist via hotplug, so that scenario can't happen. We can immediately go back to top-down allocation. That is the missing call being added in the patch. -- Reza Arbab -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 3/5] futex: Throughput-optimized (TO) futexes
On 09/23/2016 09:02 AM, Thomas Gleixner wrote: On Thu, 22 Sep 2016, Waiman Long wrote: Locking was done mostly by lock stealing. This is where most of the performance benefit comes from, not optimistic spinning. How does the lock latency distribution of all this look like and how fair is the whole thing? The TO futexes are unfair as can be seen from the min/max thread times listed above. It took the fastest thread 0.07s to complete all the locking operations, whereas the slowest one needed 2.65s. However, the situation reverses when I changed the critical section to a 1us sleep. In this case, 1us sleep is going to add another syscall and therefor scheduling, so what? Or did you just extend the critical section busy time? The 1us sleep will cause the spinning to stop and make all the waiters sleep. This is to simulate the extreme case where TO futex may not have the performance advantage. there will be no optimistic spinning. The performance results for 100k locking operations were listed below. wait-wake futex PI futexTO futex --- max time0.06s 9.32s 4.76s Yes, wait-wake futex is the unfair one in this case. min time5.59s 9.36s 5.62s average time3.25s 9.35s 5.41s In this case, the TO futexes are fairer but perform worse than the wait-wake futexes. That is because the lock handoff mechanism limit the amount of lock stealing in the TO futexes while the wait-wake futexes have no such restriction. When I disabled lock handoff, the TO futexes would then perform similar to the wait-wake futexes. So the benefit of these new fangled futexes is only there for extreme short critical sections and a gazillion of threads fighting for the same futex, right? Not really. Lock stealing will help performance when a gazillion of threads fighting for the same futex. Optimistic spinning will help to reduce the lock transfer latency because the waiter isn't sleeping no matter the number of threads. One set of data that I haven't shown so far is that the performance delta between wait-wait and TO futexes actually increases as the critical section is lengthened. This is because for short critical section, the waiters of wait-wake futex may not actually go to sleep because of the latency introduced by the code that has to be run before they do a final check to see if the futex value change before going to sleep. The longer the critical section, the higher the chance that they actually sleep and hence their performance is getting worse relative to the TO futexes. For example, with the critical section of 50 pause instructions instead of 5, the performance gain is about 5X instead of about 1.6X in the latter case. I really wonder how the average programmer should pick the right flavour, not to talk about any useful decision for something like glibc to pick the proper one. I would say that TO futexes will have better performance in most cases. Of course, I still need to run some real world benchmarks to quantify the effect of the new futexes. I am hoping to get suggestion of what is a good set of benchmarks to run. Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -tip] locking/rtmutex: Reduce top-waiter blocking on a lock
On 09/23/2016 09:28 PM, Davidlohr Bueso wrote: +#ifdef CONFIG_RT_MUTEX_SPIN_ON_OWNER +static bool rt_mutex_spin_on_owner(struct rt_mutex *lock, + struct task_struct *owner) +{ +bool ret = true; + +/* + * The last owner could have just released the lock, + * immediately try taking it again. + */ +if (!owner) +goto done; + +rcu_read_lock(); +while (rt_mutex_owner(lock) == owner) { +/* + * Ensure we emit the owner->on_cpu, dereference _after_ + * checking lock->owner still matches owner. If that fails, + * owner might point to freed memory. If it still matches, + * the rcu_read_lock() ensures the memory stays valid. + */ +barrier(); +if (!owner->on_cpu || need_resched()) { +ret = false; +break; +} + +cpu_relax_lowlatency(); +} +rcu_read_unlock(); +done: +return ret; +} + One issue that I saw is that the spinner may no longer be the top waiter while spinning. Should we also check this condition in the spin loop? Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 5/5] mm: enable CONFIG_MOVABLE_NODE on powerpc
On Sun, 2016-09-25 at 13:36 -0500, Reza Arbab wrote: > To create a movable node, we need to hotplug all of its memory into > ZONE_MOVABLE. > > Note that to do this, auto_online_blocks should be off. Since the memory > will first be added to the default zone, we must explicitly use > online_movable to online. > > Because such a node contains no normal memory, can_online_high_movable() > will only allow us to do the onlining if CONFIG_MOVABLE_NODE is set. > Enable the use of this config option on PPC64 platforms. What is that business with a command line argument ? Do that mean that we'll need some magic command line argument to properly handle LPC memory on CAPI devices or GPUs ? If yes that's bad ... kernel arguments should be a last resort. We should have all the information we need from the device-tree. Note also that we shouldn't need to create those nodes at boot time, we need to add the ability to create the whole thing at runtime, we may know that there's an NPU with an LPC window in the system but we won't know if it's used until it is and for CAPI we just simply don't know until some PCI device gets turned into CAPI mode and starts claiming LPC memory... Ben. > Signed-off-by: Reza Arbab> --- > Documentation/kernel-parameters.txt | 2 +- > mm/Kconfig | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index a4f4d69..3d8460d 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2344,7 +2344,7 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > > that the amount of memory usable for all allocations > > is not too small. > > > > - movable_node[KNL,X86] Boot-time switch to enable the effects > > > + movable_node[KNL,X86,PPC] Boot-time switch to enable the effects > > of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details. > > > > MTD_Partition= [MTD] > diff --git a/mm/Kconfig b/mm/Kconfig > index be0ee11..4b19cd3 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -153,7 +153,7 @@ config MOVABLE_NODE > > bool "Enable to assign a node which has only movable memory" > > depends on HAVE_MEMBLOCK > > depends on NO_BOOTMEM > > - depends on X86_64 > > + depends on X86_64 || PPC64 > > depends on NUMA > > default n > > help -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 4/5] powerpc/mm: restore top-down allocation when using movable_node
On Mon, Sep 26, 2016 at 09:17:43PM +0530, Aneesh Kumar K.V wrote: + /* bottom-up allocation may have been set by movable_node */ + memblock_set_bottom_up(false); + By then we have done few memblock allocation right ? Yes, some allocations do occur while bottom-up is set. IMHO, we should do this early enough in prom.c after we do parse_early_param, with a comment there explaining that, we don't really support hotplug memblock and when we do that, this should be moved to a place where we can handle memblock allocation such that we avoid spreading memblock allocation to movable node. Sure, we can do it earlier. The only consideration is that any potential calls to memblock_mark_hotplug() happen before we reset to top-down. Since we don't do that at all on power, the call can go anywhere. -- Reza Arbab -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] Moving runnable code from Documentation (last 2 patches)
On Mon, Sep 26, 2016 at 11:40 AM, Shuah Khanwrote: > This patch series contains the last 2 patches to complete moving runnable > code from Documentation to selftests, samples, and tools. > > The first patch moves blackfin gptimers-example to samples and removes > CONFIG_BUILD_DOCSRC. > > The second one updates 00-INDEX files under Documentation to reflect the > move of runnable code from Documentation. Looks good to me! Reviewed-by: Kees Cook -Kees -- Kees Cook Nexus Security -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Doc: update 00-INDEX files to reflect the runnable code move
Update 00-INDEX files with the current file list to reflect the runnable code move. Signed-off-by: Shuah Khan--- Documentation/00-INDEX | 2 -- Documentation/arm/00-INDEX | 2 -- Documentation/filesystems/00-INDEX | 2 -- Documentation/networking/00-INDEX | 2 -- Documentation/spi/00-INDEX | 2 -- Documentation/timers/00-INDEX | 2 -- 6 files changed, 12 deletions(-) diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index cb9a6c6..b79d661 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -45,8 +45,6 @@ IRQ.txt - description of what an IRQ is. Intel-IOMMU.txt - basic info on the Intel IOMMU virtualization support. -Makefile - - some files in Documentation dir are actually sample code to build ManagementStyle - how to (attempt to) manage kernel hackers. RCU/ diff --git a/Documentation/arm/00-INDEX b/Documentation/arm/00-INDEX index dea011c..b6e69fd 100644 --- a/Documentation/arm/00-INDEX +++ b/Documentation/arm/00-INDEX @@ -8,8 +8,6 @@ Interrupts - ARM Interrupt subsystem documentation IXP4xx - Intel IXP4xx Network processor. -Makefile - - Build sourcefiles as part of the Documentation-build for arm Netwinder - Netwinder specific documentation Porting diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX index 9922939..f66e748 100644 --- a/Documentation/filesystems/00-INDEX +++ b/Documentation/filesystems/00-INDEX @@ -2,8 +2,6 @@ - this file (info on some of the filesystems supported by linux). Locking - info on locking rules as they pertain to Linux VFS. -Makefile - - Makefile for building the filsystems-part of DocBook. 9p.txt - 9p (v9fs) is an implementation of the Plan 9 remote fs protocol. adfs.txt diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 415154a..98f3d4b 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -10,8 +10,6 @@ LICENSE.qlge - GPLv2 for QLogic Linux qlge NIC Driver LICENSE.qlcnic - GPLv2 for QLogic Linux qlcnic NIC Driver -Makefile - - Makefile for docsrc. PLIP.txt - PLIP: The Parallel Line Internet Protocol device driver README.ipw2100 diff --git a/Documentation/spi/00-INDEX b/Documentation/spi/00-INDEX index 4644bf0..8e4bb17 100644 --- a/Documentation/spi/00-INDEX +++ b/Documentation/spi/00-INDEX @@ -1,7 +1,5 @@ 00-INDEX - this file. -Makefile - - Makefile for the example sourcefiles. butterfly - AVR Butterfly SPI driver overview and pin configuration. ep93xx_spi diff --git a/Documentation/timers/00-INDEX b/Documentation/timers/00-INDEX index ee212a2..6ee117b 100644 --- a/Documentation/timers/00-INDEX +++ b/Documentation/timers/00-INDEX @@ -8,8 +8,6 @@ hpet_example.c - sample hpet timer test program hrtimers.txt - subsystem for high-resolution kernel timers -Makefile - - Build and link hpet_example NO_HZ.txt - Summary of the different methods for the scheduler clock-interrupts management. timekeeping.txt -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] samples: move blackfin gptimers-example from Documentation
Move blackfin gptimers-example to samples and remove it from Documentation Makefile. Update samples Kconfig and Makefile to build gptimers-example. blackfin is the last CONFIG_BUILD_DOCSRC target in Documentation/Makefile, hence this patch also includes changes to remove CONFIG_BUILD_DOCSRC from Makefile and lib/Kconfig.debug. Signed-off-by: Shuah Khan--- Documentation/Makefile| 1 - Documentation/blackfin/00-INDEX | 4 -- Documentation/blackfin/Makefile | 5 -- Documentation/blackfin/gptimers-example.c | 91 --- Makefile | 3 - lib/Kconfig.debug | 9 --- samples/Kconfig | 6 ++ samples/Makefile | 2 +- samples/blackfin/Makefile | 1 + samples/blackfin/gptimers-example.c | 91 +++ 10 files changed, 99 insertions(+), 114 deletions(-) delete mode 100644 Documentation/Makefile delete mode 100644 Documentation/blackfin/Makefile delete mode 100644 Documentation/blackfin/gptimers-example.c create mode 100644 samples/blackfin/Makefile create mode 100644 samples/blackfin/gptimers-example.c diff --git a/Documentation/Makefile b/Documentation/Makefile deleted file mode 100644 index 8435965..000 --- a/Documentation/Makefile +++ /dev/null @@ -1 +0,0 @@ -subdir-y := blackfin diff --git a/Documentation/blackfin/00-INDEX b/Documentation/blackfin/00-INDEX index c54fcdd..265a1ef 100644 --- a/Documentation/blackfin/00-INDEX +++ b/Documentation/blackfin/00-INDEX @@ -1,10 +1,6 @@ 00-INDEX - This file -Makefile - - Makefile for gptimers example file. bfin-gpio-notes.txt - Notes in developing/using bfin-gpio driver. bfin-spi-notes.txt - Notes for using bfin spi bus driver. -gptimers-example.c - - gptimers example diff --git a/Documentation/blackfin/Makefile b/Documentation/blackfin/Makefile deleted file mode 100644 index 6782c58..000 --- a/Documentation/blackfin/Makefile +++ /dev/null @@ -1,5 +0,0 @@ -ifneq ($(CONFIG_BLACKFIN),) -ifneq ($(CONFIG_BFIN_GPTIMERS),) -obj-m := gptimers-example.o -endif -endif diff --git a/Documentation/blackfin/gptimers-example.c b/Documentation/blackfin/gptimers-example.c deleted file mode 100644 index 283eba9..000 --- a/Documentation/blackfin/gptimers-example.c +++ /dev/null @@ -1,91 +0,0 @@ -/* - * Simple gptimers example - * http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:drivers:gptimers - * - * Copyright 2007-2009 Analog Devices Inc. - * - * Licensed under the GPL-2 or later. - */ - -#include -#include - -#include -#include - -/* ... random driver includes ... */ - -#define DRIVER_NAME "gptimer_example" - -#ifdef IRQ_TIMER5 -#define SAMPLE_IRQ_TIMER IRQ_TIMER5 -#else -#define SAMPLE_IRQ_TIMER IRQ_TIMER2 -#endif - -struct gptimer_data { - uint32_t period, width; -}; -static struct gptimer_data data; - -/* ... random driver state ... */ - -static irqreturn_t gptimer_example_irq(int irq, void *dev_id) -{ - struct gptimer_data *data = dev_id; - - /* make sure it was our timer which caused the interrupt */ - if (!get_gptimer_intr(TIMER5_id)) - return IRQ_NONE; - - /* read the width/period values that were captured for the waveform */ - data->width = get_gptimer_pwidth(TIMER5_id); - data->period = get_gptimer_period(TIMER5_id); - - /* acknowledge the interrupt */ - clear_gptimer_intr(TIMER5_id); - - /* tell the upper layers we took care of things */ - return IRQ_HANDLED; -} - -/* ... random driver code ... */ - -static int __init gptimer_example_init(void) -{ - int ret; - - /* grab the peripheral pins */ - ret = peripheral_request(P_TMR5, DRIVER_NAME); - if (ret) { - printk(KERN_NOTICE DRIVER_NAME ": peripheral request failed\n"); - return ret; - } - - /* grab the IRQ for the timer */ - ret = request_irq(SAMPLE_IRQ_TIMER, gptimer_example_irq, - IRQF_SHARED, DRIVER_NAME, ); - if (ret) { - printk(KERN_NOTICE DRIVER_NAME ": IRQ request failed\n"); - peripheral_free(P_TMR5); - return ret; - } - - /* setup the timer and enable it */ - set_gptimer_config(TIMER5_id, - WDTH_CAP | PULSE_HI | PERIOD_CNT | IRQ_ENA); - enable_gptimers(TIMER5bit); - - return 0; -} -module_init(gptimer_example_init); - -static void __exit gptimer_example_exit(void) -{ - disable_gptimers(TIMER5bit); - free_irq(SAMPLE_IRQ_TIMER, ); - peripheral_free(P_TMR5); -} -module_exit(gptimer_example_exit); - -MODULE_LICENSE("BSD"); diff --git a/Makefile b/Makefile index 1a8c8dd..de5136a 100644 --- a/Makefile +++ b/Makefile @@ -926,9 +926,6 @@ vmlinux_prereq: $(vmlinux-deps) FORCE ifdef CONFIG_HEADERS_CHECK
[PATCH 0/2] Moving runnable code from Documentation (last 2 patches)
This patch series contains the last 2 patches to complete moving runnable code from Documentation to selftests, samples, and tools. The first patch moves blackfin gptimers-example to samples and removes CONFIG_BUILD_DOCSRC. The second one updates 00-INDEX files under Documentation to reflect the move of runnable code from Documentation. Shuah Khan (2): samples: move blackfin gptimers-example from Documentation Doc: update 00-INDEX files to reflect the runnable code move Documentation/00-INDEX| 2 - Documentation/Makefile| 1 - Documentation/arm/00-INDEX| 2 - Documentation/blackfin/00-INDEX | 4 -- Documentation/blackfin/Makefile | 5 -- Documentation/blackfin/gptimers-example.c | 91 --- Documentation/filesystems/00-INDEX| 2 - Documentation/networking/00-INDEX | 2 - Documentation/spi/00-INDEX| 2 - Documentation/timers/00-INDEX | 2 - Makefile | 3 - lib/Kconfig.debug | 9 --- samples/Kconfig | 6 ++ samples/Makefile | 2 +- samples/blackfin/Makefile | 1 + samples/blackfin/gptimers-example.c | 91 +++ 16 files changed, 99 insertions(+), 126 deletions(-) delete mode 100644 Documentation/Makefile delete mode 100644 Documentation/blackfin/Makefile delete mode 100644 Documentation/blackfin/gptimers-example.c create mode 100644 samples/blackfin/Makefile create mode 100644 samples/blackfin/gptimers-example.c -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 5/5] mm: enable CONFIG_MOVABLE_NODE on powerpc
Reza Arbabwrites: > To create a movable node, we need to hotplug all of its memory into > ZONE_MOVABLE. > > Note that to do this, auto_online_blocks should be off. Since the memory > will first be added to the default zone, we must explicitly use > online_movable to online. > > Because such a node contains no normal memory, can_online_high_movable() > will only allow us to do the onlining if CONFIG_MOVABLE_NODE is set. > Enable the use of this config option on PPC64 platforms. > Reviewed-by: Aneesh Kumar K.V > Signed-off-by: Reza Arbab > --- > Documentation/kernel-parameters.txt | 2 +- > mm/Kconfig | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/Documentation/kernel-parameters.txt > b/Documentation/kernel-parameters.txt > index a4f4d69..3d8460d 100644 > --- a/Documentation/kernel-parameters.txt > +++ b/Documentation/kernel-parameters.txt > @@ -2344,7 +2344,7 @@ bytes respectively. Such letter suffixes can also be > entirely omitted. > that the amount of memory usable for all allocations > is not too small. > > - movable_node[KNL,X86] Boot-time switch to enable the effects > + movable_node[KNL,X86,PPC] Boot-time switch to enable the effects > of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details. > > MTD_Partition= [MTD] > diff --git a/mm/Kconfig b/mm/Kconfig > index be0ee11..4b19cd3 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -153,7 +153,7 @@ config MOVABLE_NODE > bool "Enable to assign a node which has only movable memory" > depends on HAVE_MEMBLOCK > depends on NO_BOOTMEM > - depends on X86_64 > + depends on X86_64 || PPC64 > depends on NUMA > default n > help > -- > 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 00/11] pci: support for configurable PCI endpoint
Hi Arnd, On Thursday 22 September 2016 07:04 PM, Arnd Bergmann wrote: > On Thursday, September 15, 2016 2:03:05 PM CEST Kishon Vijay Abraham I wrote: >> On Wednesday 14 September 2016 06:55 PM, Arnd Bergmann wrote: >>> On Wednesday, September 14, 2016 10:41:56 AM CEST Kishon Vijay Abraham I >>> wrote: >>> I've added the drivers/ntb maintainers to Cc, given that there is >>> a certain degree of overlap between your work and the existing >>> code, I think they should be part of the discussion. >>> Known Limitation: *) Does not support multi-function devices >>> >>> If I understand it right, this was a problem for USB and adding >>> it later made it somewhat inconsistent. Maybe we can at least >>> try to come up with an idea of how multi-function devices >>> could be handled even if we don't implement it until someone >>> actually needs it. >> >> Actually IMO multi-function device in PCI should be much simpler than it is >> for >> USB. In the case of USB, all the functions in a multi-function device will >> share the same *usb configuration* . (USB device can have multiple >> configuration but only one can be enabled at a time). A multi-function USB >> device will still have a single vendor-id/product-id/class... So I think a >> separate library (composite.c) in USB makes sense. > > Ok, makes sense. > >> But in the case of PCI, every function can be treated independently since all >> the functions have it's own 4KB configuration space. Each function can be >> configured independently. Each can have it's own vendor-id/product-id/class.. >> I'm not sure if we'll need a separate library for PCI like we have for USB. > > I think it depends on whether we want to add the software multi-function > support you mention. > >> Now the restriction for not allowing multi-function device is because of the >> following structure definition. >> >> struct pci_epc { >> .. >> struct pci_epf *epf; >> .. >> }; >> >> EPC has a single reference to EPF and it is used *only* to notify the >> function >> driver when the link is up. (If this can be changed to use notification >> mechanism, multi-function devices can be supported here) >> >> One more place where this restriction arises is in designware driver >> >> struct dw_pcie_ep { >> .. >> u8 bar_to_atu[6]; >> .. >> }; >> >> We use single ATU window to configure a BAR (in BAR). If there are multiple >> functions, then this should also be modified since each function has 6 BARs. >> >> This can be fixed without much effort unless some other issue props up. > > Ok. > >>> >>> Is your hardware able to make the PCIe endpoint look like >>> a device with multiple PCI functions, or would one have to >>> do this in software inside of a single PCI function if we >>> ever need it? >> >> The hardware I have doesn't support multiple PCI functions (like having a >> separate configuration space for each function). It has a dedicated space for >> configuration space supporting only one function. [Section 24.9.7.3.2 >> PCIe_SS_EP_CFG_DBICS Register Description in [1]]. >> >> yeah, it has to be done in software (but that won't be multi-function device >> in >> PCI terms). >> >> [1] -> http://www.ti.com/lit/ug/spruhz6g/spruhz6g.pdf > > Ok, so in theory there can be other hardware (and quite likely is) > that supports multiple functions, and we can extend the framework > to support them without major obstacles, but your hardware doesn't, > so you kept it simple with one hardcoded function, right? right, PCIe can have upto 8 functions. So the issues with the current framework has to be fixed. I don't expect major obstacles with this as of now. > > Seems completely reasonable to me. > TODO: *) access buffers in RC *) raise MSI interrupts *) Enable user space control for the RC side PCI driver >>> >>> The user space control would end up just being one of several >>> gadget drivers, right? E.g. gadget drivers for standard hardware >>> (8250 uart, ATA, NVMe, some ethernet) could be done as kernel >>> drivers while a user space driver can be used for things that >>> are more unusual and that don't need to interface to another >>> part of the kernel? >> >> Actually I didn't mean that. It was more with respect to the host side PCI >> test >> driver (drivers/misc/pci_endpoint_test.c). Right now it validates BAR, irq >> itself. I wanted to change this so that the user controls which tests to run. >> (Like for USB gadget zero tests, testusb.c invokes ioctls to perform various >> tests). Similarly I want to have a userspace program invoke pci_endpoint_test >> to perform various PCI tests. > > Ok, I see. So what I described above would be yet another function > driver that can be implemented, but so far, you have not planned > to do that because there was not need, right? right. I felt pci_endpoint_test is the generic function that would be of interest to all the vendors. Any new function can be added by taking