Re: Why is the deferred initcall patch not mainline?
On 21.10.2014 21:37, Bird, Tim wrote: I'm going to respond to several comments in this one message (sorry for the likely confusion) On Tuesday, October 21, 2014 9:31 AM, Nicolas Pitre [n...@fluxnic.net] wrote: On Tue, 21 Oct 2014, Grant Likely wrote: On Sat, Oct 18, 2014 at 9:11 AM, Bird, Tim tim.b...@sonymobile.com wrote: The answer is pretty easy, I think. I tried to mainline it once but failed, and didn't really try again. If it is being found useful, we should try to mainline it again, this time with more persistence. The reason it got rejected before IIRC was that you can accomplish a similar thing with modules, with no changes to the kernel. But that doesn't cover the case where the loadable modules feature of the kernel is turned off, which is common in very small systems. It is a rather clumsy approach though since it requires changes to modules and it makes the configuration static per build. Could it instead be done by the kernel accepting a list of initcalls that should be deferred? It would depend I suppose on the cost of finding the initcalls to defer at boot time. Yeah, I'm not a big fan of having to change kernel code in order to use the feature. I am quite intrigued by Geert Uytterhoeven's idea to add a 'D' option to the config system, so that the record of which modules to defer could be stored there. This is much better than hand-altering code. I don't know how difficult this would be to add to the kbuild system, but the mechanism for altering the macro would be, IMHO, very straightforward. I should say that it's been quite some time since I worked on this, so some of my recollections may be fuzzy. With regards to doing it dynamically, I'd have to think about how to do that. Having text-based lists of things to do at runtime seems to fit with how we're using device tree these days, but I'm not sure how that would work. The code as it stands now is quite simple, just creating a new linker section to hold the list of deferred function pointers, re-using all existing routines for processing such lists, doing a few code changes to handle actually deferring the initialization and memory free-ing, and finally creating a /proc entry to trigger the whole thing. In a modern kernel, the /proc trigger should definitely be moved to /sys. Other than this, though, if you move to some other system of processing the list, you will have to create new infrastructure for working through the deferred module list, or make a change in the way the items are handled in the generic init function pointer processing. A simple solution would be to just compare each item from each ...initcall.init section with a list of deferred functions, and not process them, until doing the deferred init. Note that the current technique uses the compiler and linker do some of the work for list aggregation and processing, so that would have to be replaced with something else if you do it differently. I missed the session unfortunately, are there some measurements available that I could look at? Which subsystems are typically the problem? I, too, would like to know more about the problem. Any pointers? Here is the elinux wiki page with some historical measurements: http://elinux.org/Deferred_Initcalls The example on the wiki page defers 2 USB modules, and it saved 530 milliseconds on an x86 system. This is consistent with what we saw on cameras at Sony. This patch predated Arjan Van de Ven's fastboot work. I don't know if some of his parallelization (asynchronous module loading), and optimizations for USB loading made things substantially better than this. The USB spec makes in impossible to avoid a certain amount of delay in probing the USB busses USB was the main culprit, but we sometimes deferred other modules, if they were not in the fastpath for taking a picture. Sony cameras had a goal of booting in .5 seconds, but I think the best we ever achieved was about 1.1 seconds, using deferred initcalls and a variety of other techniques. To extend the list of usage examples, e.g. -late_initcall(clk_debug_init); +deferred_initcall(clk_debug_init); I.e. you might want to have some debug features enabled, but you don't want to spend the time needed for initializing them in the time critical boot phase. Best regards Dirk -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why is the deferred initcall patch not mainline?
On 18.10.2014 10:11, Bird, Tim wrote: The answer is pretty easy, I think. I tried to mainline it once but failed, and didn't really try again. If it is being found useful, we should try to mainline it again, this time with more persistence. The reason it got rejected before IIRC was that you can accomplish a similar thing with modules, with no changes to the kernel. But that doesn't cover the case where the loadable modules feature of the kernel is turned off, which is common in very small systems. Just some other uses cases: You want to avoid the overhead of ELF module loading, even if module loading is on. We've seen a lot of cases where the overall boot time is a lot faster having the driver in the kernel than loading it as module. Even if the kernel size and therefore its load time increases with this. And if you want to have the driver quite early, earlier than the user space loads the modules. But want to have the delay/wait time of that driver to be running _after_ you have mounted the rootfs. Thanks Dirk Btw.: Does anybody have the correct mail address of Chris? Maybe he has some opinions on this, too, as his talk is the starting point of this discussion ;) Dirk Behme wrote Hi, During the ELCE 2014 in Duesseldorf in Chris Hallinan's talk [1] there has been the unanswered question why the deferred initcall patch [2] isn't mainline, yet. Anybody remembers? Best regards Dirk [1] http://sched.co/1yG5fmY [2] http://elinux.org/Deferred_Initcalls -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Why is the deferred initcall patch not mainline?
Hi, During the ELCE 2014 in Duesseldorf in Chris Hallinan's talk [1] there has been the unanswered question why the deferred initcall patch [2] isn't mainline, yet. Anybody remembers? Best regards Dirk [1] http://sched.co/1yG5fmY [2] http://elinux.org/Deferred_Initcalls -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Boot time: Initial main memory initialization optimizations?
Hi, regarding boot time optimization, on an embedded ARM Cortex-A9 based system with 512MB or 1GB main memory, we found that initializing this main memory takes a somehow large amount of time. Initializing 512MB takes = ~100ms, the additional 512MB on the 1GB take = ~100ms additionally, too. So in sum = ~200ms for 1GB. Having a short look to this, it looks like most of the time is spent in arch/arm/mm/init.c in bootmem_init()/arm_bootmem_init()/arm_bootmem_free(). Has anybody already looked into this if there are any optimizations possible? Maybe even some hacks, if the main memory size (512MB/1GB) is always known? Any pointers? I'm looking for reducing (a) the overall init time and maybe (b) the dependency on the memory size. Thanks Dirk -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Boot time: Optimize CPU bring up?
Hi, on a ARMv7 Freescale i.MX6 based system we are looking at optimizing the kernel boot time. Booting a 3.5.7 kernel with SMP=y and the kernel option 'nosmp' (the i.MX6 has single, dual and quad CPU versions) we get [0.255927] hw perfevents: enabled with ARMv7 Cortex-A9 PMU driver, 7 counters available [0.256033] Setting up static identity map for 0x10426a28 - 0x10426a80 [0.260204] initcall spawn_ksoftirqd+0x0/0x58 returned 0 after 9765 usecs [0.270363] initcall init_workqueues+0x0/0x39c returned 0 after 9765 usecs [0.290265] initcall cpu_stop_init+0x0/0xd0 returned 0 after 19531 usecs [0.310449] initcall rcu_spawn_kthreads+0x0/0xc0 returned 0 after 19531 usecs [0.310699] Brought up 1 CPUs [0.310712] SMP: Total of 1 processors activated (1581.05 BogoMIPS). I.e. ~55ms just for bringing up the 1 CPU. Looking into some details, e.g. cpu_stop_init(), the ~19531 usecs are there because the system 'hangs' 2 jiffies (CONFIG_HZ=100) in cpu_v7_do_idle(). For testing purposes switching to CONFIG_HZ=1000 reduces above 54ms to just ~4ms. But we are unsure to switch the whole system to CONFIG_HZ=1000 just to optimize this part of the boot process. Does anybody know why all the above parts are idling for some jiffies? Is there any other optimization than CONFIG_HZ=1000 possible? In case there are any patches floating around or this was already discussed, any link would be nice. Many thanks and best regards Dirk -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] mmc: block: remove unused name_idx
With the previous patch mmc: block: mmcblkN: use slot index instead of dynamic name index name_idx is not needed any more. Signed-off-by: Dirk Behme dirk.be...@de.bosch.com CC: Jassi Brar jaswinder.si...@linaro.org CC: Chris Ball c...@laptop.org --- drivers/mmc/card/block.c | 16 1 files changed, 0 insertions(+), 16 deletions(-) diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c index a01d306..555d840 100644 --- a/drivers/mmc/card/block.c +++ b/drivers/mmc/card/block.c @@ -74,7 +74,6 @@ static int max_devices; /* 256 minors, so at most 256 separate devices */ static DECLARE_BITMAP(dev_use, 256); -static DECLARE_BITMAP(name_use, 256); /* * There is one mmc_blk_data per slot. @@ -92,7 +91,6 @@ struct mmc_blk_data { unsigned intusage; unsigned intread_only; unsigned intpart_type; - unsigned intname_idx; unsigned intreset_done; #define MMC_BLK_READ BIT(0) #define MMC_BLK_WRITE BIT(1) @@ -1458,19 +1456,6 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card, goto out; } - /* -* !subname implies we are creating main mmc_blk_data that will be -* associated with mmc_card with mmc_set_drvdata. Due to device -* partitions, devidx will not coincide with a per-physical card -* index anymore so we keep track of a name index. -*/ - if (!subname) { - md-name_idx = find_first_zero_bit(name_use, max_devices); - __set_bit(md-name_idx, name_use); - } else - md-name_idx = ((struct mmc_blk_data *) - dev_to_disk(parent)-private_data)-name_idx; - md-area_type = area_type; /* @@ -1660,7 +1645,6 @@ static void mmc_blk_remove_parts(struct mmc_card *card, struct list_head *pos, *q; struct mmc_blk_data *part_md; - __clear_bit(md-name_idx, name_use); list_for_each_safe(pos, q, md-part) { part_md = list_entry(pos, struct mmc_blk_data, part); list_del(pos); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] module: Use binary search in lookup_symbol()
On 17.05.2011 22:56, Alessio Igor Bogani wrote: This work was supported by a hardware donation from the CE Linux Forum. Signed-off-by: Alessio Igor Boganiabog...@kernel.org --- kernel/module.c |7 ++- 1 files changed, 2 insertions(+), 5 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 1e2b657..795bdc7 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2055,11 +2055,8 @@ static const struct kernel_symbol *lookup_symbol(const char *name, const struct kernel_symbol *start, const struct kernel_symbol *stop) { - const struct kernel_symbol *ks = start; - for (; ks stop; ks++) - if (strcmp(ks-name, name) == 0) - return ks; - return NULL; + return bsearch(name, start, stop - start, + sizeof(struct kernel_symbol), cmp_name); } static int is_exported(const char *name, unsigned long value, The old version with the warning is in linux-next now http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commitdiff;h=903996de9b35213aaa4162c24351a2cb2931d9ac Best regards Dirk -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Speed up the symbols' resolution process V4
On 16.04.2011 15:26, Alessio Igor Bogani wrote: The intent of this patch is to speed up the symbols resolution process. This objective is achieved by sorting all ksymtab* and kcrctab* symbols (those which reside both in the kernel and in the modules) and thus use the fast binary search. To avoid adding lots of code for symbols sorting I rely on the linker which can easily do the job thanks to a little trick. The trick isn't really beautiful to see but permits minimal changes to the code and build process. Indeed the patch is very simple and short. In the first place I changed the code for place every symbol in a different section (for example: ___ksymtab sec __ #sym) at compile time (this the above mentioned trick!). Thus I request to the linker to sort and merge all these sections into the appropriate ones (for example: __ksymtab) at link time using the linker scripts. Once all symbols are sorted we can use binary search instead of the linear one. I'm fairly sure that this is a good speed improvement even though I haven't made any comprehensive benchmarking (but follow a simple one). In any case I would be very happy to receive suggestions about how made it. Collaterally, the boot time should be reduced also (proportionally to the number of modules and symbols nvolved at boot stage). I hope that you find that interesting! This work was supported by a hardware donation from the CE Linux Forum. Thanks to Ian Lance Taylor for help about how the linker works. Changes since V3: *) Please ignore this version completely Changes since V2: *) Fix a bug in each_symbol() semantics by Anders Kaseorg *) Split the work in three patches as requested by Rusty Russell *) Add a generic binary search implementation made by Tim Abbott *) Remove CONFIG_SYMBOLS_BSEARCH kernel option Changes since V1: *) Merge all patches into only one *) Remove few useless things *) Introduce CONFIG_SYMBOLS_BSEARCH kernel option Alessio Igor Bogani (3): module: Restructure each_symbol() code module: Sort exported symbols module: Use the binary search for symbols resolution Tim Abbott (1): lib: Add generic binary search function to the kernel. include/asm-generic/vmlinux.lds.h | 20 include/linux/bsearch.h |9 include/linux/module.h|4 +- kernel/module.c | 84 - lib/Makefile |3 +- lib/bsearch.c | 53 +++ scripts/module-common.lds | 11 + 7 files changed, 151 insertions(+), 33 deletions(-) create mode 100644 include/linux/bsearch.h create mode 100644 lib/bsearch.c Tested-by: Dirk Behme dirk.be...@googlemail.com On an embedded ARM system insmoding a large number of modules the overall module load time is improved up to ~1s. Great! :) It would be nice to get these patches into mainline asap. Many thanks Dirk -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
LinuxCon Europe 2011 == ELCE 2011?
Is the LinuxCon Europe October 26 - 28, 2011, Prague http://events.linuxfoundation.org/events/linuxcon-europe now the same as the Embedded Linux Conference Europe (ELCE) 2011? There is some rumor that these were merged. But the above page doesn't even mention the string 'embedded'. Thanks Dirk -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
Sascha Hauer wrote: On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote: Hi, On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote: On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote: That's bad :-) So there is no room for improvement any more in our ARM boot sequences ... on x86 we're doing pretty well ;-) On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from power-on through the kernel up to starting init. This is with - no delay in u-boot-v2 - rootfs on NAND (UBIFS) - quiet - precalculated loops-per-jiffy - zImage kernel instead of uImage Here's a little video of our demo system booting: http://www.youtube.com/watch?v=xDbUnNsj0cI As you can see there, it needs about 15 s from the release of the reset button up to the moment where the application shows it's Qt 4.5.2 based GUI (which is when we fade over from the initial framebuffer to the final one, in order to hide the qt application startup noise). And below is the boot log (after turning quiet off again). The numbers are the timestamp and the delta to the last timestamp, measured on the controlling PC by looking at the serial console output. The ptx_ts script starts when the regexp was found, so the numbers start basically in the moment when u-boot-v2 has initialized the system up to the point where we can see something. Result: - 2.4 s up from u-boot to the end of Uncompressing Linux - 300 ms until ubifs initialization starts - 3.7 s for ubifs, until mounted root So we basically have 7 s for the kernel. The rest is userspace, which hasn't seen much optimization yet, other than trying to start the GUI application as early as possible, while doing all other init stuff in parallel. Adding quiet brings us another 300 ms. That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. Robert r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9 [ 13.522625] 0.043189 [ 13.546627] 0.024002 OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200) [ 13.558613] 0.011986 [ 13.690643] 0.132030_ ___ _ [ 13.690731] 0.88 _ __ | |__ _ _ / ___/ _ \| _ \| | [ 13.698595] 0.007864 | '_ \| '_ \| | | | | | | | | |_) | _| [ 13.698654] 0.59 | |_) | | | | |_| | |__| |_| | _ | |___ [ 13.702581] 0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_| [ 13.706573] 0.003992 |_| |___/ [ 13.706622] 0.49 [ 13.725043] 0.018421 [ 14.742608] 1.017565 I made some changes suggested in this thread: - enable MMU in the bootloader - use assembler optimized memcpy/memset in the bootloader - start an uncompressed image - disable IP autoconfiguration in the Kernel - use lpj= command line parameter - use static device nodes instead of udev - skip some init scripts - made the kernel smaller (I do not have both configs handy, so I do not know what exactly I changed) Already looks much better: [ 0.05] 0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25) [ 0.26] 0.21 [ 0.41] 0.15 Board: Phytec phyCORE-i.MX27 [ 0.54] 0.13 cfi_probe: cfi_flash base: 0xc000 size: 0x0200 [ 0.67] 0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit) [ 0.80] 0.13 im...@imxfb0: i.MX Framebuffer driver [ 0.92] 0.12 dma_alloc: 0xa6f56e40 0x1000 [ 0.000105] 0.13 dma_alloc: 0xa6f57088 0x1000 [ 0.000118] 0.13 dev_protect: currently broken [ 0.000129] 0.11 Using environment in NOR Flash [ 0.000141] 0.12 initialising PLLs [ 0.128972] 0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB) [ 0.128995] 0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB) [ 0.129008] 0.13 running /env/bin/init... [ 0.224963] 0.095955 [ 0.224984] 0.21 Hit any key to stop autoboot: 0 [ 0.224999] 0.15 copy [ 0.592964] 0.367965 done [ 0.652010] 0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009 So, this are ~0.6 s in boot loader and kernel copy until kernel starts, correct? What's the size of the uncompressed kernel copied here? Best regards Dirk Btw.: I tried to summarize some hints given in this thread in http://elinux.org/Boot_Time#Boot_time_check_list Please feel free to add and correct stuff! [ 0.652030] 0.20 CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177 [ 0.652044] 0.14 CPU: VIVT data cache, VIVT instruction cache [ 0.652057] 0.13 Machine: phyCORE-i.MX27 [ 0.652069] 0.12 Memory policy: ECC disabled, Data cache writeback [ 0.652082] 0.13 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32512 [ 0.706012] 0.053930