Re: New fast(?)-boot results on ARM
On Wednesday 09 September 2009 16:33, Johannes Stezenbach wrote: Sorry for slow reply. On Fri, Sep 04, 2009 at 06:16:26PM +0200, Wolfram Sang wrote: Now that microcom is in Debian sid (thanks!), where can I find ptx_ts? It seems to be quite useful. Back from the holidays, so here it is: http://pengutronix.de/software/ptx_ts/index_en.html Hope it can be useful... Yes, it is. Thanks! BTW, some feedback about microcom: - the choice of ^\ as an escape charater is unfortunate since that is usually mapped to set SIGQUIT in the tty; a btter choice would be ^] (like telnet) or ^A (like minicom) - typing the escape character immediate causes the menu to be displayed, so one cannot send a break sequence for SysRq without cluttering up the screen Would you take patches for that? Sure. -- vda -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Wed, Aug 19, 2009 at 06:20:13PM +0200, Dirk Behme wrote: Yes, correct. The copying itself is between 'copy' and 'done' so it takes about 0.4s. What's the size of the uncompressed kernel copied here? The image is about 2.8MB, but I copied the whole partition of 3MB because with raw images you can't detect the image size. With 3MB copied in ~0.4s you get ~8MB/s. This really depends on your HW, but I would think with standard NOR flashes you should be able to do at least two (three?) times better. Have you already checked the memory (NOR flash) timings configured in your SoC? It's NAND flash, so there's not much timing to optimize. What's interesting about this is that the kernel NAND driver is much slower than the one in U-Boot. Looking at it it turned out that the kernel driver uses interrupts to wait for the controller to get ready. Switching this to polling nearly doubles the NAND performance. UBI mounts much faster now and this cuts off another few seconds from the boot process :) Sascha -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Tue, Aug 18, 2009 at 05:31:42PM +0200, Dirk Behme wrote: Sascha Hauer wrote: On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote: Hi, On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote: On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote: That's bad :-) So there is no room for improvement any more in our ARM boot sequences ... on x86 we're doing pretty well ;-) On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from power-on through the kernel up to starting init. This is with - no delay in u-boot-v2 - rootfs on NAND (UBIFS) - quiet - precalculated loops-per-jiffy - zImage kernel instead of uImage Here's a little video of our demo system booting: http://www.youtube.com/watch?v=xDbUnNsj0cI As you can see there, it needs about 15 s from the release of the reset button up to the moment where the application shows it's Qt 4.5.2 based GUI (which is when we fade over from the initial framebuffer to the final one, in order to hide the qt application startup noise). And below is the boot log (after turning quiet off again). The numbers are the timestamp and the delta to the last timestamp, measured on the controlling PC by looking at the serial console output. The ptx_ts script starts when the regexp was found, so the numbers start basically in the moment when u-boot-v2 has initialized the system up to the point where we can see something. Result: - 2.4 s up from u-boot to the end of Uncompressing Linux - 300 ms until ubifs initialization starts - 3.7 s for ubifs, until mounted root So we basically have 7 s for the kernel. The rest is userspace, which hasn't seen much optimization yet, other than trying to start the GUI application as early as possible, while doing all other init stuff in parallel. Adding quiet brings us another 300 ms. That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. Robert r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9 [ 13.522625] 0.043189 [ 13.546627] 0.024002 OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200) [ 13.558613] 0.011986 [ 13.690643] 0.132030_ ___ _ [ 13.690731] 0.88 _ __ | |__ _ _ / ___/ _ \| _ \| | [ 13.698595] 0.007864 | '_ \| '_ \| | | | | | | | | |_) | _| [ 13.698654] 0.59 | |_) | | | | |_| | |__| |_| | _ | |___ [ 13.702581] 0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_| [ 13.706573] 0.003992 |_| |___/ [ 13.706622] 0.49 [ 13.725043] 0.018421 [ 14.742608] 1.017565 I made some changes suggested in this thread: - enable MMU in the bootloader - use assembler optimized memcpy/memset in the bootloader - start an uncompressed image - disable IP autoconfiguration in the Kernel - use lpj= command line parameter - use static device nodes instead of udev - skip some init scripts - made the kernel smaller (I do not have both configs handy, so I do not know what exactly I changed) Already looks much better: [ 0.05] 0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25) [ 0.26] 0.21 [ 0.41] 0.15 Board: Phytec phyCORE-i.MX27 [ 0.54] 0.13 cfi_probe: cfi_flash base: 0xc000 size: 0x0200 [ 0.67] 0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit) [ 0.80] 0.13 im...@imxfb0: i.MX Framebuffer driver [ 0.92] 0.12 dma_alloc: 0xa6f56e40 0x1000 [ 0.000105] 0.13 dma_alloc: 0xa6f57088 0x1000 [ 0.000118] 0.13 dev_protect: currently broken [ 0.000129] 0.11 Using environment in NOR Flash [ 0.000141] 0.12 initialising PLLs [ 0.128972] 0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB) [ 0.128995] 0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB) [ 0.129008] 0.13 running /env/bin/init... [ 0.224963] 0.095955 [ 0.224984] 0.21 Hit any key to stop autoboot: 0 [ 0.224999] 0.15 copy [ 0.592964] 0.367965 done [ 0.652010] 0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009 So, this are ~0.6 s in boot loader and kernel copy until kernel starts, correct? Yes, correct. The copying itself is between 'copy' and 'done' so it takes about 0.4s. What's the size of the uncompressed kernel copied here? The image is about 2.8MB, but I copied the whole partition of 3MB because with raw images you can't detect the image size. Btw.: I tried to summarize some hints given in this thread in http://elinux.org/Boot_Time#Boot_time_check_list Nice work! Regards Sascha -- Pengutronix e.K. |
Re: New fast(?)-boot results on ARM
Marco, On Tue, Aug 18, 2009 at 12:06:48PM +0200, Marco Stornelli wrote: Yeah, I agree, do you really need udevd, device file creation at every start-up in /dev? Usually static devices nodes and mdev for hotplug are enough or at least you could use a simple script to create only once time the devices file (mdev -s). About the fs, do you really need a rootfs with ubifs? I mean, you could split your fs. You could use a read-only fs (SquashFS for example) for your root-fs, ubifs for permanent storage data (mounted under /data for example) and a ram fs for volatile data. Well, we try to find out what is possible with a fast booting Linux system which *still* is as vanilla as possible. All the boot-in-one-second systems out there are highly squeezed, which is surely good if you have a scenario with high production volumes. You can do the optimization in the last steps then and it doesn't really matter how much time you spend with testing to come from a system that works for a developer to a production system. For most of our use cases here at Pengutronix, we see that: - Customers want in-system upgradability on a per-packet base; so the flash filesystems should be normally r/o, but may be remounted r/w. - Development systems should be close to production systems, in order to be able to have more early testing; so things like printk-ripout or special non-mainline patches/tweaks should be avoided as far as possible. - In general we want to have our systems close to what the mainline does; Automation Embedded is only a small market, and anything which is *not* specific to these markets but mainline is good. So let's see what we'll reach while trying what people have suggested. Thanks, rsc -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Tue, Aug 18, 2009 at 12:21, Robert Schwebelr.schwe...@pengutronix.de wrote: - In general we want to have our systems close to what the mainline does; Automation Embedded is only a small market, and anything which is *not* specific to these markets but mainline is good. BTW, what is your mainline (or it looks like you mean upstream)? -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Tue, Aug 18, 2009 at 12:34:52PM +0200, Alex Riesen wrote: On Tue, Aug 18, 2009 at 12:21, Robert Schwebelr.schwe...@pengutronix.de wrote: - In general we want to have our systems close to what the mainline does; Automation Embedded is only a small market, and anything which is *not* specific to these markets but mainline is good. BTW, what is your mainline (or it looks like you mean upstream)? unpatched kernel.org rsc -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Tue, Aug 18, 2009 at 12:48:50PM +0200, Alex Riesen wrote: But many of the problems you described and suggested solutions point at userspace, right? (like pre-defined static /dev, mdev script, or using of specially designed rootfs) Yes, right. But even there, mdev is more in the embedded special league than udev, for example, and highly specialized read-only root filesystems as well. rsc -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote: Hi, On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote: On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote: That's bad :-) So there is no room for improvement any more in our ARM boot sequences ... on x86 we're doing pretty well ;-) On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from power-on through the kernel up to starting init. This is with - no delay in u-boot-v2 - rootfs on NAND (UBIFS) - quiet - precalculated loops-per-jiffy - zImage kernel instead of uImage Here's a little video of our demo system booting: http://www.youtube.com/watch?v=xDbUnNsj0cI As you can see there, it needs about 15 s from the release of the reset button up to the moment where the application shows it's Qt 4.5.2 based GUI (which is when we fade over from the initial framebuffer to the final one, in order to hide the qt application startup noise). And below is the boot log (after turning quiet off again). The numbers are the timestamp and the delta to the last timestamp, measured on the controlling PC by looking at the serial console output. The ptx_ts script starts when the regexp was found, so the numbers start basically in the moment when u-boot-v2 has initialized the system up to the point where we can see something. Result: - 2.4 s up from u-boot to the end of Uncompressing Linux - 300 ms until ubifs initialization starts - 3.7 s for ubifs, until mounted root So we basically have 7 s for the kernel. The rest is userspace, which hasn't seen much optimization yet, other than trying to start the GUI application as early as possible, while doing all other init stuff in parallel. Adding quiet brings us another 300 ms. That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. Robert r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9 [ 13.522625] 0.043189 [ 13.546627] 0.024002 OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200) [ 13.558613] 0.011986 [ 13.690643] 0.132030_ ___ _ [ 13.690731] 0.88 _ __ | |__ _ _ / ___/ _ \| _ \| | [ 13.698595] 0.007864 | '_ \| '_ \| | | | | | | | | |_) | _| [ 13.698654] 0.59 | |_) | | | | |_| | |__| |_| | _ | |___ [ 13.702581] 0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_| [ 13.706573] 0.003992 |_| |___/ [ 13.706622] 0.49 [ 13.725043] 0.018421 [ 14.742608] 1.017565 I made some changes suggested in this thread: - enable MMU in the bootloader - use assembler optimized memcpy/memset in the bootloader - start an uncompressed image - disable IP autoconfiguration in the Kernel - use lpj= command line parameter - use static device nodes instead of udev - skip some init scripts - made the kernel smaller (I do not have both configs handy, so I do not know what exactly I changed) Already looks much better: [ 0.05] 0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25) [ 0.26] 0.21 [ 0.41] 0.15 Board: Phytec phyCORE-i.MX27 [ 0.54] 0.13 cfi_probe: cfi_flash base: 0xc000 size: 0x0200 [ 0.67] 0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit) [ 0.80] 0.13 im...@imxfb0: i.MX Framebuffer driver [ 0.92] 0.12 dma_alloc: 0xa6f56e40 0x1000 [ 0.000105] 0.13 dma_alloc: 0xa6f57088 0x1000 [ 0.000118] 0.13 dev_protect: currently broken [ 0.000129] 0.11 Using environment in NOR Flash [ 0.000141] 0.12 initialising PLLs [ 0.128972] 0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB) [ 0.128995] 0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB) [ 0.129008] 0.13 running /env/bin/init... [ 0.224963] 0.095955 [ 0.224984] 0.21 Hit any key to stop autoboot: 0 [ 0.224999] 0.15 copy [ 0.592964] 0.367965 done [ 0.652010] 0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009 [ 0.652030] 0.20 CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177 [ 0.652044] 0.14 CPU: VIVT data cache, VIVT instruction cache [ 0.652057] 0.13 Machine: phyCORE-i.MX27 [ 0.652069] 0.12 Memory policy: ECC disabled, Data cache writeback [ 0.652082] 0.13 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32512 [ 0.706012] 0.053930 Kernel command line: console=ttymxc0,115200 earlyprintk lpj=995328 mt9v022.sensor_type=color ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0::: ubi.mtd=7 root=ubi0:root rootfstype=ubifs
Re: New fast(?)-boot results on ARM
Sascha Hauer wrote: On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote: Hi, On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote: On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote: That's bad :-) So there is no room for improvement any more in our ARM boot sequences ... on x86 we're doing pretty well ;-) On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from power-on through the kernel up to starting init. This is with - no delay in u-boot-v2 - rootfs on NAND (UBIFS) - quiet - precalculated loops-per-jiffy - zImage kernel instead of uImage Here's a little video of our demo system booting: http://www.youtube.com/watch?v=xDbUnNsj0cI As you can see there, it needs about 15 s from the release of the reset button up to the moment where the application shows it's Qt 4.5.2 based GUI (which is when we fade over from the initial framebuffer to the final one, in order to hide the qt application startup noise). And below is the boot log (after turning quiet off again). The numbers are the timestamp and the delta to the last timestamp, measured on the controlling PC by looking at the serial console output. The ptx_ts script starts when the regexp was found, so the numbers start basically in the moment when u-boot-v2 has initialized the system up to the point where we can see something. Result: - 2.4 s up from u-boot to the end of Uncompressing Linux - 300 ms until ubifs initialization starts - 3.7 s for ubifs, until mounted root So we basically have 7 s for the kernel. The rest is userspace, which hasn't seen much optimization yet, other than trying to start the GUI application as early as possible, while doing all other init stuff in parallel. Adding quiet brings us another 300 ms. That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. Robert r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9 [ 13.522625] 0.043189 [ 13.546627] 0.024002 OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200) [ 13.558613] 0.011986 [ 13.690643] 0.132030_ ___ _ [ 13.690731] 0.88 _ __ | |__ _ _ / ___/ _ \| _ \| | [ 13.698595] 0.007864 | '_ \| '_ \| | | | | | | | | |_) | _| [ 13.698654] 0.59 | |_) | | | | |_| | |__| |_| | _ | |___ [ 13.702581] 0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_| [ 13.706573] 0.003992 |_| |___/ [ 13.706622] 0.49 [ 13.725043] 0.018421 [ 14.742608] 1.017565 I made some changes suggested in this thread: - enable MMU in the bootloader - use assembler optimized memcpy/memset in the bootloader - start an uncompressed image - disable IP autoconfiguration in the Kernel - use lpj= command line parameter - use static device nodes instead of udev - skip some init scripts - made the kernel smaller (I do not have both configs handy, so I do not know what exactly I changed) Already looks much better: [ 0.05] 0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25) [ 0.26] 0.21 [ 0.41] 0.15 Board: Phytec phyCORE-i.MX27 [ 0.54] 0.13 cfi_probe: cfi_flash base: 0xc000 size: 0x0200 [ 0.67] 0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit) [ 0.80] 0.13 im...@imxfb0: i.MX Framebuffer driver [ 0.92] 0.12 dma_alloc: 0xa6f56e40 0x1000 [ 0.000105] 0.13 dma_alloc: 0xa6f57088 0x1000 [ 0.000118] 0.13 dev_protect: currently broken [ 0.000129] 0.11 Using environment in NOR Flash [ 0.000141] 0.12 initialising PLLs [ 0.128972] 0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB) [ 0.128995] 0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB) [ 0.129008] 0.13 running /env/bin/init... [ 0.224963] 0.095955 [ 0.224984] 0.21 Hit any key to stop autoboot: 0 [ 0.224999] 0.15 copy [ 0.592964] 0.367965 done [ 0.652010] 0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009 So, this are ~0.6 s in boot loader and kernel copy until kernel starts, correct? What's the size of the uncompressed kernel copied here? Best regards Dirk Btw.: I tried to summarize some hints given in this thread in http://elinux.org/Boot_Time#Boot_time_check_list Please feel free to add and correct stuff! [ 0.652030] 0.20 CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177 [ 0.652044] 0.14 CPU: VIVT data cache, VIVT instruction cache [ 0.652057] 0.13 Machine: phyCORE-i.MX27 [ 0.652069] 0.12 Memory policy: ECC disabled, Data cache writeback [ 0.652082] 0.13 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32512 [ 0.706012] 0.053930
Re: New fast(?)-boot results on ARM
Dirk Behme wrote: Sascha Hauer wrote: On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote: Hi, On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote: On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote: That's bad :-) So there is no room for improvement any more in our ARM boot sequences ... on x86 we're doing pretty well ;-) On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from power-on through the kernel up to starting init. This is with - no delay in u-boot-v2 - rootfs on NAND (UBIFS) - quiet - precalculated loops-per-jiffy - zImage kernel instead of uImage Here's a little video of our demo system booting: http://www.youtube.com/watch?v=xDbUnNsj0cI As you can see there, it needs about 15 s from the release of the reset button up to the moment where the application shows it's Qt 4.5.2 based GUI (which is when we fade over from the initial framebuffer to the final one, in order to hide the qt application startup noise). And below is the boot log (after turning quiet off again). The numbers are the timestamp and the delta to the last timestamp, measured on the controlling PC by looking at the serial console output. The ptx_ts script starts when the regexp was found, so the numbers start basically in the moment when u-boot-v2 has initialized the system up to the point where we can see something. Result: - 2.4 s up from u-boot to the end of Uncompressing Linux - 300 ms until ubifs initialization starts - 3.7 s for ubifs, until mounted root So we basically have 7 s for the kernel. The rest is userspace, which hasn't seen much optimization yet, other than trying to start the GUI application as early as possible, while doing all other init stuff in parallel. Adding quiet brings us another 300 ms. That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. Robert r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9 [ 13.522625] 0.043189 [ 13.546627] 0.024002 OSELAS(R)-phyCORE-trunk (PTXdist-1.99.svn/2009-08-06T08:37:25+0200) [ 13.558613] 0.011986 [ 13.690643] 0.132030_ ___ _ [ 13.690731] 0.88 _ __ | |__ _ _ / ___/ _ \| _ \| | [ 13.698595] 0.007864 | '_ \| '_ \| | | | | | | | | |_) | _| [ 13.698654] 0.59 | |_) | | | | |_| | |__| |_| | _ | |___ [ 13.702581] 0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_| [ 13.706573] 0.003992 |_| |___/ [ 13.706622] 0.49 [ 13.725043] 0.018421 [ 14.742608] 1.017565 I made some changes suggested in this thread: - enable MMU in the bootloader - use assembler optimized memcpy/memset in the bootloader - start an uncompressed image - disable IP autoconfiguration in the Kernel - use lpj= command line parameter - use static device nodes instead of udev - skip some init scripts - made the kernel smaller (I do not have both configs handy, so I do not know what exactly I changed) Already looks much better: [ 0.05] 0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 13:29:25) [ 0.26] 0.21 [ 0.41] 0.15 Board: Phytec phyCORE-i.MX27 [ 0.54] 0.13 cfi_probe: cfi_flash base: 0xc000 size: 0x0200 [ 0.67] 0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit) [ 0.80] 0.13 im...@imxfb0: i.MX Framebuffer driver [ 0.92] 0.12 dma_alloc: 0xa6f56e40 0x1000 [ 0.000105] 0.13 dma_alloc: 0xa6f57088 0x1000 [ 0.000118] 0.13 dev_protect: currently broken [ 0.000129] 0.11 Using environment in NOR Flash [ 0.000141] 0.12 initialising PLLs [ 0.128972] 0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB) [ 0.128995] 0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB) [ 0.129008] 0.13 running /env/bin/init... [ 0.224963] 0.095955 [ 0.224984] 0.21 Hit any key to stop autoboot: 0 [ 0.224999] 0.15 copy [ 0.592964] 0.367965 done [ 0.652010] 0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009 So, this are ~0.6 s in boot loader and kernel copy until kernel starts, correct? What's the size of the uncompressed kernel copied here? Best regards Dirk Btw.: I tried to summarize some hints given in this thread in http://elinux.org/Boot_Time#Boot_time_check_list Please feel free to add and correct stuff! It's a good documentation, good work. From 14s to 5s I think it's a very good result. In reference to the previous response of Robert, I think that it's a good thing to use a vanilla kernel and avoid strange and specific or not mature solutions, but it needs to use the right tool for
Re: New fast(?)-boot results on ARM
Robert Schwebel wrote: On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote: [ �5.082616] �0.007992 RPC: Registered tcp transport module. [ �5.605159] �0.522543 eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX. [ �6.602621] �0.997462 IP-Config: Complete: [ �6.606638] �0.004017 � � �device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2, [ �6.614588] �0.007950 � � �host=192.168.23.197, domain=, nis-domain=(none), [ �6.618652] �0.004064 � � �bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath= Well, this ~1 second is not really kernel's fault, it's DHCP delay. But, do you need to do it at this moment? You do not seem to be using networking filesystems. You can run DHCP client in userspace. The board has ip autoconfig configured in, because we also use tftp/nfs boot for development. But it had been disabled on the commandline: ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0::: That shouldn't do dhcp, right? I think it doesn't, but I'm not positive. The DHCP trasmissions themselves don't take very long. There are some very long timeouts in the network code paths, which appear to be used whether you specify a static address or not. See the definitions of CONF_PRE_OPEN and CON_POST_OPEN in net/ipv4/ipconfig.c They are set to ridiculously long values. In my experience, you can cut them down considerably with no dangerous side effects (but I haven't asked the network guys about the possible downsides). Here's a patch which I've used in the past. (Sorry if it doesn't apply cleanly, I just extracted it from a PDF and the whitespace may have gotten messed up. It's short enough that you can hand-edit the files if there's a problem.) I'd like to hear back, if you apply this, whether it shortens the network startup time for you. diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c index 42065ff..e42d83f 100644 --- a/net/ipv4/ipconfig.c +++ b/net/ipv4/ipconfig.c @@ -86,8 +86,10 @@ #endif /* Define the friendly delay before and after opening net devices */ -#define CONF_PRE_OPEN 500 /* Before opening: 1/2 second */ -#define CONF_POST_OPEN 1 /* After opening: 1 second */ +/*#define CONF_PRE_OPEN 500 /* Before opening: 1/2 second */ +/*#define CONF_POST_OPEN 1 /* After opening: 1 second */ +#define CONF_PRE_OPEN 5 /* Before opening: 5 milli seconds */ +#define CONF_POST_OPEN 10 /* After opening: 10 milli seconds */ /* Define the timeout for waiting for a DHCP/BOOTP/RARP reply */ #define CONF_OPEN_RETRIES 2 /* (Re)open devices twice */ @@ -1292,7 +1294,7 @@ static int __init ip_auto_config(void) return -1; /* Give drivers a chance to settle */ - ssleep(CONF_POST_OPEN); + msleep(CONF_POST_OPEN); /* * If the config information is insufficient (e.g., our IP address or = Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America = -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Fri, Aug 14, 2009 at 10:43:05PM +0200, Robert Schwebel wrote: On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote: r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9 Now that microcom is in Debian sid (thanks!), where can I find ptx_ts? It seems to be quite useful. [ 0.874559] 0.003967 Hit any key to stop autoboot: 0 boot loader is not fast. considering its simple task, it can be made faster. Yup, will check. Almost 1 s seems really long. I'm working on a SoC with a 200MHz ARM926EJ-S. We managed to get to 1.5sec from power-on to starting init. The main difference to your platform seems to be that we use NOR flash. The kernel is not optimized, it still has some debug options turned on and is used during development. (however, the 1.5sec is with quiet) The root fs is cramfs. The kernel version is 2.6.20. For u-boot we enabled the D-cache which gave a decent speed up (on ARM926EJ-S this requires one to set up page tables and enable MMU, but it's not that difficult). I don't have the numbers here but I think it still takes ~300ms in u-boot, and ~1.2s for the kernel boot. [ 1.326621] 0.452062 loaded zImage from /dev/nand0.kernel.bb with size 1679656 [ 2.009996] 0.683375 Uncompressing Linux... done, booting the kernel. [ 2.416999] 0.407003 Linux version 2.6.31-rc4-g056f82f-dirty (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT Thu Aug 6 08:37:19 CEST 2009 Other people already commented on this (kernel is too big) Not sure (the kernel is already customized for the board), but I'll take a look again. We are booting an uncomressed kernel (~2.8MB). Uncompressing (running the uncompressor XIP in NOR flash) took ~0.5s longer than copying 2.8MB from flash to RAM. BTW, we are using uImage and set verify=no in u-boot. We use u-boot-1.3.0. [ 5.082616] 0.007992 RPC: Registered tcp transport module. [ 5.605159] 0.522543 eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX. What is happening here? Waiting for eth link negotiation? [ 6.602621] 0.997462 IP-Config: Complete: [ 6.606638] 0.004017 device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2, [ 6.614588] 0.007950 host=192.168.23.197, domain=, nis-domain=(none), [ 6.618652] 0.004064 bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath= Well, this ~1 second is not really kernel's fault, it's DHCP delay. But, do you need to do it at this moment? You do not seem to be using networking filesystems. You can run DHCP client in userspace. The board has ip autoconfig configured in, because we also use tftp/nfs boot for development. But it had been disabled on the commandline: ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0::: That shouldn't do dhcp, right? Try to boot with eth cable unplugged, see if it hangs in IP-config. If it were doing static configuration it would be faster. However, unless you need ethernet to boot (NFS root) I'd suggest doing eth config in userspace. [ 7.137924] 0.059316 starting udev [ 7.147925] 0.010001 mounting tmpfs at /dev [ 7.182299] 0.034374 creating static nodes [ 7.410613] 0.228314 starting udevd...done [ 8.811097] 1.400484 waiting for devices...done And suddenly devtmpfs sounds like a good idea ;-) We use static device nodes during boot, and later setup busybox mdev for hotplug. Johannes -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
Robert Schwebel wrote: - 2.4 s up from u-boot to the end of Uncompressing Linux - 300 ms until ubifs initialization starts - 3.7 s for ubifs, until mounted root So we basically have 7 s for the kernel. The rest is userspace, which hasn't seen much optimization yet, other than trying to start the GUI application as early as possible, while doing all other init stuff in parallel. Adding quiet brings us another 300 ms. That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. 2.4s in uncompression? That seems like an obvious target for improvement. Your kernel seems awfully large. 3104K code? You should definitely find out what is making it that big and cut out everything you do not need. You might even try some of the embedded system scripts that rip out all the printk strings. If you get the kernel size way down then use a uncompressed kernel and it should boot a lot faster if the bottleneck is CPU speed. However, it is probably IO speed. There could be something really wrong and slow with your MTD. Does it DMA or is it doing something crazy like using the CPU to read a byte at a time? Or maybe its cheap and slow flash. In that case I think your only hope is to make all the code as small as possible and/or find a different flash filesystem that does not have to read so much of the device to mount. Perhaps use a read-only compressed filesystem for the system binaries and reflash it for software upgrades. Only init and mount the writable flash for user-storable data well after system boot has finished. -- Zan Lynx zl...@acm.org Knowledge is Power. Power Corrupts. Study Hard. Be Evil. -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
Zan, On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote: That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. 2.4s in uncompression? That seems like an obvious target for improvement. Indeed, we'll check that. However, I have a little bit the impression that most systems which are hyped as fast boot out there are optimized so aggressively that they are not really usable in real life applications any more. So we try to configure the systems in a realistic way. I know that we won't get the last milliseconds that way - but I'd like to find out how far we can go. Your kernel seems awfully large. 3104K code? You should definitely find out what is making it that big and cut out everything you do not need. Definitely, will audit again. You might even try some of the embedded system scripts that rip out all the printk strings. Hmm, that's definitely in the last-minute-before-product category. If you get the kernel size way down then use a uncompressed kernel and it should boot a lot faster if the bottleneck is CPU speed. I'll try that. However, it is probably IO speed. There could be something really wrong and slow with your MTD. Does it DMA or is it doing something crazy like using the CPU to read a byte at a time? Will check. Or maybe its cheap and slow flash. In that case I think your only hope is to make all the code as small as possible and/or find a different flash filesystem that does not have to read so much of the device to mount. Perhaps use a read-only compressed filesystem for the system binaries and reflash it for software upgrades. Only init and mount the writable flash for user-storable data well after system boot has finished. That would be also a last-minute change, but surely worth to be evaluated. We recently changed from jffs2 to ubifs and hoped to gain speed during that step. Thanks for your feedback! rsc -- Pengutronix e.K. | | Industrial Linux Solutions | http://www.pengutronix.de/ | Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New fast(?)-boot results on ARM
On Fri, Aug 14, 2009 at 7:02 PM, Robert Schwebelr.schwe...@pengutronix.de wrote: So we basically have 7 s for the kernel. The rest is userspace, which hasn't seen much optimization yet, other than trying to start the GUI application as early as possible, while doing all other init stuff in parallel. Adding quiet brings us another 300 ms. That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. Robert r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9 [ 2.395740] 2.395740 [ 2.395860] 0.000120 [ 0.11] 0.11 U-Boot 2.0.0-rc9 (Aug 5 2009 - 10:05:58) [ 0.59] 0.48 [ 0.003823] 0.003764 Board: Phytec phyCORE-i.MX27 [ 0.010753] 0.006930 cfi_probe: cfi_flash base: 0xc000 size: 0x0200 [ 0.018711] 0.007958 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST Micro NAND 64MiB 1,8V 8-bit) [ 0.026592] 0.007881 im...@imxfb0: i.MX Framebuffer driver [ 0.178655] 0.152063 dev_protect: currently broken [ 0.178736] 0.81 Using environment in NOR Flash [ 0.182577] 0.003841 initialising PLLs [ 0.367142] 0.184565 Malloc space: 0xa3f0 - 0xa7f0 (size 64 MB) [ 0.370568] 0.003426 Stack space : 0xa3ef8000 - 0xa3f0 (size 32 kB) [ 0.445993] 0.075425 running /env/bin/init... [ 0.870592] 0.424599 [ 0.874559] 0.003967 Hit any key to stop autoboot: 0 boot loader is not fast. considering its simple task, it can be made faster. [ 1.326621] 0.452062 loaded zImage from /dev/nand0.kernel.bb with size 1679656 [ 2.009996] 0.683375 Uncompressing Linux... done, booting the kernel. [ 2.416999] 0.407003 Linux version 2.6.31-rc4-g056f82f-dirty (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT Thu Aug 6 08:37:19 CEST 2009 Other people already commented on this (kernel is too big) [ 2.418729] 0.001730 CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), cr=00053177 [ 2.423081] 0.004352 CPU: VIVT data cache, VIVT instruction cache [ 2.426592] 0.003511 Machine: phyCORE-i.MX27 ... [ 2.742628] 0.016050 0x0036-0x0400 : root [ 3.058610] 0.315982 UBI: attaching mtd7 to ubi0 [ 3.062878] 0.004268 UBI: physical eraseblock size: 16384 bytes (16 KiB) [ 3.070601] 0.007723 UBI: logical eraseblock size: 15360 bytes [ 3.070665] 0.64 UBI: smallest flash I/O unit: 512 [ 3.078564] 0.007899 UBI: VID header offset: 512 (aligned 512) [ 3.078609] 0.45 UBI: data offset: 1024 [ 5.006609] 1.928000 UBI: attached mtd7 to ubi0 [ 5.013157] 0.006548 UBI: MTD device name: root As others commented, ubi looks slow and you probably need to find out why. [ 5.014566] 0.001409 UBI: MTD device size: 60 MiB [ 5.018660] 0.004094 UBI: number of good PEBs: 3880 [ 5.022585] 0.003925 UBI: number of bad PEBs: 0 [ 5.026797] 0.004212 UBI: max. allowed volumes: 89 [ 5.026849] 0.52 UBI: wear-leveling threshold: 4096 [ 5.030779] 0.003930 UBI: number of internal volumes: 1 [ 5.034583] 0.003804 UBI: number of user volumes: 1 [ 5.046572] 0.011989 UBI: available PEBs: 0 [ 5.046622] 0.50 UBI: total number of reserved PEBs: 3880 [ 5.046657] 0.35 UBI: number of PEBs reserved for bad PEB handling: 38 [ 5.050606] 0.003949 UBI: max/mean erase counter: 2/0 [ 5.050668] 0.62 UBI: image sequence number: 0 [ 5.058619] 0.007951 UBI: background thread ubi_bgt0d started, PID 215 [ 5.062620] 0.004001 oprofile: using timer interrupt. [ 5.070584] 0.007964 TCP cubic registered [ 5.070637] 0.53 NET: Registered protocol family 17 [ 5.074624] 0.003987 RPC: Registered udp transport module. [ 5.082616] 0.007992 RPC: Registered tcp transport module. [ 5.605159] 0.522543 eth0: config: auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX. [ 6.602621] 0.997462 IP-Config: Complete: [ 6.606638] 0.004017 device=eth0, addr=192.168.23.197, mask=255.255.0.0, gw=192.168.23.2, [ 6.614588] 0.007950 host=192.168.23.197, domain=, nis-domain=(none), [ 6.618652] 0.004064 bootserver=192.168.23.2, rootserver=192.168.23.2, rootpath= Well, this ~1 second is not really kernel's fault, it's DHCP delay. But, do you need to do it at this moment? You do not seem to be using networking filesystems. You can run DHCP client in userspace. [ 6.630579] 0.011927 UBIFS: recovery needed [ 6.662655] 0.032076 UBIFS: recovery completed [ 6.666587] 0.003932 UBIFS: mounted UBI device 0, volume 1, name root [ 6.670570] 0.003983 UBIFS: file system size: 58490880 bytes (57120 KiB, 55 MiB, 3808 LEBs) [ 6.678572]
Re: New fast(?)-boot results on ARM
2009/8/14 Robert Schwebel r.schwe...@pengutronix.de: On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote: That's factor 70 away from the 110 ms boot time Tim has talked about some days ago (and he measured on an ARM cpu which had almost half the speed of this one), and I'm wondering what we can do to improve the boot time. 2.4s in uncompression? That seems like an obvious target for improvement. Indeed, we'll check that. We got rid of uncompression on a flash-based system vastly improving boot time. The reason is that compressed kernels are faster only when the throughput to the persistent storage is lower than the decompression throughput, and on typical embedded systems with DMA the throughput to memory outperforms the CPU-based decompression. Of course it depends on a lot of stuff like performance of flash controller, kernel storage filesystem performance, DMA controller performance, cache architecture etc so it's individual per-system. Linus Walleij -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html