Re: New fast(?)-boot results on ARM

2009-09-09 Thread Denys Vlasenko
On Wednesday 09 September 2009 16:33, Johannes Stezenbach wrote:
 Sorry for slow reply.
 
 On Fri, Sep 04, 2009 at 06:16:26PM +0200, Wolfram Sang wrote:
  
   Now that microcom is in Debian sid (thanks!), where can I find ptx_ts?
   It seems to be quite useful.
  
  Back from the holidays, so here it is:
  
  http://pengutronix.de/software/ptx_ts/index_en.html
  
  Hope it can be useful...
 
 Yes, it is.  Thanks!
 
 BTW, some feedback about microcom:
 
 - the choice of ^\ as an escape charater is unfortunate since that
   is usually mapped to set SIGQUIT in the tty; a btter choice would
   be ^] (like telnet) or ^A (like minicom)
 - typing the escape character immediate causes the menu to be displayed,
   so one cannot send a break sequence for SysRq without cluttering up the 
 screen
 
 Would you take patches for that?

Sure.
--
vda
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-20 Thread Sascha Hauer
On Wed, Aug 19, 2009 at 06:20:13PM +0200, Dirk Behme wrote:

 Yes, correct. The copying itself is between 'copy' and 'done' so it
 takes about 0.4s.

 What's the size of the uncompressed kernel copied here?

 The image is about 2.8MB, but I copied the whole partition of 3MB
 because with raw images you can't detect the image size.

 With 3MB copied in ~0.4s you get ~8MB/s. This really depends on your HW, 
 but I would think with standard NOR flashes you should be able to do at 
 least two (three?) times better. Have you already checked the memory (NOR 
 flash) timings configured in your SoC?

It's NAND flash, so there's not much timing to optimize. What's
interesting about this is that the kernel NAND driver is much slower
than the one in U-Boot. Looking at it it turned out that the kernel
driver uses interrupts to wait for the controller to get ready.
Switching this to polling nearly doubles the NAND performance. UBI
mounts much faster now and this cuts off another few seconds from the
boot process :)

Sascha


-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-19 Thread Sascha Hauer
On Tue, Aug 18, 2009 at 05:31:42PM +0200, Dirk Behme wrote:
 Sascha Hauer wrote:
 On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
 Hi,

 On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
 On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
 That's bad :-) So there is no room for improvement any more in our
 ARM boot sequences ...
 on x86 we're doing pretty well ;-)
 On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
 power-on through the kernel up to starting init. This is with

 - no delay in u-boot-v2
 - rootfs on NAND (UBIFS)
 - quiet
 - precalculated loops-per-jiffy
 - zImage kernel instead of uImage
 Here's a little video of our demo system booting:
 http://www.youtube.com/watch?v=xDbUnNsj0cI

 As you can see there, it needs about 15 s from the release of the reset 
 button
 up to the moment where the application shows it's Qt 4.5.2 based GUI (which 
 is
 when we fade over from the initial framebuffer to the final one, in order to
 hide the qt application startup noise).

 And below is the boot log (after turning quiet off again). The numbers are
 the timestamp and the delta to the last timestamp, measured on the 
 controlling
 PC by looking at the serial console output. The ptx_ts script starts when 
 the
 regexp was found, so the numbers start basically in the moment when 
 u-boot-v2
 has initialized the system up to the point where we can see something.

 Result:

 - 2.4 s up from u-boot to the end of Uncompressing Linux
 - 300 ms until ubifs initialization starts
 - 3.7 s for ubifs, until mounted root

 So we basically have 7 s for the kernel. The rest is userspace, which hasn't
 seen much optimization yet, other than trying to start the GUI application 
 as
 early as possible, while doing all other init stuff in parallel. Adding 
 quiet
 brings us another 300 ms.

 That's factor 70 away from the 110 ms boot time Tim has talked about some 
 days
 ago (and he measured on an ARM cpu which had almost half the speed of this
 one), and I'm wondering what we can do to improve the boot time.

 Robert

 r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9
 [ 13.522625]   0.043189
 [ 13.546627]   0.024002 OSELAS(R)-phyCORE-trunk 
 (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
 [ 13.558613]   0.011986
 [ 13.690643]   0.132030_ ___    _
 [ 13.690731]   0.88  _ __ | |__  _   _ / ___/ _ \|  _ \| |
 [ 13.698595]   0.007864 | '_ \| '_ \| | | | |  | | | | |_) |  _|
 [ 13.698654]   0.59 | |_) | | | | |_| | |__| |_| |  _ | |___
 [ 13.702581]   0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_|
 [ 13.706573]   0.003992 |_|  |___/
 [ 13.706622]   0.49
 [ 13.725043]   0.018421
 [ 14.742608]   1.017565

 I made some changes suggested in this thread:

 - enable MMU in the bootloader
 - use assembler optimized memcpy/memset in the bootloader
 - start an uncompressed image
 - disable IP autoconfiguration in the Kernel
 - use lpj= command line parameter
 - use static device nodes instead of udev
 - skip some init scripts
 - made the kernel smaller (I do not have both configs handy, so I do not
   know what exactly I changed)

 Already looks much better:

 [  0.05]   0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 
 2009 - 13:29:25)
 [  0.26]   0.21
 [  0.41]   0.15 Board: Phytec phyCORE-i.MX27
 [  0.54]   0.13 cfi_probe: cfi_flash base: 0xc000 size: 
 0x0200
 [  0.67]   0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 
 (ST Micro NAND 64MiB 1,8V 8-bit)
 [  0.80]   0.13 im...@imxfb0: i.MX Framebuffer driver
 [  0.92]   0.12 dma_alloc: 0xa6f56e40 0x1000
 [  0.000105]   0.13 dma_alloc: 0xa6f57088 0x1000
 [  0.000118]   0.13 dev_protect: currently broken
 [  0.000129]   0.11 Using environment in NOR Flash
 [  0.000141]   0.12 initialising PLLs
 [  0.128972]   0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB)
 [  0.128995]   0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB)
 [  0.129008]   0.13 running /env/bin/init...
 [  0.224963]   0.095955
 [  0.224984]   0.21 Hit any key to stop autoboot:  0
 [  0.224999]   0.15 copy
 [  0.592964]   0.367965 done
 [  0.652010]   0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty 
 (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT 
 Tue Aug 18 14:08:51 CEST 2009

 So, this are ~0.6 s in boot loader and kernel copy until kernel starts, 
 correct?

Yes, correct. The copying itself is between 'copy' and 'done' so it
takes about 0.4s.


 What's the size of the uncompressed kernel copied here?

The image is about 2.8MB, but I copied the whole partition of 3MB
because with raw images you can't detect the image size.


 Btw.: I tried to summarize some hints given in this thread in

 http://elinux.org/Boot_Time#Boot_time_check_list

Nice work!

Regards
  Sascha

-- 
Pengutronix e.K.   | 

Re: New fast(?)-boot results on ARM

2009-08-18 Thread Robert Schwebel
Marco,

On Tue, Aug 18, 2009 at 12:06:48PM +0200, Marco Stornelli wrote:
 Yeah, I agree, do you really need udevd, device file creation at every
 start-up in /dev? Usually static devices nodes and mdev for hotplug are
 enough or at least you could use a simple script to create only once
 time the devices file (mdev -s). About the fs, do you really need a
 rootfs with ubifs? I mean, you could split your fs. You could use a
 read-only fs (SquashFS for example) for your root-fs, ubifs for
 permanent storage data (mounted under /data for example) and a ram fs
 for volatile data.

Well, we try to find out what is possible with a fast booting Linux
system which *still* is as vanilla as possible.

All the boot-in-one-second systems out there are highly squeezed,
which is surely good if you have a scenario with high production
volumes. You can do the optimization in the last steps then and it
doesn't really matter how much time you spend with testing to come from
a system that works for a developer to a production system.

For most of our use cases here at Pengutronix, we see that:

- Customers want in-system upgradability on a per-packet base; so the
  flash filesystems should be normally r/o, but may be remounted r/w.

- Development systems should be close to production systems, in order to
  be able to have more early testing; so things like printk-ripout or
  special non-mainline patches/tweaks should be avoided as far as
  possible.

- In general we want to have our systems close to what the mainline
  does; Automation  Embedded is only a small market, and anything
  which is *not* specific to these markets but mainline is good.

So let's see what we'll reach while trying what people have suggested.

Thanks,
rsc
-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-18 Thread Alex Riesen
On Tue, Aug 18, 2009 at 12:21, Robert Schwebelr.schwe...@pengutronix.de wrote:
 - In general we want to have our systems close to what the mainline
  does; Automation  Embedded is only a small market, and anything
  which is *not* specific to these markets but mainline is good.

BTW, what is your mainline (or it looks like you mean upstream)?
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-18 Thread Robert Schwebel
On Tue, Aug 18, 2009 at 12:34:52PM +0200, Alex Riesen wrote:
 On Tue, Aug 18, 2009 at 12:21, Robert Schwebelr.schwe...@pengutronix.de 
 wrote:
  - In general we want to have our systems close to what the mainline
   does; Automation  Embedded is only a small market, and anything
   which is *not* specific to these markets but mainline is good.

 BTW, what is your mainline (or it looks like you mean upstream)?

unpatched kernel.org

rsc
-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-18 Thread Robert Schwebel
On Tue, Aug 18, 2009 at 12:48:50PM +0200, Alex Riesen wrote:
 But many of the problems you described and suggested solutions
 point at userspace, right? (like pre-defined static /dev, mdev script,
 or using of specially designed rootfs)

Yes, right. But even there, mdev is more in the embedded special
league than udev, for example, and highly specialized read-only root
filesystems as well.

rsc 
-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-18 Thread Sascha Hauer
On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
 Hi,
 
 On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
  On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
That's bad :-) So there is no room for improvement any more in our
ARM boot sequences ...
  
   on x86 we're doing pretty well ;-)
 
  On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
  power-on through the kernel up to starting init. This is with
 
  - no delay in u-boot-v2
  - rootfs on NAND (UBIFS)
  - quiet
  - precalculated loops-per-jiffy
  - zImage kernel instead of uImage
 
 Here's a little video of our demo system booting:
 http://www.youtube.com/watch?v=xDbUnNsj0cI
 
 As you can see there, it needs about 15 s from the release of the reset button
 up to the moment where the application shows it's Qt 4.5.2 based GUI (which is
 when we fade over from the initial framebuffer to the final one, in order to
 hide the qt application startup noise).
 
 And below is the boot log (after turning quiet off again). The numbers are
 the timestamp and the delta to the last timestamp, measured on the controlling
 PC by looking at the serial console output. The ptx_ts script starts when the
 regexp was found, so the numbers start basically in the moment when u-boot-v2
 has initialized the system up to the point where we can see something.
 
 Result:
 
 - 2.4 s up from u-boot to the end of Uncompressing Linux
 - 300 ms until ubifs initialization starts
 - 3.7 s for ubifs, until mounted root
 
 So we basically have 7 s for the kernel. The rest is userspace, which hasn't
 seen much optimization yet, other than trying to start the GUI application as
 early as possible, while doing all other init stuff in parallel. Adding 
 quiet
 brings us another 300 ms.
 
 That's factor 70 away from the 110 ms boot time Tim has talked about some days
 ago (and he measured on an ARM cpu which had almost half the speed of this
 one), and I'm wondering what we can do to improve the boot time.
 
 Robert
 
 r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9
 [ 13.522625]   0.043189
 [ 13.546627]   0.024002 OSELAS(R)-phyCORE-trunk 
 (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
 [ 13.558613]   0.011986
 [ 13.690643]   0.132030_ ___    _
 [ 13.690731]   0.88  _ __ | |__  _   _ / ___/ _ \|  _ \| |
 [ 13.698595]   0.007864 | '_ \| '_ \| | | | |  | | | | |_) |  _|
 [ 13.698654]   0.59 | |_) | | | | |_| | |__| |_| |  _ | |___
 [ 13.702581]   0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_|
 [ 13.706573]   0.003992 |_|  |___/
 [ 13.706622]   0.49
 [ 13.725043]   0.018421
 [ 14.742608]   1.017565

I made some changes suggested in this thread:

- enable MMU in the bootloader
- use assembler optimized memcpy/memset in the bootloader
- start an uncompressed image
- disable IP autoconfiguration in the Kernel
- use lpj= command line parameter
- use static device nodes instead of udev
- skip some init scripts
- made the kernel smaller (I do not have both configs handy, so I do not
  know what exactly I changed)

Already looks much better:

[  0.05]   0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 
13:29:25)
[  0.26]   0.21
[  0.41]   0.15 Board: Phytec phyCORE-i.MX27
[  0.54]   0.13 cfi_probe: cfi_flash base: 0xc000 size: 0x0200
[  0.67]   0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST 
Micro NAND 64MiB 1,8V 8-bit)
[  0.80]   0.13 im...@imxfb0: i.MX Framebuffer driver
[  0.92]   0.12 dma_alloc: 0xa6f56e40 0x1000
[  0.000105]   0.13 dma_alloc: 0xa6f57088 0x1000
[  0.000118]   0.13 dev_protect: currently broken
[  0.000129]   0.11 Using environment in NOR Flash
[  0.000141]   0.12 initialising PLLs
[  0.128972]   0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB)
[  0.128995]   0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB)
[  0.129008]   0.13 running /env/bin/init...
[  0.224963]   0.095955
[  0.224984]   0.21 Hit any key to stop autoboot:  0
[  0.224999]   0.15 copy
[  0.592964]   0.367965 done
[  0.652010]   0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty 
(s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue 
Aug 18 14:08:51 CEST 2009
[  0.652030]   0.20 CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), 
cr=00053177
[  0.652044]   0.14 CPU: VIVT data cache, VIVT instruction cache
[  0.652057]   0.13 Machine: phyCORE-i.MX27
[  0.652069]   0.12 Memory policy: ECC disabled, Data cache writeback
[  0.652082]   0.13 Built 1 zonelists in Zone order, mobility grouping 
on.  Total pages: 32512
[  0.706012]   0.053930 Kernel command line: console=ttymxc0,115200 
earlyprintk lpj=995328 mt9v022.sensor_type=color 
ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0::: ubi.mtd=7 
root=ubi0:root rootfstype=ubifs 

Re: New fast(?)-boot results on ARM

2009-08-18 Thread Dirk Behme

Sascha Hauer wrote:

On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:

Hi,

On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:

On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:

That's bad :-) So there is no room for improvement any more in our
ARM boot sequences ...

on x86 we're doing pretty well ;-)

On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
power-on through the kernel up to starting init. This is with

- no delay in u-boot-v2
- rootfs on NAND (UBIFS)
- quiet
- precalculated loops-per-jiffy
- zImage kernel instead of uImage

Here's a little video of our demo system booting:
http://www.youtube.com/watch?v=xDbUnNsj0cI

As you can see there, it needs about 15 s from the release of the reset button
up to the moment where the application shows it's Qt 4.5.2 based GUI (which is
when we fade over from the initial framebuffer to the final one, in order to
hide the qt application startup noise).

And below is the boot log (after turning quiet off again). The numbers are
the timestamp and the delta to the last timestamp, measured on the controlling
PC by looking at the serial console output. The ptx_ts script starts when the
regexp was found, so the numbers start basically in the moment when u-boot-v2
has initialized the system up to the point where we can see something.

Result:

- 2.4 s up from u-boot to the end of Uncompressing Linux
- 300 ms until ubifs initialization starts
- 3.7 s for ubifs, until mounted root

So we basically have 7 s for the kernel. The rest is userspace, which hasn't
seen much optimization yet, other than trying to start the GUI application as
early as possible, while doing all other init stuff in parallel. Adding quiet
brings us another 300 ms.

That's factor 70 away from the 110 ms boot time Tim has talked about some days
ago (and he measured on an ARM cpu which had almost half the speed of this
one), and I'm wondering what we can do to improve the boot time.

Robert

r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9
[ 13.522625]   0.043189
[ 13.546627]   0.024002 OSELAS(R)-phyCORE-trunk 
(PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
[ 13.558613]   0.011986
[ 13.690643]   0.132030_ ___    _
[ 13.690731]   0.88  _ __ | |__  _   _ / ___/ _ \|  _ \| |
[ 13.698595]   0.007864 | '_ \| '_ \| | | | |  | | | | |_) |  _|
[ 13.698654]   0.59 | |_) | | | | |_| | |__| |_| |  _ | |___
[ 13.702581]   0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_|
[ 13.706573]   0.003992 |_|  |___/
[ 13.706622]   0.49
[ 13.725043]   0.018421
[ 14.742608]   1.017565


I made some changes suggested in this thread:

- enable MMU in the bootloader
- use assembler optimized memcpy/memset in the bootloader
- start an uncompressed image
- disable IP autoconfiguration in the Kernel
- use lpj= command line parameter
- use static device nodes instead of udev
- skip some init scripts
- made the kernel smaller (I do not have both configs handy, so I do not
  know what exactly I changed)

Already looks much better:

[  0.05]   0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug 18 2009 - 
13:29:25)
[  0.26]   0.21
[  0.41]   0.15 Board: Phytec phyCORE-i.MX27
[  0.54]   0.13 cfi_probe: cfi_flash base: 0xc000 size: 0x0200
[  0.67]   0.13 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 (ST 
Micro NAND 64MiB 1,8V 8-bit)
[  0.80]   0.13 im...@imxfb0: i.MX Framebuffer driver
[  0.92]   0.12 dma_alloc: 0xa6f56e40 0x1000
[  0.000105]   0.13 dma_alloc: 0xa6f57088 0x1000
[  0.000118]   0.13 dev_protect: currently broken
[  0.000129]   0.11 Using environment in NOR Flash
[  0.000141]   0.12 initialising PLLs
[  0.128972]   0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size 16 MB)
[  0.128995]   0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size 32 kB)
[  0.129008]   0.13 running /env/bin/init...
[  0.224963]   0.095955
[  0.224984]   0.21 Hit any key to stop autoboot:  0
[  0.224999]   0.15 copy
[  0.592964]   0.367965 done
[  0.652010]   0.059046 Linux version 2.6.31-rc4-4-g05786f8-dirty 
(s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 
14:08:51 CEST 2009


So, this are ~0.6 s in boot loader and kernel copy until kernel 
starts, correct?


What's the size of the uncompressed kernel copied here?

Best regards

Dirk

Btw.: I tried to summarize some hints given in this thread in

http://elinux.org/Boot_Time#Boot_time_check_list

Please feel free to add and correct stuff!


[  0.652030]   0.20 CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), 
cr=00053177
[  0.652044]   0.14 CPU: VIVT data cache, VIVT instruction cache
[  0.652057]   0.13 Machine: phyCORE-i.MX27
[  0.652069]   0.12 Memory policy: ECC disabled, Data cache writeback
[  0.652082]   0.13 Built 1 zonelists in Zone order, mobility grouping 
on.  Total pages: 32512
[  0.706012]   0.053930 

Re: New fast(?)-boot results on ARM

2009-08-18 Thread Marco Stornelli
Dirk Behme wrote:
 Sascha Hauer wrote:
 On Fri, Aug 14, 2009 at 07:02:28PM +0200, Robert Schwebel wrote:
 Hi,

 On Thu, Aug 13, 2009 at 05:33:26PM +0200, Robert Schwebel wrote:
 On Thu, Aug 13, 2009 at 08:28:26AM -0700, Arjan van de Ven wrote:
 That's bad :-) So there is no room for improvement any more in our
 ARM boot sequences ...
 on x86 we're doing pretty well ;-)
 On i.MX27 (400 MHz ARM926EJ-S) we currently need 7 s, measured from
 power-on through the kernel up to starting init. This is with

 - no delay in u-boot-v2
 - rootfs on NAND (UBIFS)
 - quiet
 - precalculated loops-per-jiffy
 - zImage kernel instead of uImage
 Here's a little video of our demo system booting:
 http://www.youtube.com/watch?v=xDbUnNsj0cI

 As you can see there, it needs about 15 s from the release of the
 reset button
 up to the moment where the application shows it's Qt 4.5.2 based GUI
 (which is
 when we fade over from the initial framebuffer to the final one, in
 order to
 hide the qt application startup noise).

 And below is the boot log (after turning quiet off again). The
 numbers are
 the timestamp and the delta to the last timestamp, measured on the
 controlling
 PC by looking at the serial console output. The ptx_ts script starts
 when the
 regexp was found, so the numbers start basically in the moment when
 u-boot-v2
 has initialized the system up to the point where we can see something.

 Result:

 - 2.4 s up from u-boot to the end of Uncompressing Linux
 - 300 ms until ubifs initialization starts
 - 3.7 s for ubifs, until mounted root

 So we basically have 7 s for the kernel. The rest is userspace, which
 hasn't
 seen much optimization yet, other than trying to start the GUI
 application as
 early as possible, while doing all other init stuff in parallel.
 Adding quiet
 brings us another 300 ms.

 That's factor 70 away from the 110 ms boot time Tim has talked about
 some days
 ago (and he measured on an ARM cpu which had almost half the speed of
 this
 one), and I'm wondering what we can do to improve the boot time.

 Robert

 r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9
 [ 13.522625]   0.043189
 [ 13.546627]   0.024002 OSELAS(R)-phyCORE-trunk
 (PTXdist-1.99.svn/2009-08-06T08:37:25+0200)
 [ 13.558613]   0.011986
 [ 13.690643]   0.132030_ ___    _
 [ 13.690731]   0.88  _ __ | |__  _   _ / ___/ _ \|  _ \| |
 [ 13.698595]   0.007864 | '_ \| '_ \| | | | |  | | | | |_) |  _|
 [ 13.698654]   0.59 | |_) | | | | |_| | |__| |_| |  _ | |___
 [ 13.702581]   0.003927 | .__/|_| |_|\__, |\\___/|_| \_\_|
 [ 13.706573]   0.003992 |_|  |___/
 [ 13.706622]   0.49
 [ 13.725043]   0.018421
 [ 14.742608]   1.017565

 I made some changes suggested in this thread:

 - enable MMU in the bootloader
 - use assembler optimized memcpy/memset in the bootloader
 - start an uncompressed image
 - disable IP autoconfiguration in the Kernel
 - use lpj= command line parameter
 - use static device nodes instead of udev
 - skip some init scripts
 - made the kernel smaller (I do not have both configs handy, so I do not
   know what exactly I changed)

 Already looks much better:

 [  0.05]   0.05 U-Boot 2.0.0-rc10-00241-g3f10fe9-dirty (Aug
 18 2009 - 13:29:25)
 [  0.26]   0.21
 [  0.41]   0.15 Board: Phytec phyCORE-i.MX27
 [  0.54]   0.13 cfi_probe: cfi_flash base: 0xc000 size:
 0x0200
 [  0.67]   0.13 NAND device: Manufacturer ID: 0x20, Chip ID:
 0x36 (ST Micro NAND 64MiB 1,8V 8-bit)
 [  0.80]   0.13 im...@imxfb0: i.MX Framebuffer driver
 [  0.92]   0.12 dma_alloc: 0xa6f56e40 0x1000
 [  0.000105]   0.13 dma_alloc: 0xa6f57088 0x1000
 [  0.000118]   0.13 dev_protect: currently broken
 [  0.000129]   0.11 Using environment in NOR Flash
 [  0.000141]   0.12 initialising PLLs
 [  0.128972]   0.128831 Malloc space: 0xa6f0 - 0xa7f0 (size
 16 MB)
 [  0.128995]   0.23 Stack space : 0xa6ef8000 - 0xa6f0 (size
 32 kB)
 [  0.129008]   0.13 running /env/bin/init...
 [  0.224963]   0.095955
 [  0.224984]   0.21 Hit any key to stop autoboot:  0
 [  0.224999]   0.15 copy
 [  0.592964]   0.367965 done
 [  0.652010]   0.059046 Linux version
 2.6.31-rc4-4-g05786f8-dirty (s...@octopus) (gcc version 4.3.2
 (OSELAS.Toolchain-1.99.3) ) #206 PREEMPT Tue Aug 18 14:08:51 CEST 2009
 
 So, this are ~0.6 s in boot loader and kernel copy until kernel starts,
 correct?
 
 What's the size of the uncompressed kernel copied here?
 
 Best regards
 
 Dirk
 
 Btw.: I tried to summarize some hints given in this thread in
 
 http://elinux.org/Boot_Time#Boot_time_check_list
 
 Please feel free to add and correct stuff!
 

It's a good documentation, good work. From 14s to 5s I think it's a very
 good result. In reference to the previous response of Robert, I think
that it's a good thing to use a vanilla kernel and avoid strange and
specific or not mature solutions, but it needs to use the right tool
for 

Re: New fast(?)-boot results on ARM

2009-08-17 Thread Tim Bird
Robert Schwebel wrote:
 On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote:
 [ �5.082616]  �0.007992 RPC: Registered tcp transport module.
 [ �5.605159]  �0.522543 eth0: config: auto-negotiation on, 100FDX, 
 100HDX, 10FDX, 10HDX.
 [ �6.602621]  �0.997462 IP-Config: Complete:
 [ �6.606638]  �0.004017 � � �device=eth0, addr=192.168.23.197, 
 mask=255.255.0.0, gw=192.168.23.2,
 [ �6.614588]  �0.007950 � � �host=192.168.23.197, domain=, 
 nis-domain=(none),
 [ �6.618652]  �0.004064 � � �bootserver=192.168.23.2, 
 rootserver=192.168.23.2, rootpath=
 Well, this ~1 second is not really kernel's fault, it's DHCP delay.
 But, do you need to do it at this moment?
 You do not seem to be using networking filesystems.
 You can run DHCP client in userspace.
 
 The board has ip autoconfig configured in, because we also use tftp/nfs
 boot for development. But it had been disabled on the commandline:
 
 ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0:::
 
 That shouldn't do dhcp, right?

I think it doesn't, but I'm not positive.  The DHCP trasmissions
themselves don't take very long.  There are some very long timeouts
in the network code paths, which appear to be used whether you specify
a static address or not.

See the definitions of CONF_PRE_OPEN and CON_POST_OPEN
in net/ipv4/ipconfig.c

They are set to ridiculously long values.  In my experience,
you can cut them down considerably with no dangerous side
effects (but I haven't asked the network guys about the
possible downsides).

Here's a patch which I've used in the past.  (Sorry
if it doesn't apply cleanly, I just extracted it from
a PDF and the whitespace may have gotten messed up.
It's short enough that you can hand-edit the files if
there's a problem.)

I'd like to hear back, if you apply this, whether it shortens
the network startup time for you.

diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
index 42065ff..e42d83f 100644
--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -86,8 +86,10 @@
 #endif

 /* Define the friendly delay before and after opening net devices */
-#define CONF_PRE_OPEN 500 /* Before opening: 1/2 second */
-#define CONF_POST_OPEN 1 /* After opening: 1 second */
+/*#define CONF_PRE_OPEN 500 /* Before opening: 1/2 second */
+/*#define CONF_POST_OPEN 1 /* After opening: 1 second */
+#define CONF_PRE_OPEN 5 /* Before opening: 5 milli seconds */
+#define CONF_POST_OPEN 10 /* After opening: 10 milli seconds */

 /* Define the timeout for waiting for a DHCP/BOOTP/RARP reply */
 #define CONF_OPEN_RETRIES 2 /* (Re)open devices twice */
@@ -1292,7 +1294,7 @@ static int __init ip_auto_config(void)
return -1;
/* Give drivers a chance to settle */
-   ssleep(CONF_POST_OPEN);
+   msleep(CONF_POST_OPEN);

/*
 * If the config information is insufficient (e.g., our IP address or

=
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=

--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-15 Thread Johannes Stezenbach
On Fri, Aug 14, 2009 at 10:43:05PM +0200, Robert Schwebel wrote:
 On Fri, Aug 14, 2009 at 10:04:57PM +0200, Denys Vlasenko wrote:
   r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9

Now that microcom is in Debian sid (thanks!), where can I find ptx_ts?
It seems to be quite useful.


   [  0.874559]   0.003967 Hit any key to stop autoboot:  0
 
  boot loader is not fast. considering its simple task, it can be made
  faster.
 
 Yup, will check. Almost 1 s seems really long.


I'm working on a SoC with a 200MHz ARM926EJ-S.  We managed to get
to 1.5sec from power-on to starting init. The main difference to
your platform seems to be that we use NOR flash.  The kernel is
not optimized, it still has some debug options turned on and
is used during development. (however, the 1.5sec is with quiet)
The root fs is cramfs. The kernel version is 2.6.20.

For u-boot we enabled the D-cache which gave a decent speed up
(on ARM926EJ-S this requires one to set up page tables and enable
MMU, but it's not that difficult). I don't have the numbers here
but I think it still takes ~300ms in u-boot, and ~1.2s for the kernel boot.


   [  1.326621]   0.452062 loaded zImage from /dev/nand0.kernel.bb with 
   size 1679656
   [  2.009996]   0.683375 Uncompressing 
   Linux...
done, booting the kernel.
   [  2.416999]   0.407003 Linux version 2.6.31-rc4-g056f82f-dirty 
   (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT 
   Thu Aug 6 08:37:19 CEST 2009
  
  Other people already commented on this (kernel is too big)
 
 Not sure (the kernel is already customized for the board), but I'll take
 a look again.

We are booting an uncomressed kernel (~2.8MB).  Uncompressing (running the 
uncompressor
XIP in NOR flash) took ~0.5s longer than copying 2.8MB from flash to RAM.
BTW, we are using uImage and set verify=no in u-boot. We use u-boot-1.3.0.


   [  5.082616]   0.007992 RPC: Registered tcp transport module.
   [  5.605159]   0.522543 eth0: config: auto-negotiation on, 100FDX, 
   100HDX, 10FDX, 10HDX.

What is happening here? Waiting for eth link negotiation?

   [  6.602621]   0.997462 IP-Config: Complete:
   [  6.606638]   0.004017      device=eth0, addr=192.168.23.197, 
   mask=255.255.0.0, gw=192.168.23.2,
   [  6.614588]   0.007950      host=192.168.23.197, domain=, 
   nis-domain=(none),
   [  6.618652]   0.004064      bootserver=192.168.23.2, 
   rootserver=192.168.23.2, rootpath=
  
  Well, this ~1 second is not really kernel's fault, it's DHCP delay.
  But, do you need to do it at this moment?
  You do not seem to be using networking filesystems.
  You can run DHCP client in userspace.
 
 The board has ip autoconfig configured in, because we also use tftp/nfs
 boot for development. But it had been disabled on the commandline:
 
 ip=192.168.23.197:192.168.23.2:192.168.23.2:255.255.0.0:::
 
 That shouldn't do dhcp, right?

Try to boot with eth cable unplugged, see if it hangs in IP-config.
If it were doing static configuration it would be faster.

However, unless you need ethernet to boot (NFS root) I'd suggest
doing eth config in userspace.


   [  7.137924]   0.059316 starting udev
   [  7.147925]   0.010001 mounting tmpfs at /dev
   [  7.182299]   0.034374 creating static nodes
   [  7.410613]   0.228314 starting udevd...done
   [  8.811097]   1.400484 waiting for devices...done

And suddenly devtmpfs sounds like a good idea ;-)

We use static device nodes during boot, and later
setup busybox mdev for hotplug.


Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-14 Thread Zan Lynx

Robert Schwebel wrote:


- 2.4 s up from u-boot to the end of Uncompressing Linux
- 300 ms until ubifs initialization starts
- 3.7 s for ubifs, until mounted root

So we basically have 7 s for the kernel. The rest is userspace, which hasn't
seen much optimization yet, other than trying to start the GUI application as
early as possible, while doing all other init stuff in parallel. Adding quiet
brings us another 300 ms.

That's factor 70 away from the 110 ms boot time Tim has talked about some days
ago (and he measured on an ARM cpu which had almost half the speed of this
one), and I'm wondering what we can do to improve the boot time.


2.4s in uncompression? That seems like an obvious target for improvement.

Your kernel seems awfully large. 3104K code? You should definitely find 
out what is making it that big and cut out everything you do not need. 
You might even try some of the embedded system scripts that rip out all 
the printk strings.


If you get the kernel size way down then use a uncompressed kernel and 
it should boot a lot faster if the bottleneck is CPU speed.


However, it is probably IO speed. There could be something really wrong 
and slow with your MTD. Does it DMA or is it doing something crazy like 
using the CPU to read a byte at a time?


Or maybe its cheap and slow flash. In that case I think your only hope 
is to make all the code as small as possible and/or find a different 
flash filesystem that does not have to read so much of the device to 
mount. Perhaps use a read-only compressed filesystem for the system 
binaries and reflash it for software upgrades. Only init and mount the 
writable flash for user-storable data well after system boot has finished.

--
Zan Lynx
zl...@acm.org

Knowledge is Power.  Power Corrupts.  Study Hard.  Be Evil.
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-14 Thread Robert Schwebel
Zan,

On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote:
  That's factor 70 away from the 110 ms boot time Tim has talked about
  some days ago (and he measured on an ARM cpu which had almost half
  the speed of this one), and I'm wondering what we can do to improve
  the boot time.

 2.4s in uncompression? That seems like an obvious target for
 improvement.

Indeed, we'll check that.

However, I have a little bit the impression that most systems which are
hyped as fast boot out there are optimized so aggressively that they
are not really usable in real life applications any more. So we try to
configure the systems in a realistic way. I know that we won't get the
last milliseconds that way - but I'd like to find out how far we can go.

 Your kernel seems awfully large. 3104K code? You should definitely find
 out what is making it that big and cut out everything you do not need.

Definitely, will audit again.

 You might even try some of the embedded system scripts that rip out
 all the printk strings.

Hmm, that's definitely in the last-minute-before-product category.

 If you get the kernel size way down then use a uncompressed kernel and
 it should boot a lot faster if the bottleneck is CPU speed.

I'll try that.

 However, it is probably IO speed. There could be something really wrong
 and slow with your MTD. Does it DMA or is it doing something crazy like
 using the CPU to read a byte at a time?

Will check.

 Or maybe its cheap and slow flash. In that case I think your only hope
 is to make all the code as small as possible and/or find a different
 flash filesystem that does not have to read so much of the device to
 mount. Perhaps use a read-only compressed filesystem for the system
 binaries and reflash it for software upgrades. Only init and mount the
 writable flash for user-storable data well after system boot has
 finished.

That would be also a last-minute change, but surely worth to be
evaluated.

We recently changed from jffs2 to ubifs and hoped to gain speed during
that step.

Thanks for your feedback!

rsc
-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New fast(?)-boot results on ARM

2009-08-14 Thread Denys Vlasenko
On Fri, Aug 14, 2009 at 7:02 PM, Robert
Schwebelr.schwe...@pengutronix.de wrote:
 So we basically have 7 s for the kernel. The rest is userspace, which hasn't
 seen much optimization yet, other than trying to start the GUI application as
 early as possible, while doing all other init stuff in parallel. Adding 
 quiet
 brings us another 300 ms.

 That's factor 70 away from the 110 ms boot time Tim has talked about some days
 ago (and he measured on an ARM cpu which had almost half the speed of this
 one), and I'm wondering what we can do to improve the boot time.

 Robert

 r...@thebe:~$ microcom | ptx_ts U-Boot 2.0.0-rc9
 [  2.395740]   2.395740
 [  2.395860]   0.000120
 [  0.11]   0.11 U-Boot 2.0.0-rc9 (Aug  5 2009 - 10:05:58)
 [  0.59]   0.48
 [  0.003823]   0.003764 Board: Phytec phyCORE-i.MX27
 [  0.010753]   0.006930 cfi_probe: cfi_flash base: 0xc000 size: 
 0x0200
 [  0.018711]   0.007958 NAND device: Manufacturer ID: 0x20, Chip ID: 0x36 
 (ST Micro NAND 64MiB 1,8V 8-bit)
 [  0.026592]   0.007881 im...@imxfb0: i.MX Framebuffer driver
 [  0.178655]   0.152063 dev_protect: currently broken
 [  0.178736]   0.81 Using environment in NOR Flash
 [  0.182577]   0.003841 initialising PLLs
 [  0.367142]   0.184565 Malloc space: 0xa3f0 - 0xa7f0 (size 64 MB)
 [  0.370568]   0.003426 Stack space : 0xa3ef8000 - 0xa3f0 (size 32 kB)
 [  0.445993]   0.075425 running /env/bin/init...
 [  0.870592]   0.424599
 [  0.874559]   0.003967 Hit any key to stop autoboot:  0

boot loader is not fast. considering its simple task,
it can be made faster.

 [  1.326621]   0.452062 loaded zImage from /dev/nand0.kernel.bb with size 
 1679656
 [  2.009996]   0.683375 Uncompressing 
 Linux...
  done, booting the kernel.
 [  2.416999]   0.407003 Linux version 2.6.31-rc4-g056f82f-dirty 
 (s...@octopus) (gcc version 4.3.2 (OSELAS.Toolchain-1.99.3) ) #1 PREEMPT Thu 
 Aug 6 08:37:19 CEST 2009

Other people already commented on this (kernel is too big)

 [  2.418729]   0.001730 CPU: ARM926EJ-S [41069264] revision 4 (ARMv5TEJ), 
 cr=00053177
 [  2.423081]   0.004352 CPU: VIVT data cache, VIVT instruction cache
 [  2.426592]   0.003511 Machine: phyCORE-i.MX27
...
 [  2.742628]   0.016050 0x0036-0x0400 : root
 [  3.058610]   0.315982 UBI: attaching mtd7 to ubi0
 [  3.062878]   0.004268 UBI: physical eraseblock size:   16384 bytes (16 
 KiB)
 [  3.070601]   0.007723 UBI: logical eraseblock size:    15360 bytes
 [  3.070665]   0.64 UBI: smallest flash I/O unit:    512
 [  3.078564]   0.007899 UBI: VID header offset:          512 (aligned 512)
 [  3.078609]   0.45 UBI: data offset:                1024
 [  5.006609]   1.928000 UBI: attached mtd7 to ubi0
 [  5.013157]   0.006548 UBI: MTD device name:            root

As others commented, ubi looks slow and you probably need to find out why.

 [  5.014566]   0.001409 UBI: MTD device size:            60 MiB
 [  5.018660]   0.004094 UBI: number of good PEBs:        3880
 [  5.022585]   0.003925 UBI: number of bad PEBs:         0
 [  5.026797]   0.004212 UBI: max. allowed volumes:       89
 [  5.026849]   0.52 UBI: wear-leveling threshold:    4096
 [  5.030779]   0.003930 UBI: number of internal volumes: 1
 [  5.034583]   0.003804 UBI: number of user volumes:     1
 [  5.046572]   0.011989 UBI: available PEBs:             0
 [  5.046622]   0.50 UBI: total number of reserved PEBs: 3880
 [  5.046657]   0.35 UBI: number of PEBs reserved for bad PEB handling: 
 38
 [  5.050606]   0.003949 UBI: max/mean erase counter: 2/0
 [  5.050668]   0.62 UBI: image sequence number: 0
 [  5.058619]   0.007951 UBI: background thread ubi_bgt0d started, PID 215
 [  5.062620]   0.004001 oprofile: using timer interrupt.
 [  5.070584]   0.007964 TCP cubic registered
 [  5.070637]   0.53 NET: Registered protocol family 17
 [  5.074624]   0.003987 RPC: Registered udp transport module.
 [  5.082616]   0.007992 RPC: Registered tcp transport module.
 [  5.605159]   0.522543 eth0: config: auto-negotiation on, 100FDX, 100HDX, 
 10FDX, 10HDX.
 [  6.602621]   0.997462 IP-Config: Complete:
 [  6.606638]   0.004017      device=eth0, addr=192.168.23.197, 
 mask=255.255.0.0, gw=192.168.23.2,
 [  6.614588]   0.007950      host=192.168.23.197, domain=, 
 nis-domain=(none),
 [  6.618652]   0.004064      bootserver=192.168.23.2, 
 rootserver=192.168.23.2, rootpath=

Well, this ~1 second is not really kernel's fault, it's DHCP delay.
But, do you need to do it at this moment?
You do not seem to be using networking filesystems.
You can run DHCP client in userspace.

 [  6.630579]   0.011927 UBIFS: recovery needed
 [  6.662655]   0.032076 UBIFS: recovery completed
 [  6.666587]   0.003932 UBIFS: mounted UBI device 0, volume 1, name root
 [  6.670570]   0.003983 UBIFS: file system size:   58490880 bytes (57120 
 KiB, 55 MiB, 3808 LEBs)
 [  6.678572]   

Re: New fast(?)-boot results on ARM

2009-08-14 Thread Linus Walleij
2009/8/14 Robert Schwebel r.schwe...@pengutronix.de:
 On Fri, Aug 14, 2009 at 12:19:48PM -0600, Zan Lynx wrote:

  That's factor 70 away from the 110 ms boot time Tim has talked about
  some days ago (and he measured on an ARM cpu which had almost half
  the speed of this one), and I'm wondering what we can do to improve
  the boot time.

 2.4s in uncompression? That seems like an obvious target for
 improvement.

 Indeed, we'll check that.

We got rid of uncompression on a flash-based system vastly improving
boot time. The reason is that compressed kernels are faster only when
the throughput to the persistent storage is lower than the decompression
throughput, and on typical embedded systems with DMA the throughput to
memory outperforms the CPU-based decompression.

Of course it depends on a lot of stuff like performance of flash controller,
kernel storage filesystem performance, DMA controller performance,
cache architecture etc so it's individual per-system.

Linus Walleij
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html