Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-24 Thread Jagan Teki
On Wed, Apr 25, 2018 at 1:46 AM, Tom Rini  wrote:
> On Tue, Apr 24, 2018 at 09:57:58PM +0200, Maxime Ripard wrote:
>> Hi Jagan,
>>
>> On Fri, Apr 06, 2018 at 11:36:59AM +0530, Jagan Teki wrote:
>> > On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
>> >  wrote:
>> > > On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
>> > >> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
>> > >>  wrote:
>> > >> > From: Philipp Tomsich 
>> > >> >
>> > >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> > >> > read 10MB from a fast eMMC device due to excessive delays in polling
>> > >> > loops.
>> > >> >
>> > >> > This commit restructures the main polling loops to use get_timer(...)
>> > >> > to determine whether a (millisecond) timeout has expired.  We choose
>> > >> > not to use the wait_bit function, as we don't need interruptability
>> > >> > with ctrl-c and have at least one case where two bits (one for an
>> > >> > error condition and another one for completion) need to be read and
>> > >> > using wait_bit would have not added to the clarity.
>> > >> >
>> > >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> > >> > 10MB write decreases from 9.302s to 0.884s).
>> > >>
>> > >> Fyi: I've seen significant improvement, but not 10x on A64
>> > >> (bananpi-m64) with read
>> > >>
>> > >> Before this change:
>> > >>
>> > >> => mmc dev 0
>> > >> switch to partitions #0, OK
>> > >> mmc0 is current device
>> > >> => fatload mmc 0:1 $kernel_addr_r Image
>> > >> reading Image
>> > >> 16310784 bytes read in 821 ms (18.9 MiB/s)
>> > >> => mmc dev 1
>> > >> switch to partitions #0, OK
>> > >> mmc1(part 0) is current device
>> > >> => ext4load mmc 1:1 $kernel_addr_r Image
>> > >> 16310784 bytes read in 1109 ms (14 MiB/s)
>> > >>
>> > >>
>> > >> After this change:
>> > >>
>> > >> => mmc dev 0
>> > >> switch to partitions #0, OK
>> > >> mmc0 is current device
>> > >> => fatload mmc 0:1 $kernel_addr_r Image
>> > >> 16310784 bytes read in 784 ms (19.8 MiB/s)
>> > >> => mmc dev 1
>> > >> switch to partitions #0, OK
>> > >> mmc1(part 0) is current device
>> > >> => ext4load mmc 1:1 $kernel_addr_r Image
>> > >> 16310784 bytes read in 793 ms (19.6 MiB/s)
>> > >
>> > > Yeah, the smaller the file is, the bigger the gain is. Since you have
>> > > an almost twice bigger file, the gains are probably just noise at that
>> > > point and the bottleneck starts to be your MMC.
>> >
>> > Acked-by: Jagan Teki 
>>
>> Jaehoon doesn't seem to reply at all, can we merge this through the
>> sunxi tree?

Applied to u-boot-sunxi/master
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-24 Thread Tom Rini
On Tue, Apr 24, 2018 at 09:57:58PM +0200, Maxime Ripard wrote:
> Hi Jagan,
> 
> On Fri, Apr 06, 2018 at 11:36:59AM +0530, Jagan Teki wrote:
> > On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
> >  wrote:
> > > On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
> > >> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
> > >>  wrote:
> > >> > From: Philipp Tomsich 
> > >> >
> > >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> > >> > read 10MB from a fast eMMC device due to excessive delays in polling
> > >> > loops.
> > >> >
> > >> > This commit restructures the main polling loops to use get_timer(...)
> > >> > to determine whether a (millisecond) timeout has expired.  We choose
> > >> > not to use the wait_bit function, as we don't need interruptability
> > >> > with ctrl-c and have at least one case where two bits (one for an
> > >> > error condition and another one for completion) need to be read and
> > >> > using wait_bit would have not added to the clarity.
> > >> >
> > >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> > >> > 10MB write decreases from 9.302s to 0.884s).
> > >>
> > >> Fyi: I've seen significant improvement, but not 10x on A64
> > >> (bananpi-m64) with read
> > >>
> > >> Before this change:
> > >>
> > >> => mmc dev 0
> > >> switch to partitions #0, OK
> > >> mmc0 is current device
> > >> => fatload mmc 0:1 $kernel_addr_r Image
> > >> reading Image
> > >> 16310784 bytes read in 821 ms (18.9 MiB/s)
> > >> => mmc dev 1
> > >> switch to partitions #0, OK
> > >> mmc1(part 0) is current device
> > >> => ext4load mmc 1:1 $kernel_addr_r Image
> > >> 16310784 bytes read in 1109 ms (14 MiB/s)
> > >>
> > >>
> > >> After this change:
> > >>
> > >> => mmc dev 0
> > >> switch to partitions #0, OK
> > >> mmc0 is current device
> > >> => fatload mmc 0:1 $kernel_addr_r Image
> > >> 16310784 bytes read in 784 ms (19.8 MiB/s)
> > >> => mmc dev 1
> > >> switch to partitions #0, OK
> > >> mmc1(part 0) is current device
> > >> => ext4load mmc 1:1 $kernel_addr_r Image
> > >> 16310784 bytes read in 793 ms (19.6 MiB/s)
> > >
> > > Yeah, the smaller the file is, the bigger the gain is. Since you have
> > > an almost twice bigger file, the gains are probably just noise at that
> > > point and the bottleneck starts to be your MMC.
> > 
> > Acked-by: Jagan Teki 
> 
> Jaehoon doesn't seem to reply at all, can we merge this through the
> sunxi tree?

Yes.

Reviewed-by: Tom Rini 

-- 
Tom


signature.asc
Description: PGP signature
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-24 Thread Maxime Ripard
Hi Jagan,

On Fri, Apr 06, 2018 at 11:36:59AM +0530, Jagan Teki wrote:
> On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
>  wrote:
> > On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
> >> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
> >>  wrote:
> >> > From: Philipp Tomsich 
> >> >
> >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> >> > read 10MB from a fast eMMC device due to excessive delays in polling
> >> > loops.
> >> >
> >> > This commit restructures the main polling loops to use get_timer(...)
> >> > to determine whether a (millisecond) timeout has expired.  We choose
> >> > not to use the wait_bit function, as we don't need interruptability
> >> > with ctrl-c and have at least one case where two bits (one for an
> >> > error condition and another one for completion) need to be read and
> >> > using wait_bit would have not added to the clarity.
> >> >
> >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> >> > 10MB write decreases from 9.302s to 0.884s).
> >>
> >> Fyi: I've seen significant improvement, but not 10x on A64
> >> (bananpi-m64) with read
> >>
> >> Before this change:
> >>
> >> => mmc dev 0
> >> switch to partitions #0, OK
> >> mmc0 is current device
> >> => fatload mmc 0:1 $kernel_addr_r Image
> >> reading Image
> >> 16310784 bytes read in 821 ms (18.9 MiB/s)
> >> => mmc dev 1
> >> switch to partitions #0, OK
> >> mmc1(part 0) is current device
> >> => ext4load mmc 1:1 $kernel_addr_r Image
> >> 16310784 bytes read in 1109 ms (14 MiB/s)
> >>
> >>
> >> After this change:
> >>
> >> => mmc dev 0
> >> switch to partitions #0, OK
> >> mmc0 is current device
> >> => fatload mmc 0:1 $kernel_addr_r Image
> >> 16310784 bytes read in 784 ms (19.8 MiB/s)
> >> => mmc dev 1
> >> switch to partitions #0, OK
> >> mmc1(part 0) is current device
> >> => ext4load mmc 1:1 $kernel_addr_r Image
> >> 16310784 bytes read in 793 ms (19.6 MiB/s)
> >
> > Yeah, the smaller the file is, the bigger the gain is. Since you have
> > an almost twice bigger file, the gains are probably just noise at that
> > point and the bottleneck starts to be your MMC.
> 
> Acked-by: Jagan Teki 

Jaehoon doesn't seem to reply at all, can we merge this through the
sunxi tree?

Thanks!
Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-20 Thread Michael Nazzareno Trimarchi
Hi

On Fri, Apr 20, 2018 at 10:10 PM, Maxime Ripard
 wrote:
> On Mon, Apr 16, 2018 at 10:37:11PM +0200, Michael Nazzareno Trimarchi wrote:
>> Hi
>>
>> On Mon, Apr 16, 2018 at 9:55 PM, Maxime Ripard
>>  wrote:
>> > On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
>> >> Hi Jaehoon,
>> >>
>> >> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
>> >> > From: Philipp Tomsich 
>> >> >
>> >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> >> > read 10MB from a fast eMMC device due to excessive delays in polling
>> >> > loops.
>> >> >
>> >> > This commit restructures the main polling loops to use get_timer(...)
>> >> > to determine whether a (millisecond) timeout has expired.  We choose
>> >> > not to use the wait_bit function, as we don't need interruptability
>> >> > with ctrl-c and have at least one case where two bits (one for an
>> >> > error condition and another one for completion) need to be read and
>> >> > using wait_bit would have not added to the clarity.
>> >> >
>> >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> >> > 10MB write decreases from 9.302s to 0.884s).
>> >> >
>> >> > Signed-off-by: Philipp Tomsich 
>> >> > Signed-off-by: Maxime Ripard 
>> >>
>> >> Any chance we can merge this for the next release?
>> >
>> > Ping?
>> >
>>
>> Just curios but what is the result if %s/udelay(1000)/udelay(1)/g in
>> the driver
>
> This will probably speed up the transfer as well, but we don't need
> that udelay in the first place. We don't have any application or OS to
> be nice to, so we can just busy loop in order to achieve the higher
> throughput. Or am I missing something?
>

One is to try to have less code change and second was to ping in another way
to be included

Michael

> Maxime
>
> --
> Maxime Ripard, Bootlin (formerly Free Electrons)
> Embedded Linux and Kernel engineering
> https://bootlin.com



-- 
| Michael Nazzareno Trimarchi Amarula Solutions BV |
| COO  -  Founder  Cruquiuskade 47 |
| +31(0)851119172 Amsterdam 1018 AM NL |
|  [`as] http://www.amarulasolutions.com   |
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-20 Thread Maxime Ripard
On Mon, Apr 16, 2018 at 10:37:11PM +0200, Michael Nazzareno Trimarchi wrote:
> Hi
> 
> On Mon, Apr 16, 2018 at 9:55 PM, Maxime Ripard
>  wrote:
> > On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
> >> Hi Jaehoon,
> >>
> >> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
> >> > From: Philipp Tomsich 
> >> >
> >> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> >> > read 10MB from a fast eMMC device due to excessive delays in polling
> >> > loops.
> >> >
> >> > This commit restructures the main polling loops to use get_timer(...)
> >> > to determine whether a (millisecond) timeout has expired.  We choose
> >> > not to use the wait_bit function, as we don't need interruptability
> >> > with ctrl-c and have at least one case where two bits (one for an
> >> > error condition and another one for completion) need to be read and
> >> > using wait_bit would have not added to the clarity.
> >> >
> >> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> >> > 10MB write decreases from 9.302s to 0.884s).
> >> >
> >> > Signed-off-by: Philipp Tomsich 
> >> > Signed-off-by: Maxime Ripard 
> >>
> >> Any chance we can merge this for the next release?
> >
> > Ping?
> >
> 
> Just curios but what is the result if %s/udelay(1000)/udelay(1)/g in
> the driver

This will probably speed up the transfer as well, but we don't need
that udelay in the first place. We don't have any application or OS to
be nice to, so we can just busy loop in order to achieve the higher
throughput. Or am I missing something?

Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-16 Thread Michael Nazzareno Trimarchi
Hi

On Mon, Apr 16, 2018 at 9:55 PM, Maxime Ripard
 wrote:
> On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
>> Hi Jaehoon,
>>
>> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
>> > From: Philipp Tomsich 
>> >
>> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> > read 10MB from a fast eMMC device due to excessive delays in polling
>> > loops.
>> >
>> > This commit restructures the main polling loops to use get_timer(...)
>> > to determine whether a (millisecond) timeout has expired.  We choose
>> > not to use the wait_bit function, as we don't need interruptability
>> > with ctrl-c and have at least one case where two bits (one for an
>> > error condition and another one for completion) need to be read and
>> > using wait_bit would have not added to the clarity.
>> >
>> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> > 10MB write decreases from 9.302s to 0.884s).
>> >
>> > Signed-off-by: Philipp Tomsich 
>> > Signed-off-by: Maxime Ripard 
>>
>> Any chance we can merge this for the next release?
>
> Ping?
>

Just curios but what is the result if %s/udelay(1000)/udelay(1)/g in the driver

Michael

> Maxime
>
> --
> Maxime Ripard, Bootlin (formerly Free Electrons)
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
> ___
> U-Boot mailing list
> U-Boot@lists.denx.de
> https://lists.denx.de/listinfo/u-boot
>



-- 
| Michael Nazzareno Trimarchi Amarula Solutions BV |
| COO  -  Founder  Cruquiuskade 47 |
| +31(0)851119172 Amsterdam 1018 AM NL |
|  [`as] http://www.amarulasolutions.com   |
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-16 Thread Maxime Ripard
On Fri, Apr 06, 2018 at 07:54:47AM +0200, Maxime Ripard wrote:
> Hi Jaehoon,
> 
> On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
> > From: Philipp Tomsich 
> > 
> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> > read 10MB from a fast eMMC device due to excessive delays in polling
> > loops.
> > 
> > This commit restructures the main polling loops to use get_timer(...)
> > to determine whether a (millisecond) timeout has expired.  We choose
> > not to use the wait_bit function, as we don't need interruptability
> > with ctrl-c and have at least one case where two bits (one for an
> > error condition and another one for completion) need to be read and
> > using wait_bit would have not added to the clarity.
> > 
> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> > 10MB write decreases from 9.302s to 0.884s).
> > 
> > Signed-off-by: Philipp Tomsich 
> > Signed-off-by: Maxime Ripard 
> 
> Any chance we can merge this for the next release?

Ping?

Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


signature.asc
Description: PGP signature
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-06 Thread Jagan Teki
On Wed, Apr 4, 2018 at 12:36 PM, Maxime Ripard
 wrote:
> On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
>> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
>>  wrote:
>> > From: Philipp Tomsich 
>> >
>> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
>> > read 10MB from a fast eMMC device due to excessive delays in polling
>> > loops.
>> >
>> > This commit restructures the main polling loops to use get_timer(...)
>> > to determine whether a (millisecond) timeout has expired.  We choose
>> > not to use the wait_bit function, as we don't need interruptability
>> > with ctrl-c and have at least one case where two bits (one for an
>> > error condition and another one for completion) need to be read and
>> > using wait_bit would have not added to the clarity.
>> >
>> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
>> > 10MB write decreases from 9.302s to 0.884s).
>>
>> Fyi: I've seen significant improvement, but not 10x on A64
>> (bananpi-m64) with read
>>
>> Before this change:
>>
>> => mmc dev 0
>> switch to partitions #0, OK
>> mmc0 is current device
>> => fatload mmc 0:1 $kernel_addr_r Image
>> reading Image
>> 16310784 bytes read in 821 ms (18.9 MiB/s)
>> => mmc dev 1
>> switch to partitions #0, OK
>> mmc1(part 0) is current device
>> => ext4load mmc 1:1 $kernel_addr_r Image
>> 16310784 bytes read in 1109 ms (14 MiB/s)
>>
>>
>> After this change:
>>
>> => mmc dev 0
>> switch to partitions #0, OK
>> mmc0 is current device
>> => fatload mmc 0:1 $kernel_addr_r Image
>> 16310784 bytes read in 784 ms (19.8 MiB/s)
>> => mmc dev 1
>> switch to partitions #0, OK
>> mmc1(part 0) is current device
>> => ext4load mmc 1:1 $kernel_addr_r Image
>> 16310784 bytes read in 793 ms (19.6 MiB/s)
>
> Yeah, the smaller the file is, the bigger the gain is. Since you have
> an almost twice bigger file, the gains are probably just noise at that
> point and the bottleneck starts to be your MMC.

Acked-by: Jagan Teki 
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-05 Thread Maxime Ripard
Hi Jaehoon,

On Wed, Mar 21, 2018 at 12:18:58PM +0100, Maxime Ripard wrote:
> From: Philipp Tomsich 
> 
> Throughput tests have shown the sunxi_mmc driver to take over 10s to
> read 10MB from a fast eMMC device due to excessive delays in polling
> loops.
> 
> This commit restructures the main polling loops to use get_timer(...)
> to determine whether a (millisecond) timeout has expired.  We choose
> not to use the wait_bit function, as we don't need interruptability
> with ctrl-c and have at least one case where two bits (one for an
> error condition and another one for completion) need to be read and
> using wait_bit would have not added to the clarity.
> 
> The observed speedup in testing on a A31 is greater than 10x (e.g. a
> 10MB write decreases from 9.302s to 0.884s).
> 
> Signed-off-by: Philipp Tomsich 
> Signed-off-by: Maxime Ripard 

Any chance we can merge this for the next release?

Thanks!
Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-04 Thread Maxime Ripard
On Wed, Apr 04, 2018 at 12:13:01PM +0530, Jagan Teki wrote:
> On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
>  wrote:
> > From: Philipp Tomsich 
> >
> > Throughput tests have shown the sunxi_mmc driver to take over 10s to
> > read 10MB from a fast eMMC device due to excessive delays in polling
> > loops.
> >
> > This commit restructures the main polling loops to use get_timer(...)
> > to determine whether a (millisecond) timeout has expired.  We choose
> > not to use the wait_bit function, as we don't need interruptability
> > with ctrl-c and have at least one case where two bits (one for an
> > error condition and another one for completion) need to be read and
> > using wait_bit would have not added to the clarity.
> >
> > The observed speedup in testing on a A31 is greater than 10x (e.g. a
> > 10MB write decreases from 9.302s to 0.884s).
> 
> Fyi: I've seen significant improvement, but not 10x on A64
> (bananpi-m64) with read
> 
> Before this change:
> 
> => mmc dev 0
> switch to partitions #0, OK
> mmc0 is current device
> => fatload mmc 0:1 $kernel_addr_r Image
> reading Image
> 16310784 bytes read in 821 ms (18.9 MiB/s)
> => mmc dev 1
> switch to partitions #0, OK
> mmc1(part 0) is current device
> => ext4load mmc 1:1 $kernel_addr_r Image
> 16310784 bytes read in 1109 ms (14 MiB/s)
> 
> 
> After this change:
> 
> => mmc dev 0
> switch to partitions #0, OK
> mmc0 is current device
> => fatload mmc 0:1 $kernel_addr_r Image
> 16310784 bytes read in 784 ms (19.8 MiB/s)
> => mmc dev 1
> switch to partitions #0, OK
> mmc1(part 0) is current device
> => ext4load mmc 1:1 $kernel_addr_r Image
> 16310784 bytes read in 793 ms (19.6 MiB/s)

Yeah, the smaller the file is, the bigger the gain is. Since you have
an almost twice bigger file, the gains are probably just noise at that
point and the bottleneck starts to be your MMC.

Maxime

-- 
Maxime Ripard, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


signature.asc
Description: PGP signature
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-04-04 Thread Jagan Teki
On Wed, Mar 21, 2018 at 4:48 PM, Maxime Ripard
 wrote:
> From: Philipp Tomsich 
>
> Throughput tests have shown the sunxi_mmc driver to take over 10s to
> read 10MB from a fast eMMC device due to excessive delays in polling
> loops.
>
> This commit restructures the main polling loops to use get_timer(...)
> to determine whether a (millisecond) timeout has expired.  We choose
> not to use the wait_bit function, as we don't need interruptability
> with ctrl-c and have at least one case where two bits (one for an
> error condition and another one for completion) need to be read and
> using wait_bit would have not added to the clarity.
>
> The observed speedup in testing on a A31 is greater than 10x (e.g. a
> 10MB write decreases from 9.302s to 0.884s).

Fyi: I've seen significant improvement, but not 10x on A64
(bananpi-m64) with read

Before this change:

=> mmc dev 0
switch to partitions #0, OK
mmc0 is current device
=> fatload mmc 0:1 $kernel_addr_r Image
reading Image
16310784 bytes read in 821 ms (18.9 MiB/s)
=> mmc dev 1
switch to partitions #0, OK
mmc1(part 0) is current device
=> ext4load mmc 1:1 $kernel_addr_r Image
16310784 bytes read in 1109 ms (14 MiB/s)


After this change:

=> mmc dev 0
switch to partitions #0, OK
mmc0 is current device
=> fatload mmc 0:1 $kernel_addr_r Image
16310784 bytes read in 784 ms (19.8 MiB/s)
=> mmc dev 1
switch to partitions #0, OK
mmc1(part 0) is current device
=> ext4load mmc 1:1 $kernel_addr_r Image
16310784 bytes read in 793 ms (19.6 MiB/s)

Jagan.
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


Re: [U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-03-29 Thread Mylène Josserand
Hello,

On Wed, 21 Mar 2018 12:18:58 +0100
Maxime Ripard  wrote:

> From: Philipp Tomsich 
> 
> Throughput tests have shown the sunxi_mmc driver to take over 10s to
> read 10MB from a fast eMMC device due to excessive delays in polling
> loops.
> 
> This commit restructures the main polling loops to use get_timer(...)
> to determine whether a (millisecond) timeout has expired.  We choose
> not to use the wait_bit function, as we don't need interruptability
> with ctrl-c and have at least one case where two bits (one for an
> error condition and another one for completion) need to be read and
> using wait_bit would have not added to the clarity.
> 
> The observed speedup in testing on a A31 is greater than 10x (e.g. a
> 10MB write decreases from 9.302s to 0.884s).
> 
> Signed-off-by: Philipp Tomsich 
> Signed-off-by: Maxime Ripard 

Tested-by: Mylène Josserand 

Thanks,

-- 
Mylène Josserand, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
http://bootlin.com
___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot


[U-Boot] [PATCH] sunxi: improve throughput in the sunxi_mmc driver

2018-03-21 Thread Maxime Ripard
From: Philipp Tomsich 

Throughput tests have shown the sunxi_mmc driver to take over 10s to
read 10MB from a fast eMMC device due to excessive delays in polling
loops.

This commit restructures the main polling loops to use get_timer(...)
to determine whether a (millisecond) timeout has expired.  We choose
not to use the wait_bit function, as we don't need interruptability
with ctrl-c and have at least one case where two bits (one for an
error condition and another one for completion) need to be read and
using wait_bit would have not added to the clarity.

The observed speedup in testing on a A31 is greater than 10x (e.g. a
10MB write decreases from 9.302s to 0.884s).

Signed-off-by: Philipp Tomsich 
Signed-off-by: Maxime Ripard 
---
 drivers/mmc/sunxi_mmc.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/mmc/sunxi_mmc.c b/drivers/mmc/sunxi_mmc.c
index 4edb4be46c81..d36c1689e7b1 100644
--- a/drivers/mmc/sunxi_mmc.c
+++ b/drivers/mmc/sunxi_mmc.c
@@ -187,15 +187,16 @@ static int mmc_update_clk(struct sunxi_mmc_priv *priv)
 {
unsigned int cmd;
unsigned timeout_msecs = 2000;
+   unsigned long start = get_timer(0);
 
cmd = SUNXI_MMC_CMD_START |
  SUNXI_MMC_CMD_UPCLK_ONLY |
  SUNXI_MMC_CMD_WAIT_PRE_OVER;
+
writel(cmd, >reg->cmd);
while (readl(>reg->cmd) & SUNXI_MMC_CMD_START) {
-   if (!timeout_msecs--)
+   if (get_timer(start) > timeout_msecs)
return -1;
-   udelay(1000);
}
 
/* clock update sets various irq status bits, clear these */
@@ -276,18 +277,21 @@ static int mmc_trans_data_by_cpu(struct sunxi_mmc_priv 
*priv, struct mmc *mmc,
unsigned i;
unsigned *buff = (unsigned int *)(reading ? data->dest : data->src);
unsigned byte_cnt = data->blocksize * data->blocks;
-   unsigned timeout_usecs = (byte_cnt >> 8) * 1000;
-   if (timeout_usecs < 200)
-   timeout_usecs = 200;
+   unsigned timeout_msecs = byte_cnt >> 8;
+   unsigned long  start;
+
+   if (timeout_msecs < 2000)
+   timeout_msecs = 2000;
 
/* Always read / write data through the CPU */
setbits_le32(>reg->gctrl, SUNXI_MMC_GCTRL_ACCESS_BY_AHB);
 
+   start = get_timer(0);
+
for (i = 0; i < (byte_cnt >> 2); i++) {
while (readl(>reg->status) & status_bit) {
-   if (!timeout_usecs--)
+   if (get_timer(start) > timeout_msecs)
return -1;
-   udelay(1);
}
 
if (reading)
@@ -303,16 +307,16 @@ static int mmc_rint_wait(struct sunxi_mmc_priv *priv, 
struct mmc *mmc,
 uint timeout_msecs, uint done_bit, const char *what)
 {
unsigned int status;
+   unsigned long start = get_timer(0);
 
do {
status = readl(>reg->rint);
-   if (!timeout_msecs-- ||
+   if ((get_timer(start) > timeout_msecs) ||
(status & SUNXI_MMC_RINT_INTERRUPT_ERROR_BIT)) {
debug("%s timeout %x\n", what,
  status & SUNXI_MMC_RINT_INTERRUPT_ERROR_BIT);
return -ETIMEDOUT;
}
-   udelay(1000);
} while (!(status & done_bit));
 
return 0;
@@ -404,15 +408,16 @@ static int sunxi_mmc_send_cmd_common(struct 
sunxi_mmc_priv *priv,
}
 
if (cmd->resp_type & MMC_RSP_BUSY) {
+   unsigned long start = get_timer(0);
timeout_msecs = 2000;
+
do {
status = readl(>reg->status);
-   if (!timeout_msecs--) {
+   if (get_timer(start) > timeout_msecs) {
debug("busy timeout\n");
error = -ETIMEDOUT;
goto out;
}
-   udelay(1000);
} while (status & SUNXI_MMC_STATUS_CARD_DATA_BUSY);
}
 
-- 
2.14.3

___
U-Boot mailing list
U-Boot@lists.denx.de
https://lists.denx.de/listinfo/u-boot