On Sat, Jun 27, 2009 at 04:32:35PM +0900, Kyungmin Park wrote:
> >> +/**
> >> + * onenand_read_burst
> >> + *
> >> + * 16 Burst read: performance is improved up to 40%.
> >> + */
> >> +static void onenand_read_burst(void *dest, const void *src, size_t len)
> >> +{
> >> +     int count;
> >> +
> >> +     if (len % 16 != 0)
> >> +             return;
> >> +
> >> +     count = len / 16;
> >> +
> >> +     __asm__ __volatile__(
> >> +             "       stmdb   r13!, {r0-r3,r9-r12}\n"
> >> +             "       mov     r2, %0\n"
> >> +             "1:\n"
> >> +             "       ldmia   r1, {r9-r12}\n"
> >> +             "       stmia   r0!, {r9-r12}\n"
> >> +             "       subs    r2, r2, #0x1\n"
> >> +             "       bne     1b\n"
> >> +             "       ldmia   r13!, {r0-r3,r9-r12}\n"::"r" (count));
> >> +}
> >
> > What is this doing that we couldn't generically make memcpy do?
> 
> Even though It looks some strange. it has some performance gain. but
> not general.

I guess that's because you're reading from the same 16 bytes each loop
iteration.  Perhaps repeated 16-byte calls to memcpy could be used,
combined with a suitably optimized memcpy (possibly with inline asm in
the arch headers for certain constant sizes).

Also, relying on r0/r1 to still contain dest/src after the compiler has
had a chance to mess with things is dangerous.  Better to use the asm
constraints properly.  I also don't see why you need to save r3.

Is there any chance that this driver could be applicable to something
that isn't ARM?  Is this programming interface part of a host controller,
or is it embedded in the OneNAND chip?

-Scott
_______________________________________________
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot

Reply via email to