On Sat, Jun 27, 2009 at 04:32:35PM +0900, Kyungmin Park wrote: > >> +/** > >> + * onenand_read_burst > >> + * > >> + * 16 Burst read: performance is improved up to 40%. > >> + */ > >> +static void onenand_read_burst(void *dest, const void *src, size_t len) > >> +{ > >> + int count; > >> + > >> + if (len % 16 != 0) > >> + return; > >> + > >> + count = len / 16; > >> + > >> + __asm__ __volatile__( > >> + " stmdb r13!, {r0-r3,r9-r12}\n" > >> + " mov r2, %0\n" > >> + "1:\n" > >> + " ldmia r1, {r9-r12}\n" > >> + " stmia r0!, {r9-r12}\n" > >> + " subs r2, r2, #0x1\n" > >> + " bne 1b\n" > >> + " ldmia r13!, {r0-r3,r9-r12}\n"::"r" (count)); > >> +} > > > > What is this doing that we couldn't generically make memcpy do? > > Even though It looks some strange. it has some performance gain. but > not general.
I guess that's because you're reading from the same 16 bytes each loop iteration. Perhaps repeated 16-byte calls to memcpy could be used, combined with a suitably optimized memcpy (possibly with inline asm in the arch headers for certain constant sizes). Also, relying on r0/r1 to still contain dest/src after the compiler has had a chance to mess with things is dangerous. Better to use the asm constraints properly. I also don't see why you need to save r3. Is there any chance that this driver could be applicable to something that isn't ARM? Is this programming interface part of a host controller, or is it embedded in the OneNAND chip? -Scott _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot