On Tue, January 11, 2011 09:33, Nori, Sekhar wrote: >> > http://processors.wiki.ti.com/index.php/DaVinci_PSP_03.20.00.14_Device_Driver_Features_and_Performance_Guide#SATA >> >> I've seen that page, and I get similar results. The main reason the folks >> I'm working with asked me to help them look into things is that they are >> under the impression that the OMAP-L138 should be able to easily do 50 MB/s. >> >> For instance, in this thread, a TI employee ("clam") forwarded a claim that >> raw SATA tests on the OMAP-L138 could do 120 MB/s. > > Yes, I think that's correct. SATA driver for TI's DSP/BIOS for the > pin-compatible DSP-only chip C6748 is able to do 120 MBps on read > and 84 MBps on write. Of course, the two operating systems are very > different; DSP/BIOS being more lightweight. Also, I guess buffer > copies would be avoided in DSP/BIOS. > > Please see the DSP/BIOS driver's datasheet here: > > http://software-dl.ti.com/dsps/dsps_public_sw/psp/BIOSPSP/01_30_01/content/C6748_BIOSPSP_Datasheet.pdf
In that document, I have a concern about the SATA numbers. They say they were done with 100 MB data per test. However, in my own experiments, I have found that to get good sustained write performance numbers, I need to do at least 2 GiB, otherwise the numbers are too optimistic because of caches and various other effects (e.g. the difference between when the benchmark is "done" and when the data is fully comitted to the disk). On the other hand, if the DSP can really get those performance numbers at that low of a CPU low compared to the ARM (84 MB/s @ 20% load, vs. 25 MB/s @ 99% load), then perhaps a solution in my application is to make the DSP (which is currently unused) control the SATA instead of the ARM (running Linux). Any thoughts here? >> Unfortunately, I have scoured the internet and have only found claims of >> people >> getting the similar 25 MB/s number ... I've yet to see anyone actually claim >> that they have gotten sustained SATA write performance better than that, >> which >> makes me wonder if it's not a Linux kernel problem, but just a limitation of >> the device. > > I think it is a Linux kernel limitation. Any ideas (even conjecture?) where exactly the bottleneck might be? 84 MB/s @ 20% load (DSP+BIOS) vs 25 MB/s @ 99% load (ARM+Linux) seems like a major difference, even given that the parts are different architectures. I've ruled out the effects of the Linux fs layer, as I've benchmarked against the raw device (/dev/sda), a raw partition (/dev/sda1), ext2, ext3, and ext4. Choice of filesystem or raw device only makes a slight (~10% to ~20%) difference. BTW, many of the major errors (as opposed to performance issues) seemed to be caused by having older SOMs. Here is a thread about that in the TI forums: http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/p/87625/309509.aspx#309509 Again, thanks for the info & discussion! _______________________________________________ Davinci-linux-open-source mailing list [email protected] http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
