Hi Marc, Thanks for your reply
I have done some optimization myself, but all of that has been on TI- DSP processors, and TI has *very good* compiler tools (and also general development tools). Now, question is, anyone aware of such tools (or libraries developed) for Android? As I can see from my code (cannot post that for obvious reasons), I'm pretty certain the code is pretty ugly on the memory usage side. It's really killing the processor :( I'll share what I can: I am basically loading two images (in float) and the result image is being filled in, so that is the equivalent of 4 * 3 = 12 byte images (assuming 4 bytes to a float, 4 bytes to an int, etc -- standard type widths). So obviously I have not one, but *three* images getting accessed, so cache misses and swaps are happening. What I am looking to do now is to redesign the algorithm/ data structures to minimize the cache misses by favoring locality patterns -- temporal & spatial. What I wanted to see was if someone has sort-of "been-there-done-that" and could provide some pointers. > Java tends to be allocate as you need memory model, but for > cache optimizations you need tighter control of memory usage. Exactly! How do I get that level of control while developing for Android? With either Java or native C... Thanks so much, Amit On Aug 20, 12:40 am, Marc <[email protected]> wrote: > I have spent a lot of time optimizing signalprocessingalgorithms > using both arm assembly and memory optimizations. The memory > optimizations can be huge. I remember one project the speed more than > doubled once we only using cache memory. That is the entire working > data set was less than 8k. > > The process is fairly straight forward: Determine the size of your > cache, then use only that much memory. This typically requires doing > tricks like making the output buffer overlap the input buffer. > Basically you need to know exactly where every byte of memory is being > used. Java tends to be allocate as you need memory model, but for > cache optimizations you need tighter control of memory usage. > > On Aug 19, 11:47 am, DanH <[email protected]> wrote: > > > > > > > > > "Hate to sound like I'm harping on the same stuff, but then (assuming > > that the JVM/JIT compiler is doing good enough), the memory bottleneck > > still remains." > > > Yep, much of our effort on iSeries went into the memory bottleneck > > area. Eg, we got fairly astounding improvements (ca 20%) when we > > "packed" objects so that the fields of "SubclassOfA" filled in the > > "holes" left from aligning the fields of "A". And even more > > improvement by packing the Char array owned by a String into the > > String and arranging it so that the two shared a single header. > > > (BTW, with regard to alignment, note that most processors can handle, > > eg, unaligned ints and longs, but often the storage accesses are > > several times longer if unaligned, so alignment may be very important, > > even if "unnecessary".) > > > On Aug 19, 12:54 pm, Amit <[email protected]> wrote: > > > > Hi Dan, > > > > Thanks for the response > > > > > In general, JITed Java code is as fast as or faster than the > > > > equivalent native code, if the JIT is reasonably good, and if the > > > > specific application can be coded efficiently in Java. > > > > I was actually banking on this. I don't know too much of the hairy > > > details (am not really a compiler person), but from what I have read > > > recent improvements by Google to the Dalvik VM make it *comparable* if > > > not equal in performance to native code ... > > > > Hate to sound like I'm harping on the same stuff, but then (assuming > > > that the JVM/JIT compiler is doing good enough), the memory bottleneck > > > still remains. > > > > Thanks, > > > Amit > > > > On Aug 19, 10:11 pm, DanH <[email protected]> wrote: > > > > > In general, JITed Java code is as fast as or faster than the > > > > equivalent native code, if the JIT is reasonably good, and if the > > > > specific application can be coded efficiently in Java. The problem is > > > > that some specific dataprocessingpatterns are not easy to code > > > > efficiently in Java, and I suspect that certain of the bit-bashing > > > > algorithms used inimageprocessingfall into this category. > > > > > In such cases the most efficient approach is "native Java", but I only > > > > know of one JVM (the IBM iSeries "classic" JVM) that permits this, and > > > > then only for system code. Otherwise it's a bit of a tradeoff to get > > > > the right partitioning between Java and native, since crossing the > > > > Java/native boundary tends to be relatively expensive. > > > > > On Aug 19, 7:03 am, Fabrizio Giudici <[email protected]> > > > > wrote: > > > > > > -----BEGIN PGP SIGNED MESSAGE----- > > > > > Hash: SHA1 > > > > > > On 8/19/10 13:35 , Amit wrote: > > > > > > > Now, I know that native code will *not* yield any significant > > > > > > performance improvement over Java code > > > > > > Well, specifically forimageprocessingthis won't be true, for sure > > > > > up to 2.1 included (as the bytecode is purely interpreted); in 2.2 we > > > > > have JIT, but can't speak as I haven't seen it yet. > > > > > > - -- > > > > > Fabrizio Giudici - Java Architect, Project Manager > > > > > Tidalwave s.a.s. - "We make Java work. Everywhere." > > > > > java.net/blog/fabriziogiudici -www.tidalwave.it/people > > > > > [email protected] > > > > > -----BEGIN PGP SIGNATURE----- > > > > > Version: GnuPG/MacGPG2 v2.0.14 (Darwin) > > > > > Comment: Using GnuPG with Mozilla -http://enigmail.mozdev.org/ > > > > > > iEYEARECAAYFAkxtHakACgkQeDweFqgUGxe83wCfSDP1NEN+TLD0iOCZ/zSvQDRw > > > > > I5cAoJOEoC7eREU5KuPU7m93/GDj9VUr > > > > > =2ZDf > > > > > -----END PGP SIGNATURE----- -- You received this message because you are subscribed to the Google Groups "Android Developers" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/android-developers?hl=en

