On Friday, September 30, 2011, Jim Kukunas <james.t.kuku...@linux.intel.com> wrote: > On Fri, Sep 30, 2011 at 12:08:03AM -0300, Gustavo Sverzut Barbieri wrote: >> On Thursday, September 29, 2011, Jim Kukunas < >> james.t.kuku...@linux.intel.com> wrote: >> > Hi Folks, >> > >> > This patch series introduces a SSE3 implementation of Evas's common >> > engine blending routines. >> > >> > Why SSE3?: >> > The lddqu instruction, introduced in SSE3, is faster then a typical >> > unaligned load in the situation where we load from, but not store to, >> > an unaligned address which crosses a cache line. This yields itself well >> > to the blending functions which operate on two separate arrays. We single >> > step until we obtain an aligned address for the destination array, and use >> > lddqu to load the other unaligned array. >> > >> > Why do we need an SSE implementation?: >> > GCC does perform some auto-vectorization, but misses a lot of >> > opportunities for leveraging SSE, specifically when operating on >> > packed integers, as opposed to floating-point. With GCC 4.6.0 and >> > the CFLAGS listed below, the c implementation isn't vectorized, and >> > the MMX implementation performance is suboptimal. >> > >> > A few tests which demonstrate the performance impact: >> > >> > Setup: >> > Intel Atom N270, Intel 945GME, Expedite Xlib engine >> > GCC 4.5.1 CFLAGS=-m32 -mtune=atom -O2 -msse3 >> > >> > Rect Blend: >> > C: 21.80 FPS +/- 0.028674 >> > MMX: 27.41 FPS +/- 0.021344 >> > SSE3: 46.90 FPS +/- 0.376106 >> > >> > Image Blend Fade Unscaled: >> > C: 15.46 FPS +/- 0.031314 >> > MMX: 24.92 FPS +/- 0.055902 >> > SSE3: 34.28 FPS +/- 0.099457 >> > >> > Image Blend Solid Fade Unscaled: >> > C: 22.03 FPS +/- 0.097125 >> > MMX: 33.78 FPS +/- 0.190351 >> > SSE3: 46.86 FPS +/- 0.437874 >> > >> > Setup: >> > Intel Atom N455, Intel GMA 3150, Expedite Xlib engine >> > GCC 4.6.0 CFLAGS=-m32 -mtune=atom -O2 -msse3 >> > >> > Rect Blend: >> > C: 32.68 FPS +/- 0.218510 >> > MMX: 29.75 FPS +/- 0.527105 >> > SSE3: 54.24 FPS +/- 0.870486 >> > >> > Image Blend Unscaled: >> > C: 32.73 FPS +/- 0.359036 >> > MMX: 35.00 FPS +/- 1.099517 >> > SSE3: 50.93 FPS +/- 0.990806 >> > >> > Image Blend Occlude 3 Many: >> > C: 24.25 FPS +/- 0.213135 >> > MMX: 25.87 FPS +/- 0.470124 >> > SSE3: 36.96 FPS +/- 0.505757 >> > >> > I'm sure there is further room for improvement. >> > >> > Let me know what you guys think. >> >> I think it is amazing! We were already very fast but it was improved and can >> be improved even more. Excellent to have intel folks hacking EFL :-) > > Thanks. > >> >> Now I wonder whenever you'll try with icc and if it's supposed to yield >> better performance than gcc > > I wasn't planning on trying with icc. There is definately room for GCC > to generate better code for the SSE3 routines, and I'm not sure if ICC > does or not. Either way, optimizing for GCC reaches a wider audience.
Sure, just wondering about the results and if intel had plans to make EFL work with ICC :-) Likely most people will still do gcc anyway, but it's good to know >> Last but not least what's your target driver for gl/composite? Is it powervr >> based? Or the intel one with open drivers? > > All of my tests were conducted with Intel integrated graphics running > the open source drivers. But you ran software engine, not gl. >> >> >> > >> > Thanks. >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > All the data continuously generated in your IT infrastructure contains a >> > definitive record of customers, application performance, security >> > threats, fraudulent activity and more. Splunk takes this data and makes >> > sense of it. Business sense. IT sense. Common sense. >> > http://p.sf.net/sfu/splunk-d2dcopy1 >> > _______________________________________________ >> > enlightenment-devel mailing list >> > enlightenment-devel@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel >> > >> >> -- >> Gustavo Sverzut Barbieri >> http://profusion.mobi embedded systems >> -------------------------------------- >> MSN: barbi...@gmail.com >> Skype: gsbarbieri >> Mobile: +55 (19) 9225-2202 <tel:%2B55%20%2819%29%209225-2202> >> ------------------------------------------------------------------------------ >> All of the data generated in your IT infrastructure is seriously valuable. >> Why? It contains a definitive record of application performance, security >> threats, fraudulent activity, and more. Splunk takes this data and makes >> sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-d2dcopy2 >> _______________________________________________ >> enlightenment-devel mailing list >> enlightenment-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > > -- > Jim Kukunas > Intel Open Source Technology Center > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2dcopy2 > _______________________________________________ > enlightenment-devel mailing list > enlightenment-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > -- Gustavo Sverzut Barbieri http://profusion.mobi embedded systems -------------------------------------- MSN: barbi...@gmail.com Skype: gsbarbieri Mobile: +55 (19) 9225-2202 ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel