Re: [Freerdp-devel] RemoteFX Profiler: First results...

S. Erisman Thu, 09 Jun 2011 09:03:20 -0700

Martin,

On 6/9/2011 7:09 AM, Martin Fleisz wrote:

One thing that will definitely hurt performance is if our memory isnot 16-byte aligned. We should also have a possibility to overload thememory allocation in rfx_pool to use _mm_malloc/_mm_free to havecorrectly aligned buffers.

We should already be 16-byte memory aligned. I already modified thebuffers to be aligned (look in rfx_context_init), and GCC automaticallyaligns the local __m128 variables. Looking at the disassembled code,GCC is outputting the aligned version of the instruction set. In fact,if we weren't aligned (and still used aligned instructions), we would becrashing with a seg fault or other exception (I have seen this in testing).

I will make an attempt to implement an integer version of the code ...(I noticed that there seems to be no max/min instructions for 32-bitintegers so it might not be that straightforward to get it working)

I actually worked on an integer version of the code last night (err thismorning). It is definitely faster than the floating point version on mymachine, but (so far) has it's own problems. The first problem, as youmentioned, is that there is no 32-bit integer min/max instruction untilyou get to SSE4, which I feel is too new to rely on (at least for mypurposes). The approach I took, is to use the 16-bit version of allinstructions (available in SSE2). This has the advantage of 1/2 thememory requirement for the buffers and twice the throughput (because itcan process 8 operations at a time instead of just 4). This alsocurrently has a big disadvantage, however, in that we have to convertthe buffers and supporting decoding routines to be uint16 based (fromuint32). I must still have a bug in my attempt to do this conversion asam now getting some wierd color artifacts (regardless of original or sseversion of the code). So, I either have a bug in the decoding routinesthat needs to be found, or 16 bit ints aren't big enough to hold all theinformation prior to color conversion.

Since Vic wrote the original decoding routines (I think), maybe he canweigh in on whether 16 bit ints should be big enough for our buffers, orif they actually have to be 32 bit ints?

I will check-in my integer version when I can verify that my approachwill actually work. I probably won't be able to look at it again untillater tonight.


Thanks,
 Steve

------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev

_______________________________________________
Freerdp-devel mailing list
Freerdp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freerdp-devel

Re: [Freerdp-devel] RemoteFX Profiler: First results...

Reply via email to