Hi Eric,

On Wed, Jun 8, 2011 at 2:16 AM, S. Erisman <seris...@serisman.com> wrote:

> Marc,
>
> I took your suggestions into account, revised my earlier patch, and
> committed my changes to a new fork:
>    https://github.com/serisman/FreeRDP
>
> ... more comments below ...
>
>
> On 6/7/2011 9:29 PM, Marc-André Moreau wrote:
>
>> Hi Steve,
>>
>> Well, that was fast :) I had started thinking of the different ways we can
>> integrate this SSE acceleration within the rest in a clean way. I see two
>> major options:
>>
>> 1) The SSE code is part of the library, and can be disabled with a
>> compile-time option. Methods for SSE and non-SSE have the same signature. At
>> initialization, detection of SSE support level is performed, and a callback
>> is registered to point to one of the methods (SSE or non-SSE).
>>
> This is basically what my original patch was doing (minus having the
> compile-time option).
>
>
>  2) The SSE code is part of a loadable sub-module, just like a plugin.
>> detection is done in the "main" code and SSE support functions are loaded
>> and registered dynamically. The only issue here is that there is not much
>> value to doing it this way, since SSE support level can be detected and used
>> dynamically anyway, and SSE is only available on the intel architecture. On
>> systems where SSE is not available at all, such as ARM, the compile-time
>> option should discard it completely.
>>
> I agree with you that this seems like the wrong option.
>
>
>  I think the part which needs to be abstracted is the "SIMD" system. The
>> decoder should perform a check for available SIMD (SSE on intel, NEON on
>> ARM) and use it if available, but when compiling for either intel or arm,
>> only the relevant implementation should actually be compiled in. I think it
>> would make sense to put the SSE code in separate files which are only
>> included if FreeRDP is compiled for the architecture which makes sense for
>> it. An analogy could be made with the way we currently deal with multiple
>> cryptographic libraries. We have one file for abstracting each crypto
>> library for our use, and at compile time only one of then is compiled, just
>> like SSE would be compiled for intel only and NEON for arm only.
>>
>> What would you think of option 1), done in a similar way to the current
>> cryptographic abstraction layer code?
>>
> I have most of this done in my new fork (see above).  There is a configure
> option (--with-sse2) that is enabled by default that controls adding the
> sse2 .c/.h files and defining WITH_SSE2.  The _RFX_CONTEXT has a 'delegate'
> for the 1 method that has been optimized (so far).  The rfx_context_new
> method will set the delegate to the default non-sse software routine, and
> then call rfx_sse2_init if WITH_SSE2 is defined.  the rfx_sse2_init method
> then will check whether the CPU actually supports sse2 (future) and changes
> the delegate(s) to the sse optimized version(s). Eventually we could
> probably have the --with-sse2 config option smart enough to disable itself
> automatically based on which CPU arch was being compiled.  On the code side
> I separated all the SSE code into files in a new 'sse2' directory under
> libfreerdp-rfx.  What was done with SSE could easily be duplicated for NEON
> for the ARM architecture as well.
>
> Is this a good enough structure to be able to continue adding more
> optimizations?
>

I was about to implement similar changes, you're on the good track. Yes,
please keep going, I will likely improve on your code soon or I will at
least code review it before it gets merged.

I would suggest the following for detection:

Apparently there's a "cpuinfo" instruction which can be used to detect which
version of SSE is supported by the processor, if any. I think we should wrap
such a call to get the supported SSE level at run-time and register the
proper callback, instead of relying on a compile-time option for enabling or
disabling it.

Is SSE2 the best version of SSE available on your platform? I see there's
SSE3 and SSE4 as well, maybe we should implement SSE2, SSE3 and SSE4.

>
> Thanks,
>  Steve
>
------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
Freerdp-devel mailing list
Freerdp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freerdp-devel

Reply via email to