Hi Steve,
I just tried your patch - awesome!
I can almost watch a youtube video in fullscreen now with a resolution of
1680x1050.
For testing performance, I think we can go with relative comparison of
performance, unless someone wants to implement some way to properly measure
performance.
It's easy to find ways to stress the implementation and see if it works
faster or slower. In my case, I try increasingly larger resolutions for
youtube videos of different qualities put in full screen viewing.
Non-fullscreen youtube is already very smooth, but it's not smooth yet when
you want to go fullscreen, especially as resolutions get larger. The ideal
goal would be to get a full hd video smoothly played over RemoteFX, in a
full hd resolution RDP session.
On Tue, Jun 7, 2011 at 10:29 PM, Marc-André Moreau <
marcandre.mor...@gmail.com> wrote:
> Hi Steve,
>
> Well, that was fast :) I had started thinking of the different ways we can
> integrate this SSE acceleration within the rest in a clean way. I see two
> major options:
>
> 1) The SSE code is part of the library, and can be disabled with a
> compile-time option. Methods for SSE and non-SSE have the same signature. At
> initialization, detection of SSE support level is performed, and a callback
> is registered to point to one of the methods (SSE or non-SSE).
>
> 2) The SSE code is part of a loadable sub-module, just like a plugin.
> detection is done in the "main" code and SSE support functions are loaded
> and registered dynamically. The only issue here is that there is not much
> value to doing it this way, since SSE support level can be detected and used
> dynamically anyway, and SSE is only available on the intel architecture. On
> systems where SSE is not available at all, such as ARM, the compile-time
> option should discard it completely.
>
> I think the part which needs to be abstracted is the "SIMD" system. The
> decoder should perform a check for available SIMD (SSE on intel, NEON on
> ARM) and use it if available, but when compiling for either intel or arm,
> only the relevant implementation should actually be compiled in. I think it
> would make sense to put the SSE code in separate files which are only
> included if FreeRDP is compiled for the architecture which makes sense for
> it. An analogy could be made with the way we currently deal with multiple
> cryptographic libraries. We have one file for abstracting each crypto
> library for our use, and at compile time only one of then is compiled, just
> like SSE would be compiled for intel only and NEON for arm only.
>
> What would you think of option 1), done in a similar way to the current
> cryptographic abstraction layer code?
>
> On Tue, Jun 7, 2011 at 9:46 PM, S. Erisman <seris...@serisman.com> wrote:
>
>> Marc,
>>
>> On 6/6/2011 9:20 AM, Marc-André Moreau wrote:
>>
>>> I read more about SSE, and then about NEON which is the equivalent for
>>> ARM
>>>
>>> My first impression is damn, how could I not see this before? This thing
>>> looks very well suited not only for acceleration of RemoteFX decoding, but
>>> there's a chance that more GDI operations could be accelerated with it than
>>> the current implementation in xfreerdp. Color conversion also appears to be
>>> possible with it. If someone wants to work on something like this, let me
>>> know.
>>>
>>
>> I started working on adding SSE/SSE2 decoding support to the RemoteFX
>> library.
>>
>> I think there are several questions that still need to be answered on how
>> to best wire this up, but please review the attached .patch file to see what
>> I have working so far. This .patch file is based off of your recent changes
>> in the awakecoding/FreeRDP branch.
>>
>> As a starting place, I broke out the YCbCr to RGB conversion code out of
>> rfx_decode_rgb and into a separate function. I then added an SSE
>> 'optimized' version of it. Also included is a file with the disassembly of
>> the rfx_decode.o file that clearly shows the difference between the 2
>> functions.
>>
>> One note... I had to use a ./configure CFLAGS="-O2 -msse2" command to get
>> this code to compile (the -O2 isn't actually needed, but cleans up the
>> assembled code). I think we would need to find a better way of
>> automatically handling this. Maybe a --with-sse flag that can be passed to
>> ./configure with #ifdef lines around SSE code? Help around how to set this
>> up would be appreciated.
>>
>> Then there are questions about structure. Should we break out SSE
>> optimizations into their own files and/or libraries, or leave them alongside
>> their non-SSE cousins?
>>
>> Lastly, is there a good way to test if and how much better these
>> optimizations actually are? I started messing around with gprof, sprof, and
>> oprofile, but I can't seem to get debug info out of the libfreerdp-rfx
>> static library. gprof works, but only records info on the xfreerdp
>> application and not on static libraries. I can't seem to get sprof or
>> oprofile working either. Maybe it is just the way I was using them, but is
>> there a better/easier way to profile this library? Or... maybe we could set
>> up a unit test with known RFX data that can be run through a number of
>> iterations and then time it?
>>
>> Any other thoughts?
>>
>> -Steve
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> EditLive Enterprise is the world's most technically advanced content
>> authoring tool. Experience the power of Track Changes, Inline Image
>> Editing and ensure content is compliant with Accessibility Checking.
>> http://p.sf.net/sfu/ephox-dev2dev
>> _______________________________________________
>> Freerdp-devel mailing list
>> Freerdp-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/freerdp-devel
>>
>>
>
------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
Freerdp-devel mailing list
Freerdp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freerdp-devel