FYI

-------- Original Message --------
Subject: {DirectFB} Core: Experimental gfxcard.c replacement code with huge     
optimization
Date: 13 Nov 2011 14:07:05 +0100
From: d...@directfb.org
To: directfb-...@directfb.org

New branch 'accel1' available with the following commits:
http://git.directfb.org/?p=core/DirectFB.git;a=commit;h=92c15f0612c55df9b95a83fb0d19804e4718fa48
commit 92c15f0612c55df9b95a83fb0d19804e4718fa48
Author: Denis Oliver Kropp <d...@directfb.org>
Date:   Mon Oct 31 23:44:07 2011 +0100

    Core: Experimental gfxcard.c replacement code with huge optimization

    This implementation does not lock/unlock buffers for each operation,
    but does a lazy state and lock management. If one calls SetColor and
    FillRectangles in a row it will only call SetState() of the driver,
    no unlock/lock of buffers etc.

    To achieve this there's a new dispatch cleanup handler added that
    is called before the next read() from Fusion device, to unlock the
    currently locked buffers and exit the currently active state.

    There's still a lot of code to move from gfxcard.c, the actual rendering,
    but maybe it's worth to think about a rework to support all kinds of
    cases with/without hardware matrix/clipping, different primitives etc.

    The performance boost is awesome, up to 20x for some tests I ran.

    Here are a few results:

    Benchmarking 100x100 on 852x464 RGB32 (32bit)...

    Anti-aliased Text                              3.001 secs ( 3637.187 
KChars/sec) [ 19.6%]
    Anti-aliased Text (blend)                      3.015 secs (  489.552 
KChars/sec) [  4.9%]
    Fill Rectangle                                 3.003 secs ( 3303.030 
MPixel/sec) [ 24.0%]
    Fill Rectangle (blend)                         3.066 secs (  181.343 
MPixel/sec) [  1.6%]
    Fill Rectangles [10]                           3.018 secs ( 3479.125 
MPixel/sec) [  4.6%]
    Fill Rectangles [10] (blend)                   3.351 secs (  182.035 
MPixel/sec) [  0.2%]
    Blit                                           3.005 secs ( 3346.422 
MPixel/sec) [ 16.3%]
    Blit 180                                       3.014 secs ( 1379.230 
MPixel/sec) [  7.9%]
    Blit colorkeyed                                3.015 secs ( 1271.973 
MPixel/sec) [  7.3%]
    Blit destination colorkeyed                    3.012 secs ( 1403.054 
MPixel/sec) [  7.9%]
    Blit with format conversion                    3.059 secs (  322.000 
MPixel/sec) [  1.9%]
    Blit with colorizing                           3.061 secs (  189.807 
MPixel/sec) [  1.9%]
    Blit from 32bit (blend)                        3.059 secs (  323.635 
MPixel/sec) [  1.9%]
    Blit from 32bit (blend) with colorizing        3.126 secs (   86.372 
MPixel/sec) [  0.9%]
    Blit SrcOver (premultiplied source)            3.037 secs (  526.506 
MPixel/sec) [  3.3%]
    Blit SrcOver (premultiply source)              3.035 secs (  548.599 
MPixel/sec) [  3.3%]

    Compared to the old code:

    Benchmarking 100x100 on 852x464 RGB32 (32bit)...

    Anti-aliased Text                              3.009 secs (  926.021 
KChars/sec) [ 18.0%]
    Anti-aliased Text (blend)                      3.015 secs (  427.462 
KChars/sec) [  4.9%]
    Fill Rectangle                                 3.010 secs (  655.813 
MPixel/sec) [ 40.5%]
    Fill Rectangle (blend)                         3.069 secs (  171.391 
MPixel/sec) [  2.2%]
    Fill Rectangles [10]                           3.019 secs ( 3093.739 
MPixel/sec) [  5.6%]
    Fill Rectangles [10] (blend)                   3.326 secs (  180.396 
MPixel/sec) [  0.3%]
    Blit                                           3.037 secs (  466.249 
MPixel/sec) [  6.6%]
    Blit 180                                       3.051 secs (  406.751 
MPixel/sec) [  5.5%]
    Blit colorkeyed                                3.046 secs (  397.570 
MPixel/sec) [  5.2%]
    Blit destination colorkeyed                    3.030 secs (  571.287 
MPixel/sec) [  8.2%]
    Blit with format conversion                    3.079 secs (  220.850 
MPixel/sec) [  2.2%]
    Blit with colorizing                           3.072 secs (  131.510 
MPixel/sec) [  2.2%]
    Blit from 32bit (blend)                        3.097 secs (  188.246 
MPixel/sec) [  2.2%]
    Blit from 32bit (blend) with colorizing        3.136 secs (   77.487 
MPixel/sec) [  0.9%]
    Blit SrcOver (premultiplied source)            3.078 secs (  253.411 
MPixel/sec) [  2.9%]
    Blit SrcOver (premultiply source)              3.068 secs (  265.319 
MPixel/sec) [  2.9%]

    Compared to new code, but running as master (new mechanism leverages async 
FusionCalls):

    Benchmarking 100x100 on 852x464 RGB32 (32bit)...

    Anti-aliased Text                              3.000 secs ( 1582.800 
KChars/sec) [ 99.3%]
    Anti-aliased Text (blend)                      3.003 secs (  402.797 
KChars/sec) [ 99.6%]
    Fill Rectangle                                 3.000 secs ( 1978.000 
MPixel/sec) [ 99.6%]
    Fill Rectangle (blend)                         3.001 secs (  172.609 
MPixel/sec) [ 99.6%]
    Fill Rectangles [10]                           3.002 secs ( 3214.523 
MPixel/sec) [ 99.6%]
    Fill Rectangles [10] (blend)                   3.049 secs (  180.387 
MPixel/sec) [ 99.6%]
    Blit                                           3.001 secs (  522.159 
MPixel/sec) [ 99.3%]
    Blit 180                                       3.000 secs (  424.333 
MPixel/sec) [ 99.6%]
    Blit colorkeyed                                3.002 secs (  413.724 
MPixel/sec) [ 99.3%]
    Blit destination colorkeyed                    3.001 secs (  615.794 
MPixel/sec) [ 99.3%]
    Blit with format conversion                    3.000 secs (  225.333 
MPixel/sec) [ 99.6%]
    Blit with colorizing                           3.003 secs (  143.856 
MPixel/sec) [ 99.6%]
    Blit from 32bit (blend)                        3.002 secs (  207.861 
MPixel/sec) [ 99.3%]
    Blit from 32bit (blend) with colorizing        3.006 secs (   74.184 
MPixel/sec) [ 99.6%]
    Blit SrcOver (premultiplied source)            3.001 secs (  274.908 
MPixel/sec) [ 99.0%]
    Blit SrcOver (premultiply source)              3.000 secs (  286.333 
MPixel/sec) [ 99.6%]

    YES, it is slower than as a slave, as master does not go via FusionCall!

 lib/fusion/fusion.c                 |   68 +++++
 lib/fusion/fusion.h                 |   19 ++
 lib/fusion/fusion_internal.h        |    2 +
 src/core/CoreGraphicsState_real.cpp |  465 +++++++++++++++++++++++++++++++++-
 src/core/gfxcard.c                  |    2 +-
 src/core/graphics_state.h           |   18 +-
 src/core/state.h                    |    6 +-
 src/gfx/clip.h                      |    2 +-
 src/gfx/generic/generic.c           |  197 +++++++++++-----
 src/gfx/generic/generic.h           |    3 +-
 10 files changed, 704 insertions(+), 78 deletions(-)

http://git.directfb.org/?p=core/DirectFB.git;a=commit;h=364bbdb150032eed9d69e5a0acfdd6f976a38770
commit 364bbdb150032eed9d69e5a0acfdd6f976a38770
Author: Denis Oliver Kropp <d...@directfb.org>
Date:   Sun Nov 13 13:49:45 2011 +0100

    Core: Use new "queue" property for rendering and state setting methods.

    Except surface setters because of out of order execution with references
    being dropped right after blitting from but before flushing.

 src/core/CoreGraphicsState.flux |   24 ++++++++++++++++++++++++
 1 files changed, 24 insertions(+), 0 deletions(-)

_______________________________________________
directfb-cvs mailing list
directfb-...@directfb.org
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-cvs

_______________________________________________
directfb-dev mailing list
directfb-dev@directfb.org
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

Reply via email to