Re: Optimization idea: soft XvPutImage

2008-09-22 Thread Adam Richter
--- On Mon, 9/22/08, Michel Dänzer [EMAIL PROTECTED] wrote:


 On Sun, 2008-09-21 at 13:58 +0200, Soeren Sandmann wrote:
  As other people pointed out, XRender does allow
 arbitrary 3x3
  transformations of source images, but you are right
 that the XRender
  protocol would require putting the data in a drawable
 first.
 
 There could be a generic XVideo adaptor which uses RENDER
 internally.
 The Xgl code might already have something like that.

Wow!  Right you are.  xorg-server-1.5/src/hw/xgl/xglxv.c.  At first glance, at 
least, it looks like it should be readily portable for Xgl to fb, as the Xgl 
specific stuff appears to be contained mostly in well labelled macros.  Thank 
you very much for pointing this out.

Adam Richter



  
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: Optimization idea: soft XvPutImage

2008-09-21 Thread Adam Richter
On Fri, 9/19/08, Soeren Sandmann [EMAIL PROTECTED] wrote:
 Adam Richter [EMAIL PROTECTED] writes:
 
  I want to suggest a way we could eliminate a
 substantial
  amount of data copying [...]
[...]
 Pixman, the software implementation of XRender already has
 support for
 YUV formats, so all that is really required is to just
 export YUV
 picture formats through the XRender protocol. [...]

Thank you for pointing out that pixman has some limited YUV reading support 
already.  The biggest problem that I see with using the X Render is that it 
lacks stretch and shrink, at least if I understand correctly from looking at 
the protocol specification here:

http://gitweb.freedesktop.org/?p=xorg/proto/renderproto.git;a=blob_plain;f=renderproto.txt
See lines 758-766:

Composite

op: PICTOP
src:PICTURE
mask:   PICTURE or None
dst:PICTURE
src-x, src-y:   INT16
mask-x, mask-y: INT16
dst-x, dst-y:   INT16
width, height:  CARD16

The last two parameters (width and height) presumably apply to both source and 
destination rather than having separate parameters for the source and 
destination rectangles.

This also appears to be the case when I look in the header for the pixman 
library  (pixman-0.12/pixman/pixman.h) at the declaration of pixman_blt.  It 
also only has a width and height, which presumably apply to both source and 
destination.

Even if you do not want to do stretch, I believe that the X Render extension 
would require first copying the YUV data to a drawable and then doing a 
drawable-drawable block transfer operation to do the YUV transformation.  In 
comparison, XvPutImage is a single call takes an XImage, which can be in shared 
memory, and would normally be in YUV, and specifies the YUV-RGB conversion and 
stretch in a single operation.

Thanks for your input, especially the tip about some YUV support already 
existing in libpixman.

Adam Richter




  
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: Optimization idea: soft XvPutImage

2008-09-21 Thread Maarten Maathuis
On Sun, Sep 21, 2008 at 11:10 AM, Adam Richter
[EMAIL PROTECTED] wrote:
 On Fri, 9/19/08, Soeren Sandmann [EMAIL PROTECTED] wrote:
 Adam Richter [EMAIL PROTECTED] writes:

  I want to suggest a way we could eliminate a
 substantial
  amount of data copying [...]
 [...]
 Pixman, the software implementation of XRender already has
 support for
 YUV formats, so all that is really required is to just
 export YUV
 picture formats through the XRender protocol. [...]

 Thank you for pointing out that pixman has some limited YUV reading support 
 already.  The biggest problem that I see with using the X Render is that it 
 lacks stretch and shrink, at least if I understand correctly from looking at 
 the protocol specification here:

 http://gitweb.freedesktop.org/?p=xorg/proto/renderproto.git;a=blob_plain;f=renderproto.txt
 See lines 758-766:

 Composite

op: PICTOP
src:PICTURE
mask:   PICTURE or None
dst:PICTURE
src-x, src-y:   INT16
mask-x, mask-y: INT16
dst-x, dst-y:   INT16
width, height:  CARD16

 The last two parameters (width and height) presumably apply to both source 
 and destination rather than having separate parameters for the source and 
 destination rectangles.

 This also appears to be the case when I look in the header for the pixman 
 library  (pixman-0.12/pixman/pixman.h) at the declaration of pixman_blt.  It 
 also only has a width and height, which presumably apply to both source and 
 destination.

 Even if you do not want to do stretch, I believe that the X Render extension 
 would require first copying the YUV data to a drawable and then doing a 
 drawable-drawable block transfer operation to do the YUV transformation.  In 
 comparison, XvPutImage is a single call takes an XImage, which can be in 
 shared memory, and would normally be in YUV, and specifies the YUV-RGB 
 conversion and stretch in a single operation.

 Thanks for your input, especially the tip about some YUV support already 
 existing in libpixman.

 Adam Richter





 ___
 xorg mailing list
 xorg@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/xorg


Src and Mask pictures have a transform, which can translate and rotate
coordinates as you please.
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Re: Optimization idea: soft XvPutImage

2008-09-21 Thread Daniel Stone
On Sun, Sep 21, 2008 at 02:10:07AM -0700, Adam Richter wrote:
 Thank you for pointing out that pixman has some limited YUV reading support 
 already.  The biggest problem that I see with using the X Render is that it 
 lacks stretch and shrink, at least if I understand correctly from looking at 
 the protocol specification here:

Render also allows you to apply a transformation matrix to pictures, so
you can scale with that.

Cheers,
Daniel


signature.asc
Description: Digital signature
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg

Re: Optimization idea: soft XvPutImage

2008-09-21 Thread Soeren Sandmann
Adam Richter [EMAIL PROTECTED] writes:

 Even if you do not want to do stretch, I believe that the X Render
 extension would require first copying the YUV data to a drawable and
 then doing a drawable-drawable block transfer operation to do the
 YUV transformation.  In comparison, XvPutImage is a single call
 takes an XImage, which can be in shared memory, and would normally
 be in YUV, and specifies the YUV-RGB conversion and stretch in a
 single operation.

As other people pointed out, XRender does allow arbitrary 3x3
transformations of source images, but you are right that the XRender
protocol would require putting the data in a drawable first.

A shared memory pixmap would be a possibility, perhaps, though the
shared memory extension should eventually be replaced with something
based on the DRM memory manager.



Soren
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg


Optimization idea: soft XvPutImage

2008-09-17 Thread Adam Richter
I want to suggest a way we could eliminate a substantial
amount of data copying when playing video on X servers that do not
provide hardware video windows, including servers that offer the X
shared memory extension.  In common situations, I suspect that this
could reduce memory bus utilization for playing video by more than a
factor of two.

I do not know if I have time to implement this optimization
right now, but I think it is potentially a big enough benefit that I
really ought to describe it here in case someone else wants to
implement it or can relieve me of thinking about it by showing me why
it will not work.

The copy operation that I want to eliminate occurs when the X
server reads data from XPutImage (usually via a shared memory area)
and copies it to the frame buffer.  The amount of data copied is
particularly large because the image is often stretched from its
native dimensions (720x480 for DVD, for example) to the dimensions of
the display area (for example, 1920x1200 for full screen video on a
24 panel).

To eliminate this copy, I want the X server to receive the
unstretched YUV image by XvPutImage provided by the Xvideo-2.2 (Xv)
extension, as is done for video display hardware that provided video
windows, which typically do YUV-RGB and stretch in the display
hardware.  In this proposed Xv driver, which I will refer to as soft
XvPutImage, the YUV-RGB and stretch operations would have to be done
in software by the X server, just as they are currently done in
software by video playing programs.  The difference is that by
combining this operation with the X server receiving the image, a big
copy operation is eliminated that might plausibly account for more
than half of the memory bus utilization in some common video playing
scenarios.

I realize that most modern video hardware has YUV/stretch
video window capabilities or other hardware acceleration for this
operation (for example, in hardware 3D operations), but there are at
common cases in practice where this optimization should be useful:

1) Improving the capabilities of the weakest systems would
   allow video to be used more ubiquitously (for example,
   adding video-based tutorials to larger application suites
   might become more common).

2) Many open source drivers lack this YUV/stretch capability
   even if the hardware has it, due to lack of public
   documentation or slow development in comparison to the
   life cycle of the hardware, even though efforts to address
   these problems are definitely helping.

3) The following scenarios may fall under #1 or #2, but are
   worth separate mention:

a) On systems with more processor cores (typically ones
   which have YUV/stretch hardware but lack drivers),
   memory bus utilization will be especially important.

b) Fake X servers, such as for VNC or when running
   on a virtualized computer, are less likely to have
   access to acceleration hardware (although it is
   possible).

c) There are those who believe that 3D acceleration
   hardware will be traded off for more CPU cores in
   typical systems of the future.  So, at least for the
   case of playing video through a 3D effect, this
   optimization may help.  See, for example, the
   Twilight of the GPU interview on slashdot yesterday
   at http://tech.slashdot.org/tech/08/09/15/2116240.shtml .

4) There are also a couple of cases of small benefit I will note
   for completeness:

a) For video with a slow frame rate playing on a monitor
   with a high refresh rate where the frame buffer and
   video window are part of system memory (i.e., no video
   RAM), where pixels in the frame buffer under the video
   window are still fetched for chroma key comparison, 
   Soft XvPutImage might actually use less bandwidth than
   a YUV/stretch video window.

b) Not part of this proposal, but a similar idea for
   systems that have Xv but lack XvMC would be SoftXvMC
   to eliminate a verbatim copying in of YUV data in Xv,
   but the bandwidth savings would be more modest.

To understand the possible bandwidth savings, here is a
calculation based on the scenario mentioned earlier: playing standard
DVD (720x480 yuv422) stretched to 1920x1200 (a popular full screen
resolution).

To start, here is a list of data transfers that occur in the
early stages of video decoding, regardless of whether this soft
XvPutImage optimization is used.  (I believe yuv422 is 2 bytes per
pixel).  In the descriptions I 

Re: Optimization idea: soft XvPutImage

2008-09-17 Thread Roland Scheidegger
On 17.09.2008 14:22, Adam Richter wrote:
   2) Many open source drivers lack this YUV/stretch capability
  even if the hardware has it, due to lack of public
  documentation or slow development in comparison to the
  life cycle of the hardware, even though efforts to address
  these problems are definitely helping.
Note that for this case, you could just implement this in the driver
itself (might be easiest for figuring out if it really helps in practice).

   The memory bus utilization would also be reduced (but never
 more than that factor 3) as the ratio of the size of the unstretched
 video to stretched video increases, such as when playing a 720x480 video
 on a newer 2560x1600 display.
It works the other way around too however, think Full HD playback on a
(mobile) device with a (say) 800x480 screen (not that I'd say this is
very common, but it's a case which could definitely happen). Of course,
normal Xv needs to transfer the full resolution image too (albeit only
as packed or more common planar yuv which is a bit smaller).

I could see this being faster and provide some benefits, but I don't
know if it's worth the effort.

Roland
___
xorg mailing list
xorg@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/xorg