Hi, I recently posted to the Nouveau mailing list about this, but for those who don't participate in that one I thought I would also post here since it seems to concern DRI as much as Nouveau. I intend to submit an application for a project that will attempt to implement XvMC in terms of Gallium3D. I've come up with a preliminary proposal and was hoping people would be willing to give it a quick read and give me some feedback; opinions, corrections, concerns, etc. An HTML version is here: http://www.bitblit.org/gsoc/gallium3d_xvmc.shtml and a text version is below.
Also, something I forgot to mention in my Nouveau email is that I would need a mentor and a mentoring organization. I've been told that this can be pursued under X.org since DRI isn't a mentoring org this year, hopefully this is OK, Thank you kindly. Younes Manton younes.m at gmail Generic GPU-Accelerated Video Decoding Synopsis: The purpose of this project is to produce a video decoding solution for GPUs that are supported by the Gallium3D driver framework. The project will attempt to implement the XvMC API using the programmable pipeline of a typical GPU, thereby providing accelerated video decoding to a wide variety of hardware. Since the decoding will be implemented using the GPU's programmable pipeline, it is important to note that this solution should support all recent GPUs regardless of whether or not they include dedicated video decoding hardware. It is hoped that this GPU-based acceleration will allow for real-time play back of HD video streams on mid-range and possibly low-end hardware. The implementation will be developed and tested using Gallium3D's SoftPipe driver, a stable software reference implementation, and later on Nvidia hardware and the nouveau driver. Benefits: Video media has become a pervasive part of the computing landscape and encompasses a variety of formats and resolutions, from low-res MPEG2 streams to HD MPEG4 content. From the point of view of the end-user, accelerated video decoding offers potentially better quality, smoother multi-tasking (by way of unburdening the main CPU) and the extension of the lifespan of current mid-range and low-end hardware. From the point of view of the OSS community, accelerated video decoding will offer an incentive to the end-user to adopt open-source drivers, which have traditionally not provided significant video acceleration on the most popular GPUs. This particular project will also provide a multi-vendor solution, as most GPUs supported by the Gallium3D framework can be targeted with the same code base, including some current Nvidia hardware via the nouveau driver, AMD/ATI hardware, and Intel hardware, amongst others. Deliverables: The deliverables for this project have been organized into two categories: the minimum set of deliverables that would make this project worthwhile for all involved (must-haves), and a larger set of goals that would make good contributions to the community and offer greater benefit to the end-user (nice-to-haves). Must-Haves: * An XvMC implementation that handles the color space conversion (CSC) and motion compensation (MC) stages of the video decoding pipeline. These two stages represent the bulk of the processing and are good candidates for being handled by the GPU. This should allow for real-time play back of HD video streams according to [1]. As part of this goal it is expected some work will have to be done with the nouveau driver to address possible bugs and add required functionality. * Handling of the inverse discrete cosine transform (IDCT) stage of the video decoding pipeline. This stage does not map optimally to the GPU pipeline but represents a large percentage of the processing and would also allow for the preceding stage (inverse quantization - IQ) to also be handled by the GPU without introducing an extra GPU-CPU-GPU round trip between the IQ, IDCT, and MC stages. XvMC was originally intended to handle the MC stage, but has been extended to support IDCT. A preliminary timeline with milestones is presented below: Preliminary research & experimentation - April weeks 1 & 2 mplement CSC with SoftPipe - April weeks 3 & 4 Implement MC, IDCT with SoftPipe - May to mid-June (1) Preliminary hardware research & experimentation - June Test with real hardware, add required functionality, fix bugs - mid-June to Aug (2) Bug fixes, performance testing & tuning, documentation - Aug 1. Working implementation with the SoftPipe driver by mid-June 2. Working implementation with nouveau driver by end-July Nice-To-Haves: * Support for other video formats. XvMC was originally intended for MPEG2 video, but has been extended to support other formats such as MPEG4. * An implementation of the Video Acceleration API (VAAPI) which is similar to XvMC but has been designed to support off-loading more stages and more video formats. VAAPI is not currently supported widely by user applications, but this could offer incentive to application developers. * Implementation of various filters to improve visual quality. De-interlacing, de-blocking, de-ringing, bi-cubic interpolation, and others would be investigated for possible implementation. Project Details: The XvMC implementation: This implementation will be written in terms of Gallium3D using the SoftPipe driver, a software driver, and a fork of libXvMC from the openChrome project as a starting point. This will allow the implementation to be tested against a working reference driver, after which point development will switch to actual hardware. The level of support that the current Gallium3D Nvidia implementation offers will be evaluated and any deficiencies will be addressed. It is expected that the XvMC implementation will require vertex and pixel shader support, texturing support, and render target support, amongst other things. The implementation itself will be composed of C source code that will be compiled as part of the nouveau driver (TODO: Get clarification on this--part of nouveau, Gallium3D?) and vertex and pixel shader code that will execute on the GPU. User applications: The user applications will serve as benchmarks for the implementation. Several user applications have support for XvMC and can be used to verify functional and performance characteristics. VLC and MPlayer are good candidates for example. Alternative hardware drivers: Alternative hardware drivers, in addition to the SoftPipe driver, can also be used as a reference, especially for performance comparisons. The Nvidia binary driver supports XvMC and implements IDTC and MC for MPEG2 video and can serve as a reference. Related Work: [1] "Accelerate Video Decoding With Generic GPU" - Guobin Shen, Guang-Ping Gao, Shipeng Li, Heung-Yeung Shum, and Ya-Qin Zhang - http://research.microsoft.com/~jackysh/publications/Accelerate%20video%20decoding%20with%20generic%20GPU.pdf This paper describes the implementation of the color space conversion (CSC) and motion compensation (MC) stages of the video decoding pipeline via the GPU programmable pipeline. The authors state that they were able to achieve real-time 720p HD play back on a Pentium III 667 MHz CPU and GeForce3 GPU. [2] "Techniques for Efficient DCT/IDCT Implementation on Generic GPU" - Bo Fang, Guobin Shen, Shipeng Li, and Huifang Chen - http://research.microsoft.com/~jackysh/publications/iscas2005%20--%20Techniques%20for%20Efficient%20DCT_IDCT%20Implementation%20on%20Generic%20GPU.pdf An extension of the previous paper, where the authors also implement the inverse discrete cosine transform (IDCT) stage of the video decoding pipeline. They note that while their implementation is competitive, an optimized CPU SIMD implementation is still somewhat faster for this stage. [3] XvMC via dedicated hardware on Nvidia GPUs - http://nouveau.freedesktop.org/wiki/jb17bsome User jb17bsome is working towards XvMC support in the nouveau driver via dedicated video decoding hardware on Nvidia GPUs, as opposed to using the programmable GPU pipeline. Personal Details: My name is Younes Manton; I am currently a computer science student at York University in Toronto, Canada. I am interested in low-level computer architecture, compiler theory and design, 2D and 3D graphics technology, and video and audio decoding. For the last year I have been employed as an intern in IBM's compiler group, working on performance analysis for the XL C, C++, and Fortran compilers for PowerPC and CELL. As my internship winds down I hope to participate in the Summer of Code program on a project that is in line with my interests and is useful to the OSS community. Skills: * Well-versed in C, C++, and a variety of assembly languages (x86, PPC, CELL-SPU, SuperH, MIPS) * Well-versed in the Direct3D and OpenGL APIs, and various shading languages * Experienced with low-level programming, having worked on various embedded systems * TODO: Add more relevant skills, add evidence Plans: As my internship at IBM winds down I hope to have sufficient free time to undertake the above. I do not plan on taking any courses during the summer, but will be employed on a part-time basis as a necessity. I hope to devote an average of 20 hours per week to this work and will strive to meet expectations and deliver a successful project. I have identified a minimum set of deliverables that I feel will still make this project worthwhile for myself, the Google Summer of Code program, and the mentoring organization, but have also provided a larger set of goals that will be nice to have and that I am optimistic in achieving, at least partially. ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel