http://dri.freedesktop.org/wiki/MesaDriverMesa internalsHow does one writes a new Mesa driver?There are two basic aspects to writing a new driver. First, define the public OpenGL / window system API. In the case of GLX, these are the glx*() functions. For OSMesa these are the OSMesa*() functions seen in include/GL/osmesa.h. You'll basically need functions for specifying frame buffer formats (bits per rgb, bits for Z, bits for stencil, etc.), functions for creating/destroying contexts, binding contexts to windows. etc. Second, implement the internal functions needed by the "DD" interface. Look at the osmesa.c file and grep for "ctx->Driver. = ". This is where the driver hooks itself into the core of Mesa. In many cases we hook in fall-back functions (like _swrast_DrawPixels). This isn't simple (or even as straight-forward as it used to be) but the system's designed for efficiency, flexibility and modularity. If the device driver interface were made for simplicity above all else there would probably only be two driver functions: dd_function_table::ReadPixels() and dd_function_table::DrawPixels(). The OSMesa driver is pretty simple. The only complexity comes from supporting all the different frame buffer formats like !RGB, !RGBA, !BGRA, !ABGR, etc. I think the Windows driver is in pretty good shape too. The XMesa driver (upon which Mesa's GLX is layered) is rather large because of lots of frame buffer formats and optimized point/line/triangle rendering functions. Mesa 4.x Implementation NotesThe big changes in Mesa were made between Mesa 3.4.x and Mesa 3.5. That's when KeithWhitwell re-modularized the source code into separate modules for T&L, s/w rasterization, etc. This document is an overview of the internal structure of Mesa and is meant for those who are interested in modifying or enhancing Mesa, or just curious. Note: Based on the original Mesa Implementation Notes and corrections by Brian Paul. Library State and ContextsOpenGL
uses the notion of a state machine. Almost all OpenGL state is
contained in one large structure: GLcontextRec
(typedef'd to GLcontext), as seen in mtypes.h. This is the central
context data structure for Mesa.
The Vertex bufferThe immediate represents everything that can take place between glBegin and glEnd being able to represent multiple glBegin/glEnd pairs. It can be used to losslessly encode this information in display lists. See t_context.h and t_imm_api.c. When either the vertex buffer becomes filled or a state change outside the glBegin/glEnd is made, we must flush the buffer. That is, we apply the vertex transformations, compute lighting, fog, texture coordinates etc. The various vertex transformations are implemented as software pipeline stages by the t_pipeline.c and tnl/t_vb_*.c files. When we're outside of a glBegin/glEnd pair the information in this structure is retained pending either of the flushing events described above. Note: Originally, Mesa didn't accumulate vertices in this way. Instead, glVertex transformed and lit then buffered each vertex as it was received. When enough vertices to draw the primitive (1 for points, 2 for lines, >2 for polygons) were accumulated the primitive was drawn and the buffer cleared. The new approach of buffering many vertices and then transforming, lighting and clip testing is faster because it's done in a "vectorized" manner. See gl_transform_points in m_xform.c for an example. For best performance Mesa clients should try to maximize the number of vertices between glBegin/glEnd pairs and use connected primitives when possible. RasterizationThe point, line and polygon rasterizers are called via the Point, Line, and Triangle function pointers in the SWcontext structure in s_context.h. Whenever the library state is changed in a significant way, the NewStateNewState is checked. If the flag is set we re-evaluate the state to determine what rasterizers to use. Special purpose rasterizers are selected according to the status of certain state variables such as flat vs smooth shading, depth-buffered vs. non-depth- buffered, etc. The _swrast_choose_* functions do this analysis. It's up to the device driver to choose optimized or accelerated rasterization functions to replace those in the general software rasterizer. context flag is raised. When glBegin is called In general, typical states (depth-buffered & smooth-shading) result in optimized rasterizers being selected. Non-typical states (stenciling, blending, stippling) result in slower, general purpose rasterizers being selected. Pixel spans
Note: Pixel buffers are no longer present in the latest Mesa code (4.1). All fragment (pixels plus color, depth, texture coordinates) processing is done via the span functions in s_span.c. Device DriverThere are three Mesa data types which are meant to be used by device drivers:
These types should be encapsulated by corresponding device driver data types. See xmesa.h and xmesaP.h for an example. In OOP terms, GLcontext, GLvisual, and GLframebuffer are base classes which the device driver must derive from. The structure dd_function_table seen in dd.h, defines the device driver functions 1. By using a table of pointers, the device driver can be changed dynamically at runtime. For example, the X/Mesa and OS/Mesa (Off-Screen rendering) device drivers can co-exist in one library and be selected at runtime. In addition to the device driver table functions, each Mesa driver has its own set of unique interface functions. For example, the X/Mesa driver has the XMesaCreateContext, XMesaBindWindow, and XMesaSwapBuffers functions while the Windows/Mesa interface has WMesaCreateContext, WMesaPaletteChange and WMesaSwapBuffers. New Mesa drivers need to both implement the dd_function_table functions and define a set of unique window system or operating system-specific interface functions. The device driver functions can roughly be divided into three groups:
Even if hardware accelerated renderers aren't available, the device driver may implement tuned, special purpose code for common kinds of points, lines or polygons. The X/Mesa device driver does this for a number of lines and polygons. See the xm_line.c and xm_tri.c and files. Overall OrganizationThe overall relation of the core Mesa library, X device driver/interface, toolkits and application programs is shown in this diagram:
+-----------------------------------------------------------+ | | | Application Programs | | | | +- glu.h -+------ glut.h -------+ | | | | | | | | GLU | GLUT | | | | | toolkits | | | | | | | +---------- gl.h ------------+-------- glx.h ----+ | | | | | | Mesa core | GLX functions | | | | | | +---------- dd.h ------------+------------- xmesa.h --------+ | | | XMesa* and device driver functions | | | +-----------------------------------------------------------+ | Hardware/OS/Window System | +-----------------------------------------------------------+ Mesa's pipelineThe work starts on t_pipeline.c were a driver configurable pipeline is run in response to either the vertex buffer filling up, or a statechange. The pipeline stages operate on context variables (suchs as vertices coord, colors, normals, textures coords, etc), applying the necessary operations in a OpenGL pipeline (such as coord transformation, lighting, etc.). The last stage - rendering -, calls *BuildVertices in *_vb.c which applies the viewport transformation, perpective divide, data type convertion and packs the vertex data in the context (in the arrays tnl->vb->*Ptr->data) into a driver dependent buffer with just the information relevent for the current OpenGL state (e.g., with/without texture, fog, etc). The template t_dd_vbtmp.h does this into a Direct3D alike vertex structure format. For instance, if we needed to premultiply the textures coordinates, as it is done in the tdfx and mach64 driver, we will need to make a customized version of t_dd_vbtmp.h for that effect, or change it and supply a configuration parameter to control that behavior. This buffer is then used to render the primitives in *_tris.c. This vertex data is intended to be copied almost verbatim into DMA buffers, with a header command, in most chips with DMA. But in the case of Mach64, where the commands are interleaved with each of the vertex data elements, it will be necessary to use a different structure of *Vertex to do the same, and probably to come up with a rather different implementation of t_dd_vbtmp.h as well. Indeed, if the chip expects something quite different to the d3d vertices, one will certainly want to look at this. In the meantime, it may be simplest to go with a "normal-looking" *_vb.c and do some extra stuff in the triangle/line/point functions. The ffb and glint drivers are a bit like this, I think. All this mechanism is controlled with function pointers in the context which are rechosen whenever the OpenGL state changes enough. These functions pointers can also be overwritten with those in the sw_* modules to fallback to software rendering. How about the main X drawing surface? Are 2 extra "window sized" buffers allocated for primary and secondary buffers in a page-flipping configuration?Right now, we don't do page flipping at all. Everything is a blit from back to front. The biggest problem with page flipping is detecting when you're in full screen mode, since OpenGL doesn't really have a concept of full screen mode. We want a solution that works for existing games. So we've been designing a solution for it. It should get implemented fairly soon since we need it for antialiasing on the V5. In the current implementation the X front buffer is the 3D front buffer. When we do page flipping we'll continue to do the same thing. Since you have an X window that covers the screen it is safe for us to use the X surface's memory. Then we'll do page flipping. The only issue will be falling back to blitting if the window is ever moved from covering the whole screen. ClippingThis section gives some notions about the several concepts associated to clipping.
ScissorsThe scissors are register settings that determine a hardware clipping rect in window coords. Any part of a primitive or other drawing operation that extends beyond the scissors is not drawn. The scissors can be set through GL commands. This has nothing to do with perspective clipping in the pipeline, just the final window coordinates. CliprectsCliprects are used to determine what parts of the context/window should be redrawn to handle overlapping windows. The more overlapping windows, the more cliprects you have. These need to be passed to the drm. It does a clear or swap for each cliprect. Again these are for 2D clipping after rasterization and not part of the pipeline. Things get a bit complicated by the fact that there can be separate clip rects for the front and back buffers. The cliprects are stored in device-independent structures, hence the code is abstracted out of the individual drivers. ViewportThe viewport array holds values to determine how to translate transformed, clipped, and projected vertex coordinates into window coordinates. This is the last stage of the pipeline. The values are based on the size and position of the "drawable", also known as the drawing area of the window for the context. Texture managementWhat follows is a description of the texture management system in the DRI. This is all based on the Radeon driver in the 11-Feb-2002 CVS of the mesa-4-0-branch. While it is based on the Radeon code, all drivers except gamma and tdfx seem to use the same scheme (and virtually identical code).
Note: Just FYI: the tdfx texture memory management code is different because:
How often are checks done to see if things need clipped/redrawn/redisplayed?The locking system is designed to be highly efficient. It is based on a two tiered lock. Basically it works like this: The client wants the lock. The use the CAS (I was corrected that the instruction is compare and swap, I knew that was the functionality, but I got the name wrong) If the client was the last application to hold the lock, you're done you move on. If it wasn't the last one, then we use an IOCTL to the kernel to arbitrate the lock. In this case some or all of the state on the card may have changed. The shared memory carries a stamp number for the X server. When the X server does a window operation it increments the stamp. If the client sees that the stamp has changed, it uses a DRI X protocol request to get new window location and clip rects. This only happens on a window move. Assuming your clip rects/window position hasn't changed, the redisplay happens entirely in the client. The client may have other state to restore as well. In the case of the tdfx driver we have three more flags for command fifo invalid, 3D state invalid, textures invalid. If those are set the corresponding state is restored. So, if the X server wakes up to process input, it currently grabs the lock but doesn't invalidate any state. I'm actually fixing this now so that it doesn't grab the lock for input processing. If the X server draws, it grabs the lock and invalidates the command FIFO. If the X server moves a window, it grabs the lock, updates the stamp, and invalidates the command FIFO. If another 3D app runs, it grabs the lock, invalidates the command FIFO, invalidates the 3D state and possibly invalidates the texture state.
|
