Am Samstag 21 Juni 2008 12:55:49 schrieb Keith Whitwell: > I wrote similar code which now lives in the i965 driver in brw_wm_pass*.c. > > In short there is: > > brw_wm_fp.c > -- various preprocessing simplifications to the mesa program > representation. -- add instructions required by hw to set up interpolants, > etc. > > brw_wm_pass0.c > -- convert to an SSA format, but still very close to the mesa > program instruction format. > -- identify shared/scalar values (eg all 4 results of a DP3 instruction) > -- discard non-saturating, non-negating swizzles and moves > > brw_wm_pass1.c > -- dead code elimination > -- basically clear writemask for unused values, on a scalar granularity > > brw_wm_pass2.c > -- generate liveness information for each scalar register component > -- register allocation, assuming a scalar or SOA architecture. > > Most of this code is independent of the i965 architecture, though > assumptions do creep in... The big simplifying assumption is that > we're talking about only ARB_fs style shaders - ie no loops or > branches.
Thanks for the overview. I'm trying to reason about how one might be able to share some or all of that code between drivers. You're using a hardware-specific representation (brw_wm_compile) that splits fragment program registers into their scalar components. Obviously, looking at scalars individually allows some optimizations that are otherwise impossible or very hard, so it would be tempting for us to break registers up into their components as well, and then figuring out a common representation for some shared optimizations. I'm worried, though, because we cannot actually treat components like this in Radeon hardware, where each register consists of four floating point values [1] and we can't just mix and match components from different registers. I believe this is true even in the R600 family, from what I've seen in the documentation. So if we were to break registers into their scalar components, we would then have to *reassemble* them at register allocation time, and I'm not sure whether we'll be able to do that efficiently. >From a quick glance it seems as if the Intel driver doesn't deal with this problem. Am I right in assuming that Intel hardware allows you to work with components completely individually, i.e. that each register is in fact a single, scalar floating point value? cu, Nicolai [1] Actually, this is an oversimplification at least for R300 fragment programs, where there is one set of 3-component (RGB/XYZ) registers and one set of scalar (Alpha/W) registers. ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Mesa3d-dev mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
