Am Samstag 21 Juni 2008 12:55:49 schrieb Keith Whitwell:
> I wrote similar code which now lives in the i965 driver in brw_wm_pass*.c.
>
> In short there is:
>
> brw_wm_fp.c
>   -- various preprocessing simplifications to the mesa program
> representation. -- add instructions required by hw to set up interpolants,
> etc.
>
> brw_wm_pass0.c
>   -- convert to an SSA format, but still very close to the mesa
> program instruction format.
>   -- identify shared/scalar values (eg all 4 results of a DP3 instruction)
>   -- discard non-saturating, non-negating swizzles and moves
>
> brw_wm_pass1.c
>   -- dead code elimination
>   -- basically clear writemask for unused values, on a scalar granularity
>
> brw_wm_pass2.c
>   -- generate liveness information for each scalar register component
>   -- register allocation, assuming a scalar or SOA architecture.
>
> Most of this code is independent of the i965 architecture, though
> assumptions do creep in...  The big simplifying assumption is that
> we're talking about only ARB_fs style shaders - ie no loops or
> branches.

Thanks for the overview.

I'm trying to reason about how one might be able to share some or all of that 
code between drivers. You're using a hardware-specific representation 
(brw_wm_compile) that splits fragment program registers into their scalar 
components.

Obviously, looking at scalars individually allows some optimizations that are 
otherwise impossible or very hard, so it would be tempting for us to break 
registers up into their components as well, and then figuring out a common 
representation for some shared optimizations.

I'm worried, though, because we cannot actually treat components like this in 
Radeon hardware, where each register consists of four floating point values 
[1] and we can't just mix and match components from different registers. I 
believe this is true even in the R600 family, from what I've seen in the 
documentation.

So if we were to break registers into their scalar components, we would then 
have to *reassemble* them at register allocation time, and I'm not sure 
whether we'll be able to do that efficiently.

>From a quick glance it seems as if the Intel driver doesn't deal with this 
problem. Am I right in assuming that Intel hardware allows you to work with 
components completely individually, i.e. that each register is in fact a 
single, scalar floating point value?

cu,
Nicolai

[1] Actually, this is an oversimplification at least for R300 fragment 
programs, where there is one set of 3-component (RGB/XYZ) registers and one 
set of scalar (Alpha/W) registers.

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to