2010/1/6 Michel Dänzer <mic...@daenzer.net>:
> On Wed, 2010-01-06 at 15:18 +0000, Keith Whitwell wrote:
>> On Wed, 2010-01-06 at 07:13 -0800, José Fonseca wrote:
>> > On Wed, 2010-01-06 at 06:51 -0800, Michel Dänzer wrote:
>> > > On Wed, 2010-01-06 at 14:32 +0000, José Fonseca wrote:
>> > > > On Wed, 2010-01-06 at 06:23 -0800, Michel Dänzer wrote:
>> > > > > On Wed, 2010-01-06 at 14:03 +0000, José Fonseca wrote:
>> > > > > > On Tue, 2010-01-05 at 23:36 -0800, michal wrote:
>> > > > > > > michal wrote on 2010-01-06 07:58:
>> > > > > > > > michal wrote on 2009-12-22 10:00:
>> > > > > > > >
>> > > > > > > >> Marek Olšák wrote on 2009-12-22 08:40:
>> > > > > > > >>
>> > > > > > > >>
>> > > > > > > >>> Hi,
>> > > > > > > >>>
>> > > > > > > >>> I noticed that gallium/auxiliary/util/u_format.csv contains 
>> > > > > > > >>> some weird
>> > > > > > > >>> swizzling, for example see this:
>> > > > > > > >>>
>> > > > > > > >>> $ grep zyxw u_format.csv
>> > > > > > > >>> PIPE_FORMAT_A8R8G8B8_UNORM        , arith , 1, 1, un8 , un8 
>> > > > > > > >>> , un8 ,
>> > > > > > > >>> un8 , zyxw, rgb
>> > > > > > > >>> PIPE_FORMAT_A1R5G5B5_UNORM        , arith , 1, 1, un5 , un5 
>> > > > > > > >>> , un5 ,
>> > > > > > > >>> un1 , zyxw, rgb
>> > > > > > > >>> PIPE_FORMAT_A4R4G4B4_UNORM        , arith , 1, 1, un4 , un4 
>> > > > > > > >>> , un4 ,
>> > > > > > > >>> un4 , zyxw, rgb
>> > > > > > > >>> PIPE_FORMAT_A8B8G8R8_SNORM        , arith , 1, 1, sn8 , sn8 
>> > > > > > > >>> , sn8 ,
>> > > > > > > >>> sn8 , zyxw, rgb
>> > > > > > > >>> PIPE_FORMAT_B8G8R8A8_SRGB         , arith , 1, 1, u8  , u8  
>> > > > > > > >>> , u8  , u8
>> > > > > > > >>>  , zyxw, srgb
>> > > > > > > >>>
>> > > > > > > >>> It's hard to believe that ARGB, ABGR, and BGRA have the same
>> > > > > > > >>> swizzling. Let's continue our journey:
>> > > > > > > >>>
>> > > > > > > >>> $ grep A8R8G8B8 u_format.csv
>> > > > > > > >>> PIPE_FORMAT_A8R8G8B8_UNORM        , arith , 1, 1, un8 , un8 
>> > > > > > > >>> , un8 ,
>> > > > > > > >>> un8 , zyxw, rgb
>> > > > > > > >>> PIPE_FORMAT_A8R8G8B8_SRGB         , arith , 1, 1, u8  , u8  
>> > > > > > > >>> , u8  ,
>> > > > > > > >>> u8  , wxyz, srgb
>> > > > > > > >>>
>> > > > > > > >>> Same formats, different swizzling? Also:
>> > > > > > > >>>
>> > > > > > > >>> $ grep B8G8R8A8 u_format.csv
>> > > > > > > >>> PIPE_FORMAT_B8G8R8A8_UNORM        , arith , 1, 1, un8 , un8 
>> > > > > > > >>> , un8 ,
>> > > > > > > >>> un8 , yzwx, rgb
>> > > > > > > >>> PIPE_FORMAT_B8G8R8A8_SRGB         , arith , 1, 1, u8  , u8  
>> > > > > > > >>> , u8  ,
>> > > > > > > >>> u8  , zyxw, srgb
>> > > > > > > >>>
>> > > > > > > >>> Same formats, different swizzling? I don't really get it. 
>> > > > > > > >>> And there's
>> > > > > > > >>> much more cases like these. Could someone tell me what the 
>> > > > > > > >>> intended
>> > > > > > > >>> order of channels should be? (or possibly propose a fix) The 
>> > > > > > > >>> meaning
>> > > > > > > >>> of the whole table is self-contradictory and it's definitely 
>> > > > > > > >>> the
>> > > > > > > >>> source of some r300g bugs.
>> > > > > > > >>>
>> > > > > > > >>>
>> > > > > > > >>>
>> > > > > > > >> Marek,
>> > > > > > > >>
>> > > > > > > >> Yes, that seems like a defect. The format swizzle field tells 
>> > > > > > > >> us how to
>> > > > > > > >> "swizzle" the incoming pixel so that its components are 
>> > > > > > > >> ordered in some
>> > > > > > > >> predefined order. For RGB and SRGB colorspaces the order is 
>> > > > > > > >> R, G, B and
>> > > > > > > >> A. For depth-stencil, ie. ZS color space the order is Z and 
>> > > > > > > >> then S.
>> > > > > > > >>
>> > > > > > > >> I will have a look at this.
>> > > > > > > >>
>> > > > > > > >>
>> > > > > > > > Marek, Jose,
>> > > > > > > >
>> > > > > > > > Can you review the attached patch?
>> > > > > > > >
>> > > > > > >
>> > > > > > > Ouch, it looks like we will have to leave 24-bit (s)rgb formats 
>> > > > > > > with
>> > > > > > > array layout as the current code generator will bite us on big 
>> > > > > > > endian
>> > > > > > > platforms. Attached an updated patch.
>> > > > > >
>> > > > > > Why are you changing the layout from array to arith? Please leave 
>> > > > > > that
>> > > > > > alone.
>> > > > > >
>> > > > > > Yes, the code generator needs a big_ending -> little endian call 
>> > > > > > to be
>> > > > > > correct on big endian platforms, as gallium formats should always 
>> > > > > > be
>> > > > > > thougth of in little endian terms, just like most hardware is.
>> > > > >
>> > > > > Actually, 'array' formats should be endianness neutral,
>> > > >
>> > > > Yep.
>> > > >
>> > > > > and IMO 'arith' formats should be defined in the CPU endianness.
>> > > >
>> > > > I originally thought that too, but Keith convinced me that "gallium is 
>> > > > a
>> > > > hardware abstraction, and all 3d hardware is little endian, therefore
>> > > > gallium formats should be always in little endian."
>> > >
>> > > Then there probably should be no 'arith' formats, at least not when the
>> > > components consist of an integer number of bytes.
>> >
>> > Yes, that's probably the best.
>> >
>> > > > > Though as discussed
>> > > > > before, having 'reversed' formats defined in the other endianness as
>> > > > > well might be useful. Drivers which can work on setups where the CPU
>> > > > > endianness doesn't match the GPU endianness should possibly only use
>> > > > > 'array' formats, but then there might need to be some kind of mapping
>> > > > > between the two kinds of formats somewhere, maybe in the state 
>> > > > > trackers
>> > > > > or an auxiliary module...
>> > > >
>> > > > Basically a developer implementing a pipe drivers for a hardware should
>> > > > not have to worry about CPU endianness. If a graphics API define 
>> > > > formats
>> > > > in terms of the native CPU endianness then the state tracker will have
>> > > > to do the translation.
>> > >
>> > > That's more or less what I meant in my last sentence above. Hopefully
>> > > it'll be possible to share this between state trackers at least to some
>> > > degree via an auxiliary module or so. At least OpenGL and X11 define
>> > > (some) formats in CPU endianness.
>> >
>> > OK. We agree then.
>> >
>> > I don't know how you envision this auxiliary functionality. I don't
>> > think it is actually necessary to define a bunch of PIPE_FORMAT_xxxx_REV
>> > formats, given that no hardware will ever support them. Instead code
>> > generate a variation of u_format_access.py which reads formats in native
>> > endianness should suffice. That is
>> >
>> >   void
>> >   util_format_read_4f_native(...);
>> >
>> >   void
>> >   util_format_write_4f_native(...);
>> >
>> >   void
>> >   util_format_read_4ub_native(...);
>> >
>> >   void
>> >   util_format_write_4ub_native(...);
>> >
>> > Plus code generate one extra function that just does endianess
>> > translation, without pixel unpacking/packing.
>> >
>> >   util_format_byte_swap(...)
>> >
>> > I believe this should give all functionality a statetracker might need.
>> > What do you think?
>> >
>> > Jose
>> >
>>
>> This is potentially a lot of data we're translating - all incoming
>> vertex data, for instance.  I'd be interested to know more about what
>> really happens on big-endian systems with various graphics cards outside
>> of gallium -- are they really doing this level of translation?
>
> For vertex data, the Radeon drivers are using hardware vertex fetcher
> byte-swapping support. (Unfortunately, that doesn't have any effect if
> the vertex data is in VRAM...) Texture data is byte-swapped by the CPU
> as necessary.
>
>> I don't claim that no GPU is big or little endian internally -- just
>> that they definitely all have the capability to read/write little-endian
>> formats at wire speed.
>>
>> But is it really true that they do not have that capability for
>> big-endian data?
>
> In theory yes, but in practice there are various quirks to the
> byte-swapping bits which make most of them mostly useless at least on
> R300 generation hardware.

On r6xx+ just about every block has an endian swap control, although I
haven't actually tested to see how well it works on each block (crtc
swapping appears to work, but I haven't tried the others).  There is
also a BE version of the DRAW_INDEX_IMMD packet.  See
SQ_VTX_CONSTANT_WORD2_0 for vertexes and SQ_TEX_RESOURCE_WORD4_0 for
textures.  The CP also has a swapper.

Alex

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to