On Wed, 2010-01-06 at 06:32 -0800, michal wrote: > Michel Dänzer wrote on 2010-01-06 15:23: > > On Wed, 2010-01-06 at 14:03 +0000, José Fonseca wrote: > > > >> On Tue, 2010-01-05 at 23:36 -0800, michal wrote: > >> > >>> michal wrote on 2010-01-06 07:58: > >>> > >>>> michal wrote on 2009-12-22 10:00: > >>>> > >>>> > >>>>> Marek Olšák wrote on 2009-12-22 08:40: > >>>>> > >>>>> > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I noticed that gallium/auxiliary/util/u_format.csv contains some weird > >>>>>> swizzling, for example see this: > >>>>>> > >>>>>> $ grep zyxw u_format.csv > >>>>>> PIPE_FORMAT_A8R8G8B8_UNORM , arith , 1, 1, un8 , un8 , un8 , > >>>>>> un8 , zyxw, rgb > >>>>>> PIPE_FORMAT_A1R5G5B5_UNORM , arith , 1, 1, un5 , un5 , un5 , > >>>>>> un1 , zyxw, rgb > >>>>>> PIPE_FORMAT_A4R4G4B4_UNORM , arith , 1, 1, un4 , un4 , un4 , > >>>>>> un4 , zyxw, rgb > >>>>>> PIPE_FORMAT_A8B8G8R8_SNORM , arith , 1, 1, sn8 , sn8 , sn8 , > >>>>>> sn8 , zyxw, rgb > >>>>>> PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , u8 > >>>>>> , zyxw, srgb > >>>>>> > >>>>>> It's hard to believe that ARGB, ABGR, and BGRA have the same > >>>>>> swizzling. Let's continue our journey: > >>>>>> > >>>>>> $ grep A8R8G8B8 u_format.csv > >>>>>> PIPE_FORMAT_A8R8G8B8_UNORM , arith , 1, 1, un8 , un8 , un8 , > >>>>>> un8 , zyxw, rgb > >>>>>> PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8 , u8 , u8 , > >>>>>> u8 , wxyz, srgb > >>>>>> > >>>>>> Same formats, different swizzling? Also: > >>>>>> > >>>>>> $ grep B8G8R8A8 u_format.csv > >>>>>> PIPE_FORMAT_B8G8R8A8_UNORM , arith , 1, 1, un8 , un8 , un8 , > >>>>>> un8 , yzwx, rgb > >>>>>> PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 , u8 , > >>>>>> u8 , zyxw, srgb > >>>>>> > >>>>>> Same formats, different swizzling? I don't really get it. And there's > >>>>>> much more cases like these. Could someone tell me what the intended > >>>>>> order of channels should be? (or possibly propose a fix) The meaning > >>>>>> of the whole table is self-contradictory and it's definitely the > >>>>>> source of some r300g bugs. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> Marek, > >>>>> > >>>>> Yes, that seems like a defect. The format swizzle field tells us how to > >>>>> "swizzle" the incoming pixel so that its components are ordered in some > >>>>> predefined order. For RGB and SRGB colorspaces the order is R, G, B and > >>>>> A. For depth-stencil, ie. ZS color space the order is Z and then S. > >>>>> > >>>>> I will have a look at this. > >>>>> > >>>>> > >>>>> > >>>> Marek, Jose, > >>>> > >>>> Can you review the attached patch? > >>>> > >>>> > >>> Ouch, it looks like we will have to leave 24-bit (s)rgb formats with > >>> array layout as the current code generator will bite us on big endian > >>> platforms. Attached an updated patch. > >>> > >> Why are you changing the layout from array to arith? Please leave that > >> alone. > >> > >> Yes, the code generator needs a big_ending -> little endian call to be > >> correct on big endian platforms, as gallium formats should always be > >> thougth of in little endian terms, just like most hardware is. > >> > > > > Actually, 'array' formats should be endianness neutral, and IMO 'arith' > > formats should be defined in the CPU endianness. Though as discussed > > before, having 'reversed' formats defined in the other endianness as > > well might be useful. Drivers which can work on setups where the CPU > > endianness doesn't match the GPU endianness should possibly only use > > 'array' formats, but then there might need to be some kind of mapping > > between the two kinds of formats somewhere, maybe in the state trackers > > or an auxiliary module... > > > > > Interesting. Is there any reference that would say which formats are > 'array', and which are not? Or is it a simple rule that when every > component's bitsize is greater-or-equal to, say, 16, then it's an array > format?
There isn't really a rule, and I haven't profiled code enough to tell which algorithm is faster for swizzling -- bit shifting arithmetic or byte/word/dword indexation. My expectation is that byte/word/dword indexation will be faster for n x 8bit formats. In particular there is sse3 instruction PSHUFB can swizzle any n x 8bit format, 16 channels at a time. Where as bit SSE2 bit shift arithmetic instructions can only do 4 or 8 channels at a time, for 32bit and 16bit formats respectively. Again, this is only relevant for the generating CPU routines that will do pixel translation. GPU drivers should behave identically regardless of this layout. Jose ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev