2010/1/6 Michel Dänzer <mic...@daenzer.net>: > On Wed, 2010-01-06 at 15:18 +0000, Keith Whitwell wrote: >> On Wed, 2010-01-06 at 07:13 -0800, José Fonseca wrote: >> > On Wed, 2010-01-06 at 06:51 -0800, Michel Dänzer wrote: >> > > On Wed, 2010-01-06 at 14:32 +0000, José Fonseca wrote: >> > > > On Wed, 2010-01-06 at 06:23 -0800, Michel Dänzer wrote: >> > > > > On Wed, 2010-01-06 at 14:03 +0000, José Fonseca wrote: >> > > > > > On Tue, 2010-01-05 at 23:36 -0800, michal wrote: >> > > > > > > michal wrote on 2010-01-06 07:58: >> > > > > > > > michal wrote on 2009-12-22 10:00: >> > > > > > > > >> > > > > > > >> Marek Olšák wrote on 2009-12-22 08:40: >> > > > > > > >> >> > > > > > > >> >> > > > > > > >>> Hi, >> > > > > > > >>> >> > > > > > > >>> I noticed that gallium/auxiliary/util/u_format.csv contains >> > > > > > > >>> some weird >> > > > > > > >>> swizzling, for example see this: >> > > > > > > >>> >> > > > > > > >>> $ grep zyxw u_format.csv >> > > > > > > >>> PIPE_FORMAT_A8R8G8B8_UNORM , arith , 1, 1, un8 , un8 >> > > > > > > >>> , un8 , >> > > > > > > >>> un8 , zyxw, rgb >> > > > > > > >>> PIPE_FORMAT_A1R5G5B5_UNORM , arith , 1, 1, un5 , un5 >> > > > > > > >>> , un5 , >> > > > > > > >>> un1 , zyxw, rgb >> > > > > > > >>> PIPE_FORMAT_A4R4G4B4_UNORM , arith , 1, 1, un4 , un4 >> > > > > > > >>> , un4 , >> > > > > > > >>> un4 , zyxw, rgb >> > > > > > > >>> PIPE_FORMAT_A8B8G8R8_SNORM , arith , 1, 1, sn8 , sn8 >> > > > > > > >>> , sn8 , >> > > > > > > >>> sn8 , zyxw, rgb >> > > > > > > >>> PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 >> > > > > > > >>> , u8 , u8 >> > > > > > > >>> , zyxw, srgb >> > > > > > > >>> >> > > > > > > >>> It's hard to believe that ARGB, ABGR, and BGRA have the same >> > > > > > > >>> swizzling. Let's continue our journey: >> > > > > > > >>> >> > > > > > > >>> $ grep A8R8G8B8 u_format.csv >> > > > > > > >>> PIPE_FORMAT_A8R8G8B8_UNORM , arith , 1, 1, un8 , un8 >> > > > > > > >>> , un8 , >> > > > > > > >>> un8 , zyxw, rgb >> > > > > > > >>> PIPE_FORMAT_A8R8G8B8_SRGB , arith , 1, 1, u8 , u8 >> > > > > > > >>> , u8 , >> > > > > > > >>> u8 , wxyz, srgb >> > > > > > > >>> >> > > > > > > >>> Same formats, different swizzling? Also: >> > > > > > > >>> >> > > > > > > >>> $ grep B8G8R8A8 u_format.csv >> > > > > > > >>> PIPE_FORMAT_B8G8R8A8_UNORM , arith , 1, 1, un8 , un8 >> > > > > > > >>> , un8 , >> > > > > > > >>> un8 , yzwx, rgb >> > > > > > > >>> PIPE_FORMAT_B8G8R8A8_SRGB , arith , 1, 1, u8 , u8 >> > > > > > > >>> , u8 , >> > > > > > > >>> u8 , zyxw, srgb >> > > > > > > >>> >> > > > > > > >>> Same formats, different swizzling? I don't really get it. >> > > > > > > >>> And there's >> > > > > > > >>> much more cases like these. Could someone tell me what the >> > > > > > > >>> intended >> > > > > > > >>> order of channels should be? (or possibly propose a fix) The >> > > > > > > >>> meaning >> > > > > > > >>> of the whole table is self-contradictory and it's definitely >> > > > > > > >>> the >> > > > > > > >>> source of some r300g bugs. >> > > > > > > >>> >> > > > > > > >>> >> > > > > > > >>> >> > > > > > > >> Marek, >> > > > > > > >> >> > > > > > > >> Yes, that seems like a defect. The format swizzle field tells >> > > > > > > >> us how to >> > > > > > > >> "swizzle" the incoming pixel so that its components are >> > > > > > > >> ordered in some >> > > > > > > >> predefined order. For RGB and SRGB colorspaces the order is >> > > > > > > >> R, G, B and >> > > > > > > >> A. For depth-stencil, ie. ZS color space the order is Z and >> > > > > > > >> then S. >> > > > > > > >> >> > > > > > > >> I will have a look at this. >> > > > > > > >> >> > > > > > > >> >> > > > > > > > Marek, Jose, >> > > > > > > > >> > > > > > > > Can you review the attached patch? >> > > > > > > > >> > > > > > > >> > > > > > > Ouch, it looks like we will have to leave 24-bit (s)rgb formats >> > > > > > > with >> > > > > > > array layout as the current code generator will bite us on big >> > > > > > > endian >> > > > > > > platforms. Attached an updated patch. >> > > > > > >> > > > > > Why are you changing the layout from array to arith? Please leave >> > > > > > that >> > > > > > alone. >> > > > > > >> > > > > > Yes, the code generator needs a big_ending -> little endian call >> > > > > > to be >> > > > > > correct on big endian platforms, as gallium formats should always >> > > > > > be >> > > > > > thougth of in little endian terms, just like most hardware is. >> > > > > >> > > > > Actually, 'array' formats should be endianness neutral, >> > > > >> > > > Yep. >> > > > >> > > > > and IMO 'arith' formats should be defined in the CPU endianness. >> > > > >> > > > I originally thought that too, but Keith convinced me that "gallium is >> > > > a >> > > > hardware abstraction, and all 3d hardware is little endian, therefore >> > > > gallium formats should be always in little endian." >> > > >> > > Then there probably should be no 'arith' formats, at least not when the >> > > components consist of an integer number of bytes. >> > >> > Yes, that's probably the best. >> > >> > > > > Though as discussed >> > > > > before, having 'reversed' formats defined in the other endianness as >> > > > > well might be useful. Drivers which can work on setups where the CPU >> > > > > endianness doesn't match the GPU endianness should possibly only use >> > > > > 'array' formats, but then there might need to be some kind of mapping >> > > > > between the two kinds of formats somewhere, maybe in the state >> > > > > trackers >> > > > > or an auxiliary module... >> > > > >> > > > Basically a developer implementing a pipe drivers for a hardware should >> > > > not have to worry about CPU endianness. If a graphics API define >> > > > formats >> > > > in terms of the native CPU endianness then the state tracker will have >> > > > to do the translation. >> > > >> > > That's more or less what I meant in my last sentence above. Hopefully >> > > it'll be possible to share this between state trackers at least to some >> > > degree via an auxiliary module or so. At least OpenGL and X11 define >> > > (some) formats in CPU endianness. >> > >> > OK. We agree then. >> > >> > I don't know how you envision this auxiliary functionality. I don't >> > think it is actually necessary to define a bunch of PIPE_FORMAT_xxxx_REV >> > formats, given that no hardware will ever support them. Instead code >> > generate a variation of u_format_access.py which reads formats in native >> > endianness should suffice. That is >> > >> > void >> > util_format_read_4f_native(...); >> > >> > void >> > util_format_write_4f_native(...); >> > >> > void >> > util_format_read_4ub_native(...); >> > >> > void >> > util_format_write_4ub_native(...); >> > >> > Plus code generate one extra function that just does endianess >> > translation, without pixel unpacking/packing. >> > >> > util_format_byte_swap(...) >> > >> > I believe this should give all functionality a statetracker might need. >> > What do you think? >> > >> > Jose >> > >> >> This is potentially a lot of data we're translating - all incoming >> vertex data, for instance. I'd be interested to know more about what >> really happens on big-endian systems with various graphics cards outside >> of gallium -- are they really doing this level of translation? > > For vertex data, the Radeon drivers are using hardware vertex fetcher > byte-swapping support. (Unfortunately, that doesn't have any effect if > the vertex data is in VRAM...) Texture data is byte-swapped by the CPU > as necessary. > >> I don't claim that no GPU is big or little endian internally -- just >> that they definitely all have the capability to read/write little-endian >> formats at wire speed. >> >> But is it really true that they do not have that capability for >> big-endian data? > > In theory yes, but in practice there are various quirks to the > byte-swapping bits which make most of them mostly useless at least on > R300 generation hardware.
On r6xx+ just about every block has an endian swap control, although I haven't actually tested to see how well it works on each block (crtc swapping appears to work, but I haven't tried the others). There is also a BE version of the DRAW_INDEX_IMMD packet. See SQ_VTX_CONSTANT_WORD2_0 for vertexes and SQ_TEX_RESOURCE_WORD4_0 for textures. The CP also has a swapper. Alex ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev