Paul Berry <stereotype...@gmail.com> writes: > On 19 December 2013 13:30, Francisco Jerez <curroje...@riseup.net> wrote: > >> Paul Berry <stereotype...@gmail.com> writes: >> >> > On 2 December 2013 11:31, Francisco Jerez <curroje...@riseup.net> wrote: >> > >> >> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h >> >> b/src/mesa/drivers/dri/i965/brw_shader.h >> >> index ff5af93..f284389 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_shader.h >> >> +++ b/src/mesa/drivers/dri/i965/brw_shader.h >> >> @@ -23,6 +23,7 @@ >> >> >> >> #include <stdint.h> >> >> #include "brw_defines.h" >> >> +#include "brw_reg.h" >> >> #include "glsl/ir.h" >> >> >> >> #pragma once >> >> @@ -39,6 +40,45 @@ enum register_file { >> >> >> >> #ifdef __cplusplus >> >> >> >> +class backend_reg { >> >> +public: >> >> + backend_reg(); >> >> + backend_reg(struct brw_reg reg); >> >> + >> >> + bool is_zero() const; >> >> + bool is_one() const; >> >> + bool is_null() const; >> >> + >> >> + /** Register file: GRF, MRF, IMM. */ >> >> + enum register_file file; >> >> + >> >> + /** >> >> + * Register number. For MRF, it's the hardware register. For >> >> + * GRF, it's a virtual register number until register allocation >> >> + */ >> >> + int reg; >> >> + >> >> + /** >> >> + * Offset from the start of the contiguous register block. >> >> + * >> >> + * For pre-register-allocation GRFs, this is in units of a float per >> >> pixel >> >> + * (1 hardware register for SIMD8 mode, or 2 registers for SIMD16 >> >> mode). >> >> >> > >> > Since this code is now shared with the vec4 back-end, can we update this >> > comment to say "1 hardware register for SIMD8 or SIMD4x2 mode..."? >> > >> > >> >> + * For uniforms, this is in units of 1 float. >> >> + */ >> >> + int reg_offset; >> >> + >> >> + /** Register type. BRW_REGISTER_TYPE_* */ >> >> + int type; >> >> + struct brw_reg fixed_hw_reg; >> >> + >> >> + /** Value for file == BRW_IMMMEDIATE_FILE */ >> >> + union { >> >> + int32_t i; >> >> + uint32_t u; >> >> + float f; >> >> + } imm; >> >> +}; >> >> + >> >> class backend_instruction : public exec_node { >> >> public: >> >> bool is_tex(); >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h >> >> b/src/mesa/drivers/dri/i965/brw_vec4.h >> >> index 3b3f35b..a718333 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_vec4.h >> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h >> >> @@ -89,28 +89,7 @@ class dst_reg; >> >> unsigned >> >> swizzle_for_size(int size); >> >> >> >> -class reg >> >> -{ >> >> -public: >> >> - /** Register file: GRF, MRF, IMM. */ >> >> - enum register_file file; >> >> - /** virtual register number. 0 = fixed hw reg */ >> >> - int reg; >> >> - /** Offset within the virtual register. */ >> >> - int reg_offset; >> >> - /** Register type. BRW_REGISTER_TYPE_* */ >> >> - int type; >> >> - struct brw_reg fixed_hw_reg; >> >> - >> >> - /** Value for file == BRW_IMMMEDIATE_FILE */ >> >> - union { >> >> - int32_t i; >> >> - uint32_t u; >> >> - float f; >> >> - } imm; >> >> -}; >> >> - >> >> -class src_reg : public reg >> >> +class src_reg : public backend_reg >> >> { >> >> public: >> >> DECLARE_RALLOC_CXX_OPERATORS(src_reg) >> >> @@ -123,10 +102,9 @@ public: >> >> src_reg(uint32_t u); >> >> src_reg(int32_t i); >> >> src_reg(struct brw_reg reg); >> >> + src_reg(const backend_reg ®); >> >> >> > >> > I'm concerned about this constructor (and the corresponding constructors >> in >> > the dst_reg and fs_reg classes) contributing to object slicing problems. >> > Consider this code: >> > >> > src_reg r1; >> > r1.swizzle = BRW_SWIZZLE_XXXX; >> > backend_reg r2(r1); >> > src_reg r3(r2); >> > assert(r3.swizzle == BRW_SWIZZLE_XXXX); /* fails! */ >> > >> > The reason for the failure is that src_reg::swizzle doesn't appear in the >> > backend_reg base class. So when r1 is copied to r2, the value of swizzle >> > is lost. That by itself wouldn't be a problem, except that when we later >> > try to reconstruct a src_reg from r2, swizzle winds up being initialized >> to >> > the default value. >> > >> > Of course we would never write code like the above example directly, but >> > since your shared code for implementing loads/stores (in future patches) >> > uses backend_reg for all of its register representations, it seems likely >> > that a src_reg will get converted to a backend_reg and then back to a >> > src_reg at some point during compilation. So I suspect that when >> compiling >> > GLSL source code like the following: >> > >> > uniform image2D img; >> > ivec2 coord; >> > vec4 v; >> > imageStore(img, coord, v.yzwx); >> > >> > The swizzle would fail to take effect. >> > >> > Have you run tests to see if this is in fact the case? >> > >> >> I haven't, but I'm aware of this problem. It's one of the things I wish >> I had finished before I left. I had a few solutions in mind, all of >> them likely to be time-consuming and not especially compelling: >> >> - Rename what I called backend_reg to e.g. base_reg, make it an >> abstract base class or use some other technique to prevent its >> construction. Now backend_reg would be implemented as a sort of >> type-erased wrapper for the other kinds of regs. This wouldn't be >> too hard because the only thing we need backend_regs for is to pass >> values around, the actual uses of those regs know about the real >> register types. >> >> - Prevent construction of backend_reg as before, but pass them around >> as pointers instead of by value. This would be simple but kind of >> annoying and inconsistent with the current value semantics of the >> other register classes. >> >> - Make backend_reg a member variable rather than a base class to >> prevent unsafe up-casting. We could have helper functions to convert >> a fs/vec4 reg into a backend reg that check if the original register >> region parameters are the identity mapping and, if they are not, copy >> the value over to some temporary storage. Assignments would be >> handled by the visitor itself rather than the surface lowering code. >> >> - Use static rather than dynamic polymorphism in the shared code, which >> would be parameterized by the correct register type. There is really >> no reason for the shared code to be a (polymorphic) class, other than >> to avoid templates. >> >> I think right now I'm leaning towards possibility 1, but I'll give the >> matter another thought and try to get one of these solutions working >> when I find the time. I can't promise when that will happen though. >> >> Let me know if you come up with anything else. >> >> Thanks. >> > > I played around a little bit yesterday with the last option (using static > rather than dynamic polymorphism), and I like what I came up with, but it > involves templates so it would be a hard sell for a lot of folks. I'm > hoping that I can have an in-person conversation with some of the Intel > developers about this so that we can hash it out in person without things > getting too flamey. Unfortunately, several people have already started > their holiday vacations, so that conversation probably won't happen until > the new year.
Any news on this? Thanks.
pgpt__b_hxdEh.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev