On Wed, Oct 7, 2015 at 7:51 AM, Emil Velikov <emil.l.veli...@gmail.com> wrote: > Signed-off-by: Emil Velikov <emil.veli...@collabora.com> > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 9 +++++++++ > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 10 ++++++++++ > 2 files changed, 19 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 03fe680..bcb8f38 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -1317,6 +1317,15 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr > break; > } > > + case nir_intrinsic_shader_clock: { > + /* We cannot do anything if there is an event, so ignore it for now */ > + fs_reg shader_clock = get_timestamp(bld);
get_timestamp() isn't doing quite what you want. If you look at the definition of fs_visitor::get_timestamp(), you see this comment: /* The caller wants the low 32 bits of the timestamp. Since it's running * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds, * which is plenty of time for our purposes. It is identical across the * EUs, but since it's tracking GPU core speed it will increment at a * varying rate as render P-states change. * * The caller could also check if render P-states have changed (or anything * else that might disrupt timing) by setting smear to 2 and checking if * that field is != 0. */ When you read the ARF register, you get three components set in your SIMD8 register, plus some junk because we have to read in four components at a time. That is, this code in get_timestamp: fs_reg ts = fs_reg(retype(brw_vec4_reg(BRW_ARCHITECTURE_REGISTER_FILE, BRW_ARF_TIMESTAMP, 0), BRW_REGISTER_TYPE_UD)); bld.group(4, 0).exec_all().MOV(dst, ts); will create a register called dst that looks like this: ------------------------------ | tm0.0 | tm0.1 | tm0.2 | junk | ------------------------------ Then doing dst.set_smear(0) will create a source that looks like this: --------------------------------------------------------------- | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | --------------------------------------------------------------- Or, in SIMD16, --------------------------------------------------------------- | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | --------------------------------------------------------------- --------------------------------------------------------------- | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | --------------------------------------------------------------- Whereas what you actually want is: --------------------------------------------------------------- | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | --------------------------------------------------------------- --------------------------------------------------------------- | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | --------------------------------------------------------------- Or, in SIMD16, --------------------------------------------------------------- | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | --------------------------------------------------------------- --------------------------------------------------------------- | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | tm0.0 | --------------------------------------------------------------- --------------------------------------------------------------- | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | --------------------------------------------------------------- --------------------------------------------------------------- | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | tm0.1 | --------------------------------------------------------------- You can use get_timestamp(), except you need to save the result and emit two more MOV's to appropriately-sized registers allocated using bld.vgrf(BRW_REGISTER_TYPE_UD) using the same smear() method that the get_timestamp() code uses (note that although it did smear(0), you can still call .smear(1) on the result to pick out tm0.1) to pick out the right components, and then a LOAD_PAYLOAD to combine them into the destination. > + > + bld.MOV(retype(dest, brw_type_for_base_type(glsl_type::uvec2_type)), > + shader_clock); > + break; > + } > + > case nir_intrinsic_image_size: { > /* Get the referenced image variable and type. */ > const nir_variable *var = instr->variables[0]->var; > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 41bd80d..f1de8d4 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -806,6 +806,16 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr > *instr) > break; > } > > + case nir_intrinsic_shader_clock: { > + /* We cannot do anything if there is an event, so ignore it for now */ > + src_reg shader_clock = get_timestamp(); > + enum brw_reg_type type = brw_type_for_base_type(glsl_type::uvec2_type); > + > + dest = get_nir_dest(instr->dest, type); > + emit(MOV(dest, retype(shader_clock, type))); > + break; > + } > + > default: > unreachable("Unknown intrinsic"); > } > -- > 2.5.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev