Re: [Mesa-dev] [PATCH 0/6] r600g: r600_shader.c small cleanups
On 28.02.2017 05:44, Dave Airlie wrote: > On 28 February 2017 at 12:42, Constantine Charlamov <hi-an...@yandex.ru> > wrote: >> On 28.02.2017 05:19, Dave Airlie wrote: >>> On 27 February 2017 at 06:31, Constantine Charlamov <hi-an...@yandex.ru> >>> wrote: >> Initially I was trying to implement for r600 optimization like in >>> the d633e23192ef17207f4a6acd3009da3126aab395 commit for radeonsi, but >>> failed because I need to learn some more about GPUs internals. For another >>> time. Anyway, accidentally it turned into a small cleanup of r600_shader.c, >>> here it >> is. >> >> Hi-Angel (6): >> Get rid of trailing whitespace (trivial) >> >> Rename i→chan_index >> Replace bit-shifts and cycles with helpers from >> tgsi_exec.h >> Rename tgsi_last_instruction → tgsi_last_channel >> Get >> rid of tgsi_last_channel() wherever possible, rename lasti → >> last_chan >>>> Remove redudant comparisons > > > Have they passed a complete piglit run >>>> without regressions? > > Dave. >> >> Hmm I don't know. Is there some specific test I should be running? I indeed >> tried >> >> ./piglit run shader results/shader --all-concurrent > > piglit run -c tests/gpu.py results/gpu > > I'm not sure how long an r600 run takes, I haven't ran it in a while, > but any patches that clean stuff up should probably make sure they > don't cause regressions on the way. > > Dave. > Thank you. Unfortunately piglit-testing with mesa-master makes my GPU to lockup. It doesn't happen with 17.0 branch though, but the patchset doesn't apply there cleanly. Any suggestion for how to find which test causes lockup? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] r600g: r600_shader.c small cleanups
On 28.02.2017 05:19, Dave Airlie wrote: > On 27 February 2017 at 06:31, Constantine Charlamov <hi-an...@yandex.ru> > wrote: >> Initially I was trying to implement for r600 optimization like in > the d633e23192ef17207f4a6acd3009da3126aab395 commit for radeonsi, but failed > because I need to learn some more about GPUs internals. For another time. > Anyway, accidentally it turned into a small cleanup of r600_shader.c, here it is. >> >> Hi-Angel (6): >> Get rid of trailing whitespace (trivial) >> Rename i→chan_index >> Replace bit-shifts and cycles with helpers from tgsi_exec.h >> Rename tgsi_last_instruction → tgsi_last_channel >> Get rid of tgsi_last_channel() wherever possible, rename lasti → >> last_chan >> Remove redudant comparisons > > > Have they passed a complete piglit run >> without regressions? > > Dave. Hmm I don't know. Is there some specific test I should be running? I indeed tried ./piglit run shader results/shader --all-concurrent But there're 33k tests running with speed that would take 1-1.5 hours for every run (i.e. with and without changes). I can try it anyway, but I just suspect I'm doing something wrong because usually, when I see people mention piglit tests, there're amounts far less than 10k. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] r600g: r600_shader.c small cleanups
On 28.02.2017 05:38, Matt Turner wrote: > On Sun, Feb 26, 2017 at 12:31 PM, Constantine Charlamov > <hi-an...@yandex.ru> wrote: >> Initially I was trying to implement for r600 optimization like in the >> d633e23192ef17207f4a6acd3009da3126aab395 commit for radeonsi, but failed >> because I need to learn some more about GPUs internals. For another time. >> Anyway, accidentally it turned into a small cleanup of r600_shader.c, here >> it is. >> >> Hi-Angel (6): > Do you mind using a real name in git? I actually renamed myself, but probably "Hi-Angel" left because I did it after commits done, i.e. "the author" of commits is Hi-Angel. I'll fix that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] r600g: r600_shader.c small cleanups
Okay. Should I resend them? On 27.02.2017 16:32, Marek Olšák wrote: > All r600g commits should have the "r600g:" prefix. > > Marek > > On Sun, Feb 26, 2017 at 9:31 PM, Constantine Charlamov > <hi-an...@yandex.ru> wrote: >> Initially I was trying to implement for r600 optimization like in the >> d633e23192ef17207f4a6acd3009da3126aab395 commit for radeonsi, but failed >> because I need to learn some more about GPUs internals. For another time. >> Anyway, accidentally it turned into a small cleanup of r600_shader.c, here >> it is. >> >> Hi-Angel (6): >> Get rid of trailing whitespace (trivial) >> Rename i→chan_index >> Replace bit-shifts and cycles with helpers from tgsi_exec.h >> Rename tgsi_last_instruction → tgsi_last_channel >> Get rid of tgsi_last_channel() wherever possible, rename lasti → >> last_chan >> Remove redudant comparisons >> >> src/gallium/drivers/r600/r600_shader.c | 815 >> ++--- >> 1 file changed, 333 insertions(+), 482 deletions(-) >> >> -- >> 2.11.1 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] Get rid of trailing whitespace (trivial)
From: Hi-Angel <hi-an...@yandex.ru> Signed-off-by: Constantine Charlamov <hi-an...@yandex.ru> --- src/gallium/drivers/r600/r600_shader.c | 44 +- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 8cb3f8b2f4..46aa2c1abd 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -39,23 +39,23 @@ #include #include -/* CAYMAN notes +/* CAYMAN notes Why CAYMAN got loops for lots of instructions is explained here. -These 8xx t-slot only ops are implemented in all vector slots. MUL_LIT, FLT_TO_UINT, INT_TO_FLT, UINT_TO_FLT -These 8xx t-slot only opcodes become vector ops, with all four -slots expecting the arguments on sources a and b. Result is +These 8xx t-slot only opcodes become vector ops, with all four +slots expecting the arguments on sources a and b. Result is broadcast to all channels. MULLO_INT, MULHI_INT, MULLO_UINT, MULHI_UINT, MUL_64 -These 8xx t-slot only opcodes become vector ops in the z, y, and +These 8xx t-slot only opcodes become vector ops in the z, y, and x slots. EXP_IEEE, LOG_IEEE/CLAMPED, RECIP_IEEE/CLAMPED/FF/INT/UINT/_64/CLAMPED_64 RECIPSQRT_IEEE/CLAMPED/FF/_64/CLAMPED_64 SQRT_IEEE/_64 SIN/COS -The w slot may have an independent co-issued operation, or if the -result is required to be in the w slot, the opcode above may be +The w slot may have an independent co-issued operation, or if the +result is required to be in the w slot, the opcode above may be issued in the w slot as well. The compiler must issue the source argument to slots z, y, and x */ @@ -3160,7 +3160,7 @@ static int r600_shader_from_tgsi(struct r600_context *rctx, goto out_err; } } - + shader->ring_item_sizes[0] = ctx.next_ring_offset; shader->ring_item_sizes[1] = 0; shader->ring_item_sizes[2] = 0; @@ -4272,7 +4272,7 @@ static int cayman_emit_float_instr(struct r600_shader_ctx *ctx) int i, j, r; struct r600_bytecode_alu alu; int last_slot = (inst->Dst[0].Register.WriteMask & 0x8) ? 4 : 3; - + for (i = 0 ; i < last_slot; i++) { memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = ctx->inst_info->op; @@ -4799,7 +4799,7 @@ static int tgsi_lit(struct r600_shader_ctx *ctx) alu.last = 1; } else alu.dst.write = 0; - + r = r600_bytecode_add_alu(ctx->bc, ); if (r) return r; @@ -5275,7 +5275,7 @@ static int tgsi_divmod(struct r600_shader_ctx *ctx, int mod, int signed_op) memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = ALU_OP1_FLT_TO_UINT; - + alu.dst.sel = tmp0; alu.dst.chan = 0; alu.dst.write = 1; @@ -5346,7 +5346,7 @@ static int tgsi_divmod(struct r600_shader_ctx *ctx, int mod, int signed_op) } else { r600_bytecode_src([1], >src[1], i); } - + alu.last = 1; if ((r = r600_bytecode_add_alu(ctx->bc, ))) return r; @@ -5612,7 +5612,7 @@ static int tgsi_divmod(struct r600_shader_ctx *ctx, int mod, int signed_op) } else { r600_bytecode_src([0], >src[1], i); } - + alu.src[1].sel = tmp0; alu.src[1].chan = 2; @@ -7014,7 +7014,7 @@ static int tgsi_tex(struct r600_shader_ctx *ctx) r = r600_bytecode_add_alu(ctx->bc, ); if (r) return r; - /* write initial compare value into Z component + /* write initial compare value into Z component - W src 0 for shadow cube - X src 1 for shadow cube array */ if (inst->Texture.Texture == TGSI_TEXTURE_SHADOWCUBE || @@ -7092,7 +7092,7 @@ static int tgsi_tex(struct r600_shader_ctx *ctx) r = r600_bytecode_add_alu(ctx->bc, ); if (r) return r; - + r = r600_bytecode_add_tex(ctx->bc, ); if (r) return r; @@ -7419,7 +7419,7 @@ static int tgsi_tex(struct r600_shader_ctx *ctx) /*
[Mesa-dev] [PATCH 4/6] Rename tgsi_last_instruction → tgsi_last_channel
From: Hi-Angel <hi-an...@yandex.ru> It's actually iterating through channels, checking whether they're enabled Signed-off-by: Constantine Charlamov <hi-an...@yandex.ru> --- src/gallium/drivers/r600/r600_shader.c | 56 +- 1 file changed, 28 insertions(+), 28 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 905214f69b..972e013aef 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -377,7 +377,7 @@ static void r600_bytecode_src(struct r600_bytecode_alu_src *bc_src, static int do_lds_fetch_values(struct r600_shader_ctx *ctx, unsigned temp_reg, unsigned dst_reg); -static int tgsi_last_instruction(unsigned writemask) +static int tgsi_last_channel(unsigned writemask) { int i, last_ch = 0; @@ -2692,7 +2692,7 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) return r; /* LDS write */ - lasti = tgsi_last_instruction(write_mask); + lasti = tgsi_last_channel(write_mask); for (chan_index = 1; chan_index <= lasti; chan_index++) { if(!TGSI_IS_DST0_CHANNEL_ENABLED(inst, chan_index)) continue; @@ -3766,7 +3766,7 @@ static int tgsi_op2_64_params(struct r600_shader_ctx *ctx, bool singledest, bool } } - lasti = tgsi_last_instruction(write_mask); + lasti = tgsi_last_channel(write_mask); TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { memset(, 0, sizeof(struct r600_bytecode_alu)); @@ -3893,7 +3893,7 @@ static int tgsi_op2_s(struct r600_shader_ctx *ctx, int swap, int trans_only) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; unsigned write_mask = inst->Dst[0].Register.WriteMask; - int chan_index, j, r, lasti = tgsi_last_instruction(write_mask); + int chan_index, j, r, lasti = tgsi_last_channel(write_mask); /* use temp register if trans_only and more than one dst component */ int use_tmp = trans_only && (write_mask ^ (1 << lasti)); unsigned op = ctx->inst_info->op; @@ -3966,7 +3966,7 @@ static int tgsi_ineg(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; int chan_index, r; - int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); + int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { memset(, 0, sizeof(struct r600_bytecode_alu)); @@ -3994,7 +3994,7 @@ static int tgsi_dneg(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; int chan_index, r; - int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); + int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { memset(, 0, sizeof(struct r600_bytecode_alu)); @@ -4084,7 +4084,7 @@ static int egcm_int_to_double(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; int chan_index, r; - int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); + int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); assert(inst->Instruction.Opcode == TGSI_OPCODE_I2D || inst->Instruction.Opcode == TGSI_OPCODE_U2D); @@ -4131,7 +4131,7 @@ static int egcm_double_to_int(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; int chan_index, r; - int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); + int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); assert(inst->Instruction.Opcode == TGSI_OPCODE_D2I || inst->Instruction.Opcode == TGSI_OPCODE_D2U); @@ -4208,7 +4208,7 @@ static int cayman_emit_double_instr(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; int chan_index, r; struct r600_bytecode_alu alu; - int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); + int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); int t1 = ctx->temp_reg; /* should only be one src regs */ @@ -4277,7 +4277,7 @@ static int cayman_mul_int_instr(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; int chan_ind
[Mesa-dev] [PATCH 3/6] Replace bit-shifts and cycles with helpers from tgsi_exec.h
From: Hi-Angel <hi-an...@yandex.ru> Changes turned out to be bigger than I expected, so I skipped over every place where I was in doubts. Still, it looks better. Signed-off-by: Constantine Charlamov <hi-an...@yandex.ru> --- src/gallium/drivers/r600/r600_shader.c | 246 + 1 file changed, 65 insertions(+), 181 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 8562678d0c..905214f69b 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -30,6 +30,7 @@ #include "pipe/p_shader_tokens.h" #include "tgsi/tgsi_info.h" +#include "tgsi/tgsi_exec.h" #include "tgsi/tgsi_parse.h" #include "tgsi/tgsi_scan.h" #include "tgsi/tgsi_dump.h" @@ -378,14 +379,14 @@ static int do_lds_fetch_values(struct r600_shader_ctx *ctx, unsigned temp_reg, static int tgsi_last_instruction(unsigned writemask) { - int i, lasti = 0; + int i, last_ch = 0; - for (i = 0; i < 4; i++) { + TGSI_FOR_EACH_CHANNEL (i) { if (writemask & (1 << i)) { - lasti = i; + last_ch = i; } } - return lasti; + return last_ch; } static int tgsi_is_supported(struct r600_shader_ctx *ctx) @@ -2693,8 +2694,7 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) /* LDS write */ lasti = tgsi_last_instruction(write_mask); for (chan_index = 1; chan_index <= lasti; chan_index++) { - - if (!(write_mask & (1 << chan_index))) + if(!TGSI_IS_DST0_CHANNEL_ENABLED(inst, chan_index)) continue; r = single_alu_op2(ctx, ALU_OP2_ADD_INT, temp_reg, chan_index, @@ -2704,10 +2704,7 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) return r; } - for (chan_index = 0; chan_index <= lasti; chan_index++) { - if (!(write_mask & (1 << chan_index))) - continue; - + TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { if ((chan_index == 0 && ((write_mask & 3) == 3)) || (chan_index == 2 && ((write_mask & 0xc) == 0xc))) { memset(, 0, sizeof(struct r600_bytecode_alu)); @@ -3747,7 +3744,7 @@ static int tgsi_op2_64_params(struct r600_shader_ctx *ctx, bool singledest, bool struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; unsigned write_mask = inst->Dst[0].Register.WriteMask; struct r600_bytecode_alu alu; - int chan_index, j, r, lasti = tgsi_last_instruction(write_mask); + int chan_index, j, r, lasti; int use_tmp = 0; if (singledest) { @@ -3770,11 +3767,7 @@ static int tgsi_op2_64_params(struct r600_shader_ctx *ctx, bool singledest, bool } lasti = tgsi_last_instruction(write_mask); - for (chan_index = 0; chan_index <= lasti; chan_index++) { - - if (!(write_mask & (1 << chan_index))) - continue; - + TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { memset(, 0, sizeof(struct r600_bytecode_alu)); if (singledest) { @@ -3823,10 +3816,7 @@ static int tgsi_op2_64_params(struct r600_shader_ctx *ctx, bool singledest, bool write_mask = inst->Dst[0].Register.WriteMask; /* move result from temp to dst */ - for (chan_index = 0; chan_index <= lasti; chan_index++) { - if (!(write_mask & (1 << chan_index))) - continue; - + TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = ALU_OP1_MOV; tgsi_dst(ctx, >Dst[0], chan_index, ); @@ -3912,10 +3902,7 @@ static int tgsi_op2_s(struct r600_shader_ctx *ctx, int swap, int trans_only) ctx->info.properties[TGSI_PROPERTY_MUL_ZERO_WINS]) op = ALU_OP2_MUL; - for (chan_index = 0; chan_index <= lasti; chan_index++) { - if (!(write_mask & (1 << chan_index))) - continue; - + TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { memset(, 0, sizeof(struct r600_bytecode_alu)); if (use_tmp) { alu.dst.sel = ctx->temp_reg; @@ -3943,10 +3930,7 @@ static int tgsi_op2_s(struct r600_shader_ctx *ctx, int swap, int trans_only) if (use_tmp) { /* move result from temp to dst */ - for (chan_index = 0; chan_index <= lasti; chan_index++) { -
[Mesa-dev] [PATCH 6/6] Remove redudant comparisons
From: Hi-Angel <hi-an...@yandex.ru> Signed-off-by: Constantine Charlamov <hi-an...@yandex.ru> --- src/gallium/drivers/r600/r600_shader.c | 64 -- 1 file changed, 14 insertions(+), 50 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 3616de572b..9afaaa57ba 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -745,10 +745,7 @@ static int single_alu_op2(struct r600_shader_ctx *ctx, int op, alu.dst.chan = dst_chan; alu.dst.write = 1; alu.last = 1; - r = r600_bytecode_add_alu(ctx->bc, ); - if (r) - return r; - return 0; + return r600_bytecode_add_alu(ctx->bc, ); } /* execute a single slot ALU calculation */ @@ -759,7 +756,6 @@ static int single_alu_op3(struct r600_shader_ctx *ctx, int op, int src2_sel, unsigned src2_chan_val) { struct r600_bytecode_alu alu; - int r; /* validate this for other ops */ assert(op == ALU_OP3_MULADD_UINT24); @@ -784,10 +780,7 @@ static int single_alu_op3(struct r600_shader_ctx *ctx, int op, alu.dst.chan = dst_chan; alu.is_op3 = 1; alu.last = 1; - r = r600_bytecode_add_alu(ctx->bc, ); - if (r) - return r; - return 0; + return r600_bytecode_add_alu(ctx->bc, ); } /* put it in temp_reg.x */ @@ -795,21 +788,16 @@ static int get_lds_offset0(struct r600_shader_ctx *ctx, int rel_patch_chan, int temp_reg, bool is_patch_var) { - int r; - /* MUL temp.x, patch_stride (input_vals.x), rel_patch_id (r0.y (tcs)) */ /* ADD Dimension - patch0_offset (input_vals.z), Non-dim - patch0_data_offset (input_vals.w) */ - r = single_alu_op3(ctx, ALU_OP3_MULADD_UINT24, - temp_reg, 0, - ctx->tess_output_info, 0, - 0, rel_patch_chan, - ctx->tess_output_info, is_patch_var ? 3 : 2); - if (r) - return r; - return 0; + return single_alu_op3(ctx, ALU_OP3_MULADD_UINT24, + temp_reg, 0, + ctx->tess_output_info, 0, + 0, rel_patch_chan, + ctx->tess_output_info, is_patch_var ? 3 : 2); } static inline int get_address_file_reg(struct r600_shader_ctx *ctx, int index) @@ -839,16 +827,12 @@ static int vs_add_primid_output(struct r600_shader_ctx *ctx, int prim_id_sid) static int tgsi_barrier(struct r600_shader_ctx *ctx) { struct r600_bytecode_alu alu; - int r; memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = ctx->inst_info->op; alu.last = 1; - r = r600_bytecode_add_alu(ctx->bc, ); - if (r) - return r; - return 0; + return r600_bytecode_add_alu(ctx->bc, ); } static int tgsi_declaration(struct r600_shader_ctx *ctx) @@ -1793,10 +1777,7 @@ static int fetch_tes_input(struct r600_shader_ctx *ctx, struct tgsi_full_src_reg if (r) return r; - r = do_lds_fetch_values(ctx, temp_reg, dst_reg); - if (r) - return r; - return 0; + return do_lds_fetch_values(ctx, temp_reg, dst_reg); } static int fetch_tcs_input(struct r600_shader_ctx *ctx, struct tgsi_full_src_register *src, unsigned int dst_reg) @@ -1819,10 +1800,7 @@ static int fetch_tcs_input(struct r600_shader_ctx *ctx, struct tgsi_full_src_reg if (r) return r; - r = do_lds_fetch_values(ctx, temp_reg, dst_reg); - if (r) - return r; - return 0; + return do_lds_fetch_values(ctx, temp_reg, dst_reg); } static int fetch_tcs_output(struct r600_shader_ctx *ctx, struct tgsi_full_src_register *src, unsigned int dst_reg) @@ -1841,10 +1819,7 @@ static int fetch_tcs_output(struct r600_shader_ctx *ctx, struct tgsi_full_src_re if (r) return r; - r = do_lds_fetch_values(ctx, temp_reg, dst_reg); - if (r) - return r; - return 0; + return do_lds_fetch_values(ctx, temp_reg, dst_reg); } static int tgsi_split_lds_inputs(struct r600_shader_ctx *ctx) @@ -4493,10 +4468,7 @@ static int tgsi_setup_trig(struct r600_shader_ctx *ctx) } alu.last = 1; - r = r600_bytecode_add_alu(ctx->bc, ); - if (r) - return r; - return 0; + return r600_bytecode_add_alu(ctx->bc, ); } static int cayman_trig(struct r600_shader_ctx *ctx) @@ -6679,7 +6651,6 @@ static int r600_do_buffer_txq(struct r600_shader_ctx *ctx) { struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; -
[Mesa-dev] [PATCH 5/6] Get rid of tgsi_last_channel() wherever possible, rename lasti → last_chan
From: Hi-Angel <hi-an...@yandex.ru> The diff might be confusing: the assignment of last_chan and comparison with last_chan are actually in different cycles. Signed-off-by: Constantine Charlamov <hi-an...@yandex.ru> --- src/gallium/drivers/r600/r600_shader.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 972e013aef..3616de572b 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -2673,7 +2673,7 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) { struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; const struct tgsi_full_dst_register *dst = >Dst[0]; - int chan_index, r, lasti; + int chan_index, r; int temp_reg = r600_get_temp(ctx); struct r600_bytecode_alu alu; unsigned write_mask = dst->Register.WriteMask; @@ -2692,8 +2692,7 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) return r; /* LDS write */ - lasti = tgsi_last_channel(write_mask); - for (chan_index = 1; chan_index <= lasti; chan_index++) { + for (chan_index = 1; chan_index < TGSI_NUM_CHANNELS; chan_index++) { if(!TGSI_IS_DST0_CHANNEL_ENABLED(inst, chan_index)) continue; r = single_alu_op2(ctx, ALU_OP2_ADD_INT, @@ -4277,10 +4276,11 @@ static int cayman_mul_int_instr(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; int chan_index, j, k, r; struct r600_bytecode_alu alu; - int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); + int last_chan; int t1 = ctx->temp_reg; TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, k) { + last_chan = k; TGSI_FOR_EACH_CHANNEL(chan_index) { memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = ctx->inst_info->op; @@ -4305,7 +4305,7 @@ static int cayman_mul_int_instr(struct r600_shader_ctx *ctx) alu.src[0].chan = chan_index; tgsi_dst(ctx, >Dst[0], chan_index, ); alu.dst.write = 1; - if (chan_index == lasti) + if (chan_index == last_chan) alu.last = 1; r = r600_bytecode_add_alu(ctx->bc, ); if (r) @@ -4321,7 +4321,7 @@ static int cayman_mul_double_instr(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; int chan_index, j, k, r; struct r600_bytecode_alu alu; - int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); + int last_chan; int t1 = ctx->temp_reg; /* t1 would get overwritten below if we actually tried to @@ -4332,6 +4332,8 @@ static int cayman_mul_double_instr(struct r600_shader_ctx *ctx) k = inst->Dst[0].Register.WriteMask == TGSI_WRITEMASK_XY ? 0 : 1; TGSI_FOR_EACH_CHANNEL (chan_index) { + if (TGSI_IS_DST0_CHANNEL_ENABLED(inst, chan_index)) + last_chan = chan_index; memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = ctx->inst_info->op; for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { @@ -4354,7 +4356,7 @@ static int cayman_mul_double_instr(struct r600_shader_ctx *ctx) alu.src[0].chan = chan_index; tgsi_dst(ctx, >Dst[0], chan_index, ); alu.dst.write = 1; - if (chan_index == lasti) + if (chan_index == last_chan) alu.last = 1; r = r600_bytecode_add_alu(ctx->bc, ); if (r) @@ -8795,10 +8797,11 @@ static int tgsi_umad(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; int chan_index, j, k, r; - int lasti = tgsi_last_channel(inst->Dst[0].Register.WriteMask); + int last_chan; /* src0 * src1 */ TGSI_FOR_EACH_DST0_ENABLED_CHANNEL(inst, chan_index) { + last_chan = chan_index; if (ctx->bc->chip_class == CAYMAN) { for (j = 0 ; j < 4; j++) { memset(, 0, sizeof(struct r600_bytecode_alu)); @@ -8846,7 +8849,7 @@ static int tgsi_umad(struct r600_shader_ctx *ctx) alu.src[0].chan = chan_index; r600_bytecode_src([1], >src[2], chan_index); - if (chan_index == lasti) { + if (chan_index == last_chan) { alu.last = 1; } r = r600_bytecode_add_alu(ctx-&g
[Mesa-dev] [PATCH 2/6] Rename i→chan_index
From: Hi-Angel <hi-an...@yandex.ru> I might have missed some more opportunities to rename, but oh well. Signed-off-by: Constantine Charlamov <hi-an...@yandex.ru> --- src/gallium/drivers/r600/r600_shader.c | 590 - 1 file changed, 295 insertions(+), 295 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 46aa2c1abd..8562678d0c 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -2672,7 +2672,7 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) { struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; const struct tgsi_full_dst_register *dst = >Dst[0]; - int i, r, lasti; + int chan_index, r, lasti; int temp_reg = r600_get_temp(ctx); struct r600_bytecode_alu alu; unsigned write_mask = dst->Register.WriteMask; @@ -2692,36 +2692,36 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) /* LDS write */ lasti = tgsi_last_instruction(write_mask); - for (i = 1; i <= lasti; i++) { + for (chan_index = 1; chan_index <= lasti; chan_index++) { - if (!(write_mask & (1 << i))) + if (!(write_mask & (1 << chan_index))) continue; r = single_alu_op2(ctx, ALU_OP2_ADD_INT, - temp_reg, i, + temp_reg, chan_index, temp_reg, 0, - V_SQ_ALU_SRC_LITERAL, 4 * i); + V_SQ_ALU_SRC_LITERAL, 4 * chan_index); if (r) return r; } - for (i = 0; i <= lasti; i++) { - if (!(write_mask & (1 << i))) + for (chan_index = 0; chan_index <= lasti; chan_index++) { + if (!(write_mask & (1 << chan_index))) continue; - if ((i == 0 && ((write_mask & 3) == 3)) || - (i == 2 && ((write_mask & 0xc) == 0xc))) { + if ((chan_index == 0 && ((write_mask & 3) == 3)) || + (chan_index == 2 && ((write_mask & 0xc) == 0xc))) { memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = LDS_OP3_LDS_WRITE_REL; alu.src[0].sel = temp_reg; - alu.src[0].chan = i; + alu.src[0].chan = chan_index; alu.src[1].sel = dst->Register.Index; alu.src[1].sel += ctx->file_offset[dst->Register.File]; - alu.src[1].chan = i; + alu.src[1].chan = chan_index; alu.src[2].sel = dst->Register.Index; alu.src[2].sel += ctx->file_offset[dst->Register.File]; - alu.src[2].chan = i + 1; + alu.src[2].chan = chan_index + 1; alu.lds_idx = 1; alu.dst.chan = 0; alu.last = 1; @@ -2729,17 +2729,17 @@ static int r600_store_tcs_output(struct r600_shader_ctx *ctx) r = r600_bytecode_add_alu(ctx->bc, ); if (r) return r; - i += 1; + chan_index += 1; continue; } memset(, 0, sizeof(struct r600_bytecode_alu)); alu.op = LDS_OP2_LDS_WRITE; alu.src[0].sel = temp_reg; - alu.src[0].chan = i; + alu.src[0].chan = chan_index; alu.src[1].sel = dst->Register.Index; alu.src[1].sel += ctx->file_offset[dst->Register.File]; - alu.src[1].chan = i; + alu.src[1].chan = chan_index; alu.src[2].sel = V_SQ_ALU_SRC_0; alu.dst.chan = 0; @@ -3747,7 +3747,7 @@ static int tgsi_op2_64_params(struct r600_shader_ctx *ctx, bool singledest, bool struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; unsigned write_mask = inst->Dst[0].Register.WriteMask; struct r600_bytecode_alu alu; - int i, j, r, lasti = tgsi_last_instruction(write_mask); + int chan_index, j, r, lasti = tgsi_last_instruction(write_mask); int use_tmp = 0; if (singledest) { @@ -3770,39 +3770,39 @@ static int tgsi_op2_64_params(struct r600_shader_ctx *ctx, bool singledest, bool } lasti = tgsi_last_instruction(write_mask); - for (i = 0; i <= lasti; i++) { + for (chan_index = 0; chan_index <= lasti; chan_index++) { - if (!(write_mask
[Mesa-dev] [PATCH 0/6] r600g: r600_shader.c small cleanups
Initially I was trying to implement for r600 optimization like in the d633e23192ef17207f4a6acd3009da3126aab395 commit for radeonsi, but failed because I need to learn some more about GPUs internals. For another time. Anyway, accidentally it turned into a small cleanup of r600_shader.c, here it is. Hi-Angel (6): Get rid of trailing whitespace (trivial) Rename i→chan_index Replace bit-shifts and cycles with helpers from tgsi_exec.h Rename tgsi_last_instruction → tgsi_last_channel Get rid of tgsi_last_channel() wherever possible, rename lasti → last_chan Remove redudant comparisons src/gallium/drivers/r600/r600_shader.c | 815 ++--- 1 file changed, 333 insertions(+), 482 deletions(-) -- 2.11.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g/sb: Fix memory leak by reworking uses list (rebased)
The author is Heiko Przybyl(CC'ing), the patch is rebased on top of Bartosz Tomczyk's one per Dieter Nützel's comment. Tested-by: Constantine Charlamov <hi-an...@yandex.ru> -- When fixing the stalls on evergreen I introduced leaking of the useinfo structure(s). Sorry. Instead of allocating a new object to hold 3 values where only one is actually used, rework the list to just store the node pointer. Thus no allocating and deallocation is needed. Since use_info and use_kind aren't used anywhere, drop them and reduce code complexity. This might also save some small amount of cycles. Thanks to Bartosz Tomczyk for finding the bug. Reported-by: Bartosz Tomczyk https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Signed-off-by: Heiko Przybyl https://lists.freedesktop.org/mailman/listinfo/mesa-dev>> Supersedes: https://patchwork.freedesktop.org/patch/135852 --- src/gallium/drivers/r600/sb/sb_def_use.cpp | 29 +++-- src/gallium/drivers/r600/sb/sb_gcm.cpp | 16 src/gallium/drivers/r600/sb/sb_ir.h | 23 ++- src/gallium/drivers/r600/sb/sb_valtable.cpp | 21 +++-- 4 files changed, 28 insertions(+), 61 deletions(-) diff --git a/src/gallium/drivers/r600/sb/sb_def_use.cpp b/src/gallium/drivers/r600/sb/sb_def_use.cpp index a512d92086..68ab4ca26c 100644 --- a/src/gallium/drivers/r600/sb/sb_def_use.cpp +++ b/src/gallium/drivers/r600/sb/sb_def_use.cpp @@ -106,58 +106,51 @@ void def_use::process_defs(node *n, vvec , bool arr_def) { } void def_use::process_uses(node* n) { -unsigned k = 0; - -for (vvec::iterator I = n->src.begin(), E = n->src.end(); I != E; -++I, ++k) { +for (vvec::iterator I = n->src.begin(), E = n->src.end(); I != E; ++I) { value *v = *I; if (!v || v->is_readonly()) continue; if (v->is_rel()) { if (!v->rel->is_readonly()) -v->rel->add_use(n, UK_SRC_REL, k); +v->rel->add_use(n); -unsigned k2 = 0; for (vvec::iterator I = v->muse.begin(), E = v->muse.end(); -I != E; ++I, ++k2) { +I != E; ++I) { value *v = *I; if (!v) continue; -v->add_use(n, UK_MAYUSE, k2); +v->add_use(n); } } else -v->add_use(n, UK_SRC, k); +v->add_use(n); } -k = 0; -for (vvec::iterator I = n->dst.begin(), E = n->dst.end(); I != E; -++I, ++k) { +for (vvec::iterator I = n->dst.begin(), E = n->dst.end(); I != E; ++I) { value *v = *I; if (!v || !v->is_rel()) continue; if (!v->rel->is_readonly()) -v->rel->add_use(n, UK_DST_REL, k); -unsigned k2 = 0; +v->rel->add_use(n); for (vvec::iterator I = v->muse.begin(), E = v->muse.end(); -I != E; ++I, ++k2) { +I != E; ++I) { value *v = *I; if (!v) continue; -v->add_use(n, UK_MAYDEF, k2); +v->add_use(n); } } if (n->pred) -n->pred->add_use(n, UK_PRED, 0); +n->pred->add_use(n); if (n->type == NT_IF) { if_node *i = static_cast<if_node*>(n); if (i->cond) -i->cond->add_use(i, UK_COND, 0); +i->cond->add_use(i); } } diff --git a/src/gallium/drivers/r600/sb/sb_gcm.cpp b/src/gallium/drivers/r600/sb/sb_gcm.cpp index 9c75389ada..7b43a32818 100644 --- a/src/gallium/drivers/r600/sb/sb_gcm.cpp +++ b/src/gallium/drivers/r600/sb/sb_gcm.cpp @@ -200,27 +200,27 @@ void gcm::td_release_val(value *v) { ); for (uselist::iterator I = v->uses.begin(), E = v->uses.end(); I != E; ++I) { -use_info *u = *I; -if (u->op->parent != ) { +node *op = *I; +if (op->parent != ) { continue; } GCM_DUMP( sblog << "tdused in "; -dump::dump_op(u->op); +dump::dump_op(op); sblog << "\n"; ); -assert(uses[u->op] > 0); -if (--uses[u->op] == 0) { +assert(uses[op] > 0); +if (--uses[op] == 0) { GCM_DUMP( sblog << "tdreleased : "; -dump::dump_op(u->op); +dump::dump_op(op); sblog << "\n"; ); -pending.remove_node(u->op); -ready.push_back(u->op); +pending.remove_node(op); +ready.push_back(op); } } diff --git a/src/gallium/drivers/r60
Re: [Mesa-dev] [PATCH] st/nine: make use of common uploaders v4
On 21.02.2017 23:28, Axel Davy wrote: > This looks fine to me. > > Reviewed-by: Axel Davy <axel.d...@ens.fr> > > I think the patch requires your Signed-off-by though. > > Axel > v2: fixed formatting, broken due to thunderbird configuration v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP v4: per Axel comment: changed style of the comment Signed-off-by: Constantine Charlamov <hi-an...@yandex.ru> --- src/gallium/state_trackers/nine/device9.c| 50 +--- src/gallium/state_trackers/nine/device9.h| 5 --- src/gallium/state_trackers/nine/nine_ff.c| 8 ++--- src/gallium/state_trackers/nine/nine_state.c | 48 +- 4 files changed, 37 insertions(+), 74 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index b9b7a637d7..86c8e38535 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This, This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS); This->driver_caps.user_sw_vbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS); This->driver_caps.user_sw_cbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS); - -/* Implicit use of context pipe for vertex and index uploaded when - * csmt is not active. Does not need to sync since csmt is unactive, - * thus no need to call NineDevice9_GetPipe at each upload. */ -if (!This->driver_caps.user_vbufs) -This->vertex_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, -65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_ibufs) -This->index_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, - 128 * 1024, - PIPE_BIND_INDEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_cbufs) { +if (!This->driver_caps.user_cbufs) This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT); -This->constbuf_uploader = u_upload_create(This->context.pipe, This->vs_const_size, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); -} - -This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); - This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION); This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS); This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS); @@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This ) nine_state_clear(>state, TRUE); nine_context_clear(This); -if (This->vertex_uploader) -u_upload_destroy(This->vertex_uploader); -if (This->index_uploader) -u_upload_destroy(This->index_uploader); -if (This->constbuf_uploader) -u_upload_destroy(This->constbuf_uploader); -if (This->vertex_sw_uploader) -u_upload_destroy(This->vertex_sw_uploader); -if (This->constbuf_sw_uploader) -u_upload_destroy(This->constbuf_sw_uploader); - nine_bind(>record, NULL); pipe_sampler_view_reference(>dummy_sampler_view, NULL); @@ -2852,15 +2818,17 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This, vtxbuf.buffer = NULL; vtxbuf.user_buffer = pVertexStreamZeroData; +/* csmt is unactive when user vertex or index buffers are used, thus no + * need to call NineDevice9_GetPipe. */ if (!This->driver_caps.user_vbufs) { -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, 0, (prim_count_to_vertex_count(PrimitiveType, PrimitiveCount)) * VertexStreamZeroStride, /* XXX */ 4, vtxbuf.user_buffer, _offset, ); -u_upload_unmap(This->vertex_uploader); +u_upload_unmap(This->context.pipe->stream_uploader);
[Mesa-dev] [PATCH] st/nine: make use of common uploaders v4
Make use of common uploaders that landed recently to Mesa v2: fixed formatting, broken due to thunderbird configuration v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP v4: per Axel comment: changed style of the comment --- src/gallium/state_trackers/nine/device9.c| 50 +--- src/gallium/state_trackers/nine/device9.h| 5 --- src/gallium/state_trackers/nine/nine_ff.c| 8 ++--- src/gallium/state_trackers/nine/nine_state.c | 48 +- 4 files changed, 37 insertions(+), 74 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index b9b7a637d7..86c8e38535 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This, This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS); This->driver_caps.user_sw_vbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS); This->driver_caps.user_sw_cbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS); - -/* Implicit use of context pipe for vertex and index uploaded when - * csmt is not active. Does not need to sync since csmt is unactive, - * thus no need to call NineDevice9_GetPipe at each upload. */ -if (!This->driver_caps.user_vbufs) -This->vertex_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, -65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_ibufs) -This->index_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, - 128 * 1024, - PIPE_BIND_INDEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_cbufs) { +if (!This->driver_caps.user_cbufs) This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT); -This->constbuf_uploader = u_upload_create(This->context.pipe, This->vs_const_size, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); -} - -This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); - This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION); This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS); This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS); @@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This ) nine_state_clear(>state, TRUE); nine_context_clear(This); -if (This->vertex_uploader) -u_upload_destroy(This->vertex_uploader); -if (This->index_uploader) -u_upload_destroy(This->index_uploader); -if (This->constbuf_uploader) -u_upload_destroy(This->constbuf_uploader); -if (This->vertex_sw_uploader) -u_upload_destroy(This->vertex_sw_uploader); -if (This->constbuf_sw_uploader) -u_upload_destroy(This->constbuf_sw_uploader); - nine_bind(>record, NULL); pipe_sampler_view_reference(>dummy_sampler_view, NULL); @@ -2852,15 +2818,17 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This, vtxbuf.buffer = NULL; vtxbuf.user_buffer = pVertexStreamZeroData; +/* csmt is unactive when user vertex or index buffers are used, thus no + * need to call NineDevice9_GetPipe. */ if (!This->driver_caps.user_vbufs) { -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, 0, (prim_count_to_vertex_count(PrimitiveType, PrimitiveCount)) * VertexStreamZeroStride, /* XXX */ 4, vtxbuf.user_buffer, _offset, ); -u_upload_unmap(This->vertex_uploader); +u_upload_unmap(This->context.pipe->stream_uploader); vtxbuf.user_buffer = NULL; } @@ -2916,27 +2884,27 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 *This, if (!This->driver_caps.user_vbufs) { const unsigned base = MinVertexIndex * VertexStreamZeroStride; -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, base,
Re: [Mesa-dev] [PATCH] st/nine: make use of common uploaders v2
Version 3 sent. Sorry, I haven't figured out — I ought to add you to CC. On 20.02.2017 22:49, Axel Davy wrote: > On 20/02/2017 20:22, Constantine Charlamov wrote: >> Make use of common uploaders that landed recently to Mesa >> >> v2: fixed formatting, broken due to thunderbird configuration >> >> --- >> src/gallium/state_trackers/nine/device9.c| 48 >> >> src/gallium/state_trackers/nine/device9.h| 5 --- >> src/gallium/state_trackers/nine/nine_ff.c| 8 ++--- >> src/gallium/state_trackers/nine/nine_state.c | 48 >> ++-- >> 4 files changed, 35 insertions(+), 74 deletions(-) >> >> diff --git a/src/gallium/state_trackers/nine/device9.c >> b/src/gallium/state_trackers/nine/device9.c >> index b9b7a637d7..2ae8678c31 100644 >> --- a/src/gallium/state_trackers/nine/device9.c >> +++ b/src/gallium/state_trackers/nine/device9.c >> @@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This, >> This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS); >> This->driver_caps.user_sw_vbufs = >> This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS); >> This->driver_caps.user_sw_cbufs = >> This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS); >> - >> -/* Implicit use of context pipe for vertex and index uploaded when >> - * csmt is not active. Does not need to sync since csmt is unactive, >> - * thus no need to call NineDevice9_GetPipe at each upload. */ > I'd like to have this comment kept somehow (though the use of context pipe is > not implicit anymore). > > I guess it should be in NineDevice9_DrawPrimitiveUP just before if > (!This->driver_caps.user_vbufs). > > It could be: csmt is unactive when user vertex or index buffers are used, > thus no need to call NineDevice8_GetPipe. > > Axel >> -if (!This->driver_caps.user_vbufs) >> -This->vertex_uploader = u_upload_create(This->csmt_active ? >> -This->pipe_secondary : >> This->context.pipe, >> -65536, >> -PIPE_BIND_VERTEX_BUFFER, >> PIPE_USAGE_STREAM); >> -This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536, >> -PIPE_BIND_VERTEX_BUFFER, >> PIPE_USAGE_STREAM); >> -if (!This->driver_caps.user_ibufs) >> -This->index_uploader = u_upload_create(This->csmt_active ? >> -This->pipe_secondary : >> This->context.pipe, >> - 128 * 1024, >> - PIPE_BIND_INDEX_BUFFER, >> PIPE_USAGE_STREAM); >> -if (!This->driver_caps.user_cbufs) { >> +if (!This->driver_caps.user_cbufs) >> This->constbuf_alignment = >> GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT); >> -This->constbuf_uploader = u_upload_create(This->context.pipe, >> This->vs_const_size, >> - >> PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); >> -} >> - >> -This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024, >> - PIPE_BIND_CONSTANT_BUFFER, >> PIPE_USAGE_STREAM); >> - >> This->driver_caps.window_space_position_support = >> GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION); >> This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, >> PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS); >> This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, >> PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS); >> @@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This ) >> nine_state_clear(>state, TRUE); >> nine_context_clear(This); >> >> -if (This->vertex_uploader) >> -u_upload_destroy(This->vertex_uploader); >> -if (This->index_uploader) >> -u_upload_destroy(This->index_uploader); >> -if (This->constbuf_uploader) >> -u_upload_destroy(This->constbuf_uploader); >> -if (This->vertex_sw_uploader) >> -u_upload_destroy(This->vertex_sw_uploader); >> -if (This->constbuf_sw_uploader) >> -u_upload_destroy(This->constbuf_sw_up
[Mesa-dev] [PATCH] st/nine: make use of common uploaders v3
Make use of common uploaders that landed recently to Mesa v2: fixed formatting, broken due to thunderbird configuration v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP --- src/gallium/state_trackers/nine/device9.c| 50 +--- src/gallium/state_trackers/nine/device9.h| 5 --- src/gallium/state_trackers/nine/nine_ff.c| 8 ++--- src/gallium/state_trackers/nine/nine_state.c | 48 +- 4 files changed, 37 insertions(+), 74 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index b9b7a637d7..86c8e38535 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This, This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS); This->driver_caps.user_sw_vbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS); This->driver_caps.user_sw_cbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS); - -/* Implicit use of context pipe for vertex and index uploaded when - * csmt is not active. Does not need to sync since csmt is unactive, - * thus no need to call NineDevice9_GetPipe at each upload. */ -if (!This->driver_caps.user_vbufs) -This->vertex_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, -65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_ibufs) -This->index_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, - 128 * 1024, - PIPE_BIND_INDEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_cbufs) { +if (!This->driver_caps.user_cbufs) This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT); -This->constbuf_uploader = u_upload_create(This->context.pipe, This->vs_const_size, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); -} - -This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); - This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION); This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS); This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS); @@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This ) nine_state_clear(>state, TRUE); nine_context_clear(This); -if (This->vertex_uploader) -u_upload_destroy(This->vertex_uploader); -if (This->index_uploader) -u_upload_destroy(This->index_uploader); -if (This->constbuf_uploader) -u_upload_destroy(This->constbuf_uploader); -if (This->vertex_sw_uploader) -u_upload_destroy(This->vertex_sw_uploader); -if (This->constbuf_sw_uploader) -u_upload_destroy(This->constbuf_sw_uploader); - nine_bind(>record, NULL); pipe_sampler_view_reference(>dummy_sampler_view, NULL); @@ -2852,15 +2818,17 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This, vtxbuf.buffer = NULL; vtxbuf.user_buffer = pVertexStreamZeroData; +// csmt is unactive when user vertex or index buffers are used, thus no +// need to call NineDevice9_GetPipe. if (!This->driver_caps.user_vbufs) { -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, 0, (prim_count_to_vertex_count(PrimitiveType, PrimitiveCount)) * VertexStreamZeroStride, /* XXX */ 4, vtxbuf.user_buffer, _offset, ); -u_upload_unmap(This->vertex_uploader); +u_upload_unmap(This->context.pipe->stream_uploader); vtxbuf.user_buffer = NULL; } @@ -2916,27 +2884,27 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 *This, if (!This->driver_caps.user_vbufs) { const unsigned base = MinVertexIndex * VertexStreamZeroStride; -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, base, NumVertices * VertexStreamZeroStride,
[Mesa-dev] [PATCH] st/nine: make use of common uploaders v2
Make use of common uploaders that landed recently to Mesa v2: fixed formatting, broken due to thunderbird configuration --- src/gallium/state_trackers/nine/device9.c| 48 src/gallium/state_trackers/nine/device9.h| 5 --- src/gallium/state_trackers/nine/nine_ff.c| 8 ++--- src/gallium/state_trackers/nine/nine_state.c | 48 ++-- 4 files changed, 35 insertions(+), 74 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index b9b7a637d7..2ae8678c31 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This, This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS); This->driver_caps.user_sw_vbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS); This->driver_caps.user_sw_cbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS); - -/* Implicit use of context pipe for vertex and index uploaded when - * csmt is not active. Does not need to sync since csmt is unactive, - * thus no need to call NineDevice9_GetPipe at each upload. */ -if (!This->driver_caps.user_vbufs) -This->vertex_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, -65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536, -PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_ibufs) -This->index_uploader = u_upload_create(This->csmt_active ? -This->pipe_secondary : This->context.pipe, - 128 * 1024, - PIPE_BIND_INDEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_cbufs) { +if (!This->driver_caps.user_cbufs) This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT); -This->constbuf_uploader = u_upload_create(This->context.pipe, This->vs_const_size, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); -} - -This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); - This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION); This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS); This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS); @@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This ) nine_state_clear(>state, TRUE); nine_context_clear(This); -if (This->vertex_uploader) -u_upload_destroy(This->vertex_uploader); -if (This->index_uploader) -u_upload_destroy(This->index_uploader); -if (This->constbuf_uploader) -u_upload_destroy(This->constbuf_uploader); -if (This->vertex_sw_uploader) -u_upload_destroy(This->vertex_sw_uploader); -if (This->constbuf_sw_uploader) -u_upload_destroy(This->constbuf_sw_uploader); - nine_bind(>record, NULL); pipe_sampler_view_reference(>dummy_sampler_view, NULL); @@ -2853,14 +2819,14 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This, vtxbuf.user_buffer = pVertexStreamZeroData; if (!This->driver_caps.user_vbufs) { -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, 0, (prim_count_to_vertex_count(PrimitiveType, PrimitiveCount)) * VertexStreamZeroStride, /* XXX */ 4, vtxbuf.user_buffer, _offset, ); -u_upload_unmap(This->vertex_uploader); +u_upload_unmap(This->context.pipe->stream_uploader); vtxbuf.user_buffer = NULL; } @@ -2916,27 +2882,27 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 *This, if (!This->driver_caps.user_vbufs) { const unsigned base = MinVertexIndex * VertexStreamZeroStride; -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, base, NumVertices * VertexStreamZeroStride, /* XXX */ 4, (const uint8_t *)vbuf.user_buffer + base, _offset, ); -u_upload_unmap(This->vertex_uploader); +
[Mesa-dev] [PATCH] st/nine: make use of common uploaders
Make use of common uploaders that landed recently to Mesa --- src/gallium/state_trackers/nine/device9.c| 48 src/gallium/state_trackers/nine/device9.h| 5 --- src/gallium/state_trackers/nine/nine_ff.c| 8 ++--- src/gallium/state_trackers/nine/nine_state.c | 48 ++-- 4 files changed, 35 insertions(+), 74 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index b9b7a637d7..2ae8678c31 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This, This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS); This->driver_caps.user_sw_vbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS); This->driver_caps.user_sw_cbufs = This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS); - -/* Implicit use of context pipe for vertex and index uploaded when - * csmt is not active. Does not need to sync since csmt is unactive, - * thus no need to call NineDevice9_GetPipe at each upload. */ -if (!This->driver_caps.user_vbufs) -This->vertex_uploader = u_upload_create(This->csmt_active ? - This->pipe_secondary : This->context.pipe, -65536, - PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536, - PIPE_BIND_VERTEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_ibufs) -This->index_uploader = u_upload_create(This->csmt_active ? - This->pipe_secondary : This->context.pipe, - 128 * 1024, - PIPE_BIND_INDEX_BUFFER, PIPE_USAGE_STREAM); -if (!This->driver_caps.user_cbufs) { +if (!This->driver_caps.user_cbufs) This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT); -This->constbuf_uploader = u_upload_create(This->context.pipe, This->vs_const_size, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); -} - -This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024, - PIPE_BIND_CONSTANT_BUFFER, PIPE_USAGE_STREAM); - This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION); This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS); This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS); @@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This ) nine_state_clear(>state, TRUE); nine_context_clear(This); -if (This->vertex_uploader) -u_upload_destroy(This->vertex_uploader); -if (This->index_uploader) -u_upload_destroy(This->index_uploader); -if (This->constbuf_uploader) -u_upload_destroy(This->constbuf_uploader); -if (This->vertex_sw_uploader) -u_upload_destroy(This->vertex_sw_uploader); -if (This->constbuf_sw_uploader) -u_upload_destroy(This->constbuf_sw_uploader); - nine_bind(>record, NULL); pipe_sampler_view_reference(>dummy_sampler_view, NULL); @@ -2853,14 +2819,14 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This, vtxbuf.user_buffer = pVertexStreamZeroData; if (!This->driver_caps.user_vbufs) { -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, 0, (prim_count_to_vertex_count(PrimitiveType, PrimitiveCount)) * VertexStreamZeroStride, /* XXX */ 4, vtxbuf.user_buffer, _offset, ); -u_upload_unmap(This->vertex_uploader); +u_upload_unmap(This->context.pipe->stream_uploader); vtxbuf.user_buffer = NULL; } @@ -2916,27 +2882,27 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 *This, if (!This->driver_caps.user_vbufs) { const unsigned base = MinVertexIndex * VertexStreamZeroStride; -u_upload_data(This->vertex_uploader, +u_upload_data(This->context.pipe->stream_uploader, base, NumVertices * VertexStreamZeroStride, /* XXX */ 4, (const uint8_t *)vbuf.user_buffer + base, _offset, ); -u_upload_unmap(This->vertex_uploader); +u_upload_unmap(This->context.pipe->stream_uploader); /* Won't be used: */ vbuf.buffer_offset -= base; vbuf.user_buffer = NULL; } if (!This->driver_caps.user_ibufs) { -u_upload_data(This->index_uploader, +u_upload_data(This->context.pipe->stream_uploader, 0,