Hello, This patch series continues adding arb_gpu_shader_fp64 support to the Intel driver. Specifically, this targets the i965 scalar backend for BDW+ hardware (vec4 is still under research and gen7 has its own issues which we intend tackle after gen8).
This adds most of the fp64 scalar implementation, it starts by enabling the various lowering passes in NIR for doubles and then adds all the infrastructure required in the backend to operate with 64-bit floating point data. For reference, this series fixes 1009 fp64 piglit tests in BDW. Fp64 totals look like this: pass: 2523 fail: 46 crash: 447 skip: 16 total: 3032 There are a few missing things in this series to achieve a perfect fp64 pass rate: 1. Fixes to copy propagation. The fp64 code creates new code patterns that copy-propagation isn't really ready to handle yet leading to incorrect results in some cases. We have 9 patches to fix copy propagation for fp64 that we intend to send separately after the main fp64 infrastructure has been reviewed. 2. ubo/ssbo/shared-variables. We will also send the patches for this in a separate series after this one. 3. A fix for the SIMD lowering pass to properly handle execmasking when transposing the results of split instructions back together. We have a local fix for this, but Curro hit the same problem while working on SIMD32 and has a better solution for it so we intend to use his solution when it is ready. 4. Spilling. We don't support spilling of DF registers yet and some piglit tests need this to compile. Jason had plans to work on the spilling code and address the needs of fp64 along the way. The series does not introduce any regressions in piglit on ILK, SNB, HSW, BDW and SKL. A branch with this series is available for testing here: $ git clone -b i965-fp64-scalar-backend-part-1 https://github.com/Igalia/mesa.git You will have to enable the extension with: $ export MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader_fp64 The full scalar fp64 implementation, containing also the fixes to copy-propagation as well as ubo/ssbo and our local fix for the SIMD lowering pass is available here: git clone -b i965-fp64 https://github.com/Igalia/mesa.git And for the adventurous, there is also a work-in-progress branch that adds scalar support for HSW here: git clone -b i965-fp64-gen7 https://github.com/Igalia/mesa.git Thanks, Sam Connor Abbott (33): i965: use double lowering pass i965: use pack/unpackDouble lowering i965/disasm: fix disasm of 3-src doubles i965/eu: allow doubles in math instructions i965: add brw_imm_df i965: add support for getting/setting DF immediates i965: add support for disassembling DF immediates i965/eu: add support for DF immediates i965: fix brw_negate_immediate() for doubles i965: fix is_zero(), is_one() and is_negative_one() for doubles i965: fixup uniform setup for doubles i965/fs: print writemask_all when it's enabled i965/fs: use the NIR bit size when creating registers i965/fs: don't propagate 64-bit immediates i965/fs: add support for printing double immediates i965/fs: always pass the bitsize to brw_type_for_nir_type() i965/fs: add a stride helper i965/fs: add PACK opcode i965/fs: add a pass for lowering PACK opcodes i965/fs/nir: translate double pack/unpack i965/fs: fix type_size() for doubles i965/fs: handle uniforms in byte_offset() i965/fs: use byte_offset() in offset() for uniforms i965/fs: fix assign_constant_locations() for doubles i965/fs: generalize SIMD16 interference workaround i965/fs: extend exec_size halving in the generator i965/fs: fix compares for doubles i965/fs: fix regs_read() for uniforms i965/fs: fix is_copy_payload() for doubles i965/fs: fix regs_written in LOAD_PAYLOAD for doubles i965/fs: fix dst width calculation in CSE i965/fs: add a pass for legalizing d2f i965/fs: add support for f2d and d2f Iago Toral Quiroga (15): i965: fix brw_saturate_immediate() for doubles i965: fix brw_abs_immediate() for doubles i965: two-argument instructions can only use 32-bit immediates i965/fs: optimize pack double i965/fs: optimize unpack double i965/fs: handle fp64 opcodes in brw_do_channel_expressions i965/fs: We only support 32-bit integer ALU operations for now i965/fs: add null_reg_df i965/fs: implement fsign() for doubles i965/fs: implement d2b i965/fs: implement d2i and d2u i965/fs: implement i2d and u2d i965/fs: rename our lower_d2f pass to lower_d2x i965/fs/lower_simd_width: Fix registers written for split instructions i965/fs: recognize writes with a subreg_offset > 0 as partial Samuel Iglesias Gonsálvez (7): i965: enable lrp lowering for doubles vc4: lower lrp when operating with double operands freedreno/ir3: lower lrp when operating with double operands i965/fs: align access to double-based uniforms in push constant buffer i965/fs: demote_pull_constants() did not take into account double types i965/fs: take into account doubles when calculating read_size for MOV_INDIRECT i965/fs: fix MOV_INDIRECT exec_size for doubles Topi Pohjolainen (4): i965: Lower DFRACEXP/DLDEXP i965: Determine size of double precision float register i965: Tell backend register about double precision type i965/eu: Allow 3-src float ops with doubles src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 + src/gallium/drivers/vc4/vc4_program.c | 1 + src/mesa/drivers/dri/i965/Makefile.sources | 2 + src/mesa/drivers/dri/i965/brw_compiler.c | 2 + src/mesa/drivers/dri/i965/brw_compiler.h | 8 + src/mesa/drivers/dri/i965/brw_defines.h | 9 + src/mesa/drivers/dri/i965/brw_disasm.c | 3 +- src/mesa/drivers/dri/i965/brw_eu_emit.c | 60 +++-- src/mesa/drivers/dri/i965/brw_fs.cpp | 106 ++++++-- src/mesa/drivers/dri/i965/brw_fs.h | 6 +- src/mesa/drivers/dri/i965/brw_fs_builder.h | 15 +- .../dri/i965/brw_fs_channel_expressions.cpp | 23 +- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 3 + src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 3 +- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 16 +- src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp | 75 ++++++ src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp | 59 +++++ src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 287 ++++++++++++++++++--- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 67 +++-- src/mesa/drivers/dri/i965/brw_inst.h | 25 ++ src/mesa/drivers/dri/i965/brw_ir_fs.h | 14 +- src/mesa/drivers/dri/i965/brw_link.cpp | 1 + src/mesa/drivers/dri/i965/brw_nir.c | 10 + src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 7 +- src/mesa/drivers/dri/i965/brw_program.c | 1 + src/mesa/drivers/dri/i965/brw_reg.h | 10 + src/mesa/drivers/dri/i965/brw_shader.cpp | 73 ++++-- src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_wm.c | 2 + src/mesa/drivers/dri/i965/gen6_constant_state.c | 12 +- 30 files changed, 773 insertions(+), 129 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp -- 2.5.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev