URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2e75d71c1faa737ef3290ff1e9cb4851762fa381
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Wed Nov 15 10:48:02 2023 -0800

    intel/cmat: Generate better code for nir_intrinsic_cmat_insert
    
    When the source destination index is a constant, we can avoid generating
    a lot of the intermediate code. At the very least, this makes initial
    NIR dumps much easier to read.
    
    v2: Simplify tracking of dst_index. Suggested by Caio.
    
    Suggested-by: Caio
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=c6d44284aa633569a58200d00015b3e6d80a465a
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Wed Aug 2 13:36:33 2023 -0700

    intel/dev: Enable VK_KHR_cooperative_matrix on all Gfx9+ GPUs
    
    Gfx12.5 (DG2) will use DPAS instructions to accelerate the
    implementation. Earlier platforms will use equivalent discrete
    instructions (basically subgroup operations). Gfx12 (Tigerlake) will use
    DP4A for 8-bit integer matrix multiplication. Older platforms, which
    lack DP4A, will use a suboptimal instruction sequence. There is plenty
    of room for improvement here.
    
    On DG2 (Gfx12.5) gets the following results from the CTS:
    
    Test run totals:
      Passed:        1642/13982 (11.7%)
      Failed:        0/13982 (0.0%)
      Not supported: 12340/13982 (88.3%)
      Warnings:      0/13982 (0.0%)
      Waived:        0/13982 (0.0%)
    
    On DG2 (Gfx12.5) with forced lowering, Raptor Lake (Gfx12) and Ice Lake
    (Gfx11):
    
    Test run totals:
      Passed:        1662/13982 (11.9%)
      Failed:        0/13982 (0.0%)
      Not supported: 12320/13982 (88.1%)
      Warnings:      0/13982 (0.0%)
      Waived:        0/13982 (0.0%)
    
    The difference in the number of tests run is due to
    saturatingAccumulation not being set on DG2 when DPAS is used. There is
    a comment in "intel/dev: Advertise integer configs with
    saturatingAccumulation too" that explains how this could be added should
    the need arise.
    
    v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8ea032b78ee3257fd9398db8b79cdf9ca5ff4a36
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Fri Oct 20 18:24:25 2023 -0700

    intel/dev: Advertise integer configs with saturatingAccumulation too
    
    VUID-RuntimeSpirv-saturatingAccumulation-08983 says:
    
       For OpCooperativeMatrixMulAddKHR, the SaturatingAccumulation
       cooperative matrix operand must be present if and only if
       VkCooperativeMatrixPropertiesKHR::saturatingAccumulation is VK_TRUE.
    
    As a result, we have to advertise integer configs both with and without
    this flag set.
    
    v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f952dd510e4e83639f77259baaa61ff25c918305
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Tue Aug 1 10:38:14 2023 -0700

    anv: Select the SIMD mode very early when cooperative matrices are used
    
    The commit is a little ugly. The definition of anv_fixup_subgroup_size
    is moved before the added call site. In addition, the bit starting at
    the "Cooperative matrix extension requires..." comment is added.
    
    v2: Dramatic simplification of SIMD selection. Suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=511f91e307c98326185ec69570b0c6eee2c36cab
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Tue Aug 8 09:32:40 2023 -0700

    anv: Lower indirect derefs again after lowering cooperative matrices
    
    The cooperative matrix lowering can generate a lot of indirect array
    accesses, and these need to be eliminated.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=b741a9a851ca3747aa92ce0d6611b488c6e0e07b
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Mon Sep 25 09:16:55 2023 -0700

    anv: Set PIPELINE_SELECT systolic mode enable flag
    
    Set the flag on compute shaders when the application has enabled the
    cooperative matrix feature. We might still want to enable this only when
    DPAS is actually used. The current method is based on many suggestions
    from Lionel.
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwer...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7bfbeb79a75a04c3a7baa0e230a5bd4efa0976c4
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Fri Sep 22 16:17:18 2023 -0700

    anv: Set COMPUTE_WALKER systolic mode enable flag
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwer...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=67739b02de08e97128673f05bf1a525047873d3e
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Mon Oct 30 11:06:24 2023 -0700

    anv: Add anv_physical_device::has_cooperative_matrix
    
    This flag tracks whether or not cooperative matrices are fully enabled
    on the physica device (i.e., both the configs exist and the environment
    varible is set).  This is mainly to support a later commit "anv: Set
    PIPELINE_SELECT systolic mode enable flag."
    
    This could be squashed into "anv: Implement VK_KHR_cooperative_matrix."
    I left it separate because we might go back to the previous method.
    
    v3: Don't hide the extension behind an environment variable
    (ANV_COOPERATIVE_MATRIX) now the we have a better solution for setting
    PIPELINE_SELECT.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=0a6f8b40bfdf39faaf1ff7def741faf612cf5706
Author: Caio Oliveira <caio.olive...@intel.com>
Date:   Tue Jun 13 19:48:16 2023 -0700

    anv: Implement VK_KHR_cooperative_matrix
    
    v2: Rebase on moving lowering pass to src/intel/compiler.
    
    v3: Don't hide the extension behind an environment variable
    (ANV_COOPERATIVE_MATRIX) now the we have a better solution for setting
    PIPELINE_SELECT.
    
    v4: Prefix type names with INTEL_CMAT_. Suggested by Lionel. Also rebase
    on f99e43d606e ("anv: switch to use runtime physical device properties
    infrastructure").
    
    Reviewed-by: Ian Romanick <ian.d.roman...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=ff16458478eec50b04190f58802dde5d4d3e99d7
Author: Caio Oliveira <caio.olive...@intel.com>
Date:   Fri Jun 16 16:47:45 2023 -0700

    intel/dev: Add cooperative matrix configuration information
    
    v2: Prefix type names with INTEL_CMAT_. Suggested by Lionel.
    
    Reviewed-by: Ian Romanick <ian.d.roman...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6b14da33ad3aa8a30ed5e479eace8bc6470095a7
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Mon Oct 9 13:54:38 2023 -0700

    intel/fs: nir: Add nir_intrinsic_dpas_intel
    
    v2: Fix parameter order in nir_intrinsic_dpas_intel to DPAS conversion.
    
    v3: Fix float16 destination DPAS on DG2.
    
    v4: Use nir_component_mask(...) instead of 0xffff. Suggested by Caio.
    
    v5: Rebase on !26323.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=3756f605586fb2dcf53d892606152ecc5ce1ad1d
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Tue Oct 10 15:35:46 2023 -0700

    intel/fs: DPAS lowering
    
    Implements integer dot product lowering both with and without
    DP4A. Implements half-float dot product lowering.
    
    There are a couple FINISHME comments describing future optimizations.
    
    v2: Add a brw_compiler::lower_dpas flag to track when the lowering
    should be applied.
    
    v3: Use is_null() instead of checking file != ARF. Suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=3cb96255397747ecef3f824064ca0afba349c50d
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Mon Oct 16 14:22:51 2023 -0700

    intel/fs: Fix scoreboarding for DPAS
    
    v2: Remove all mention of DPASW. Suggested by Curro and Caio.
    
    Reviewed-by: Francisco Jerez <curroje...@riseup.net>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=eb1f19d7bf194574b984033754a301d1407f24d5
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Mon Sep 25 17:40:01 2023 -0700

    intel/compiler: Validation for DPAS instructions
    
    v2: s/regiser/register/g in messages. Noticed by Caio. Add more context
    to the sub-byte precision error message. Suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=1c92dad5cb7f5d46dfaf56d2f9ce0203c2fbefbe
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Mon Oct 9 16:31:41 2023 -0700

    intel/disasm: Disassembly support for DPAS
    
    v2: Fix regioning in src[012]_dpas_3src. Noticed by Caio. Treat DPAS as
    unordered. Suggested by Curro.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e666872c751bedd1e4c2e1231644c14ed18639e7
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Wed Sep 20 12:42:24 2023 -0700

    intel/compiler: Initial bits for DPAS instruction
    
    v2: Add brw_ir_performance.cpp and brw_fs_generator.cpp changes. Fix
    overlapping register allocation (via has_source_and_destination_hazard). Fix
    incorrect destination register file encoding.
    
    v3: Prevent lower_regioning from trying to "fix" DPAS sources.
    
    v4: Add instruction latency information for scheduling and perf
    estimates.
    
    v5: Remove all mention of DPASW. Suggested by Curro and Caio. Update
    the comment in fs_inst::has_source_and_destination_hazard. Suggested
    by Caio.
    
    v6: Add some comments near the src2 calculation in
    fs_inst::size_read. Suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=3a35f8b29bb9b6a92f98e8bb897bd444a54ca255
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Tue Oct 3 11:25:36 2023 -0700

    intel/cmat: Lower cmat_load and cmat_store
    
    v2: Add support for non-constant stride.
    
    v3: Explain B matrices (a little bit) in
    get_slice_type_from_desc. Suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=502be565da052e91adfa596945d5d55f7565a203
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Fri Jul 21 16:06:48 2023 -0700

    intel/cmat: Add lowering for cmat_bitcast
    
    v2: Use nir_component_mask(...) instead of 0xffff. Assert that source
    and destination are same size. Both suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7303315a8b5d16dc269359e19a8edcee4af99823
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Fri Jul 14 11:34:44 2023 -0700

    intel/cmat: Enable packed formats for scalar ops
    
    v2: Use nir_pack_bits and nir_unpack_bits to simplify coop_scalar
    handling. This saved 13 lines of code.
    
    v3: Allow packing factor 2 and packing factor 1 elements be stored in
    16-bit integers.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=26c4acd8ee58239dadb0dcaf59703c7510ebbb9a
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Thu Jul 13 11:08:54 2023 -0700

    intel/cmat: Enable packed formats for binary ops
    
    v2: Use nir_pack_bits and nir_unpack_bits to simplify coop_binary
    handling. This saved 13 lines of code.
    
    v3: Allow packing factor 2 and packing factor 1 elements be stored in
    16-bit integers.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=0d314eb3ccdbbc9c050c9432ee3713da5a9853c7
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Thu Jul 13 11:05:16 2023 -0700

    intel/cmat: Enable packed formats for unary, length, and construct
    
    With this, a minimum test case passes:
    
        void main()
        {
            coopmat<float16_t, gl_ScopeSubgroup, M, N, gl_MatrixUseA> matA;
            coopmat<float, gl_ScopeSubgroup, M, N, gl_MatrixUseA> matR;
    
            matA = coopmat<float16_t, gl_ScopeSubgroup, M, N, 
gl_MatrixUseA>(2.0);
            matR = coopmat<float, gl_ScopeSubgroup, M, N, gl_MatrixUseA>(matA);
    
            coopMatStore(matR, result, 0, N, 
gl_CooperativeMatrixLayoutRowMajor);
        }
    
    v2: Use nir_vec instead of explicit nir_vec{2,4}. Also fixes a typo in
    one of the 4x8 cases.
    
    v3: Use nir_pack_bits and nir_unpack_bits to dramatically simplify
    coop_unary handling. This saved 67 lines of code.
    
    v4: Allow packing factor 2 and packing factor 1 elements be stored in
    16-bit integers.
    
    v5: Massive update to the comment in lower_cooperative_matrix_unary_op
    with some suggestions from Caio. Add a comment and assertion around
    `nir_def *v[4]`. Suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=75388a71c932db7114a6980ef818b6f50236d6f9
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Thu Jun 29 18:21:44 2023 -0700

    intel/cmat: Add lowering for cmat_insert and cmat_extract
    
    v2: Use nir_component_mask(...) instead of 0xffff. Suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a2ded5b26cbaa7ee5f433f046b5f2c559329740e
Author: Ian Romanick <ian.d.roman...@intel.com>
Date:   Wed Jul 12 17:50:17 2023 -0700

    intel/cmat: Update get_slice_type for packed slices
    
    Also splits off another funciton get_slice_type_from_desc that will be
    used in future commits.
    
    v2: Allow packing factor 2 and packing factor 1 elements be stored in
    16-bit integers.
    
    v3: Use glsl_base_type_get_bit_size.
    
    v4: Adjust packing so that a single row fills an entire GRF.
    
    v5: Add comment for get_packing_factor and some other cleanups
    there. s/cooperative_matrix/cmat/. Tighten the validation of len in
    gt_slice_from_desc. All suggested by Caio.
    
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=dba6451ce8113b7f81df95897d666d37ae5b8cee
Author: Caio Oliveira <caio.olive...@intel.com>
Date:   Tue Jun 13 19:45:49 2023 -0700

    intel/cmat: Add pass to lower cooperative matrix to subgroup operations
    
    This is just the skeleton of the implementation. Future commits will
    fill it all in.
    
    v2: Move to src/intel/compiler
    
    v3 (idr): Use vecN instead of array[N] for slice type.
    
    v4 (idr): Refactor lower_cooperative_matrix_load and
    lower_cooperative_matrix_store into a single function.
    
    v5 (idr): Remove old, verbose debug logging. Assert that entry is not
    NULL in get_coop_type_for_slice. Use nir_component_mask(...) instead of
    0xffff. s/cooperative_matrix/cmat/. All suggested by Caio.
    
    Reviewed-by: Ian Romanick <ian.d.roman...@intel.com>
    Reviewed-by: Caio Oliveira <caio.olive...@intel.com>
    
    I put both R-b on this because, at this point, we've each done equal
    parts authoring and reviewing.
    
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25994>

Reply via email to