dmgreen added a comment. I like how this uses a splat for all the register arguments. That sounds like a good idea.
The one's that worry me are the floating point instructions. Last time we tried those it was actually causing performance regressions because of extra sp->gpr mov's left in the loop. If this is just the backend patterns though, not the sinking of splats into loops too, then I think it should be OK. On it's own I don't think it will usually cause problems. And some quick tests seem to verify that. ================ Comment at: clang/test/CodeGen/arm-mve-intrinsics/vaddq.c:2 // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py -// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -S -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s -// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -DPOLYMORPHIC -S -emit-llvm -o - %s | opt -S -mem2reg | FileCheck %s +// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -S -emit-llvm -o - %s | opt -S -O1 | FileCheck %s +// RUN: %clang_cc1 -triple thumbv8.1m.main-arm-none-eabi -target-feature +mve.fp -mfloat-abi hard -fallow-half-arguments-and-returns -O0 -disable-O0-optnone -DPOLYMORPHIC -S -emit-llvm -o - %s | opt -S -O1 | FileCheck %s ---------------- Why is this running the entire -O1 pass pipeline? These tests deliberately uses a limit subset to not need adjusting with every midend llvm change. (But not be littered with clang's verbose ir output). I'm guessing the half args are being a pain again. Is it something to do with halfs? ================ Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:4496 + UnpredSign)), + (VTI.Vec (inst (VTI.Vec MQPR:$Qm), (i32 GPR:$val)))>; + // Predicated version ---------------- These GPR's can use the same regclass as the instruction. rGPR in this case I think? ================ Comment at: llvm/lib/Target/ARM/ARMInstrMVE.td:4566 + 0b0, VTI.Unsigned>; + defvar unpred_op = !if(VTI.Unsigned, unpred_op_u, unpred_op_s); + defm : MVE_vec_scalar_int_pat_m<!cast<Instruction>(NAME), VTI, ---------------- I find all these if's at different levels a little hard to follow. It looks OK, but is it possible to rearrange things to not need it here? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D74620/new/ https://reviews.llvm.org/D74620 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits