| Issue |
115133
|
| Summary |
[AArch64][GlobalISel] Overall GISel operation status
|
| Labels |
backend:AArch64,
llvm:globalisel
|
| Assignees |
|
| Reporter |
davemgreen
|
This is a copy of an internal page me and @chuongg3 had when going through each of the operations for AArch64 GISel, making sure they don't fall back. Not all of it is complete yet (and the internal version had a few more details), but it is better to have this upstream. Some of it might now be out of date.
A few high level comments
- This does not include SVE, we should probably do the same elsewhere.
- BF16 still needs to be added, but requires a new way to specify the types / operations.
- BigEndian isn't handled yet.
- Currently some operations widen, some promote. We should stick to one (probably widen).
- Blank spaces usually mean not checked / not supported. We will get to the point where random-testing will start to be more useful.
Legend:
- Scalar normal = i8/i16/i32/i64
- Vector legal = v8i8/v4i16/v2i32 + v16i8/v8i16/v4i32/v2i64
- Vector larger/smaller = i8/i16/i32/i64 types with non-legal sizes
- i128 = scalar/vector
- i1 = scalar/vector
- Scalar ext = non-power2 sizes, including larger sizes
- Vector odd widths = i8/i16/i32/i64 with non-power-2 widths.
- Vector odd eltsize = non-power2 elt sizes (or i128, etc).
|Operation| Scalar normal| Vector legal| i128| i1 | Vector larger / smaller| Scalar ext| Vector odd widths| Vector odd eltsizes| Additional Notes
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|load| y | y | | | | | | |
|store| y | y | | | | | | |
|bitcast? ptrtoint? inttoptr?| y | y | | | | | | |
|memcpy? memmove? memset? bzero?| | | | | | | | |
|Int Operation| Scalar normal| Vector normal| i128 s/v| i1 s/v| Vector larger / smaller| Scalar non-power-2| Vector odd widths| Vector odd eltsizes| Additional Notes
|add| y| y| y/y | | y| y| x | x | https://godbolt.org/z/6c1rfWTK8
|sub| y| y| y/y | | y| y| x | x |
|mul| y| y| y/y inefficient | | y| | | | Scalar i128 could be better. https://godbolt.org/z/8Wd8zhezc
|sdiv, udiv| y| y| y/y | | y| | | | Scalar i1 could be simpler. https://godbolt.org/z/45qMq6cvh.
|srem, urem| y| y| y/y| | y| | | | Scalar i1:
|zext, sext, anyext| y| y| | | | | | | ZEXT: Global ISel could be improved to match SDAG by using BIC for
|trunc| y| y| y| | | | x Non-pow2 larger than 8| |
|and| y| y| y/y | | y| | | | https://godbolt.org/z/6Y98TnYv8
|or| y| y| y/y | | y| | | |
|xor| y| y| y/y | | y| | | |
| not?| y| y| y| | y| | | | https://godbolt.org/z/rh4ob1be7
|shl| y| y| y| | y (v2i8)| | | x| Scalar i8/i16 unnecessarily clear shift amount. i1 could simplify.
|ashr| y| y| y| | y(v2i8)| | | x| Scalar i8/i16 unnecessarily clear shift amount. i1 could simplify.
|lshr| y| y| y| | y(v2i8)| | | x| Scalar i8/i16 unnecessarily clear shift amount. i1 could simplify.
|icmp| y| y| y (i128 could be better)| x | y(v2i8)| | | | i128 could do a lot better.
|select| y| y| y| | y (v2i8)| | | | Scalarl: Unnecessary AND to clear upper lanes of the condition register
|abs| y| y| y| | x| y| | | https://godbolt.org/z/Tobs7YeoT
|smin/smax/umin/umax| y| y| y| | y| x > i128| | | i1/i128 could do better. https://godbolt.org/z/j7nx789oz.
|uaddsat/usubsat/saddsat/ssubsat| y| y| y| | | y| | | https://godbolt.org/z/4MT14bfsv
|bitreverse| y| x| y| | | y| | | https://godbolt.org/z/3sd988Mhd
|bswap| y| x| y| | x| y| | |
|ctlz| y| y| y| | y| x > i128| | |
|cttz| y| y| y| | x| x > i128| | |
|ctpop| y| y| y| | x| x| | |
|fshr/fshl| y| y| y| | x| x NonPow2 > 128| | | Scalar Normal:
| rotr/rotl?| y| y| y| | y| y| | |
|uaddo, usubo, uadde, usube?| | | | | | | | |
|umulo, smulo?| | | | | | | | |
|umulh, smulh| | | | | | | | |
|ushlsat, sshlsat| | | | | | | | |
|smulfix, umulfix| | | | | | | | |
|smulfixsat, umulfixsat| | | | | | | | |
|sdivfix, udivfix| | | | | | | | |
|sdivfixsat, udivfixsat| | | | | | | | |
|FP Operation| Scalar normal| Vector legal| f128 s/v| | Vector smaller / larger| bf16 s/v| Vector widths| | Additional Notes
|fadd| y| y| y/y| | y| | | | https://godbolt.org/z/bYWfo9v16
|fsub| y| y| y/y| | y| | | |
|fmul| y| y| y/y| | y| | | |
|fma| y| y| y/y| | y| | | | https://godbolt.org/z/1osE3Whaq
|fmuladd| y| y| y/y| | y| | | |
|fdiv| y| y| y/y| | y| | | |
|frem| y| y| y/y| | y| | | |
|fneg| y| y| y/y| | y| | | | https://godbolt.org/z/rz96eh3PW
|fpext| y | y | y/y| | y | | | | https://godbolt.org/z/358EG4j7r
|fptrunc| y | y| y/y| | y | | | | https://godbolt.org/z/7a7hq6j68
|fptosi, fptoui| y| y| y/y| | y | | | |
|fptosisat, fptouisat| | | | | | | | |
|sitofp, uitofp| y| y| y/y| | y| | | | https://godbolt.org/z/j7Prz7qj6
|fabs| y| y| y/y| | y| | | | https://godbolt.org/z/o95h4a9es
|fsqrt| y| y| y/y| | y| | | |
|ceil, floor, trunc, rint, nearbyint| y| y| y/y| | y| | | | https://godbolt.org/z/zjMqq5oeo
|lrint, llrint, lround, llround| | | | | | | | |
|fminnum, fmaxnum| y| y| y/y| | y| | | |
|fminimum, fmaximum| y| y| | | y| | | |
|fminimumnum, fmaximumnum| | | | | | | | |
|fcopysign| y| y| y/y| | y| | | | https://godbolt.org/z/aq5bbc4jG
|fpow| y| y| y/y| | y| | | | https://godbolt.org/z/WEeWYj1e4
|fpowi| y| y| y/y| | y| | | |
|sin, cos, etc| y| y| y/y| | y| | | |
|fexp, fexp2, flog, flog2, flog10| y| y| y/y| | y| | | |
|fldexp, frexmp| | | | | | | | |
|fcanonicalize| | | | | | | | |
|is_fpclass| | | | | | | | |
|Vector Operation| Scalar normal| | Vector legal| Vector smaller / larger| | Scalar ext| Vector odd widths| Vector odd eltsizes| Additional Notes
|insert| -| -| y| y| | -| | |
|extract| -| -| y| y| | -| | |
|shuffle*| -| -| | | | -| | |
| dup| -| -| y| | | -| | |
| ext| -| -| y| y| | -| | |
| zip1/zip2/uzp2/uzp2/trn1/trn2| -| -| y| | | -| | |
| tbl| -| -| y| y| | -| | | Could do with tbl2/tbl4 combines
| reverse| -| -| | | | -| | | Needs full reverses from https://godbolt.org/z/1chrbKjhs
| perfect shuffles| -| -| | | | -| | |
|reduce.add| -| -| | | | -| | | Integer reductions in ISel use i32 return types. They can be i8/i16 in GISel.
|reduce.mul| -| -| | | | -| | |
|reduce.smin/smax/umin/umax| -| -| | | | -| | |
|reduce.and/or/xor| -| -| | | | -| | |
|reduce.fadd| -| -| | | | -| | | Needs sequential
|reduce.fmul| -| -| | | | -| | | Needs sequential, plus #73309
|reduce.fmin/fmax/fminimum/fmaxmum| -| -| y| | | -| x | |
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs