Issue 115133
Summary [AArch64][GlobalISel] Overall GISel operation status
Labels backend:AArch64, llvm:globalisel
Assignees
Reporter davemgreen
    This is a copy of an internal page me and @chuongg3 had when going through each of the operations for AArch64 GISel, making sure they don't fall back. Not all of it is complete yet (and the internal version had a few more details), but it is better to have this upstream. Some of it might now be out of date.

A few high level comments
 - This does not include SVE, we should probably do the same elsewhere.
 - BF16 still needs to be added, but requires a new way to specify the types / operations.
 - BigEndian isn't handled yet.
 - Currently some operations widen, some promote. We should stick to one (probably widen).
 - Blank spaces usually mean not checked / not supported. We will get to the point where random-testing will start to be more useful.

Legend:
 - Scalar normal = i8/i16/i32/i64
 - Vector legal = v8i8/v4i16/v2i32 + v16i8/v8i16/v4i32/v2i64
 - Vector larger/smaller = i8/i16/i32/i64 types with non-legal sizes
 - i128 = scalar/vector
 - i1 = scalar/vector
 - Scalar ext = non-power2 sizes, including larger sizes
 - Vector odd widths = i8/i16/i32/i64 with non-power-2 widths.
 - Vector odd eltsize = non-power2 elt sizes (or i128, etc).


|Operation| Scalar normal| Vector legal| i128| i1 | Vector larger / smaller| Scalar ext| Vector odd widths| Vector odd eltsizes| Additional Notes
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|load| y | y | | | | | | | 
|store| y | y | | | | | | | 
|bitcast? ptrtoint? inttoptr?| y | y | | | | | | | 
|memcpy? memmove? memset? bzero?| | | | | | | | | 
|Int Operation| Scalar normal| Vector normal| i128 s/v| i1 s/v| Vector larger / smaller| Scalar non-power-2| Vector odd widths| Vector odd eltsizes| Additional Notes
|add| y| y| y/y | | y| y| x | x | https://godbolt.org/z/6c1rfWTK8
|sub| y| y| y/y | | y| y| x | x | 
|mul| y| y| y/y inefficient | | y| | | | Scalar i128 could be better. https://godbolt.org/z/8Wd8zhezc
|sdiv, udiv| y| y| y/y | | y| | | | Scalar i1 could be simpler. https://godbolt.org/z/45qMq6cvh.
|srem, urem| y| y| y/y| | y| | | | Scalar i1:
|zext, sext, anyext| y| y| | | | | | | ZEXT: Global ISel could be improved to match SDAG by using BIC for 
|trunc| y| y| y| | | | x Non-pow2 larger than 8| | 
|and| y| y| y/y | | y| | | | https://godbolt.org/z/6Y98TnYv8
|or| y| y| y/y | | y| | | | 
|xor| y| y| y/y | | y| | | | 
|  not?| y| y| y| | y| | | | https://godbolt.org/z/rh4ob1be7
|shl| y| y| y| | y (v2i8)| | | x| Scalar i8/i16 unnecessarily clear shift amount. i1 could simplify.
|ashr| y| y| y| | y(v2i8)| | | x| Scalar i8/i16 unnecessarily clear shift amount. i1 could simplify.
|lshr| y| y| y| | y(v2i8)| | | x| Scalar i8/i16 unnecessarily clear shift amount. i1 could simplify.
|icmp| y| y| y (i128 could be better)| x | y(v2i8)| | | | i128 could do a lot better.
|select| y| y| y| | y (v2i8)| | | | Scalarl: Unnecessary AND to clear upper lanes of the condition register
|abs| y| y| y| | x| y| | | https://godbolt.org/z/Tobs7YeoT
|smin/smax/umin/umax| y| y| y| | y| x > i128| | | i1/i128 could do better. https://godbolt.org/z/j7nx789oz.
|uaddsat/usubsat/saddsat/ssubsat| y| y| y| | | y| | | https://godbolt.org/z/4MT14bfsv
|bitreverse| y| x| y| | | y| | | https://godbolt.org/z/3sd988Mhd
|bswap| y| x| y| | x| y| | | 
|ctlz| y| y| y| | y| x > i128| | | 
|cttz| y| y| y| | x| x > i128| | | 
|ctpop| y| y| y| | x| x| | | 
|fshr/fshl| y| y| y| | x| x NonPow2 > 128| | | Scalar Normal:
|  rotr/rotl?| y| y| y| | y| y| | | 
|uaddo, usubo, uadde, usube?| | | | | | | | | 
|umulo, smulo?| | | | | | | | | 
|umulh, smulh| | | | | | | | | 
|ushlsat, sshlsat| | | | | | | | | 
|smulfix, umulfix| | | | | | | | | 
|smulfixsat, umulfixsat| | | | | | | | | 
|sdivfix, udivfix| | | | | | | | | 
|sdivfixsat, udivfixsat| | | | | | | | | 
|FP Operation| Scalar normal| Vector legal| f128 s/v| | Vector smaller / larger| bf16 s/v| Vector widths| | Additional Notes
|fadd| y| y| y/y| | y| | | | https://godbolt.org/z/bYWfo9v16
|fsub| y| y| y/y| | y| | | | 
|fmul| y| y| y/y| | y| | | | 
|fma| y| y| y/y| | y| | | | https://godbolt.org/z/1osE3Whaq
|fmuladd| y| y| y/y| | y| | | | 
|fdiv| y| y| y/y| | y| | | | 
|frem| y| y| y/y| | y| | | | 
|fneg| y| y| y/y| | y| | | | https://godbolt.org/z/rz96eh3PW
|fpext| y | y | y/y| | y | | | | https://godbolt.org/z/358EG4j7r
|fptrunc| y | y| y/y| | y | | | | https://godbolt.org/z/7a7hq6j68
|fptosi, fptoui| y| y| y/y| | y | | | | 
|fptosisat, fptouisat| | | | | | | | | 
|sitofp, uitofp| y| y| y/y| | y| | | | https://godbolt.org/z/j7Prz7qj6
|fabs| y| y| y/y| | y| | | | https://godbolt.org/z/o95h4a9es
|fsqrt| y| y| y/y| | y| | | | 
|ceil, floor, trunc, rint, nearbyint| y| y| y/y| | y| | | | https://godbolt.org/z/zjMqq5oeo
|lrint, llrint, lround, llround| | | | | | | | | 
|fminnum, fmaxnum| y| y| y/y| | y| | | | 
|fminimum, fmaximum| y| y| | | y| | | | 
|fminimumnum, fmaximumnum| | | | | | | | | 
|fcopysign| y| y| y/y| | y| | | | https://godbolt.org/z/aq5bbc4jG
|fpow| y| y| y/y| | y| | | | https://godbolt.org/z/WEeWYj1e4
|fpowi| y| y| y/y| | y| | | | 
|sin, cos, etc| y| y| y/y| | y| | | | 
|fexp, fexp2, flog, flog2, flog10| y| y| y/y| | y| | | | 
|fldexp, frexmp| | | | | | | | | 
|fcanonicalize| | | | | | | | | 
|is_fpclass| | | | | | | | | 
|Vector Operation| Scalar normal| | Vector legal| Vector smaller / larger| | Scalar ext| Vector odd widths| Vector odd eltsizes| Additional Notes
|insert| -| -| y| y| | -| | | 
|extract| -| -| y| y| | -| | | 
|shuffle*| -| -| | | | -| | | 
|  dup| -| -| y| | | -| | | 
|  ext| -| -| y| y| | -| | | 
|  zip1/zip2/uzp2/uzp2/trn1/trn2| -| -| y| | | -| | | 
|  tbl| -| -| y| y| | -| | | Could do with tbl2/tbl4 combines
|  reverse| -| -| | | | -| | | Needs full reverses from https://godbolt.org/z/1chrbKjhs
|  perfect shuffles| -| -| | | | -| | | 
|reduce.add| -| -| | | | -| | | Integer reductions in ISel use i32 return types. They can be i8/i16 in GISel.
|reduce.mul| -| -| | | | -| | | 
|reduce.smin/smax/umin/umax| -| -| | | | -| | | 
|reduce.and/or/xor| -| -| | | | -| | | 
|reduce.fadd| -| -| | | | -| | | Needs sequential
|reduce.fmul| -| -| | | | -| | | Needs sequential, plus #73309
|reduce.fmin/fmax/fminimum/fmaxmum| -| -| y| | | -| x |  | 
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to