On Mon, 21 Nov 2022, Joshi, Tejas Sanjay wrote:

> I have addressed all your comments in the patch attached here. I have also
> used znver4-direct for avx512 insns.

Thanks.

> * This patch increased the insn-automata.cc size from 201502 to 214902.

Assuming it's the number of lines of code, I have 102847, perhaps you're
measuring without my patches? You can use 'size -A gcc/insn-automata.o'
to measure binary size growth.

> * Compile time and binary size on my machine remains same.
> * Make check and bootstrap build have no issues.
> * Spec cpu2017 also don't have any issues with this patch.
> 
> Is this ok for trunk?

I cannot approve or reject your patch, this is up to Honza who I believe
was investigating if combining this with older Zen models makes sense.
In the meantime, I see a few more issues that can be easily corrected,
please see below.

> --- /dev/null
> +++ b/gcc/config/i386/znver4.md
> +;; FSQRT
> +(define_insn_reservation "znver4_fsqrt" 22
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "fpspc")
> +                                (and (eq_attr "mode" "XF")
> +                                     (eq_attr "memory" "none"))))
> +                      "znver4-direct,znver4-fdiv*20")

This should be znver4-fdiv*10 (not *20) according to Agner Fog's measurements.

> +;; FDIV
> +(define_insn_reservation "znver4_fp_div" 15
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "fdiv")
> +                                (eq_attr "memory" "none")))
> +                      "znver4-direct,znver4-fdiv*15")

znver4-fdiv*6 instead of *15 here and in two patterns following this one.

> +(define_insn_reservation "znver4_sse_div_pd" 13
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V4DF,V2DF,V1DF")
> +                                 (eq_attr "memory" "none"))))
> +                      "znver4-direct,znver4-fdiv*7")

Agner Fog's measurements indicate fdiv*5 here.

> +
> +(define_insn_reservation "znver4_sse_div_ps" 10
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V8SF,V4SF,V2SF,SF")
> +                                 (eq_attr "memory" "none"))))
> +                      "znver4-direct,znver4-fdiv*5")

Agner Fog's measurements indicate fdiv*3 here.

> +
> +(define_insn_reservation "znver4_sse_div_pd_load" 20
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V4DF,V2DF,V1DF")
> +                                 (eq_attr "memory" "load"))))
> +                      "znver4-direct,znver4-load,znver4-fdiv*7")

fdiv*5?

> +
> +(define_insn_reservation "znver4_sse_div_ps_load" 17
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V8SF,V4SF,V2SF,SF")
> +                                 (eq_attr "memory" "load"))))
> +                      "znver4-direct,znver4-load,znver4-fdiv*5")

fdiv*3?

> +(define_insn_reservation "znver4_sse_div_pd_evex" 13
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V8DF")
> +                                 (eq_attr "memory" "none"))))
> +                      "znver4-direct,znver4-fdiv*7")

This should be twice as much as the corresponding SSE/AVX instruction
(fdiv*14 or fdiv*10; Agner Fog measured 9 cycles as reciprocal throughput).

> +
> +(define_insn_reservation "znver4_sse_div_ps_evex" 10
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V16SF")
> +                                 (eq_attr "memory" "none"))))
> +                      "znver4-direct,znver4-fdiv*5")

Likewise (fdiv*6).

> +(define_insn_reservation "znver4_sse_div_pd_evex_load" 20
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V8DF")
> +                                 (eq_attr "memory" "load"))))
> +                      "znver4-direct,znver4-load,znver4-fdiv*7")

Likewise.

> +(define_insn_reservation "znver4_sse_div_ps_evex_load" 17
> +                      (and (eq_attr "cpu" "znver4")
> +                           (and (eq_attr "type" "ssediv")
> +                                (and (eq_attr "mode" "V16SF")
> +                                 (eq_attr "memory" "load"))))
> +                      "znver4-direct,znver4-load,znver4-fdiv*5")

Likewise.

Alexander

Reply via email to