[Public] Hi,
PFA, The patch that enables support for the next generation AMD Zen5 CPU via -march=znver5 with basic znver5 scheduler Model. znver5 scheduler model is combined with existing znver4 scheduler model into a single file "zn4zn5.md". automata size tested using command : size -A gcc/insn-automata.o before patch: 1575958 After patch: 1670964 Thanks and Regards Karthiban -----Original Message----- From: Anbazhagan, Karthiban Sent: Wednesday, February 14, 2024 6:54 PM To: Jan Hubicka <hubi...@ucw.cz> Cc: gcc-patches@gcc.gnu.org; Kumar, Venkataramanan <venkataramanan.ku...@amd.com>; Joshi, Tejas Sanjay <tejassanjay.jo...@amd.com>; Nagarajan, Muthu kumar raj <muthukumarraj.nagara...@amd.com>; Gopalasubramanian, Ganesh <ganesh.gopalasubraman...@amd.com> Subject: RE: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model Hi, >>I assume the znver5 costs are smae as znver4 so far? Costing table updated for below entries. + {COSTS_N_INSNS (10), /* cost of a divide/mod for QI. */ + COSTS_N_INSNS (11), /* HI. */ + COSTS_N_INSNS (16), /* DI. */ + COSTS_N_INSNS (16)}, /* other. */ + COSTS_N_INSNS (10), /* cost of DIVSS instruction. */ + COSTS_N_INSNS (14), /* cost of SQRTSS instruction. */ + COSTS_N_INSNS (20), /* cost of SQRTSD instruction. */ >> we can just change znver4.md to also work for znver5? We will combine znver4 and znver5 scheduler descriptions into one Thanks and Regards Karthiban -----Original Message----- From: Jan Hubicka <hubi...@ucw.cz> Sent: Monday, February 12, 2024 9:30 PM To: Anbazhagan, Karthiban <karthiban.anbazha...@amd.com> Cc: gcc-patches@gcc.gnu.org; Kumar, Venkataramanan <venkataramanan.ku...@amd.com>; Joshi, Tejas Sanjay <tejassanjay.jo...@amd.com>; Nagarajan, Muthu kumar raj <muthukumarraj.nagara...@amd.com>; Gopalasubramanian, Ganesh <ganesh.gopalasubraman...@amd.com> Subject: Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Hi, > gcc/ChangeLog: > * common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5. > * common/config/i386/i386-common.cc (processor_names): Add znver5. > (processor_alias_table): Likewise. > * common/config/i386/i386-cpuinfo.h (processor_types): Add new zen > family. > (processor_subtypes): Add znver5. > * config.gcc (x86_64-*-* |...): Likewise. > * config/i386/driver-i386.cc (host_detect_local_cpu): Let > march=native detect znver5 cpu's. > * config/i386/i386-c.cc (ix86_target_macros_internal): Add znver5. > * config/i386/i386-options.cc (m_ZNVER5): New definition > (processor_cost_table): Add znver5. > * config/i386/i386.cc (ix86_reassociation_width): Likewise. > * config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER5 > (PTA_ZNVER5): New definition. > * config/i386/i386.md (define_attr "cpu"): Add znver5. > (Scheduling descriptions) Add znver5.md. > * config/i386/x86-tune-costs.h (znver5_cost): New definition. > * config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver5. > (ix86_adjust_cost): Likewise. > * config/i386/x86-tune.def (avx512_move_by_pieces): Add m_ZNVER5. > (avx512_store_by_pieces): Add m_ZNVER5. > * doc/extend.texi: Add znver5. > * doc/invoke.texi: Likewise. > * config/i386/znver5.md: New. > > gcc/testsuite/ChangeLog: > * g++.target/i386/mv29.C: Handle znver5 arch. > * gcc.target/i386/funcspec-56.inc:Likewise. > +/* This table currently replicates znver4_cost table. */ struct > +processor_costs znver5_cost = { I assume the znver5 costs are smae as znver4 so far? > +;; AMD znver5 Scheduling > +;; Modeling automatons for zen decoders, integer execution pipes, ;; > +AGU pipes, branch, floating point execution and fp store units. > +(define_automaton "znver5, znver5_ieu, znver5_idiv, znver5_fdiv, > +znver5_agu, znver5_fpu, znver5_fp_store") > + > +;; Decoders unit has 4 decoders and all of them can decode fast path > +;; and vector type instructions. > +(define_cpu_unit "znver5-decode0" "znver5") (define_cpu_unit > +"znver5-decode1" "znver5") (define_cpu_unit "znver5-decode2" > +"znver5") (define_cpu_unit "znver5-decode3" "znver5") Duplicating znver4 description to znver5 before scheduler description is tuned is basically just leads to increasing compiler binary size (scheduler models are quite large). Depending on changes between generations, I think we should try to share CPU unit DFAs where it makes sense (i.e. shared DFA is smaller than two DFAs). So perhaps unit scheduler is tuned, we can just change znver4.md to also work for znver5? Honza
0001-Add-AMD-znver5-processor-enablement-with-scheduler-model.patch
Description: 0001-Add-AMD-znver5-processor-enablement-with-scheduler-model.patch