https://gcc.gnu.org/g:dd682ea04149261d27c871879c0c81a94ece8cd8

commit r17-895-gdd682ea04149261d27c871879c0c81a94ece8cd8
Author: Kewen Lin <[email protected]>
Date:   Thu May 28 11:22:57 2026 +0000

    i386: Refine c86-4g fdiv scheduling model
    
    Commit r17-258 introduced separated c86-4g fdiv units to avoid the
    automaton explosion caused by modeling the whole divider latency on
    normal FPU pipes.  But the real hardware may keep the associated FPU
    pipe occupied for some cycles at both the beginning and the end of
    an fdiv or sqrt operation.  Following Alexander's suggestion in [1],
    this patch still keeps the long-latency part on the dedicated fdiv
    unit but models only a bounded part of the FPU pipe occupancy.  It
    makes the first four cycles reserve both the selected FPU pipe and
    the fdiv unit, then keep only the fdiv unit for the remaining cycles.
    
    Taking r17-258 as baseline, I tried K = 1,2,3,4 for
    
      fpu,divider*N -> (fpu+divider)*K, divider*(N-K)
    
    and measured the time for build/genautomata and the top 100 symbol
    sizes of insn-automata.o (baseline normalized as 100) as below:
    
    1) without any other changes:
                  time     size
      baseline    100      100
      r17-203     340.0    629.3
      K1          100.3    100
      K2          105.5    112.5
      K3          112.8    129
      K4          119.4    141
    
    2) Splitting fpu0/fpu2 and fpu1/fpu3 to paired automatons:
                  time     size
      baseline    100      100
      r17-203     340.0    629.3
      KS1         79.6     43.3
      KS2         79.8     43.3
      KS3         79.6     43.3
      KS4         79.4     43.3
    
    It turns out that if we want to model the FPU occupancy for some
    beginning cycles, separating the involved fpu1/fpu3 from the
    original fpu looks better.  So this patch splits fpu0/fpu2 and
    fpu1/fpu3 into two paired automata and this extra coupling does
    not grow the main FPU automata significantly.
    
    This patch also corrects some other modeling omissions like:
    
      - Fix c86_4g_fp_op_idiv_load latency typo by one cycle.
      - Merge the old c86_4g_m7 idiv DI/SI/HI reservations after
        aligning their latency and divider unit occupancy (with
        updated values), while keeping QI separate.
      - Adjust reservation units in templates like
        c86_4g_m7_avx_vpinsr_reg_load and c86_4g_m7_avx512_sseadd_xy
        etc.
      - Add missing reservation units and unit occupancy in templates
        like c86_4g_m7_avx512_permi2_ymm and
        c86_4g_m7_sse_sseiadd_hplus_load etc.
      - Adjust reservation units and unit occupancy in templates like
        c86_4g_m7_avx512_perm_zmm_imm, c86_4g_m7_avx512_expand and
        c86_4g_m7_avx512_ssemul etc.
    
    And also introduces some reusable reservation aliases to simplify
    some modelings.
    
    I tested build time for i686 bootstrapping in a docker container:
      - r17-202: 2437s (before c86-4g support)
      - r17-203: 7291s (c86-4g support)
      - r17-258: 2646s (tweaking for build time)
      - this: 2358s
    It looks this patch improves build time (even better than r17-202
    though the trivial gap can be due to some jitter).
    
    The symbol sizes are improved as below:
    
    nm -CS -t d --defined-only gcc/insn-automata.o \
        | sed 's/^[0-9]* 0*//' \
        | sort -n | tail -20
    
    with r17-258:
    
      20068 r bdver1_fp_transitions
      22354 r c86_4g_m7_ieu_min_issue_delay
      26208 r slm_min_issue_delay
      26580 t internal_min_issue_delay(int, DFA_chip*)
      26869 t internal_state_transition(int, DFA_chip*)
      27244 r bdver1_fp_min_issue_delay
      28518 r glm_check
      28518 r glm_transitions
      33690 r geode_min_issue_delay
      33728 r c86_4g_fp_transitions
      45436 r znver4_fpu_min_issue_delay
      46980 r bdver3_fp_min_issue_delay
      49428 r glm_min_issue_delay
      53730 r btver2_fp_min_issue_delay
      53760 r znver1_fp_transitions
      89414 r c86_4g_m7_ieu_transitions
      93960 r bdver3_fp_transitions
      181744 r znver4_fpu_transitions
      326322 r c86_4g_m7_fpu_min_issue_delay
      1305288 r c86_4g_m7_fpu_transitions
    
    with this:
    
      17872 r print_reservation(_IO_FILE*, rtx_insn*)::...
      20068 r bdver1_fp_check
      20068 r bdver1_fp_transitions
      22016 r c86_4g_m7_fpu02_transitions
      22354 r c86_4g_m7_ieu_min_issue_delay
      26208 r slm_min_issue_delay
      27244 r bdver1_fp_min_issue_delay
      28199 t internal_min_issue_delay(int, DFA_chip*)
      28362 t internal_state_transition(int, DFA_chip*)
      28518 r glm_check
      28518 r glm_transitions
      33690 r geode_min_issue_delay
      45436 r znver4_fpu_min_issue_delay
      46980 r bdver3_fp_min_issue_delay
      49428 r glm_min_issue_delay
      53730 r btver2_fp_min_issue_delay
      53760 r znver1_fp_transitions
      89414 r c86_4g_m7_ieu_transitions
      93960 r bdver3_fp_transitions
      181744 r znver4_fpu_transitions
    
    Based on random sampling of SPEC2017 benchmarks 525.x264_r and
    521.wrf_r, I verified that the new modeling introduces no
    significant compilation overhead.  Testing with a single job on a
    c86-4g-m7 machine revealed no impact on x264 and a tiny increase
    for wrf (~0.3%).
    
    [1] https://gcc.gnu.org/pipermail/gcc-patches/2026-May/716681.html
    
    gcc/ChangeLog:
    
            * config/i386/c86-4g-m7.md (c86_4g_m7_fpu): Remove automaton.
            (c86_4g_m7_fpu02): New automaton.
            (c86_4g_m7_fpu13): Ditto.
            (c86-4g-m7-fpu0): Move to c86_4g_m7_fpu02 automaton.
            (c86-4g-m7-fpu1): Move to c86_4g_m7_fpu13 automaton.
            (c86-4g-m7-fpu2): Move to c86_4g_m7_fpu02 automaton.
            (c86-4g-m7-fpu3): Move to c86_4g_m7_fpu13 automaton.
            (c86-4g-m7-fdiv): Remove cpu unit.
            (c86-4g-m7-fdiv1): New cpu unit.
            (c86-4g-m7-fdiv3): Ditto.
            (c86-4g-m7-fpu_0_3): New reservation.
            (c86-4g-m7-fpu_1_3x2): Ditto.
            (c86-4g-m7-fpu_1_3x3): Ditto.
            (c86-4g-m7-fpu_1_3x6): Ditto.
            (c86-4g-m7-fpux2): Ditto.
            (c86-4g-m7-fpux4): Ditto.
            (c86-4g-m7-fpux6): Ditto.
            (c86-4g-m7-fpux8): Ditto.
            (c86-4g-m7-fpux16): Ditto.
            (c86-4g-m7-fp1fdiv1x4): Ditto.
            (c86-4g-m7-fp3fdiv3x4): Ditto.
            (c86-4g-m7-fdiv13): Ditto.
            (c86-4g-m7-fp13div13): Ditto.
            (c86-4g-m7-fp13div13x4): Ditto.
            (c86-4g-m7-fp1div1_fp3div3_x4x8): Ditto.
            (c86-4g-m7-fp1div1_fp3div3_x4x9): Ditto.
            (c86-4g-m7-fp1div1_fp3div3_x4x11): Ditto.
            (c86-4g-m7-fp1div1_fp3div3_x4x15): Ditto.
            (c86-4g-m7-fp1div1_fp3div3_x4x18): Ditto.
            (c86_4g_m7_idiv): New reservation.
            (c86_4g_m7_idiv_QI): Adjust reservation latency and unit occupancy.
            (c86_4g_m7_idiv_load): New reservation.
            (c86_4g_m7_idiv_QI_load): Adjust reservation latency and unit
            occupancy.
            (c86_4g_m7_idiv_DI): Remove reservation.
            (c86_4g_m7_idiv_SI): Ditto.
            (c86_4g_m7_idiv_HI): Ditto.
            (c86_4g_m7_idiv_DI_load): Ditto.
            (c86_4g_m7_idiv_SI_load): Ditto.
            (c86_4g_m7_idiv_HI_load): Ditto.
            (c86_4g_m7_sse_insertimm): Adjust reservation units and unit
            occupancy.
            (c86_4g_m7_sse_insert): Ditto.
            (c86_4g_m7_fp_sqrt): Adjust reservation.
            (c86_4g_m7_fp_div): Ditto.
            (c86_4g_m7_fp_div_load): Ditto.
            (c86_4g_m7_fp_idiv_load): Ditto.
            (c86_4g_m7_sse_pinsr_reg): Adjust reservation units and unit
            occupancy.
            (c86_4g_m7_sse_pinsr_reg_load): Ditto.
            (c86_4g_m7_avx_vpinsr_reg): Ditto.
            (c86_4g_m7_avx_vpinsr_reg_load): Ditto.
            (c86_4g_m7_avx512_perm_xmm): Delete the prefix condition.
            (c86_4g_m7_avx512_perm_xmm_opload): Ditto.
            (c86_4g_m7_avx512_permi2_ymm): Adjust reservation units and unit
            occupancy.
            (c86_4g_m7_avx512_permi2_zmm): Ditto.
            (c86_4g_m7_avx512_permi2_ymm_load): Ditto.
            (c86_4g_m7_avx512_permi2_zmm_load): Ditto.
            (c86_4g_m7_avx512_perm_zmm_imm): Ditto.
            (c86_4g_m7_avx512_perm_zmm_imm_load): Ditto.
            (c86_4g_m7_avx512_perm_zmm_noimm): Ditto.
            (c86_4g_m7_sse_perm_zmm_noimm_load): Ditto.
            (c86_4g_m7_avx_perm_ymm): Remove.
            (c86_4g_m7_avx_perm_ymem): Ditto.
            (c86_4g_m7_avx512_shuf_zmm): Adjust reservation units and unit
            occupancy.
            (c86_4g_m7_avx512_shuf_zmem): Ditto.
            (c86_4g_m7_avx512_cmpestr): Ditto.
            (c86_4g_m7_avx512_cmpestr_load): Ditto.
            (c86_4g_m7_avx512_vdbpsadbw_zmm): Ditto.
            (c86_4g_m7_avx512_vdbpsadbw_zmem): Ditto.
            (c86_4g_m7_avx_ssecomi_comi): Ditto.
            (c86_4g_m7_avx_ssecomi_comi_load): Ditto.
            (c86_4g_m7_avx512_expand): Ditto.
            (c86_4g_m7_avx512_expand_load): Ditto.
            (c86_4g_m7_avx512_expand_z): Ditto.
            (c86_4g_m7_avx512_expand_z_load): Ditto.
            (c86_4g_m7_sse_movnt_xy): Rename to c86_4g_m7_sse_movnt.
            (c86_4g_m7_avx512_sseadd_xy): Adjust reservation units.
            (c86_4g_m7_avx512_sseadd_xy_load): Ditto.
            (c86_4g_m7_sse_sseiadd_hplus): Adjust reservation units and unit
            occupancy.
            (c86_4g_m7_sse_sseiadd_hplus_load): Ditto.
            (c86_4g_m7_avx512_ssemul): Adjust reservation units.
            (c86_4g_m7_avx512_ssemul_load): Ditto.
            (c86_4g_m7_avx512_ssediv): Remove.
            (c86_4g_m7_avx512_ssediv_mem): Remove.
            (c86_4g_m7_avx512_ssediv_x): New.
            (c86_4g_m7_avx512_ssediv_xmem): New.
            (c86_4g_m7_avx512_ssediv_y): New.
            (c86_4g_m7_avx512_ssediv_ymem): New.
            (c86_4g_m7_avx512_ssediv_z): Adjust reservation units.
            (c86_4g_m7_avx512_ssediv_zmem): Ditto.
            (c86_4g_m7_avx512_ssecmp_z): Add reservation units and unit
            occupancy.
            (c86_4g_m7_avx512_ssecmp_z_load): Ditto.
            (c86_4g_m7_avx512_ssecmp_vp_z): New reservation.
            (c86_4g_m7_avx512_ssecmp_vp_z_load): Ditto.
            (c86_4g_m7_avx512_ssecmp_test_z): Remove reservation.
            (c86_4g_m7_avx512_ssecmp_test_z_load): Ditto.
            (c86_4g_m7_avx512_muladd): Broaden matching condition.
            (c86_4g_m7_avx512_muladd_load): Ditto.
            (c86_4g_m7_fma_muladd): Remove reservation.
            (c86_4g_m7_fma_muladd_load): Ditto.
            (c86_4g_m7_avx512_sse_conflict_x): Add reservation units and unit
            occupancy.
            (c86_4g_m7_avx512_sse_conflict_x_load): Ditto.
            (c86_4g_m7_avx512_sse_conflict_y): Ditto.
            (c86_4g_m7_avx512_sse_conflict_y_load): Ditto.
            (c86_4g_m7_avx512_sse_conflict_z): Ditto.
            (c86_4g_m7_avx512_sse_conflict_z_load): Ditto.
            (c86_4g_m7_avx512_sse_class_z): Add reservation units and unit
            occupancy.
            (c86_4g_m7_avx512_sse_class_z_load): Ditto.
            (c86_4g_m7_avx512_sse_sqrt): Remove.
            (c86_4g_m7_avx512_sse_sqrt_load): Remove.
            (c86_4g_m7_avx512_sse_sqrt_sf_x): New.
            (c86_4g_m7_avx512_sse_sqrt_sf_xload): New.
            (c86_4g_m7_avx512_sse_sqrt_sf_y): New.
            (c86_4g_m7_avx512_sse_sqrt_sf_yload): New.
            (c86_4g_m7_avx512_sse_sqrt_sf_z): New.
            (c86_4g_m7_avx512_sse_sqrt_sf_zload): New.
            (c86_4g_m7_avx512_sse_sqrt_df_x): New.
            (c86_4g_m7_avx512_sse_sqrt_df_xload): New.
            (c86_4g_m7_avx512_sse_sqrt_df_y): New.
            (c86_4g_m7_avx512_sse_sqrt_df_yload): New.
            (c86_4g_m7_avx512_sse_sqrt_df_z): New.
            (c86_4g_m7_avx512_sse_sqrt_df_zload): New.
            (c86_4g_m7_avx512_msklog_vector): Add reservation units and unit
            occupancy.
            (c86_4g_m7_avx512_mskmov_z_k): Ditto.
            (c86_4g_m7_avx512_mskmov_k_reg): Ditto.
            * config/i386/c86-4g.md (c86_4g_fp): Remove automaton.
            (c86_4g_fp024): New automaton.
            (c86_4g_fp1): Ditto.
            (c86-4g-fp0): Move to c86_4g_fp024 automaton.
            (c86-4g-fp1): Move to c86_4g_fp1 automaton.
            (c86-4g-fp2): Move to c86_4g_fp024 automaton.
            (c86-4g-fp3): Ditto.
            (c86-4g-fp1fdivx4): New reservation.
            (c86_4g_fp_sqrt): Adjust reservation.
            (c86_4g_sse_sqrt_sf): Ditto.
            (c86_4g_sse_sqrt_sf_mem): Ditto.
            (c86_4g_sse_sqrt_df): Ditto.
            (c86_4g_sse_sqrt_df_mem): Ditto.
            (c86_4g_fp_op_div): Ditto.
            (c86_4g_fp_op_div_load): Ditto.
            (c86_4g_fp_op_idiv_load): Adjust reservation latency.
            (c86_4g_ssediv_ss_ps): Adjust reservation.
            (c86_4g_ssediv_ss_ps_load): Ditto.
            (c86_4g_ssediv_sd_pd): Ditto.
            (c86_4g_ssediv_sd_pd_load): Ditto.
            (c86_4g_ssediv_avx256_ps): Ditto.
            (c86_4g_ssediv_avx256_ps_load): Ditto.
            (c86_4g_ssediv_avx256_pd): Ditto.
            (c86_4g_ssediv_avx256_pd_load): Ditto.
    
    Co-authored-by: Xin Liu <[email protected]>
    Signed-off-by: Xin Liu <[email protected]>
    Signed-off-by: Kewen Lin <[email protected]>

Diff:
---
 gcc/config/i386/c86-4g-m7.md | 412 +++++++++++++++++++++++++------------------
 gcc/config/i386/c86-4g.md    |  61 ++++---
 2 files changed, 270 insertions(+), 203 deletions(-)

diff --git a/gcc/config/i386/c86-4g-m7.md b/gcc/config/i386/c86-4g-m7.md
index 54a850db3be8..96bd322a2883 100644
--- a/gcc/config/i386/c86-4g-m7.md
+++ b/gcc/config/i386/c86-4g-m7.md
@@ -20,8 +20,10 @@
 ;; HYGON c86-4g-m7 Scheduling
 ;; Modeling automatons for decoders, integer execution pipes,
 ;; AGU pipes, branch, floating point execution, fp store units,
-;; integer and floating point dividers.
-(define_automaton "c86_4g_m7, c86_4g_m7_ieu, c86_4g_m7_agu, c86_4g_m7_fpu, 
c86_4g_m7_idiv, c86_4g_m7_fdiv")
+;; integer and floating point dividers.  Split fpu1 and fpu3
+;; into their own automata to keep these units independent
+;; without increasing the main c86_4g_m7_fpu state space.
+(define_automaton "c86_4g_m7, c86_4g_m7_ieu, c86_4g_m7_agu, c86_4g_m7_fpu02, 
c86_4g_m7_fpu13, c86_4g_m7_idiv, c86_4g_m7_fdiv")
 
 ;; Decoders unit has 4 decoders and all of them can decode fast path
 ;; and vector type instructions.
@@ -30,10 +32,6 @@
 (define_cpu_unit "c86-4g-m7-decode2" "c86_4g_m7")
 (define_cpu_unit "c86-4g-m7-decode3" "c86_4g_m7")
 
-;; Two separated dividers for int and fp.
-(define_cpu_unit "c86-4g-m7-idiv" "c86_4g_m7_idiv")
-(define_cpu_unit "c86-4g-m7-fdiv" "c86_4g_m7_fdiv")
-
 ;; Currently blocking all decoders for vector path instructions as
 ;; they are dispatched separetely as microcode sequence.
 (define_reservation "c86-4g-m7-vector" 
"c86-4g-m7-decode0+c86-4g-m7-decode1+c86-4g-m7-decode2+c86-4g-m7-decode3")
@@ -50,6 +48,9 @@
 (define_cpu_unit "c86-4g-m7-ieu2" "c86_4g_m7_ieu")
 (define_cpu_unit "c86-4g-m7-ieu3" "c86_4g_m7_ieu")
 
+;; One separated integer divider.
+(define_cpu_unit "c86-4g-m7-idiv" "c86_4g_m7_idiv")
+
 ;; c86-4g-m7 has an additional branch unit.
 (define_cpu_unit "c86-4g-m7-bru0" "c86_4g_m7_ieu")
 (define_reservation "c86-4g-m7-ieu" 
"c86-4g-m7-ieu0|c86-4g-m7-ieu1|c86-4g-m7-ieu2|c86-4g-m7-ieu3")
@@ -67,23 +68,48 @@
 ;; vectorpath (microcoded) instructions are single issue instructions.
 ;; So, they occupy all the integer units.
 (define_reservation "c86-4g-m7-ivector" "c86-4g-m7-ieu0+c86-4g-m7-ieu1
-                                     
+c86-4g-m7-ieu2+c86-4g-m7-ieu3+c86-4g-m7-bru0
-                                     
+c86-4g-m7-agu0+c86-4g-m7-agu1+c86-4g-m7-agu2")
+                                        
+c86-4g-m7-ieu2+c86-4g-m7-ieu3+c86-4g-m7-bru0
+                                        
+c86-4g-m7-agu0+c86-4g-m7-agu1+c86-4g-m7-agu2")
 
 ;; Floating point unit 4 FP pipes.
-(define_cpu_unit "c86-4g-m7-fpu0" "c86_4g_m7_fpu")
-(define_cpu_unit "c86-4g-m7-fpu1" "c86_4g_m7_fpu")
-(define_cpu_unit "c86-4g-m7-fpu2" "c86_4g_m7_fpu")
-(define_cpu_unit "c86-4g-m7-fpu3" "c86_4g_m7_fpu")
+(define_cpu_unit "c86-4g-m7-fpu0" "c86_4g_m7_fpu02")
+(define_cpu_unit "c86-4g-m7-fpu1" "c86_4g_m7_fpu13")
+(define_cpu_unit "c86-4g-m7-fpu2" "c86_4g_m7_fpu02")
+(define_cpu_unit "c86-4g-m7-fpu3" "c86_4g_m7_fpu13")
+
 (define_reservation "c86-4g-m7-fpu" 
"c86-4g-m7-fpu0|c86-4g-m7-fpu1|c86-4g-m7-fpu2|c86-4g-m7-fpu3")
-(define_reservation "c86-4g-m7-fpu_0_2" "c86-4g-m7-fpu0|c86-4g-m7-fpu2")
-(define_reservation "c86-4g-m7-fpu_1_3" "c86-4g-m7-fpu1|c86-4g-m7-fpu3")
 (define_reservation "c86-4g-m7-fpu_0_1" "c86-4g-m7-fpu0|c86-4g-m7-fpu1")
+(define_reservation "c86-4g-m7-fpu_0_2" "c86-4g-m7-fpu0|c86-4g-m7-fpu2")
 (define_reservation "c86-4g-m7-fpu_0_2x2" "c86-4g-m7-fpu0*2|c86-4g-m7-fpu2*2")
 (define_reservation "c86-4g-m7-fpu_0_2x4" "c86-4g-m7-fpu0*4|c86-4g-m7-fpu2*4")
+(define_reservation "c86-4g-m7-fpu_0_3" "c86-4g-m7-fpu0|c86-4g-m7-fpu3")
+(define_reservation "c86-4g-m7-fpu_1_3" "c86-4g-m7-fpu1|c86-4g-m7-fpu3")
+(define_reservation "c86-4g-m7-fpu_1_3x2" "c86-4g-m7-fpu1*2|c86-4g-m7-fpu3*2")
+(define_reservation "c86-4g-m7-fpu_1_3x3" "c86-4g-m7-fpu1*3|c86-4g-m7-fpu3*3")
+(define_reservation "c86-4g-m7-fpu_1_3x6" "c86-4g-m7-fpu1*6|c86-4g-m7-fpu3*6")
+(define_reservation "c86-4g-m7-fpux2" 
"c86-4g-m7-fpu0*2|c86-4g-m7-fpu1*2|c86-4g-m7-fpu2*2|c86-4g-m7-fpu3*2")
+(define_reservation "c86-4g-m7-fpux4" 
"c86-4g-m7-fpu0*4|c86-4g-m7-fpu1*4|c86-4g-m7-fpu2*4|c86-4g-m7-fpu3*4")
+(define_reservation "c86-4g-m7-fpux8" 
"c86-4g-m7-fpu0*8|c86-4g-m7-fpu1*8|c86-4g-m7-fpu2*8|c86-4g-m7-fpu3*8")
+(define_reservation "c86-4g-m7-fpux6" 
"c86-4g-m7-fpu0*6|c86-4g-m7-fpu1*6|c86-4g-m7-fpu2*6|c86-4g-m7-fpu3*6")
+(define_reservation "c86-4g-m7-fpux16" 
"c86-4g-m7-fpu0*16|c86-4g-m7-fpu1*16|c86-4g-m7-fpu2*16|c86-4g-m7-fpu3*16")
 (define_reservation "c86-4g-m7-fvector" "c86-4g-m7-fpu0+c86-4g-m7-fpu1
-                                     +c86-4g-m7-fpu2+c86-4g-m7-fpu3
-                                     
+c86-4g-m7-agu0+c86-4g-m7-agu1+c86-4g-m7-agu2")
+                                        +c86-4g-m7-fpu2+c86-4g-m7-fpu3
+                                        
+c86-4g-m7-agu0+c86-4g-m7-agu1+c86-4g-m7-agu2")
+
+;; Two FP dividers.
+(define_cpu_unit "c86-4g-m7-fdiv1" "c86_4g_m7_fdiv")
+(define_cpu_unit "c86-4g-m7-fdiv3" "c86_4g_m7_fdiv")
+
+(define_reservation "c86-4g-m7-fp1fdiv1x4" 
"(c86-4g-m7-fpu1+c86-4g-m7-fdiv1)*4")
+(define_reservation "c86-4g-m7-fp3fdiv3x4" 
"(c86-4g-m7-fpu3+c86-4g-m7-fdiv3)*4")
+(define_reservation "c86-4g-m7-fdiv13" "(c86-4g-m7-fdiv1+c86-4g-m7-fdiv3)")
+(define_reservation "c86-4g-m7-fp13div13" 
"(c86-4g-m7-fpu1+c86-4g-m7-fpu3+c86-4g-m7-fdiv1+c86-4g-m7-fdiv3)")
+(define_reservation "c86-4g-m7-fp13div13x4" "c86-4g-m7-fp13div13*4")
+(define_reservation "c86-4g-m7-fp1div1_fp3div3_x4x8" 
"(c86-4g-m7-fp1fdiv1x4,c86-4g-m7-fdiv1*8)|(c86-4g-m7-fp3fdiv3x4,c86-4g-m7-fdiv3*8)")
+(define_reservation "c86-4g-m7-fp1div1_fp3div3_x4x9" 
"(c86-4g-m7-fp1fdiv1x4,c86-4g-m7-fdiv1*9)|(c86-4g-m7-fp3fdiv3x4,c86-4g-m7-fdiv3*9)")
+(define_reservation "c86-4g-m7-fp1div1_fp3div3_x4x11" 
"(c86-4g-m7-fp1fdiv1x4,c86-4g-m7-fdiv1*11)|(c86-4g-m7-fp3fdiv3x4,c86-4g-m7-fdiv3*11)")
+(define_reservation "c86-4g-m7-fp1div1_fp3div3_x4x15" 
"(c86-4g-m7-fp1fdiv1x4,c86-4g-m7-fdiv1*15)|(c86-4g-m7-fp3fdiv3x4,c86-4g-m7-fdiv3*15)")
+(define_reservation "c86-4g-m7-fp1div1_fp3div3_x4x18" 
"(c86-4g-m7-fp1fdiv1x4,c86-4g-m7-fdiv1*18)|(c86-4g-m7-fp3fdiv3x4,c86-4g-m7-fdiv3*18)")
 
 ;; IMOV/IMOVX
 (define_insn_reservation "c86_4g_m7_imov_xchg" 1
@@ -168,61 +194,33 @@
                         "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-ieu1")
 
 ;; IDIV
-(define_insn_reservation "c86_4g_m7_idiv_DI" 41
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "idiv")
-                                  (and (eq_attr "mode" "DI")
-                                       (eq_attr "memory" "none"))))
-                        "c86-4g-m7-double,c86-4g-m7-ieu3,c86-4g-m7-idiv*41")
-
-(define_insn_reservation "c86_4g_m7_idiv_SI" 25
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "idiv")
-                                  (and (eq_attr "mode" "SI")
-                                       (eq_attr "memory" "none"))))
-                        "c86-4g-m7-double,c86-4g-m7-ieu3,c86-4g-m7-idiv*25")
-
-(define_insn_reservation "c86_4g_m7_idiv_HI" 17
+(define_insn_reservation "c86_4g_m7_idiv" 7
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "idiv")
-                                  (and (eq_attr "mode" "HI")
+                                  (and (eq_attr "mode" "!QI")
                                        (eq_attr "memory" "none"))))
-                        "c86-4g-m7-double,c86-4g-m7-ieu3,c86-4g-m7-idiv*17")
+                        "c86-4g-m7-double,c86-4g-m7-ieu3,c86-4g-m7-idiv*7")
 
-(define_insn_reservation "c86_4g_m7_idiv_QI" 15
+(define_insn_reservation "c86_4g_m7_idiv_QI" 6
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "idiv")
                                   (and (eq_attr "mode" "QI")
                                        (eq_attr "memory" "none"))))
-                        "c86-4g-m7-direct,c86-4g-m7-ieu3,c86-4g-m7-idiv*15")
-
-(define_insn_reservation "c86_4g_m7_idiv_DI_load" 45
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "idiv")
-                                  (and (eq_attr "mode" "DI")
-                                       (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-ieu3,c86-4g-m7-idiv*41")
-
-(define_insn_reservation "c86_4g_m7_idiv_SI_load" 29
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "idiv")
-                                  (and (eq_attr "mode" "SI")
-                                       (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-ieu3,c86-4g-m7-idiv*25")
+                        "c86-4g-m7-double,c86-4g-m7-ieu3,c86-4g-m7-idiv*6")
 
-(define_insn_reservation "c86_4g_m7_idiv_HI_load" 21
+(define_insn_reservation "c86_4g_m7_idiv_load" 11
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "idiv")
-                                  (and (eq_attr "mode" "HI")
+                                  (and (eq_attr "mode" "!QI")
                                        (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-ieu3,c86-4g-m7-idiv*17")
+                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-ieu3,c86-4g-m7-idiv*7")
 
-(define_insn_reservation "c86_4g_m7_idiv_QI_load" 19
+(define_insn_reservation "c86_4g_m7_idiv_QI_load" 10
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "idiv")
                                   (and (eq_attr "mode" "QI")
                                        (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-ieu3,c86-4g-m7-idiv*15")
+                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-ieu3,c86-4g-m7-idiv*6")
 
 ;; Integer/genaral Instructions
 (define_insn_reservation "c86_4g_m7_insn" 1
@@ -385,14 +383,14 @@
                              (and (eq_attr "type" "sseins")
                                   (and (eq_attr "memory" "none")
                                        (eq_attr "length_immediate" "2"))))
-                        
"c86-4g-m7-double,c86-4g-m7-fpu0|c86-4g-m7-fpu3,c86-4g-m7-fpu1")
+                        "c86-4g-m7-double,c86-4g-m7-fpu_0_3,c86-4g-m7-fpu1")
 
 (define_insn_reservation "c86_4g_m7_sse_insert" 3
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "sseins")
                                   (and (eq_attr "memory" "none")
                                        (eq_attr "length_immediate" "!2"))))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu1")
+                        "c86-4g-m7-direct,c86-4g-m7-fpu1*2")
 
 ;; FCMOV
 (define_insn_reservation "c86_4g_m7_fp_cmov" 4
@@ -444,7 +442,7 @@
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "fpspc")
                                   (eq_attr "c86_attr" "sqrt")))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu1,c86-4g-m7-fdiv*22")
+                        "c86-4g-m7-direct,c86-4g-m7-fp1div1_fp3div3_x4x18")
 
 ;; FPSPC
 (define_insn_reservation "c86_4g_m7_fp_spc_direct" 5
@@ -487,21 +485,21 @@
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "fdiv")
                                   (eq_attr "memory" "none")))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu1,c86-4g-m7-fdiv*15")
+                        "c86-4g-m7-direct,c86-4g-m7-fp1div1_fp3div3_x4x11")
 
 (define_insn_reservation "c86_4g_m7_fp_div_load" 22
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "fdiv")
                                   (and (eq_attr "fp_int_src" "false")
                                        (eq_attr "memory" "!none"))))
-                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu1,c86-4g-m7-fdiv*15")
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp1div1_fp3div3_x4x11")
 
 (define_insn_reservation "c86_4g_m7_fp_idiv_load" 26
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "fdiv")
                                   (and (eq_attr "fp_int_src" "true")
                                        (eq_attr "memory" "!none"))))
-                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu1,c86-4g-m7-fdiv*15")
+                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu1*4,c86-4g-m7-fp1div1_fp3div3_x4x11")
 
 (define_insn_reservation "c86_4g_m7_fp_fsgn" 1
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -634,7 +632,7 @@
                                   (and (eq_attr "c86_attr" "insr")
                                    (and (eq_attr "prefix" "orig")
                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-double,c86-4g-m7-ieu2,c86-4g-m7-fpu_0_1")
+                        "c86-4g-m7-double,c86-4g-m7-ieu2,c86-4g-m7-fpu")
 
 (define_insn_reservation "c86_4g_m7_sse_pinsr_reg_load" 3
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -642,7 +640,7 @@
                                   (and (eq_attr "c86_attr" "insr")
                                    (and (eq_attr "prefix" "orig")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_1")
+                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu")
 
 (define_insn_reservation "c86_4g_m7_avx_vpinsr_reg" 2
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -650,7 +648,7 @@
                                   (and (eq_attr "c86_attr" "insr")
                                     (and (eq_attr "prefix" "!orig")
                                          (eq_attr "memory" "none")))))
-                        "c86-4g-m7-double,c86-4g-m7-fpu2*2")
+                        "c86-4g-m7-double,c86-4g-m7-fpu_1_3x2")
 
 (define_insn_reservation "c86_4g_m7_avx_vpinsr_reg_load" 8
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -658,7 +656,7 @@
                                   (and (eq_attr "c86_attr" "insr")
                                     (and (eq_attr "prefix" "!orig")
                                          (eq_attr "memory" "load")))))
-                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu1|c86-4g-m7-fpu2|c86-4g-m7-fpu3")
+                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_1_3")
 
 ;; PERM
 (define_insn_reservation "c86_4g_m7_avx512_perm_xmm" 3
@@ -668,8 +666,7 @@
                                                  (eq_attr "mode" 
"V4SF,V2DF,TI"))
                                             (and (eq_attr "c86_attr" "perm")
                                                  (eq_attr "mode" 
"V8SF,V4DF,TI,OI")))
-                                   (and (eq_attr "prefix" "evex")
-                                        (eq_attr "memory" "none")))))
+                                       (eq_attr "memory" "none"))))
                         "c86-4g-m7-direct,c86-4g-m7-fpu_0_2x2")
 
 (define_insn_reservation "c86_4g_m7_avx512_perm_xmm_opload" 10
@@ -679,8 +676,7 @@
                                                  (eq_attr "mode" 
"V4SF,V2DF,TI"))
                                             (and (eq_attr "c86_attr" "perm")
                                                  (eq_attr "mode" 
"V8SF,V4DF,TI,OI")))
-                                   (and (eq_attr "prefix" "evex")
-                                        (eq_attr "memory" "load")))))
+                                       (eq_attr "memory" "load"))))
                         "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_2x2")
 
 (define_insn_reservation "c86_4g_m7_avx512_permi2_ymm" 4
@@ -689,7 +685,7 @@
                                   (and (eq_attr "c86_attr" "perm2")
                                    (and (eq_attr "mode" "V8SF,V4DF,OI")
                                          (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpux4")
 
 (define_insn_reservation "c86_4g_m7_avx512_permi2_zmm" 16
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -697,7 +693,7 @@
                                   (and (eq_attr "c86_attr" "perm2")
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpux16")
 
 (define_insn_reservation "c86_4g_m7_avx512_permi2_ymm_load" 11
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -705,7 +701,7 @@
                                   (and (eq_attr "c86_attr" "perm2")
                                    (and (eq_attr "mode" "V8SF,V4DF,OI")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpux4")
 
 (define_insn_reservation "c86_4g_m7_avx512_permi2_zmm_load" 23
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -713,7 +709,7 @@
                                   (and (eq_attr "c86_attr" "perm2")
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpux16")
 
 (define_insn_reservation "c86_4g_m7_avx512_perm_zmm_imm" 4
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -722,7 +718,7 @@
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                     (and (match_operand 2 "immediate_operand")
                                          (eq_attr "memory" "none"))))))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu_0_2x4")
+                        "c86-4g-m7-direct,c86-4g-m7-fpux4")
 
 (define_insn_reservation "c86_4g_m7_avx512_perm_zmm_imm_load" 11
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -731,7 +727,7 @@
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                     (and (match_operand 2 "immediate_operand")
                                          (eq_attr "memory" "load"))))))
-                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_2x4")
+                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpux4")
 
 (define_insn_reservation "c86_4g_m7_avx512_perm_zmm_noimm" 8
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -740,7 +736,7 @@
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                     (and (match_operand 2 
"nonimmediate_operand")
                                          (eq_attr "memory" "none"))))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpux8")
 
 (define_insn_reservation "c86_4g_m7_sse_perm_zmm_noimm_load" 15
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -749,23 +745,7 @@
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                     (and (match_operand 2 
"nonimmediate_operand")
                                         (eq_attr "memory" "load"))))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
-
-(define_insn_reservation "c86_4g_m7_avx_perm_ymm" 3
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "sselog")
-                                  (and (eq_attr "c86_attr" "perm")
-                                    (and (eq_attr "prefix" "!evex")
-                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
-
-(define_insn_reservation "c86_4g_m7_avx_perm_ymem" 10
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "sselog")
-                                  (and (eq_attr "c86_attr" "perm")
-                                    (and (eq_attr "prefix" "!evex")
-                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpux8")
 
 ;; VINSERT
 (define_insn_reservation "c86_4g_m7_avx512_insertx_ymm" 3
@@ -853,7 +833,7 @@
                                   (and (eq_attr "c86_attr" "shufx")
                                     (and (eq_attr "mode" "V8DF,V16SF,XI")
                                          (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu_0_2x4")
 
 (define_insn_reservation "c86_4g_m7_avx512_shuf_xymem" 10
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -869,7 +849,7 @@
                                   (and (eq_attr "c86_attr" "shufx")
                                     (and (eq_attr "mode" "V8DF,V16SF,XI")
                                          (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu_0_2x4")
 
 ;; SSELOGIC
 (define_insn_reservation "c86_4g_m7_sselogic_xymm" 1
@@ -892,14 +872,14 @@
                              (and (eq_attr "type" "sselog")
                                   (and (eq_attr "c86_attr" "cmpestr")
                                        (eq_attr "memory" "none"))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpux6")
 
 (define_insn_reservation "c86_4g_m7_avx512_cmpestr_load" 13
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "sselog")
                                   (and (eq_attr "c86_attr" "cmpestr")
                                        (eq_attr "memory" "load"))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpux6")
 
 ;; SSELOG
 (define_insn_reservation "c86_4g_m7_avx512_log" 1
@@ -940,7 +920,7 @@
                                   (and (eq_attr "c86_attr" "sadbw")
                                    (and (eq_attr "mode" "XI")
                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        
"c86-4g-m7-vector,c86-4g-m7-fpu_0_2,c86-4g-m7-fpu_1_3x2")
 
 (define_insn_reservation "c86_4g_m7_avx512_vdbpsadbw_zmem" 11
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -948,7 +928,7 @@
                                   (and (eq_attr "c86_attr" "sadbw")
                                    (and (eq_attr "mode" "XI")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        
"c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu_0_2,c86-4g-m7-fpu_1_3x2")
 
 ;; ABS
 (define_insn_reservation "c86_4g_m7_avx512_abs" 1
@@ -1052,14 +1032,14 @@
                              (and (eq_attr "type" "ssecomi")
                                   (and (eq_attr "prefix_extra" "0")
                                        (eq_attr "memory" "none"))))
-                        "c86-4g-m7-double,c86-4g-m7-fpu2|c86-4g-m7-fpu3")
+                        "c86-4g-m7-double,c86-4g-m7-fpu")
 
 (define_insn_reservation "c86_4g_m7_avx_ssecomi_comi_load" 8
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssecomi")
                                   (and (eq_attr "prefix_extra" "0")
                                        (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu2|c86-4g-m7-fpu3")
+                        "c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu")
 
 (define_insn_reservation "c86_4g_m7_avx_ssecomi_test" 1
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1201,7 +1181,7 @@
                                   (and (eq_attr "c86_attr" "expand,compress")
                                    (and (not (eq_attr "mode" "XI,V16SF,V8DF"))
                                         (eq_attr "memory" "none")))))
-                        
"c86-4g-m7-direct,c86-4g-m7-fpu3*2,c86-4g-m7-fpu1*2|c86-4g-m7-fpu3*2")
+                        "c86-4g-m7-direct,c86-4g-m7-fpu3,c86-4g-m7-fpu_0_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_expand_load" 10
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1209,7 +1189,7 @@
                                   (and (eq_attr "c86_attr" "expand,compress")
                                    (and (not (eq_attr "mode" "XI,V16SF,V8DF"))
                                         (eq_attr "memory" "load")))))
-                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu3*2,c86-4g-m7-fpu1*2|c86-4g-m7-fpu3*2")
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu3,c86-4g-m7-fpu_0_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_expand_z" 10
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1217,7 +1197,7 @@
                                   (and (eq_attr "c86_attr" "expand,compress")
                                    (and (eq_attr "mode" "XI,V16SF,V8DF")
                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu3,c86-4g-m7-fpu_0_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_expand_z_load" 17
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1225,7 +1205,7 @@
                                   (and (eq_attr "c86_attr" "expand,compress")
                                    (and (eq_attr "mode" "XI,V16SF,V8DF")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        
"c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu3,c86-4g-m7-fpu_0_3")
 
 ;; MOVNT
 (define_insn_reservation "c86_4g_m7_avx512_movnt_load" 8
@@ -1252,7 +1232,7 @@
                                         (eq_attr "memory" "!none")))))
                         "c86-4g-m7-direct,c86-4g-m7-store,c86-4g-m7-fpu1")
 
-(define_insn_reservation "c86_4g_m7_sse_movnt_xy" 4
+(define_insn_reservation "c86_4g_m7_sse_movnt" 4
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssemov")
                                   (and (eq_attr "c86_attr" "movnt")
@@ -1377,14 +1357,14 @@
                              (and (eq_attr "type" "sseadd")
                                   (and (eq_attr "c86_attr" "other")
                                          (eq_attr "memory" "none"))))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu3")
+                        "c86-4g-m7-direct,c86-4g-m7-fpu_1_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_sseadd_xy_load" 10
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "sseadd")
                                   (and (eq_attr "c86_attr" "other")
                                         (eq_attr "memory" "load"))))
-                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu3")
+                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_1_3")
 
 ;; HADD/HSUB
 (define_insn_reservation "c86_4g_m7_avx_sseadd_hplus" 7
@@ -1507,7 +1487,7 @@
                                   (and (eq_attr "c86_attr" "hplus")
                                    (and (eq_attr "prefix" "orig")
                                     (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector,c86-4g-m7-fpu0*2")
+                        "c86-4g-m7-vector,c86-4g-m7-fpux2")
 
 (define_insn_reservation "c86_4g_m7_sse_sseiadd_hplus_load" 10
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1515,49 +1495,63 @@
                                   (and (eq_attr "c86_attr" "hplus")
                                    (and (eq_attr "prefix" "orig")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu0*2")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpux2")
 
 ;; SSEMUL
 (define_insn_reservation "c86_4g_m7_avx512_ssemul" 3
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssemul")
                                   (eq_attr "memory" "none")))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu0")
+                        "c86-4g-m7-direct,c86-4g-m7-fpu_0_2")
 
 (define_insn_reservation "c86_4g_m7_avx512_ssemul_load" 10
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssemul")
                                   (eq_attr "memory" "load")))
-                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu0")
+                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_2")
 
 ;; SSEDIV
-(define_insn_reservation "c86_4g_m7_avx512_ssediv" 13
+(define_insn_reservation "c86_4g_m7_avx512_ssediv_x" 13
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "ssediv")
+                                  (and (eq_attr "mode" "SF,DF,V4SF,V2DF")
+                                       (eq_attr "memory" "none"))))
+                        "c86-4g-m7-direct,c86-4g-m7-fp1div1_fp3div3_x4x8")
+
+(define_insn_reservation "c86_4g_m7_avx512_ssediv_xmem" 20
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "ssediv")
+                                  (and (eq_attr "mode" "SF,DF,V4SF,V2DF")
+                                       (eq_attr "memory" "load"))))
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp1div1_fp3div3_x4x8")
+
+(define_insn_reservation "c86_4g_m7_avx512_ssediv_y" 13
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssediv")
-                                  (and (not (eq_attr "mode" "V16SF,V8DF"))
+                                  (and (eq_attr "mode" "V8SF,V4DF")
                                        (eq_attr "memory" "none"))))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu3,c86-4g-m7-fdiv*13")
+                        
"c86-4g-m7-direct,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*8")
 
-(define_insn_reservation "c86_4g_m7_avx512_ssediv_mem" 20
+(define_insn_reservation "c86_4g_m7_avx512_ssediv_ymem" 20
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssediv")
-                                  (and (not (eq_attr "mode" "V16SF,V8DF"))
+                                  (and (eq_attr "mode" "V8SF,V4DF")
                                        (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu3,c86-4g-m7-fdiv*13")
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*8")
 
 (define_insn_reservation "c86_4g_m7_avx512_ssediv_z" 24
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssediv")
                                   (and (eq_attr "mode" "V16SF,V8DF")
                                        (eq_attr "memory" "none"))))
-                        "c86-4g-m7-double,c86-4g-m7-fpu3,c86-4g-m7-fdiv*24")
+                        
"c86-4g-m7-double,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*20")
 
 (define_insn_reservation "c86_4g_m7_avx512_ssediv_zmem" 31
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssediv")
                                   (and (eq_attr "mode" "V16SF,V8DF")
                                         (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu3,c86-4g-m7-fdiv*24")
+                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*20")
 
 ;; SSECMP
 (define_insn_reservation "c86_4g_m7_avx512_ssecmp" 5
@@ -1582,7 +1576,7 @@
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                     (and (eq_attr "c86_attr" "other")
                                          (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu_0_2,c86-4g-m7-fpu_1_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_ssecmp_z_load" 12
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1590,7 +1584,7 @@
                                    (and (eq_attr "mode" "V16SF,V8DF,XI")
                                     (and (eq_attr "c86_attr" "other")
                                          (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        
"c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu_0_2,c86-4g-m7-fpu_1_3x2")
 
 (define_insn_reservation "c86_4g_m7_avx512_ssecmp_vp" 5
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1610,6 +1604,24 @@
                                          (eq_attr "memory" "load"))))))
                         
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu,c86-4g-m7-fpu_1_3")
 
+(define_insn_reservation "c86_4g_m7_avx512_ssecmp_vp_z" 5
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "ssecmp")
+                                  (and (eq_attr "prefix" "evex")
+                                   (and (eq_attr "mode" "XI")
+                                    (and (eq_attr "c86_attr" "other,ptest")
+                                         (eq_attr "memory" "none"))))))
+                        "c86-4g-m7-double,c86-4g-m7-fpu,c86-4g-m7-fpu_1_3")
+
+(define_insn_reservation "c86_4g_m7_avx512_ssecmp_vp_z_load" 12
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "ssecmp")
+                                  (and (eq_attr "prefix" "evex")
+                                   (and (eq_attr "mode" "XI")
+                                    (and (eq_attr "c86_attr" "other,ptest")
+                                         (eq_attr "memory" "load"))))))
+                        
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu,c86-4g-m7-fpu_1_3x2")
+
 (define_insn_reservation "c86_4g_m7_avx_ssecmp_vp" 1
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssecmp")
@@ -1641,22 +1653,6 @@
                                          (eq_attr "memory" "load")))))
                         
"c86-4g-m7-double,c86-4g-m7-load,c86-4g-m7-fpu1,c86-4g-m7-fpu_1_3")
 
-(define_insn_reservation "c86_4g_m7_avx512_ssecmp_test_z" 4
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "ssecmp")
-                                   (and (eq_attr "mode" "XI")
-                                    (and (eq_attr "c86_attr" "ptest")
-                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
-
-(define_insn_reservation "c86_4g_m7_avx512_ssecmp_test_z_load" 11
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "ssecmp")
-                                   (and (eq_attr "mode" "XI")
-                                    (and (eq_attr "c86_attr" "ptest")
-                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
-
 ;; SSECVT
 (define_insn_reservation "c86_4g_m7_avx512_ssecvt_xy" 4
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1768,17 +1764,14 @@
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssemuladd")
                                   (and (eq_attr "c86_attr" "other")
-                                   (and (not (eq_attr "isa" "fma,fma4"))
-                                        (eq_attr "mode" "V32HF,V16SF,V8DF,XI")
-                                         (eq_attr "memory" "none")))))
+                                       (eq_attr "memory" "none"))))
                         "c86-4g-m7-direct,c86-4g-m7-fpu_0_2")
 
 (define_insn_reservation "c86_4g_m7_avx512_muladd_load" 11
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "ssemuladd")
                                   (and (eq_attr "c86_attr" "other")
-                                   (and (not (eq_attr "isa" "fma,fma4"))
-                                        (eq_attr "memory" "load")))))
+                                       (eq_attr "memory" "load"))))
                         "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_2")
 
 (define_insn_reservation "c86_4g_m7_avx512_muladd_madd" 4
@@ -1797,20 +1790,6 @@
                                         (eq_attr "memory" "load")))))
                         "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_2")
 
-(define_insn_reservation "c86_4g_m7_fma_muladd" 4
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "ssemuladd")
-                                  (and (eq_attr "isa" "fma,fma4")
-                                       (eq_attr "memory" "none"))))
-                        "c86-4g-m7-direct,c86-4g-m7-fpu_0_1")
-
-(define_insn_reservation "c86_4g_m7_fma_muladd_load" 11
-                        (and (eq_attr "cpu" "c86_4g_m7")
-                             (and (eq_attr "type" "ssemuladd")
-                                  (and (eq_attr "isa" "fma,fma4")
-                                       (eq_attr "memory" "load"))))
-                        "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_1")
-
 ;; SSE
 (define_insn_reservation "c86_4g_m7_avx512_sse_range" 1
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1838,7 +1817,7 @@
                                   (and (eq_attr "c86_decode" "vector")
                                    (and (eq_attr "mode" "TI")
                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu_1_3x2")
 
 (define_insn_reservation "c86_4g_m7_avx512_sse_conflict_x_load" 9
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1846,7 +1825,7 @@
                                   (and (eq_attr "c86_decode" "vector")
                                    (and (eq_attr "mode" "TI")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu_1_3x2")
 
 (define_insn_reservation "c86_4g_m7_avx512_sse_conflict_y" 5
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1854,7 +1833,7 @@
                                   (and (eq_attr "c86_decode" "vector")
                                    (and (eq_attr "mode" "OI")
                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu_1_3x3")
 
 (define_insn_reservation "c86_4g_m7_avx512_sse_conflict_y_load" 12
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1862,7 +1841,7 @@
                                   (and (eq_attr "c86_decode" "vector")
                                    (and (eq_attr "mode" "OI")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu_1_3x3")
 
 (define_insn_reservation "c86_4g_m7_avx512_sse_conflict_z" 8
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1870,7 +1849,7 @@
                                   (and (eq_attr "c86_decode" "vector")
                                    (and (eq_attr "mode" "XI")
                                         (eq_attr "memory" "none")))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu_1_3x6")
 
 (define_insn_reservation "c86_4g_m7_avx512_sse_conflict_z_load" 15
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1878,7 +1857,7 @@
                                   (and (eq_attr "c86_decode" "vector")
                                    (and (eq_attr "mode" "XI")
                                         (eq_attr "memory" "load")))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        "c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu_1_3x6")
 
 (define_insn_reservation "c86_4g_m7_avx512_sse_class" 4
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1905,7 +1884,7 @@
                                     (and (eq_attr "length_immediate" "1")
                                      (and (eq_attr "mode" "V32HF,V16SF,V8DF")
                                           (eq_attr "memory" "none"))))))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu_1_3,c86-4g-m7-fpu_1_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_sse_class_z_load" 11
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1914,7 +1893,7 @@
                                     (and (eq_attr "length_immediate" "1")
                                      (and (eq_attr "mode" "V32HF,V16SF,V8DF")
                                           (eq_attr "memory" "load"))))))
-                        "c86-4g-m7-vector,c86-4g-m7-load")
+                        
"c86-4g-m7-vector,c86-4g-m7-load,c86-4g-m7-fpu_1_3,c86-4g-m7-fpu_1_3")
 
 (define_insn_reservation "c86_4g_m7_avx_sse" 5
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1932,19 +1911,102 @@
                                         (eq_attr "memory" "load")))))
                         "c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu_0_1")
 
-(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt" 16
+;; SSE SQRT
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_sf_x" 14
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "sse")
-                                  (and (eq_attr "c86_attr" "sqrt")
-                                       (eq_attr "memory" "none"))))
-                        
"c86-4g-m7-direct,c86-4g-m7-fpu1|c86-4g-m7-fpu3,c86-4g-m7-fdiv*16")
+                                  (and (eq_attr "mode" "SF,V4SF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "none")))))
+                        "c86-4g-m7-direct,c86-4g-m7-fp1div1_fp3div3_x4x9")
 
-(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_load" 23
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_sf_xload" 21
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "sse")
-                                  (and (eq_attr "c86_attr" "sqrt")
-                                       (eq_attr "memory" "load"))))
-                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fpu1|c86-4g-m7-fpu3,c86-4g-m7-fdiv*16")
+                                  (and (eq_attr "mode" "SF,V4SF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "load")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp1div1_fp3div3_x4x9")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_sf_y" 14
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V8SF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "none")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*9")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_sf_yload" 21
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V8SF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "load")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*9")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_sf_z" 26
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V16SF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "none")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*22")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_sf_zload" 33
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V16SF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "load")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*22")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_df_x" 20
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "DF,V2DF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "none")))))
+                        "c86-4g-m7-direct,c86-4g-m7-fp1div1_fp3div3_x4x15")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_df_xload" 27
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "DF,V2DF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "load")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp1div1_fp3div3_x4x15")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_df_y" 20
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V4DF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "none")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*15")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_df_yload" 27
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V4DF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "load")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*15")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_df_z" 38
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V8DF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "none")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*34")
+
+(define_insn_reservation "c86_4g_m7_avx512_sse_sqrt_df_zload" 45
+                        (and (eq_attr "cpu" "c86_4g_m7")
+                             (and (eq_attr "type" "sse")
+                                  (and (eq_attr "mode" "V8DF")
+                                   (and (eq_attr "c86_attr" "sqrt")
+                                        (eq_attr "memory" "load")))))
+                        
"c86-4g-m7-direct,c86-4g-m7-load,c86-4g-m7-fp13div13x4,c86-4g-m7-fdiv13*34")
 
 ;; MSKLOG/MSKMOV
 (define_insn_reservation "c86_4g_m7_avx512_msklog" 1
@@ -1957,7 +2019,7 @@
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "msklog")
                                   (eq_attr "c86_decode" "vector")))
-                        "c86-4g-m7-vector")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu_1_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_mskmov_reg_k" 1
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1977,7 +2039,7 @@
                         (and (eq_attr "cpu" "c86_4g_m7")
                              (and (eq_attr "type" "mskmov")
                                   (match_operand:V8DI 0 "register_operand" 
"v")))
-                        
"c86-4g-m7-vector,c86-4g-m7-fpu3*2,c86-4g-m7-fpu1*2|c86-4g-m7-fpu3*2")
+                        "c86-4g-m7-vector,c86-4g-m7-fpu3,c86-4g-m7-fpu_1_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_mskmov_k_k" 1
                         (and (eq_attr "cpu" "c86_4g_m7")
@@ -1991,7 +2053,7 @@
                              (and (eq_attr "type" "mskmov")
                                  (and (match_operand 0 "register_operand" "k")
                                       (match_operand 1 "register_operand" 
"r"))))
-                        
"c86-4g-m7-double,c86-4g-m7-fpu1*2,c86-4g-m7-fpu1*2|c86-4g-m7-fpu3*2")
+                        "c86-4g-m7-double,c86-4g-m7-fpu1,c86-4g-m7-fpu_1_3")
 
 (define_insn_reservation "c86_4g_m7_avx512_mskmov_k_m" 8
                         (and (eq_attr "cpu" "c86_4g_m7")
diff --git a/gcc/config/i386/c86-4g.md b/gcc/config/i386/c86-4g.md
index 49a46a8aa19e..8b81fcaabb28 100644
--- a/gcc/config/i386/c86-4g.md
+++ b/gcc/config/i386/c86-4g.md
@@ -30,8 +30,10 @@
 ;; HYGON Scheduling
 ;; Modeling automatons for decoders, integer execution pipes,
 ;; AGU pipes, floating point execution units, integer and
-;; floating point dividers.
-(define_automaton "c86_4g, c86_4g_ieu, c86_4g_fp, c86_4g_agu, c86_4g_idiv, 
c86_4g_fdiv")
+;; floating point dividers.  Split fp1 into its own automaton
+;; to keep this unit independent without increasing the main
+;; c86_4g_fp state space.
+(define_automaton "c86_4g, c86_4g_ieu, c86_4g_fp024, c86_4g_fp1, c86_4g_agu, 
c86_4g_idiv, c86_4g_fdiv")
 
 ;; Decoders unit has 4 decoders and all of them can decode fast path
 ;; and vector type instructions.
@@ -40,10 +42,6 @@
 (define_cpu_unit "c86-4g-decode2" "c86_4g")
 (define_cpu_unit "c86-4g-decode3" "c86_4g")
 
-;; Two separated dividers for int and fp.
-(define_cpu_unit "c86-4g-idiv" "c86_4g_idiv")
-(define_cpu_unit "c86-4g-fdiv" "c86_4g_fdiv")
-
 ;; Currently blocking all decoders for vector path instructions as
 ;; they are dispatched separetely as microcode sequence.
 ;; Fix me: Need to revisit this.
@@ -55,7 +53,6 @@
 ;; Fix me: Need to revisit this later to simulate fast path double behavior.
 (define_reservation "c86-4g-double" "c86-4g-direct")
 
-
 ;; Integer unit 4 ALU pipes.
 (define_cpu_unit "c86-4g-ieu0" "c86_4g_ieu")
 (define_cpu_unit "c86-4g-ieu1" "c86_4g_ieu")
@@ -63,6 +60,9 @@
 (define_cpu_unit "c86-4g-ieu3" "c86_4g_ieu")
 (define_reservation "c86-4g-ieu" 
"c86-4g-ieu0|c86-4g-ieu1|c86-4g-ieu2|c86-4g-ieu3")
 
+;; One separated integer divider.
+(define_cpu_unit "c86-4g-idiv" "c86_4g_idiv")
+
 ;; 2 AGU pipes in c86_4g
 ;; According to CPU diagram last AGU unit is used only for stores.
 (define_cpu_unit "c86-4g-agu0" "c86_4g_agu")
@@ -81,10 +81,10 @@
                                      +c86-4g-agu0+c86-4g-agu1")
 
 ;; Floating point unit 4 FP pipes.
-(define_cpu_unit "c86-4g-fp0" "c86_4g_fp")
-(define_cpu_unit "c86-4g-fp1" "c86_4g_fp")
-(define_cpu_unit "c86-4g-fp2" "c86_4g_fp")
-(define_cpu_unit "c86-4g-fp3" "c86_4g_fp")
+(define_cpu_unit "c86-4g-fp0" "c86_4g_fp024")
+(define_cpu_unit "c86-4g-fp1" "c86_4g_fp1")
+(define_cpu_unit "c86-4g-fp2" "c86_4g_fp024")
+(define_cpu_unit "c86-4g-fp3" "c86_4g_fp024")
 
 (define_reservation "c86-4g-fpu" "c86-4g-fp0|c86-4g-fp1|c86-4g-fp2|c86-4g-fp3")
 
@@ -92,6 +92,11 @@
                                      +c86-4g-fp2+c86-4g-fp3
                                      +c86-4g-agu0+c86-4g-agu1")
 
+;; One separated FP divider.
+(define_cpu_unit "c86-4g-fdiv" "c86_4g_fdiv")
+
+(define_reservation "c86-4g-fp1fdivx4" "(c86-4g-fp1+c86-4g-fdiv)*4")
+
 ;; Call instruction
 (define_insn_reservation "c86_4g_call" 1
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
@@ -387,7 +392,7 @@
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                              (and (eq_attr "type" "fpspc")
                                   (eq_attr "c86_attr" "sqrt")))
-                        "c86-4g-direct,c86-4g-fp1,c86-4g-fdiv*22")
+                        "c86-4g-direct,c86-4g-fp1fdivx4,c86-4g-fdiv*18")
 
 (define_insn_reservation "c86_4g_sse_sqrt_sf" 14
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
@@ -395,7 +400,7 @@
                                   (and (eq_attr "memory" "none,unknown")
                                        (and (eq_attr "c86_attr" "sqrt")
                                             (eq_attr "type" "sse")))))
-                        "c86-4g-direct,c86-4g-fp1,c86-4g-fdiv*14")
+                        "c86-4g-direct,c86-4g-fp1fdivx4,c86-4g-fdiv*10")
 
 (define_insn_reservation "c86_4g_sse_sqrt_sf_mem" 21
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
@@ -403,7 +408,7 @@
                                   (and (eq_attr "memory" "load")
                                        (and (eq_attr "c86_attr" "sqrt")
                                             (eq_attr "type" "sse")))))
-                        "c86-4g-direct,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*14")
+                        
"c86-4g-direct,c86-4g-load,c86-4g-fp1fdivx4,c86-4g-fdiv*10")
 
 (define_insn_reservation "c86_4g_sse_sqrt_df" 20
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
@@ -411,7 +416,7 @@
                                   (and (eq_attr "memory" "none,unknown")
                                        (and (eq_attr "c86_attr" "sqrt")
                                             (eq_attr "type" "sse")))))
-                        "c86-4g-direct,c86-4g-fp1,c86-4g-fdiv*20")
+                        "c86-4g-direct,c86-4g-fp1fdivx4,c86-4g-fdiv*16")
 
 (define_insn_reservation "c86_4g_sse_sqrt_df_mem" 27
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
@@ -419,7 +424,7 @@
                                   (and (eq_attr "memory" "load")
                                        (and (eq_attr "c86_attr" "sqrt")
                                             (eq_attr "type" "sse")))))
-                        "c86-4g-direct,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*20")
+                        
"c86-4g-direct,c86-4g-load,c86-4g-fp1fdivx4,c86-4g-fdiv*16")
 
 ;; RCP
 (define_insn_reservation "c86_4g_sse_rcp" 5
@@ -492,20 +497,20 @@
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                              (and (eq_attr "type" "fdiv")
                                   (eq_attr "memory" "none")))
-                        "c86-4g-direct,c86-4g-fp1,c86-4g-fdiv*15")
+                        "c86-4g-direct,c86-4g-fp1fdivx4,c86-4g-fdiv*11")
 
 (define_insn_reservation "c86_4g_fp_op_div_load" 22
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                              (and (eq_attr "type" "fdiv")
                                   (eq_attr "memory" "load")))
-                        "c86-4g-direct,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*15")
+                        
"c86-4g-direct,c86-4g-load,c86-4g-fp1fdivx4,c86-4g-fdiv*11")
 
-(define_insn_reservation "c86_4g_fp_op_idiv_load" 27
+(define_insn_reservation "c86_4g_fp_op_idiv_load" 26
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                              (and (eq_attr "type" "fdiv")
                                   (and (eq_attr "fp_int_src" "true")
                                        (eq_attr "memory" "load"))))
-                        "c86-4g-double,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*19")
+                        
"c86-4g-double,c86-4g-load,c86-4g-fp1*4,c86-4g-fp1fdivx4,c86-4g-fdiv*11")
 
 ;; MMX, SSE, SSEn.n, AVX, AVX2 instructions
 (define_insn_reservation "c86_4g_fp_insn" 1
@@ -1024,28 +1029,28 @@
                                        (eq_attr "mode" "V4SF,SF"))
                              (and (eq_attr "type" "ssediv")
                                   (eq_attr "memory" "none")))
-                        "c86-4g-direct,c86-4g-fp1,c86-4g-fdiv*10")
+                        "c86-4g-direct,c86-4g-fp1fdivx4,c86-4g-fdiv*6")
 
 (define_insn_reservation "c86_4g_ssediv_ss_ps_load" 17
                         (and (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                                        (eq_attr "mode" "V4SF,SF"))
                              (and (eq_attr "type" "ssediv")
                                   (eq_attr "memory" "load")))
-                        "c86-4g-direct,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*10")
+                        
"c86-4g-direct,c86-4g-load,c86-4g-fp1fdivx4,c86-4g-fdiv*6")
 
 (define_insn_reservation "c86_4g_ssediv_sd_pd" 13
                         (and (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                                        (eq_attr "mode" "V2DF,DF"))
                              (and (eq_attr "type" "ssediv")
                                   (eq_attr "memory" "none")))
-                        "c86-4g-direct,c86-4g-fp1,c86-4g-fdiv*13")
+                        "c86-4g-direct,c86-4g-fp1fdivx4,c86-4g-fdiv*9")
 
 (define_insn_reservation "c86_4g_ssediv_sd_pd_load" 20
                         (and (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                                               (eq_attr "mode" "V2DF,DF"))
                              (and (eq_attr "type" "ssediv")
                                   (eq_attr "memory" "load")))
-                        "c86-4g-direct,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*13")
+                        
"c86-4g-direct,c86-4g-load,c86-4g-fp1fdivx4,c86-4g-fdiv*9")
 
 
 (define_insn_reservation "c86_4g_ssediv_avx256_ps" 10
@@ -1053,28 +1058,28 @@
                              (and (eq_attr "mode" "V8SF")
                                   (and (eq_attr "memory" "none")
                                        (eq_attr "type" "ssediv"))))
-                        "c86-4g-double,c86-4g-fp1,c86-4g-fdiv*10")
+                        "c86-4g-double,c86-4g-fp1fdivx4,c86-4g-fdiv*6")
 
 (define_insn_reservation "c86_4g_ssediv_avx256_ps_load" 17
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                              (and (eq_attr "mode" "V8SF")
                                   (and (eq_attr "type" "ssediv")
                                        (eq_attr "memory" "load"))))
-                        "c86-4g-double,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*10")
+                        
"c86-4g-double,c86-4g-load,c86-4g-fp1fdivx4,c86-4g-fdiv*6")
 
 (define_insn_reservation "c86_4g_ssediv_avx256_pd" 13
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                              (and (eq_attr "mode" "V4DF")
                                   (and (eq_attr "type" "ssediv")
                                        (eq_attr "memory" "none"))))
-                        "c86-4g-double,c86-4g-fp1,c86-4g-fdiv*13")
+                        "c86-4g-double,c86-4g-fp1fdivx4,c86-4g-fdiv*9")
 
 (define_insn_reservation "c86_4g_ssediv_avx256_pd_load" 20
                         (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")
                              (and (eq_attr "mode" "V4DF")
                                   (and (eq_attr "type" "ssediv")
                                        (eq_attr "memory" "load"))))
-                        "c86-4g-double,c86-4g-load,c86-4g-fp1,c86-4g-fdiv*13")
+                        
"c86-4g-double,c86-4g-load,c86-4g-fp1fdivx4,c86-4g-fdiv*9")
 ;; SSE MUL
 (define_insn_reservation "c86_4g_ssemul_ss_ps" 3
                         (and (and (eq_attr "cpu" "c86_4g_m4,c86_4g_m6")

Reply via email to