Re: [COMMITTED] MAINTAINERS: Add myself as MIPS port maintainer

2023-06-04 Thread Matthew Fortune via Gcc-patches
Hi YunQiang/Maciej,

Given that I have simply been unable to fulfill any of the duties of
maintaining a port in GCC due to various other commitments, I am
thrilled that YunQiang has stepped up to take it on. I have no
objection to either being removed or left, as a rather silent partner,
but would appreciate being moved to the write after approval section
if dropped from the maintainer section.

All the best,
Matthew

>On Sat, Jun 3, 2023 at 3:51 PM Maciej W. Rozycki  wrote:
>
> On Fri, 2 Jun 2023, YunQiang Su wrote:
>
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 4a7c963914b..c8b787b6e1e 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -91,7 +91,7 @@ m68k port   Andreas Schwab  
> > 
> >  m68k-motorola-sysv port  Philippe De Muyter  
> >  mcore port   Nick Clifton
> >  microblaze       Michael Eager   
> > -mips portMatthew Fortune 
> > +mips portYunQiang Su 
>
>  Has Matthew agreed to be removed from the maintainer's post?  Even if so,
> then he needs to be moved back to the Write After Approval section, as no
> one has deprived him of this right.
>
>   Maciej


RE: [PATCH] MIPS: Add support for -mcrc and -mginv options

2018-06-15 Thread Matthew Fortune
Robert Suchanek  writes:
> This patch adds -mcrc and -mginv options to pass through them
> to the assembler.
> 
> Regards,
> Robert
> 
> gcc/ChangeLog:
> 
> 2018-06-01  Matthew Fortune  
> 
>   * config/mips/mips.h (ASM_SPEC): Pass through -mcrc, -mno-crc,
>   -mginv and -mno-ginv to the assembler.
>   * config/mips/mips.opt (-mcrc): New option.
>   (-mginv): Likewise.
>   * doc/invoke.text (-mcrc): Document.
>   (-mginv): Likewise.

Since CRC and GINV ASEs have now been committed to binutils, please
go ahead with this change.

Thanks,
Matthew



RE: [PATCH] MIPS: Add support for P6600

2018-06-13 Thread Matthew Fortune
Robert Suchanek  writes:
> As already discussed, the link to the P6600 doesn't
> appear to be referenced on mips.com but reachable
> when searching for 'p6600':
> 
> https://www.mips.com/downloads/p6600-multiprocessing-programmers-guide/

Thanks, good spot.

> gcc/ChangeLog:
> 
> 2018-06-12  Matthew Fortune  
> Prachi Godbole  
> 
>   * config/mips/mips-cpus.def: Define P6600.
>   * config/mips/mips-tables.opt: Regenerate.
>   * config/mips/mips.c (mips_ucbranch_type): New enum.
>   (mips_rtx_cost_data): Add support for P6600.
>   (mips_issue_rate): Likewise.
>   (mips_multipass_dfa_lookahead): Likewise.
>   (mips_avoid_hazard): Likewise.
>   (mips_reorg_process_insns): Likewise.
>   (mips_classify_branch_p6600): New function.
>   * config/mips/mips.h (TUNE_P6600): New define.
>   (MIPS_ISA_LEVEL_SPEC): Infer mips64r6 from p6600.
>   (ENABLE_LD_ST_PAIRS): Enable load/store bonding for p6600.
>   * config/mips/mips.md: Include p6600.md.
>   (processor): Add p6600.
>   * config/mips/p6600.md: New file.
>   * doc/invoke.texi: Add p6600 to supported architectures.

With one more change to add another comment as below, this is OK to
commit.

> @@ -18957,7 +19039,10 @@ mips_reorg_process_insns (void)
>sequence and replace it with the delay slot instruction
>then the jump to clear the forbidden slot hazard.  */

This bit does need the comment extending.  Add this:

For the P6600, this optimisation solves the performance penalty
associated with BALC followed by a delay slot branch.  We do not set
fs_delay as we do not want the full logic of a forbidden slot; the
penalty exists only against branches not the full class of forbidden
slot instructions.

> 
> -   if (fs_delay)
> +   if (fs_delay || (TUNE_P6600
> +&& TARGET_CB_MAYBE
> +&& mips_classify_branch_p6600 (insn)
> +   == UC_BALC))
>   {
> /* Search onwards from the current position looking
> for
>a SEQUENCE.  We are looking for pipeline hazards
here

Thanks,
Matthew



[RFC] New features for multilib control

2018-06-13 Thread Matthew Fortune
Hi,

This patch was developed as part of preparing ever more complex multilib
combinations for the MIPS architecture and aims to solve several problems
in this area. I've attempted to be quite verbose in the description, so
that I can explain how I am using various terms as pretty much everyone
has a different understanding (and I don't claim mine to be 'right'
either).

The changes aim to:

1) Eliminate the fallback multilib
The fallback multilib (top level of 'lib') is often annoying because it is
used for any combination of options that do not match a specific multilib.
Quite often this default multilib is incompatible with the build options
that end up linking against it, leading to bizarre link time messages that
confuse ordinary users.

2) Move the default multilib to a subfolder
Having successfully eliminated the fallback multilib it is also true that
it would eliminate the 'default' multilib as well. I.e. the library used
when no relevant user options are supplied. Moving this library to a
subfolder has two benefits. a) it still exists! and b) the location of
this library becomes invariant irrespective of which options are
build-time configured as default.

3) Preserve/use invariant paths for multilib variants
A simplistic multilib specification leads to a nested set of folders,
where the top level is the left-most multilib directory and the bottom is
the right most. Introducing a new axis of multilib configuration changes
this path structure leading to the different library configurations to 
move around. This is not in itself a problem, as the compiler driver can
always locate the right path for any given build, but it does cause issues
when doing configuration management of binary toolchains. When there are
many different multilibs it is ideal to ship/install only the ones that
are important and if the paths keep changing over time this process is
complex and confusing. This issue is effectively solved by the
MULTI_OSDIRNAMES feature but using it for both sysroot and compiler
internal libraries is even better.

4) Support un-released multilib configurations
This one sounds weird but it is quite valuable. When an architecture has
70+ possible multilib variants but only 40 of them are currently known to
be needed then only build and release 40 variants. However, if it turns
out that another configuration becomes useful then it is often handy to
just build the missing configuration and install it into the pre-existing
release. So, the driver needs to know about all multilibs and their paths
but the build process should only build a subset.

So, the solution...

Firstly, be verbose about the MULTILIB_OPTIONS needed for a target. Add
in the default options as well as all the others that may, or could, ever
be supported by the current compiler features. For MIPS supporting
MIPS32r1 onwards it looks like this:

MULTILIB_OPTIONS = muclibc mips32/mips32r2/mips32r6/mips64/mips64r2/mips64r6
mips16/mmicromips mabi=32/mabi=n32/mabi=64 EB/EL msoft-float mnan=2008

This does create an enormous matrix of possible configurations so this
has to be trimmed to the valid set using MULTILIB_REQUIRED. Note that
the valid set should include any that you may wish to support even if
you don't want to build/release them. By having the default options in
then this leads to having two copies of the 'default' multilib; one with
the options implicitly set and one with them explicitly set.

Second, remove the multilib with the implicit default options. This
equates to the '.' multilib entry. To do this use MULTILIB_EXCLUSIONS to
remove the negated form of every MULTILIB_OPTION:

MULTILIB_EXCLUSIONS =
!muclibc/!mips32/!mips32r2/!mips32r6/!mips64/!mips64r2/!mips64r6/!mips16/!mm
icromips/!mabi=32/!mabi=n32/!mabi=64/!EB/!EL/!msoft-float/!mnan=2008

Third, set the MULTILIB_OSDIRNAMES to have an entry for every valid
combination of options and use the '!' modifier. Honestly, I'm not sure
how/why this works but this leads to both the internal library paths and
sysroot paths using the OSDIRNAME instead of the nested tree of
MULTILIB_DIRNAMES. Choose a path for each variant that you will never
change again; I used an encoded form of the configuration for MIPS:

MULTILIB_OSDIRNAMES =
mips32r6/mabi.32/EB/mnan.2008=!mips-r6-hard$(is_newlib)/lib

Fourth, deal with the fallout from the config-ml.in logic which handles
the initial 'configure' of a library differently to all of the multilib
configurations. The basic idea is that since the default multilib will now
report that it belongs in a subdirectory then, when configuring the top
level multilib, query the driver for the folder. Later when iterating
through the multilib specs skip the one that matches the default
configuration. Each configuration will be built exactly once and all of
them will be placed in a subfolder leaving the top level install folder
bare.

Fifth, restrict the set of multilibs that actually get built for any
given compiler. This is sort-of a new concept so I added a

RE: [PATCH,MIPS] Fix pr86067 ICE: scal-to-vec1.c:86:1: error: unrecognizable insn with -march=loongson3a

2018-06-12 Thread Matthew Fortune
Paul Hua  writes:
> The gcc.c-torture/execute/scal-to-vec1.c  trigger a gcc ICE bug.
> 
> It's a mistake in define_expand vec_setv4hi in loongson.md file.
> 
> 375 (define_expand "vec_setv4hi"
> 376   [(set (match_operand:V4HI 0 "register_operand" "=f")
> 377 (unspec:V4HI [(match_operand:V4HI 1 "register_operand" "f")
> 378   (match_operand:HI 2 "register_operand" "f")
> 379   (match_operand:SI 3 "const_0_to_3_operand"
> "")]
> 380  UNSPEC_LOONGSON_PINSRH))]
> 381   "TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS"
> 382 {
> 383   rtx ext = gen_reg_rtx (SImode);
> 384   emit_move_insn (ext, gen_lowpart (SImode, operands[1]));
> 385   operands[1] = ext;
> 386 })
> 
> The line 384 gen_lowpart the operands[1], should be gen_lowpart
> operands[2], cause the operands[2] are HImode.
> 
> 
> The attached patch fixed this bug.
> 
> Bootstrapped and reg-tested on mips64el-unknown-linux-gnu.
> Ok for commit ?
> 
> 
> ---

Hi Paul,

This looks good, just one issue with the changelog entry. The entry
would go in the gcc/ChangeLog file and the path is then relative to
the gcc/ directory. The PR should be referenced as target/:

2018-03-24  Chenghua Xu 
 
PR target/86076
* config/mips/loongson.md (vec_setv4hi): Gen_lowpart for
operands[2] instead of operands[1].

OK to commit, thanks for finding and fixing. This has been broken
since 2011!

Matthew



RE: [PATCH] MIPS: Add support for P6600

2018-06-11 Thread Matthew Fortune
Robert Suchanek  writes:
> The below adds support for -march=p6600.  It includes
> a new scheduler plus performance tweaks.
> 
> gcc/ChangeLog:
> 
> 2018-06-01  Matthew Fortune  
> Prachi Godbole  
>   * config/mips/mips-cpus.def: Define P6600.
>   * config/mips/mips-tables.opt: Regenerate.
>   * config/mips/mips.c (mips_multipass_dfa_lookahead): Add P6600.
>   * config/mips/mips.h (TUNE_P6600): New define.
>   (MIPS_ISA_LEVEL_SPEC): Infer mips64r6 from p6600.
>   * doc/invoke.texi: Add p6600 to supported architectures.

It seems that mips.com has no longer got any architecture specific
documents available for either i6400/i6500 (which did exist until
recently) nor p6600. There isn't anything particularly unusual
about i6400/i6500 support but this patch for p6600 attempts to deal
with some undocumented micro-architecture details that are not
sufficiently described in the patch to maintain over time nor is
there reference material available.

Despite the fact that I am one of the original authors of the
work, I can't accept this upstream without further information.

Some notes on the patch are below:

> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index bfe64bb..9c77c13 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -198,6 +198,15 @@ enum mips_address_type {
>ADDRESS_SYMBOLIC
>  };
> 
> +/* Classifies an unconditional branch of interest for the P6600.  */
> +
> +enum mips_ucbranch_type {

Newline for brace.

> +  /* May not even be a branch.  */
> +  UC_UNDEFINED,
> +  UC_BALC,
> +  UC_OTHER
> +};
> +
>  /* Macros to create an enumeration identifier for a function
> prototype.  */
>  #define MIPS_FTYPE_NAME1(A, B) MIPS_##A##_FTYPE_##B
>  #define MIPS_FTYPE_NAME2(A, B, C) MIPS_##A##_FTYPE_##B##_##C
> @@ -1127,6 +1136,19 @@ static const struct mips_rtx_cost_data
>  COSTS_N_INSNS (36),   /* int_div_di */
>   2,/* branch_cost */
>   4 /* memory_latency */
> +  },
> +  { /* P6600 */
> +COSTS_N_INSNS (4),/* fp_add */
> +COSTS_N_INSNS (5),/* fp_mult_sf */
> +COSTS_N_INSNS (5),/* fp_mult_df */
> +COSTS_N_INSNS (17),   /* fp_div_sf */
> +COSTS_N_INSNS (17),   /* fp_div_df */
> +COSTS_N_INSNS (5),/* int_mult_si */
> +COSTS_N_INSNS (5),/* int_mult_di */
> +COSTS_N_INSNS (8),/* int_div_si */
> +COSTS_N_INSNS (8),/* int_div_di */
> + 2,/* branch_cost */
> + 4 /* memory_latency */
>}
>  };
> 
> 
> @@ -14592,6 +14614,7 @@ mips_issue_rate (void)
>  case PROCESSOR_LOONGSON_2F:
>  case PROCESSOR_LOONGSON_3A:
>  case PROCESSOR_P5600:
> +case PROCESSOR_P6600:
>return 4;
> 
>  case PROCESSOR_XLP:
> @@ -14727,7 +14750,7 @@ mips_multipass_dfa_lookahead (void)
>if (TUNE_OCTEON)
>  return 2;
> 
> -  if (TUNE_P5600 || TUNE_I6400)
> +  if (TUNE_P5600 || TUNE_P6600 || TUNE_I6400)
>  return 4;
> 
>return 0;
> @@ -18647,6 +18670,28 @@ mips_orphaned_high_part_p (mips_offset_table
> *htab, rtx_insn *insn)
>return false;
>  }
> 
> +/* Subroutine of mips_avoid_hazard.  We classify unconditional
> branches
> +   of interest for the P6600 for performance reasons.  We're
> interested
> +   in differentiating BALC from JIC, JIALC and BC.  */
> +
> +static enum mips_ucbranch_type
> +mips_classify_branch_p6600 (rtx_insn *insn)
> +{
> +  if (!(insn
> +  && USEFUL_INSN_P (insn)
> +  && GET_CODE (PATTERN (insn)) != SEQUENCE))
> +return UC_UNDEFINED;

Let's switch this around to the following with a comment to say
ignore sequences because they represent a filled delay slot
branch (which presumably is not affected by the uArch issue).

  if (insn
  || !USEFUL_INSN_P (insn)
  || GET_CODE (PATTERN (insn)) == SEQUENCE))
return UC_UNDEFINED;

> +
> +  if (get_attr_jal (insn) == JAL_INDIRECT /* JIC and JIALC.  */
> +  || get_attr_type (insn) == TYPE_JUMP) /* BC as it is a loose
> jump.  */
> +return UC_OTHER;

I don't recall what a loose jump was supposed to refer to, presumably
'direct jump'. 

>  /* Subroutine of mips_reorg_process_insns.  If there is a hazard
> between
> INSN and a previous instruction, avoid it by inserting nops after
> instruction AFTER.
> @@ -18699,14 +18744,36 @@ mips_avoid_hazard (rtx_insn *after, rtx_insn
> *insn, int *hilo_delay,
>  && GET_CODE (pattern) != ASM_INPUT
>  && asm_noperands (pattern) < 0)
>  nops = 1;
> +  /* The P6600's

RE: [PATCH] MIPS: Add i6500 processor as an alias for i6400

2018-06-11 Thread Matthew Fortune
Robert Suchanek  writes:
> This patch adds i6500 CPU as an alias for i6400.
> 
> Regards,
> Robert
> 
> gcc/ChangeLog:
> 
> 2018-06-01  Matthew Fortune  
> 
>   * config/mips/mips-cpus.def: New MIPS_CPU for i6500.
>   * config/mips/mips-tables.opt: Regenerate.
>   * config/mips/mips.h (MIPS_ISA_LEVEL_SPEC): Mark i6500 as
>   mips64r6.
>   * doc/invoke.texi: Document -march=i6500.

Looks good, OK to commit.

Thanks,
Matthew



RE: [PATCH] MIPS: Update I6400 scheduler

2018-06-11 Thread Matthew Fortune
Robert Suchanek  writes:
> Update to i6400 scheduler.
> 
> Regards,
> Robert
> 
> gcc/ChangeLog:
> 
> 2018-06-01  Prachi Godbole  
> 
>   * config/mips/i6400.md (i6400_gpmuldiv): Remove cpu_unit.
>   (i6400_gpmul): Add cpu_unit.
>   (i6400_gpdiv): Likewise.
>   (i6400_msa_add_d): Update reservations.
>   (i6400_msa_int_add) Likewise.
>   (i6400_msa_short_logic3) Likewise.
>   (i6400_msa_short_logic2) Likewise.
>   (i6400_msa_short_logic) Likewise.
>   (i6400_msa_move) Likewise.
>   (i6400_msa_cmp) Likewise.
>   (i6400_msa_short_float2) Likewise.
>   (i6400_msa_div_d) Likewise.
>   (i6400_msa_long_logic1) Likewise.
>   (i6400_msa_long_logic2) Likewise.
>   (i6400_msa_mult) Likewise.
>   (i6400_msa_long_float2) Likewise.
>   (i6400_msa_long_float4) Likewise.
>   (i6400_msa_long_float5) Likewise.
>   (i6400_msa_long_float8) Likewise.
>   (i6400_fpu_minmax): New define_insn_reservation.
>   (i6400_fpu_fadd): Include frint type.
>   (i6400_fpu_store): New define_insn_reservation.
>   (i6400_fpu_load): Likewise.
>   (i6400_fpu_move): Likewise.
>   (i6400_fpu_fcmp): Likewise.
>   (i6400_fpu_fmadd): Likewise.
>   (i6400_int_mult): Include imul3nc type and update reservation.
>   (i6400_int_div): Include idiv3 type and update reservation.
>   (i6400_int_load): Update to check type not move_type.
>   (i6400_int_store): Likewise.
>   (i6400_int_prefetch): Set zero latency.

Hi Robert,

Just to fill in the blanks on the submission history for this. The
patch here was written by Prachi while MIPS was a part of IMG and
was then transferred to MIPS Tech LLC when the business split. Both
IMG and MIPS Tech LLC have copyright assignment in place so it
does not matter that it is authored by someone from IMG and submitted
by a MIPS Tech LLC employee.

There is no public platform available to demonstrate the improvement
of this patch but it was tested, when written, to meet or exceed
existing performance on i6400. It is also included in the reference
tools for the i6400 produced by MIPS Tech LLC.

There does seem to be a temporal issue in submission for this as
the i6400_fpu_minmax reservation refers to fminmax and fclass types
that do not exist in trunk. Can you drop that reservation please?

Otherwise, OK to commit.

Thanks,
Matthew

> ---
>  gcc/config/mips/i6400.md | 86 ++--
> 
>  1 file changed, 61 insertions(+), 25 deletions(-)
> 
> diff --git a/gcc/config/mips/i6400.md b/gcc/config/mips/i6400.md
> index 413e9e8..a985401 100644
> --- a/gcc/config/mips/i6400.md
> +++ b/gcc/config/mips/i6400.md
> @@ -21,7 +21,7 @@
>  (define_automaton "i6400_int_pipe, i6400_mdu_pipe,
> i6400_fpu_short_pipe,
>  i6400_fpu_long_pipe")
> 
> -(define_cpu_unit "i6400_gpmuldiv" "i6400_mdu_pipe")
> +(define_cpu_unit "i6400_gpmul, i6400_gpdiv" "i6400_mdu_pipe")
>  (define_cpu_unit "i6400_agen, i6400_alu1, i6400_lsu" "i6400_int_pipe")
>  (define_cpu_unit "i6400_control, i6400_ctu, i6400_alu0"
> "i6400_int_pipe")
> 
> @@ -50,49 +50,49 @@ (define_insn_reservation "i6400_msa_add_d" 1
>(and (eq_attr "cpu" "i6400")
> (and (eq_attr "mode" "!V2DI")
>   (eq_attr "alu_type" "simd_add")))
> -  "i6400_fpu_short, i6400_fpu_intadd")
> +  "i6400_fpu_short+i6400_fpu_intadd*2")
> 
>  ;; add, hadd, sub, hsub, average, min, max, compare
>  (define_insn_reservation "i6400_msa_int_add" 2
>(and (eq_attr "cpu" "i6400")
> (eq_attr "type" "simd_int_arith"))
> -  "i6400_fpu_short, i6400_fpu_intadd")
> +  "i6400_fpu_short+i6400_fpu_intadd*2")
> 
>  ;; sat, pcnt
>  (define_insn_reservation "i6400_msa_short_logic3" 3
>(and (eq_attr "cpu" "i6400")
> (eq_attr "type" "simd_sat,simd_pcnt"))
> -  "i6400_fpu_short, i6400_fpu_logic")
> +  "i6400_fpu_short+i6400_fpu_logic*2")
> 
>  ;; shifts, nloc, nlzc, bneg, bclr, shf
>  (define_insn_reservation "i6400_msa_short_logic2" 2
>(and (eq_attr "cpu" "i6400")
> (eq_attr "type" "simd_shift,simd_shf,simd_bit"))
> -  "i6400_fpu_short, i6400_fpu_logic")
> +  "i6400_fpu_short+i6400_fpu_logic*2")
> 
>  ;; and, or, xor, ilv, pck, fill, splat
>  (define_insn_reservation "i6400_msa_short_logic" 1
>(and (eq_attr "cpu" "i6400")
> (eq_attr "type"
> "simd_permute,simd_logic,simd_splat,simd_fill"))
> -  "i6400_fpu_short, i6400_fpu_logic")
> +  "i6400_fpu_short+i6400_fpu_logic*2")
> 
>  ;; move.v, ldi
>  (define_insn_reservation "i6400_msa_move" 1
>(and (eq_attr "cpu" "i6400")
> (eq_attr "type" "simd_move"))
> -  "i6400_fpu_short, i6400_fpu_logic")
> +  "i6400_fpu_short+i6400_fpu_logic*2")
> 
>  ;; Float compare New: CMP.cond.fmt
>  (define_insn_reservation "i6400_msa_cmp" 2
>(and (eq_attr "cpu" "i6400")
> (eq_attr "type" "simd_fcmp"))
> -  "i6400_fpu_short, i6400_fpu_cmp")
> +  "i6400_fpu_short+i6400_fpu_cmp*2")
> 
>  ;; Float min, max, class
>  (define_insn_reservation 

RE: [PATCH] MIPS: Add support for -mcrc and -mginv options

2018-06-11 Thread Matthew Fortune
Robert Suchanek  writes:
> This patch adds -mcrc and -mginv options to pass through them
> to the assembler.
> 
> Regards,
> Robert
> 
> gcc/ChangeLog:
> 
> 2018-06-01  Matthew Fortune  
> 
>   * config/mips/mips.h (ASM_SPEC): Pass through -mcrc, -mno-crc,
>   -mginv and -mno-ginv to the assembler.
>   * config/mips/mips.opt (-mcrc): New option.
>   (-mginv): Likewise.
>   * doc/invoke.text (-mcrc): Document.
>   (-mginv): Likewise.

The patch is OK but should probably wait until the binutils support is
committed. I see CRC ASE support in discussion up until:

https://sourceware.org/ml/binutils/2017-12/msg00069.html

And GINV support up to:

https://sourceware.org/ml/binutils/2018-01/msg00125.html

Thanks,
Matthew



RE: Prefer open-coding vector integer division

2018-06-08 Thread Matthew Fortune
Richard Sandiford  writes:
> vect_recog_divmod_pattern currently bails out if the target has
> native support for integer division, but I think in practice
> it's always going to be better to open-code it anyway, just as
> we usually open-code scalar divisions by constants.
> 
> I think the only currently affected target is MIPS MSA, where for:
> 
>   void
>   foo (int *x)
>   {
> for (int i = 0; i < 100; ++i)
>   x[i] /= 2;
>   }
> 
> we previously preferred to use division for powers of 2:
> 
> .setnoreorder
> bnz.w   $w1,1f
> div_s.w $w0,$w0,$w1
> break   7
> .setreorder
> 1:
> 
> (or just the div_s.w for -mno-check-zero-division), but after the patch
> we open-code them using shifts:
> 
> clt_s.w $w1,$w0,$w2
> subv.w  $w0,$w0,$w1
> srai.w  $w0,$w0,1
> 
> I assume that's better.  Matthew, is that right?

Sorry for extreme tardiness. Yes, the alternate sequence has a max latency
of 6. Although I don't have the range of latencies to hand for the FDIV, as
far as I remember 6 cycles is better than the fastest FDIV case at least for
i6400/i6500.

Matthew



[PATCH,committed] [MAINTAINERS] Update email address

2018-06-04 Thread Matthew Fortune
Updating my email address, apologies for being out of date for a while.

Matthew

* MAINTAINERS: Update my email address.

---
 ChangeLog   | 4 
 MAINTAINERS | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f9f376a..54b7958 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -77,7 +77,7 @@ m68k port Andreas Schwab

 m68k-motorola-sysv portPhilippe De Muyter  
 mcore port Nick Clifton
 microblaze Michael Eager   
-mips port  Matthew Fortune 
+mips port  Matthew Fortune 
 mmix port  Hans-Peter Nilsson  
 mn10300 port   Jeff Law
 mn10300 port   Alexandre Oliva 
-- 
2.2.1




RE: [PATCH 2/2] MIPS/GCC/testsuite: Fix data-sym-pool.c for n64 code

2018-04-18 Thread Matthew Fortune
Maciej Rozycki  writes:
> (we have no support for hard-float n64 MIPS16 code generation), which
> means that the test case will fail, as the regular expression pattern
> expects `lw' and `.word' rather than `ld' and `.dword' respectively to
> appear in assembly code generation.  Correct the pattern in an obvious
> way then making it accept both intructions and pseudo-ops.
> 
>   gcc/testsuite/
>   * gcc.target/mips/data-sym-pool.c (dg-options): Match `ld' and
>   `.dword' in addition to `lw' and `.word'.
> ---
> Hi,
> 
>  I have regression-tested this with the `mips-mti-linux-gnu' target and
> the o32 ABI.  I don't have an n64 soft-float multilib configured, so the
> manually generated assembly file included with the description serves as
> the proof for what needs to be matched.
> 
>  OK to apply?

Hi Maciej,

Apologies as before on slowness. This looks trivially OK to me but that's
thanks to the description, much appreciated.

OK for commit but adding RMs given the imminent release.

Thanks,
Matthew


RE: [PATCH 1/2] MIPS/GCC/testsuite: Fix data-sym-pool.c for SVR4 model at -O0

2018-04-18 Thread Matthew Fortune
Maciej Rozycki  writes:
> Given that the SVR4 vs PLT code model consideration is irrelevant for
> this test case rather than rewriting the regular expression to match
> this variant of code just enforce the PLT model by using the `-mplt'
> option.  It is safe to use this option unconditionally as it is silently
> ignored with configurations that do not support this model, e.g. bare
> metal ELF.
> 
>   gcc/testsuite/
>   * gcc.target/mips/data-sym-pool.c (dg-options): Add `-mplt'.
> ---
> Hi,
> 
>  I have regression-tested this with the `mips-mti-linux-gnu' target and
> the o32, n32 and n64 ABIs.  The two latters are demoted to o64 by the
> test framework due to the lack of MIPS16 support for the hard-float
> variants of these ABIs and I don't have soft-float multilibs configured,
> so instead I have verified n32/soft-float and n64/soft-float variants by
> hand.  The latter revealed the need for 2/2.
> 
>  Finally I do not have a bare metal ELF configuration available for
> regression-testing right now, so I only verified that `-mplt' is
> silently ignored.  Code generated is expected to be the same as in the
> PLT mode.
> 
>  OK to apply?

Hi Maciej,

Sorry for skimming past this and being slow to respond. I agree with your
choice of forcing an option to stabilise the generated code it does tend
to future proof better than leaving options floating.

OK to apply but given the release date I've added RMs to approve for
trunk right now.

Thanks,
Matthew


RE: [PATCH,Testsuite,MIPS] Fixing fix-r4000-n.c failure started with r255348

2018-03-27 Thread Matthew Fortune
Hi Paul,

> ChangeLog entries:
> 
> gcc/testsuite/ChangeLog
> 
> 2018-03-24  Chenghua Xu 
> 
> * gcc.target/mips/fix-r4000-1.c: Delete "[^\n]" in dg-final.
> * gcc.target/mips/fix-r4000-2.c: Likewise.
> * gcc.target/mips/fix-r4000-3.c: Likewise.
> * gcc.target/mips/fix-r4000-4.c: Likewise.
> * gcc.target/mips/fix-r4000-5.c: Likewise.
> * gcc.target/mips/fix-r4000-6.c: Likewise.
> * gcc.target/mips/fix-r4000-7.c: Likewise.
> * gcc.target/mips/fix-r4000-8.c: Likewise.
> * gcc.target/mips/fix-r4000-9.c: Likewise.
> * gcc.target/mips/fix-r4000-10.c: Likewise.
> * gcc.target/mips/fix-r4000-7.c: Change dg-final
>   "mulditi3_r4000" instead of "mulditi3".
> * gcc.target/mips/fix-r4000-8.c: Change dg-final
>   "umulditi3_r4000" instead of "umulditi3".

This looks good too. Another good cleanup, OK to commit.

Thanks,
Matthew


RE: [PATCH,Testsuite,MIPS] Fixing umips-stroe16-2.c failure started with r255348

2018-03-27 Thread Matthew Fortune
Hi Paul

> ChangeLog entries:
> 
> gcc/testsuite/ChangeLog
> 
> 2018-03-24  Chenghua Xu 
> 
> * gcc.target/mips/umips-stroe16-2.c: Change "length = 2"
>   to "l=2" in dg-final.

Looks good. Thanks for the cleanup. OK to commit.

Matthew


RE: [PATCH 0/4] ASAN for MIPS (o32)

2018-03-23 Thread Matthew Fortune
Hans-Peter Nilsson  writes:
> All patches are together run through the gcc and g++ test-suites
> to check ASAN results for mipsisa32r2el-linux-gnu.  As of
> r258635 those results are on par with those for
> arm-linux-gnueabihf, so without delay I present the current
> state.  Maybe it's non-intrusive enough to be ok for gcc-8?
> MIPS maintainers (and interested party) CC:ed.

From a MIPS backend perspective all 4 patches are OK. I've done very
little to support upstream MIPS over this release so these
improvements are fantastic. I don't know the detail of asan support
so am going with the idea that your investigation has got to the
bottom of the problems; thanks for the detailed explanations.

I'm not sure I should really approve this for GCC-8 but rather ask
a global maintainer or Jakub/Richard as release managers given I
can't commit to do much to support the release and I won't want to
risk burdening others with a late change.

> For use with -fsanitize=address, you'll need a non-ancient glibc
> or equivalent (2002-ish), one that iterates on ELF headers for
> the EH info at exception time (rather, doesn't call
> __register_frame_info or __register_frame_info_bases at startup,
> ending up calling malloc/free) or else Asan will try to
> instrument the call to free and hang on a lock for eternity (or
> dies on a signal).  But that's no different than for other
> ports, AFAIU.
> 
> So, ok to commit?

As above, if a global maintainer is happy then yes.

Matthew

> 
> brgds, H-P


RE: [PATCH] Fix ICE caused by a missing check for DECL_LANG_SPECIFIC

2018-03-02 Thread Matthew Fortune
Jason Merrill <ja...@redhat.com> writes:
> On Thu, Mar 1, 2018 at 7:02 AM, Matthew Fortune <mfort...@gmail.com>
> wrote:
> > Hi,
> >
> > It seems we have had a bug for some time that causes an ICE and
> prevents the
> > MIPS16 library builds from completing but is likely unrelated to
> MIPS16.
> > The problem is when we call target_reinit and library functions get
> created
> > as shown in the call stack at the end of this message. The first
> builtin
> > that triggers the problem happens to be one of the MIPS16 helpers but
> I
> > don't think there is anything unique about it. The issue appeared
> after some
> > refactoring work in r253600 where code testing DECL_CLASS_SCOPE_P and
> > DECL_FRIEND_P was previously guarded by a check for
> DECL_LANG_SPECIFIC but
> > not after.
> >
> > https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00604.html
> >
> > I don’t know if this is the correct solution or whether we need to
> change the
> > way builtins are initialised in the MIPS backend but I suspect this
> fix
> > is the right way to go.
> >
> > Cc: Jason as author of the original change.
> >
> > Thanks,
> > Matthew
> >
> > gcc/cp/
> > * pt.c (type_dependent_expression_p): Add missing check for
> > DECL_LANG_SPECIFIC.
> > ---
> >  gcc/cp/pt.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index 7345119..c88304f 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -24635,6 +24635,7 @@ type_dependent_expression_p (tree expression)
> >   type-dependent.  Checking this is important for functions with
> auto return
> >   type, which looks like a dependent type.  */
> >if (TREE_CODE (expression) == FUNCTION_DECL
> > +  && DECL_LANG_SPECIFIC (expression)
> >&& !(DECL_CLASS_SCOPE_P (expression)
> >&& dependent_type_p (DECL_CONTEXT (expression)))
> >&& !(DECL_FRIEND_P (expression)
> 
> I think we want to go into this block when DECL_LANG_SPECIFIC is NULL.
> Does this also fix the issue for you?

Thanks. Yes, this fixes it too. I wasn't sure which of the accessors were
dependent on DECL_LANG_SPECIFIC so ended up with a sledgehammer. 

Matthew



[PATCH] Fix ICE caused by a missing check for DECL_LANG_SPECIFIC

2018-03-01 Thread Matthew Fortune
Hi,

It seems we have had a bug for some time that causes an ICE and prevents the
MIPS16 library builds from completing but is likely unrelated to MIPS16.
The problem is when we call target_reinit and library functions get created
as shown in the call stack at the end of this message. The first builtin
that triggers the problem happens to be one of the MIPS16 helpers but I
don't think there is anything unique about it. The issue appeared after some
refactoring work in r253600 where code testing DECL_CLASS_SCOPE_P and
DECL_FRIEND_P was previously guarded by a check for DECL_LANG_SPECIFIC but
not after.

https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00604.html

I don’t know if this is the correct solution or whether we need to change the
way builtins are initialised in the MIPS backend but I suspect this fix
is the right way to go.

Cc: Jason as author of the original change.

Thanks,
Matthew

gcc/cp/
* pt.c (type_dependent_expression_p): Add missing check for
DECL_LANG_SPECIFIC.
---
 gcc/cp/pt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7345119..c88304f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -24635,6 +24635,7 @@ type_dependent_expression_p (tree expression)
  type-dependent.  Checking this is important for functions with auto return
  type, which looks like a dependent type.  */
   if (TREE_CODE (expression) == FUNCTION_DECL
+  && DECL_LANG_SPECIFIC (expression)
   && !(DECL_CLASS_SCOPE_P (expression)
   && dependent_type_p (DECL_CONTEXT (expression)))
   && !(DECL_FRIEND_P (expression)
-- 
2.2.1


...v3/include/bits/cpp_type_traits.h:408:32: internal compiler error: 
Segmentation fault
 __miter_base(_Iterator __it)
^
0xd77fff crash_signal
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/toplev.c:326
0xd77fff crash_signal
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/toplev.c:326
0x74aeb9 type_dependent_expression_p(tree_node*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/pt.c:24293
0x6cda47 mangle_decl_string
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/mangle.c:3740
0x6cdca8 get_mangled_id
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/mangle.c:3782
0x6cdf50 mangle_decl(tree_node*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/mangle.c:3820
0x74aeb9 type_dependent_expression_p(tree_node*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/pt.c:24293
0x6cda47 mangle_decl_string
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/mangle.c:3740
0x6cdca8 get_mangled_id
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/mangle.c:3782
0x6cdf50 mangle_decl(tree_node*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/cp/mangle.c:3820
0x105af90 decl_assembler_name(tree_node*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/tree.c:673
0xc7d1e8 build_libfunc_function(char const*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/optabs-libfuncs.c:736
0xc7d717 init_one_libfunc(char const*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/optabs-libfuncs.c:766
0xc7d79a set_optab_libfunc(optab_tag, machine_mode, char const*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/optabs-libfuncs.c:804
0x105af90 decl_assembler_name(tree_node*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/tree.c:673
0xc7d1e8 build_libfunc_function(char const*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/optabs-libfuncs.c:736
0xc7d717 init_one_libfunc(char const*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/optabs-libfuncs.c:766
0xc7d79a set_optab_libfunc(optab_tag, machine_mode, char const*)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/optabs-libfuncs.c:804
0x10d49bb mips_init_libfuncs

/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/config/mips/mips.c:13425
0xd7836c lang_dependent_init_target
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/toplev.c:1758
0xd7941c target_reinit()
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/toplev.c:1880
0x13e6f55 save_target_globals()
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/target-globals.c:86
0x10e0124 mips_set_compression_mode

/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/config/mips/mips.c:19518
0xa7ee1b invoke_set_current_function_hook
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/function.c:4783
0xa88aa7 invoke_set_current_function_hook
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/function.c:4918
0xa88aa7 allocate_struct_function(tree_node*, bool)
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/function.c:4893
0x10d49bb mips_init_libfuncs

/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/config/mips/mips.c:13425
0xd7836c lang_dependent_init_target
/scratch/mpf/overtest/64126/240294/shared/gcc/gcc/toplev.c:1758
0xd7941c target_reinit()

[PATCH,MIPS,committed] Fix wrong use of XINT instead of INTVAL

2018-03-01 Thread Matthew Fortune
Hi,

This issue was caught with assert checking enabled but is not a
functional bug as XINT(x, 0) happens to overlay INTVAL(x) anyway.

Committed to trunk.

Thanks,
Matthew

gcc/
* config/mips/mips.c (mips_final_prescan_insn): Fix incorrect
XINT with INTVAL.
(mips_final_postscan_insn): Likewise.
---
 gcc/config/mips/mips.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 00cece2..aabd4b1 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -20426,7 +20426,7 @@ mips_final_prescan_insn (rtx_insn *insn, rtx *opvec,
int noperands)
   && GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE
   && XINT (PATTERN (insn), 1) == UNSPEC_CONSTTABLE)
 mips_set_text_contents_type (asm_out_file, "__pool_",
-XINT (XVECEXP (PATTERN (insn), 0, 0), 0),
+INTVAL (XVECEXP (PATTERN (insn), 0, 0)),
 FALSE);
 
   if (mips_need_noat_wrapper_p (insn, opvec, noperands))
@@ -20450,7 +20450,7 @@ mips_final_postscan_insn (FILE *file
ATTRIBUTE_UNUSED, rtx_insn *insn,
   && GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE
   && XINT (PATTERN (insn), 1) == UNSPEC_CONSTTABLE_END)
 mips_set_text_contents_type (asm_out_file, "__pend_",
-XINT (XVECEXP (PATTERN (insn), 0, 0), 0),
+INTVAL (XVECEXP (PATTERN (insn), 0, 0)),
 TRUE);
 }
 
-- 
2.2.1




RE: [PATCH, PR83327] Fix liveness analysis in lra for spilled-into hard regs

2018-02-28 Thread Matthew Fortune
Tom de Vries <tom_devr...@mentor.com> writes:
> On 02/26/2018 12:00 PM, Matthew Fortune wrote:
> > Tom de Vries <tom_devr...@mentor.com> writes:
> >> On 01/08/2018 05:32 PM, Tom de Vries wrote:
> >>> On 12/18/2017 05:57 PM, Vladimir Makarov wrote:
> >>>>
> >>>>
> >>>> On 12/15/2017 06:25 AM, Tom de Vries wrote:
> >>>>>
> >>>>> Proposed Solution:
> >>>>>
> >>>>> The patch addresses the problem, by:
> >>>>> - marking the hard regs that have been used in lra_spill in
> >>>>>    hard_regs_spilled_into
> >>>>> - using hard_regs_spilled_into in lra_create_live_ranges to
> >>>>>    make sure those registers are marked in the conflict_hard_regs
> >>>>>    of pseudos that overlap with the spill register usage
> >>>>>
> >>>>> [ I've also tried an approach where I didn't use
> >>>>> hard_regs_spilled_into, but tried to propagate all hard regs. I
> >>>>> figured out that I needed to mask out eliminable_regset.  Also I
> >>>>> needed to masked out lra_no_alloc_regs, but that could be due to
> >>>>> gcn-specific problems (pointers take 2 hard regs), I'm not yet
> sure.
> >>>>> Anyway, in the submitted patch I tried to avoid these problems
> and
> >>>>> went for the more minimal approach. ]
> >>>>>
> >>>> Tom, thank you for the detail explanation of the problem and
> >>>> solutions you considered.  It helped me a lot.  Your simple
> solution
> >>>> is adequate as the most transformations and allocation are done on
> >>>> the 1st LRA subpasses iteration.
> >>>>> In order to get the patch accepted for trunk, I think we need:
> >>>>> - bootstrap and reg-test on x86_64
> >>>>> - build and reg-test on mips (the only primary platform that has
> the
> >>>>>    spill_class hook enabled)
> >>>>>
> >>>>> Any comments?
> >>>>
> >>>> The patch looks ok to me.  You can commit it after successful
> testing
> >>>> on x86-64 and mips but I am sure there will be no problems with
> >>>> x86-64 as it does not use spill_class currently (actually your
> patch
> >>>> might help to switch it on again for x86-64.  spill_class was
> quite
> >>>> useful for x86-64 performance on Intel processors).
> >>>>
> >>>
> >>> Hi Matthew,
> >>>
> >>> there's an lra optimization that is currently enabled for MIPS, and
> >>> not for any other primary or secondary target.
> >>>
> >>> This (already approved) patch fixes a bug in that optimization, and
> >>> needs to be tested on MIPS.
> >>>
> >>> Unfortunately, the optimization is only enabled for MIPS16, and we
> >>> don't have a current setup to test this.
> >>>
> >>> Could you help us out here and test this patch for MIPS16 on trunk?
> >>
> >> Hi Matthew,
> >>
> >> is this something you can help us out with?
> >
> > Hi Tom,
> >
> > I've just commented on the bug report that I've set of some builds to
> try
> > and give some assurance. It is far from comprehensive but it is as
> good as
> > the normal testing I do for MIPS16.
> >
> 
> Hi Matthew,
> 
> Awesome, thanks for the help.

I have tested trunk with and without the patch and can confirm there
is no change in test status for MIPS16 big endian.

I ended up fixing an assert-checking issue in the MIPS backend and
a bug (I think) in the C++ frontend to get to the point of testing so
quite a worthwhile effort all in all.

Thanks,
Matthew


RE: [PATCH, PR83327] Fix liveness analysis in lra for spilled-into hard regs

2018-02-26 Thread Matthew Fortune
Tom de Vries <tom_devr...@mentor.com> writes:
> On 02/26/2018 12:00 PM, Matthew Fortune wrote:
> > Tom de Vries <tom_devr...@mentor.com> writes:
> >> On 01/08/2018 05:32 PM, Tom de Vries wrote:
> >>> On 12/18/2017 05:57 PM, Vladimir Makarov wrote:
> >>>>
> >>>>
> >>>> On 12/15/2017 06:25 AM, Tom de Vries wrote:
> >>>>>
> >>>>> Proposed Solution:
> >>>>>
> >>>>> The patch addresses the problem, by:
> >>>>> - marking the hard regs that have been used in lra_spill in
> >>>>>    hard_regs_spilled_into
> >>>>> - using hard_regs_spilled_into in lra_create_live_ranges to
> >>>>>    make sure those registers are marked in the conflict_hard_regs
> >>>>>    of pseudos that overlap with the spill register usage
> >>>>>
> >>>>> [ I've also tried an approach where I didn't use
> >>>>> hard_regs_spilled_into, but tried to propagate all hard regs. I
> >>>>> figured out that I needed to mask out eliminable_regset.  Also I
> >>>>> needed to masked out lra_no_alloc_regs, but that could be due to
> >>>>> gcn-specific problems (pointers take 2 hard regs), I'm not yet
> sure.
> >>>>> Anyway, in the submitted patch I tried to avoid these problems and
> >>>>> went for the more minimal approach. ]
> >>>>>
> >>>> Tom, thank you for the detail explanation of the problem and
> >>>> solutions you considered.  It helped me a lot.  Your simple
> >>>> solution is adequate as the most transformations and allocation are
> >>>> done on the 1st LRA subpasses iteration.
> >>>>> In order to get the patch accepted for trunk, I think we need:
> >>>>> - bootstrap and reg-test on x86_64
> >>>>> - build and reg-test on mips (the only primary platform that has
> >>>>> the
> >>>>>    spill_class hook enabled)
> >>>>>
> >>>>> Any comments?
> >>>>
> >>>> The patch looks ok to me.  You can commit it after successful
> >>>> testing on x86-64 and mips but I am sure there will be no problems
> >>>> with
> >>>> x86-64 as it does not use spill_class currently (actually your
> >>>> patch might help to switch it on again for x86-64.  spill_class was
> >>>> quite useful for x86-64 performance on Intel processors).
> >>>>
> >>>
> >>> Hi Matthew,
> >>>
> >>> there's an lra optimization that is currently enabled for MIPS, and
> >>> not for any other primary or secondary target.
> >>>
> >>> This (already approved) patch fixes a bug in that optimization, and
> >>> needs to be tested on MIPS.
> >>>
> >>> Unfortunately, the optimization is only enabled for MIPS16, and we
> >>> don't have a current setup to test this.
> >>>
> >>> Could you help us out here and test this patch for MIPS16 on trunk?
> >>
> >> Hi Matthew,
> >>
> >> is this something you can help us out with?
> >
> > Hi Tom,
> >
> > I've just commented on the bug report that I've set of some builds to
> > try and give some assurance. It is far from comprehensive but it is as
> > good as the normal testing I do for MIPS16.
> >
> 
> Hi Matthew,
> 
> Awesome, thanks for the help.

I'm afraid my lack of attention has shown that there appears to be a
bug preventing libstdc++ building with MIPS16 that was introduced
somewhere between Oct 9th and Oct 10th. 47 commits left to bisect.

Matthew


RE: [PATCH, PR83327] Fix liveness analysis in lra for spilled-into hard regs

2018-02-26 Thread Matthew Fortune
Tom de Vries  writes:
> On 01/08/2018 05:32 PM, Tom de Vries wrote:
> > On 12/18/2017 05:57 PM, Vladimir Makarov wrote:
> >>
> >>
> >> On 12/15/2017 06:25 AM, Tom de Vries wrote:
> >>>
> >>> Proposed Solution:
> >>>
> >>> The patch addresses the problem, by:
> >>> - marking the hard regs that have been used in lra_spill in
> >>>   hard_regs_spilled_into
> >>> - using hard_regs_spilled_into in lra_create_live_ranges to
> >>>   make sure those registers are marked in the conflict_hard_regs
> >>>   of pseudos that overlap with the spill register usage
> >>>
> >>> [ I've also tried an approach where I didn't use
> >>> hard_regs_spilled_into, but tried to propagate all hard regs. I
> >>> figured out that I needed to mask out eliminable_regset.  Also I
> >>> needed to masked out lra_no_alloc_regs, but that could be due to
> >>> gcn-specific problems (pointers take 2 hard regs), I'm not yet sure.
> >>> Anyway, in the submitted patch I tried to avoid these problems and
> >>> went for the more minimal approach. ]
> >>>
> >> Tom, thank you for the detail explanation of the problem and
> >> solutions you considered.  It helped me a lot.  Your simple solution
> >> is adequate as the most transformations and allocation are done on
> >> the 1st LRA subpasses iteration.
> >>> In order to get the patch accepted for trunk, I think we need:
> >>> - bootstrap and reg-test on x86_64
> >>> - build and reg-test on mips (the only primary platform that has the
> >>>   spill_class hook enabled)
> >>>
> >>> Any comments?
> >>
> >> The patch looks ok to me.  You can commit it after successful testing
> >> on x86-64 and mips but I am sure there will be no problems with
> >> x86-64 as it does not use spill_class currently (actually your patch
> >> might help to switch it on again for x86-64.  spill_class was quite
> >> useful for x86-64 performance on Intel processors).
> >>
> >
> > Hi Matthew,
> >
> > there's an lra optimization that is currently enabled for MIPS, and
> > not for any other primary or secondary target.
> >
> > This (already approved) patch fixes a bug in that optimization, and
> > needs to be tested on MIPS.
> >
> > Unfortunately, the optimization is only enabled for MIPS16, and we
> > don't have a current setup to test this.
> >
> > Could you help us out here and test this patch for MIPS16 on trunk?
> 
> Hi Matthew,
> 
> is this something you can help us out with?

Hi Tom,

I've just commented on the bug report that I've set of some builds to try
and give some assurance. It is far from comprehensive but it is as good as
the normal testing I do for MIPS16.

Thanks,
Matthew


RE: [patch] tweak gcc.target/mips/msa.c options

2017-11-03 Thread Matthew Fortune
Hi Sandra,

> The testcase gcc.target/mips/msa.c gives dozens of FAILs if it's tested
> with a GCC configured to default to -fno-common, because of patterns
> like
> 
> /* { dg-final { scan-assembler-times "\t.comm\tv16i8_\\d+,16,16" 3 } }
> */
> 
> Seems like the simplest solution is to force -fcommon for this test.
> OK?

Looks like a good choice to me. The test has too much expected output to
cope with -fno-common as well and it does not degrade the value of the test.

Please go ahead.

Thanks,
Matthew


RE: [PATCH 13/13] fix MIPS files

2017-10-29 Thread Matthew Fortune
Jim Wilson  writes:
> Hand tested to verify that I didn't accidentally break passing -g to
> the assembler.

In case you are waiting on an OK for the MIPS part... this is fine.

Thanks,
Matthew


RE: [PATCH] Add 'short_call' attribute for MIPS targets

2017-09-12 Thread Matthew Fortune
Simon Atanasyan <si...@atanasyan.com> writes:
> On Mon, Sep 11, 2017 at 03:26:52PM +, Matthew Fortune wrote:
> > Simon Atanasyan <si...@atanasyan.com> writes:
> > > Here is the updated patch with chnaged e-mail address and fixed
> > > indentation issues:
> > > -8<
> > > Currently GCC supports 'long_call', 'far', and 'near' attributes.
> > > The 'long_call' and 'far' attributes are synonyms. This patch adds
> > > support for the 'short_call' attribute as a synonym for `near` to
> > > make this list complete, consistent with other targets, and
> > > compatible with attributes supported by the Clang.
> > >
> > > Tested on mipsel-linux-gnu.
> > >
> > > 2017-08-18  Simon Atanasyan  <simon.atanas...@imgtec.com>
> > >
> > > gcc/
> > >   * config/mips/mips.c (mips_attribute_table): Add 'short_call'
> > >   attribute.
> > >   (mips_near_type_p): Add 'short_call' attribute as a synonym
> > >   for 'near'.
> > >   * doc/extend.texi (short_call): Document new function attribute.
> > >
> > > gcc/testsuite
> > >
> > >   * gcc.target/mips/near-far-1.c: Add check for 'short_call'
> > >   attribute.
> > >   * gcc.target/mips/near-far-2.c: Likewise.
> > >   * gcc.target/mips/near-far-3.c: Likewise.
> > >   * gcc.target/mips/near-far-4.c: Likewise.
> >
> > OK to commit, thanks.
> 
> Thanks for review. I do not have commit access. Could you commit the
> patch?

Sure, committed as r252006.

Matthew


RE: [PATCH] Add 'short_call' attribute for MIPS targets

2017-09-11 Thread Matthew Fortune
Simon Atanasyan  writes:
> Here is the updated patch with chnaged e-mail address and fixed
> indentation issues:
> -8<
> Currently GCC supports 'long_call', 'far', and 'near' attributes. The
> 'long_call' and 'far' attributes are synonyms. This patch adds support
> for the 'short_call' attribute as a synonym for `near` to make this list
> complete, consistent with other targets, and compatible with attributes
> supported by the Clang.
> 
> Tested on mipsel-linux-gnu.
> 
> 2017-08-18  Simon Atanasyan  
> 
> gcc/
>   * config/mips/mips.c (mips_attribute_table): Add 'short_call'
>   attribute.
>   (mips_near_type_p): Add 'short_call' attribute as a synonym
>   for 'near'.
>   * doc/extend.texi (short_call): Document new function attribute.
> 
> gcc/testsuite
> 
>   * gcc.target/mips/near-far-1.c: Add check for 'short_call'
>   attribute.
>   * gcc.target/mips/near-far-2.c: Likewise.
>   * gcc.target/mips/near-far-3.c: Likewise.
>   * gcc.target/mips/near-far-4.c: Likewise.

OK to commit, thanks.

Matthew


RE: [PATCH] Add 'short_call' attribute for MIPS targets

2017-09-11 Thread Matthew Fortune
Simon Atanasyan  writes:
> Currently GCC supports 'long_call', 'far', and 'near' attributes. The
> 'long_call' and 'far' attributes are synonyms. This patch adds support
> for the 'short_call' attribute as a synonym for `near` to make this list
> complete, consistent with other targets, and compatible with attributes
> supported by the Clang.

Sorry for the slow reply. Thanks for keeping GCC and Clang in sync.

> Tested on mipsel-linux-gnu.
> 
> 2017-08-22  Simon Atanasyan  

As you do not have a personal copyright assignment on file please can you
resubmit using your @imgtec.com address and use that address for the
changelog. There are just a couple of indentation issues below as well.

> gcc/
> 
> * config/mips/mips.c (mips_attribute_table): Add 'short_call' attribute.
> (mips_near_type_p): Add 'short_call' attribute as a synonym for 'near'.
> * doc/extend.texi (short_call): Document new function attribute.
> 
> gcc/testsuite/
> 
> * gcc.target/mips/near-far-1.c: Add check for 'short_call' attribute.
> * gcc.target/mips/near-far-2.c: Likewise.
> * gcc.target/mips/near-far-3.c: Likewise.
> * gcc.target/mips/near-far-4.c: Likewise.

There should be one tab before the lines in each entry * ,
lines here should be wrapped at 74 cols.

> 
> Index: gcc/config/mips/mips.c
> ===
> --- gcc/config/mips/mips.c (revision 251219)
> +++ gcc/config/mips/mips.c (working copy)
> @@ -598,6 +598,7 @@ static const struct attribute_spec mips_attribute_
>/* { name, min_len, max_len, decl_req, type_req, fn_type_req,
> handler,
> om_diagnostic } */
>{ "long_call",   0, 0, false, true,  true,  NULL, false },
> +  { "short_call",  0, 0, false, true,  true,  NULL, false },
>{ "far",   0, 0, false, true,  true,  NULL, false },
>{ "near",0, 0, false, true,  true,  NULL, false },
>/* We would really like to treat "mips16" and "nomips16" as type
> @@ -1171,13 +1172,14 @@ mflip_mips16_use_mips16_p (tree decl)
>return *slot;
>  }
> 
> -/* Predicates to test for presence of "near" and "far"/"long_call"
> +/* Predicates to test for presence of "near"/"short_call" and
> "far"/"long_call"
> attributes on the given TYPE.  */
> 
>  static bool
>  mips_near_type_p (const_tree type)
>  {
> -  return lookup_attribute ("near", TYPE_ATTRIBUTES (type)) != NULL;
> +  return (lookup_attribute ("short_call", TYPE_ATTRIBUTES (type)) !=
> NULL
> +  || lookup_attribute ("near", TYPE_ATTRIBUTES (type)) != NULL);

The || here should be lined up under the 'l' of lookup_attribute above
which will be 1-tab 2-space to save you figuring out the indent rules
for GCC again. The space indents in the testcases below are fine as
they match the existing lines.

If you can repost with your @imgtec address then I'll approve that
to commit.

Thanks,
Matthew


RE: [PATCH] Switch vec_init and vec_extract optabs to 2 mode optab to allow extraction of vector from vector or initialization of vector from smaller vectors (PR target/80846)

2017-07-25 Thread Matthew Fortune
Jakub Jelinek  writes:
> Bootstrapped/regtested on x86_64-linux and i686-linux, where it improves
> e.g. the code generation for slp-43.c and slp-45.c testcases.
> make cc1 tested in cross-compilers to the remaining targets.

No objections for the MIPS part. I've pointed out this change to Sameera to
see how/if it will affect her autovectorization branch and whether MIPS MSA
should define more forms of vec_init/vec_expand in general.

Thanks,
Matthew



RE: Add support for use_hazard_barrier_return function attribute

2017-07-06 Thread Matthew Fortune
Maciej Rozycki  writes:
> On Fri, 23 Jun 2017, Prachi Godbole wrote:
> 
> > Index: gcc/config/mips/mips.md
> > ===
> > --- gcc/config/mips/mips.md (revision 246899)
> > +++ gcc/config/mips/mips.md (working copy)
> > @@ -6578,6 +6581,20 @@
> >[(set_attr "type""jump")
> > (set_attr "mode""none")])
> >
> > +;; Insn to clear execution and instruction hazards while returning.
> > +;; However, it doesn't clear hazards created by the insn in its delay
> slot.
> > +;; Thus, explicitly place a nop in its delay slot.
> > +
> > +(define_insn "mips_hb_return_internal"
> > +  [(return)
> > +   (unspec_volatile [(match_operand 0 "pmode_register_operand" "")]
> > +   UNSPEC_JRHB)]
> > +  ""
> > +  {
> > +return "%(jr.hb\t$31%/%)";
> > +  }
> > +  [(set_attr "insn_count" "2")])
> > +
> >  ;; Normal return.
> >
> >  (define_insn "_internal"
> 
>  Nothing wrong with your proposed change, but overall I wonder if (as a
> follow-up change) we could find a nonintrusive way to have this pattern
> (and `clear_hazard_' as well) produce JRS.HB rather than JR.HB in
> microMIPS compilations, as using the 32-bit delay-slot NOP encoding
> where the 16-bit one would do is obviously a tiny, but completely
> unnecessary code space loss (and we do care about code space losses in
> microMIPS compilations; conserving space is the very purpose of the
> microMIPS ISA after all).
> 
>  Of course it wouldn't do if we rewrote the instruction pattern as
> "%(jr%!.hb\t$31%/%)" here, because the NOP that follows would have to
> come from an RTL instruction for `%!' to have any effect.  But perhaps
> we could emit RTL instead somehow rather than hardcoding the NOP with
> `%/'?

I think this case is so specialist we can safely just switch to writing
out the NOP directly in the output pattern just keeping the %(%) for
noreorder. This code will have to be reworked with microMIPSR6 when
submitted so it can be handled then; good spot to use jrs.hb.

Thanks,
Matthew


RE: Add support for use_hazard_barrier_return function attribute

2017-07-06 Thread Matthew Fortune
Prachi Godbole  writes:
> Please find the updated patch below. I hope I've covered everything.
> I've added the test for inline restriction, could you check if I got all
> the options correct?

I think the test is probably good enough. It is a little too forgiving due
to handling the indirect call case to foo which could just detect an
indirect call from foo to bar (the placement of a scan-assembler in the
.c file has no impact on where in the generated output it will match in
the corresponding .s). Given that the test would fail appropriately on
a bare metal configuration (which is where this is likely to be most
useful) then I think that is sufficient.

Watch out for the long lines in comments. There is one that is hitting
80cols noted below to tweak before committing.

> Changelog:
> 
> 2017-06-23  Prachi Godbole  
> 
> gcc/
>   * config/mips/mips.h (machine_function): New variable
>   use_hazard_barrier_return_p.
>   * config/mips/mips.md (UNSPEC_JRHB): New unspec.
>   (mips_hb_return_internal): New insn pattern.
>   * config/mips/mips.c (mips_attribute_table): Add attribute
>   use_hazard_barrier_return.
>   (mips_use_hazard_barrier_return_p): New static function.
>   (mips_function_attr_inlinable_p): Likewise.
>   (mips_compute_frame_info): Set use_hazard_barrier_return_p.  Emit error
>   for unsupported architecture choice.
>   (mips_function_ok_for_sibcall, mips_can_use_return_insn): Return false
>   for use_hazard_barrier_return.
>   (mips_expand_epilogue): Emit hazard barrier return.
>   * doc/extend.texi: Document use_hazard_barrier_return.
> 
> gcc/testsuite/
>   * gcc.target/mips/hazard-barrier-return-attribute.c: New tests.

OK to commit.

> ===
> --- gcc/config/mips/mips.c(revision 246899)
> +++ gcc/config/mips/mips.c(working copy)
> @@ -7863,6 +7889,17 @@ mips_function_ok_for_sibcall (tree decl, tree exp
>&& !targetm.binds_local_p (decl))
>  return false;
> +  /* Functions that need to return with a hazard barrier cannot sibcall 
> because:

Long line for a comment above.

> +
> + 1) Hazard barriers are not possible for direct jumps
> +
> + 2) Despite an indirect jump with hazard barrier being possible we do
> + not use it so that the logic for generating a hazard barrier jump
> + can be contained within the epilogue handling.  */
> +
> +  if (mips_use_hazard_barrier_return_p (current_function_decl))
> +return false;
> +
>/* Otherwise OK.  */
>return true;
>  }

Thanks for the new feature!

Matthew


RE: Add support for use_hazard_barrier_return function attribute

2017-06-14 Thread Matthew Fortune
Prachi Godbole  writes:
> Changelog:
> 
> 2017-04-25  Prachi Godbole  
> 
> gcc/
>   * config/mips/mips.h (machine_function): New variable
>   use_hazard_barrier_return_p.
>   * config/mips/mips.md (UNSPEC_JRHB): New unspec.
>   (mips_hb_return_internal): New insn pattern.
>   * config/mips/mips.c (mips_attribute_table): Add attribute
>   use_hazard_barrier_return.
>   (mips_use_hazard_barrier_return_p): New static function.
>   (mips_function_attr_inlinable_p): Likewise.
>   (mips_compute_frame_info): Set use_hazard_barrier_return_p.  Emit
> error
>   for unsupported architecture choice.
>   (mips_function_ok_for_sibcall, mips_can_use_return_insn): Return
> false
>   for use_hazard_barrier_return.
>   (mips_expand_epilogue): Emit hazard barrier return.
>   * doc/extend.texi: Document use_hazard_barrier_return.
> 
> gcc/testsuite/
>   * gcc.target/mips/hazard-barrier-return-attribute.c: New test.

Sorry for not getting back to this after stage1 opened.

This looks like a great idea.  I'm slightly concerned about what will
happen if code uses this attribute and is built by older tools.  The
problem being that it will just get a warning and that may or may not
be strong enough for a user to notice they did not get a jr.hb and
then go on to get weird runtime failures.

That said we don't have this kind of feature detection for all the new
attributes relating to interrupts so perhaps we just leave this to
-Werror and/or configure time detection of the feature.

No major issues with this just a little cleanup...

> Index: gcc/doc/extend.texi
> ===
> --- gcc/doc/extend.texi   (revision 246899)
> +++ gcc/doc/extend.texi   (working copy)
> @@ -4496,6 +4496,12 @@ On MIPS targets, you can use the
> @code{nocompressi
>  to locally turn off MIPS16 and microMIPS code generation.  This attribute
>  overrides the @option{-mips16} and @option{-mmicromips} options on the
>  command line (@pxref{MIPS Options}).
> +
> +@item use_hazard_barrier_return
> +@cindex @code{use_hazard_barrier_return} function attribute, MIPS
> +This function attribute instructs the compiler to generate hazard barrier 
> return

Missing an 'a':

... generate a hazard barrier return...

Documentation normally wraps at 74 columns which makes these lines a
bit long.

> +that clears all execution and instruction hazards while returning, instead of
> +generating a normal return instruction.
>  @end table
> 
>  @node MSP430 Function Attributes
> Index: gcc/config/mips/mips.md
> ===
> --- gcc/config/mips/mips.md   (revision 246899)
> +++ gcc/config/mips/mips.md   (working copy)
> @@ -156,6 +156,7 @@
> 
>;; The `.insn' pseudo-op.
>UNSPEC_INSN_PSEUDO
> +  UNSPEC_JRHB

Add a comment on what the unspec is.

>  ])
> 
>  (define_constants
> @@ -6578,6 +6579,20 @@
>[(set_attr "type"  "jump")
> (set_attr "mode"  "none")])
> 
> +;; Insn to clear execution and instruction hazards while returning.
> +;; However, it doesn't clear hazards created by the insn in its delay slot.
> +;; Thus, explicitly place a nop in its delay slot.
> +
> +(define_insn "mips_hb_return_internal"
> +  [(return)
> +   (unspec_volatile [(match_operand 0 "pmode_register_operand" "")]
> + UNSPEC_JRHB)]
> +  ""
> +  {
> +return "%(jr.hb\t$31%/%)";
> +  }
> +  [(set_attr "insn_count" "2")])
> +
>  ;; Normal return.
> 
>  (define_insn "_internal"
> Index: gcc/config/mips/mips.c
> ===
> --- gcc/config/mips/mips.c(revision 246899)
> +++ gcc/config/mips/mips.c(working copy)
> @@ -615,6 +615,7 @@ static const struct attribute_spec mips_attribute_
>  mips_handle_use_shadow_register_set_attr, false },
>{ "keep_interrupts_masked",0, 0, false, true,  true, NULL, false },
>{ "use_debug_exception_return", 0, 0, false, true,  true, NULL, false },
> +  { "use_hazard_barrier_return", 0, 0, true, false, false, NULL, false },
>{ NULL,   0, 0, false, false, false, NULL, false }
>  };
> 
> 
> @@ -1275,6 +1276,16 @@ mips_use_debug_exception_return_p (tree type)
>  TYPE_ATTRIBUTES (type)) != NULL;
>  }
> 
> +/* Check if the attribute to use hazard barrier return is set for
> +   the function declaration DECL.  */
> +
> +static bool
> +mips_use_hazard_barrier_return_p (tree decl)

Make this a const_tree so you don't have to const_cast.

> +{
> +  return lookup_attribute ("use_hazard_barrier_return",
> + DECL_ATTRIBUTES (decl)) != NULL;
> +}
> +

DECL_ATTRIBUTES is indented 1 space too many.

>  /* Return the set of compression modes that are explicitly required
> by the attributes in ATTRIBUTES.  */
> 
> @@ -1460,6 +1471,21 @@ mips_can_inline_p (tree caller, tree callee)
>return 

RE: [PATCH] MIPS16/GCC: Emit bounds checking as RTL in `casesi'

2017-06-12 Thread Matthew Fortune
Maciej Rozycki  writes:
>  Further to my changes made last November here is an update to the MIPS16
> `casesi' pattern making it emit bounds checking as RTL rather than having
> it as hardcoded assembly within the `casesi_internal_mips16_'
> dispatcher.  See below for how PR tree-optimization/51513 has prevented me
> from proceeding back then.
> 
>  This new arrangement has several advantages:
> 
> 1. There is no hardcoded BTEQZ branch instruction that has a limited span
>and can overflow causing an assembly failure if the target label is too
>distant.  GCC is able to relax out of range MIPS16 branches these days,
>but obviously they have to be expressed in RTL rather than buried in
>assembly code.  This overflow can be easily reproduced; please enquire
>for a boring example if interested.
> 
> 2. The `casesi_internal_mips16_' pattern now has an accurate length
>(aka instruction count) recorded as all its remaining emitted assembly
>instructions are known in advance to fit in their regular (16-bit)
>machine encodings.  Previously there was uncertainty about the SLTU and
>BTEQZ instructions used for the bounds check, which depending on their
>exact arguments could have each resulted in their either regular
>(16-bit) or extended (32-bit) encoding.  Consequently the worst length
>estimate was recorded instead, possibly causing worse code generation
>(e.g. premature out-of-range branch relaxation or jump table expansion
>from half-words to full words).
> 
> 3. GCC now has freedom to schedule code around bounds checking as it sees
>fit rather than having to adapt to the fixed assembly block.  In fact
>it tends to make use of it right away swapping BTEQZ for the BTNEZ
>instruction and rescheduling code such that the out-of-bounds (default)
>case executes linearly.
> 
> There are probably more benefits, but these are what has immediately come
> to my mind.

Sounds good, I certainly agree that we will see the benefits you list above.

>  As noted above I meant to propose this change along with the rest so as
> to have it in GCC 7, however emitting the bounds check as a separate RTL
> pattern triggered an unrelated bug, then unknown to me, causing a fatal
> code generation regression, and the lack of time did not allow me to
> investigate it further.  This was easily reproduced with a piece of code
> (reduced from actual Linux kernel code) like this:
> 
> $ cat frob.c
> int
> frob (int i)
> {
>   switch (i)
> {
> case -5:
>   return -2;
> case -3:
>   return -1;
> case 0:
>   return 0;
> case 3:
>   return 1;
> case 5:
>   break;
> default:
>   __builtin_unreachable ();
> }
>   return i;
> }
> 
> producing truncated assembly like this:
> 
> $ gcc -O2 -mips16 -mcode-readable=yes -dp -S frob.c
> $ cat frob.s
>   .file   1 "frob.c"
>   .section .mdebug.abi32
>   .previous
>   .nanlegacy
>   .module fp=32
>   .module nooddspreg
>   .abicalls
>   .option pic0
>   .text
>   .align  2
>   .weak   frob
>   .setmips16
>   .setnomicromips
>   .entfrob
>   .type   frob, @function
> frob:
>   .frame  $sp,0,$31   # vars= 0, regs= 0/0, args= 0, gp= 0
>   .mask   0x,0
>   .fmask  0x,0
>   addiu   $2,$4,5  # 11   *addsi3_mips16/7[length = 2]
>   .endfrob
>   .size   frob, .-frob
>   .ident  "GCC: (GNU) 7.0.0 20161117 (experimental)"
> $
> 
> -- where as you can see the whole switch statement has vanished along with
> any return path from the function, and only the preparatory addition
> emitted from `casesi' with:
> 
>   emit_insn (gen_addsi3 (reg, operands[0], offset));
> 
> remained.
> 
>  The causing bug has turned out to be what was filed as PR
> tree-optimization/51513 and has been kindly fixed by Peter recently
> (thanks, Peter!) with r247844 ("Fix PR51513, switch statement with default
> case containing __builtin_unreachable leads to wild branch"),
> , enabling me to
> proceed with this change without having to investigate the cause of code
> breakage -- which for the MIPS16 target has clearly turned out to be
> graver than a mere silly branch never to be executed.
> 
>  Given the previous troubles with this change I have decided to add
> MIPS16 test cases to verify that code truncation has not happened,
> complementing gcc.target/powerpc/pr51513.c, in case further tweaks in this
> area might do something bad.  This would be caught by
> gcc.target/mips/insn-casesi.c added with r242424, but that test case does
> not refer to PR tree-optimization/51513, so let's make it explicit.  With
> the PR tree-optimization/51513 fix removed the two new cases indeed cause:
> 
> FAIL: gcc.target/mips/pr51513-1.c   -O2   scan-assembler \tjrc?\t\\$31\n
> FAIL: 

RE: [testsuite]MIPS remove duplicate div-x test

2017-06-04 Thread Matthew Fortune
Hi Paul,

Paul Hua  writes:
> cc: Matthew.
> 
> ping.

Sorry a little slow on the reply.
 
> On Thu, Jun 1, 2017 at 3:35 PM, Paul Hua  wrote:
> > Hi,
> >
> > There are duplicate testcase in gcc.target/mips dir.
> >
> > div-5.c same as div-9.c.
> > div-6.c same as div-10.c.
> > div-7.c same as div-11.c.
> > div-8.c same as div-12.c.
> >
> > Is this deliberate?

I see no evidence of this being deliberate and has been like this since
the original commit.

> > Otherwise, the attached patch fixing this.
> >
> >
> > Paul.
> >
> > ***ChangeLog***
> >
> > 2017-06-01Chenghua Xu 
> >
> > Remove duplicate div-x testcase.

These kind of comments don't tend to go in a changelog.

> > * gcc.target/mips/div-9.c: Delete.

You could say "Delete duplicate test" here if you want though.

> > * gcc.target/mips/div-10.c: Ditto.
> > * gcc.target/mips/div-11.c: Ditto.
> > * gcc.target/mips/div-12.c: Ditto.

Otherwise OK. I can't remember if you have write access let me know
if you need it committing. Thanks for finding this.

Matthew


RE: [PATCH] Fix fixincludes for canadian cross builds - next try

2017-04-20 Thread Matthew Fortune
Bernd Edlinger  writes:
> This is my new attempt to clean up the different cross compiler
> configurations.  It turned out to be a very complicated matter,
> so I thought it would be better to postpone it to the stage1.
> 
> In a canadian cross compiler setup we have a different header dir path
> for use in the build and later on the target, which is written to
> install-tools/mkheaders.conf, so I propose to export SYSTEM_HEADER_DIR
> and BUILD_SYSTEM_HEADER_DIR from configure.ac to be used in Makefile.in.
> 
> I also removed unnecessary handling of --with-headers, because
> the headers are copied to sys-include and thus it is not necessary to
> use the original path here.
> 
> If --with-sysroot or --with-build-sysroot is used the SYSTEM_HEADER_DIR
> or BUILD_SYSTEM_HEADER_DIR contain $${sysroot_headers_suffix},
> which is normally an empty string, but on mips it may be something
> like "mips-r2" which gets appended to the sysroot for use of fixincludes
> but "target_header_dir" which is used in configure to find things like
> the GLIBC version it is not used.  I assume that that either does
> not create problems and is silently ignored, or that people have a
> work around, my patch should not change that, however I have not been
> able to setup a sysroot for mips*-img-linux* or mips*-mti-linux* which
> seem to be the only targets where this might make a difference.

I'll try to test this out for you with MIPS. I have changes I want to make
to further improve the cross builds for the mti and img vendor builds so
if there is a bit of fallout I could deal with it then.

Matthew


RE: [PATCH] MIPS: Prevent buffer overrun in uninitialised variable fix

2017-04-20 Thread Matthew Fortune
Jakub Jelinek <ja...@redhat.com> writes:
> On Thu, Apr 20, 2017 at 12:49:49PM +, Matthew Fortune wrote:
> > Hi Jeff,
> >
> > I missed a load of test failures while on vacation and just noticed
> > that the fix you did for a potentially uninitialized variable warning
> > is overwriting the stack and breaking MSA in MIPS.
> >
> > I guess you may have intended to set the length appropriately and only
> > zero the elements that would not be set in the loop:
> >
> > memset (_perm[nelt], 0, MAX_VECT_LEN - nelt);
> >
> > but I switched it to just zero initialise the whole array for
> > simplicity in the patch below.
> >
> > I thought I'd check with you before committing. Obviously I'd like to
> > apply this to GCC 7 branch as well.
> 
> As it is just 16, it is not a big deal either way, memsetting everything
> might be even faster.  If it would be bigger than that, the MAX_VECT_LEN
> - nelt case would be better.
> 
> > gcc/
> > * config/mips/mips.c (mips_expand_vec_perm_const): Re-fix
> > uninitialized variable warning to avoid buffer overrun.
> 
> Ok, thanks, even for GCC 7 branch.

Thanks Jakub. Committed on trunk and gcc-7-branch.

Matthew


[PATCH] MIPS: Prevent buffer overrun in uninitialised variable fix

2017-04-20 Thread Matthew Fortune
Hi Jeff,

I missed a load of test failures while on vacation and just noticed
that the fix you did for a potentially uninitialized variable warning
is overwriting the stack and breaking MSA in MIPS.

I guess you may have intended to set the length appropriately and
only zero the elements that would not be set in the loop:

memset (_perm[nelt], 0, MAX_VECT_LEN - nelt);

but I switched it to just zero initialise the whole array for
simplicity in the patch below.

I thought I'd check with you before committing. Obviously I'd like to
apply this to GCC 7 branch as well.

Thanks,
Matthew

gcc/
* config/mips/mips.c (mips_expand_vec_perm_const): Re-fix
uninitialized variable warning to avoid buffer overrun.
---
 gcc/ChangeLog  | 5 +
 gcc/config/mips/mips.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c908048..80d3436 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2017-04-20  Matthew Fortune  <matthew.fort...@imgtec.com>
+
+   * config/mips/mips.c (mips_expand_vec_perm_const): Re-fix
+   uninitialized variable warning to avoid buffer overrun.
+
 2017-04-20  Alexander Monakov  <amona...@ispras.ru>
 
PR other/71250
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index b35fba7..6bfd86a 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -21358,7 +21358,7 @@ mips_expand_vec_perm_const (rtx operands[4])
 
   /* This is overly conservative, but ensures we don't get an
  uninitialized warning on ORIG_PERM.  */
-  memset (_perm[nelt], 0, MAX_VECT_LEN);
+  memset (orig_perm, 0, MAX_VECT_LEN);
   for (i = which = 0; i < nelt; ++i)
 {
   rtx e = XVECEXP (sel, 0, i);
-- 
2.2.1



RE: Fix MIPS builds with current trunk

2017-04-04 Thread Matthew Fortune
Jeff Law  writes:
> All the MIPS compilers will fail to build using the trunk due to a
> couple minor uninitialized memory issues.
> 
> First in mips_multi_add we add an uninitialized mips_multi_member object
> to the mips_multi_members vec.  It's easy enough to just memset the new
> member.
> 
> Second in mips_expand_vec_perm_const the tail of the orig_perm may be
> uninitialized if the number of elements in the target object is smaller
> than MAX_VECT_LEN.  This is also trivial to fix by explicitly clearing
> the tail of the array.
> 
> Tested by verifying the MIPS targets in config-list.mk will build using
> a trunk compiler again.
> 
> Installed on the trunk.

Thanks Jeff, and apologies for missing the report.

Matthew


[patch, MIPS] Update -mvirt option description

2017-03-31 Thread Matthew Fortune
Hi Catherine,

I'm following up on PR target/80057 to update the description of -mvirt.

I agree with Roland that the description is inconsistent and should not
state 'application specific' as none of the other ASE options include
it. Instead I suggest we put the shortened form of (VZ) in like (XPA)
is shown for -mxpa. The short form of virtualization ASE is usually VZ
not VIRT which has always irritated me about this option name but that
is history.

What do you think?

gcc/
PR target/80057
* config/mips/mips.opt (-mvirt): Update description.

Thanks,
Matthew

diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index 7c4b607..ced2432 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -406,7 +406,7 @@ Put uninitialized constants in ROM (needs -membedded-data).
 
 mvirt
 Target Report Var(TARGET_VIRT)
-Use Virtualization Application Specific instructions.
+Use Virtualization (VZ) instructions.
 
 mxpa
 Target Report Var(TARGET_XPA)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8d3821e..19a85b6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -19345,7 +19345,7 @@ Use (do not use) the MIPS Enhanced Virtual Addressing 
instructions.
 @itemx -mno-virt
 @opindex mvirt
 @opindex mno-virt
-Use (do not use) the MIPS Virtualization Application Specific instructions.
+Use (do not use) the MIPS Virtualization (VZ) instructions.
 
 @item -mxpa
 @itemx -mno-xpa
-- 
2.2.1



[patch,MIPS,committed] Fix extraction from odd-numbered MSA registers

2017-03-31 Thread Matthew Fortune
Another MSA related fix; this time relating to using -mno-odd-spreg.
This fixes a build-failure with gcc.c-torture/execute/20050604-1.c
when using -mabi=32 -mmsa -mno-odd-spreg.

The fix is to copy the whole vector from the odd-numbered source
register to the even numbered single-precision destination register
and then re-interpret the vector as single precision in the
destination. The mov.s is always eliminated as it is a trivial
no-op; this is permitted by the architecture as the single-precision
and double-precision registers overlay with the 0th element of the
MSA registers. Also, trivial no-ops were already being generated
and eliminated prior to this change when source and destination
register numbers happened to be the same.

No additional test failures introduced when running the testsuite
with -mmsa. The fix is covered by a pre-existing test so no new
testcase added.

gcc/
* config/mips/mips-msa.md (msa_vec_extract_): Update
extraction from odd-numbered MSA register

Committed.

Thanks,
Matthew


diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
index accb8de..c80be47 100644
--- a/gcc/config/mips/mips-msa.md
+++ b/gcc/config/mips/mips-msa.md
@@ -366,7 +366,20 @@ (define_insn_and_split "msa_vec_extract_"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 1))]
-  "operands[1] = gen_rtx_REG (mode, REGNO (operands[1]));"
+{
+  /* An MSA register cannot be reinterpreted as a single precision
+ register when using -mno-odd-spreg and the MSA register is
+ an odd number.  */
+  if (mode == SFmode && !TARGET_ODD_SPREG
+  && (REGNO (operands[1]) & 1))
+{
+  emit_move_insn (gen_rtx_REG (mode, REGNO (operands[0])),
+ operands[1]);
+  operands[1] = operands[0];
+}
+  else
+operands[1] = gen_rtx_REG (mode, REGNO (operands[1]));
+}
   [(set_attr "move_type" "fmove")
(set_attr "mode" "")])


[patch,MIPS,committed] Fix ICE when expanding MSA constant vectors

2017-03-30 Thread Matthew Fortune
This is a fix for a compile failure in gcc.c-torture/compile/pr70240.c
with options -O1 -mmsa.

The expand code for replicated constant vectors with small immediate
values was simply wrong and would never work. This code is not used in the
simple case of initialising a variable with a constant vector from C code; so
focussed test is non-trivial. Since an existing test triggers the issue
then we have basic coverage in place. The whole mips_expand_vector_init
function needs a review and possibly a rewrite as on reading through the
code I see lots of, at least, confusing logic in there which seems
unnecessary.

gcc/
* config/mips/mips.c (mips_expand_vector_init): Create
a const_vector to initialise a vector register instead of
using a const_int.

Committed.

Thanks,
Matthew

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index d1deb52..dadfcc4 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -21757,11 +21757,12 @@ mips_expand_vector_init (rtx target, rtx vals)
case V8HImode:
case V4SImode:
case V2DImode:
- emit_move_insn (target, same);
+ temp = gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0));
+ emit_move_insn (target, temp);
  return;
 
default:
- break;
+ gcc_unreachable ();
}
}
  temp = gen_reg_rtx (imode);


[PATCH,MIPS,committed] Fix pr52125.c test when built as -mno-abicalls -mabi=64

2017-03-30 Thread Matthew Fortune
pr52125.c verifies that orphaned %hi relocs are deleted if they feed
an inline asm statement that never emits the %lo part. Orphaned %hi
relocs are only strictly a problem for o32 but are eliminated for
any ABI as long as 32-bit addressing is in use so force -msym32 as well
as require absolute addressing. This is necessary because -msym32 is
only applied for n64 as part of downgrading -mabicalls to absolute
addressing. I.e. this test was broken for bare metal n64 configs.

gcc/testsuite/
* gcc.target/mips/pr52125.c: Add -msym32.

diff --git a/gcc/testsuite/gcc.target/mips/pr52125.c 
b/gcc/testsuite/gcc.target/mips/pr52125.c
index 2ac8067..46df940 100644
--- a/gcc/testsuite/gcc.target/mips/pr52125.c
+++ b/gcc/testsuite/gcc.target/mips/pr52125.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mno-gpopt addressing=absolute" } */
+/* { dg-options "-mno-gpopt -msym32 addressing=absolute" } */

 int a, b, c, d;



RE: [PATCH,testsuite] Skip pic-3,4.c and pie-3,4.c for mips*-*-linux-*.

2017-03-27 Thread Matthew Fortune
Toma Tabacu  writes:
> The pic-3,4.c and pie-3,4.c tests are failing for some configurations of
> mips*-*-linux-*.
> 
> This is because PIC is always on for MIPS Linux by default, except when the
> compiler is built with --with-mips-plt, in which case PIC is on by default 
> only
> for the n64 ABI, because in this case -mplt "has no effect without '-msym32'",
> to quote the documentation, so -mplt is not passed by default.
> 
> Richard Sandiford also talked about this in the summary of a patch which was
> skipping a test for mips*-*-linux-* because of PIC:
> https://gcc.gnu.org/ml/gcc-patches/2009-01/msg00801.html
> 
> Therefore, I think the most reasonable solution would be to just skip these
> tests for mips*-*-linux-*. The patch below does so.
> 
> Tested with mips-mti-linux-gnu.
> 
> Catherine, Matthew what do you think ?

Given the acceptance that MIPS PIC behaviour is semi-unique then checking MIPS
complies with the rules around pre-processor behaviour doesn't add much value.
I'm happy to skip these tests on the basis that software can't make the same
assumptions about MIPS and __PIC__ as other architectures do.

OK to commit.

Thanks,
Matthew



RE: [PATCH,testsuite] Skip gcc.dg/pic-2.c and gcc.dg/pie-2.c for MIPS.

2017-03-15 Thread Matthew Fortune
Toma Tabacu  writes:
> The gcc.dg/pic-2.c and gcc.dg/pie-2.c tests are failing for MIPS targets
> because __PIC__ is always set to 1 for MIPS.
> 
> This patch makes the testsuite skip those two tests for all MIPS
> targets.
> 
> Tested with mips-mti-elf and mips-mti-linux-gnu.
> 
> Should I have fixed this in target-supports.exp instead ?
> I was worried that doing so would complicate the fpic and pie effective
> target checks too much.

I think the skip is OK here. I'd like to get Catherine's opinion on
this though too. I don't think we should change the definition of __PIC__
for -fPIC on MIPS as multi-got solves 'most' issues. If we start trying to
figure out what __PIC__ should mean on MIPS then we will get into a big
mess with -mxgot as that is arguably __PIC__==2 but I expect there will be
several differing opinions.

Thanks,
Matthew

> 
> Regards,
> Toma
> 
> gcc/testsuite/
> 
>   * gcc.dg/pic-2.c: Skip for MIPS.
>   * gcc.dg/pie-2.c: Skip for MIPS.
> 
> diff --git a/gcc/testsuite/gcc.dg/pic-2.c b/gcc/testsuite/gcc.dg/pic-2.c
> index 59ce8e2..bccec13 100644
> --- a/gcc/testsuite/gcc.dg/pic-2.c
> +++ b/gcc/testsuite/gcc.dg/pic-2.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-require-effective-target fpic } */
>  /* { dg-options "-fPIC" } */
> +/* { dg-skip-if "__PIC__ is always 1 for MIPS" { mips*-*-* } } */
> 
>  #if __PIC__ != 2
>  # error __PIC__ is not 2!
> diff --git a/gcc/testsuite/gcc.dg/pie-2.c b/gcc/testsuite/gcc.dg/pie-2.c
> index 7bdc4ac..1838745 100644
> --- a/gcc/testsuite/gcc.dg/pie-2.c
> +++ b/gcc/testsuite/gcc.dg/pie-2.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-options "-fPIE" } */
>  /* { dg-require-effective-target pie } */
> +/* { dg-skip-if "__PIC__ is always 1 for MIPS" { mips*-*-* } } */
> 
>  #if __PIC__ != 2
>  # error __PIC__ is not 2!



RE: [PATCH] Fix MIPS-specific ICE in gcc.dg/pr77834.c (PR rtl-optimization/79150).

2017-03-15 Thread Matthew Fortune
Hi Catherine,

What is your opinion on this patch? I am OK with the temporary
workaround on the basis that the additional nop is successfully
eliminated so there is no code size increase. Also, I am happy
enough that the CFG is fixed very soon after the block move is
expanded so the 'bad' basic block is fixed almost immediately
anyway making the offending check potentially unnecessary in
the first place.

Thanks,
Matthew

> -Original Message-
> From: Toma Tabacu
> Sent: 07 March 2017 11:44
> To: gcc-patches@gcc.gnu.org
> Cc: Matthew Fortune; Segher Boessenkool (seg...@kernel.crashing.org)
> Subject: [PATCH] Fix MIPS-specific ICE in gcc.dg/pr77834.c (PR rtl-
> optimization/79150).
> 
> Hi,
> 
> This ICE is caused by "gcc_assert (!JUMP_P (last))" in
> commit_one_edge_insertion (gcc/cfgrtl.c):
> 
>   if (returnjump_p (last))
> {
>   /* ??? Remove all outgoing edges from BB and add one for EXIT.
>This is not currently a problem because this only happens
>for the (single) epilogue, which already has a fallthru edge
>to EXIT.  */
> 
>   e = single_succ_edge (bb);
>   gcc_assert (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
> && single_succ_p (bb) && (e->flags & EDGE_FALLTHRU));
> 
>   e->flags &= ~EDGE_FALLTHRU;
>   emit_barrier_after (last);
> 
>   if (before)
>   delete_insn (before);
> }
>   else
> gcc_assert (!JUMP_P (last));
> 
> The assert is triggered because we're removing an edge which ends on a
> conditional jump, which is part of a loop.
> 
> The reason for the existence of this loop is the size of the vector used
> in gcc.dg/pr77834.c.
> When compiling code which uses vectors for MIPS, a looping memcpy is
> generated if the vectors are big enough (>32 bytes for MIPS32 or >64
> bytes for MIPS64).
> This takes place in mips_expand_block_move (gcc/config/mips.c).
> 
> In our case, a looping memcpy gets generated for a partition copy which
> is inserted on an edge (in insert_partition_copy_on_edge, gcc/tree-
> outof-ssa.c).
> This happens during PHI node elimination, which is done by eliminate_phi
> (gcc/tree-outof-ssa.c), as a part of expand_phi_nodes, which is called
> by the expand pass (gcc/cfgexpand.c).
> 
> My original fix was to update the CFG by calling
> find_many_sub_basic_blocks with an all-one block bitmap (which also
> happens in cfgexpand.c, after edge
> removal) whenever an edge contains an insn which satisfies
> control_flow_insn_p.
> However, Segher Boessenkool said that this would be too risky for stage
> 4 and suggested inserting a NOP after the conditional jump in order to
> fool the assert. This assumes that it is safe to delay the CFG update
> for this particular case.
> 
> This patch changes mips_block_move_loop to emit a NOP after the
> conditional jump, if the jump is the last insn of the loop. This
> prevents "gcc_assert (!JUMP_P (last))" from triggering.
> 
> The NOP will never make it into the final assembly code because it is
> removed during the cse1 pass through DCE for -O1, -O2, and -O3, and it's
> not even emitted for -O0 and -Os because looping memcpy's are not
> generated for those optimization levels, as can be seen in
> mips_expand_block_move from mips.c:
> 
>   if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
>   {
> mips_block_move_straight (dest, src, INTVAL (length));
> return true;
>   }
>   else if (optimize)
>   {
> mips_block_move_loop (dest, src, INTVAL (length),
>   MIPS_MAX_MOVE_BYTES_PER_LOOP_ITER);
> return true;
>   }
> 
> Tested with mips-mti-elf.
> 
> Regards,
> Toma Tabacu
> 
> gcc/
> 
>   * config/mips/mips.c (mips_block_move_loop): Emit a NOP after the
>   conditional jump, if the jump is the last insn of the loop.
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index
> 4e13fbe..43e719f 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -8098,6 +8098,9 @@ mips_block_move_loop (rtx dest, rtx src,
> HOST_WIDE_INT length,
>/* Mop up any left-over bytes.  */
>if (leftover)
>  mips_block_move_straight (dest, src, leftover);
> +  else
> +/* Temporary fix for PR79150.  */
> +emit_insn (gen_nop ());
>  }
> 
>  /* Expand a movmemsi instruction, which copies LENGTH bytes from



[PATCH,committed] FIx MIPS lxc1-sxc1 tests for soft-float configurations

2017-03-09 Thread Matthew Fortune
The lxc1-sxc1 tests fundamentally require hard-float to test the
relevant instruction usage.  This patch adds -mhard-float to force
it on all configurations.

gcc/testsuite/

* gcc.target/mips/lxc1-sxc1-1.c: Use -mhard-float.
* gcc.target/mips/lxc1-sxc1-2.c: Likewise.

Committed to trunk.

Matthew

diff --git a/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c 
b/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c
index f455eb8..7272258 100644
--- a/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c
+++ b/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c
@@ -1,4 +1,4 @@
-/* { dg-options "(HAS_LXC1) -mno-lxc1-sxc1" } */
+/* { dg-options "(HAS_LXC1) -mhard-float -mno-lxc1-sxc1" } */
 /* { dg-final { scan-assembler-not "\tldxc1\t" } } */
 /* { dg-final { scan-assembler-not "\tsdxc1\t" } } */
 
diff --git a/gcc/testsuite/gcc.target/mips/lxc1-sxc1-2.c 
b/gcc/testsuite/gcc.target/mips/lxc1-sxc1-2.c
index dfbf6b5..317252b 100644
--- a/gcc/testsuite/gcc.target/mips/lxc1-sxc1-2.c
+++ b/gcc/testsuite/gcc.target/mips/lxc1-sxc1-2.c
@@ -1,4 +1,4 @@
-/* { dg-options "(HAS_LXC1) -mlxc1-sxc1" } */
+/* { dg-options "(HAS_LXC1) -mhard-float -mlxc1-sxc1" } */
 /* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
 /* { dg-final { scan-assembler "\tldxc1\t" } } */
 
-- 
2.2.1



RE: [PATCH,testsuite] Add check_effective_target_rdynamic and use it in g++.dg/lto/pr69589_0.C.

2017-03-09 Thread Matthew Fortune
Rainer Orth  writes:
> > g++.dg/lto/pr69589_0.C is currently failing for mips-mti-elf with the
> following error:
> >
> > xg++: error: unrecognized command line option '-rdynamic'
> >
> > However, it passes just fine for mips-mti-linux-gnu.
> > I think that we should skip this test for mips-mti-elf.
> 
> could it be that mips/sde.h is just missing -rdynamic handling in
> LINK_SPEC?  At least some bare-metal targets using gld *do* support -
> rdynamic (cf. xtensa/elf.h and aarch64/aarch64-elf-raw.h)...

There's probably no harm in adding it in to mips/sde.h it just doesn't
really add much value either. We have some (very subtle) ABI differences
between ELF and Linux targets so I'm not sure it is worth trying to
make sure all options exist/behave the same given such differences.

Matthew


RE: [PATCH][MIPS]MSA AND.d optimization to generate BCLRI.d

2017-03-09 Thread Matthew Fortune
Prachi Godbole  writes:
> 2017-03-09  Prachi Godbole  
> 
> gcc/testsuite/
>   * gcc.target/mips/msa-bclri.c: Skip the test for -O0.
> 
> 
> Index: testsuite/gcc.target/mips/msa-bclri.c
> ===
> --- testsuite/gcc.target/mips/msa-bclri.c   (revision 245912)
> +++ testsuite/gcc.target/mips/msa-bclri.c   (working copy)
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-mno-mips16 -mfp64 -mhard-float -mmsa" } */
> +/* { dg-skip-if "" { *-*-* }  { "-O0" } { "" } } */
> 
>  typedef long long v2i64 __attribute__ ((vector_size(16)));

I guess this is effectively a code quality test as well as a
regression test for the bug so can you state "code quality test"
in the dg-skip-if string.

Otherwise OK for trunk,

Matthew



RE: [RFC] VEC_SELECT sanity checking in genrecog (arm, aarch64, mips)

2017-03-06 Thread Matthew Fortune
Jakub Jelinek  writes:
> >
> > ../../gcc/config/mips/mips-msa.md:1219:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1219:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1219:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1219:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1307:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1307:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1307:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1307:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1401:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1401:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1401:1: vec_select parallel with 2
> > elements, expected 4
> > ../../gcc/config/mips/mips-msa.md:1401:1: vec_select parallel with 2
> > elements, expected 4

Thanks for reporting. These were actually in progress already but without
a PR (I believe these are the same ones as shown above). Patch was
approved earlier today:

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00229.html

Thanks,
Matthew


RE: [PATCH,testsuite] MIPS: Force O32 ABI for inline-memcpy-3.c.

2017-03-06 Thread Matthew Fortune
Toma Tabacu  writes:
> gcc/testsuite/
> 
>   * gcc.target/mips/inline-memcpy-3.c (dg-options): Add -mabi=32.

OK, thanks. Sorry for the slow reply.

Matthew


RE: [PATCH][MIPS]MSA min,max insn family RTL fixes

2017-03-06 Thread Matthew Fortune
Prachi Godbole  writes:
> 2017-03-06  Prachi Godbole  
> 
> gcc/
>   * config/mips/mips-msa.md (msa_fmax_a_,
> msa_fmin_a_,
>   msa_max_a_, msa_min_a_): Introduce mode interator
> for
>   if_then_else.
>   (smin3, smax3): Change operand print code from 'B' to
> 'E'.
> 
> gcc/testsuite/
>   * gcc.target/mips/msa-minmax.c: New tests.

OK, thanks.
Matthew


RE: [PATCH][MIPS]MSA dotp.d, dpadd.d, dpsub.d insn RTL - fix MODE

2017-03-06 Thread Matthew Fortune
Prachi Godbole  writes:
> 2017-03-06  Prachi Godbole  
> 
> gcc/
>   * config/mips/mips-msa.md (msa_dotp__d, msa_dpadd__d,
>   msa_dpsub__d): Fix MODE for vec_select.
> 
> gcc/testsuite/
>   * gcc.target/mips/msa-dotp.c: New tests.

OK, thanks.

Matthew


RE: [PATCH][MIPS]MSA AND.d optimization to generate BCLRI.d

2017-03-06 Thread Matthew Fortune
Prachi Godbole  writes:
> 2017-03-06  Prachi Godbole  
> 
> gcc/
>   * config/mips/mips.c (mips_gen_const_int_vector): Change type of
> last
>   argument.
>   * config/mips/mips-protos.h (mips_gen_const_int_vector): Likewise.
> 
> gcc/testsuite/
>   * gcc.target/mips/msa-bclri.c: New test.

OK, Thanks.

Matthew


RE: [PATCH,testsuite] MIPS: Fix register mode checking for n64 in pr68273.c.

2017-03-02 Thread Matthew Fortune
Toma Tabacu  writes:
> pr68273.c currently fails when targeting MIPS64 with the n64 ABI.
> This is because it is checking for some registers in SImode, but, because of
> n64, they will actually be in DImode.
> 
> This patch restricts matching for SImode registers to ilp32 targets and adds
> matching for DImode registers for lp64 targets.
> This makes sure that the test is run on as many targets as possible, compared
> to the alternative of adding -mabi=32 to the test options.
> 
> I haven't checked to see what happens with eabi or o64, though.
> 
> Does this approach work ? Or should I just make separate tests for o32, n32 
> and
> n64, each one using -mabi=* as a test option to force an ABI ?

I'm happy with this approach.  As long as it works for o32/n32/n64 that is
sufficient right now.  If anyone starts to build and run the testsuite regularly
for default eabi/o64 then tests can be fixed which will involve much more than
just this one.

> Tested with mips-mti-elf.
> 
> Regards,
> Toma
> 
> gcc/testsuite/
> 
>   * gcc.target/mips/pr68273.c (dg-final): Match SImode registers only for
>   ilp32 targets and match DImode registers for lp64 targets.

OK, thanks.

Matthew


RE: [PATCH,MIPS] Document -mload-store-pairs

2017-02-24 Thread Matthew Fortune
Moore, Catherine <catherine_mo...@mentor.com> writes:
> > -Original Message-
> > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
> > Sent: Friday, February 24, 2017 8:58 AM
> > To: Moore, Catherine <catherine_mo...@mentor.com>
> > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org)  > patc...@gcc.gnu.org>
> > Subject: [PATCH,MIPS] Document -mload-store-pairs
> >
> > Hi Catherine,
> >
> > Can you review the description for -mload-store-pairs please?
> >
> > Thanks,
> > Matthew
> >
> > gcc/
> > PR target/79473
> > * doc/invoke.texi: Document -mload-store-pairs.
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 6e5fa56..f1fc449 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -879,6 +879,7 @@ Objective-C and Objective-C++ Dialects}.
> >  -mexplicit-relocs  -mno-explicit-relocs @gol
> >  -mcheck-zero-division  -mno-check-zero-division @gol
> >  -mdivide-traps  -mdivide-breaks @gol
> > +-mload-store-pairs  -mno-load-store-pairs @gol
> >  -mmemcpy  -mno-memcpy  -mlong-calls  -mno-long-calls @gol
> >  -mmad  -mno-mad  -mimadd  -mno-imadd  -mfused-madd  -mno-fused-
> > madd  -nocpp @gol
> >  -mfix-24k  -mno-fix-24k @gol
> > @@ -19495,6 +19496,16 @@ overridden at configure time using
> > @option{--with-divide=breaks}.
> >  Divide-by-zero checks can be completely disabled using
> >  @option{-mno-check-zero-division}.
> >
> > +@item -mload-store-pairs
> > +@itemx -mno-load-store-pairs
> > +@opindex mload-store-pairs
> > +@opindex mno-load-store-pairs
> > +Enable (disable) an optimization that keeps consecutive load or store
> > +instructions sequential to allow MIPS processors that perform load
> > +and store bonding to optimize the access.  This option is enabled by
> > +default but only takes effect when the selected architecture is known
> > +to support bonding.
> > +
> >  @item -mmemcpy
> >  @itemx -mno-memcpy
> >  @opindex mmemcpy
> > --
> > 2.2.1
> 
> Hi Matthew -- How about this instead?
> 
> +@item -mload-store-pairs
> +@itemx -mno-load-store-pairs
> +@opindex mload-store-pairs
> +@opindex mno-load-store-pairs
> +Enable (disable) an optimization that pairs consecutive load or store
> +instructions to enable load/store bonding.  This option is enabled by
> +default but only takes effect when the selected architecture is known
> +to support bonding.
> +

Sure, I'll commit this along with the paired single fix when you've had
chance to review.

Thanks,
Matthew


[PATCH,MIPS] Document -mload-store-pairs

2017-02-24 Thread Matthew Fortune
Hi Catherine,

Can you review the description for -mload-store-pairs please?

Thanks,
Matthew

gcc/
PR target/79473
* doc/invoke.texi: Document -mload-store-pairs.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6e5fa56..f1fc449 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -879,6 +879,7 @@ Objective-C and Objective-C++ Dialects}.
 -mexplicit-relocs  -mno-explicit-relocs @gol
 -mcheck-zero-division  -mno-check-zero-division @gol
 -mdivide-traps  -mdivide-breaks @gol
+-mload-store-pairs  -mno-load-store-pairs @gol
 -mmemcpy  -mno-memcpy  -mlong-calls  -mno-long-calls @gol
 -mmad  -mno-mad  -mimadd  -mno-imadd  -mfused-madd  -mno-fused-madd  -nocpp 
@gol
 -mfix-24k  -mno-fix-24k @gol
@@ -19495,6 +19496,16 @@ overridden at configure time using 
@option{--with-divide=breaks}.
 Divide-by-zero checks can be completely disabled using
 @option{-mno-check-zero-division}.
 
+@item -mload-store-pairs
+@itemx -mno-load-store-pairs
+@opindex mload-store-pairs
+@opindex mno-load-store-pairs
+Enable (disable) an optimization that keeps consecutive load or store
+instructions sequential to allow MIPS processors that perform load
+and store bonding to optimize the access.  This option is enabled by
+default but only takes effect when the selected architecture is known
+to support bonding.
+
 @item -mmemcpy
 @itemx -mno-memcpy
 @opindex mmemcpy
-- 
2.2.1



[PATCH,MIPS] Handle paired single test changes

2017-02-23 Thread Matthew Fortune
Hi Catherine,

I missed a couple of testsuite changes that are needed to deal with the
fallout of fixing the ABI issues for floating point vectors.  I had them
in my tree but forgot to post.  The ABI change for V2SF i.e. paired
single is a bug fix as the behaviour was unintended and violates the goal
of having FP64 a compatible ABI extension for o32.  The probability of
having code dependent on this corner case of the calling convention in
the wild is exceptionally low so I see no significant risk still.

The tests for paired single just need a little encouragement to still
produce the necessary instructions now that paired single is not returned
in registers.

Does it look OK to you?

Thanks,
Matthew

gcc/testsuite/

* gcc.target/mips/mips-ps-type-2.c (move): Force generation
of mov.ps.
* gcc.target/mips/mips-ps-type.c (move): Likewise.
(cond_move1): Simplify condition to force generation of
mov[nz].ps.
(cond_move2): Likewise.
---

diff --git a/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c 
b/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c
index fecc35b..ed5d6ee 100644
--- a/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c
+++ b/gcc/testsuite/gcc.target/mips/mips-ps-type-2.c
@@ -32,6 +32,11 @@ NOMIPS16 v2sf init (float a, float b)
 /* Move between registers */
 NOMIPS16 v2sf move (v2sf a)
 {
+  register v2sf b __asm__("$f0") = a;
+  register v2sf c __asm__("$f2");
+  __asm__ __volatile__ ("" : "+f" (b));
+  c = b;
+  __asm__ __volatile__ ("" : : "f" (c));
   return a;
 }
 
diff --git a/gcc/testsuite/gcc.target/mips/mips-ps-type.c 
b/gcc/testsuite/gcc.target/mips/mips-ps-type.c
index d74d4b5..731649c 100644
--- a/gcc/testsuite/gcc.target/mips/mips-ps-type.c
+++ b/gcc/testsuite/gcc.target/mips/mips-ps-type.c
@@ -30,6 +30,11 @@ NOMIPS16 v2sf init (float a, float b)
 /* Move between registers */
 NOMIPS16 v2sf move (v2sf a)
 {
+  register v2sf b __asm__("$f0") = a;
+  register v2sf c __asm__("$f2");
+  __asm__ __volatile__ ("" : "+f" (b));
+  c = b;
+  __asm__ __volatile__ ("" : : "f" (c));
   return a;
 }
 
@@ -96,7 +101,7 @@ NOMIPS16 v2sf nmsub (v2sf a, v2sf b, v2sf c)
 /* Conditional Move */ 
 NOMIPS16 v2sf cond_move1 (v2sf a, v2sf b, long i)
 {
-  if (i > 0)
+  if (i != 0)
 return a;
   else
 return b;
@@ -105,7 +110,7 @@ NOMIPS16 v2sf cond_move1 (v2sf a, v2sf b, long i)
 /* Conditional Move */ 
 NOMIPS16 v2sf cond_move2 (v2sf a, v2sf b, int i)
 {
-  if (i > 0)
+  if (i != 0)
 return a;
   else
 return b;
-- 
2.2.1



RE: [PATCH, MIPS] Calling convention differs depending on the presence of MSA

2017-02-22 Thread Matthew Fortune
Moore, Catherine  writes:
> > As this is an ABI fix, just wait a day or so in case Catherine has
> > any comment otherwise OK to commit.
> >
> I have not  further comments on this patch.  Please commit.

Hi Sameera,

I've been considering this patch further and have realised that my review
was preoccupied with o32 behaviour.  The patch fixes o32 but breaks
all other ABIs because it forces floating point vectors with size up to
2*word_size to be in memory after the change when they would have been
in registers before.

Since you mentioned off list you're having issues committing I've gone
ahead and made the necessary changes and committed for you as below.

Thanks for your work on this.

Matthew

gcc/
* config/mips/mips.c (mips_return_in_memory): Force FP
vector types to be returned in memory for o32 ABI.

gcc/testsuite/

* gcc.target/mips/msa-fp-cc.c: New test.

---

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 7974a16..4e13fbe 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -6472,9 +6472,13 @@ mips_function_value_regno_p (const unsigned int regno)
 static bool
 mips_return_in_memory (const_tree type, const_tree fndecl ATTRIBUTE_UNUSED)
 {
-  return (TARGET_OLDABI
- ? TYPE_MODE (type) == BLKmode
- : !IN_RANGE (int_size_in_bytes (type), 0, 2 * UNITS_PER_WORD));
+  if (TARGET_OLDABI)
+/* Ensure that any floating point vector types are returned via memory
+   even if they are supported through a vector mode with some ASEs.  */
+return (VECTOR_FLOAT_TYPE_P (type)
+   || TYPE_MODE (type) == BLKmode);
+
+  return (!IN_RANGE (int_size_in_bytes (type), 0, 2 * UNITS_PER_WORD));
 }
 

 /* Implement TARGET_SETUP_INCOMING_VARARGS.  */
diff --git a/gcc/testsuite/gcc.target/mips/msa-fp-cc.c 
b/gcc/testsuite/gcc.target/mips/msa-fp-cc.c
new file mode 100644
index 000..3d293f3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/msa-fp-cc.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=32 -mfp64 -mhard-float -mmsa" } */
+typedef float v4f32 __attribute__((vector_size(16)));
+typedef double v2f64 __attribute__((vector_size(16)));
+
+v4f32
+fcmpOeqVector4 (v4f32 a, v4f32 b)
+{
+  return a + b;
+}
+
+v2f64
+fcmpOeqVector2 (v2f64 a, v2f64 b)
+{
+  return a + b;
+}
+
+/* { dg-final { scan-assembler-not "copy_s" } } */
+/* { dg-final { scan-assembler-not "insert" } } */
-- 
2.2.1



RE: [PATCH 3/5] Support WORD_REGISTER_OPERATIONS requirements in simplify_operand_subreg

2017-02-22 Thread Matthew Fortune
Eric Botcazou  writes:
> > The condition would look like this, What do you think?
> >
> >   if (!(GET_MODE_PRECISION (mode) != GET_MODE_PRECISION (innermode)
> > && GET_MODE_SIZE (mode) <= UNITS_PER_WORD
> > && GET_MODE_SIZE (innermode) <= UNITS_PER_WORD
> > && WORD_REGISTER_OPERATIONS)
> >   && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> >
> >   || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> >
> >   && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN
> > (reg) return true;
> 
> No regressions on SPARC.

I have committed this change now. The final patch is below. I will respond
promptly if there is some further unexpected fallout from this. I built the
arm-none-eabi target successfully as an additional check and bootstrapped
mips64el-linux-gnu as well.  I'll have to rely on the various auto-builders
people have to cover all the testsuite variations as I'm not set up to cover
them all myself.

Thanks,
Matthew

gcc/
PR target/78660
* lra-constraints.c (simplify_operand_subreg): Handle
WORD_REGISTER_OPERATIONS targets.


diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 35539a9..224a956 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -1541,11 +1541,22 @@ simplify_operand_subreg (int nop, machine_mode reg_mode)
 subregs as we don't substitute such equiv memory (see processing
 equivalences in function lra_constraints) and because for spilled
 pseudos we allocate stack memory enough for the biggest
-corresponding paradoxical subreg.  */
- if (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
-   && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
- || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
- && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg
+corresponding paradoxical subreg.
+
+However, do not blindly simplify a (subreg (mem ...)) for
+WORD_REGISTER_OPERATIONS targets as this may lead to loading junk
+data into a register when the inner is narrower than outer or
+missing important data from memory when the inner is wider than
+outer.  This rule only applies to modes that are no wider than
+a word.  */
+ if (!(GET_MODE_PRECISION (mode) != GET_MODE_PRECISION (innermode)
+   && GET_MODE_SIZE (mode) <= UNITS_PER_WORD
+   && GET_MODE_SIZE (innermode) <= UNITS_PER_WORD
+   && WORD_REGISTER_OPERATIONS)
+ && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
+   && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
+ || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
+ && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg)
return true;
 
  *curr_id->operand_loc[nop] = operand;
-- 
2.2.1



RE: [PATCH 1/5] Handle WORD_REGISTER_OPERATIONS when reloading (subreg(reg))

2017-02-22 Thread Matthew Fortune
> > I will keep working on this code post GCC 7 as the topic is bugging me
> now
> > :-)
> 
> I'm OK to lend a hand, but what's left for GCC 7 to make MIPS happy?

I have just successfully bootstrapped MIPS with just the pending (amended)
patch (3). i.e. with this patch (1) reverted so I did not need this one
after all.

I should have gone back and checked that originally. I'll commit the
updated patch (3) later today. After checking an ARM build.

Thanks for everyone's help and patience,
Matthew



RE: [PATCH 1/5] Handle WORD_REGISTER_OPERATIONS when reloading (subreg(reg))

2017-02-21 Thread Matthew Fortune
Eric Botcazou  writes:
> > Agreed.  I don't think things like WORD_MODE_OPERATIONS should change
> > rtl semantics, just optimisation decisions.

Sorry, I did cover two topics in one email.  My point was about whether
simplifying:

(subreg:OUTER (mem:INNER ...))
To:
(mem:OUTER ...)

Should ever be a requirement for successful reloading rather than just an
optimisation that 'could' be applied. There are a few cases it seems that
require this simplification to happen which I find odd.

> > And using the smallest
> > possible spill size is often good even for RISCy targets.

Yes, if (subreg(mem)) simplification only ever happened when it was reducing
the size of the load/store I would understand the code more too but actually
it will currently happily simplify to a wider mode. Using a wider mode may
well be beneficial in other situations where a further reload would be needed
due to insn constraints I guess. It all feels a bit like magic still.

> I don't think that's correct, WORD_MODE_OPERATIONS does change RTL semantics
> for SUBREGs smaller than a word since it can make all bits defined, even if
> only the lower part is assigned.  For example (SUBREG:SI (MEM:QI)) has the
> higher bits defined according to LOAD_EXTEND_OP if WORD_MODE_OPERATIONS.

It would almost be simpler if we had another variant of subreg (like we have
strict_low_part) to explicitly represent the WORD_REGISTER_OPERATIONS behaviour
with the mode change. I'll stop myself and hold this for later though!

> LRA simply needs to preserve the semantics, just as reload does.

I will keep working on this code post GCC 7 as the topic is bugging me now :-)

Thanks,
Matthew


RE: [PATCH 1/5] Handle WORD_REGISTER_OPERATIONS when reloading (subreg(reg))

2017-02-21 Thread Matthew Fortune
Richard Sandiford <richard.sandif...@arm.com> writes:
> Matthew Fortune <matthew.fort...@imgtec.com> writes:
> > Eric Botcazou <ebotca...@adacore.com> writes:
> >> > Thanks for reporting. I'll take a brief look first but revert if
> >> > the issue isn't something that vaguely makes sense to me.
> >>
> >> You need to restrict any WORD_REGISTER_OPERATIONS test to subword
> >> registers.
> >
> > I've reverted this. I haven't been able to quickly find a change that
> > I both feel is right and works. I am wondering if I actually don't
> > need this change and that 'patch 3' (the change to
> > simplify_operand_subreg) is the actual fix for both issues I have
> seen.
> 
> I think that patch might have a similar problem though, in that it
> applies WORD_REGISTER_OPERATIONS semantics to things that might be
> vectors.

There is an amendment done as part of SPARC testing that limits the
change to modes that are no wider than a word. But, given that assumptions
coming from WORD_REGISTER_OPERATIONS can only be applied to integer
modes then it should also check both modes are MODE_INT as well.

All that said... I don't entirely follow why any target should be
reliant on subreg(mem) being simplified to use the outer mode. It is an
optimization certainly for some cases but I don't understand what rule
or guarantee we have that means reloading the inner MEM is illegal.

The latter point is not appropriate to debate for GCC 7 though.

Matthew


RE: [PATCH 1/5] Handle WORD_REGISTER_OPERATIONS when reloading (subreg(reg))

2017-02-21 Thread Matthew Fortune
Eric Botcazou  writes:
> > Thanks for reporting. I'll take a brief look first but revert if the
> > issue isn't something that vaguely makes sense to me.
> 
> You need to restrict any WORD_REGISTER_OPERATIONS test to subword
> registers.

I've reverted this. I haven't been able to quickly find a change that
I both feel is right and works. I am wondering if I actually don't
need this change and that 'patch 3' (the change to
simplify_operand_subreg) is the actual fix for both issues I have seen.

I'll test my other change against an ARM build and wait a day to see
that ARM is at least working on trunk before committing the other
fix to this area of code.

Thanks,
Matthew 


RE: [PATCH 1/5] Handle WORD_REGISTER_OPERATIONS when reloading (subreg(reg))

2017-02-21 Thread Matthew Fortune
Christophe Lyon <christophe.l...@linaro.org> writes:
> On 21 February 2017 at 11:59, Kyrill Tkachov
> <kyrylo.tkac...@foss.arm.com> wrote:
> >
> > On 21/02/17 10:54, Christophe Lyon wrote:
> >>
> >> Hi,
> >>
> >> On 20 February 2017 at 13:08, Matthew Fortune
> >> <matthew.fort...@imgtec.com> wrote:
> >>>
> >>> Vladimir Makarov <vmaka...@redhat.com> writes:
> >>>>
> >>>> On 02/07/2017 09:08 AM, Matthew Fortune wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> This change deals with reloading a subreg(reg) using the inner
> >>>>> mode to prevent partial spilling of data like in the case
> described here:
> >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660#c8
> >>>>>
> >>>>> No test case for now but I am investigating a targeted test using
> >>>>> the RTL frontend for later submission.
> >>>>>
> >>>>>
> >>>>> gcc/
> >>>>>  PR target/78660
> >>>>>  * lra-constraints.c (curr_insn_transform): Handle
> >>>>>  WORD_REGISTER_OPERATIONS requirements when reloading SUBREGs.
> >>>>>
> >>>> The patch is OK.  So all 5 patches can be committed to the trunk.
> >>>> I hope Eric's test of the patches on SPARC will be successful.
> >>>> Thank you again for your efforts.
> >>>
> >>> Committed as r245598.
> >>>
> >> This patch is causing ICEs on arm-none-linux-gnueabi
> >> FAIL: gcc.c-torture/execute/simd-2.c   -O1  (internal compiler error)
> >>
> >> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.c-
> torture/execute/simd-2.c:
> >> In function 'main':
> >>
> >> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.c-
> torture/execute/simd-2.c:72:1:
> >> error: unable to find a register to spill
> >>
> >> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.c-
> torture/execute/simd-2.c:72:1:
> >> error: this is the insn:
> >> (insn 276 2123 2129 2 (set (subreg:SI (reg:TI 886 [362]) 0)
> >>  (and:SI (subreg:SI (reg:TI 567 [orig:111 j.1_2 ] [111]) 0)
> >>  (subreg:SI (reg:TI 568 [orig:110 i.0_1 ] [110]) 0)))
> >>
> >> "/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.c-torture/exec
> >> ute/simd-2.c":47
> >> 82 {*arm_andsi3_insn}
> >>   (expr_list:REG_DEAD (reg:TI 568 [orig:110 i.0_1 ] [110])
> >>  (expr_list:REG_DEAD (reg:TI 567 [orig:111 j.1_2 ] [111])
> >>  (nil
> >>
> >> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.c-
> torture/execute/simd-2.c:72:1:
> >> internal compiler error: in assign_by_spills, at lra-assigns.c:1457
> >> 0xacc9d3 _fatal_insn(char const*, rtx_def const*, char const*, int,
> >> char
> >> const*)
> >>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/rtl-error.c:108
> >> 0x9a6123 assign_by_spills
> >>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/lra-assigns.c:1457
> >> 0x9a7506 lra_assign()
> >>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/lra-assigns.c:1643
> >> 0x9a16d4 lra(_IO_FILE*)
> >>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/lra.c:2451
> >> 0x957669 do_reload
> >>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ira.c:5452
> >> 0x957669 execute
> >>  /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/ira.c:5636
> >>
> >>
> >>
> >> I've also noticed that --target arm-none-eabi with default
> >> --with-mode and --with-cpu fails to build libgcc (it may be easier to
> >> reproduce):
> >>
> >> /tmp/870921_29.tmpdir/aci-gcc-fsf/sources/gcc-
> fsf/gccsrc/libgcc/fixed-bit.c:
> >> In function '__gnu_mulhelperudq':
> >>
> >> /tmp/870921_29.tmpdir/aci-gcc-fsf/sources/gcc-
> fsf/gccsrc/libgcc/fixed-bit.c:371:1:
> >> error: unable to find a register to spill
> >>   }
> >>   ^
> >>
> >> /tmp/870921_29.tmpdir/aci-gcc-fsf/sources/gcc-
> fsf/gccsrc/libgcc/fixed-bit.c:371:1:
> >> error: this is the insn:
> >> (insn 286 296 287 2 (set (reg:SI 232)
> >>  (neg:SI (ltu:SI (subreg:SI (reg:DI 238 [orig:119 _10 ]
> [119]) 4)
> >>  (subreg:SI (reg/v:DI 149 [ temp1 ]) 4
> >>
> >> "/tmp/870921_29.tmpdir/aci-gcc-fsf/sources/gcc

RE: [PATCH, MIPS] Calling convention differs depending on the presence of MSA

2017-02-21 Thread Matthew Fortune
Sameera Deshpande <sameera.deshpa...@imgtec.com> writes:
> Hi Matthew,
> 
> Please find attached updated patch as per our offline discussion.
> 
> I have disabled return in registers for all vector float types, and
> updated the test case accordingly.
> 
> Ok for trunk?

Hi Sameera,

Sorry for the slow response.  I'd like to explain my thinking here
as I believe your change is right but you are actually ending up
fixing more issues than you expected!

There are 3 potential vector modes that are affected by your change
rather than just 2.  V2SF (paired single), V4SF and V2DF (MSA).

The V4SF and V2DF changes are the ones needed for MSA as the test
case shows.  These become valid modes with -mmsa so the float
vector types no longer get assigned BLKmode. (As an aside I think
it is a legacy bug that this function was ever defined in terms of
modes for o32).

The V2SF is an interesting one as it would not historically have
been a valid mode for o32 until o32 FP64 was re-invented.  Now that
o32 FP64 is a compatible ABI extension we(I) have in fact allowed an
incompatible ABI change through as a vector of 2 floats will be
returned in registers for o32 FP64 and not the other o32 variants.

As an aside... The integer vector types up to 16-bytes get returned
in registers also potentially by chance because...

1) We have TImode support in the backend
2) TImode is 16-bytes and hence the type->mode logic will fall back
   to looking for a wider integer mode for integer vectors and find
   TImode
3) The mips_hard_regno_mode_ok_p logic does not limit the size of
   a mode allowed in GPRs as long as the first register is even.
4) TImode is not BLKmode so is not forced to memory here.

I'm doubtful this is by design but it is what it is.  Let it
hereby be part of the MIPS o32 ABI definition!

Are you sure you need the dg-skip-if on the test case? Perhaps just
-O0. The test does not need vectorization as it is explicitly
Vectorized but it may break the scan-assembler at -O0.  Also please
can you apply the normal code style to the tests?

v4f32
fcmpOeqVector4 (v4f32 a, v4f32 b)
{
  return a + b;
}

As this is an ABI fix, just wait a day or so in case Catherine has
any comment otherwise OK to commit.

Thanks,
Matthew

> 
> - Thanks and regards,
>   Sameera D.
> 
> From: Sameera Deshpande
> Sent: 08 February 2017 14:10:52
> To: Matthew Fortune
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH, MIPS] Calling convention differs depending on the
> presence of MSA
> 
> Hi Matthew,
> 
> Please find attached the patch to fix the calling convention issue,
> where argument and result passing convention differed for MSA and non-
> MSA variants.
> 
> The implementation of TARGET_RETURN_IN_MEMORY is altered to block V4SF
> to be returned in registers.
> 
> Ok for trunk?
> 
> - Thanks and regards,
>   Sameera D.
> 
> 
> Changelog:
> gcc/
> * config/mips/mips.c (mips_return_in_memory) : Restrict V4SFmode to
> be returned in registers.
> 
> gcc/testsuite/
> * gcc.target/mips/msa-fp-cc.c : New testcase.


RE: [PATCH 5/5] Ensure the mode used to create split registers is suppported

2017-02-20 Thread Matthew Fortune
Vladimir Makarov <vmaka...@redhat.com> writes:
> On 02/07/2017 09:08 AM, Matthew Fortune wrote:
> > Hi,
> >
> > This patch addresses a problem with LRA splitting hard registers where
> > the mode requires multiple registers.  When splitting then each
> > constituent register is split individually using the widest mode for
> > each register but no check is made that such a mode is actually
> > supported in those registers.
> >
> > MIPS has an ABI variant o32 FPXX that allows DFmode values to exist in
> > pairs of 32-bit floating point registers but only the first 32-bit
> > register is directly addressable.  The second register can only be
> > accessed as part of a 64-bit load/store or via a special move
> > instruction used as part of a 64-bit move.
> >
> > The split is simply rejected to ensure compliance with the ABI
> > although I expect the split logic could account for this case and
> > split using the wider mode.  Such a change appears more invasive than
> > appropriate in stage 4.
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78012
> >
> > It is unknown if any other LRA enabled target could hit this issue but
> > it is certainly possible depending on mode/register restrictions.
> The patch is ok for the trunk and it is pretty safe.  Thank you Robert
> and Matt.

Committed as r245601.

Thanks,
Matthew


RE: [PATCH 4/5] Partial revert of r243782 to restore previous behavior

2017-02-20 Thread Matthew Fortune
Vladimir Makarov <vmaka...@redhat.com> writes:
> On 02/07/2017 09:08 AM, Matthew Fortune wrote:
> > Hi,
> >
> > This patch partially reverts r243782 where a return false was added
> > expecting it to be a no-op.  Detailed inspection shows this was not
> > true.  Despite no bug being identified following the change, removing
> > the early return is likely to be safer than leaving it in place.
> >
> > This is supported by the discussion here:
> > https://gcc.gnu.org/ml/gcc/2017-02/msg7.html
> >
> OK for the trunk.  Matt, thank you very much for efforts to clean
> simplify_operand_subreg up.

Committed as r245600.

Thanks,
Matthew


RE: [PATCH 2/5] Tighten condition for converting SUBREG reloads from OP_OUT to OP_INOUT

2017-02-20 Thread Matthew Fortune
Matthew Fortune <matthew.fort...@imgtec.com> writes:
> This change addresses a comment from Richard Sandiford in:
> https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02165.html
> 
> where a previous fix was over eager in converting OP_OUT reloads to
> OP_INOUT.
> 
> No testcase here either but I will try and exercise this code later with
> a targeted test using the RTL frontend if possible.
> 
> Thanks,
> Matthew
> 
> gcc/
>   PR target/78660
>   * lra-constraints.c (curr_insn_transform): Tighten condition
>   for converting SUBREG reloads from OP_OUT to OP_INOUT.

Committed as r245599.

Thanks,
Matthew



RE: [PATCH 1/5] Handle WORD_REGISTER_OPERATIONS when reloading (subreg(reg))

2017-02-20 Thread Matthew Fortune
Vladimir Makarov <vmaka...@redhat.com> writes:
> On 02/07/2017 09:08 AM, Matthew Fortune wrote:
> > Hi,
> >
> > This change deals with reloading a subreg(reg) using the inner mode to
> > prevent partial spilling of data like in the case described here:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660#c8
> >
> > No test case for now but I am investigating a targeted test using the
> > RTL frontend for later submission.
> >
> >
> > gcc/
> > PR target/78660
> > * lra-constraints.c (curr_insn_transform): Handle
> > WORD_REGISTER_OPERATIONS requirements when reloading SUBREGs.
> >
> The patch is OK.  So all 5 patches can be committed to the trunk.  I
> hope Eric's test of the patches on SPARC will be successful.  Thank you
> again for your efforts.

Committed as r245598.

Thanks,
Matthew


RE: [PATCH 3/5] Support WORD_REGISTER_OPERATIONS requirements in simplify_operand_subreg

2017-02-20 Thread Matthew Fortune
Hi Eric,

Any thoughts on this?

Thanks,
Matthew

> Sorry for the slow reply, been away for a few days
> 
> Eric Botcazou  writes:
> > > This patch is a minimal change to prevent (subreg(mem)) from being
> > > simplified to use the outer mode for WORD_REGISTER_OPERATIONS.
> > > There is high probability of refining and/or re-implementing this
> > > for GCC 8 but such a change would be too invasive.  This change at
> > > least ensures correctness but may prevent simplification of some
> acceptable cases.
> >
> > This one causes:
> >
> > +FAIL: gcc.dg/torture/builtin-complex-1.c   -O3 -fomit-frame-pointer -
> > funroll-
> > loops -fpeel-loops -ftracer -finline-functions  (test for excess
> > errors)
> > +WARNING: gcc.dg/torture/builtin-complex-1.c   -O3 -fomit-frame-
> pointer
> > -
> > funroll-loops -fpeel-loops -ftracer -finline-functions  compilation
> > failed to produce executable
> > +FAIL: gcc.dg/torture/builtin-complex-1.c   -O3 -g  (test for excess
> > errors)
> > +WARNING: gcc.dg/torture/builtin-complex-1.c   -O3 -g  compilation
> > failed to
> > produce executable
> > +WARNING: program timed out.
> > +WARNING: program timed out.
> >
> > on SPARC 32-bit, i.e. LRA hangs.  Reduced testcase attached, compile
> > at
> > -O3 with a cc1 configured for sparc-sun-solaris2.10.
> >
> > > gcc/
> > >   PR target/78660
> > >   * lra-constraints.c (simplify_operand_subreg): Handle
> > >   WORD_REGISTER_OPERATIONS targets.
> > > ---
> > >  gcc/lra-constraints.c | 17 -
> > >  1 file changed, 12 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index
> > > 66ff2bb..484a70d 100644
> > > --- a/gcc/lra-constraints.c
> > > +++ b/gcc/lra-constraints.c
> > > @@ -1541,11 +1541,18 @@ simplify_operand_subreg (int nop,
> > > machine_mode
> > > reg_mode) subregs as we don't substitute such equiv memory (see
> > > processing equivalences in function lra_constraints) and because for
> > > spilled pseudos we allocate stack memory enough for the biggest
> > > -  corresponding paradoxical subreg.  */
> > > -   if (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > > - && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> > > -   || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> > > -   && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg
> > > +  corresponding paradoxical subreg.
> > > +
> > > +  However, never simplify a (subreg (mem ...)) for
> > > +  WORD_REGISTER_OPERATIONS targets as this may lead to loading
> > junk
> > > +  data into a register when the inner is narrower than outer or
> > > +  missing important data from memory when the inner is wider
> > than
> > > +  outer.  */
> > > +   if (!WORD_REGISTER_OPERATIONS
> > > +   && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > > + && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> > > +   || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> > > +   && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN
> > (reg)
> > >   return true;
> > >
> > > *curr_id->operand_loc[nop] = operand;
> >
> > I think that we might need:
> >
> >   if (!(GET_MODE_PRECISION (mode) > GET_MODE_PRECISION (innermode)
> > && WORD_REGISTER_OPERATIONS)
> >   && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> >   || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> >   && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg)
> > return true;
> >
> > i.e. we force the reloading only for paradoxical subregs.
> 
> Maybe, though I'm not entirely convinced.  For WORD_REGISTER_OPERATIONS
> both paradoxical and normal SUBREGs can be a problem as the inner mode
> in both cases can be used elsewhere for a reload making the content of
> the spill slot wrong either in the subreg reload or the ordinary reload
> elsewhere. However, the condition can be tightened; I should not have
> made it so simplistic I guess. I.e. both modes must be no wider than a
> word and must be different precision to force an inner reload.
> 
> Adding that check would fix this case for SPARC and should be fine for
> MIPS but I need to wait for a bootstrap build to be sure.
> 
> I don't really understand why LRA can't reload this for SPARC though as
> I'm not sure there is any guarantee provided to backends that some
> SUBREGs will be reloaded using their outer mode.  If there is such a
> guarantee then it would be much easier to reason about this logic but as
> it stands I suspect we are having to tweak LRA to cope with assumptions
> made in various targets that happen to have held true (and I have no
> doubt that MIPS has some of these as well especially in terms of the
> FP/GP register usage with float and int modes.)  All being well we can
> capture such assumptions and formalise them so we ensure they hold true
> (or modify backends 

RE: [PATCH 3/5] Support WORD_REGISTER_OPERATIONS requirements in simplify_operand_subreg

2017-02-14 Thread Matthew Fortune
Sorry for the slow reply, been away for a few days

Eric Botcazou  writes:
> > This patch is a minimal change to prevent (subreg(mem)) from being
> > simplified to use the outer mode for WORD_REGISTER_OPERATIONS.  There
> > is high probability of refining and/or re-implementing this for GCC 8
> > but such a change would be too invasive.  This change at least ensures
> > correctness but may prevent simplification of some acceptable cases.
> 
> This one causes:
> 
> +FAIL: gcc.dg/torture/builtin-complex-1.c   -O3 -fomit-frame-pointer -
> funroll-
> loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> +WARNING: gcc.dg/torture/builtin-complex-1.c   -O3 -fomit-frame-pointer
> -
> funroll-loops -fpeel-loops -ftracer -finline-functions  compilation
> failed to produce executable
> +FAIL: gcc.dg/torture/builtin-complex-1.c   -O3 -g  (test for excess
> errors)
> +WARNING: gcc.dg/torture/builtin-complex-1.c   -O3 -g  compilation
> failed to
> produce executable
> +WARNING: program timed out.
> +WARNING: program timed out.
> 
> on SPARC 32-bit, i.e. LRA hangs.  Reduced testcase attached, compile at
> -O3 with a cc1 configured for sparc-sun-solaris2.10.
> 
> > gcc/
> > PR target/78660
> > * lra-constraints.c (simplify_operand_subreg): Handle
> > WORD_REGISTER_OPERATIONS targets.
> > ---
> >  gcc/lra-constraints.c | 17 -
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index
> > 66ff2bb..484a70d 100644
> > --- a/gcc/lra-constraints.c
> > +++ b/gcc/lra-constraints.c
> > @@ -1541,11 +1541,18 @@ simplify_operand_subreg (int nop, machine_mode
> > reg_mode) subregs as we don't substitute such equiv memory (see
> > processing equivalences in function lra_constraints) and because for
> > spilled pseudos we allocate stack memory enough for the biggest
> > -corresponding paradoxical subreg.  */
> > - if (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > -   && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> > - || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> > - && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg
> > +corresponding paradoxical subreg.
> > +
> > +However, never simplify a (subreg (mem ...)) for
> > +WORD_REGISTER_OPERATIONS targets as this may lead to loading
> junk
> > +data into a register when the inner is narrower than outer or
> > +missing important data from memory when the inner is wider
> than
> > +outer.  */
> > + if (!WORD_REGISTER_OPERATIONS
> > + && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
> > +   && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> > + || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> > + && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN
> (reg)
> > return true;
> >
> >   *curr_id->operand_loc[nop] = operand;
> 
> I think that we might need:
> 
>   if (!(GET_MODE_PRECISION (mode) > GET_MODE_PRECISION (innermode)
>   && WORD_REGISTER_OPERATIONS)
>   && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
>   && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
> || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
> && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg)
> return true;
> 
> i.e. we force the reloading only for paradoxical subregs.

Maybe, though I'm not entirely convinced.  For WORD_REGISTER_OPERATIONS
both paradoxical and normal SUBREGs can be a problem as the inner mode
in both cases can be used elsewhere for a reload making the content
of the spill slot wrong either in the subreg reload or the ordinary
reload elsewhere. However, the condition can be tightened; I should
not have made it so simplistic I guess. I.e. both modes must be no
wider than a word and must be different precision to force an inner
reload.

Adding that check would fix this case for SPARC and should be fine
for MIPS but I need to wait for a bootstrap build to be sure.

I don't really understand why LRA can't reload this for SPARC though
as I'm not sure there is any guarantee provided to backends that some
SUBREGs will be reloaded using their outer mode.  If there is such a
guarantee then it would be much easier to reason about this logic but
as it stands I suspect we are having to tweak LRA to cope with
assumptions made in various targets that happen to have held true (and
I have no doubt that MIPS has some of these as well especially in terms
of the FP/GP register usage with float and int modes.)  All being well
we can capture such assumptions and formalise them so we ensure they
hold true (or modify backends appropriately I guess).

The condition would look like this, What do you think?

  if (!(GET_MODE_PRECISION (mode) != GET_MODE_PRECISION (innermode)
&& GET_MODE_SIZE (mode) <= UNITS_PER_WORD
 

RE: [PATCH 0/5] LRA fixes for MIPS-like targets

2017-02-08 Thread Matthew Fortune
Eric Botcazou  writes:
> > @Eric: thanks for the offer of sparc testing; please can you let me
> > know if there are any issues?
> 
> Sure, but can you post a consolidated patch as an attachment? (btw it's
> SPARC instead of sparc, like MIPS instead of mips I presume).

I've attached the consolidated patch. Apologies about capitalisation, I
normally try to get them right everywhere just slipped here.

Thanks,
Matthew




combined_lra_fix.patch
Description: combined_lra_fix.patch


[PATCH 5/5] Ensure the mode used to create split registers is suppported

2017-02-07 Thread Matthew Fortune
Hi,

This patch addresses a problem with LRA splitting hard registers
where the mode requires multiple registers.  When splitting then
each constituent register is split individually using the widest
mode for each register but no check is made that such a mode is
actually supported in those registers.

MIPS has an ABI variant o32 FPXX that allows DFmode values to
exist in pairs of 32-bit floating point registers but only the
first 32-bit register is directly addressable.  The second
register can only be accessed as part of a 64-bit load/store or
via a special move instruction used as part of a 64-bit move.

The split is simply rejected to ensure compliance with the ABI
although I expect the split logic could account for this case
and split using the wider mode.  Such a change appears more
invasive than appropriate in stage 4.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78012

It is unknown if any other LRA enabled target could hit this
issue but it is certainly possible depending on mode/register
restrictions.

Thanks,
Matthew

Author: Robert Suchanek  

gcc/
PR target/78012
* lra-constraints.c (split_reg): Check requested split mode
is supported by the register.
---
 gcc/lra-constraints.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 0b7ae34..db6e878 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -5402,6 +5402,26 @@ split_reg (bool before_p, int original_regno, rtx_insn 
*insn,
}
  return false;
}
+  /* Split_if_necessary can split hard registers used as part of a
+multi-register mode but splits each register individually.  The
+mode used for each independent register may not be supported
+so reject the split.  Splitting the wider mode should theoretically
+be possible but is not implemented.  */
+  if (! HARD_REGNO_MODE_OK (hard_regno, mode))
+   {
+ if (lra_dump_file != NULL)
+   {
+ fprintf (lra_dump_file,
+  "Rejecting split of %d(%s): unsuitable mode %s\n",
+  original_regno,
+  reg_class_names[lra_get_allocno_class (original_regno)],
+  GET_MODE_NAME (mode));
+ fprintf
+   (lra_dump_file,
+"\n");
+   }
+ return false;
+   }
   new_reg = lra_create_new_reg (mode, original_reg, rclass, "split");
   reg_renumber[REGNO (new_reg)] = hard_regno;
 }
-- 
2.2.1



[PATCH 4/5] Partial revert of r243782 to restore previous behavior

2017-02-07 Thread Matthew Fortune
Hi,

This patch partially reverts r243782 where a return false was added
expecting it to be a no-op.  Detailed inspection shows this was not
true.  Despite no bug being identified following the change, removing
the early return is likely to be safer than leaving it in place.

This is supported by the discussion here:
https://gcc.gnu.org/ml/gcc/2017-02/msg7.html

Thanks,
Matthew

gcc/
* lra-constraints.c (simplify_operand_subreg): Remove early
return false.
---
 gcc/lra-constraints.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 484a70d..0b7ae34 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -1607,7 +1607,8 @@ simplify_operand_subreg (int nop, machine_mode reg_mode)
 the memory.  Typical case is when the index scale should
 correspond the memory.  */
   *curr_id->operand_loc[nop] = operand;
-  return false;
+  /* Do not return false here as the MEM_P (reg) will be processed
+later in this function.  */
 }
   else if (REG_P (reg) && REGNO (reg) < FIRST_PSEUDO_REGISTER)
 {
-- 
2.2.1



[PATCH 3/5] Support WORD_REGISTER_OPERATIONS requirements in simplify_operand_subreg

2017-02-07 Thread Matthew Fortune
Hi,

This patch is a minimal change to prevent (subreg(mem)) from being
simplified to use the outer mode for WORD_REGISTER_OPERATIONS.  There
is high probability of refining and/or re-implementing this for GCC 8
but such a change would be too invasive.  This change at least ensures
correctness but may prevent simplification of some acceptable cases.

No testcase here but I will try to make one using the RTL frontend
later.  This fix is required for mips64el-linux-gnu bootstrap
to succeed (in conjunction with patch 1 of this series). The specific
file affected by this bug is building gcc/predict.c where a bad reload
is created in predict_paths_for_bb.  Register 300 in the following
example is spilled to memory in SImode but reloaded as DImode.

(insn 247 212 389 3 (set (reg:SI 300)
(ne:SI (subreg/s/u:SI (reg/v:DI 231 [ taken ]) 0)
(const_int 0 [0]))) "/home/mfortune/gcc/gcc/predict.c":2904 504 
{*sne_zero_sisi}
 (nil))
...

(insn 250 256 251 40 (set (reg:DI 6 $6)
(subreg:DI (reg:SI 300) 0)) "/home/mfortune/gcc/gcc/predict.c":2904 310 
{*movdi_64bit}
 (nil))

Thanks,
Matthew

gcc/
PR target/78660
* lra-constraints.c (simplify_operand_subreg): Handle
WORD_REGISTER_OPERATIONS targets.
---
 gcc/lra-constraints.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 66ff2bb..484a70d 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -1541,11 +1541,18 @@ simplify_operand_subreg (int nop, machine_mode reg_mode)
 subregs as we don't substitute such equiv memory (see processing
 equivalences in function lra_constraints) and because for spilled
 pseudos we allocate stack memory enough for the biggest
-corresponding paradoxical subreg.  */
- if (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
-   && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
- || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
- && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg
+corresponding paradoxical subreg.
+
+However, never simplify a (subreg (mem ...)) for
+WORD_REGISTER_OPERATIONS targets as this may lead to loading junk
+data into a register when the inner is narrower than outer or
+missing important data from memory when the inner is wider than
+outer.  */
+ if (!WORD_REGISTER_OPERATIONS
+ && (!(MEM_ALIGN (subst) < GET_MODE_ALIGNMENT (mode)
+   && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (subst)))
+ || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
+ && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg)
return true;
 
  *curr_id->operand_loc[nop] = operand;
-- 
2.2.1



[PATCH 2/5] Tighten condition for converting SUBREG reloads from OP_OUT to OP_INOUT

2017-02-07 Thread Matthew Fortune
Hi,

This change addresses a comment from Richard Sandiford in:
https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02165.html

where a previous fix was over eager in converting OP_OUT reloads to
OP_INOUT.

No testcase here either but I will try and exercise this code later
with a targeted test using the RTL frontend if possible.

Thanks,
Matthew

gcc/
PR target/78660
* lra-constraints.c (curr_insn_transform): Tighten condition
for converting SUBREG reloads from OP_OUT to OP_INOUT.
---
 gcc/lra-constraints.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index f29308f..66ff2bb 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4139,7 +4139,17 @@ curr_insn_transform (bool check_only_p)
  || (GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (reg))
  && WORD_REGISTER_OPERATIONS)))
{
- if (type == OP_OUT)
+ /* An OP_INOUT is required when reloading a subreg of a
+mode wider than a word to ensure that data beyond the
+word being reloaded is preserved.  Also automatically
+ensure that strict_low_part reloads are made into
+OP_INOUT which should already be true from the backend
+constraints.  */
+ if (type == OP_OUT
+ && (curr_static_id->operand[i].strict_low
+ || (GET_MODE_SIZE (GET_MODE (reg)) > UNITS_PER_WORD
+ && GET_MODE_SIZE (mode)
+< GET_MODE_SIZE (GET_MODE (reg)
type = OP_INOUT;
  loc = _REG (*loc);
  mode = GET_MODE (*loc);
-- 
2.2.1



[PATCH 0/5] LRA fixes for MIPS-like targets

2017-02-07 Thread Matthew Fortune
Hi, 

This patch series addresses a variety of issues in LRA that the MIPS
target has exposed but are not specific to MIPS.

The fixes are primarily related to how WORD_REGISTER_OPERATIONS needs
to be accounted for in LRA SUBREG reloading.  In almost all cases LRA
needs to reload the inner REG/MEM/PLUS rather than simplify to use
the outer mode.  Following detailed analysis of the code in
simplify_operand_subreg there is scope for various improvements around
this area.  These changes will however be far too invasive for GCC 7 in
stage 4.

Notes on the current implementation are here:
https://gcc.gnu.org/ml/gcc/2017-02/msg0.html

Background discussion on the fixes is here:
https://gcc.gnu.org/ml/gcc/2017-01/msg00130.html

Original bug report:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660

Some fixes made in this series are written to ensure safety and minimize
impact.  As such further changes including reworking the fixes are
likely but as part of GCC 8 development.

This whole patch series has been run through bootstrap and regression
test on mips64el-linux-gnu.

@Vladimir: Although you have given approval in principle to some of these
already please could you double check as I have added comments in some
cases. Patch 5 has not been posted before in any form.

@Eric: thanks for the offer of sparc testing; please can you let me know
if there are any issues?

I'm running x86_64 testing too but most of the code won't trigger there
as it is not a WORD_REGISTER_OPERATIONS target. If anyone can give
a recipe for bootstrapping ARM on a compile farm machine I will do that
but I have no idea how to get the ARM multiarch stuff to work in Ubuntu.

Thanks,
Matthew

Matthew Fortune (4):
  Handle WORD_REGISTER_OPERATIONS when reloading (subreg (reg))
  Tighten condition for converting SUBREG reloads from OP_OUT to
OP_INOUT
  Support WORD_REGISTER_OPERATIONS requirements in
simplify_operand_subreg
  Partial revert of r243782 to restore previous behavior

Robert Suchanek (1):
  Ensure the mode used to create split registers is suppported

 gcc/lra-constraints.c | 61 ---
 1 file changed, 53 insertions(+), 8 deletions(-)

-- 
2.2.1



[PATCH 1/5] Handle WORD_REGISTER_OPERATIONS when reloading (subreg(reg))

2017-02-07 Thread Matthew Fortune
Hi,

This change deals with reloading a subreg(reg) using the inner mode
to prevent partial spilling of data like in the case described here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660#c8

No test case for now but I am investigating a targeted test using
the RTL frontend for later submission.

Thanks,
Matthew

gcc/
PR target/78660
* lra-constraints.c (curr_insn_transform): Handle
WORD_REGISTER_OPERATIONS requirements when reloading SUBREGs.
---
 gcc/lra-constraints.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 22323b2..f29308f 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -4130,7 +4130,14 @@ curr_insn_transform (bool check_only_p)
  && (goal_alt[i] == NO_REGS
  || (simplify_subreg_regno
  (ira_class_hard_regs[goal_alt[i]][0],
-  GET_MODE (reg), byte, mode) >= 0)
+  GET_MODE (reg), byte, mode) >= 0)))
+ /* WORD_REGISTER_OPERATIONS targets require the register
+to be reloaded when the outer mode is strictly
+narrower than the inner mode.  Note: It may be
+necessary to always reload the inner mode here but it
+requires further investigation.  */
+ || (GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (reg))
+ && WORD_REGISTER_OPERATIONS)))
{
  if (type == OP_OUT)
type = OP_INOUT;
-- 
2.2.1



RE: [PATCH] MIPS: Fix mode mismatch error between Loongson builtin arguments and insn operands.

2017-02-06 Thread Matthew Fortune
Toma Tabacu <toma.tab...@imgtec.com> writes:
> Matthew Fortune writes:
> >
> > That's not what I hoped but is what I was concerned about as I believe
> > it means we have a change of behaviour.  It boils down to simply
> > ignoring the argument type of unsigned char.  My guess is that a zero
> > extension is created but then immediately eliminated because of the
> paradoxical subreg.
> >
> > I think you need to create a temporary and perform the zero extension
> > to ensure we honour the unsigned char operand:
> >
> >   rtx new_dst = gen_reg_rtx (SImode);
> >   emit_insn (gen_zero_extendqisi2 (new_dst, ops[2].value));
> >   ops[2].value = foo;
> >
> > This should mean that the testcase I sent always has a zero extension
> > but if you change the type of 'amount' to be unsigned char then there
> > should not be a zero extension as the argument will be assumed to be
> > correctly zero extended already and the explicitly introduced
> zero_extend will be eliminated.
> >
> 
> I have made it generate a zero_extend instead of a SUBREG.
> However, the pattern associated with gen_zero_extendqisi2 does not work
> with immediate operands, so I had to add an extra step in which the
> argument is put into a QImode register before being passed to
> gen_zero_extendqisi2.
> 
> Is this OK ?
> 
> Regards,
> Toma
> 
> gcc/
> 
>   * config/mips/mips.c (mips_expand_builtin_insn): Convert the QImode
>   argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
>   builtins to SImode and emit a zero-extend, if necessary.
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index
> da7fa8f..bab5b93 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -16571,9 +16571,35 @@ mips_expand_builtin_insn (enum insn_code icode,
> unsigned int nops,  {
>machine_mode imode;
>int rangelo = 0, rangehi = 0, error_opno = 0;
> +  rtx qireg, sireg;
> 
>switch (icode)
>  {
> +/* The third operand of these instructions is in SImode, so we need
> to
> +   bring the corresponding builtin argument from QImode into
> SImode.  */
> +case CODE_FOR_loongson_pshufh:
> +case CODE_FOR_loongson_psllh:
> +case CODE_FOR_loongson_psllw:
> +case CODE_FOR_loongson_psrah:
> +case CODE_FOR_loongson_psraw:
> +case CODE_FOR_loongson_psrlh:
> +case CODE_FOR_loongson_psrlw:
> +  gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> +  sireg = gen_reg_rtx (SImode);
> +  /* We need to put the immediate in a register because
> +  gen_zero_extendqisi2 does not accept immediate operands.  */
> +  if (CONST_INT_P (ops[2].value))
> + {
> +   qireg = gen_reg_rtx (QImode);
> +   emit_insn (gen_rtx_SET (qireg, ops[2].value));
> +   emit_insn (gen_zero_extendqisi2 (sireg, qireg));
> + } else {
> +   emit_insn (gen_zero_extendqisi2 (sireg, ops[2].value));
> + }

Almost but not quite. There is a force_reg helper that takes care of
this i.e. can get rid of the qireg local and the whole if statement.

  emit_insn (gen_zero_extendqisi2 (sireg, force_reg (ops[2].value)));

> +  ops[2].value = sireg;
> +  ops[2].mode = SImode;
> +  break;
> +
>  case CODE_FOR_msa_addvi_b:
>  case CODE_FOR_msa_addvi_h:
>  case CODE_FOR_msa_addvi_w:

OK with that change.

Thanks,
Matthew


RE: [PATCH] MIPS: Fix mode mismatch error between Loongson builtin arguments and insn operands.

2017-02-02 Thread Matthew Fortune
Toma Tabacu <toma.tab...@imgtec.com> writes:
> > From: Matthew Fortune
> > > +/* The third argument needs to be in SImode in order to succesfully
> > > match
> > > +   the operand from the insn definition.  */
> >
> > Please refer to operand here not argument as it is the second argument
> > to the builtin but third operand of the instruction.  Also double ss in
> > successfully.
> >
> 
> I have rewritten the comment to address these mistakes.
> 
> > > +case CODE_FOR_loongson_pshufh:
> > > +case CODE_FOR_loongson_psllh:
> > > +case CODE_FOR_loongson_psllw:
> > > +case CODE_FOR_loongson_psrah:
> > > +case CODE_FOR_loongson_psraw:
> > > +case CODE_FOR_loongson_psrlh:
> > > +case CODE_FOR_loongson_psrlw:
> > > +  gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> > > +  ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
> > > +  ops[2].mode = SImode;
> > > +  break;
> > > +
> > >  case CODE_FOR_msa_addvi_b:
> > >  case CODE_FOR_msa_addvi_h:
> > >  case CODE_FOR_msa_addvi_w:
> >
> > For the record, given paradoxical subregs are a headache...
> > I am OK with this on the basis that the argument to psllh etc is actually
> > a uint8_t which means that bits 8 upwards are guaranteed to be zero so
> > the subreg can be eliminated without any explicit sign or zero extension
> > inserted.  This is the same kind of optimisation that combine would
> > perform when eliminating zero extension.
> >
> > Please can you check that a zero extension is inserted for the following
> > case with -O2 or above:
> >
> > int16x4_t testme(int16x4_t s, int amount)
> > {
> >   return psllh_s (s, amount);
> > }
> >
> > If my understanding is correct there should be an ANDI 0xff inserted
> > or similar.
> >
> 
> The ANDI 0xff is present for -O0, after the first time the value is loaded
> from memory, but it is not generated for -O1 and -O2.
> I'm not seeing any zero extension happening for -O1 and -O2.

That's not what I hoped but is what I was concerned about as I believe it
means we have a change of behaviour.  It boils down to simply ignoring the
argument type of unsigned char.  My guess is that a zero extension is
created but then immediately eliminated because of the paradoxical subreg.

I think you need to create a temporary and perform the zero extension to
ensure we honour the unsigned char operand:

  rtx new_dst = gen_reg_rtx (SImode);
  emit_insn (gen_zero_extendqisi2 (new_dst, ops[2].value));
  ops[2].value = foo;

This should mean that the testcase I sent always has a zero extension but if
you change the type of 'amount' to be unsigned char then there should not be
a zero extension as the argument will be assumed to be correctly zero extended
already and the explicitly introduced zero_extend will be eliminated.

Apologies for not proposing this before.

Thanks,
Matthew

> 
> The only change in the patch below is the fixed comment.
> 
> Regards,
> Toma
> 
> gcc/
> 
>   * config/mips/mips.c (mips_expand_builtin_insn): Put the QImode
>   argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
>   builtins into an SImode paradoxical SUBREG.
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index da7fa8f..e5b2d9a 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -16574,6 +16574,20 @@ mips_expand_builtin_insn (enum insn_code icode, 
> unsigned int
> nops,
> 
>switch (icode)
>  {
> +/* The third operand of these instructions is in SImode, so we need to
> +   bring the corresponding builtin argument from QImode into SImode.  */
> +case CODE_FOR_loongson_pshufh:
> +case CODE_FOR_loongson_psllh:
> +case CODE_FOR_loongson_psllw:
> +case CODE_FOR_loongson_psrah:
> +case CODE_FOR_loongson_psraw:
> +case CODE_FOR_loongson_psrlh:
> +case CODE_FOR_loongson_psrlw:
> +  gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> +  ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
> +  ops[2].mode = SImode;
> +  break;
> +
>  case CODE_FOR_msa_addvi_b:
>  case CODE_FOR_msa_addvi_h:
>  case CODE_FOR_msa_addvi_w:



RE: [PATCH] MIPS: Fix mode mismatch error between Loongson builtin arguments and insn operands.

2017-01-31 Thread Matthew Fortune
Toma Tabacu  writes:
> The builtins for the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
> Loongson instructions have the third argument's type set to UQI while
> its corresponding insn operand is in SImode.
> 
> This results in the following error when matching insn operands:
> 
> ../gcc/gcc/include/loongson.h: In function 'test_psllw_s':
> ../gcc/gcc/include/loongson.h:483:10: error: invalid argument to built-
> in function
>return __builtin_loongson_psllw_s (s, amount);
>   ^~
> 
> This causes the loongson-simd.c and loongson-shift-count-truncated-1.c
> tests to fail.
> 
> This patch fixes this by wrapping the QImode builtin argument inside a
> paradoxical SUBREG with SImode, which will successfully match against
> the insn operand.
> 
> Tested with mips-mti-elf.
> 
> Regards,
> Toma
> 
> gcc/
> 
>   * config/mips/mips.c (mips_expand_builtin_insn): Put the QImode
>   argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
>   builtins into an SImode paradoxical SUBREG.
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index
> da7fa8f..f1ca6e2 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -16574,6 +16574,20 @@ mips_expand_builtin_insn (enum insn_code icode,
> unsigned int nops,
> 
>switch (icode)
>  {
> +/* The third argument needs to be in SImode in order to succesfully
> match
> +   the operand from the insn definition.  */

Please refer to operand here not argument as it is the second argument
to the builtin but third operand of the instruction.  Also double ss in 
successfully.

> +case CODE_FOR_loongson_pshufh:
> +case CODE_FOR_loongson_psllh:
> +case CODE_FOR_loongson_psllw:
> +case CODE_FOR_loongson_psrah:
> +case CODE_FOR_loongson_psraw:
> +case CODE_FOR_loongson_psrlh:
> +case CODE_FOR_loongson_psrlw:
> +  gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> +  ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
> +  ops[2].mode = SImode;
> +  break;
> +
>  case CODE_FOR_msa_addvi_b:
>  case CODE_FOR_msa_addvi_h:
>  case CODE_FOR_msa_addvi_w:

For the record, given paradoxical subregs are a headache...
I am OK with this on the basis that the argument to psllh etc is actually
a uint8_t which means that bits 8 upwards are guaranteed to be zero so
the subreg can be eliminated without any explicit sign or zero extension
inserted.  This is the same kind of optimisation that combine would
perform when eliminating zero extension.

Please can you check that a zero extension is inserted for the following
case with -O2 or above:

int16x4_t testme(int16x4_t s, int amount)
{
  return psllh_s (s, amount);
}

If my understanding is correct there should be an ANDI 0xff inserted
or similar.

OK with those changes and confirmation of the test above.

Thanks,
Matthew


RE: [patch mips/gcc] add build-time and runtime options to disable or set madd.fmt type

2017-01-20 Thread Matthew Fortune
Moore, Catherine <catherine_mo...@mentor.com> writes:
> > -Original Message-
> > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
> > Sent: Thursday, January 19, 2017 5:30 PM
> > To: Moore, Catherine <catherine_mo...@mentor.com>
> > Cc: 'Aurelien Jarno' <aurel...@aurel32.net>; 'Richard Sandiford'
> > <rdsandif...@googlemail.com>; Loosemore, Sandra
> > <sandra_loosem...@mentor.com>; Yunqiang Su
> > <yunqiang...@imgtec.com>; 'gcc-patches' <gcc-patches@gcc.gnu.org>
> > Subject: RE: [patch mips/gcc] add build-time and runtime options to
> > disable or set madd.fmt type
> >
> > Matthew Fortune <matthew.fort...@imgtec.com> writes:
> > > I've rewritten/simplified this patch as it provides far too much
> > control
> > > to end users who will undoubtedly shoot themselves in the foot so to
> > > speak. The option I intend to support is simply --with-madd4 --
> > without-madd4
> > > and -mmadd4 -mno-madd4. This is a simple enable/disable on top of
> > > architecture checks to use/not use the madd4 family of instructions.
> > >
> > > We have to keep each of these unusual features simple so that we
> > can somehow
> > > reason about them in the future.
> > >
> >
> > Here is the tested patch.  Configure time default set/not set tested and
> > testsuite
> > fixes in place to deal with the fallout from running with the madd4
> > instructions
> > disabled.  Tests done with an o32 config on mips64el-linux-gnu.  If
> > there is any
> > other fallout from other test configurations I'll catch those as I try to
> > get the
> > rest of the testsuite issues resolved before release.
> >
> > Catherine, any issues to raise on this new option?
> 
> I committed this patch after fixing a couple of typos in the documentation 
> and ChangeLog
> entry.
> No other objections.
> Catherine

Thanks Catherine. I'll check to see what if/any documentation changes have gone 
in
without your review and send you a list in case you have time to review it 
before
release.

Thanks,
Matthew 


RE: [patch mips/gcc] add build-time and runtime options to disable or set madd.fmt type

2017-01-19 Thread Matthew Fortune
Matthew Fortune <matthew.fort...@imgtec.com> writes:
> I've rewritten/simplified this patch as it provides far too much control
> to end users who will undoubtedly shoot themselves in the foot so to
> speak. The option I intend to support is simply --with-madd4 --without-madd4
> and -mmadd4 -mno-madd4. This is a simple enable/disable on top of
> architecture checks to use/not use the madd4 family of instructions.
> 
> We have to keep each of these unusual features simple so that we can somehow
> reason about them in the future.
> 

Here is the tested patch.  Configure time default set/not set tested and 
testsuite
fixes in place to deal with the fallout from running with the madd4 instructions
disabled.  Tests done with an o32 config on mips64el-linux-gnu.  If there is any
other fallout from other test configurations I'll catch those as I try to get 
the
rest of the testsuite issues resolved before release.

Catherine, any issues to raise on this new option?

Thanks,
Matthew

gcc/

* config.gcc (supported_defaults): Add madd4.
(with_madd4): Add validation.
(all_defaults): Add madd4.
* config/mips/mips.opt (mmadd4): New option.
* gcc/config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for
mmadd4.
(TARGET_CPU_CPP_BUILTINS): Add builtin_define for
__mips_no_madd4.
(ISA_HAS_UNFUSED_MADD4): Gate with mips_madd4.
(ISA_HAS_FUSED_MADD4): Likewise.
* gcc/doc/invoke.texi (-mmadd4): Document the new option.
* doc/install.texi (--with-madd4): Document the new option.

gcc/testsuite/

* gcc.target/mips/madd4-1.c: New file.
* gcc.target/mips/madd4-2.c: Likewise.
* gcc.target/mips/mips.exp (mips_option_groups): Add ghost option
HAS_MADD4.
(mips_option_groups): Add -m[no-]madd4.
(mips-dg-init): Detect default -mno-madd4.
(mips-dg-options): Handle HAS_MADD4 arch upgrade/downgrade.
* gcc.target/mips/mips-ps-type.c: Add -mmadd4 test option.
* gcc.target/mips/mips-ps-type-2.c: Likewise.
* gcc.target/mips/nmadd-1.c: Likewise.
* gcc.target/mips/nmadd-2.c: Likewise.
* gcc.target/mips/nmadd-3.c: Likewise.
---
 gcc/ChangeLog  | 16 
 gcc/config.gcc | 19 +--
 gcc/config/mips/mips.h | 12 +---
 gcc/config/mips/mips.opt   |  4 
 gcc/doc/install.texi   | 14 ++
 gcc/doc/invoke.texi|  8 +++-
 gcc/testsuite/ChangeLog| 15 +++
 gcc/testsuite/gcc.target/mips/madd4-1.c| 14 ++
 gcc/testsuite/gcc.target/mips/madd4-2.c| 14 ++
 gcc/testsuite/gcc.target/mips/mips-ps-type-2.c |  2 +-
 gcc/testsuite/gcc.target/mips/mips-ps-type.c   |  2 +-
 gcc/testsuite/gcc.target/mips/mips.exp | 12 +++-
 gcc/testsuite/gcc.target/mips/nmadd-1.c|  2 +-
 gcc/testsuite/gcc.target/mips/nmadd-2.c|  2 +-
 gcc/testsuite/gcc.target/mips/nmadd-3.c|  2 +-
 15 files changed, 126 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/madd4-1.c
 create mode 100644 gcc/testsuite/gcc.target/mips/madd4-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e53f9e1..7496071 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,19 @@
+2017-01-19  Matthew Fortune  <matthew.fort...@imgtec.com>
+   Yunqiang Su  <yunqiang...@imgtec.com>
+
+   * config.gcc (supported_defaults): Add madd4.
+   (with_madd4): Add validation.
+   (all_defaults): Add madd4.
+   * config/mips/mips.opt (mmadd4): New option.
+   * gcc/config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for
+   mmadd4.
+   (TARGET_CPU_CPP_BUILTINS): Add builtin_define for
+   __mips_no_madd4.
+   (ISA_HAS_UNFUSED_MADD4): Gate with mips_madd4.
+   (ISA_HAS_FUSED_MADD4): Likewise.
+   * gcc/doc/invoke.texi (-mmadd4): Document the new option.
+   * doc/install.texi (--with-madd4): Document the new option.
+
 2017-01-19  Chenghua Xu  <paul.hua...@gmail.com>
 
* config/mips/mips.h (ISA_HAS_FUSED_MADD4): Enable for
diff --git a/gcc/config.gcc b/gcc/config.gcc
index dd8c08c..9e67d36 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3940,7 +3940,7 @@ case "${target}" in
;;
 
mips*-*-*)
-   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1"
+   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1 
madd4"
 
case ${with_float} in
"" | soft | hard)
@@ -4078,6 +4078,21 @@ case &q

RE: [patch mips/gcc] add build-time and runtime options to disable or set madd.fmt type

2017-01-19 Thread Matthew Fortune
Hi,

I've rewritten/simplified this patch as it provides far too much control
to end users who will undoubtedly shoot themselves in the foot so to
speak. The option I intend to support is simply --with-madd4 --without-madd4
and -mmadd4 -mno-madd4. This is a simple enable/disable on top of
architecture checks to use/not use the madd4 family of instructions.

We have to keep each of these unusual features simple so that we can somehow
reason about them in the future.

The patch below is still in test so may need further tweaks but I intend to
finish that and commit later on today.

Thanks,
Matthew

gcc/

* config.gcc (supported_defaults): Add madd4.
(with_madd4): Add validation.
(all_defaults): Add madd4.
* config/mips/mips.opt (mmadd4): New option.
* gcc/config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for
mmadd4.
(TARGET_CPU_CPP_BUILTINS): Add builtin_define for
__mips_no_madd4.
(ISA_HAS_UNFUSED_MADD4): Gate with mips_madd4.
(ISA_HAS_FUSED_MADD4): Likewise.
* gcc/doc/invoke.texi (-mmadd4): Document the new option.
* doc/install.texi (--with-madd4): Document the new option.

gcc/testsuite/

* gcc.target/mips/madd4-1.c: New file.
* gcc.target/mips/madd4-2.c: Likewise.
* gcc.target/mips/mips.exp (mips_option_groups): Add ghost option
HAS_MADD4.
(mips_option_groups): Add -m[no-]madd4.
(mips-dg-init): Detect default -mno-madd4.
(mips-dg-options): Handle HAS_MADD4 arch upgrade/downgrade.
---
 gcc/ChangeLog   | 16 
 gcc/config.gcc  | 19 +--
 gcc/config/mips/mips.h  | 12 +---
 gcc/config/mips/mips.opt|  4 
 gcc/doc/install.texi| 14 ++
 gcc/doc/invoke.texi |  8 +++-
 gcc/testsuite/ChangeLog | 10 ++
 gcc/testsuite/gcc.target/mips/madd4-1.c | 14 ++
 gcc/testsuite/gcc.target/mips/madd4-2.c | 14 ++
 gcc/testsuite/gcc.target/mips/mips.exp  | 12 +++-
 gcc/testsuite/gcc.target/mips/nmadd-1.c |  2 +-
 gcc/testsuite/gcc.target/mips/nmadd-2.c |  2 +-
 gcc/testsuite/gcc.target/mips/nmadd-3.c |  2 +-
 13 files changed, 119 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/madd4-1.c
 create mode 100644 gcc/testsuite/gcc.target/mips/madd4-2.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e53f9e1..7496071 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,19 @@
+2017-01-19  Matthew Fortune  <matthew.fort...@imgtec.com>
+   Yunqiang Su  <yunqiang...@imgtec.com>
+
+   * config.gcc (supported_defaults): Add madd4.
+   (with_madd4): Add validation.
+   (all_defaults): Add madd4.
+   * config/mips/mips.opt (mmadd4): New option.
+   * gcc/config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for
+   mmadd4.
+   (TARGET_CPU_CPP_BUILTINS): Add builtin_define for
+   __mips_no_madd4.
+   (ISA_HAS_UNFUSED_MADD4): Gate with mips_madd4.
+   (ISA_HAS_FUSED_MADD4): Likewise.
+   * gcc/doc/invoke.texi (-mmadd4): Document the new option.
+   * doc/install.texi (--with-madd4): Document the new option.
+
 2017-01-19  Chenghua Xu  <paul.hua...@gmail.com>
 
* config/mips/mips.h (ISA_HAS_FUSED_MADD4): Enable for
diff --git a/gcc/config.gcc b/gcc/config.gcc
index dd8c08c..9e67d36 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3940,7 +3940,7 @@ case "${target}" in
;;
 
mips*-*-*)
-   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1"
+   supported_defaults="abi arch arch_32 arch_64 float fpu nan 
fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1 
madd4"
 
case ${with_float} in
"" | soft | hard)
@@ -4078,6 +4078,21 @@ case "${target}" in
exit 1
;;
esac
+
+   case ${with_madd4} in
+   yes)
+   with_madd4=madd4
+   ;;
+   no)
+   with_madd4=no-madd4
+   ;;
+   "")
+   ;;
+   *)
+   echo "Unknown madd4 type used in --with-madd4" 1>&2
+   exit 1
+   ;;
+   esac
;;
 
nds32*-*-*)
@@ -4511,7 +4526,7 @@ case ${target} in
 esac
 
 t=
-all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 
schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls 
lxc1-sxc1"
+all_defaults="abi cpu cpu_32 cpu_64 arch arch_

RE: [PATCH,gcc/MIPS] Make loongson3a use fused madd.d

2017-01-19 Thread Matthew Fortune
Hi Paul,

Your latest version of the patch is now committed.  I have been doing some
work on the recursive build failure but the issue is complex and involves
LRA so I went ahead with committing your change independently.  It also
turns out that (at least when targeting loongson3a) there are stage2/3
object comparison issues even after resolving the LRA bug with an initial
fix.

r244641

Thanks,
Matthew


RE: [PATCH, MIPS] Target flag and build option to disable indexed memory OPs.

2017-01-19 Thread Matthew Fortune
ps_cpu_info {
\
   if (TARGET_CACHE_BUILTIN)
\
builtin_define ("__GCC_HAVE_BUILTIN_MIPS_CACHE");   \
+  if (!ISA_HAS_LXC1_SXC1)  \
+   builtin_define ("__mips_no_lxc1_sxc1"); \
 }  \
   while (0)
 
@@ -866,7 +868,8 @@ struct mips_cpu_info {
   {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}" }, \
   {"llsc", "%{!mllsc:%{!mno-llsc:-m%(VALUE)}}" }, \
   {"mips-plt", "%{!mplt:%{!mno-plt:-m%(VALUE)}}" }, \
-  {"synci", "%{!msynci:%{!mno-synci:-m%(VALUE)}}" }
+  {"synci", "%{!msynci:%{!mno-synci:-m%(VALUE)}}" },   \
+  {"lxc1-sxc1", "%{!mlxc1-sxc1:%{!mno-lxc1-sxc1:-m%(VALUE)}}" } \
 
 /* A spec that infers the:
-mnan=2008 setting from a -mips argument,
@@ -1036,7 +1039,8 @@ struct mips_cpu_info {
 
 /* ISA has floating-point indexed load and store instructions
(LWXC1, LDXC1, SWXC1 and SDXC1).  */
-#define ISA_HAS_LXC1_SXC1  ISA_HAS_FP4
+#define ISA_HAS_LXC1_SXC1  (ISA_HAS_FP4\
+&& mips_lxc1_sxc1)
 
 /* ISA has paired-single instructions.  */
 #define ISA_HAS_PAIRED_SINGLE  ((ISA_MIPS64\
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index 2559649..75ebafd 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -388,6 +388,10 @@ mlra
 Target Report Var(mips_lra_flag) Init(1) Save
 Use LRA instead of reload.
 
+mlxc1-sxc1
+Target Report Var(mips_lxc1_sxc1) Init(1)
+Use lwxc1/swxc1/ldxc1/sdxc1 instructions where applicable.
+
 mtune=
 Target RejectNegative Joined Var(mips_tune_option) ToLower 
Enum(mips_arch_opt_value)
 -mtune=PROCESSOR   Optimize the output for PROCESSOR.
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 4793ef8..712b82a 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1375,6 +1375,25 @@ On MIPS targets, make @option{-msynci} the default when 
no
 On MIPS targets, make @option{-mno-synci} the default when no
 @option{-msynci} option is passed.  This is the default.
 
+@item --with-lxc1-sxc1
+On MIPS targets, make @option{-mlxc1-sxc1} the default when no
+@option{-mno-lxc1-sxc1} option is passed.  This is the default.
+
+@item --without-lxc1-sxc1
+On MIPS targets, make @option{-mno-lxc1-sxc1} the default when no
+@option{-mlxc1-sxc1} option is passed.  The indexed load/store
+instructions are not directly a problem but can lead to unexpected
+behaviour when deployed in an application intended for a 32-bit address
+space but run on a 64-bit processor.  The issue is seen because all
+known MIPS 64-bit Linux kernels execute o32 and n32 applications
+with 64-bit addressing enabled which affects the overflow behaviour
+of the indexed addressing mode.  GCC will assume that ordinary
+32-bit arithmetic overflow behaviour is the same whether performed
+as an @code{addu} instruction or as part of the address calculation
+in @code{lwxc1} type instructions.  This assumption holds true in a
+pure 32-bit environment and can hold true in a 64-bit environment if
+the address space is accurately set to be 32-bit for o32 and n32.
+
 @item --with-mips-plt
 On MIPS targets, make use of copy relocations and PLTs.
 These features are extensions to the traditional
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 82cb1b5..a13a450 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -19932,6 +19932,12 @@ it is unused.
 
 This optimization is off by default at all optimization levels.
 
+@item -mlxc1-sxc1
+@itemx -mno-lxc1-sxc1
+@opindex mlxc1-sxc1
+When applicable, enable (disable) the generation of @code{lwxc1},
+@code{swxc1}, @code{ldxc1}, @code{sdxc1} instructions.  Enabled by default.
+
 @end table
 
 @node MMIX Options
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index e0e0bd5..0ba8f93 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,14 @@
+2017-01-19  Matthew Fortune  <matthew.fort...@imgtec.com>
+
+   PR target/78176
+   * gcc.target/mips/lxc1-sxc1-1.c: New file.
+   * gcc.target/mips/lxc1-sxc1-2.c: Likewise.
+   * gcc.target/mips/mips.exp (mips_option_groups): Add ghost option
+   HAS_LXC1.
+   (mips_option_groups): Add -m[no-]lxc1-sxc1.
+   (mips-dg-init): Detect default -mno-lxc1-sxc1.
+   (mips-dg-options): Handle HAS_LXC1 arch upgrade/downgrade.
+
 2017-01-19  Andre Vehreschild  <ve...@gcc.gnu.org>
 
PR fortran/70696
diff --git a/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c 
b/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c
new file mode 100644
index 000..f455eb8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c
@@

RE: [PATCH] MIPS: Fix generation of DIV.G and MOD.G for Loongson targets.

2017-01-18 Thread Matthew Fortune
Toma Tabacu <toma.tab...@imgtec.com> writes:
> Matthew Fortune writes:
> >
> > Sounds good. I'd prefer to get the testsuite clean first then improve the
> > code quality as a later step since it is not a regression and we are
> > a few days off stage 4.
> >
> > In terms of the patch then the ISA_HAS_DIV3 macro is not currently used so
> > I suggest that instead it is renamed to ISA_AVOID_DIV_HILO and then use
> > that macro in the definition of ISA_HAS_DIV and ISA_HAS_DDIV to turn
> > off the DIV/DDIV instructions.
> >
> > The ISA_HAS_DIV3 should have been cleaned up when R6 was added as it is
> > ambiguous and could refer to multiple variants of 3-reg operand DIV now
> > rather than just Loongson's.
> >
> > Thanks,
> > Matthew
> 
> I believe the patch below fits the description.
> I've also added a (too?) succinct explanation for the ISA_AVOID_DIV_HILO 
> macro.
> 
> Tested with mips-mti-elf.
> 
> Regards,
> Toma
> 
> gcc/ChangeLog:
> 
>   * config/mips/mips.h: Add macro to prevent generation of regular
>   (D)DIV(U) instructions for Loongson.

The changelog needs to be more detailed about what changed and can be
less detailed about why:

* config/mips/mips.h (ISA_HAS_DIV3): Remove unused macro.
(ISA_AVOID_DIV_HILO): New macro.
(ISA_HAS_DIV): Use new ISA_AVOID_DIV_HILO macro.
(ISA_HAS_DDIV): Likewise.

> diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
> index f91b43d..e21e7d8 100644
> --- a/gcc/config/mips/mips.h
> +++ b/gcc/config/mips/mips.h
> @@ -967,19 +967,24 @@ struct mips_cpu_info {
>  /* ISA supports instructions DMUL, DMULU, DMUH, DMUHU.  */
>  #define ISA_HAS_R6DMUL   (TARGET_64BIT && mips_isa_rev >= 6)
> 
> +/* For Loongson, it is preferable to use the Loongson-specific division and
> +   modulo instructions instead of the regular (D)DIV(U) instruction, because
> +   the former are faster and also have the effect of reducing code size.  */

Might want to put 'also can have the effect' given they don't yet.

> +#define ISA_AVOID_DIV_HILO   ((TARGET_LOONGSON_2EF   \
> +   || TARGET_LOONGSON_3A)\
> +  && !TARGET_MIPS16)
> +
>  /* ISA supports instructions DDIV and DDIVU. */
>  #define ISA_HAS_DDIV (TARGET_64BIT   \
>&& !TARGET_MIPS5900\
> +  && !ISA_AVOID_DIV_HILO \
>&& mips_isa_rev <= 5)
> 
>  /* ISA supports instructions DIV and DIVU.
> This is always true, but the macro is needed for ISA_HAS_DIV
> in mips.md.  */
> -#define ISA_HAS_DIV  (mips_isa_rev <= 5)
> -
> -#define ISA_HAS_DIV3 ((TARGET_LOONGSON_2EF   \
> -   || TARGET_LOONGSON_3A)\
> -  && !TARGET_MIPS16)
> +#define ISA_HAS_DIV  (!ISA_AVOID_DIV_HILO\
> +  && mips_isa_rev <= 5)
> 
>  /* ISA supports instructions DIV, DIVU, MOD and MODU.  */
>  #define ISA_HAS_R6DIV(mips_isa_rev >= 6)

Apart from those changes this looks OK to me.

Matthew


RE: [PATCH] MIPS: Fix generation of DIV.G and MOD.G for Loongson targets.

2017-01-17 Thread Matthew Fortune
Maciej Rozycki  writes:
> On Mon, 16 Jan 2017, Toma Tabacu wrote:
> 
> > After searching through the archives, I have found an interesting bit
> > of information about DIV.G/MOD.G in the original submission thread:
> >
> > > > Ruan Beihong 23 July 2008:
> > > >
> > > > I've seen the Loongson 2F manual carefully. The (d)div(u) is
> > > > internally splited into one (d)div(u).g and one (d)mod(u).g. So I
> > > > said before was wrong. The truth is that, (d)div(u).g and
> > > > (d)mod(u).g are always faster than (d)div(u), at least the time
> > > > spend on mflo/mfhi is saved.
> > > >
> > > > James Ruan
> > >
> > > Richard Sandiford 24 July 2008:
> > >
> > > OK, great.  In that case, it should simply be a case of disabling
> > > the divmod-related insns for Loongson, in addition to your patch.
> > > (Probably stating the obvious there, sorry.)
> > >
> > > Richard
> >
> > Here's the link for part 1 of the submission thread (has the quotes
> from above):
> > https://gcc.gnu.org/ml/gcc-patches/2008-07/msg01529.html
> > and here's part 2:
> > https://gcc.gnu.org/ml/gcc-patches/2008-11/msg00273.html
> 
>  Thanks for digging this out!
> 
> > If DIV.G/MOD.G are faster, according to Ruan Beihong, and also smaller
> > than DIV (or the same size [1]), as pointed out by Maciej, then I am
> > led to the same conclusion as Richard Sandiford: that only DIV.G/MOD.G
> > should be generated for Loongson.
> >
> > I think it would still be a good idea to add a test for separated
> > DIV.G/MOD.G, though.
> 
>  Possibly, though the combined tests need to stay then, to make sure
> generic DIV/DIVU is not ever produced.

I'm happy to just stick with the original tests as they effectively test
both scenarios just at different optimisation levels. i.e. the new divmod
expansion only kicks in at -O2 I believe.

> > What are your thoughts on this ?
> > Have I misunderstood something in the context of the submission thread
> ?
> >
> > Regards,
> > Toma
> >
> > [1] I've noticed that GCC generates the same TEQ instruction twice if
> > both DIV.G and MOD.G are needed, which makes the sequence just as big
> > as DIV + TEQ + MFHI + MFLO; this seems unnecessary to me.
> 
>  This ought to be handled then, likely by adding Loongson-specific RTL
> insns matching the `divmod4' and `udivmod4' expanders.  It
> may be as simple as say (conceptually, untested):
> 
> (define_insn "divmod4_loongson"
>   [(set (match_operand:GPR 0 "register_operand" "=d")
>   (any_div:GPR (match_operand:GPR 1 "register_operand" "d")
>(match_operand:GPR 2 "register_operand" "d")))
>(set (match_operand:GPR 3 "register_operand" "=d")
>   (any_mod:GPR (match_dup 1)
>(match_dup 2)))]
>   "TARGET_LOONGSON_2EF"
> {
>   return mips_output_division
> ("div.g\t%0,%1,%2\;mod.g\t%3,%1,%2", operands);
> }
>   [(set_attr "type" "idiv")
>(set_attr "mode" "")])
> 
> although any final fix will have to take an instruction count adjustment
> into account too, as `mips_idiv_insns' won't as it stands handle the new
> case.

Sounds good. I'd prefer to get the testsuite clean first then improve the
code quality as a later step since it is not a regression and we are
a few days off stage 4.

In terms of the patch then the ISA_HAS_DIV3 macro is not currently used so
I suggest that instead it is renamed to ISA_AVOID_DIV_HILO and then use
that macro in the definition of ISA_HAS_DIV and ISA_HAS_DDIV to turn
off the DIV/DDIV instructions.

The ISA_HAS_DIV3 should have been cleaned up when R6 was added as it is
ambiguous and could refer to multiple variants of 3-reg operand DIV now
rather than just Loongson's.

Thanks,
Matthew



RE: [PATCH, MIPS] Target flag and build option to disable indexed memory OPs.

2017-01-17 Thread Matthew Fortune
Moore, Catherine <catherine_mo...@mentor.com> writes:
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Matthew Fortune
> > Sent: Monday, January 16, 2017 11:25 AM
> > To: Doug Gilmore <doug.gilm...@imgtec.com>; gcc-
> > patc...@gcc.gnu.org
> > Cc: Moore, Catherine <catherine_mo...@mentor.com>
> > Subject: RE: [PATCH, MIPS] Target flag and build option to disable
> > indexed memory OPs.
> >
> > Doug Gilmore <doug.gilm...@imgtec.com>
> > > I recently bisected PR78176 to problems introduced with r21650.
> > >
> > > Given the short time until the release, we would like to provide a
> > > target flag and build option to avoid the bug until we are able to
> > > resolve the problem with the commit.  Note that as Matthew Fortune
> > has
> > > mentioned in the PR:
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176#c5
> > >
> > > the problem could also be addressed by updates to the Linux kernel
> > since
> > > the problem is only exposed by running MIPS 32-bit binaries on 64-
> > bit
> > > kernels.
> > >
> > > Bootstrapped on X86_64, regression tested on X86_64 and MIPS.
> > >
> > > OK to commit?
> >
> > Given this is a generic reference to indexed load/store and the issue
> > could
> > affect any indexed operation then I think it needs to include all of the
> > following as well:
> >
> > /* ISA has lwxs instruction (load w/scaled index address.  */
> > #define ISA_HAS_LWXS((TARGET_SMARTMIPS ||
> > TARGET_MICROMIPS) \
> >  && !TARGET_MIPS16)
> >
> > /* ISA has lbx, lbux, lhx, lhx, lhux, lwx, lwux, or ldx instruction. */
> > #define ISA_HAS_LBX (TARGET_OCTEON2)
> > #define ISA_HAS_LBUX(ISA_HAS_DSP || TARGET_OCTEON2)
> > #define ISA_HAS_LHX (ISA_HAS_DSP || TARGET_OCTEON2)
> > #define ISA_HAS_LHUX(TARGET_OCTEON2)
> > #define ISA_HAS_LWX (ISA_HAS_DSP || TARGET_OCTEON2)
> > #define ISA_HAS_LWUX(TARGET_OCTEON2 && TARGET_64BIT)
> > #define ISA_HAS_LDX ((ISA_HAS_DSP || TARGET_OCTEON2) \
> >  && TARGET_64BIT)
> >
> > The DSP LBUX/LHX/LWX/LDX intrinsics will also need a new AVAIL
> > predicate
> > to disable them. The snag is that some DSP code will fail to compile if it
> > uses the DSP load intrinsics directly.
> >
> > I see no way of avoiding that. Therefore, distributions that use
> > --without-indexed-load-store will have to cope with some potential
> > DSP
> > fallout if they enable DSP at all.
> >
> > @Catherine: I'd like your input here if possible as I advocated this
> > approach, comments on option names welcome too.  I quite like the
> > verbose
> > name.
> 
> Okay, based on my reading of the comments in the bug report, you are 
> proposing this option
> as a workaround to a kernel deficiency.  I don't see any agreement that this 
> is actually a
> compiler bug.
> Do we really need to include the DSP instrinsics as well?   Do you think that 
> many
> distributions actually enable DSP?
> 
> The option name itself is acceptable to me.  I'd like to see documentation 
> that explains
> when this problem is exposed.  I'd like to limit the fix to LWXS and I'd like 
> to see the
> testcase from the bug report added to the testsuite.
> I also agree that the preprocessor macro is a good idea (even if we decide to 
> forgo the
> DSP portion of the patch).

Thanks for the comments.

Having thought further I agree we can safely ignore DSP indexed load and 
micromips LWXS on
the basis that DSP code will not run on a MIPS64 processor anyway (at least 
none that I
know of) so the issue cannot occur and similarly for microMIPS, there are no 
64-bit cores.

Restricting to just LWXC1/SWXC1/LDXC1/SDXC1 is therefore fine but we should 
reflect
that in option names then.

--with-lxc1-sxc1 --without-lxc1-sxc1
-mlxc1-sxc1

These names reflect the internal macro that controls availability of these 
instructions.

Macro name: __mips_no_lxc1_sxc1
Defined when !ISA_HAS_LXC1_SXC1 so would be present even when targeting a core 
that
doesn't have the instructions anyway.

Any refinements to this Catherine?

Thanks,
Matthew




RE: [PATCH, MIPS] Target flag and build option to disable indexed memory OPs.

2017-01-16 Thread Matthew Fortune
Doug Gilmore <doug.gilm...@imgtec.com>
> I recently bisected PR78176 to problems introduced with r21650.
> 
> Given the short time until the release, we would like to provide a
> target flag and build option to avoid the bug until we are able to
> resolve the problem with the commit.  Note that as Matthew Fortune has
> mentioned in the PR:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176#c5
> 
> the problem could also be addressed by updates to the Linux kernel since
> the problem is only exposed by running MIPS 32-bit binaries on 64-bit
> kernels.
> 
> Bootstrapped on X86_64, regression tested on X86_64 and MIPS.
> 
> OK to commit?

Given this is a generic reference to indexed load/store and the issue could
affect any indexed operation then I think it needs to include all of the
following as well:

/* ISA has lwxs instruction (load w/scaled index address.  */
#define ISA_HAS_LWXS((TARGET_SMARTMIPS || TARGET_MICROMIPS) \
 && !TARGET_MIPS16)

/* ISA has lbx, lbux, lhx, lhx, lhux, lwx, lwux, or ldx instruction. */
#define ISA_HAS_LBX (TARGET_OCTEON2)
#define ISA_HAS_LBUX(ISA_HAS_DSP || TARGET_OCTEON2)
#define ISA_HAS_LHX (ISA_HAS_DSP || TARGET_OCTEON2)
#define ISA_HAS_LHUX(TARGET_OCTEON2)
#define ISA_HAS_LWX (ISA_HAS_DSP || TARGET_OCTEON2)
#define ISA_HAS_LWUX(TARGET_OCTEON2 && TARGET_64BIT)
#define ISA_HAS_LDX ((ISA_HAS_DSP || TARGET_OCTEON2) \
 && TARGET_64BIT)

The DSP LBUX/LHX/LWX/LDX intrinsics will also need a new AVAIL predicate
to disable them. The snag is that some DSP code will fail to compile if it
uses the DSP load intrinsics directly.

I see no way of avoiding that. Therefore, distributions that use
--without-indexed-load-store will have to cope with some potential DSP
fallout if they enable DSP at all.

@Catherine: I'd like your input here if possible as I advocated this
approach, comments on option names welcome too.  I quite like the verbose
name.

@Doug: Have you tried running the testsuite with the configure option
--without-indexed-load-store? There may be tests that need adjusting where they
test indexed load/store. We probably need a pre-processor macro
to detect if the option is enabled/disabled so that DSP code can be predicated
on indexed load being available.

Thanks,
Matthew


RE: [PATCH] PR79079 Fix __builtin_mul_overflow code gen for !TRULY_NOOP_TRUNCATION target

2017-01-16 Thread Matthew Fortune
Kito Cheng  writes:
> On Mon, Jan 16, 2017 at 02:42:08PM +0800, Kito Cheng wrote:
> > 2017-01-16  Kito Cheng 
> > Kuan-Lin Chen 
> >
> > PR target/PR79079
> > * gcc/internal-fn.c (expand_mul_overflow): Use convert_modes
> instead of
> > gen_lowpart.

Thanks for the fix Kito, much appreciated.

I believe this is candidate for backport to GCC 6 given a week or so in trunk
to make sure it's OK.

Matthew


RE: [PATCH] MIPS: Fix generation of DIV.G and MOD.G for Loongson targets.

2017-01-12 Thread Matthew Fortune
Maciej Rozycki  writes:
> On Thu, 12 Jan 2017, Toma Tabacu wrote:
> 
> > > > Unfortunately, this interferes with the generation of DIV.G and
> > > > MOD.G (the div3 and mod3 patterns) for Loongson
> > > > targets,
> > > which
> > > > causes test failures.
> > >
> > >  What test failures?  Details please.
> > >
> >
> > It's
> > gcc.target/mips/loongson-muldiv-1.c
> > gcc.target/mips/loongson-muldiv-2.c
> > gcc.target/mips/loongson3a-muldiv-1.c
> > gcc.target/mips/loongson3a-muldiv-2.c
> > on O2, O3, and Os.
> >
> > They are also checking for the Loongson-specific multiply instruction,
> > but there are no failures for that.
> 
>  So these tests have e.g.:
> 
> NOMIPS16 st sdiv (st x, st y) { return x / y + x % y; }
> 
> and as such it looks to me like this code does legitimately use a single
> DIV instruction rather than a DIV.G and MOD.G pair, at least at -O2/-O3,
> as I'd expect two divisions to take twice as much computing time as one
> does, and the avoidance of the extra accumulator access overhead needed
> with DIV does not compensate for it.  For -Os actual code generated will
> have to be checked; I suspect DIV.G/MOD.G ought to be used rather than
> DIV/MFLO/MFHI as it's two instructions vs three.
> 
>  So as a first step I'd split these tests into individual cases covering
> signed/unsigned function pairs each of which doing a single operation
> only, i.e. smul/umul, sdiv/udiv, smod/umod, which will then be expected
> to always use the extra Loongson instructions.  This ought to provide
> the coverage originally intended (study the discussion around the
> submission of the patch that introduced these tests to double-check).

This sounds like a reasonable fix. The theme of Toma's changes has been to
improve testsuite stability rather than code gen so breaking these tests
up to cover the original goal looks OK.

>  As a further step a test case for sdivmod/udivmod will then be needed,
> to cover the use of DIV vs DIV.G/MOD.G as required for speed vs space
> optimisation.

I see no need to improve code quality currently as it would be new work.
Based on the description I expect -Os will be using DIV and MFLO/MFHI.
Adding a test to record that certain optimisation levels happen to now
get the base arch DIV instruction used could be risky for stability but
I have no objection unless the new test somehow fails when pitted against
the plethora of configurations we have to test.

>  Likewise for GSMULT/GSDIV/GSMOD, etc. (hmm, why are the signed
> MULT.G/GSMULT instruction variants never used?).

Somewhat weird but so is the presence of signed and unsigned mul in R6.
I believe that microarchitecture optimisation is possible where results
could be memoised waiting for the high part multiply to be issued and
then reused.

Since there shouldn't be any compiler change needed to resolve this then
I see no need to test specifically against loongson. Just verify the
code generated looks reasonable as per the original tests.

Thanks,
Matthew


RE: Make MIPS soft-fp preserve NaN payloads for NAN2008

2017-01-11 Thread Matthew Fortune
Maciej Rozycki <maciej.rozy...@imgtec.com> writes:
> On Thu, 5 Jan 2017, Matthew Fortune wrote:
> > It is true to say that users are discouraged from using 2008-NaN with
> > soft-float for pre-R6 architectures simply to avoid further fragmentation
> > of software for no real gain. However, for R6 then soft-float is 2008-NaN
> > otherwise we are stuck with legacy-NaN forever.
> 
>  What's the actual issue you have with legacy NaN, and how does soft-float
> relate to R6?  It's not like hardware, R6 or othwerwise, limits soft-float
> in any way.

What about floating point data stored in binary form written by a hard-float
app and read by a soft-float etc. From R6 onwards it would be poor not to
address this and make it the same. I assume people do this sometimes!

For R5 I agree there is no useful reason to do soft-float in nan2008 but
someone may wish to do that in a closed environment if perhaps they have
a requirement like the one I describe above for R6.
 
> > If someone did want to build a system from source with soft-float as
> > 2008-NaN then I see no reason to stop them but I doubt they would and I
> > don't expect the --with-nan GCC configure option to be used in conjunction
> > with --with-float=soft for the same reason. The most likely use of
> > --with-nan is to build a distribution specifically to target an MSA capable
> > system like P5600 or perhaps an M5150 with an FPU. The NaN interlinking
> > work will make these use-cases less important still though I think.
> 
>  You can have GCC configured with `--with-nan=2008' and equipped with a

You 'can' but I don't think you would... unless that is actually what you
wanted which is really the premise of permitting nan2008 soft-float.

> soft-float multilib.  IMHO you ought to be able to just use `-msoft-float'
> then to select the soft-float multilib and have it implicitly use the
> legacy NaN encoding rather than having to pass `-msoft-float -mnan=legacy'
> to get the intended semantics.

This is something a vendor configuration could handle or the addition of
a spec to do it but I believe we have currently got a reasonably clean
separation of options in the generic/unknown configuration such that use
of one option does interfere with others (at least in terms of the options
which affect which ABI variant is in use). The less x implies y type
options we have the less mental trauma we have in understanding what the
effective behaviour is of a given set of options. That's not to say it is
in any way easy to figure out currently from an arbitrarily selected MIPS
GCC toolchain.

>  There shouldn't be a need for NaN interlinking for soft-float objects,
> that's just unnecessary burden IMO.

Indeed. I can't see anyone needing that as soft-float nan-2008 version is
highly unlikely.
 
>  MSA is irrelevant for soft-float operations, we don't have a soft-float
> MSA ABI.  If we ever define one, then we could well choose the 2008-NaN
> encoding for compatibility with hard-float code; there's no issue with
> backwards compatibility here as no legacy-NaN MSA hardware has been ever
> allowed.

I wasn't meaning to imply MSA makes sense with soft-float but rather making
the point that there are few scenarios where --with-nan configure time
option is likely to be used but one is a hard-float MSA native toolchain.

>  Have I missed anything?

Summary:

* No technical need to prohibit nan2008 soft-float
* Benefit to R6 onwards such that we don't have to track two different
  floating point formats forever in tools that may not bother with pre-r6
  in the future.
* Marginal benefit in sharing floating point data in binary format between
  soft and hard float programs.

Hope that explains my thinking on this.

Matthew


RE: [patch mips/gcc] add build-time and runtime options to disable or set madd.fmt type

2017-01-11 Thread Matthew Fortune
Sandra Loosemore  writes:
> On 01/10/2017 07:24 AM, Yunqiang Su wrote:
> > Hi, folks, any idea about this patch?
> 
> I can only comment on the documentation parts.

I am reviewing the patch but need to determine if the changes are sufficient and
safe to meet the goal. This area is complex in the MIPS backend so it will be a
few days to respond.

Sandra: Thanks for your docs review, much appreciated.

Thanks,
Matthew

> 
> >> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> >> index b911d76dd66..e548733537d 100644
> >> --- a/gcc/doc/install.texi
> >> +++ b/gcc/doc/install.texi
> >> @@ -1338,6 +1338,23 @@ In the absence of this configuration option the 
> >> default
> convention is
> >> the legacy encoding, as when neither of the @option{-mnan=2008} and
> >> @option{-mnan=legacy} command-line options has been used.
> >>
> >> +@item --with-madd4=@var{type}
> >> +On MIPS targets, set the default behaivor of madd.fmt/msub.fmt. The
> 
> s/behaivor/behavior/
> 
> Are you really setting/describing the behavior of these instructions on
> the target, or are you setting how GCC generates code for them?
> 
> >> +possibilities for @var{type} are:
> >> +@table @code
> >> +@item unfused
> >> +The madd.fmt/msub.fmt instructions are unfused, as with the
> >> +@option{-mmadd4=unfused} command-line option.
> >> +@item fused
> >> +The madd.fmt/msub.fmt instructions are fused, as with the
> >> +@option{-mmadd4=fused} command-line option.
> >> +@item no
> >> +The madd.fmt/msub.fmt are disabled, as with the @option{-mmadd4=no}
> 
> s/are/instructions are/
> >> +command-line option.
> >> +@end table
> >> +In the absence of this configuration option the default convention is
> >> +the unfused, while for MIPS r6 and Loongson, the default is fused.
> 
> s/the unfused/@samp{unfused}
> s/fused/@samp{fused}
> 
> The MIPS maintainers may correct me on this, but it looks to me like
> "MIPS r6" is not an official name of anything.  I see "MIPS Release 6"
> on the Imagination web site and in the MIPS processor documentation.
> 
> Finally, I'm confused about whether the default is based on the
> processor selected at compile time, or depends on some other
> configure-time option.
> 
> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> >> index 1096254085f..e88040cdceb 100644
> >> --- a/gcc/doc/invoke.texi
> >> +++ b/gcc/doc/invoke.texi
> >> @@ -19289,6 +19289,29 @@ their trailing significand field being 0.
> >> The default is @option{-mnan=legacy} unless GCC has been configured with
> >> @option{--with-nan=2008}.
> >>
> >> +@item -mmadd4=unfused
> >> +@item -mmadd4=fused
> >> +@item -mmadd4=no
> 
> Use @itemx instead of @item for entries other than the first.
> 
> >> +@opindex mmadd4=unfused
> >> +@opindex mmadd4=fused
> >> +@opindex mmadd4=no
> 
> I think it's sufficient to just have a single index entry
> 
> @opindex mmadd4=
> 
> >> +These options control the type of madd.fmt/msub.fmt, unfused, fused
> >> +or disabled at all.
> 
> "type of" is confusing.  Maybe just say
> 
> These options control generation of madd.fmt/msub.fmt instructions.
> 
> >> +
> >> +The @option{-mmadd4=unfused} option will set type of madd.fmt/msub.fmt
> 
> Same concerns about "type of" here, plus
> s/will set/sets/
> 
> >> +to be unfused if they are generated.
> >> +
> >> +The @option{-mmadd4=fused} option will set type of madd.fmt/msub.fmt
> >> +to be fused if they are generated.
> 
> Same here.
> 
> >> +
> >> +The @option{-mmadd4=no} option will disable madd.fmt/msub.fmt
> >> +to be generated at all.
> 
> s/will disable/prevents/
> s/to be generated/from being generated/
> 
> 
> >> +
> >> +The default is @option{-mmadd4=unfused} unless GCC has been configured
> >> +for MIPS r6+ or Loongson. If runtime flags for MIPS r6 or Loongson is
> >> +given, the type will always be `fused', no matter what value
> >> +@option{-mmadd4=} is.
> >> +
> 
> Same concerns about "MIPS r6".
> 
> I suggest rewriting the second sentence as something like
> 
> "If compiling for a MIPS Release 6 or Loongson processor, this option is
> ignored and the behavior is as if @samp{fused} were specified."
> 
> I think we are talking about controlling code generation determined by
> compilation options, and not detecting runtime processor flag bits, right?
> 
> -Sandra



RE: Make MIPS soft-fp preserve NaN payloads for NAN2008

2017-01-05 Thread Matthew Fortune
Joseph Myers  writes:
> On Wed, 4 Jan 2017, Maciej W. Rozycki wrote:
> 
> >  AFAIR we deliberately decided not to define a 2008-NaN soft-float
> > ABI, and chose to require all soft-float binaries to use the legacy
> encoding.
> 
> Soft-float and 2008-NaN are naturally completely orthogonal and the
> combination works fine (of course, it doesn't need any special kernel or
> hardware support).  There's no need to disallow the combination.
> 
> In any case, the soft-fp change is relevant in the hard-float case as
> well, to make software TFmode behave consistently with hardware SFmode
> and DFmode regarding NaN payload preservation.

I agree here.

It is true to say that users are discouraged from using 2008-NaN with
soft-float for pre-R6 architectures simply to avoid further fragmentation
of software for no real gain. However, for R6 then soft-float is 2008-NaN
otherwise we are stuck with legacy-NaN forever.

If someone did want to build a system from source with soft-float as
2008-NaN then I see no reason to stop them but I doubt they would and I
don't expect the --with-nan GCC configure option to be used in conjunction
with --with-float=soft for the same reason. The most likely use of
--with-nan is to build a distribution specifically to target an MSA capable
system like P5600 or perhaps an M5150 with an FPU. The NaN interlinking
work will make these use-cases less important still though I think.

Matthew


RE: Make MIPS soft-fp preserve NaN payloads for NAN2008

2017-01-04 Thread Matthew Fortune
Joseph Myers  writes:
> The MIPS sfp-machine.h has an _FP_CHOOSENAN implementation which
> emulates hardware semantics of not preserving signaling NaN payloads for
> an operation with two NaN arguments (although that doesn't suffice to
> avoid sNaN payload preservation in any case with just one NaN argument).
> 
> However, those are only hardware semantics in the legacy NaN case; in
> the NAN2008 case, the architecture documentation says hardware preserves
> payloads in such cases.  Furthermore, this implementation assumes legacy
> NaN semantics, so in the NAN2008 case the implementation actually has
> the effect of preserving sNaN payloads but not preserving qNaN payloads,
> when both should be preserved.
> 
> This patch fixes the code just to copy from the first argument (at the
> level of libgcc, it's not meaningful which argument is the first and
> which is the second).
> 
> Tested for mips64-linux-gnu (soft float, NAN2008) with the glibc math/
> tests.  OK to commit?
> 
> 2017-01-02  Joseph Myers  
> 
>   * config/mips/sfp-machine.h (_FP_CHOOSENAN): Always preserve NaN
>   payload if [__mips_nan2008].

Thanks for finding and fixing this.

OK to commit.

Matthew


RE: [Patch ,gcc/MIPS] add an build-time/runtime option to disable madd.fmt

2016-12-21 Thread Matthew Fortune
Sandra Loosemore  writes:
> On 12/21/2016 11:54 AM, Yunqiang Su wrote:
> > By this patch, I add a build-time option ` --with-unfused-madd4=yes/no',
> > and runtime option -m(no-)unfused-madd4,
> > to disable generate madd.fmt instructions.
> 
> Your patch also needs a documentation change so that the new
> command-line option is listed in the GCC manual with other MIPS target
> options.

Any opinions on option names to control this? Is it best to target the specific
feature that is non-compliant on loongson or apply a general -mfix-loongson
type option?

I'm not sure I have a strong opinion either way but there do seem to be
multiple possible variants.

Thanks,
Matthew




RE: [PATCH,gcc/MIPS] Make loongson3a use fused madd.d

2016-12-19 Thread Matthew Fortune
Hi Paul,

Apologies for the delay in responding.

> I get the copyright assignment, it's ok for commit.

Thanks for going through copyright assignment, I can see you listed and
also you have commit access now. Is the trunk build failure still
present for you, if it is now resolved then please go ahead and commit.

Thanks,
Matthew


RE: [PATCH, testsuite] MIPS: Upgrade to R2 for -mmicromips.

2016-12-14 Thread Matthew Fortune
Toma Tabacu  writes:
> microMIPS is not supported on pre-R2 architectures, but the testsuite allows
> it to be used on pre-R2 architectures, which results in test failures.
> 
> This patch makes the testsuite upgrade to R2 if -mmicromips is used in a test.
> 
> Tested with mips-mti-elf.
> 
> Regards,
> Toma
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/mips/mips.exp (mips-dg-options): Upgrade to R2 for
>   -mmicromips.

OK, thanks.

Matthew


RE: [PATCH, testsuite] MIPS: Remove redundant dg-skip-if from mips16-attributes.c.

2016-12-14 Thread Matthew Fortune
Toma Tabacu  writes:
> In the case of mips16-attributes.c, even though the (-mips16) option ensures
> that -mmicromips will be overriden, there still is a dg-skip-if for 
> -mmicromips.
> 
> I think that it is not necessary and it actually decreases test coverage,
> because it will cause the testsuite to skip this test instead of letting
> (-mips16) do its job of overriding the global options.
> 
> This patch removes the dg-skip-if for -mmicromips.
> 
> Tested with mips-mti-elf.
> 
> Regards,
> Toma
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/mips/mips16-attributes.c: Remove dg-skip-if for
>   -mmicromips.

OK, thanks.

Matthew


RE: [PATCH, testsuite] MIPS: Skip msa-builtins-err.c if there is no assembly output.

2016-12-13 Thread Matthew Fortune
Toma Tabacu  writes:
> >
> > It's a shame this is the only way to deal with this but I see aarch64
> > have to resort to the same thing for error checking tests.
> >
> 
> I have looked into this some more and I 've found that the solution I
> proposed is incomplete.
> 
> The problem is that if no linker plugin is found, -fno-fat-lto-objects
> is not passed on, so the test isn't skipped and it will fail because -
> flto doesn't do assembly generation by default and the errors will not
> be triggered.
> 
> This can be fixed by adding -ffat-lto-objects as a test option for error
> tests, as shown in the patch below.
> 
> The thing is, this already happens for scan-assembler & friends behind-
> the-scenes, but not for dg-error, because the former are run through
> force_conventional_output, while the latter is not.
> 
> A nicer solution would be to have a new directive which would do nothing
> except for the force_conventional_output part, thus forcing assembly
> generation, but this may be overkill.
> 
> Regards,
> Toma
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/mips/msa-builtins-err.c (dg-options): Use
>-ffat-lto-objects to guarantee assembly generation.

OK, thanks for investigating. Slight tweak to the changelog we just say
what changed in the changelog not why. You covered 'why' in the code
comment. This is just:

(dg-options): Add -ffat-lto-objects option.

Thanks,
Matthew


  1   2   3   4   >