Re: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-02-15 Thread Richard Sandiford
Graham Stott graham.st...@imgtec.com writes:
 +(define_constraint YC
 +  @internal
 +   A constant vector with each element is a unsigned bitimm-bit integer with 
 only one bit set

Maybe:

  A replicated vector constant in which the replicated value has a single
  bit set

Likewise YZ and clear bits.

 +(define_constraint Y5
 +  @internal
 +   A constant vector with each element is a signed 6-bit integer
 +  (and (match_code const_vector)
 +   (match_test mips_const_vector_any_int_p (op, mode, -32, 31

Maybe use Usv6.

  A replicated vector constant in which the replicated value is a signed
  6-bit number.

 +(define_constraint Y6
 +  @internal
 +   A constant vector with each element a unsigned 6-bit integer
 +  (and (match_code const_vector)
 +   (match_test mips_const_vector_any_int_p (op, mode, 0, 31

Similarly here for Uuv6.  Upper bound should be 63 for a 6-bit integer.
Would be good to have a test for that.

 +(define_constraint Y8
 +  @internal
 +   A constant vector with each element a unsigned 0-bit integer
 +  (and (match_code const_vector)
 +   (match_test mips_const_vector_any_int_p (op, mode, 0, 255

Similarly here for Uuv8.

 @@ -127,3 +351,4 @@
  DEF_MIPS_FTYPE (1, (VOID, USI))
  DEF_MIPS_FTYPE (2, (VOID, V2HI, V2HI))
  DEF_MIPS_FTYPE (2, (VOID, V4QI, V4QI))
 +

No newline here.

 +(define_c_enum unspec [
 +UNSPEC_MSA_ADDVI
 +UNSPEC_MSA_ANDI_B
 +UNSPEC_MSA_ASUB_S
 +  UNSPEC_MSA_ASUB_U
 +  UNSPEC_MSA_AVE_S
 +  UNSPEC_MSA_AVE_U

Formatting (second is right).

 +(define_mode_iterator MODE128_2 [V2DF V4SF V2DI V4SI V8HI V16QI])
 +(define_mode_iterator IMODE128 [V2DI V4SI V8HI V16QI])

These two aren't used and I can't see where MODE128_2 would come in useful.
Let's drop these for now.

 +(define_mode_attr VHALFMODE 
 +  [(V8HI V16QI)
 +   (V4SI V8HI)
 +   (V2SI V4SI)
 +   (V2DI V4SI)
 +   (V2DF V4SF)])
 +
 +;; This attribute gives the integer mode for selection mask in vec_perm.
 +;; vcond also uses MSA_I for operand 0, 1, and 2.
 +(define_mode_attr MSA_I
 +  [(V2DF V2DI)
 +   (V4SF V4SI)
 +   (V2DI V2DI)
 +   (V4SI V4SI)
 +   (V8HI V8HI)
 +   (V16QI V16QI)])
 +
 +;; The attribute give the integer vector mode with same size
 +(define_mode_attr MODE_I
 +  [(V2DF V2DI)
 +   (V4SF V4SI)
 +   (V2DI V2DI)
 +   (V4SI V4SI)
 +   (V8HI V8HI)
 +   (V16QI V16QI)])

Let's call this VIMODE for consistency with both IMODE in mips.md
and the HALFMODE/VHALFMODE pair.  VIMODE can be used in place of MSA_I;
no need for both.

 +;; This attribute qives suffix gives the mode of the result for copy_s_b, 
 copy_u_b etc.
 +(define_mode_attr RES
 +  [(V2DF DF)
 +   (V4SF SF)
 +   (V2DI DI)
 +   (V4SI SI)
 +   (V8HI SI)
 +   (V16QI SI)])

Why we do need to promote sub-SI values to SI for this?  I'd prefer
that we use the correct mode (i.e. UNITMODE) instead.

 +;; This is used in msa_cast* to output mov.s or mov.d.
 +(define_mode_attr msafmt2
 +  [(V2DF d)
 +   (V4SF s)])

Not really an MSA format.  Maybe unitfmt?

 +;; This attribute qives define_insn suffix for MSA instructions 
 +;; with need distinction between integer and floating point.
 +(define_mode_attr msafmt3
 +  [(V2DF d_f)
 +   (V4SF w_f)
 +   (V2DI d)
 +   (V4SI w)
 +   (V8HI h)
 +   (V16QI b)])

msafmt_f might be more mnemonic than msafmt3.

 +;; The maximum index inside a vector.
 +(define_mode_attr max_elem_index
 +  [(V2DF 1)
 +   (V4SF 3)
 +   (V2DI 1)
 +   (V4SI 3)
 +   (V8HI 7)
 +   (V16QI 15)])

In the asserts where this is used it could just be
GET_MODE_NUNITS (MODEmode)

 +;; This is used to form an immediate operand constraint 
 +;; using const_imm_operand.
 +(define_mode_attr imm
 +  [(V2DF 0_or_1)
 +   (V4SF 0_to_3)
 +   (V2DI 0_or_1)
 +   (V4SI 0_to_3)
 +   (V8HI uimm3)
 +   (V16QI uimm4)])

Maybe indeximm rather than imm, for consistency with bitimm?

 +;; This attribute is used to form the MODE for reg_or_0_operand
 +;; constraint.
 +(define_mode_attr REGOR0
 +  [(V2DF DF)
 +   (V4SF SF)
 +   (V2DI DI)
 +   (V4SI SI)
 +   (V8HI SI)
 +   (V16QI SI)])

Same as RES, and same comment.

 +(define_expand vec_extractmode
 +  [(match_operand:UNITMODE 0 register_operand)
 +   (match_operand:IMSA 1 register_operand)
 +   (match_operand 2 const_int_operand)]
 +  ISA_HAS_MSA
 +{
 +  gcc_assert (UINTVAL (operands[2]) = max_elem_index);
 +  enum machine_mode mode0 = GET_MODE (operands[0]);
 +  if (mode0 == QImode || mode0 == HImode)
 +emit_move_insn (operands[0],
 + gen_lowpart (mode0, gen_reg_rtx (SImode)));
 +  else
 +emit_insn (gen_msa_copy_s_msafmt (operands[0], operands[1], 
 operands[2]));
 +  DONE;
 +})

The QImode/HImode case isn't right -- the source of the move is an
uninitialised register.  Please make sure there's a testcase for this.

You should be able to use UNITMODEmode instead of mode0.

 +(define_expand vec_extractmode
 +  [(match_operand:UNITMODE 0 register_operand)
 +   (match_operand:FMSA 1 register_operand)
 +   (match_operand 2 const_int_operand)]
 +  ISA_HAS_MSA
 +{
 +  rtx temp;

RE: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-02-05 Thread Graham Stott
Hi Richard,

Attached is an updated patch for  feedback so  MSA support to MIPS backend can 
be added when open again for next stage1.

It is unfinished in that some comments from your review of the initial patch 
have yet to be addressed.

The diff is against svn 207500 

Graham




msa.svn.207500.diffs.gz
Description: msa.svn.207500.diffs.gz


RE: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-01-21 Thread Matthew Fortune
Hi Richard,

I'd like to get some more of your thoughts on the ABI implications of MSA. 
Generally, any thoughts you have (or anyone else) on the current state of MIPS 
ABIs would be welcome. As an example I'm curious whether you see variations of 
O32 as being new ABIs or extensions of O32, it can be seen as a fine line.

The MSA patch as submitted is another variation of O32 ABI which could be 
described as O32+FP64+MSA(+nan2008) and would be link incompatible with both 
O32 and O32+FP64(+/-nan2008). The same of course applies to N32/N64 being link 
incompatible with N32/N64+FP64. This may make it difficult to have a small part 
of an application optimised with MSA but the rest of the code ignorant of MSA, 
in your experience do you think that will be an issue?

We (MIPS) have several discussions ongoing regarding ABIs on a variety of 
mailing lists so I'm trying to collect as much input as possible to inform 
future plans. 

One part of the patch that I don't believe you commented on is the change of 
stack alignment for both existing O32+FP64 and new O32+FP64+MSA. Did this seem 
OK? You did however comment on a related change to MIPS_STACK_ALIGN which has 
similar implications. I've also pasted the other parts of the patch that are 
ABI related along with your original comments below.

 -#define STACK_BOUNDARY (TARGET_NEWABI ? 128 : 64)
 +/* Because we want to allow MSA functions and non-MSA functions to
 +   call + each other and MSA always requires -mfp64, we set stack
 +   boundary to + 128 bits when newabi or -mfp64.  */
 +#define STACK_BOUNDARY \
 +  ((TARGET_NEWABI || TARGET_FLOAT64) ? 128 : 64)

Thoughts and opinions welcome from all...

Regards,
Matthew

  @@ -419,11 +423,17 @@
 /* The number of words passed in registers, rounded up.  */
 unsigned int reg_words;
 
  -  /* For EABI, the offset of the first register from GP_ARG_FIRST or
  - FP_ARG_FIRST.  For other ABIs, the offset of the first register
  from
  - the start of the ABI's argument structure (see the
  CUMULATIVE_ARGS
  - comment for details).
  +  /* If msa_reg_p is true, the offset of the first register from
  + MSA_ARG_FIRST.
 
  + The value is MAX_ARGS_IN_MSA_REGISTERS if the argument is passed
  +entirely
  + on the stack.
  +
  + If msa_reg_p is false, for EABI, the offset of the first
  +register from
  + GP_ARG_FIRST or FP_ARG_FIRST.  For other ABIs, the offset of the
  +first
  + register from the start of the ABI's argument structure
  + (see the CUMULATIVE_ARGS comment for details).
  +
The value is MAX_ARGS_IN_REGISTERS if the argument is passed
 entirely
on the stack.  */
 unsigned int reg_offset;
 
 So the MSA arguments operate outside the normal argument structure
 for o32, n32 and n64?  That's OK with me but please write up what the new
 ABI is.
 
  @@ -6445,6 +6979,233 @@
   }
   }
 
  +/* Write a stub for the current function.  This stub is used
  +   for functions from FR0 to FR1 or from FR1 to FR0.
  +   If FR0_TO_FR1_P is true, we need to set the FR mode to 1,
  +   set up floating-point arguments, align the stack pointer to 16
  +bytes,
  +   copy the arguments on the stack, call the function, set up the
  +returned
  +   floating-point value, and restore the FR mode to 0.
  +   If FR0_TO_FR1_P is false, the stub will be similar but we won't
  +need to
  +   align the stack pointer to 16 bytes and copy the arguments.  */
 
 Hmm, what's all this about?
 
 
  @@ -1410,8 +1436,11 @@
   /* 8 is observed right on a DECstation and on riscos 4.02.  */
   #define STRUCTURE_SIZE_BOUNDARY 8
 
  -/* There is no point aligning anything to a rounder boundary than
  this.  */ -#define BIGGEST_ALIGNMENT LONG_DOUBLE_TYPE_SIZE
  +/* There is no point aligning anything to a rounder boundary than
  +   LONG_DOUBLE_TYPE_SIZE, unless under MSA the bigggest alignment is
  +   BITS_PER_MSA_REG.  */
  +#define BIGGEST_ALIGNMENT \
  +  (TARGET_MSA ? BITS_PER_MSA_REG : LONG_DOUBLE_TYPE_SIZE)
 
 I'm not sure about the ABI implications of this -- will have to think about
 it.  Please integrate MSA into the comment a bit more rather than tacking it
 onto the end.
 
  @@ -2293,8 +2372,8 @@
   /* Treat LOC as a byte offset from the stack pointer and round it up
  to the next fully-aligned offset.  */
   #define MIPS_STACK_ALIGN(LOC) \
  -  (TARGET_NEWABI ? ((LOC) + 15)  -16 : ((LOC) + 7)  -8)
  -
  +  ((TARGET_NEWABI || TARGET_FLOAT64) ? \
  +   ((LOC) + 15)  -16 : ((LOC) + 7)  -8)
 
 What's this about -- again seems dangerous ABIwise.



RE: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-01-21 Thread Joseph S. Myers
On Tue, 21 Jan 2014, Matthew Fortune wrote:

 The MSA patch as submitted is another variation of O32 ABI which could 
 be described as O32+FP64+MSA(+nan2008) and would be link incompatible 
 with both O32 and O32+FP64(+/-nan2008). The same of course applies to 
 N32/N64 being link incompatible with N32/N64+FP64. This may make it 
 difficult to have a small part of an application optimised with MSA but 
 the rest of the code ignorant of MSA, in your experience do you think 
 that will be an issue?

In general, unnecessary ABI incompatibility is to be discouraged.  This 
means (I'm talking about all architectures, not just MIPS):

* Use dynamic stack realignment rather than increasing the alignment 
assumed by callees.

* New registers should be call-clobbered, so setjmp/longjmp (and *context 
functions) don't need to save and preserve them.

* Do not change layout or alignment of existing types.

* If your architecture defines that some ISA feature can only coexist with 
other ISA features (for example, that it can only coexist with hardware 
floating point, or only with 64-bit support) then that may reduce the 
number of ABI variants that need to be defined.

* If you want to pass vector arguments (or return values) in new 
registers, without affecting the ABI for any non-vector argument or return 
type, we're a bit more relaxed on that - but if this affects GNU C generic 
vectors rather than just architecture-specific vector types there should, 
at least, be a -Wpsabi warning about the ABI change implied by enabling 
the vector instructions.  Alternatively, enable such argument passing only 
with an ABI option e.g. -mmsa-abi, rather than with the same option that 
enables the new instructions.  In either case, you need to make sure the 
dynamic linker is built with the new instructions disabled, to avoid it 
clobbering the new registers (just as it generally needs special handling 
for call-clobbered registers involved in argument passing).

* In any case, write an ABI document at an early stage that documents how 
the new registers interact with the ABI.

If you do conclude that you need a new, incompatible ABI, then the 
following also apply - see the NaN2008 changes for recent MIPS examples:

* Ensure the static linker gives errors for linking incompatible objects 
(whether with GNU object attributes, ELF header bits, or some other 
mechanism).

* For glibc, define new dynamic linker names (see recent discussion on the 
glibc lists) and ensure the corresponding specs in GCC pass the right 
arguments to -dynamic-linker.

* Ensure that glibc can tell executables / shared libraries for the 
different ABIs apart by examining the ELF header, ensure 
elf_machine_matches_host checks compatibility, and ensure ldconfig handles 
the different kinds of executables / shared libraries 
(_DL_CACHE_DEFAULT_ID, readelflib.c:process_elf_file, etc.).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-01-21 Thread Richard Sandiford
Hi Matthew,

Just wanted to add a couple of MIPS-specific things on top of what
Joseph said:

Matthew Fortune matthew.fort...@imgtec.com writes:
 The MSA patch as submitted is another variation of O32 ABI which could
 be described as O32+FP64+MSA(+nan2008) and would be link incompatible
 with both O32 and O32+FP64(+/-nan2008). The same of course applies to
 N32/N64 being link incompatible with N32/N64+FP64. This may make it
 difficult to have a small part of an application optimised with MSA but
 the rest of the code ignorant of MSA, in your experience do you think
 that will be an issue?

FWIW, n32/n64 is always fp64 -- we don't support an fp32 version.
And as you imply, o32+fp64 is already an established ABI so I think we
have to support the current form alongside any new one.  I agree with
Joseph that it'd be better to realign the stack dynamically instead.
This is what x86 does, so it's well tested within gcc.

Also, I think there was a possibility of adding a 256-bit form of MSA
in future, is that right?  So if we added extra static alignments,
would we need a separate ABI for 256-bit MSA too?

 We (MIPS) have several discussions ongoing regarding ABIs on a variety
 of mailing lists so I'm trying to collect as much input as possible to
 inform future plans.

 One part of the patch that I don't believe you commented on is the
 change of stack alignment for both existing O32+FP64 and new
 O32+FP64+MSA. Did this seem OK?

No, sorry, not sure why I didn't include that.

Thanks,
Richard


Re: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-01-21 Thread Joseph S. Myers
On Tue, 21 Jan 2014, Richard Sandiford wrote:

 And as you imply, o32+fp64 is already an established ABI so I think we
 have to support the current form alongside any new one.  I agree with
 Joseph that it'd be better to realign the stack dynamically instead.
 This is what x86 does, so it's well tested within gcc.

With glibc, o32+fp64 is not established - the glibc patch submitted in 
November needs more work as I noted in my review, likely to include 
dynamic linker names distinct from those used for o32+fp32.  So it would 
be possible to declare that only a new form is supported with glibc (more 
generally, with the Linux kernel, given the lack of current kernel support 
for o32+fp64 stated in the glibc discussion), with appropriate configure 
checks preventing building glibc with an old-ABI o32+fp64 compiler (and 
ideally a #error in some glibc header disallowing building programs with 
an old-ABI o32+fp64 compiler).

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-01-21 Thread Matthew Fortune
 Just wanted to add a couple of MIPS-specific things on top of what Joseph
 said:
 
 Matthew Fortune matthew.fort...@imgtec.com writes:
  The MSA patch as submitted is another variation of O32 ABI which could
  be described as O32+FP64+MSA(+nan2008) and would be link incompatible
  with both O32 and O32+FP64(+/-nan2008). The same of course applies to
  N32/N64 being link incompatible with N32/N64+FP64. This may make it
  difficult to have a small part of an application optimised with MSA
  but the rest of the code ignorant of MSA, in your experience do you
  think that will be an issue?
 
 FWIW, n32/n64 is always fp64 -- we don't support an fp32 version.

Sorry I meant N32/N64 being link incompatible with N32/N64+MSA.

 And as you imply, o32+fp64 is already an established ABI so I think we have
 to support the current form alongside any new one.  I agree with Joseph that
 it'd be better to realign the stack dynamically instead.
 This is what x86 does, so it's well tested within gcc.

I see that x86 includes support for a DRAP register which as far as I can tell 
improves debug'ability and the generated code when dynamic stack realignment is 
triggered. Any thoughts on whether defining a DRAP register for MIPS would be 
worthwhile if we were to rely on dynamic realignment?

 
 Also, I think there was a possibility of adding a 256-bit form of MSA in 
 future,
 is that right?  So if we added extra static alignments, would we need a
 separate ABI for 256-bit MSA too?

Personally, I'd want to avoid further ABIs if MSA ever got extended to 256bit 
and I would expect there would also be some desire to interlink the current MSA 
with future extensions which would make further ABI variants undesirable. 
Adding extra static alignments now would therefore seem like a bad precedent to 
set.

Going back to a point from Joseph:

 * If you want to pass vector arguments (or return values) in new registers, 
 without affecting the ABI for any non-vector argument or return type, we're 
 a bit more relaxed on that - but if this affects GNU C generic vectors rather 
 than just architecture-specific vector types there should, at least, be a 
 -Wpsabi warning about the ABI change implied by enabling the vector 
 instructions.

I assume we could define architecture specific types with the same layout as 
GNU C generic vectors and differentiate between the two giving the user the 
choice of which to use and have GNU C generic vectors stay with existing 
calling convention. I guess arch specific types are defined by tagging them 
with an attribute?

 Alternatively, enable such argument passing only with an ABI 
 option e.g. -mmsa-abi, rather than with the same option that enables the 
 new instructions.  In either case, you need to make sure the dynamic linker
 is built with the new instructions disabled, to avoid it clobbering the new
 registers (just as it generally needs special handling for call-clobbered 
 registers involved in argument passing).

The MSA registers are not completely new but rather wider registers that 
overlay with the standard FP registers... I think this therefore causes further 
issues as use of the corresponding FP registers would clobber the MSA values 
even if the underlying FP registers were call-saved (the extended part of the 
register is zero'd when used as scalar FP).

Matthew



RE: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2014-01-21 Thread Matthew Fortune
  And as you imply, o32+fp64 is already an established ABI so I think we
  have to support the current form alongside any new one.  I agree with
  Joseph that it'd be better to realign the stack dynamically instead.
  This is what x86 does, so it's well tested within gcc.
 
 With glibc, o32+fp64 is not established - the glibc patch submitted in
 November needs more work as I noted in my review, likely to include
 dynamic linker names distinct from those used for o32+fp32.  So it would be
 possible to declare that only a new form is supported with glibc (more
 generally, with the Linux kernel, given the lack of current kernel support for
 o32+fp64 stated in the glibc discussion), with appropriate configure checks
 preventing building glibc with an old-ABI o32+fp64 compiler (and ideally a
 #error in some glibc header disallowing building programs with an old-ABI
 o32+fp64 compiler).

True, but it doesn't seem wise to have the 'same' ABI work differently between 
linux and bare metal. Code portability from linux to bare metal (and 
vice-versa) can be important and whilst code could be written taking the 
biggest stack alignment into consideration to solve such a problem it still 
feels wrong to have them work differently.

 
 --
 Joseph S. Myers
 jos...@codesourcery.com



RE: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2013-11-21 Thread Graham Stott
Hi Joseph,

Thanks for the comments I will address these issues and send an updated  patch.

Graham





Re: [PATCH RFC] MIPS add support for MIPS SIMD ARCHITECTURE V1.07

2013-11-20 Thread Joseph S. Myers
This patch inserts a #if 0 around existing code, which looks suspicious, 
and appears to be missing documentation in invoke.texi for the new 
command-line options, and extend.texi for the new built-in functions.

-- 
Joseph S. Myers
jos...@codesourcery.com