[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2021-05-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|8.5 |---
 CC||jakub at gcc dot gnu.org

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2020-03-04 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|8.4 |8.5

--- Comment #36 from Jakub Jelinek  ---
GCC 8.4.0 has been released, adjusting target milestone.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread ma...@linux-mips.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #35 from Maciej W. Rozycki  ---
So presumably the actual solution for n32 would be the same as with x32
and SIB, which IIUC cannot rely on hardware wrapping around the address
space either.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread ma...@linux-mips.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #34 from Maciej W. Rozycki  ---
(In reply to mpf from comment #29)
> I don't remember the detail of this issue but I believe I was convinced that
> it is down to the lack of setting PX appropriately in HW. UX==0, PX==1. The
> PX control bit forces address calculations i.e. base + imm or base + reg to
> be performed with 32-bit rules but allows 64 instruction usage. Since there
> is a processor mode that is perfectly capable of meeting the requirements of
> a program with 64bit data and 32bit pointers then the solution is to set PX
> for N32 rather than UX.

This is impractical because as I say Linux has to support processors that
have no CP0.Status.PX bit and do have to rely on CP0.Status.UX instead.

NB Richard, n32 is 64-bit mode, pretty much like x86's x32, except that
invented some 20 years earlier.  So regs are already DImode as are stack
slots, etc.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread patrickdepinguin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #33 from Thomas De Schampheleire  ---
(In reply to Andrew Pinski from comment #32)
> >I'm currently using -march=octeon3   or -march=octeon2  as appropriate.
> 
> Can you report this to Marvell (Cavium)?  O32 was not used much on Octeon.


Yes, I will.
However, please note that I am using N32, not O32.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #32 from Andrew Pinski  ---
>I'm currently using -march=octeon3   or -march=octeon2  as appropriate.

Can you report this to Marvell (Cavium)?  O32 was not used much on Octeon.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread patrickdepinguin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #31 from Thomas De Schampheleire  ---
(In reply to Maciej W. Rozycki from comment #27)
> Yes, it is the same problem, the same address calculation occurs here,
> and the lack of 32-bit address space wraparound is a part of the n32
> Linux ABI, which implies support for processors that do not support such
> a wraparound in hardware (no CP0.Status.PX bit).
> 
> You may try experimenting with ISA/ASE selection options, so that LWX is
> not considered a valid instruction by GCC.  Otherwise I can't help with
> finding a workaround as I don't know one offhand and I'm not involved
> with MIPS development anymore, sorry.  And neither is Doug BTW.
> 
> This really ought to be fixed properly in GCC.

I'm currently using -march=octeon3   or -march=octeon2  as appropriate.
I'm not really confident in changing this, as there will be other impact too.

As a quick workaround/test, I will try letting the '-mno-lxc1-sxc1' flag also
control the use of the lwx and similar instructions, as follows:

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 23e1672b586..5dee3fbe29f 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1194,17 +1194,17 @@ struct mips_cpu_info {

 /* ISA has lwxs instruction (load w/scaled index address.  */
 #define ISA_HAS_LWXS   ((TARGET_SMARTMIPS || TARGET_MICROMIPS) \
-&& !TARGET_MIPS16)
+&& !TARGET_MIPS16 && mips_lxc1_sxc1)

 /* ISA has lbx, lbux, lhx, lhx, lhux, lwx, lwux, or ldx instruction. */
-#define ISA_HAS_LBX(TARGET_OCTEON2)
-#define ISA_HAS_LBUX   (ISA_HAS_DSP || TARGET_OCTEON2)
-#define ISA_HAS_LHX(ISA_HAS_DSP || TARGET_OCTEON2)
-#define ISA_HAS_LHUX   (TARGET_OCTEON2)
-#define ISA_HAS_LWX(ISA_HAS_DSP || TARGET_OCTEON2)
-#define ISA_HAS_LWUX   (TARGET_OCTEON2 && TARGET_64BIT)
+#define ISA_HAS_LBX(TARGET_OCTEON2 && mips_lxc1_sxc1)
+#define ISA_HAS_LBUX   ((ISA_HAS_DSP || TARGET_OCTEON2) &&
mips_lxc1_sxc1)
+#define ISA_HAS_LHX((ISA_HAS_DSP || TARGET_OCTEON2) &&
mips_lxc1_sxc1)
+#define ISA_HAS_LHUX   (TARGET_OCTEON2 && mips_lxc1_sxc1)
+#define ISA_HAS_LWX((ISA_HAS_DSP || TARGET_OCTEON2) &&
mips_lxc1_sxc1)
+#define ISA_HAS_LWUX   (TARGET_OCTEON2 && TARGET_64BIT &&
mips_lxc1_sxc1)
 #define ISA_HAS_LDX((ISA_HAS_DSP || TARGET_OCTEON2) \
-&& TARGET_64BIT)
+&& TARGET_64BIT && mips_lxc1_sxc1)

 /* The DSP ASE is available.  */
 #define ISA_HAS_DSP(TARGET_DSP && !TARGET_MIPS16)

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread patrickdepinguin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #30 from Thomas De Schampheleire  ---
(In reply to mpf from comment #29)
> I don't remember the detail of this issue but I believe I was convinced that
> it is down to the lack of setting PX appropriately in HW. UX==0, PX==1. The
> PX control bit forces address calculations i.e. base + imm or base + reg to
> be performed with 32-bit rules but allows 64 instruction usage. Since there
> is a processor mode that is perfectly capable of meeting the requirements of
> a program with 64bit data and 32bit pointers then the solution is to set PX
> for N32 rather than UX.

This would have to be done by the kernel when switching to an application,
correct? And then only for n32 applications, not for n64 or others.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread mpf at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #29 from mpf at gcc dot gnu.org ---
I don't remember the detail of this issue but I believe I was convinced that it
is down to the lack of setting PX appropriately in HW. UX==0, PX==1. The PX
control bit forces address calculations i.e. base + imm or base + reg to be
performed with 32-bit rules but allows 64 instruction usage. Since there is a
processor mode that is perfectly capable of meeting the requirements of a
program with 64bit data and 32bit pointers then the solution is to set PX for
N32 rather than UX.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-22 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #28 from Richard Biener  ---
(In reply to James Cowgill from comment #0)
> Before the ldxc1 instruction is executed, gdb reports that the values in v0
> and s0 are both large integers (above 0x8000):
> (gdb) print/x $v0
> $1 = 0xfffee7f8
> (gdb) print/x $s0
> $2 = 0x80008b50
> 
> When added together, the lower 32-bits contains the correct pointer (in this
> case on the stack). On a 32-bit processor this is fine.
> 
> On a 64-bit processor, we know that v0 and s0 are sign extended as the last
> instructions to touch them were the addu at 924 and the subu at 8e0. So the
> values in the registers are actually:
> v0 = 0xfffee7f8
> s0 = 0x80008b50
> 
> Adding these together (modulo 64-bit) gives the final pointer of
> 0x7fff7348 which is outside the user address space and thus results
> in a SIGBUS.
> 
> I think GCC is assuming that the address calculated by the ldxc1 instruction
> is modulo 32-bit when compiled for a 32-bit processor. However, this is not
> true if the code is later run on a 64-bit processor.

Given the above GCC has to consider pointers (even integer regs!?) being
64bit even in 32bit mode since obviously semantics between 32bit and 64bit
hardware differs.  Thus as Eric says Pmode needs to be DImode and
ptr_mode SImode for 32bit and DImode for 64bit.  Everything else is going
to lead to issues like the ones observed.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-21 Thread ma...@linux-mips.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #27 from Maciej W. Rozycki  ---
Yes, it is the same problem, the same address calculation occurs here,
and the lack of 32-bit address space wraparound is a part of the n32
Linux ABI, which implies support for processors that do not support such
a wraparound in hardware (no CP0.Status.PX bit).

You may try experimenting with ISA/ASE selection options, so that LWX is
not considered a valid instruction by GCC.  Otherwise I can't help with
finding a workaround as I don't know one offhand and I'm not involved
with MIPS development anymore, sorry.  And neither is Doug BTW.

This really ought to be fixed properly in GCC.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-21 Thread patrickdepinguin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #26 from Thomas De Schampheleire  ---
(In reply to Thomas De Schampheleire from comment #25)
> Is it possible that this same problem is applicable on the 'lwx' instruction?
> I am using MIPS64 n32.
> 
> I first saw the original problem as described in this bug with instruction
> 'lwxc1'. I then used the suggested compilation flag -mno-lxc1-sxc1 which
> removed that problem.
> 
> However, in another place in the code I get a SIGBUS on following
> instruction:
> 
> lwx   v1,a5(v1)
> 
> where:
> v1: eb623870
> a5: 8a4ee1c4
> 
> The exception shows:
> badvaddr: 75b11a34
> 
> If you sum the lower 32 bits of v1+a5 you get '175b11a34' (i.e. 33 bits).
> Truncated to 32 bits, this is the value you see in badvaddr.
> Nevertheless, this address is inside a mapped and read/writable memory
> range, from /proc/PID/maps:
> 
> 75b0-75b21000 rw-p  00:00 0 
> 
> This behavior looks very similar to this bug's description.
> 
> If this same problem indeed also applies to 'lwx', is there a workaround for
> it?

This is observed with gcc 7.3.0.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-08-21 Thread patrickdepinguin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #25 from Thomas De Schampheleire  ---
Is it possible that this same problem is applicable on the 'lwx' instruction?
I am using MIPS64 n32.

I first saw the original problem as described in this bug with instruction
'lwxc1'. I then used the suggested compilation flag -mno-lxc1-sxc1 which
removed that problem.

However, in another place in the code I get a SIGBUS on following instruction:

lwx v1,a5(v1)

where:
v1: eb623870
a5: 8a4ee1c4

The exception shows:
badvaddr: 75b11a34

If you sum the lower 32 bits of v1+a5 you get '175b11a34' (i.e. 33 bits).
Truncated to 32 bits, this is the value you see in badvaddr.
Nevertheless, this address is inside a mapped and read/writable memory range,
from /proc/PID/maps:

75b0-75b21000 rw-p  00:00 0 

This behavior looks very similar to this bug's description.

If this same problem indeed also applies to 'lwx', is there a workaround for
it?

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2019-02-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|8.3 |8.4

--- Comment #24 from Jakub Jelinek  ---
GCC 8.3 has been released.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2018-07-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|8.2 |8.3

--- Comment #23 from Jakub Jelinek  ---
GCC 8.2 has been released.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2018-05-02 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|8.0 |8.2

--- Comment #22 from Jakub Jelinek  ---
GCC 8.1 has been released.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-03-07 Thread mpf at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

mpf at gcc dot gnu.org changed:

   What|Removed |Added

  Known to work||7.0
   Target Milestone|--- |8.0
  Known to fail||6.2.0

--- Comment #21 from mpf at gcc dot gnu.org ---
Moving target to GCC 8.0 albeit that it may not be solvable. Mitigated in GCC
7.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-02-01 Thread doug.gilmore at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #20 from Doug Gilmore  ---
I'll collect more tracing data on the costing problem.

Hopefully I post an update in the next few days.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-02-01 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #19 from Richard Biener  ---
I agree with the comments that this (if at all) needs to be fixed at RTL
expansion time where we already do quite some "hacks" for sizetype
in POINTER_PLUS_EXPR context:

case POINTER_PLUS_EXPR:
  /* Even though the sizetype mode and the pointer's mode can be different
 expand is able to handle this correctly and get the correct result out
 of the PLUS_EXPR code.  */
  /* Make sure to sign-extend the sizetype offset in a POINTER_PLUS_EXPR
 if sizetype precision is smaller than pointer precision.  */
  if (TYPE_PRECISION (sizetype) < TYPE_PRECISION (type))
treeop1 = fold_convert_loc (loc, type,
fold_convert_loc (loc, ssizetype,
  treeop1));
  /* If sizetype precision is larger than pointer precision, truncate the
 offset to have matching modes.  */

but I don't see from the comments in this bug what the actual stmt is the
critical one.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-01-30 Thread doug.gilmore at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Doug Gilmore  changed:

   What|Removed |Added

 CC||law at redhat dot com,
   ||rguenth at gcc dot gnu.org,
   ||zqchen at gcc dot gnu.org

--- Comment #18 from Doug Gilmore  ---
CC author and reviewers of r216501.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-01-30 Thread doug.gilmore at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #17 from Doug Gilmore  ---
> This really throws off the costing of substituting different IVs on
> MIPS.
I forgot to mention that for MIPS the net of effect r216501 is to not
produce indexed memory OPs in simple examples where we should.  But
we also will produce problematic indexed memory OPs in situations
where address generation costing is a bit complicated (the original
issue associated with this bug report).

Applying the the two patches I just attached fixes the problem of
generating indexed memory OPs in simple examples, and also will cause
IVOPTS to select IVs that are similar to those that were made in the
past that avoids the problem executing indexed memory OPs in O32
binaries on 64-bit MIPS processors running current Linux kernels.

There is still the issue of recognizing that rewriting a "use" to use
a different IV can expose a problem with indexed memory OPs on 64-bit
MIPS processors, where an infinite cost should be associated in that
situation, that still needs to be addressed (thus the need for the
flag to turn off the generation of indexed memory OPs until this issue
is addressed).

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-01-30 Thread doug.gilmore at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #16 from Doug Gilmore  ---
Created attachment 40632
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40632=edit
Tweak to adjust_setup_cost (r220473).

Second patch associated with previous comment.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-01-30 Thread doug.gilmore at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #15 from Doug Gilmore  ---
Created attachment 40631
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40631=edit
Prototype change to backout r216501.

> Bisected the problem to commit r216501:

The review discussion of r216501 starts with message:

https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00758.html

Which contains:

The are two implementations of seq_cost. The function bodies are
exactly the same. The patch removes one of them and make the other
global.

This seems the patch was cleanup that shouldn't introduce a
functional change.

However implementations of seq_cost are different, per
final version of the patch:

https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00896.html

cfgloopanal.c:
-   cost += set_rtx_cost (set, speed);


rtlanal.c:
+cost += set_rtx_cost (set, speed);

tree-ssa-loop-ivopts.c:
-   cost += set_src_cost (SET_SRC (set), speed);

In general, when computing the cost of a sequence of N INSNs this
increases the cost of the sequence by N*4.  This really throws
off the costing of substituting different IVs on MIPS.

The first patch attached (just a prototype) basically reverts
this change.  The second fixes a problem with r220473, a fix
for PR62631 from Eric Botcazou:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62631#c17

This looks a generic problem in get_shiftadd_cost to me, it ought
to mimic the algorithms in expmed.c, something like ...

This change can lower the cost of a sequence of instruction.  However
there are situations this (lower) cost is being scaled by an estimated
iteration count will cause the adjusted cost to now become zero.  For
the example attached to the second patch the IV replacement algorithm
will determine that the cost using separate IVs for each load will be
less than then cost of one IV for all loads.

Thus, in the second patch we detect that a non-zero cost being scaled
to zero should represented by one instead, which gets us back to
IVSOPTS generating just one IV that will be used for all loads.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-01-19 Thread mpf at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #14 from mpf at gcc dot gnu.org ---
Author: mpf
Date: Thu Jan 19 16:05:59 2017
New Revision: 244640

URL: https://gcc.gnu.org/viewcvs?rev=244640=gcc=rev
Log:
MIPS: PR target/78176 add -mlxc1-sxc1.

gcc/

PR target/78176
* config.gcc (supported_defaults): Add lxc1-sxc1.
(with_lxc1_sxc1): Add validation.
(all_defaults): Add lxc1-sxc1.
* config/mips/mips.opt (mlxc1-sxc1): New option.
* gcc/config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for
mlxc1-sxc1.
(TARGET_CPU_CPP_BUILTINS): Add builtin_define for
__mips_no_lxc1_sxc1.
(ISA_HAS_LXC1_SXC1): Gate with mips_lxc1_sxc1.
* gcc/doc/invoke.texi (-mlxc1-sxc1): Document the new option.
* doc/install.texi (--with-lxc1-sxc1): Document the new option.

gcc/testsuite/

* gcc.target/mips/lxc1-sxc1-1.c: New file.
* gcc.target/mips/lxc1-sxc1-2.c: Likewise.
* gcc.target/mips/mips.exp (mips_option_groups): Add ghost option
HAS_LXC1.
(mips_option_groups): Add -m[no-]lxc1-sxc1.
(mips-dg-init): Detect default -mno-lxc1-sxc1.
(mips-dg-options): Handle HAS_LXC1 arch upgrade/downgrade.

Added:
trunk/gcc/testsuite/gcc.target/mips/lxc1-sxc1-1.c
trunk/gcc/testsuite/gcc.target/mips/lxc1-sxc1-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config.gcc
trunk/gcc/config/mips/mips.h
trunk/gcc/config/mips/mips.opt
trunk/gcc/doc/install.texi
trunk/gcc/doc/invoke.texi
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/mips/mips.exp

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-01-16 Thread mpf at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

mpf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2017-01-16
 CC||mpf at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |dgilmore at mips dot com
 Ever confirmed|0   |1

--- Comment #13 from mpf at gcc dot gnu.org ---
(In reply to Maciej W. Rozycki from comment #11)
> TBH this does look like trying to rely on UB to me, as per section
> 6.5.6 "Additive operators" clause 8 of the C language standard, which
> states (among others):
> 
> "If both the pointer operand and the result point to elements of the
> same array object, or one past the last element of the array object,
> the evaluation shall not produce an overflow; otherwise, the behavior
> is undefined."
> 
> Here under the triggering conditions the pointer the integer is added
> to with LDXC1 does not point to an element of the array operated on (or
> to one past the last), so the hardware operation matches the standard's
> semantics WRT overflow and I don't think we ought to be pushing it.

Aren't language rules mostly irrelevant at this stage though?
We don't really have the concept of a pointer after generating RTL we just
happen to have some values in a mode that is the same as Pmode.

mips.h comment for Pmode:

/* Specify the machine mode that pointers have.
   After generation of rtl, the compiler makes no further distinction
   between pointers and any other objects of this machine mode.  */

> So it looks like a middle end bug to me and the backend is fine in
> faithfully assuming its RTL pattern won't be passed operands leading to
> UB.

I can't see any feature/option/setting that gives this guarantee. Why do you
think the backend can make this assumption?

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2017-01-13 Thread doug.gilmore at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Doug Gilmore  changed:

   What|Removed |Added

 CC||doug.gilmore at imgtec dot com

--- Comment #12 from Doug Gilmore  ---
Bisected the problem to commit r216501:

commit 9a416363e99c9f2d48fa810e220bc2f7904f1788
Author: zqchen 
Date:   Tue Oct 21 03:38:37 2014 +

2014-10-21  Zhenqiang Chen  

* cfgloopanal.c (seq_cost): Delete.
* rtl.h (seq_cost): New prototype.
* rtlanal.c (seq_cost): New function.
* tree-ssa-loop-ivopts.c (seq_cost): Delete.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@216501
138bc75d-0d04-0410-961f-82ee72b054a4

More analysis to follow.

Given the short time until the release, we plan submit a patch to
provide a target flag and build option to avoid the bug.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-04 Thread ma...@linux-mips.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Maciej W. Rozycki  changed:

   What|Removed |Added

 CC||ma...@linux-mips.org

--- Comment #11 from Maciej W. Rozycki  ---
TBH this does look like trying to rely on UB to me, as per section
6.5.6 "Additive operators" clause 8 of the C language standard, which
states (among others):

"If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined."

Here under the triggering conditions the pointer the integer is added
to with LDXC1 does not point to an element of the array operated on (or
to one past the last), so the hardware operation matches the standard's
semantics WRT overflow and I don't think we ought to be pushing it.

So it looks like a middle end bug to me and the backend is fine in
faithfully assuming its RTL pattern won't be passed operands leading to
UB.

Have I missed anything?

  Maciej

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-04 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #10 from Eric Botcazou  ---
> Sure, and in this case there are no implicit extensions to larger types. The
> bug does require an implicit extension to occur but this only happens at
> runtime when on 64-bit hardware and the core is not configured to be limited
> to a 32-bit address space which is always true for a 64-bit kernel as it
> turns out.

OK, if the compiler doesn't know that larger types are involved, then indeed it
cannot do anything to prevent implicit extensions.  Such a configuration needs
to define Pmode as DImode and ptr_mode as SImode for the compiler to work
properly.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-04 Thread matthew.fortune at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #9 from Matthew Fortune  ---
(In reply to Eric Botcazou from comment #8)
> > The expansion looks like an acceptable transformation to me i.e. it is not
> > introducing the overflow for the offending pointer just maintaining what is
> > already in the tree.
> 
> Wrap around for unsigned types is OK but, if expansion does implicit
> extensions to larger types, then things can easily go wrong.

Sure, and in this case there are no implicit extensions to larger types. The
bug does require an implicit extension to occur but this only happens at
runtime when on 64-bit hardware and the core is not configured to be limited to
a 32-bit address space which is always true for a 64-bit kernel as it turns
out.

> > I'm still not sure if there is really a bug. Should reassoc not be doing
> > this for 'sizetype'? Should ivopts not have created the mess in the first
> > place? Would changing either of these actually introduce an assurance that
> > this situation won't occur from other optimisations?
> 
> sizetype is unsigned so all the transformations looks valid.

I'm leaning heavily towards not-a-compiler-bug but may need to offer a
workaround until we can solve this in the Linux kernel.

The only workaround possible is to provide an option to disable register+index
addressing i.e. remove [ls][wd]xc1 instructions.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-04 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #8 from Eric Botcazou  ---
> The expansion looks like an acceptable transformation to me i.e. it is not
> introducing the overflow for the offending pointer just maintaining what is
> already in the tree.

Wrap around for unsigned types is OK but, if expansion does implicit extensions
to larger types, then things can easily go wrong.

> I'm still not sure if there is really a bug. Should reassoc not be doing
> this for 'sizetype'? Should ivopts not have created the mess in the first
> place? Would changing either of these actually introduce an assurance that
> this situation won't occur from other optimisations?

sizetype is unsigned so all the transformations looks valid.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-04 Thread matthew.fortune at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #7 from Matthew Fortune  ---
(In reply to Eric Botcazou from comment #6)
> > The issue may stem from the C front end where the dumps start off as below.
> > Note that the '-1' in kappa-1 has ended up being represented as 1073741823
> > (0x3fff) presumably owing to the fact it will be multiplied by 4
> > eventually and hence be 'safe'.
> 
> All pointer calculations are done in sizetype and it is unsigned.  This kind
> of issues generally come from the RTL level, maybe expansion here.

The expansion looks like an acceptable transformation to me i.e. it is not
introducing the overflow for the offending pointer just maintaining what is
already in the tree.

I dug back to reassoc2 based on Andrew's comment which gets the following as
input:

  _48 = (sizetype) kappa_6(D);
  _49 = _48 * 8;
  _50 = s_37(D) + _49;
  ivtmp.19_47 = (unsigned int) _50

and then in the loop:

  _54 = (sizetype) s_37(D);
  _55 = ivtmp.19_3 - _54;
  _56 = _55 + 4294967280;
  # RANGE [0, 4294967295] NONZERO 4294967288
  _19 = _56;
  # PT = nonlocal escaped 
  _20 = _17 + _19;
  # VUSE <.MEM_4>
  _21 = *_20

This includes a pointer subtraction of 's' (somewhat pointless but valid I
believe).

reassoc2 changes it to this:

  _48 = (sizetype) kappa_6(D);
  _49 = _48 * 8;
  _50 = s_37(D) + _49;
  ivtmp.19_47 = (unsigned int) _50;

and in the loop:

  _54 = (sizetype) s_37(D);
  _38 = -_54;
  _25 = 4294967280 - _54;
  _56 = _25 + ivtmp.19_3;
  # RANGE [0, 4294967295] NONZERO 4294967288
  _19 = _56;
  # PT = nonlocal escaped 
  _20 = _17 + _19;
  # VUSE <.MEM_4>
  _21 = *_20;

And _56 is where we get the overflow.

I'm still not sure if there is really a bug. Should reassoc not be doing this
for 'sizetype'? Should ivopts not have created the mess in the first place?
Would changing either of these actually introduce an assurance that this
situation won't occur from other optimisations?

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-03 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Eric Botcazou  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org

--- Comment #6 from Eric Botcazou  ---
> The issue may stem from the C front end where the dumps start off as below.
> Note that the '-1' in kappa-1 has ended up being represented as 1073741823
> (0x3fff) presumably owing to the fact it will be multiplied by 4
> eventually and hence be 'safe'.

All pointer calculations are done in sizetype and it is unsigned.  This kind of
issues generally come from the RTL level, maybe expansion here.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-03 Thread matthew.fortune at imgtec dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Matthew Fortune  changed:

   What|Removed |Added

 CC||matthew.fortune at imgtec dot 
com

--- Comment #5 from Matthew Fortune  ---
I'm struggling to see what can be fixed in GCC for this (and I'm not sure I
follow why there is a problem with GCC's behaviour) but...

The issue may stem from the C front end where the dumps start off as below.
Note that the '-1' in kappa-1 has ended up being represented as 1073741823
(0x3fff) presumably owing to the fact it will be multiplied by 4 eventually
and hence be 'safe'. I can't figure out so far what creates this value but the
behavior started some time between gcc 4.7 and 4.8 I believe. The questionably
bad code then shows up dependent on optimisations and instruction availability.
If the calculation had been done with ADDU instead of in the LDXC1 then all
would be well.

Regardless of whether this is a real bug for the compiler then the Linux kernel
should be making use of the status.UX bit in order to properly emulate a 32-bit
environment for o32 on 64-bit kernel/hardware and had this been run using a
32-bit kernel (on 64-bit hardware) then it would run correctly. There is an
issue for N32 code with the UX bit in that you also need the PX bit available
to independently control 64-bit instruction availability vs address space but
this is only missing in MIPSIV cores.

;; Function ldxc1_test (null)
;; enabled by -tree-original


{
  int kappa2 = kappa;
  double tmp = 0.0;

int kappa2 = kappa;
double tmp = 0.0;
  :;
  kappa-- ;
  if (zeros + 1 < kappa)
{
  tmp = *(*(r->rows + ((sizetype) kappa + 1073741823) * 4) + ((sizetype)
kappa + 536870911) * 8) * ctt;
  tmp = ldexp (tmp, *(expo + ((sizetype) kappa + 1073741823) * 4) - *(expo
+ (sizetype) ((unsigned int) kappa2 * 4)));
}
  if (zeros + 1 < kappa && *(s + ((sizetype) kappa + 536870911) * 8) <= tmp)
goto ; else goto ;
  :;
  return tmp;

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-01 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||mips

--- Comment #4 from Andrew Pinski  ---
Then the problem is a reassociation issue. That is we should never get an
overflow happening for pointers for MIPS.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-01 Thread james410 at cowgill dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #3 from James Cowgill  ---
As far as I can tell, all the pointers in the original C code are valid and do
not wrap. Some of the registers wrap, but they're not pointers (until added
with other registers).

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-01 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

--- Comment #2 from Andrew Pinski  ---
I think this code is undefined if you have wrapping pointers. No pointer should
ever be above INT_MAX in user space on mips32 due to the memory layout on
MIPS32.

[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits

2016-11-01 Thread james410 at cowgill dot org.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176

James Cowgill  changed:

   What|Removed |Added

  Attachment #39937|0   |1
is obsolete||

--- Comment #1 from James Cowgill  ---
Created attachment 39938
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39938=edit
testcase v2

Testcase actually initializes the arrays this time :)

The bug still occurs.