[Bug target/56028] Splitting a 64-bit volatile store

2013-01-27 Thread uros at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #14 from uros at gcc dot gnu.org 2013-01-27 14:28:23 UTC ---

Author: uros

Date: Sun Jan 27 14:28:19 2013

New Revision: 195495



URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=195495

Log:

Backport from mainline

2013-01-22  Uros Bizjak  ubiz...@gmail.com



PR target/56028

* config/i386/i386.md (*movti_internal_rex64): Change (o,riF)

alternative to (o,r).

(*movdi_internal_rex64): Remove (!o,n) alternative.

(DImode immediate-memory splitter): Remove.

(DImode immediate-memory peephole2): Remove.

(movtf): Enable for TARGET_64BIT || TARGET_SSE.

(*movtf_internal_rex64): Rename from *movtf_internal. Change (!o,F*r)

alternative to (!o,*r).

(*movtf_internal_sse): New pattern.

(*movxf_internal_rex64): New pattern.

(*movxf_internal): Disable for TARGET_64BIT.

(*movdf_internal_rex64): Remove (!o,F) alternative.



2013-01-23  Uros Bizjak  ubiz...@gmail.com



* config/i386/i386.md (*movdf_internal_rex64): Disparage alternatives

involving stack registers slightly.



2013-01-24  Uros Bizjak  ubiz...@gmail.com



* config/i386/constraints.md (Yf): New constraint.

* config/i386/i386.md (*movdf_internal_rex64): Use Yf*f instead

of f constraint to conditionaly disable x87 register preferences.

(*movdf_internal): Ditto.

(*movsf_internal): Ditto.



2012-01-24  Uros Bizjak  ubiz...@gmail.com



* config/i386/i386.md (*movti_internal_rex64): Add (o,e) alternative.

(*movtf_internal_rex64): Add (!o,C) alternative

(*movxf_internal_rex64): Ditto.

(*movdf_internal_rex64): Add (?r,C) and (?m,C) alternatives.



testsuite/ChangeLog:



Backport from mainline

2013-01-22  Uros Bizjak  ubiz...@gmail.com



PR target/56028

* gcc.target/i386/pr56028.c: New test.



2013-01-24  Uros Bizjak  ubiz...@gmail.com



* gcc.target/i386/movsd.c: New test.





Added:

branches/gcc-4_7-branch/gcc/testsuite/gcc.target/i386/movsd.c

branches/gcc-4_7-branch/gcc/testsuite/gcc.target/i386/pr56028.c

Modified:

branches/gcc-4_7-branch/gcc/ChangeLog

branches/gcc-4_7-branch/gcc/config/i386/constraints.md

branches/gcc-4_7-branch/gcc/config/i386/i386.md

branches/gcc-4_7-branch/gcc/testsuite/ChangeLog


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-27 Thread ubizjak at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



Uros Bizjak ubizjak at gmail dot com changed:



   What|Removed |Added



 Status|ASSIGNED|RESOLVED

 Resolution||FIXED



--- Comment #15 from Uros Bizjak ubizjak at gmail dot com 2013-01-27 14:30:04 
UTC ---

Fixed.


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-22 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



Jakub Jelinek jakub at gcc dot gnu.org changed:



   What|Removed |Added



 CC||jakub at gcc dot gnu.org



--- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org 2013-01-22 
08:18:23 UTC ---

Can't we in the movdi expander detect writing to volatile MEM and expand it as

atomic_storedi with MEMMODEL_RELAXED instead?


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-22 Thread ubizjak at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #11 from Uros Bizjak ubizjak at gmail dot com 2013-01-22 08:46:48 
UTC ---

I was thinking of removing (!o,n) alternative from movdi (together with

corresponding splitters). Splitter/peephole2 actually always generates movabs

$N,%reg; mov $reg,(mem) unless it can't get a register. In the later case, the

move is split into two separate moves.



The problem is, that other move patterns assume (i.e.

*mov{df,xf,ti}_internal_rex64) that they can move all immediates to memory.

When moving FP immediate to volatile DFmode location, gcc tries to use DImode

move with immediate operand, so this would fail with removed (!o,n)

alternative.



Also, split_double_mode, split_to_parts and ix86_split_long_move will have to

be reviewed. Please see the comment inside split_double_mode, how

simplify_subreg refuses to split volatile memory address, but gcc manages to

get around this limitation.


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-22 Thread ubizjak at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



Uros Bizjak ubizjak at gmail dot com changed:



   What|Removed |Added



URL||http://gcc.gnu.org/ml/gcc-p

   ||atches/2013-01/msg01119.htm

   ||l

   Target Milestone|--- |4.7.3



--- Comment #12 from Uros Bizjak ubizjak at gmail dot com 2013-01-22 20:58:03 
UTC ---

Patch at [1].



[1] http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01119.html


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-22 Thread uros at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #13 from uros at gcc dot gnu.org 2013-01-22 20:58:45 UTC ---

Author: uros

Date: Tue Jan 22 20:58:37 2013

New Revision: 195386



URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=195386

Log:

PR target/56028

* config/i386/i386.md (*movti_internal_rex64): Change (o,riF)

alternative to (o,r).

(*movdi_internal_rex64): Remove (!o,n) alternative.

(DImode immediate-memory splitter): Remove.

(DImode immediate-memory peephole2): Remove.

(movtf): Enable for TARGET_64BIT || TARGET_SSE.

(*movtf_internal_rex64): Rename from *movtf_internal. Change (!o,F*r)

alternative to (!o,*r).

(*movtf_internal_sse): New pattern.

(*movxf_internal_rex64): New pattern.

(*movxf_internal): Disable for TARGET_64BIT.

(*movdf_internal_rex64): Remove (!o,F) alternative.



testsuite/ChangeLog:



2012-01-22  Uros Bizjak  ubiz...@gmail.com



PR target/56028

* gcc.target/i386/pr56028.c: New test.





Added:

trunk/gcc/testsuite/gcc.target/i386/pr56028.c

Modified:

trunk/gcc/ChangeLog

trunk/gcc/config/i386/i386.md

trunk/gcc/testsuite/ChangeLog


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-21 Thread ubizjak at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



Uros Bizjak ubizjak at gmail dot com changed:



   What|Removed |Added



 Status|UNCONFIRMED |ASSIGNED

URL|http://gcc.gnu.org/ml/gcc-p |

   |atches/2013-01/msg00870.htm |

   |l   |

   Last reconfirmed||2013-01-22

 AssignedTo|unassigned at gcc dot   |ubizjak at gmail dot com

   |gnu.org |

 Ever Confirmed|0   |1



--- Comment #9 from Uros Bizjak ubizjak at gmail dot com 2013-01-22 07:32:59 
UTC ---

I'll look into this.


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-19 Thread paulmck at linux dot vnet.ibm.com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #8 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 
2013-01-19 12:35:12 UTC ---

Indeed, different hardware implementations can cause all sorts of mischief. 

Nevertheless, the compiler should not also provide mischief in these cases.



In addition, as noted in comment 5, there is legacy software.  I understand and

completely support use of atomics when they are generally available (having

assisted in their definition), but many projects will need to work with older

compilers for quite some time to come.  Therefore, volatile loads, which are

only idiom available in these older compilers, needs to continue to provide the

needed functionality.  Again as noted in comment 5, volatile stores of

reasonably-sized basic types must be atomic.


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-18 Thread ubizjak at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #1 from Uros Bizjak ubizjak at gmail dot com 2013-01-18 11:09:42 
UTC ---

- Does language standard guarantee atomic store in this case [wikipedia says

No. [1]]?



- Can a store to a volatile DImode location be implemented as two consecutive

SImode stores to adjacent location (this breaks stores of 64bit immediates to

MMIO 64bit registers)?



[1] http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-18 Thread paulmck at linux dot vnet.ibm.com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #2 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 
2013-01-18 11:25:52 UTC ---

(In reply to comment #1)

 - Does language standard guarantee atomic store in this case [wikipedia says

 No. [1]]?



The above example of device drivers storing constants to a device register

declared as volatile unsigned long y does not require all of the attributes

of an atomic store, for example, it does not require memory-fence instructions.



 - Can a store to a volatile DImode location be implemented as two consecutive

 SImode stores to adjacent location (this breaks stores of 64bit immediates to

 MMIO 64bit registers)?



See 1.9p8 of the C++11 standard, first bullet:



Access to volatile objects are evaluated strictly according to the rules of

the abstract machine.



From what I can see, implementing a store to a volatile DImode location as two

consecutive SImode stores to adjacent locations violates this aspect of the

standard.  Furthermore, to expand on your parenthesized statement above, gcc

might not operate reliably if the device drivers in the kernel it is running on

have their 64-bit immediate stores broken into pairs of 32-bit immediate

stores.  ;-)



 [1] http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-18 Thread eugeni.stepanov at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #3 from Evgeniy Stepanov eugeni.stepanov at gmail dot com 
2013-01-18 11:57:07 UTC ---

(In reply to comment #2)

 See 1.9p8 of the C++11 standard, first bullet:

 

 Access to volatile objects are evaluated strictly according to the rules of

 the abstract machine.

 

 From what I can see, implementing a store to a volatile DImode location as two

 consecutive SImode stores to adjacent locations violates this aspect of the

 standard.  Furthermore, to expand on your parenthesized statement above, gcc

 might not operate reliably if the device drivers in the kernel it is running 
 on

 have their 64-bit immediate stores broken into pairs of 32-bit immediate

 stores.  ;-)



So, what are these rules of the abstract machine, and why do they allow

non-atomic store of a large volatile aggregate (it is definitely not atomic,

right?), and require atomicity for volatile long?


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-18 Thread paulmck at linux dot vnet.ibm.com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #4 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 
2013-01-18 16:22:49 UTC ---

(In reply to comment #3)

 So, what are these rules of the abstract machine, and why do they allow

 non-atomic store of a large volatile aggregate (it is definitely not atomic,

 right?), and require atomicity for volatile long?



5.17p3 and 5.17p4 do distinguish between non-aggregate and aggregate

assignment:



5.17p3:



If the left operand is not of class type, the expression is 

implicitly converted (Clause 4) to the cv-unqualified type of

the left operand.



5.17p4:



If the left operand is of class type, the class shall be 

complete.  Assignment to objects of a class is defined by the

copy/move assignment operator.



Of course, the old-days possibility of systems with 8-bit busses limits how

much the standard can say, but given that the system in question really can do

a 64-bit store, volatile really should force a single store.


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-18 Thread eugeni.stepanov at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #5 from Evgeniy Stepanov eugeni.stepanov at gmail dot com 
2013-01-18 16:38:11 UTC ---

Well, it's true that classes have assignment operators, and basic types don't.

But this does not have anything to do with how the assignment could (or could

not) be implemented at the machine level.



AFAIU, the main point here is that a valid, data-race-free program can't

observe the non-atomicity of a 64-bit store. And there is no special clause for

volatile stores in this regard, either.



I agree that in practice volatile stores of reasonably-sized basic types must

be atomic, or a lot of legacy code will break.


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-18 Thread paulmck at linux dot vnet.ibm.com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



--- Comment #6 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 
2013-01-18 17:40:13 UTC ---

The fact that a data-race-free program cannot observe the non-atomicity of a

64-bit store, though true, is beside the point.  The plain fact is that

hardware registers (for which volatile was intended) really do care about the

size of the store.  A pair of 32-bit stores does not necessarily mean the same

thing to hardware as does a single 64-bit store.  Given that C is intended to

be used for device drivers, volatile stores of reasonably-sized basic types

must be atomic, and on 64-bit systems, reasonably-sized very clearly includes

64-bit stores.



So this bug really does need to be fixed.


[Bug target/56028] Splitting a 64-bit volatile store

2013-01-18 Thread pluto at agmk dot net


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028



Pawel Sikora pluto at agmk dot net changed:



   What|Removed |Added



 CC||pluto at agmk dot net



--- Comment #7 from Pawel Sikora pluto at agmk dot net 2013-01-18 18:23:27 
UTC ---

(In reply to comment #6)

 The fact that a data-race-free program cannot observe the non-atomicity of a

 64-bit store, though true, is beside the point.  The plain fact is that

 hardware registers (for which volatile was intended) really do care about the

 size of the store.  A pair of 32-bit stores does not necessarily mean the same

 thing to hardware as does a single 64-bit store.  Given that C is intended to

 be used for device drivers, volatile stores of reasonably-sized basic types

 must be atomic, and on 64-bit systems, reasonably-sized very clearly 
 includes

 64-bit stores.



even 'movq' on x86-64 could be split into 32-bit transfers by mainboard h/w.

devices connected to x86-64 should be aware of this!



btw, iirc the x87-fpu load/store can give you atomic 64-bit transfer.