[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #14 from uros at gcc dot gnu.org 2013-01-27 14:28:23 UTC --- Author: uros Date: Sun Jan 27 14:28:19 2013 New Revision: 195495 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=195495 Log: Backport from mainline 2013-01-22 Uros Bizjak ubiz...@gmail.com PR target/56028 * config/i386/i386.md (*movti_internal_rex64): Change (o,riF) alternative to (o,r). (*movdi_internal_rex64): Remove (!o,n) alternative. (DImode immediate-memory splitter): Remove. (DImode immediate-memory peephole2): Remove. (movtf): Enable for TARGET_64BIT || TARGET_SSE. (*movtf_internal_rex64): Rename from *movtf_internal. Change (!o,F*r) alternative to (!o,*r). (*movtf_internal_sse): New pattern. (*movxf_internal_rex64): New pattern. (*movxf_internal): Disable for TARGET_64BIT. (*movdf_internal_rex64): Remove (!o,F) alternative. 2013-01-23 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*movdf_internal_rex64): Disparage alternatives involving stack registers slightly. 2013-01-24 Uros Bizjak ubiz...@gmail.com * config/i386/constraints.md (Yf): New constraint. * config/i386/i386.md (*movdf_internal_rex64): Use Yf*f instead of f constraint to conditionaly disable x87 register preferences. (*movdf_internal): Ditto. (*movsf_internal): Ditto. 2012-01-24 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (*movti_internal_rex64): Add (o,e) alternative. (*movtf_internal_rex64): Add (!o,C) alternative (*movxf_internal_rex64): Ditto. (*movdf_internal_rex64): Add (?r,C) and (?m,C) alternatives. testsuite/ChangeLog: Backport from mainline 2013-01-22 Uros Bizjak ubiz...@gmail.com PR target/56028 * gcc.target/i386/pr56028.c: New test. 2013-01-24 Uros Bizjak ubiz...@gmail.com * gcc.target/i386/movsd.c: New test. Added: branches/gcc-4_7-branch/gcc/testsuite/gcc.target/i386/movsd.c branches/gcc-4_7-branch/gcc/testsuite/gcc.target/i386/pr56028.c Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/config/i386/constraints.md branches/gcc-4_7-branch/gcc/config/i386/i386.md branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 Uros Bizjak ubizjak at gmail dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Comment #15 from Uros Bizjak ubizjak at gmail dot com 2013-01-27 14:30:04 UTC --- Fixed.
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #10 from Jakub Jelinek jakub at gcc dot gnu.org 2013-01-22 08:18:23 UTC --- Can't we in the movdi expander detect writing to volatile MEM and expand it as atomic_storedi with MEMMODEL_RELAXED instead?
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #11 from Uros Bizjak ubizjak at gmail dot com 2013-01-22 08:46:48 UTC --- I was thinking of removing (!o,n) alternative from movdi (together with corresponding splitters). Splitter/peephole2 actually always generates movabs $N,%reg; mov $reg,(mem) unless it can't get a register. In the later case, the move is split into two separate moves. The problem is, that other move patterns assume (i.e. *mov{df,xf,ti}_internal_rex64) that they can move all immediates to memory. When moving FP immediate to volatile DFmode location, gcc tries to use DImode move with immediate operand, so this would fail with removed (!o,n) alternative. Also, split_double_mode, split_to_parts and ix86_split_long_move will have to be reviewed. Please see the comment inside split_double_mode, how simplify_subreg refuses to split volatile memory address, but gcc manages to get around this limitation.
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 Uros Bizjak ubizjak at gmail dot com changed: What|Removed |Added URL||http://gcc.gnu.org/ml/gcc-p ||atches/2013-01/msg01119.htm ||l Target Milestone|--- |4.7.3 --- Comment #12 from Uros Bizjak ubizjak at gmail dot com 2013-01-22 20:58:03 UTC --- Patch at [1]. [1] http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01119.html
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #13 from uros at gcc dot gnu.org 2013-01-22 20:58:45 UTC --- Author: uros Date: Tue Jan 22 20:58:37 2013 New Revision: 195386 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=195386 Log: PR target/56028 * config/i386/i386.md (*movti_internal_rex64): Change (o,riF) alternative to (o,r). (*movdi_internal_rex64): Remove (!o,n) alternative. (DImode immediate-memory splitter): Remove. (DImode immediate-memory peephole2): Remove. (movtf): Enable for TARGET_64BIT || TARGET_SSE. (*movtf_internal_rex64): Rename from *movtf_internal. Change (!o,F*r) alternative to (!o,*r). (*movtf_internal_sse): New pattern. (*movxf_internal_rex64): New pattern. (*movxf_internal): Disable for TARGET_64BIT. (*movdf_internal_rex64): Remove (!o,F) alternative. testsuite/ChangeLog: 2012-01-22 Uros Bizjak ubiz...@gmail.com PR target/56028 * gcc.target/i386/pr56028.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr56028.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 Uros Bizjak ubizjak at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED URL|http://gcc.gnu.org/ml/gcc-p | |atches/2013-01/msg00870.htm | |l | Last reconfirmed||2013-01-22 AssignedTo|unassigned at gcc dot |ubizjak at gmail dot com |gnu.org | Ever Confirmed|0 |1 --- Comment #9 from Uros Bizjak ubizjak at gmail dot com 2013-01-22 07:32:59 UTC --- I'll look into this.
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #8 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 2013-01-19 12:35:12 UTC --- Indeed, different hardware implementations can cause all sorts of mischief. Nevertheless, the compiler should not also provide mischief in these cases. In addition, as noted in comment 5, there is legacy software. I understand and completely support use of atomics when they are generally available (having assisted in their definition), but many projects will need to work with older compilers for quite some time to come. Therefore, volatile loads, which are only idiom available in these older compilers, needs to continue to provide the needed functionality. Again as noted in comment 5, volatile stores of reasonably-sized basic types must be atomic.
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #1 from Uros Bizjak ubizjak at gmail dot com 2013-01-18 11:09:42 UTC --- - Does language standard guarantee atomic store in this case [wikipedia says No. [1]]? - Can a store to a volatile DImode location be implemented as two consecutive SImode stores to adjacent location (this breaks stores of 64bit immediates to MMIO 64bit registers)? [1] http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #2 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 2013-01-18 11:25:52 UTC --- (In reply to comment #1) - Does language standard guarantee atomic store in this case [wikipedia says No. [1]]? The above example of device drivers storing constants to a device register declared as volatile unsigned long y does not require all of the attributes of an atomic store, for example, it does not require memory-fence instructions. - Can a store to a volatile DImode location be implemented as two consecutive SImode stores to adjacent location (this breaks stores of 64bit immediates to MMIO 64bit registers)? See 1.9p8 of the C++11 standard, first bullet: Access to volatile objects are evaluated strictly according to the rules of the abstract machine. From what I can see, implementing a store to a volatile DImode location as two consecutive SImode stores to adjacent locations violates this aspect of the standard. Furthermore, to expand on your parenthesized statement above, gcc might not operate reliably if the device drivers in the kernel it is running on have their 64-bit immediate stores broken into pairs of 32-bit immediate stores. ;-) [1] http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #3 from Evgeniy Stepanov eugeni.stepanov at gmail dot com 2013-01-18 11:57:07 UTC --- (In reply to comment #2) See 1.9p8 of the C++11 standard, first bullet: Access to volatile objects are evaluated strictly according to the rules of the abstract machine. From what I can see, implementing a store to a volatile DImode location as two consecutive SImode stores to adjacent locations violates this aspect of the standard. Furthermore, to expand on your parenthesized statement above, gcc might not operate reliably if the device drivers in the kernel it is running on have their 64-bit immediate stores broken into pairs of 32-bit immediate stores. ;-) So, what are these rules of the abstract machine, and why do they allow non-atomic store of a large volatile aggregate (it is definitely not atomic, right?), and require atomicity for volatile long?
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #4 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 2013-01-18 16:22:49 UTC --- (In reply to comment #3) So, what are these rules of the abstract machine, and why do they allow non-atomic store of a large volatile aggregate (it is definitely not atomic, right?), and require atomicity for volatile long? 5.17p3 and 5.17p4 do distinguish between non-aggregate and aggregate assignment: 5.17p3: If the left operand is not of class type, the expression is implicitly converted (Clause 4) to the cv-unqualified type of the left operand. 5.17p4: If the left operand is of class type, the class shall be complete. Assignment to objects of a class is defined by the copy/move assignment operator. Of course, the old-days possibility of systems with 8-bit busses limits how much the standard can say, but given that the system in question really can do a 64-bit store, volatile really should force a single store.
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #5 from Evgeniy Stepanov eugeni.stepanov at gmail dot com 2013-01-18 16:38:11 UTC --- Well, it's true that classes have assignment operators, and basic types don't. But this does not have anything to do with how the assignment could (or could not) be implemented at the machine level. AFAIU, the main point here is that a valid, data-race-free program can't observe the non-atomicity of a 64-bit store. And there is no special clause for volatile stores in this regard, either. I agree that in practice volatile stores of reasonably-sized basic types must be atomic, or a lot of legacy code will break.
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 --- Comment #6 from Paul E. McKenney paulmck at linux dot vnet.ibm.com 2013-01-18 17:40:13 UTC --- The fact that a data-race-free program cannot observe the non-atomicity of a 64-bit store, though true, is beside the point. The plain fact is that hardware registers (for which volatile was intended) really do care about the size of the store. A pair of 32-bit stores does not necessarily mean the same thing to hardware as does a single 64-bit store. Given that C is intended to be used for device drivers, volatile stores of reasonably-sized basic types must be atomic, and on 64-bit systems, reasonably-sized very clearly includes 64-bit stores. So this bug really does need to be fixed.
[Bug target/56028] Splitting a 64-bit volatile store
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56028 Pawel Sikora pluto at agmk dot net changed: What|Removed |Added CC||pluto at agmk dot net --- Comment #7 from Pawel Sikora pluto at agmk dot net 2013-01-18 18:23:27 UTC --- (In reply to comment #6) The fact that a data-race-free program cannot observe the non-atomicity of a 64-bit store, though true, is beside the point. The plain fact is that hardware registers (for which volatile was intended) really do care about the size of the store. A pair of 32-bit stores does not necessarily mean the same thing to hardware as does a single 64-bit store. Given that C is intended to be used for device drivers, volatile stores of reasonably-sized basic types must be atomic, and on 64-bit systems, reasonably-sized very clearly includes 64-bit stores. even 'movq' on x86-64 could be split into 32-bit transfers by mainboard h/w. devices connected to x86-64 should be aware of this! btw, iirc the x87-fpu load/store can give you atomic 64-bit transfer.