Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Linus Torvalds
On Sat, Feb 22, 2014 at 4:39 PM, Paul E. McKenney
 wrote:
>
> Agreed, by far the most frequent use is "->" to dereference and assignment
> to store into a local variable.  The other operations where the kernel
> expects ordering to be maintained are:
>
> o   Bitwise "&" to strip off low-order bits.  The FIB tree does
> this, for example in fib_table_lookup() in net/ipv4/fib_trie.c.
> The low-order bit is used to distinguish internal nodes from
> leaves -- nodes and leaves are different types of structures.
> (There are a few others.)

Note that this is very much outside the scope of the C standard,
regardless of 'consume' or not.

We'll always do things outside the standard, so I wouldn't worry.

> o   Uses "?:" to substitute defaults in case of NULL pointers,
> but ordering must be maintained in the non-default case.
> Most, perhaps all, of these could be converted to "if" should
> "?:" prove problematic.

Note that this doesn't actually affect the restrict/ordering rule in
theory: "?:" isn't special according to those rules. The rules are
fairly simple: we guarantee ordering only to the object that the
pointer points to, and even that guarantee goes out the window if
there is some *other* way to reach the object.

?: is not really relevant, except in the sense that *any* expression
that ends up pointing to outside the object will lose the ordering
guarantee. ?: can be one such expression, but so can "p-p" or anything
like that.

And in *practice*, the only thing that needs to be sure to generate
special code is alpha, and there you'd just add the "rmb" after the
load. That is sufficient to fulfill the guarantees.

On ARM and powerpc, the compiler obviously has to guarantee that it
doesn't do value-speculation on the result, but again, that never
really had anything to do with the whole "carries a dependency", it is
really all about the fact that in order to guarantee the ordering, the
compiler mustn't generate that magical aliased pointer value. But if
the aliased pointer value comes from the *source* code, all bets are
off.

Now, even on alpha, the compiler can obviously move that "rmb" around.
For example, if there is a conditional after the
"atomic_read(mo_consume)", and the compiler can tell that the pointer
that got read by mo_consume is dead along one branch, then the
compiler can move the "rmb" to only exist in the other branch. Why?
Because we inherently guarantee only the order to any accesses to the
object the pointer pointed to, and that the pointer that got loaded is
the *only* way to get to that object (in this context), so if the
value is dead, then so is the ordering.

In fact, even if the value is *not* dead, but it is NULL, the compiler
can validly say "the NULL pointer cannot point to any object, so I
don't have to guarantee any serialization". So code like this (writing
alpha assembly, since in practice only alpha will ever care):

ptr = atomic_read(pp, mo_consume);
if (ptr) {
   ... do something with ptr ..
}
return ptr;

can validly be translated to:

ldq $1,0($2)
beq $1,branch-over
rmb
.. the do-something code using register $1 ..

because the compiler knows that a NULL pointer cannot be dereferenced,
so it can decide to put the rmb in the non-NULL path - even though the
pointer value is still *live* in the other branch (well, the liveness
of a constant value is somewhat debatable, but you get the idea), and
may be used by the caller (but since it is NULL, the "use" can not
include accessing any object, only really testing)

So note how this is actually very different from the "carries
dependency" rule. It's simpler, and it allows much more natural
optimizations.

> o   Addition and subtraction to adjust both pointers to and indexes
> into RCU-protected arrays.  There are not that many indexes,
> and they could be converted to pointers, but the addition and
> subtraction looks necessary in a some cases.

Addition and subtraction is fine, as long as they stay within the same
object/array.

And realistically, people violate the whole C pointer "same object"
rule all the time. Any time you implement a raw memory allocator,
you'll violate the C standard and you *will* basically be depending on
architecture-specific behavior. So everybody knows that the C "pointer
arithmetic has to stay within the object" is really a fairly made-up
but convenient shorthand for "sure, we know you'll do tricks on
pointer values, but they won't be portable and you may have to take
particular machine representations into account".

> o   Array indexing.  The value from rcu_dereference() is used both
> before and inside the "[]", interestingly enough.

Well, in the C sense, or in the actual "integer index" sense? Because
technically, a[b] is nothing but *(a+b), so "inside" the "[]" is
strictly speaking meaningless. Inside and outside are just syntactic
sugar.

That said, 

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Paul E. McKenney
On Sat, Feb 22, 2014 at 01:53:30PM -0800, Linus Torvalds wrote:
> On Sat, Feb 22, 2014 at 10:53 AM, Torvald Riegel  wrote:
> >
> > Stating that (1) "the standard is wrong" and (2) that you think that
> > mo_consume semantics are not good is two different things.
> 
> I do agree. They are two independent things.
> 
> I think the standard is wrong, because it's overly complex, hard to
> understand, and nigh unimplementable. As shown by the bugzilla
> example, "carries a dependency" encompasses things that are *not* just
> synchronizing things just through a pointer, and as a result it's
> actually very complicated, since they could have been optimized away,
> or done in non-local code that wasn't even aware of the dependency
> carrying.
> 
> That said, I'm reconsidering my suggested stricter semantics, because
> for RCU we actually do want to test the resulting pointer against NULL
> _without_ any implied serialization.
> 
> So I still feel that the standard as written is fragile and confusing
> (and the bugzilla entry pretty much proves that it is also practically
> unimplementable as written), but strengthening the serialization may
> be the wrong thing.
> 
> Within the kernel, the RCU use for this is literally purely about
> loading a pointer, and doing either:
> 
>  - testing its value against NULL (without any implied synchronization at all)
> 
>  - using it as a pointer to an object, and expecting that any accesses
> to that object are ordered wrt the consuming load.

Agreed, by far the most frequent use is "->" to dereference and assignment
to store into a local variable.  The other operations where the kernel
expects ordering to be maintained are:

o   Bitwise "&" to strip off low-order bits.  The FIB tree does
this, for example in fib_table_lookup() in net/ipv4/fib_trie.c.
The low-order bit is used to distinguish internal nodes from
leaves -- nodes and leaves are different types of structures.
(There are a few others.)

o   Uses "?:" to substitute defaults in case of NULL pointers,
but ordering must be maintained in the non-default case.
Most, perhaps all, of these could be converted to "if" should
"?:" prove problematic.

o   Addition and subtraction to adjust both pointers to and indexes
into RCU-protected arrays.  There are not that many indexes,
and they could be converted to pointers, but the addition and
subtraction looks necessary in a some cases.

o   Array indexing.  The value from rcu_dereference() is used both
before and inside the "[]", interestingly enough.

o   Casts along with unary "&" and "*".

That said, I did not see any code that dependended on ordering through
the function-call "()", boolean complement "!", comparison (only "=="
and "!="), logical operators ("&&" and "||"), and the "*", "/", and "%"
arithmetic operators.

> So I actually have a suggested *very* different model that people
> might find more acceptable.
> 
> How about saying that the result of a "atomic_read(&a, mo_consume)" is
> required to be a _restricted_ pointer type, and that the consume
> ordering guarantees the ordering between that atomic read and the
> accesses to the object that the pointer points to.
> 
> No "carries a dependency", no nothing.

In the case of arrays, the object that the pointer points to is
considered to be the full array, right?

> Now, there's two things to note in there:
> 
>  - the "restricted pointer" part means that the compiler does not need
> to worry about serialization to that object through other possible
> pointers - we have basically promised that the *only* pointer to that
> object comes from the mo_consume. So that part makes it clear that the
> "consume" ordering really only is valid wrt that particular pointer
> load.

That could work, though there are some cases where a multi-linked
structure is made visible using a single rcu_assign_pointer(), and
rcu_dereference() is used only for the pointer leading to that
multi-linked structure, not for the pointers among the elements
making up that structure.  One way to handle this would be to
require rcu_dereference() to be used within the structure an well
as upon first traversal to the structure.

>  - the "to the object that the pointer points to" makes it clear that
> you can't use the pointer to generate arbitrary other values and claim
> to serialize that way.
> 
> IOW, with those alternate semantics, that gcc bugzilla example is
> utterly bogus, and a compiler can ignore it, because while it tries to
> synchronize through the "dependency chain" created with that "p-i+i"
> expression, that is completely irrelevant when you use the above rules
> instead.
> 
> In the bugzilla example, the object that "*(p-i+i)" accesses isn't
> actually the object pointed to by the pointer, so no serialization is
> implied. And if it actually *were* to be the same object, because "p"
> happens to have the same value as "i", then the 

gcc-4.7-20140222 is now available

2014-02-22 Thread gccadmin
Snapshot gcc-4.7-20140222 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.7-20140222/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.7 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_7-branch 
revision 208045

You'll find:

 gcc-4.7-20140222.tar.bz2 Complete GCC

  MD5=24cb2e1ac7659760c309c8735471ce4c
  SHA1=2cf407a1304bf693685af67301705e4fc4f893d5

Diffs from 4.7-20140215 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.7
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Linus Torvalds
On Sat, Feb 22, 2014 at 10:53 AM, Torvald Riegel  wrote:
>
> Stating that (1) "the standard is wrong" and (2) that you think that
> mo_consume semantics are not good is two different things.

I do agree. They are two independent things.

I think the standard is wrong, because it's overly complex, hard to
understand, and nigh unimplementable. As shown by the bugzilla
example, "carries a dependency" encompasses things that are *not* just
synchronizing things just through a pointer, and as a result it's
actually very complicated, since they could have been optimized away,
or done in non-local code that wasn't even aware of the dependency
carrying.

That said, I'm reconsidering my suggested stricter semantics, because
for RCU we actually do want to test the resulting pointer against NULL
_without_ any implied serialization.

So I still feel that the standard as written is fragile and confusing
(and the bugzilla entry pretty much proves that it is also practically
unimplementable as written), but strengthening the serialization may
be the wrong thing.

Within the kernel, the RCU use for this is literally purely about
loading a pointer, and doing either:

 - testing its value against NULL (without any implied synchronization at all)

 - using it as a pointer to an object, and expecting that any accesses
to that object are ordered wrt the consuming load.

So I actually have a suggested *very* different model that people
might find more acceptable.

How about saying that the result of a "atomic_read(&a, mo_consume)" is
required to be a _restricted_ pointer type, and that the consume
ordering guarantees the ordering between that atomic read and the
accesses to the object that the pointer points to.

No "carries a dependency", no nothing.

Now, there's two things to note in there:

 - the "restricted pointer" part means that the compiler does not need
to worry about serialization to that object through other possible
pointers - we have basically promised that the *only* pointer to that
object comes from the mo_consume. So that part makes it clear that the
"consume" ordering really only is valid wrt that particular pointer
load.

 - the "to the object that the pointer points to" makes it clear that
you can't use the pointer to generate arbitrary other values and claim
to serialize that way.

IOW, with those alternate semantics, that gcc bugzilla example is
utterly bogus, and a compiler can ignore it, because while it tries to
synchronize through the "dependency chain" created with that "p-i+i"
expression, that is completely irrelevant when you use the above rules
instead.

In the bugzilla example, the object that "*(p-i+i)" accesses isn't
actually the object pointed to by the pointer, so no serialization is
implied. And if it actually *were* to be the same object, because "p"
happens to have the same value as "i", then the "restrict" part of the
rule pops up and the compiler can again say that there is no ordering
guarantee, since the programmer lied to it and used a restricted
pointer that aliased with another one.

So the above suggestion basically tightens the semantics of "consume"
in a totally different way - it doesn't make it serialize more, in
fact it weakens the serialization guarantees a lot, but it weakens
them in a way that makes the semantics a lot simpler and clearer.

 Linus


Re: gcc generated long read out of bounds segfault

2014-02-22 Thread Andreas Schwab
David Fries  writes:

> The structure is only made up of an 8 bit type "char", and it is
> aligned to a multiple of the struct rgb data size which is 3.  How is
> that unaligned?

Sorry, I've miscomputed the alignment.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Paul E. McKenney
On Sat, Feb 22, 2014 at 07:30:37PM +0100, Torvald Riegel wrote:
> xagsmtp2.20140222183231.5...@emeavsc.vnet.ibm.com
> X-Xagent-Gateway: emeavsc.vnet.ibm.com (XAGSMTP2 at EMEAVSC)
> 
> On Thu, 2014-02-20 at 10:18 -0800, Paul E. McKenney wrote:
> > On Thu, Feb 20, 2014 at 06:26:08PM +0100, Torvald Riegel wrote:
> > > xagsmtp2.20140220172700.0...@vmsdvm4.vnet.ibm.com
> > > X-Xagent-Gateway: vmsdvm4.vnet.ibm.com (XAGSMTP2 at VMSDVM4)
> > > 
> > > On Wed, 2014-02-19 at 20:01 -0800, Paul E. McKenney wrote:
> > > > On Wed, Feb 19, 2014 at 04:53:49PM -0800, Linus Torvalds wrote:
> > > > > On Tue, Feb 18, 2014 at 11:47 AM, Torvald Riegel  
> > > > > wrote:
> > > > > > On Tue, 2014-02-18 at 09:44 -0800, Linus Torvalds wrote:
> > > > > >>
> > > > > >> Can you point to it? Because I can find a draft standard, and it 
> > > > > >> sure
> > > > > >> as hell does *not* contain any clarity of the model. It has a 
> > > > > >> *lot* of
> > > > > >> verbiage, but it's pretty much impossible to actually understand, 
> > > > > >> even
> > > > > >> for somebody who really understands memory ordering.
> > > > > >
> > > > > > http://www.cl.cam.ac.uk/~mjb220/n3132.pdf
> > > > > > This has an explanation of the model up front, and then the detailed
> > > > > > formulae in Section 6.  This is from 2010, and there might have been
> > > > > > smaller changes since then, but I'm not aware of any bigger ones.
> > > > > 
> > > > > Ahh, this is different from what others pointed at. Same people,
> > > > > similar name, but not the same paper.
> > > > > 
> > > > > I will read this version too, but from reading the other one and the
> > > > > standard in parallel and trying to make sense of it, it seems that I
> > > > > may have originally misunderstood part of the whole control dependency
> > > > > chain.
> > > > > 
> > > > > The fact that the left side of "? :", "&&" and "||" breaks data
> > > > > dependencies made me originally think that the standard tried very
> > > > > hard to break any control dependencies. Which I felt was insane, when
> > > > > then some of the examples literally were about the testing of the
> > > > > value of an atomic read. The data dependency matters quite a bit. The
> > > > > fact that the other "Mathematical" paper then very much talked about
> > > > > consume only in the sense of following a pointer made me think so even
> > > > > more.
> > > > > 
> > > > > But reading it some more, I now think that the whole "data dependency"
> > > > > logic (which is where the special left-hand side rule of the ternary
> > > > > and logical operators come in) are basically an exception to the rule
> > > > > that sequence points end up being also meaningful for ordering (ok, so
> > > > > C11 seems to have renamed "sequence points" to "sequenced before").
> > > > > 
> > > > > So while an expression like
> > > > > 
> > > > > atomic_read(p, consume) ? a : b;
> > > > > 
> > > > > doesn't have a data dependency from the atomic read that forces
> > > > > serialization, writing
> > > > > 
> > > > >if (atomic_read(p, consume))
> > > > >   a;
> > > > >else
> > > > >   b;
> > > > > 
> > > > > the standard *does* imply that the atomic read is "happens-before" wrt
> > > > > "a", and I'm hoping that there is no question that the control
> > > > > dependency still acts as an ordering point.
> > > > 
> > > > The control dependency should order subsequent stores, at least assuming
> > > > that "a" and "b" don't start off with identical stores that the compiler
> > > > could pull out of the "if" and merge.  The same might also be true for 
> > > > ?:
> > > > for all I know.  (But see below)
> > > 
> > > I don't think this is quite true.  I agree that a conditional store will
> > > not be executed speculatively (note that if it would happen in both the
> > > then and the else branch, it's not conditional); so, the store in
> > > "a;" (assuming it would be a store) won't happen unless the thread can
> > > really observe a true value for p.  However, this is *this thread's*
> > > view of the world, but not guaranteed to constrain how any other thread
> > > sees the state.  mo_consume does not contribute to
> > > inter-thread-happens-before in the same way that mo_acquire does (which
> > > *does* put a constraint on i-t-h-b, and thus enforces a global
> > > constraint that all threads have to respect).
> > > 
> > > Is it clear which distinction I'm trying to show here?
> > 
> > If you are saying that the control dependencies are a result of a
> > combination of the standard and the properties of the hardware that
> > Linux runs on, I am with you.  (As opposed to control dependencies being
> > a result solely of the standard.)
> 
> I'm not quite sure I understand what you mean :)  Do you mean the
> control dependencies in the binary code, or the logical "control
> dependencies" in source programs?

At present, the intersection of those two sets, but only including those
control dependencies beginning with with a memory_order

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Torvald Riegel
On Thu, 2014-02-20 at 11:09 -0800, Linus Torvalds wrote:
> On Thu, Feb 20, 2014 at 10:53 AM, Torvald Riegel  wrote:
> > On Thu, 2014-02-20 at 10:32 -0800, Linus Torvalds wrote:
> >> On Thu, Feb 20, 2014 at 10:11 AM, Paul E. McKenney
> >>  wrote:
> >> >
> >> > You really need that "consume" to be "acquire".
> >>
> >> So I think we now all agree that that is what the standard is saying.
> >
> > Huh?
> >
> > The standard says that there are two separate things (among many more):
> > mo_acquire and mo_consume.  They both influence happens-before in
> > different (and independent!) ways.
> >
> > What Paul is saying is that *you* should have used *acquire* in that
> > example.
> 
> I understand.
> 
> And I disagree. I think the standard is wrong, and what I *should* be
> doing is point out the fact very loudly, and just tell people to NEVER
> EVER use "consume" as long as it's not reliable and has insane
> semantics.

Stating that (1) "the standard is wrong" and (2) that you think that
mo_consume semantics are not good is two different things.  Making bold
statements without a proper context isn't helpful in making this
discussion constructive.  It's simply not efficient if I (or anybody
else reading this) has to wonder whether you actually mean what you said
(even if, when reading it literally, is arguably not consistent with the
arguments brought up in the discussion) or whether those statements just
have to be interpreted in some other way.

> So what I "should do" is to not accept any C11 atomics use in the
> kernel.

You're obviously free to do that.

> Because with the "acquire", it generates worse code than what
> we already have,

I would argue that this is still under debate.  At least I haven't seen
a definition of what you want that is complete and based on the standard
(e.g., an example of what a compiler might do in a specific case isn't a
definition).  From what I've seen, it's not inconceivable that what you
want is just an optimized acquire.

I'll bring this question up again elsewhere in the thread (where it
hopefully fits better).



Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Torvald Riegel
On Thu, 2014-02-20 at 10:18 -0800, Paul E. McKenney wrote:
> On Thu, Feb 20, 2014 at 06:26:08PM +0100, Torvald Riegel wrote:
> > xagsmtp2.20140220172700.0...@vmsdvm4.vnet.ibm.com
> > X-Xagent-Gateway: vmsdvm4.vnet.ibm.com (XAGSMTP2 at VMSDVM4)
> > 
> > On Wed, 2014-02-19 at 20:01 -0800, Paul E. McKenney wrote:
> > > On Wed, Feb 19, 2014 at 04:53:49PM -0800, Linus Torvalds wrote:
> > > > On Tue, Feb 18, 2014 at 11:47 AM, Torvald Riegel  
> > > > wrote:
> > > > > On Tue, 2014-02-18 at 09:44 -0800, Linus Torvalds wrote:
> > > > >>
> > > > >> Can you point to it? Because I can find a draft standard, and it sure
> > > > >> as hell does *not* contain any clarity of the model. It has a *lot* 
> > > > >> of
> > > > >> verbiage, but it's pretty much impossible to actually understand, 
> > > > >> even
> > > > >> for somebody who really understands memory ordering.
> > > > >
> > > > > http://www.cl.cam.ac.uk/~mjb220/n3132.pdf
> > > > > This has an explanation of the model up front, and then the detailed
> > > > > formulae in Section 6.  This is from 2010, and there might have been
> > > > > smaller changes since then, but I'm not aware of any bigger ones.
> > > > 
> > > > Ahh, this is different from what others pointed at. Same people,
> > > > similar name, but not the same paper.
> > > > 
> > > > I will read this version too, but from reading the other one and the
> > > > standard in parallel and trying to make sense of it, it seems that I
> > > > may have originally misunderstood part of the whole control dependency
> > > > chain.
> > > > 
> > > > The fact that the left side of "? :", "&&" and "||" breaks data
> > > > dependencies made me originally think that the standard tried very
> > > > hard to break any control dependencies. Which I felt was insane, when
> > > > then some of the examples literally were about the testing of the
> > > > value of an atomic read. The data dependency matters quite a bit. The
> > > > fact that the other "Mathematical" paper then very much talked about
> > > > consume only in the sense of following a pointer made me think so even
> > > > more.
> > > > 
> > > > But reading it some more, I now think that the whole "data dependency"
> > > > logic (which is where the special left-hand side rule of the ternary
> > > > and logical operators come in) are basically an exception to the rule
> > > > that sequence points end up being also meaningful for ordering (ok, so
> > > > C11 seems to have renamed "sequence points" to "sequenced before").
> > > > 
> > > > So while an expression like
> > > > 
> > > > atomic_read(p, consume) ? a : b;
> > > > 
> > > > doesn't have a data dependency from the atomic read that forces
> > > > serialization, writing
> > > > 
> > > >if (atomic_read(p, consume))
> > > >   a;
> > > >else
> > > >   b;
> > > > 
> > > > the standard *does* imply that the atomic read is "happens-before" wrt
> > > > "a", and I'm hoping that there is no question that the control
> > > > dependency still acts as an ordering point.
> > > 
> > > The control dependency should order subsequent stores, at least assuming
> > > that "a" and "b" don't start off with identical stores that the compiler
> > > could pull out of the "if" and merge.  The same might also be true for ?:
> > > for all I know.  (But see below)
> > 
> > I don't think this is quite true.  I agree that a conditional store will
> > not be executed speculatively (note that if it would happen in both the
> > then and the else branch, it's not conditional); so, the store in
> > "a;" (assuming it would be a store) won't happen unless the thread can
> > really observe a true value for p.  However, this is *this thread's*
> > view of the world, but not guaranteed to constrain how any other thread
> > sees the state.  mo_consume does not contribute to
> > inter-thread-happens-before in the same way that mo_acquire does (which
> > *does* put a constraint on i-t-h-b, and thus enforces a global
> > constraint that all threads have to respect).
> > 
> > Is it clear which distinction I'm trying to show here?
> 
> If you are saying that the control dependencies are a result of a
> combination of the standard and the properties of the hardware that
> Linux runs on, I am with you.  (As opposed to control dependencies being
> a result solely of the standard.)

I'm not quite sure I understand what you mean :)  Do you mean the
control dependencies in the binary code, or the logical "control
dependencies" in source programs?

> This was a deliberate decision in 2007 or so.  At that time, the
> documentation on CPU memory orderings were pretty crude, and it was
> not clear that all relevant hardware respected control dependencies.
> Back then, if you wanted an authoritative answer even to a fairly simple
> memory-ordering question, you had to find a hardware architect, and you
> probably waited weeks or even months for the answer.  Thanks to lots
> of work from the Cambridge guys at about the time that the standard was
> final

Re: gcc generated long read out of bounds segfault

2014-02-22 Thread David Fries
On Sat, Feb 22, 2014 at 08:49:38AM +0100, Andreas Schwab wrote:
> David Fries  writes:
> 
> > The attached program sets up and reads through the array with extra
> > padding at the of the array from 8 bytes to 0 bytes.  Padding from 4
> > to 0 crashes.
> 
> This program has undefined behaviour because you are using unaligned
> pointers.

The structure is only made up of an 8 bit type "char", and it is
aligned to a multiple of the struct rgb data size which is 3.  How is
that unaligned?

I thought the compiler would pad the structure out to make it aligned,
does that mean the following has undefined behavior?

struct rgb3 { char r, g, b;} v[2];
void fun3(struct rgb3 r) { v[0] = r; }
void array3()
{
fun3(v[1]);
}


void align()
{
struct rgb3 t0, t1, t2, t3, t4, t5, t6, *pt;
t6.r = 0;
t6.g = 1;
t6.b = 2;
printf("t6 %lu, %lu, %lu, %lu, %lu, %lu, %lu\n", (size_t)&t6,
- (size_t)&t5 + (size_t)&t6,
- (size_t)&t4 + (size_t)&t6,
- (size_t)&t3 + (size_t)&t6,
- (size_t)&t2 + (size_t)&t6,
- (size_t)&t1 + (size_t)&t6,
- (size_t)&t0 + (size_t)&t6);
t0 = t1 = t2 = t3 = t4 = t5 = t6;
pt = &t0;
fun3(*pt);
}
With -Os
t6 140737107100125, 3, 6, 9, 12, 15, 18

Would have the same problem, does that mean you can't trust taking the
address of anything on the stack?


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36043

-- 
David Fries PGP pub CB1EE8F0
http://fries.net/~david/


Re: gcc generated long read out of bounds segfault

2014-02-22 Thread Eric Botcazou
> Before I file a bug report I wanted to check to see if my expectations
> are wrong or if this is a compiler bug.  Is there anything that allows
> the compiler to generate instructions that would read beyond the end
> of an array potentially causing a crash if the page isn't accessible?

It's PR middle-end/36043 in GCC's bugzilla.

-- 
Eric Botcazou