* Linus Torvalds <[email protected]> wrote:

> On Mon, Jul 6, 2015 at 6:44 AM, Ingo Molnar <[email protected]> wrote:
> >
> > So looking at this I question the choice of -mpreferred-stack-boundary=3. 
> > Why 
> > not do -mpreferred-stack-boundary=2?
> 
> It wouldn't make sense anyway - it would only make code worse (if it worked) 
> and 
> not any better.
> 
> The reason the "=3" value is good is because 8-byte alignment is the 
> "natural" 
> alignment - it's what you get with a normal call sequence, simply because the 
> return address is 8 bytes in size.
> 
> That means that with "=3" you don't get extra code to align the stack for the 
> simple functions that don't need a frame.
> 
> Anything smaller than 3 wouldn't help even if it worked, because none of the 
> normal stack operations (pushing/popping registers to save/restore them) 
> would 
> be any smaller anyway.
> 
> But bigger values than 3 result in the compiler having to generate extra 
> stack 
> adjustments just to align the stack after a call that very naturally 
> mis-aligned 
> it. And it doesn't help anyway, since in the kernel we don't put stuff on the 
> stack that needs bigger alignment (of, the fxsave buffer is a 
> counter-example, 
> but it's a very odd one that we _shouldn't_ have put on the stack).

Ok, so it's all moot, but my (quite possibly flawed) thinking was that for 
deeper 
call chains, using 4 byte RSP alignment (as opposed to 8 bytes) would allow, in 
about 50% of the cases, the stack frame to be narrower by 4 bytes. (depending 
on 
whether the 'natural' stack boundary is properly aligned to 8 bytes or not.)

For a 10 deep call chain that's a 20 bytes more compact stack on average 
(10*4*0.5), resulting in a tiny bit denser D$.

My assumptions were:

 - no extra code is generated by GCC. (If it causes any extra code to be 
generated
   then it's an obvious loss.)

 - mis-aligning an 8 byte variable by 4 bytes is being handled quite well by 
most
   x86 uarchs, without penalty in most cases.

But ... it's all moot and even in the best case if both my assumptions are 
fully 
met (which is not a given), the advantages are pretty marginal, so consider the 
idea dead by multiple mortal wounds.

Thanks,

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to