On 25.10.2012 05:49, Bruce Evans wrote:
On Wed, 24 Oct 2012, Attilio Rao wrote:

On Wed, Oct 24, 2012 at 8:16 PM, Andre Oppermann <an...@freebsd.org> wrote:
...
Let's go back and see how we can do this the sanest way.  These are
the options I see at the moment:

 1. sprinkle __aligned(CACHE_LINE_SIZE) all over the place

This is wrong because it doesn't give padding.

Unless it is sprinkled in struct declarations.

 2. use a macro like MTX_ALIGN that can be SMP/UP aware and in
    the future possibly change to a different compiler dependent
    align attribute

What is this macro supposed to do? I don't understand that from your
description.

 3. embed __aligned(CACHE_LINE_SIZE) into struct mtx itself so it
    automatically gets aligned in all cases, even when dynamically
    allocated.

This works but I think it is overkill for structures including sleep
mutexes which are the vast majority. So I wouldn't certainly be in
favor of such a patch.

This doesn't work either with fully dynamic (auto) allocations.  Stack
alignment is generally broken (limited, and pessimized for both space
and time) in gcc (it works better in clang).  On amd64, it is limited
by the default of -mpreferred-stack-boundary=4.  Since 2**4 is smaller
than the cache line size and stack alignments larger than it are broken
in gcc, __aligned(CACHE_LINE_SIZE) never works (except accidentally,
16/CACHE_LINE_SIZE of the time.  On i386, we reduce the space/time
pessimizations a little by overriding the default to
-mpreferred-stack-boundary=2.  2**2 is even smaller than the cache
line size.  (The pessimizations are for both space and time, since
time and code space is wasted for the code to keep the stack aligned,
and cache space and thus also time are wasted for padding.  Most
functions don't benefit from more than sizeof(register_t) alignment.)

I'm not aware of stack allocated mutexes anywhere in the kernel.
Even if there is a case it's very special and unique.

I've verified that __aligned(CACHE_LINE_SIZE) on the definition of
struct mtx itself (in sys/_mutex.h) correctly aligns and pads the
global .bss resident mutexes for 64B and 128B cache line sizes.

Dynamic allocations via malloc() get whatever alignment malloc() gives.
This is only required to be 4 or 8 or 16 or so (the maximum for a C
object declared in conforming C (no __align()), but malloc() usually
gives more.  If it gives CACHE_LINE_SIZE, that is wasteful for most
small allocations.

Stand-alone mutexes are normally not malloc'ed.  They're always
embedded into some larger structure they protect.

__builtin_alloca() is broken in gcc-3.3.3, but works in gcc-4.2.1, at
least on i386.  In gcc-3.3.3, it assumes that the stack is the default
16-byte aligned even if -mpreferred-stack-boundary=2 is in CFLAGS to
say otherwise, and just subtracts from the stack pointer.  In gcc-4.2.1,
it does the necessary andl of the stack pointer, but only 16-byte
alignment.

It is another bug that there sre no extensions of malloc() or alloca().
Since malloc() is in the library and may give CACHE_LINE_SIZE but
__builtin_alloca() is in the compiler and only gives 16, these functions
are not even as compatible as they should be.

I don't know of any mutexes allocated on the stack, but there are stack
frames with mcontexts in them that need special alignment so they cause
problems on i386.  They can't just be put on the stack due to the above
bugs. They are laboriously allocated using malloc().  Since they are a
quite large, 1 mcontext barely fits on the kernel stack, so kib didn't
like my alloca() method for allocating them.

You lost me here.

--
Andre

_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to