Re: (L) Prelim RFR: 8132510: Replace ThreadLocalStorage with compiler/language-based thread-local variables

David Holmes Tue, 03 Nov 2015 20:59:28 -0800

Hi Thomas,

On 4/11/2015 12:14 AM, Thomas Stüfe wrote:

On Mon, Nov 2, 2015 at 9:01 PM, David Holmes <[email protected]
<mailto:[email protected]>> wrote:
    On 2/11/2015 9:20 PM, Thomas Stüfe wrote:
        some small changes are needed to make this build and run on AIX. I
        attached a patch file with the needed additions.


I also checked Linux ppc64, seems to work fine.


Excellent! Thank you!

        I did not run any extensive tests on AIX, so I cannot say for
        sure if
        this is stable. We (SAP) also may face some problems later when
        we port
        this to HP-UX, because there, shared libraries using __thread
        cannot be loaded dynamically.

    Ouch!

        So, I admit to some small worries, beside the issue with memory
        leaks on
        older glibc versions. For me, this feels like something which needs
        tight compiler/thread library support from the OS, so it makes us
        vulnerable to running on older systems (older glibc) or building
        with
        outdated compilers. Therefore it would be nice to have a simple
        way to re-add the pthread-based TLS implementation if needed.

    I can't see how to do that without keeping all the existing layers
    of code - even though they would be no-ops on all the platforms that
    support the compiler-based TLS. Basically just extend what I did for
    Solaris to the other platforms.

I took a closer look and I now I worry less. I am confident that in case
our old platforms experience problemswith __thread, we can reintroduce
TLS without too many changes.

Just as a test, I changed the AIX implementation from using __thread
back to pthread tls just by changing implementations for
Thread::current(), Thread::initialize_thread_current() and
Thead::clear_thread_current() in thread.cpp. Works fine as expected. Of
course this was just a hack, but if we need to go back to pthread tls
for AIX or any platform, I think it can be done in a simpler way than
before and still be clean.

Thanks for looking into this in detail! Yes I've been thinking aboutthis too, and I think three of four simple hooks will allow the basicpthread-TLS mechanism to be reinstated, in shared code, but without anyof the per-platform fancy caching schemes. There will be a singlethreadLocalStorage.cpp file in a platform specific directory; and ofcourse MacroAssembler::get_thread may need to be os/cpu specific. I willlook further into this, but may defer its implementation to a follow upissue.

Not terribly important, but I would prefer if
Thread::initialize_thread_current() and Thead::clear_thread_current()
were not exposed from Thread at all or at least as private as possible.
Thread::initialize_thread_current() is called from the OS layer, but
Thead::clear_thread_current() is only called from within thread.cpp
itself and could be kept at file scope.

As you note the initialize function has to be exposed as it is needed inthe OS thread startup code. But I can make the clear function private.

        Apart from that, I like the patch and think the simplification
        is good and worth the effort.

    Even if you can't easily add back the pthread-based TLS if needed?

I think we can, if needed.

Ok.

    It is unfortunate that hotspot may still be shackled to the past
    that way - we killed off hotspot-express (in part) to remove those
    shackles and allow us to modernize the codebase.

    Thanks,
    David


One question about your changes:

Before, Thread::current() would assert instead of returning NULL if
called before Thread::initialize_thread_current() or after
Thead::clear_thread_current() . Now, we just return NULL. Is this intended?


Ah great question ... so before we have a mix of calls to:

- Thread::current()  (asserts on NULL as does JavaThread::current)
- ThreadLocalStorage::thread() (can return NULL)
- ThreadLocalStorage::get_thread_slow() (can return NULL)

and now we only have Thread::current() which means we have to allowreturning NULL because it can be intentionally called when a thread isnot attached. That means we won't directly catch calls toThread::current() from code that doesn't realize it is calling it "toosoon" - though there do exist numerous assertions in the callers ofThread::current() that check the result is not NULL.

I could add the assert to Thread::current() and also addThread::current_or_null() to be used by code that needs to use it tocheck for attachment (ie JNI code). I'd also have to examine all thechanged ThreadLocalStorage::thread/get_thread_slow call-sites to see ifany of those legitimately expect the thread may not be attached.


What do you think?

I also need to look at the location of Thread::current in the .hpp filerather than .inline.hpp and reconcile that with comments regarding thenoinline version (which is only used in g1HotCardCache.hpp).


Thanks,
David

Regards, Thomas

        Kind Regards, Thomas






        On Mon, Nov 2, 2015 at 7:40 AM, David Holmes
        <[email protected] <mailto:[email protected]>
        <mailto:[email protected]
        <mailto:[email protected]>>> wrote:

             bug: https://bugs.openjdk.java.net/browse/JDK-8132510

             Open webrev:
        http://cr.openjdk.java.net/~dholmes/8132510/webrev.v2/

             A simple (in principle) but wide-ranging change which
        should appeal
             to our Code Deletion Engineer's. We implement Thread::current()
             using a compiler/language-based thread-local variable eg:


               static __thread Thread *_thr_current;

               inline Thread* Thread::current() {
                 return _thr_current;
               }

             with an appropriate setter of course. By doing this we can
             completely remove the platform-specific ThreadLocalStorage
             implementations, and the associated
        os::thread_local_storage* calls,
             plus all the uses of ThreadLocalStorage::thread() and
             ThreadLocalStorage::get_thread_slow(). This extends the
        previous
             work done on Solaris to implement
        ThreadLocalStorage::thread() using
             compiler-based thread-locals.

             We can also consolidate nearly all the os_cpu versions of
             MacroAssembler::get_thread on x86 into one cpu specific one ( a
             special variant is still needed for 32-bit Windows).

             As a result of this change we have further potential cleanups:
             - all the src/os/<os>/vm/thread_<os>.inline.hpp files are now
             completely empty and could also be removed
             - the MINIMIZE_RAM_USAGE define (which avoids use of the linux
             sp-map "cache" on 32-bit) now has no affect and so could be
             completely removed from the build system

             I plan to do the MINIMIZE_RAM_USAGE removal as a follow up
        CR, but
             could add the removal of the "inline" files to this CR if
        people
             think it worth removing them.

             I have one missing piece on Aarch64 - I need to change
             MacroAssembler::get_thread to simply call Thread::current()
        as on
             other platforms, but I don't know how to write that. I would
             appreciate it if someone could give me the right code for that.

             I would also appreciate comments/testing by the AIX and
        PPC64 folk
             as well.

             A concern about memory-leaks had previously been raised, but
             experiments using simple C code on linux 86 and Solaris
        showed no
             issues. Also note that Aarch64 already uses this kind of
        thread-local.

             Thanks,
             David

Re: (L) Prelim RFR: 8132510: Replace ThreadLocalStorage with compiler/language-based thread-local variables

Reply via email to