On Tue, Sep 4, 2012 at 10:49 PM, Martin Sebor <mse...@gmail.com> wrote:

>   template <class _CharT>
>   inline string numpunct<_CharT>::grouping () const
>   {
>       if (!(_C_flags & _RW::__rw_gr)) {
>
>           numpunct* const __self = _RWSTD_CONST_CAST (numpunct*, this);
>
>          _RWSTD_MT_GUARD (_C_mutex);
>
>          // [try to] get the grouping first (may throw)
>           // then set a flag to avoid future initializations
>           __self->_C_grouping  = do_grouping ();
>           __self->_C_flags    |= _RW::__rw_gr;
>       }
>
>       return _C_grouping;
>   }

That's what I wanted to do originally - use a per-object mutext.
Unfortunately the _C_mutex member in rw::__rw_synchronized is static:

struct __rw_synchronized
{
    // static so that it takes up no space
    static _RWSTD_EXPORT __rw_mutex _C_mutex;

    void _C_lock () { }

    void _C_unlock () { }

    __rw_guard _C_guard () {
        return __rw_guard (_C_mutex);
    }

and __rw::rw_guard doesn't have an appropriate constructor.

Intel C++ complains about it too:

/src/steleman/programming/stdcxx-intel/stdcxx-4.2.1-thread-safe/include/loc/_numpunct.h(181):
error: no instance of constructor "__rw::__rw_guard::__rw_guard"
matches the argument list
            argument types are: (const __rw::__rw_mutex)
          _RWSTD_MT_GUARD (_C_mutex);
          ^

This works:

template <class _CharT>
inline string numpunct<_CharT>::grouping () const
{
    if (!(_C_flags & _RW::__rw_gr)) {

        numpunct* const __self = _RWSTD_CONST_CAST (numpunct*, this);

        _RWSTD_MT_STATIC_GUARD (_Type);

        // [try to] get the grouping first (may throw)
        // then set a flag to avoid future initializations
        __self->_C_grouping  = do_grouping ();
        __self->_C_flags    |= _RW::__rw_gr;
    }

    return _C_grouping;
}

Although I'm not sure of the performance implications difference
between _RWSTD_MT_STATIC_GUARD and _RWSTD_MT_CLASS_GUARD for this
particular problem. I'm going with "nothing in real life". :-)

And even so, this is still not thread-safe:

Two different threads [ T1 and T2 ], seeking two different locales
[en_US.UTF-8 and ja_JP.UTF-8], enter std::numpunct<_CharT>::grouping()
at the same time - because they are running on two different cores.
They both test for

    if (!(_C_flags & _RW::__rw_gr))

and then -- assuming the expression above evaluates to true -- one of
them wins the mutex [T1], and the other one [T2] blocks on the mutex.

When T1 is done and exits the function, the grouping is set to
en_US.UTF-8 and the mutex is released. Now T2 acquires the mutex, and
proceeds to setting grouping to ja-JP.UTF-8. Woe on the caller running
from T1 who now thinks he got en_US.UTF-8, but instead he gets
ja_JP.UTF-8, which was duly set so by T2, but T1 had no clue about it
(remember, the std::string grouping _charT buffer is shared by the
caller from T1 and the caller from T2).

So at a minimum, the locking must happen before evaluating the

    if (!(_C_flags & _RW::__rw_gr))

expression.

This still doesn't solve what ends up being returned in grouping. If
we lock at the top of the function, then, when T2 acquires the mutex,
the test expression will evaluate to false. Therefore T2 will return
whatever is in grouping right now, which happens to be en_US.UTF-8 as
set by T1, when T2 really wanted ja_JP.UTF-8.

I really think the appropriate fix here -- which would address the
performance implications -- is more complex than this. I am thinking
about creating and using a (non-publicly accessible) internal locale
cache:

typedef std::map<std::string, std::locale> locale_cache;

where all the locales are stored fully initialized, on demand. There
is only one locale instantiation and initialization overhead cost per
locale. After a locale has been instantiated and placed into the
cache, the caller of any specfic locale gets a copy from the cache,
fully instantiated and initialized. But this breaks ABI, so I'm
thinking it's for stdcxx 5.

Thoughts?

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.tele...@gmail.com

Reply via email to