date:20120925

Re: STDCXX-1056 : numpunct fix

2012-09-25 Thread Liviu Nicoara


On 09/24/12 22:28, Martin Sebor wrote:

On 09/20/2012 06:46 PM, Stefan Teleman wrote:

On Thu, Sep 20, 2012 at 8:39 PM, Liviu Nicoaranikko...@hates.ms  wrote:


I have not created this requirement out of thin air. STDCXX development has 
functioned in this manner for as long as I remember. If it does not suit you, 
that's fine.


That would explain why these bugs are present in the first place.

If the official method of determining thread-safety here is
fprintf(3C), then we have a much bigger problem than
22.locale.numpunct.mt.


Tests that end in .mt.cpp use the RW_ASSERT() macro to verify
correctness in multithreaded programs. There is no way at the
moment to run a thread analyzer as part of the test suite,
although it would be a nice addition. As I said in my other
reply, though, not all data races necessarily indicate bugs,
so we'd need a way to distinguish between know/benign races
and the rest. (Unless we decide to eliminate all races and
accept the performance penalty.)


Preserving the lazy initialization is possible: we perfectly forward to the 
implementation with either:

1. no additional synchronization (preserves fast reads of the data), while 
fully knowing that there are data races in __rw_get_numpunct/_C_get_data there, 
or
2.we put a big lock on top of all and every facet operations (the version that 
would be race free).

Caching seems out of the window for now, more so in the presence of unguarded 
reads of facet's data.

The only argument for (1) is that at the moment we don't have a failing test 
yet, and not for lack of arduous trying.

However, if atomic/membar APIs would be recognized by the compiler, lazy 
initialization of the facet and caching of data would be attainable in a quite 
simple fashion.

Liviu

Re: STDCXX-1056 : numpunct fix

2012-09-25 Thread Liviu Nicoara


On 09/24/12 23:50, Stefan Teleman wrote:

On Mon, Sep 24, 2012 at 10:03 PM, Martin Sebor mse...@gmail.com wrote:

FWIW, there are race conditions in stdcxx. Some of them are by
design and benign on the systems the library runs on (although
I acknowledge that some others may be bugs). One such benign
date race is:

   timeT1  T2
 0x = N
 1x = N   read x

where x is a scalar that can be accessed atomically by the CPU
and the compiler.

I think some of the lazy facet initialization falls under this
class. It would be nice to fix them but only if it doesn't slow
things down. The others need to be fixed in any case.


The race conditions I am talking about are not benign.


Caching failures aside, we have no failing tests. Before saying they are malign 
you need to objectively show a failing program.



It shows the types of race conditions it's reporting: these are
read/write race conditions, not read/read. The thread analyzer's html
filter doesn't show the types of races reported as clearly as the
command-line analyzer which has a windowing GUI.


Martin's example above is a read-write race. In the absence of a failing test 
case it is quite unreasonable to modify the current implementation.

Liviu

STDCXX-1070

2012-09-25 Thread Liviu Nicoara


I filed 1070, failure to build 22.locale.collate.cpp on Linux with gcc 4.7.1. 
Gcc, Comeau and Clang fail to compile it, Intel and Sun are fine. It looks to 
me like Intel and Sun compilers are not doing the right thing. A small test 
case and a patch have been attached. The failing code has been reduced to:

$ cat test.cpp; g++ -c test.cpp
template  class charT 
void f ()
{
g (charT ('a'));
}

template  class charT 
void g (charT)
{
}

int h ()
{
return f char  (), 0;
}
test.cpp: In instantiation of 'void f() \[with charT = char\]':
test.cpp:14:23:   required from here
test.cpp:4:5: error: 'g' was not declared in this scope, and no declarations 
were found by argument-dependent lookup at the point of instantiation 
\[-fpermissive\]
test.cpp:8:6: note: 'templateclass charT void g(charT)' declared here, later 
in the translation unit

Liviu

Re: STDCXX-1070

2012-09-25 Thread Martin Sebor


On 09/25/2012 02:46 PM, Liviu Nicoara wrote:

I filed 1070, failure to build 22.locale.collate.cpp on Linux with gcc
4.7.1. Gcc, Comeau and Clang fail to compile it, Intel and Sun are fine.
It looks to me like Intel and Sun compilers are not doing the right
thing. A small test case and a patch have been attached. The failing
code has been reduced to:


I agree that the test case is ill-formed and requires a diagnostic.

I haven't looked at the test so just to make sure the test case does
reflect the problem there: the well-formedness depends on whether
charT is a class type. If so, then the code would be okay because g()
would be found via associated namespace lookup. I.e., this would be
fine:

  template class charT
  void f () { g (charT ()); }

  template class charT
  void g (charT) { }

  struct S { };

  void h () { fS(); }

Martin



$ cat test.cpp; g++ -c test.cpp
template  class charT 
void f ()
{
g (charT ('a'));
}

template  class charT 
void g (charT)
{
}

int h ()
{
return f char  (), 0;
}
test.cpp: In instantiation of 'void f() \[with charT = char\]':
test.cpp:14:23: required from here
test.cpp:4:5: error: 'g' was not declared in this scope, and no
declarations were found by argument-dependent lookup at the point of
instantiation \[-fpermissive\]
test.cpp:8:6: note: 'templateclass charT void g(charT)' declared here,
later in the translation unit

Liviu

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

2012-09-25 Thread Liviu Nicoara

On 9/25/12 7:56 PM, Stefan Teleman (JIRA) wrote:

[
https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefan,

I don't think it's ok to close this bug. The race conditions are there and we
have not come to a completely satisfactory conclusion on how to deal with them,
or even if we should deal with them. Whichever it is we gotta keep this
discussion open. I sure hope you want to be a part of it.

FWIW, I have spent quite some time looking at your proposed patch and I am going
to reopen the incident. If I can.

Liviu

Stefan Teleman closed STDCXX-1056.
--

Resolution: Won't Fix

Bug will not be fixed. Upstream refuses to acknowledge the existence of the bug
in spite of objective evidence to the contrary.

std::moneypunct and std::numpunct implementations are not thread-safe
-

Key: STDCXX-1056
URL: https://issues.apache.org/jira/browse/STDCXX-1056
Project: C++ Standard Library
Issue Type: Bug
Components: 22. Localization
Affects Versions: 4.2.1, 4.2.x, 4.3.x, 5.0.0
Environment: Solaris 10 and 11, RedHat and OpenSuSE Linux, Sun C++
Compilers 12.1, 12.2, 12.3
Issue is independent of platform and/or compiler.
Reporter: Stefan Teleman
Labels: thread-safety
Fix For: 4.2.x, 4.3.x, 5.0.0

Attachments: 22.locale.numpunct.mt.out, facet.cpp.diff,
locale_body.cpp.diff, punct.cpp.diff, runtests-linux32-all.out,
runtests-linux64-all.out, runtests.out, STDCXX-1056-additional-timings.tgz,
stdcxx-1056.patch, stdcxx-1056-timings.tgz,
stdcxx-4.2.x-numpunct-perfect-forwarding.patch,
stdcxx-4.3.x-numpunct-perfect-forwarding.patch

several member functions in std::moneypunct and std::numpunct return
a std::string by value (as required by the Standard). The implication of
return-by-value
being that the caller owns the returned object.
In the stdcxx implementation, the std::basic_string copy constructor uses a
shared
underlying buffer implementation. This shared buffer creates the first problem
for
these classes: although the std::string object returned by value *appears* to
be owned
by the caller, it is, in fact, not.
In a mult-threaded environment, this underlying shared buffer can be
subsequently modified by a different thread than the one who made the initial
call. Furthermore, two or more different threads can access the same shared
buffer at the same time, and modify it, resulting in undefined run-time
behavior.
The cure for this defect has two parts:
1. the member functions in question must truly return a copy by avoiding a call
to the copy constructor, and using a constructor which creates a deep copy of
the std::string.
2. access to these member functions must be serialized, in order to guarantee
atomicity
of the creation of the std::string being returned by value.
Patch for 4.2.1 to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

2012-09-25 Thread Wojciech Meyer

Hi guys,

I think the consensus about this bug is that everybody wants to fix it.
Stefan has a patch but I got completely lost what is the exact reason of
not applying it. It would be a shame to stop at this point.

Liviu Nicoara nikko...@hates.ms writes:

On 9/25/12 7:56 PM, Stefan Teleman (JIRA) wrote:

[
https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefan,

I don't think it's ok to close this bug. The race conditions are there
and we have not come to a completely satisfactory conclusion on how to
deal with them, or even if we should deal with them. Whichever it is
we gotta keep this discussion open. I sure hope you want to be a part
of it.

FWIW, I have spent quite some time looking at your proposed patch and
I am going to reopen the incident. If I can.

Liviu

Stefan Teleman closed STDCXX-1056.
--

Resolution: Won't Fix

Bug will not be fixed. Upstream refuses to acknowledge the existence of the
bug in spite of objective evidence to the contrary.

std::moneypunct and std::numpunct implementations are not thread-safe
-

several member functions in std::moneypunct and std::numpunct return
a std::string by value (as required by the Standard). The implication of
return-by-value
being that the caller owns the returned object.
In the stdcxx implementation, the std::basic_string copy constructor uses a
shared
underlying buffer implementation. This shared buffer creates the first
problem for
these classes: although the std::string object returned by value *appears*
to be owned
by the caller, it is, in fact, not.
In a mult-threaded environment, this underlying shared buffer can be
subsequently modified by a different thread than the one who made the
initial call. Furthermore, two or more different threads can access the
same shared buffer at the same time, and modify it, resulting in undefined
run-time behavior.
The cure for this defect has two parts:
1. the member functions in question must truly return a copy by avoiding a
call to the copy constructor, and using a constructor which creates a deep
copy of the std::string.
2. access to these member functions must be serialized, in order to
guarantee atomicity
of the creation of the std::string being returned by value.
Patch for 4.2.1 to follow.

--
Wojciech Meyer
http://danmey.org

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

2012-09-25 Thread Stefan Teleman

On Tue, Sep 25, 2012 at 8:05 PM, Liviu Nicoara nikko...@hates.ms wrote:
 On 9/25/12 7:56 PM, Stefan Teleman (JIRA) wrote:


   [
 https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


 Stefan,

 I don't think it's ok to close this bug. The race conditions are there and
 we have not come to a completely satisfactory conclusion on how to deal with
 them, or even if we should deal with them. Whichever it is we gotta keep
 this discussion open. I sure hope you want to be a part of it.

 FWIW, I have spent quite some time looking at your proposed patch and I am
 going to reopen the incident. If I can.

I am done wasting my time on trying to convince this project that it
should fix its severe bugs.

If and when this project decides to abandon the adolescent attitude
which appears to be one of its primary characteristics, let me know.

-- 
Stefan Teleman
KDE e.V.
stefan.tele...@gmail.com

Re: [jira] [Closed] (STDCXX-1066) SPARCV8 requires pthread_mutex_t and pthread_cond_t to be aligned on an 8-byte boundary

2012-09-25 Thread Liviu Nicoara


On 9/25/12 7:56 PM, Stefan Teleman (JIRA) wrote:


  [ 
https://issues.apache.org/jira/browse/STDCXX-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Anybody around here, except Stefan, who has access to a SPARC V8 machine updated 
to the specified kernel update or later, and who is willing to run a simple test 
program? It's a 5 minute job at most.


Thanks!



Stefan Teleman closed STDCXX-1066.
--

 Resolution: Won't Fix

Bug will not be fixed upstream. It is fixed in the Solaris releases.


SPARCV8 requires pthread_mutex_t and pthread_cond_t to be aligned on an 8-byte 
boundary
---

 Key: STDCXX-1066
 URL: https://issues.apache.org/jira/browse/STDCXX-1066
 Project: C++ Standard Library
  Issue Type: Bug
  Components: Thread Safety
Affects Versions: 4.2.1, 4.2.2, 4.2.x, 4.3.x
 Environment: Solaris 10 Update 6 or later on SPARCV8 [32-bit]
Defect is compiler-independent - in reality it only affects Sun Studio and GCC.
Reporter: Stefan Teleman
  Labels: features, runtime, threads
 Fix For: 4.2.2, 4.2.x, 4.3.x

 Attachments: _config-gcc.h.stdcxx-1066.patch, 
_config-sunpro.h.stdcxx-1066.patch, ctype.cpp.stdcxx-1066.patch, 
exception.cpp.stdcxx-1066.patch, ios.cpp.stdcxx-1066.patch, 
iostream.cpp.stdcxx-1066.patch, iostream.stdcxx-1066.patch, 
locale_body.cpp.stdcxx-1066.patch, locale_classic.cpp.stdcxx-1066.patch, 
messages.cpp.stdcxx-1066.patch, _mutex.h.stdcxx-1066.patch, 
time_put.cpp.stdcxx-1066.patch, use_facet.h.stdcxx-1066.patch


Starting with Solaris 10 Update 6, on SPARCV8, pthread_mutex_t and 
pthread_cond_t MUST be aligned on an 8-byte boundary. Misaligned access will 
result in either SEGV or SIGBUS.
There are numerous places in the multi-threaded version of stdcxx where 
pthread_mutex_t and/or pthread_cond_t types are contained within an union, but 
with an enforced alignment different than 8. All these instances must be 
corrected, and #ifdef-guarded for SPARCV8.
Patches to follow shortly, this is just opening the issue.
Warning: the patchset resolving this issue is very large, and it affects a 
large number of files.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: STDCXX-1056 : numpunct fix

2012-09-25 Thread Liviu Nicoara


On 9/24/12 11:06 PM, Stefan Teleman wrote:

On Mon, Sep 24, 2012 at 7:48 PM, Liviu Nicoara nikko...@hates.ms wrote:


Stefan, was it your intention to completely eliminate all the race
conditions with this last patch? Is this what the tools showed in your
environment?



https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461452#comment-13461452


Yes, all the race conditions in std::numpunct and std::moneypunct.
Not all the race conditions in stdcxx.


FWIW, I have analyzed the performance of the latest patch from Stefan. The test 
program is the same test program used in the timings attachments to the 
incident. The version of the library was 4.2.1, compiler gcc 4.7.1, with:


1. The simple, perfect forwarding patch, with no caching, which does eliminate 
race conditions.

2. Stefan's latest patch
3. A patch I created which truly, if crudely, eliminates all facet race 
conditions by using a top-level lock for all facet operations(*)


The numbers from the first test run are already available in the incident.

The numbers for the second test run were a bit bigger, which is expected because 
of the additional synchronization. E.g., I have found that the test, run against 
the STDCXX locale database, gives:


$ time ./t en_US.UTF-8 4 500
4, 500

real0m19.210s
user0m32.659s
sys 0m36.507s

where the first patch gives:

$ time ./t en_US.UTF-8 4 5000; done
4, 5000

real0m7.961s
user0m31.211s
sys 0m0.003s

Notice the number of loops, 10x larger in the second set of number.

Now, for the third patch, the numbers are embarrassingly large. It practically 
slows the program to a crawl because it does not do any reads of facet data 
without holding the lock. This, I wager, eliminates all races in the 
numpunct/moneypunct facet. See:


$ time ./t en_US.UTF-8 4 10
4, 10

real0m24.316s
user0m28.213s
sys 0m6.633s

Now that is a whooping performance hit.

I went back to see an explanation between the numbers I see with Stefan's more 
performing patch, and my lock-all patch. I believe that there are still places 
in Stefan's patch where the facet makes unguarded reads of facet data, e.g.:


loc/_numpunct.h:
190 // returns a pointer to the facet's implementation data
191 // if it exists, 0 otherwise
192 const void* _C_data () const {
193 return _C_impsize ? _C_impdata
194 : _RWSTD_CONST_CAST (__rw_facet*, this)-_C_get_data ();
195 }

where _C_impsize/_C_impdata are read outside of a lock scope, etc. I.e., a 
thread may read the _C_impsize while, say, another is still writing the 
_C_impdata member variable.


I conclude, even with such simple testing, that any solution that attempts to 
completely eliminate the race conditions in these facets will incur a penalty, 
with the safest (top-level locking) the worst by orders of magnitude and any 
other solution somewhere in between.


Please let me know if you have any questions.

Liviu

(*) Will attach it as soon as we figure out how to undo the closing of the 
issue.

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

2012-09-25 Thread Martin Sebor

On 09/25/2012 06:08 PM, Wojciech Meyer wrote:

Hi guys,

From my POV, there's no disagreement that there is a bug
in the thread safety of the facets or its cause, or a lack
of desire to fix it. I think the main reason for the lack
of consensus on a patch is that we don't yet have a good
sense of the performance impact of the proposed approaches
or an understanding of the differences in the timings.

As for Stefan's patches, the biggest problem I see there
(besides the above) is that they introduce changes that
appear to be unrelated to the problem without providing
a satisfactory rationale. The established stdcxx bug
fixing process is to create as small a patch as possible
that does nothing but address the problem. This is
particularly important when we don't have ready access
to test the patch with all the targeted compilers and
on all the targeted operating systems (to minimize the
risk of incidental breakage and to make it as easy as
possible to root cause possible regressions.

Martin

Liviu Nicoara nikko...@hates.ms writes:

On 9/25/12 7:56 PM, Stefan Teleman (JIRA) wrote:

[
https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stefan,

I don't think it's ok to close this bug. The race conditions are there
and we have not come to a completely satisfactory conclusion on how to
deal with them, or even if we should deal with them. Whichever it is
we gotta keep this discussion open. I sure hope you want to be a part
of it.

FWIW, I have spent quite some time looking at your proposed patch and
I am going to reopen the incident. If I can.

Liviu

Stefan Teleman closed STDCXX-1056.
--

Resolution: Won't Fix

Bug will not be fixed. Upstream refuses to acknowledge the existence of the bug
in spite of objective evidence to the contrary.

std::moneypunct and std::numpunct implementations are not thread-safe
-

--
Wojciech Meyer
http://danmey.org

Re: [jira] [Closed] (STDCXX-1066) SPARCV8 requires pthread_mutex_t and pthread_cond_t to be aligned on an 8-byte boundary

2012-09-25 Thread Travis Vitek


I asked around and we don't have one available at this time.

Travis

Liviu Nicoara wrote:
On 9/25/12 7:56 PM, Stefan Teleman (JIRA) wrote:

   [ 
 https://issues.apache.org/jira/browse/STDCXX-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
  ]

Anybody around here, except Stefan, who has access to a SPARC V8 machine updated
to the specified kernel update or later, and who is willing to run a simple test
program? It's a 5 minute job at most.

Thanks!


 Stefan Teleman closed STDCXX-1066.
 --

  Resolution: Won't Fix

 Bug will not be fixed upstream. It is fixed in the Solaris releases.

 SPARCV8 requires pthread_mutex_t and pthread_cond_t to be aligned on an 
 8-byte boundary
 ---

  Key: STDCXX-1066
  URL: https://issues.apache.org/jira/browse/STDCXX-1066
  Project: C++ Standard Library
   Issue Type: Bug
   Components: Thread Safety
 Affects Versions: 4.2.1, 4.2.2, 4.2.x, 4.3.x
  Environment: Solaris 10 Update 6 or later on SPARCV8 [32-bit]
 Defect is compiler-independent - in reality it only affects Sun Studio and 
 GCC.
 Reporter: Stefan Teleman
   Labels: features, runtime, threads
  Fix For: 4.2.2, 4.2.x, 4.3.x

  Attachments: _config-gcc.h.stdcxx-1066.patch, 
 _config-sunpro.h.stdcxx-1066.patch, ctype.cpp.stdcxx-1066.patch, 
 exception.cpp.stdcxx-1066.patch, ios.cpp.stdcxx-1066.patch, 
 iostream.cpp.stdcxx-1066.patch, iostream.stdcxx-1066.patch, 
 locale_body.cpp.stdcxx-1066.patch, locale_classic.cpp.stdcxx-1066.patch, 
 messages.cpp.stdcxx-1066.patch, _mutex.h.stdcxx-1066.patch, 
 time_put.cpp.stdcxx-1066.patch, use_facet.h.stdcxx-1066.patch


 Starting with Solaris 10 Update 6, on SPARCV8, pthread_mutex_t and 
 pthread_cond_t MUST be aligned on an 8-byte boundary. Misaligned access will 
 result in either SEGV or SIGBUS.
 There are numerous places in the multi-threaded version of stdcxx where 
 pthread_mutex_t and/or pthread_cond_t types are contained within an union, 
 but with an enforced alignment different than 8. All these instances must be 
 corrected, and #ifdef-guarded for SPARCV8.
 Patches to follow shortly, this is just opening the issue.
 Warning: the patchset resolving this issue is very large, and it affects a 
 large number of files.

 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA administrators
 For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: STDCXX-1056 : numpunct fix

Re: STDCXX-1056 : numpunct fix

STDCXX-1070

Re: STDCXX-1070

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

Re: [jira] [Closed] (STDCXX-1066) SPARCV8 requires pthread_mutex_t and pthread_cond_t to be aligned on an 8-byte boundary

Re: STDCXX-1056 : numpunct fix

Re: [jira] [Closed] (STDCXX-1056) std::moneypunct and std::numpunct implementations are not thread-safe

Re: [jira] [Closed] (STDCXX-1066) SPARCV8 requires pthread_mutex_t and pthread_cond_t to be aligned on an 8-byte boundary

11 matches

Site Navigation

Mail list logo

Footer information