Re: [PATCH] STDCXX-853
On 09/24/12 01:18, Travis Vitek wrote: Liviu, Should the volatile be to the left of the intT typename here? I know it is equivalent, but it is weird to look at the line of code below and see that we're following two different conventions. Thanks, will do. Travis ___ From: Liviu Nicoara Sent: Sunday, September 23, 2012 8:34 AM To: dev@stdcxx.apache.org Subject: [PATCH] STDCXX-853 Umm, I didn't think to search for a corresponding incident and I considered the defect to be so minor as to not warrant an issue. The following patch has been applied already on 4.2.x: Index: tests/support/atomic_xchg.cpp === --- tests/support/atomic_xchg.cpp (revision 1388732) +++ tests/support/atomic_xchg.cpp (revision 1388733) @@ -297,7 +297,7 @@ void run_test (intT, thr_args_base::tag_ // compute the expected result, skipping zeros by incrementing // expect twice when it overflows and wraps around to 0 (zero is // used as the lock variable in thread_routine() above) -intT expect = intT (1); +intT volatile expect = intT (1); const unsigned long nincr = (Args::nthreads_ * Args::nincr_) / 2U; -- And now I see with eye serene The very pulse of the machine.
Re: STDCXX-1066 [was: Re: STDCXX forks]
On Mon, Sep 24, 2012 at 8:21 AM, Liviu Nicoara nikko...@hates.ms wrote: In the light of your inability to answer the simplest questions about the correctness and usefulness of this patch, I propose we strike the patch in its entirety. Let me make something very clear to you: what I am doing here is a courtesy to the stdcxx project. There is no requirement in my job description to waste hours arguing with you in pointless squabbles over email. Nor is there a requirement in the APL V2.0 which would somehow compel us to redistribute our source code changes. We could and should re-work the instances in the library where we might use mutex and condition objects that are misaligned wrt the mentioned kernel update. Yeah, why don't you go ahead and do that. Just like you fixed stdcxx-1056. --Stefan -- Stefan Teleman KDE e.V. stefan.tele...@gmail.com
RE: STDCXX-1066 [was: Re: STDCXX forks]
Comments below... -Original Message- From: Stefan Teleman Sent: Monday, September 24, 2012 5:46 AM Subject: Re: STDCXX-1066 [was: Re: STDCXX forks] On Mon, Sep 24, 2012 at 8:21 AM, Liviu Nicoara nikko...@hates.ms wrote: In the light of your inability to answer the simplest questions about the correctness and usefulness of this patch, I propose we strike the patch in its entirety. Let me make something very clear to you: what I am doing here is a courtesy to the stdcxx project. Thank you for your contribution. There is a good chance that without your activity on this project over the last few months that it would be in the attic. There is no requirement in my job description to waste hours arguing with you in pointless squabbles over email. Nor is there a requirement in the APL V2.0 which would somehow compel us to redistribute our source code changes. The problem here is that code changes are expected to pass review. They must follow established conventions, solve the problem in a reasonable and portable way, provide a test case (where one doesn't already exist), as well as make sense to everyone who is working with the product. Several of the changes included don't make sense to the rest of us. Unless we manage to figure them out on our own or you manage to explain them to us, then the changes should probably be excluded from the code. For example, the changes to src/exception.cpp don't make sense to me. Here is the relevant diff... +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT) +# pragma pack(8) +# pragma align 8(__rw_aligned_buffer) +#endif + // facet must never be destroyed static __rw_aligned_buffer_Msgs msgs; It seems that we trying to give 8-byte alignment to a class of type __rw_aligned_buffer. The point of __rw_aligned_buffer is to give us a block of properly aligned data, and if it isn't aligned, then it is broken. So, why are we using a pragma for alignment here (and potentially in other places) when it seems that there is a bug in __rw_aligned_buffer that we should be addressing? Of course, if this is a problem with aligned buffer, we need a testcase. Another example are the changes to src/ctype.cpp +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT) +# pragma pack(8) +# pragma align 8(f) +# pragma align 8(pf) +#endif static union { +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT) +unsigned long long align_; +unsigned char data_ [sizeof (_STD::ctypechar)]; +#else void *align_; char data_ [sizeof (_STD::ctypechar)]; +#endif } f; static __rw_facet* const pf = new (f) _STD::ctypechar(__rw_classic_tab, false, ref); pfacet = pf; +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT) +# pragma pack(0) +#endif This change seems to give alignment to `f' in two different ways, and it seems to be applying alignment to the pointer (the issue Liviu has brought up). It seems like the simpler fix would be to unconditionally add union members to `f' so that we get the proper alignment (much like we try to do in __rw_aligned_buffer), or an even better solution would be to use __rw_aligned_buffer. We could and should re-work the instances in the library where we might use mutex and condition objects that are misaligned wrt the mentioned kernel update. Yeah, why don't you go ahead and do that. Just like you fixed stdcxx- 1056. There is no need for comments like this. Stefan, if you are feeling singled out and attacked because your patches aren't accepted without question, it might be a good idea to go back and look through the mailing list archives. We have _all_ been in this position. Nearly every time I posted a patch for the first year working on stdcxx was like this, and I've been doing C++ software for 15 years. The company where I work currently has a review process that is just as rigorous as what you're seeing with stdcxx now. The process isn't there to attack anyone personally, but to ensure the quality and maintainability of the library for years to come. Travis
Re: STDCXX-1056 : numpunct fix
On 9/24/12 12:06 AM, Stefan Teleman wrote: On Fri, Sep 21, 2012 at 9:10 AM, Liviu Nicoara nikko...@hates.ms wrote: On 09/21/12 05:13, Stefan Teleman wrote: On Fri, Sep 21, 2012 at 2:28 AM, Travis Vitek travis.vi...@roguewave.com wrote: I have provided this list with test results showing that my patch *does* fix the race condition problems identified by all the tools at my disposal. I'm willing to bet you never looked at it. You dismissed it outright, just because you didn't like the *idea* that increasing the size of the cache, and eliminating that useless cache resizing would play an important role in eliminating the race conditions. I looked at it in great detail and I am sure Travis read it too. The facet/locale buffer management is a red herring. Apparently, we cannot convince you, regardless of the argument. That's fine by me. This bug [STDCXX-1056] was updated over the weekend with new comments. Here's the link to the comments, for the record: Stefan, was it your intention to completely eliminate all the race conditions with this last patch? Is this what the tools showed in your environment? Thanks, Liviu https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461452#comment-13461452
Re: STDCXX-1056 : numpunct fix
On Mon, Sep 24, 2012 at 7:48 PM, Liviu Nicoara nikko...@hates.ms wrote: Stefan, was it your intention to completely eliminate all the race conditions with this last patch? Is this what the tools showed in your environment? https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461452#comment-13461452 Yes, all the race conditions in std::numpunct and std::moneypunct. Not all the race conditions in stdcxx. Yes, that's what the test results show with my latest patchset - 0 race conditions in numpunct and moneypunct on Linux/Intel 32/64, Solaris SPARC 32/64 and Solaris Intel 32/64. The test results are in the onlinehome.us directory URL i put in the comment. -- Stefan Teleman KDE e.V. stefan.tele...@gmail.com
Re: STDCXX-1056 : numpunct fix
On Mon, Sep 24, 2012 at 10:03 PM, Martin Sebor mse...@gmail.com wrote: FWIW, there are race conditions in stdcxx. Some of them are by design and benign on the systems the library runs on (although I acknowledge that some others may be bugs). One such benign date race is: timeT1 T2 0x = N 1x = N read x where x is a scalar that can be accessed atomically by the CPU and the compiler. I think some of the lazy facet initialization falls under this class. It would be nice to fix them but only if it doesn't slow things down. The others need to be fixed in any case. The race conditions I am talking about are not benign. I've uploaded a full thread analyzer output for 22.locale.numpunct.mt showing dataraces here: http://s247136804.onlinehome.us/stdcxx-1056-malign/22.locale.numpunct.mt.2.er.tar.bz2 The name of the analyzer results directory is 22.locale.numpunct.mt.2.er You will need the SunPro Linux 12.3 Thread Analyzer installed, which comes with SunPro anyway. The analyzer itself is ${PATH_TO_SUN_PRO_INSTALL}/bin/analyzer There's a screenshot from the same Analyzer output here: http://s247136804.onlinehome.us/stdcxx-1056-malign/sunpro_thread_analyzer_screenshot.jpg taken just now on my laptop. The report itself is from a couple of days ago, and it's from a run with only the _numpunct.h patch applied. No patches to either facet.cpp, punct.cpp or locale_body.cpp. It shows the types of race conditions it's reporting: these are read/write race conditions, not read/read. The thread analyzer's html filter doesn't show the types of races reported as clearly as the command-line analyzer which has a windowing GUI. At any rate you can see the same exact type of race conditions being reported by the Intel Inspector 2013 Thread Analyzer. -- Stefan Teleman KDE e.V. stefan.tele...@gmail.com