Re: STDCXX-1056 : numpunct fix

Liviu Nicoara Tue, 25 Sep 2012 18:03:48 -0700

On 9/24/12 11:06 PM, Stefan Teleman wrote:

On Mon, Sep 24, 2012 at 7:48 PM, Liviu Nicoara <nikko...@hates.ms> wrote:

Stefan, was it your intention to completely eliminate all the race
conditions with this last patch? Is this what the tools showed in your
environment?

https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461452#comment-13461452


Yes, all the race conditions in std::numpunct<> and std::moneypunct<>.
Not all the race conditions in stdcxx.

FWIW, I have analyzed the performance of the latest patch from Stefan. The testprogram is the same test program used in the timings attachments to theincident. The version of the library was 4.2.1, compiler gcc 4.7.1, with:

1. The simple, perfect forwarding patch, with no caching, which does eliminaterace conditions.

2. Stefan's latest patch

3. A patch I created which truly, if crudely, eliminates all facet raceconditions by using a top-level lock for all facet operations(*)


The numbers from the first test run are already available in the incident.

The numbers for the second test run were a bit bigger, which is expected becauseof the additional synchronization. E.g., I have found that the test, run againstthe STDCXX locale database, gives:


$ time ./t en_US.UTF-8 4 5000000
4, 5000000

real    0m19.210s
user    0m32.659s
sys     0m36.507s

where the first patch gives:

$ time ./t en_US.UTF-8 4 50000000; done
4, 50000000

real    0m7.961s
user    0m31.211s
sys     0m0.003s

Notice the number of loops, 10x larger in the second set of number.

Now, for the third patch, the numbers are embarrassingly large. It practicallyslows the program to a crawl because it does not do any reads of facet datawithout holding the lock. This, I wager, eliminates all races in thenumpunct/moneypunct facet. See:


$ time ./t en_US.UTF-8 4 100000
4, 100000

real    0m24.316s
user    0m28.213s
sys     0m6.633s

Now that is a whooping performance hit.

I went back to see an explanation between the numbers I see with Stefan's moreperforming patch, and my lock-all patch. I believe that there are still placesin Stefan's patch where the facet makes unguarded reads of facet data, e.g.:


loc/_numpunct.h:
    190     // returns a pointer to the facet's implementation data
    191     // if it exists, 0 otherwise
    192     const void* _C_data () const {
    193         return _C_impsize ? _C_impdata
    194             : _RWSTD_CONST_CAST (__rw_facet*, this)->_C_get_data ();
    195     }

where _C_impsize/_C_impdata are read outside of a lock scope, etc. I.e., athread may read the _C_impsize while, say, another is still writing the_C_impdata member variable.

I conclude, even with such simple testing, that any solution that attempts tocompletely eliminate the race conditions in these facets will incur a penalty,with the safest (top-level locking) the worst by orders of magnitude and anyother solution somewhere in between.


Please let me know if you have any questions.

Liviu

(*) Will attach it as soon as we figure out how to undo the closing of the 
issue.

Re: STDCXX-1056 : numpunct fix

Reply via email to