Re: STDCXX-1056 : numpunct fix

Liviu Nicoara Thu, 20 Sep 2012 05:07:17 -0700

Thanks for the feed-back. Please see below.

On Sep 19, 2012, at 10:02 PM, Stefan Teleman wrote:

> On Wed, Sep 19, 2012 at 8:51 PM, Liviu Nicoara <nikko...@hates.ms> wrote:
> 
>> I think you are referring to `live' cache objects and the code which
>> specifically adjusts the size of the buffer according to the number of
>> `live' locales and/or facets in it. In that respect I would not call that
>> eviction because locales and facets with non-zero reference counters are
>> never evicted.
>> 
>> But anyhoo, this is semantics. Bottom line is the locale/facet buffer
>> management code follows a principle of economy.
> 
> Yes it does. But we have to choose between economy and efficiency. To
> clarify: The overhead of having unused pointers in the cache is
> sizeof(void*) times the number of unused "slots".  This is 2012. Even
> an entry-level Android cell phone comes with 1GB system memory. If we
> want to talk about embedded systems, where memory constraints are more
> stringent than cell phones, then we're not talking about Apache stdcxx
> anymore, or any other open souce of the C++ Standard Library. These
> types of systems use C++ for embedded systems, which is a different
> animal altogether: no exceptions support, no rtti. For example see,
> Green Hills: http://www.ghs.com/ec++.html. And even they have become
> more relaxed about memory constraints. They use BOOST.
> 
> Bottom line: so what if 16 pointers in this 32 pointer slots cache
> never get used. The maximum amount of "wasted memory" for these 16
> pointers is 128 bytes, on a 64-bit machine with 8-byte sized pointers.
> Can we live with that in 2012, a year when a $500 laptop comes with
> 4GB RAM out of the box? I would pick 128 bytes of allocated but unused
> memory over random and entirely avoidable memory churn any day.

The argument is plausible and fine as far as brainstorming goes. 

But have you measured the amount of memory consumed by all STDCXX locale data 
loaded in one process? How much absolute time is spent in resizing the locale 
and facet buffers? What is the gain in space and time performance with such a 
change versus without? Just how fragmented the heap becomes and is there a 
performance impact because of it, etc.? IOW, before changing the status quo one 
must show an objective defect, produce a body of evidence, including a failing 
test case for the argument.

> 
> My goal: I would be very happy if any application using Apache stdcxx
> would reach its peak instantiation level of localization (read: max
> number of locales and facets instantiated and cached, for the
> application's particular use case), and would then stabilize at that
> level *without* having to resize and re-sort the cache, *ever*. That
> is a locale cache I can love. I love binary searches on sorted
> containers. Wrecking the container with insertions or deletions, and
> then having to re-sort it again, not so much. Especially when I can't
> figure out why we're doing it in the first place.

And I love minimalistic code, and hate waste at the same time, especially in a 
general purpose library. To each its own.

> 
>> Hey Stefan, are the above also timing the changes?
> 
> Nah, I didn't bother with the timings - yet - for a very simple
> reason: in order to use instrumentation, both with SunPro and with
> Intel compilers, optimization of any kind must be disabled. On SunPro
> you have to pass -xkeepframe=%all (which disables tail-call
> optimization as well), in addition to passing -xO0 and -g. So the
> timings for these unoptimized experiments would have been completely
> irrelevant.

Well, I think you are the only one around here with access to SPARC hardware, 
your input is very precious in this sense. Also, this is the reason for which I 
kept asking that question earlier: do we have currently any failing locale MT 
test when numpunct does just perfect forwarding, with no caching? I.e., 
changing just _numpunct.h and no other source file (as to silence thread 
analyzers warnings) does any locale (or other) MT tests fail? I would greatly 
appreciate it if you could give it a run on your hardware if you don't already 
know the answer.

The discussion has been productive. But I object to the patch as is because it 
goes out of the scope of the original incident. I think this patch should only 
touch the MT defect detected by the failing test cases. If you think the other 
parts you changed are defects you should open corresponding issues in JIRA and 
have them discussed in their separate rooms.

Thanks,
Liviu

Re: STDCXX-1056 : numpunct fix

Reply via email to