549 - Apache Standard C++ Library

Garrett D'Amore Wed, 27 Aug 2008 22:34:38 -0700

(Putting PSARC-ext back on distribution at request of Stefan.)

Stefan Teleman wrote:
>
>
> Garrett D'Amore wrote:
>> It looks (from reading the README link you supplied) like there are 
>> some library utilities, localedef, etc. that are not part of your 
>> case but need to be.  Your case materials are incomplete without a 
>> full list of what you intend to provide.
>
> These executables are not being delivered. the "locale" executable is 
> just a C++ version of the existing Solaris "locale" executable. There 
> is no need for this.


OK.
>
> The "localedef" executable is a utility which only used at Library 
> build time, and which generates the character set conversion tables. 
> It plays no role after that, and it is not useful for inclusion.

Ah, OK, sorry if I was confused from the Apache README.
>
>> The locations of the locale-specific data need to be fully exposed in 
>> the case materials as well. 
>
> This location is provided in the ARC Case Materials, Appendix 2. 
> Please read all the ARC Case materials, completely.

Sorry, I didn't see that.  However, what I feel is still missing is how 
the data is manipulated.  I.e., how do I generate l10n data.  There is 
more than just a directory location here.  (For example, with C's 
gettext(), I can use xgettext to extract data from the source, which can 
then be used to generate translations, and the translated files (whose 
format is specified in xgettext/gettext) can be stashed in documented 
location... you get the idea.

Basically, anyone who is going to build a localized program, or provide 
localization data for such a program, needs to know certain information 
-- and that information is required in order for this case to be 
complete, IMO.

>
>> Saying your library provides i18n features without explaining *how* 
>> (or providing a link to a document that explains how) is inadequate.  
>> (And the Standard isn't sufficient either -- there are implementation 
>> specific behaviors that have to be specified as well.)
>
> What are these implementation specific behaviors ?

How the implementation tracks down localized data, and what format it 
uses to interpret them.  These are details of how users interact with 
the implementation that go beyond just the source code of the program.

>
> If they are implementation specific, therefore Project Private, why do 
> they have to be exposed as interfaces ?

No.  Implementation specific is *not* Project Private in this case.  
Unless you don't want to allow users to access any of those 
implementation specific details (which includes in this case the 
techniques used to publish locale specific data for new C++ programs.)


>
> Do any other ARC Cases which integrate external software document 
> Project Private implementation details ? Precedent, please.

Again, not Project Private.  But there are probably many precedents of 
publishing Project Private details.  All one has to do is look for 
"Project Private" interface bindings in the ARC case log.  Many Project 
Private details fall under the domain of "Architecture", even though ARC 
normally tends to focus on Interfaces.  (See, for example, the cases 
associated with the FireEngine TCP stack in S10, which I think are 
almost entirely Project Private.)

>
> Do i also have to document all the private implementation details of 
> the std::iostream or std::string implementations ?

Not if they are truly Project Private and not otherwise architectural.  
(I expect you don't need to, in other words.)

>
>> You still seem to be confusing the C++ ABI with the *binary* 
>> interfaces exposed by the library.  (Binary interfaces are symbol 
>> names, semantics, etc.)  Your library (all libraries!) offers an ABI, 
>> which is itself built upon the ABI supplied by the compiler.
>
> No, i am not confusing anything.
>
> Your understanding of ABI is valid in C. It is entirely inadequate, 
> and invalid, in C++.
>
> The Standard C++ Library is not just a shared library. It is also a 
> very large collection of templates, and classes.
>
> The classes provided in the Standard C++ Library contain virtual 
> methods. The presence of these methods determines the compiler to 
> create virtual tables, which become part of the binary representation 
> of the classes being compiled.

Yes, I understand that.

>
> These templates provided by the Standard C++ Library are compiled 
> *into* the shared library, or executables, being built by the 
> consumer, including any and all virtual tables. These templates are 
> also compiled into the Standard C++ Library shared library being 
> delivered: libstdcxx.so.4
>
> The consumer must link with the Standard C++ Library, libstdcxx.so.4. 
> In addition, the templates have been compiled into the application 
> built by the consumer.

Yes, I understand that as well.

>
> As such, the ABI of the Library is comprised of everything i have 
> enumerated above: class definitions, class objects, automatic class 
> methods implicitly generated by the compiler, virtual tables, 
> constructors, destructors, assignment operators, exception handling, etc.

Yes.  But the *public* portions of those headers that were used (i.e. 
the parts that are specified in the Standard) should be common across 
multiple standards conformant libraries.  The fact that the headers also 
include library-specific implementation details in the binary is *not* 
specific to C++.  All one has to do is look at how certain macros are 
used in C.  (The macro is part of the public source interface, but the 
binary bits it generates, which might includes in-code references to 
private structure fields, etc., are not.)  The binary details constitute 
an ABI for applications working with that library (which consists of the 
shared object *and* the headers used with said shared object.

>
> The notion of ABI symbol names is non-sensical in C++. In any C++ 
> compiled binary, symbol names are mangled. The mangling algorithm is 
> implementation defined, and, actually the term "mangling" does not 
> appear in any of the 800+ pages which comprise the C++ Standard. It is 
> entirely implementation specific how any particular implementation 
> decides to handle C++ symbols.

Of course, I get that.  But that's a compiler artifact.  The combination 
of compiler specific mangling (plus any other calling conventions used 
by the compiler) *and* the details in the headers for a library 
implementation together constitute an ABI.  My understanding is that the 
C++ Standard does not specify an ABI.  It *does* specify an API.  Correct?


>
> The implementation always generates additional, mangled symbol names, 
> which are neither exposed, nor are they predictable to the consumer.
>
> The names of the interfaces exposed by the Standard C++ Library are 
> codified in the Standard. This is the Library API. This is what the 
> programmer writes code to. This is the Library API.

We agree.

I think the problem we are having is commitment level of the *API* 
versus the *ABI*.  You can declare your API to have Committed binding -- 
however your *ABI* is where I have problems with your commitment level, 
and its also where the incompatibility problems lie.

>
> The semantics of the interfaces exposed by the Standard C++ Library are
>
> 1. Codified in the Standard
> 2. Highly dependent on the context in which they are being used
> 3. Highly dependent on the compilation context within which they were 
> used
>
> The actual names of the mangled symbols (ABI) are irrelevant, 
> inherently non-standard, non-portable, and implementation specific 
> private details.

But those names seriously impact application compatibility.  So while 
you might not document them in detail, the fact remains that they become 
part of an exported interface -- because without those no dynamically 
linked application could ever function properly.

>
>> Put another way:
>>
>> API = Source level compatibility
>> ABI = Binary level compatibility
>>
>> If you know that the compiler group is planning something for the 
>> future, which should also support standards, then perhaps shipping 
>> this with Committed is inappropriate.  Volatile might be better.
>
> Please raise these issues openly with PSARC-ext.

I'm CC'ing them on this reply.

I'm starting to think that if my concerns about insufficient specs are 
resolved, this case could be allowed to proceed, *IF* the case owner is 
willing to consider an amendment stating that developers should be aware 
that the library is likely to evolve in the future, and that binary 
compatibility should therefore not be relied upon.  (In other words, we 
should retract any binary compatibility promises for this library, at 
least until a full case addressing the larger picture of compatibility 
concerns is made.)

I do confess that this project being a fast track does feel to me a 
little like trying to shim KDE as a fast track.  Sure lots of people 
download it, want it, use it, and we could do a lot of people a lot of 
favors by including it.  But it also has some non-obvious issues  
(application compatibility between gnome and KDE being one good 
example), and I think it would be poor form to try to submit such a case 
as a fast track.  If the only bar to a project integrating is that other 
people on the 'net use it, then there seems to be very little left for 
ARC to do.  (Hmmm.... is a fast track for the Linux kernel coming soon?)

    -- Garrett

PSARC/2008/549 - Apache Standard C++ Library

Reply via email to