tl;dr - I ask some questions near the end, you probably need to read the entire post to be able to insightfully answer them.

The Mozilla codebase, which I define loosely as mozilla-central, comm-central, JS, NSS, NSPR, LDAP C SDKs, and any other small project included in mozilla-central that is built and maintained semi-independently, has organically and continuously grown from its original kernels which predate many of the modern standards used for C and C++. Thus, its APIs have impedance mismatches with what is more common in modern C/C++, which may deter would-be contributors.

For those who do not follow the ongoing tasks of JTC1/SC22/WG14 and 21 with zeal, here are the various relevant versions of standards:

C89 aka ISO C aka ANSI C: This is what most people think of as "standard" C code, although in practice modern compilers accept features not present in this version (such as mixed code and declarations, C++-style comments, long long). What you get nowadays in the default mode tends to be closer to "the subset of C99 also in C++".

C99: This adds several features present in C++ (a few mentioned above), but also contains some features that have been controversial (variable-length arrays are a big one). MSVC doesn't yet fully implement this, but some things (like designated initializers) will probably be in the next version of MSVC.

C11: This is basically taking some new features of C++11 and shoving them in C, such as atomics, threading, and noreturn. There's also some minor goodies like standardizing the "x" flag in fopen to correlate with O_CREAT | O_EXCL.

C++98/C++03: These are basically the same thing as far as programmers are concern, since the changes in the 2003 version mostly matter only to language lawyers. This is "traditional" C++.

C++11: This standard has a very large suite of new features added, which have been incrementally supported in compilers over the last 5 years. Clang and g++ are feature-complete as of 3.3 and 4.8.1, respectively; MSVC is not yet feature complete, although things that cannot be worked around will probably be added within the next two versions [1]. Note that the standard library support has lagged compiler support, particularly for <regex> (ES-compatible regular expressions remain unsupported even on tip-of-trunk libstdc++).

C++14: This is a proposed standard in final drafting standards that is better thought of as "C++11.1", adding a few features that ought to have been in C++11 had people played with them more. The expected final list of features boils down to generic lambdas, auto improvements, "correct" VLAs, and variable templates.

C++14 Technical Specifications: C++ is moving to a modular design for development in the future! There are currently three main TSs planned for release in 2014: a filesystem specification, a networking specification, and a "concepts lite" specification. There may also be one for "things that just missed C++14" such as std::string_view (or whatever bikeshedding happens) [2]. A draft of the Filesystem TS is available here: <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3693.html>.


While most prior discussions about using newer versions of C/C++ have focused on the compiler support, the topic of using newer versions of the library has mostly been ignored. Library support is more annoying, since instead of three major compilers, we have four standard libraries to concern ourselves with and potential issues with compiler-library compatibility (particularly Clang and libstdc++)--these are MSVC's implementation, libc++, libstdc++, and stlport. Since most nice charts of C++11 compatibility focus on what the compiler needs to do, I've put together a high-level overview of the major additions to the standard library [3]:
* std::tuple -- Generalization std::pair
* std::unique_ptr -- Non-broken replacement for std::auto_ptr
* std::shared_ptr -- Non-intrusive version of nsCOMPtr and friends
* std::function/std::bind -- Generalization of function pointers
* std::type_traits -- Template helpers
* std::ratio -- Compile-time rational arithmetic, mostly used for std::chrono * std::chrono -- Time, more specifically a generalization of time stamps instead of calendaring functions
* std::array -- Generalization of compile-time arrays like int x[10]
* std::forward_list -- Singly-linked list
* std::unordered_map/std::unordered_set -- Hashtables/hashsets
* std::random -- More powerful random number generators than just rand()
* std::codecvt -- UTF-8/UTF-16/UTF-32 conversion classes (ties in with a bit of the locale support) * std::regex -- Regular expressions, but probably the least well-implemented of anything here * std::atomic, std::thread, std::mutex, std::condexpr, std::future -- Major threading interfaces

C++14 will also be adding [3]:
* std::optional -- A template for which "not present" is different than "empty"/"null"/0/etc.
* std::dynarray -- Generalization of int x[N]
* std::shared_mutex -- R/W locks
* std::exchange -- std::atomic<T>::exchange that's not atomic.
* user-defined literals for std::chrono and std::string types.


Now that you have the background for what is or will be in standard C++, let me discuss the real question I want to discuss: how much of this should we be using in Mozilla? I assume that enabling exceptions or RTTI is untenable and the mere suggestion of it would lead to everything else I say being ignored :-) . The practical effect of this is that we are unable to use any function where we would not want to crash if an exception is thrown. Fortunately, the C++ specification appears to be assuming that this is a situation worth designing for, as the Filesystem TS draft in particular defines most functions in pairs: one that throws an exception if something goes wrong, the other that uses an error code. It feels worth saying that the error code style in use is a std::system_error on the end of the function, not an nsresult return value [4].

For purposes of discussion, I think it's worth breaking down the C++ (and C) standard library into the following components:
* Containers--vector, map, etc.
* Strings
* I/O
* Platform support (threading, networking, filesystems, locales)
* Other helpful utilities (std::random, std::tuple, etc.)

We have explicit non-STL implementations of many of the containers (nsTArray, several hashtable implementations, mozilla::LinkedList), and we have much more specialized string libraries than the C++ standard library provides. The iostream library has some issues with using (particularly static constructors IIRC), and is not so usable for most of the things that Gecko needs to do. Platform support ultimately depends on NSPR (or intl/ for locale stuff) in large part, although we have C++/IDL wrappers for most of the major things, and the current C++ support for the things we need are lacking. For some of the helpful utilities, we basically have the C++ implementation but using Mozilla coding style instead (particularly type_traits stuff).

Using the STL wherever possible comes with drawbacks. We lose control over the ABI (I believe libstdc++ tries to keep compatibility, but not MSVC), which can impact people who write binary extensions. We also lose control over the ability to make performance tweaks to containers. There is also a critical API mismatch between the STL and the containers we use: the STL tends to use iterators and templates heavily, which tends to mean a lot of inlining (std::sort is fully inlined, for example); in contrast, the containers we use tend to favor using function pointers for sorting or even enumeration. Strings in particular are extremely weak in the STL: there is one string class, where we have several for specialized purposes (O(1) substring, null-terminated, O(1) concatenation, allocate-on-stack, etc.).

Looking at a large C++ project that is not constrained by legacy as much as we are, LLVM, it's clear that the STL by itself is not sufficient. LLVM defines a large number of helper datastructures: <http://llvm.org/doxygen/dir_a7dd73f244ee1af3dca2a8723843bc79.html>. Of particular note is the use of llvm::StringRef (which is roughly equivalent to everywhere we have const nsA[C]String & in our code) in many places, as well as the existence of a llvm::SmallString (roughly equivalent to nsAuto[C]String). There is similarly a large collection of variants on std::vector, std::set, and std::map for specific purposes. The downside to this kind of approach is that use of alternative APIs tends to leak through APIs, particularly for out parameters; this can be ameliorated somewhat with templates, but that causes its own set of problems.

Even if fully using the standard library is untenable from a performance perspective, usability may be enhanced if we align some of our APIs which mimic STL functionality with the actual STL APIs. For example, we could add begin()/end()/push_back()/etc. methods to nsTArray to make it a fairly drop-in replacement for std::vector, or at least close enough to one that it could be used in other STL APIs (like std::sort, std::find, etc.). However, this does create massive incongruities in our API, since the standard library prefers naming stuff with this_kind_of_convention whereas most Mozilla style guides prefer ThisKindOfConvention.

There is also a separate question of how much of the standard library we should use or mimic. The current strings library is basically a performance footgun to use as is, although a proposed string_view class (an encapsulation of const char* + length) would be suitable for most inparameters. Similarly, std::shared_ptr is moderately useful, but the libstdc++ implementation appears to assume threadsafe reference counting always (which would be a perf hit for us), and our intrusive refcounting doesn't mesh well with it. It is possible (though dirty) to do something like specializing std::shared_ptr for nsISupports-derived types and making that specialization inherit from nsCOMPtr. On the other hand, std::unique_ptr or other small utilities are basically what we would code up ourselves modulo the style guideline issues.


With all of that stated, the questions I want to pose to the community at large are as follows: 1. How much, and where, should we be using standard C++ library functionality in Mozilla code? 2. To what degree should our custom ADTs (like nsTArray) be interoperable with the C++ standard library? 3. How should we handle bridge support for standardized features not yet universally-implemented? 4. When should we prefer our own implementations to standard library implementations? 5. To what degree should our platform-bridging libraries (xpcom/mfbt/necko/nspr) use or align with the C++ standard library? 6. Where support for an API we wish to use is not universal, what is the preferred way to mock that support? [Note: similar questions also apply to NSPR and NSS with respect to newer C99 and C11 functionality.]

Thoughts/comments/corrections/questions/concerns/flames/insightful discussion?

Footnotes:
[1] Well, strictly speaking, expression SFINAE, universal character names, and inheriting constructors can't be worked around directly, but avoiding the first two should be pretty easy, and the last one is considered by some to be a misfeature anyways.

[2] There is also discussion of adding feature-test macros, such as what Clang provides, in light of the incremental nature in the way this stuff gets extended. This is not planned to be an official part of C++, but rather a common convention that all major compilers will support. The latest draft of this effort is here: <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3694.htm>, but the full C++ committee has not discussed it yet.

[3] There are lots of minor changes to libraries and smaller things that I don't think are worth calling out.

[4] This opens up an interesting idea to attempt boiling the ocean to align API error strategies, and I can think of several benefits this kind of approach might accrue. That said, given that it's a boil-the-ocean kind of patch with a potential but unknown impact on performance, I'm not going to suggest doing it here.

--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to