-------------------------------
UPCOMING MOZILLA STRING CHANGES
-------------------------------


The Mozilla string code will be undergoing extensive revision following the release of Mozilla 1.7 alpha. The changes will be mostly transparent, having very little affect on the string API. This change will be made first thing during the 1.7 beta cycle.

The work is being tracked here:
http://bugzilla.mozilla.org/show_bug.cgi?id=231995


The major API-level changes include:


  (1) nsAC?String will no longer be able to represent multi-fragment
      strings.  This allows all implementations of nsAC?String to be
      unified, resulting in a significant reduction of code.

  (2) nsReadingIterator and nsWritingIterator will be limited to
      iterating over a contiguous buffer.  Previously, operator++ was
      forced to "normalize" the iterator forward to the next fragment.
      This added additional code to every consumer of iterators that
      was almost never needed since multi-fragment strings are very
      uncommon.

  (3) nsA?String methods are now all non-virtual.  This is possible
      since there is now only one implementation of nsAC?String.  This
      helps reduce code at the call sites and improves performance.
      ABI compatibility with the existing vtable is maintained (more
      on this later).  It is important to note that any external
      components that use multi-fragment strings will be broken, but
      passing multi-fragment strings in external components was
      forbidden anyway (although the prohibition was poorly
      documented) and none of our implementations of multi-fragment
      strings were ever frozen for component developers to use outside
      of the Mozilla codebase.

  (4) A simplified string API is introduced for embedders and external
      component developers.  nsAC?String's methods are now meant to be
      used only within the Mozilla code base.  The nsEmbedC?String
      class is now implemented in terms of the simplified string API.

(5) The following string classes have been eliminated:

nsSharableC?String

        nsC?String will now allocate a sharable buffer by default.  It
        implements thread-safe reference counting, enabling copy-on-
        write semantics for most strings.  Since very little code
        referenced nsSharableC?String, this class name has been
        eliminated.

nsDependentSingleFragmentC?Substring

        This is now equivalent to nsDependentC?Substring.  Since very
        little code referenced nsDependentSingleFragmentC?Substring,
        this class name has been eliminated.

nsDependentC?Concatenation

        Since nsA?CString can no longer represent a multi-fragment
        string, nsDependentC?Concatenation could no longer inherit
        from nsAC?String.  Therefore, this class no longer exists.
        However, efficient string concatenation is still implemented
        using a very similar mechanism.  More on this later.

(6) nsStringFwd.h now forward declares all string classes.

  (7) nsC?Substring has been added to the string hierarchy.  It will
      be the core string class from which all other strings inherit.
      It behaves much like the old nsSingleFragmentA?CString, except
      that it does not reference the nsAC?String vtable to satisfy any
      of its methods.  Many of the "getter"-functions are inlined for
      performance.


The revised string hierarchy is depicted below:


              nsAC?String
                   |
                   |
                   |
             nsC?Substring
                   |
                   |-------- nsDependentC?Substring
                   |
               nsC?String
                   |
                   |------------.----------.----------.
                   |            |          |          |
           nsDepedentC?String   |          |          |
                                |          |          |
                          nsC?AutoString   |          |
                                           |          |
                                    nsXPIDLC?String   |
                                                      |
                                            nsPromiseFlatC?String


Class overview:


nsAC?String

    This class is designed to be subclassed.  It is never directly
    instantiated.  This class exists only to provide backwards
    compatibility with the former string class API.  It is essentially
    equivalent to nsC?Substring.  However, unlike nsC?Substring,
    nsAC?String might be implemented by an external XPCOM component or
    embedding application that has not yet migrated to the new
    (simpler) embedding string API provided by XPCOM.

nsC?Substring

    This class is designed to be subclassed.  It is never directly
    instantiated.  It represents a string fragment that may or may not
    be null-terminated.  It has methods to access and manipulate the
    string buffer.  It has all of the code to manage the various
    different buffer allocation schemes used by the string classes.
    In many ways, the subclasses of nsC?Substring simply provide
    specialized constructors that select the corresponding memory
    allocation scheme.  If nsC?Substring needs to re-allocate the
    buffer, it will allocate a null-terminated, sharable buffer.

nsC?String

    This class is designed to be instantiated directly.  It is the
    main string class.  It provides a heap allocated string buffer.
    It also provides compatibility methods with the "obsolete" string
    API that used to live in xpcom/string/obsolete (i.e., the "Rick
    G." string API).  It always allocates a sharable buffer.

nsDependentC?String

    This class is designed to be instantiated directly.  It provides a
    mechanism to construct a nsC?String that simply stores a raw
    pointer to an externally allocated buffer.  This class depends on
    the user of the class to ensure that the buffer remains valid for
    the lifetime of the nsDependentC?String.  This class can only wrap
    a null-terminated buffer.

nsAutoC?String

    This class is designed to be instantiated directly.  It provides a
    mechanism to construct a nsC?String that optionally uses a fixed-
    size, stack-based buffer.  This class is designed to be allocated
    on the stack.  Allocating this class on the heap is usually a bad
    idea ;-)

nsXPIDLC?String

    This class is designed to be instantiated directly.  It provides
    support for the getter_Copies mechanism.  It also provides support
    for a null buffer.  Unlike nsC?String classes, the result of
    nsXPIDLC?String::get() may return null if the nsXPIDLC?String is
    uninitialized or was told to adopt a null-valued string buffer.
    This class can also be cast automatically to |const char_type*|
    for backwards compatibility.  Use this class when working with
    XPCOM getter methods that return |string| or |wstring|.

nsPromiseFlatC?String

    This class is designed to be instantiated via the
    PromiseFlatC?String family of functions.  PromiseFlatC?String
    takes a nsAC?String and returns a nsPromiseFlatC?String, which
    "promises" to be null-terminated.  PromiseFlatC?String will
    allocate a copy of the given string if necessary in order to fulfill
    it's promise of a null-terminated string.  The "flat" adjective
    comes from the old string API that supported multi-fragment strings.
    With these current string changes, PromiseFlatC?String is still very
    useful for ensuring null-terminated storage.  This is usually only
    important when you need to pass a nsC?Substring to an API that takes
    a raw character pointer.

nsDependentC?Substring

    This class is designed to be instantiated via the Substring family
    of functions.  It represents an array of characters that are not
    null-terminated.  Much like nsDependentC?String, this class
    depends on an externally allocated string buffer.  Use this class
    to create a nsC?Substring that wraps a pair of raw character
    pointers, a pair of nsReadingIterator<char_type>'s, or a section
    of an existing nsC?Substring.


Concatenations in the new world:


  For the most part, string concatenation will continue to work just
  as they always have.  They continue to be the preferred way to
  compose a new string from several other strings.  The only
  difference in the new world is that the string concatenation class
  no longer inherits from nsAC?String, so it cannot be passed to
  functions expecting a nsAC?String.  However, for compatibility with
  existing code, a concatenation of strings will automatically flatten
  itself into a nsC?String when necessary.

For example:

    void foo( const nsAString& s )
    {
      nsCAutoString buf;
      buf = NS_LITERAL_STRING("prefix") + e;
      ...
    }

  In this case, the two strings "prefix" and |e| are written directly
  to the buffer owned by |buf|.

Here's another example:

void bar( const nsAString& s );

    {
      nsString a, b;
      ...
      bar( a + b );
    }

  In this case, a temporary nsString is created to hold the result of
  the concatenation of |a| and |b| prior to calling |bar|.  This
  temporary nsString would not have been generated with the previous
  string implementation that supported multi-fragment nsAStrings.
  However, there was a serious bug in the older implementation that
  made doing this kind of thing crash-prone (especially if the
  definition of |bar| looked something like the definition of |foo| in
  the previous example).  See bug 231995 for more details.

  The main point here is that string concatenations will continue to
  work as they have in the past, with a few minor exceptions.

For example, code such as the following will no longer compile:

    {
      nsString a, b;
      ...
      const nsAString& s = a + b;
      ...
    }

Such code is uncommon. It should be rewritten like this:

    {
      nsString a, b;
      ...
      nsString r( a + b );
      ...
    }

  |r| could also be declared a nsAutoString to avoid heap-allocating
  the result of the concatenation.  However, since nsString allocates
  a sharable buffer, the programmer should consider nsString if it is
  expected that |r| might need to be copied elsewhere.


Maintaining string ABI compatibility:


  nsA?CString exists for backwards compatibility with the frozen
  nsAC?String vtable.  ABI compatibility is maintained even though
  nsAC?String's methods are all non-virtual.  While this sounds like a
  contradiction, compatibility exists by having nsAC?String (in the
  new world) store a pointer to an implementation of the old vtable.
  The vtable methods all cast |this| to nsC?Substring and invoke the
  corresponding methods on nsC?Substring.  (Yes, we are utilizing
  knowledge of how the compiler implements virtual functions, but that's
  not unfamiliar territory -- xpconnect!)  This allows a new nsAC?String
  to have the same binary signature as an old nsAC?String.  Likewise,
  every method on the new nsAC?String must first check the value of its
  vtable pointer to determine if |this| is really a nsC?Substring
  derived class or actually some other nsAC?String implementation (such
  as the old nsEmbedC?String).

  An advantage of this approach is that it eliminates virtual function
  calls in most cases (especially for internal Gecko code).  Common
  nsAC?String methods like BeginReading and Length are made much
  faster by avoiding virtual function calls.  Code at the callsite is
  also reduced since there is no need to dereference the |this|
  pointer and the vtable pointer in order to gain access to the
  address of the virtual function.  Now, the callsites make DSO/DLL
  calls which are significantly less costly in terms of codesize and
  runtime.


New string API for XPCOM component developers:


  Going forward, external components and embedding applications should
  not call methods directly on nsA?CString.  These classes should be
  viewed as opaque references to string objects.  This is important
  because it will allow Gecko more flexibility to improve its string
  implementation in the future.

  The new external string API consists of a small set of functions
  exported from the XPCOM library as well as a number of inline helper
  functions.  Include nsStringAPI.h to use these functions.

  nsEmbedString has been re-implemented in terms of this new external
  string API.  For Gecko embedders and XPCOM component authors, the
  XPCOM glue provides stub implementations of the new external string
  API.  All one needs to do to use these functions in external code is
  link to the XPCOM glue standalone library (xpcomglue_s).

  If a component is developed against this new API, then it will only
  work in versions of Mozilla that support this new API (obviously).
  This means that component authors interested in compatibility with
  Mozilla 1.4 (for example) will need to develop their components
  against Mozilla 1.4 instead of the later versions of Mozilla.  New
  versions of Mozilla will continue to be binary compatible with
  FROZEN interfaces defined by older versions of Mozilla (see
  "Maintaining string ABI compatibility" above).


So, are we supposed to stop using nsAString?


  The answer is that it depends.  AString in XPIDL will continue to
  map to nsAString.  Gecko interfaces make extensive use of AString,
  ACString, and AUTF8String, and this isn't going to change.  So,
  nsAC?String will continue to be very important to code that
  interacts with XPCOM interfaces.  However, when it makes sense
  nsC?Substring should be used to pass around string references inside
  the Mozilla codebase.  nsC?Substring unlike nsAC?String can be more
  efficient since it does not need to inspect and possibly jump
  through the vtable on each method call.

  Moving code from nsA?CString to nsC?Substring is consistent with the
  overall strategy of deCOMification that is on-going within Gecko.
  If it ever happens that we are able to break binary compatibility
  with Mozilla 1.0, then we would want to equate nsAC?String to
  nsC?Substring.  Of course, I'm not counting on this happening
  anytime soon.

  Embedders and external component developers should treat nsAC?String
  as an opaque handle to a string object.  They should use the new
  external string API and nsEmbedC?String to work with Mozilla
  strings.  <= I'm repeating myself here ;-)


I've tried to minimize the impact of these changes. I don't expect Mozilla hackers to have to re-learn a new string API. If you are writing an external XPCOM component, I hope you will find the new API easier to work with. I should add that my goal is to freeze the new external string API for Mozilla 1.7 final.


Please let me know if you have any questions or concerns about these changes.


Darin Fisher ([EMAIL PROTECTED]) 2004-02-17 _______________________________________________ mozilla-embedding mailing list [EMAIL PROTECTED] http://mail.mozilla.org/listinfo/mozilla-embedding

Reply via email to