https://bugs.documentfoundation.org/show_bug.cgi?id=158068

            Bug ID: 158068
           Summary: Replace string literals with custom O(U)String
                    literals in code
           Product: LibreOffice
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Keywords: difficultyBeginner, easyHack, skillCpp, topicCleanup
          Severity: normal
          Priority: medium
         Component: LibreOffice
          Assignee: libreoffice-bugs@lists.freedesktop.org
          Reporter: mikekagan...@hotmail.com
                CC: sberg...@redhat.com

Throughout the codebase, there are hundreds of initializations of O(U)Strings
with string literals, like

    OUString foo = "abc";
    OString bar("def");
    std::vector<OUString> baz = {"xyz1", "xyz2", "xyz3"};

Every time such an initialization appears, a string constructor is called at
runtime, which allocates memory, and copies strings. This is because O(U)String
is not a plain character array, but a class with a pointer to a structure
rtl_(u)String that holds information about reference count, size, and the
actual character array.

To avoid overhead of such construction, several techniques were introduced over
time; with C++17 adoption, we used an updated O(U)StringLiteral, which is a
templated structure, with a layout compatible with rtl_(u)String; such
O(U)StringLiterals are created at compile time (typically as static inline
constexpr objects), and creation of OUStrings from those became a trivial
operation. But that required a bloat of such helper objects, which is
inconvenient. Instead of a clear

    for (;;)
    {
        // ...
        OUString foo = "abc"; // Allocating memory in a loop
        // doing something with the string...
    }

we had 

    static constexpr OUStringLiteral a_abc("abc"); // compile-time constant
    // ...
    for (;;)
    {
        // ...
        OUString foo(a_abc); // Trivial construction without allocation
        // doing something with the string...
    }

This inconvenience prevented wide adoption of the optimization; only in few
cases, plain "abc" used to initialize O(U)Strings at run time were replaced
with uses of O(U)StringLiterals; most of run-time initializations stay in code.

Introduction of C++20 support in the codebase (commit
1eef07805021b7ca26a1a8894809b6d995747ba1 Bump baseline to C++20, 2023-09-22)
pawed a way to use custom literals; and Stephan came with a solution that
allows to (mostly) get rid of use of the intermediate O(U)StringLiteral
objects, and have the benefir of compile-time creation of the string objects:
commit 27d1f3ac016d77d3c907cebedca558308f366855 (O[U]String literals (unusable
for now, C++20 only), 2023-07-14) introduced the operator ""_ostr() to both
OString and OUString; and commit e83e62fe376a91f7270435e06ee7f6864c48fb4b (Work
around MSVC bug with "..."_ostr vs. u"..."_ostr, 2023-07-19) renamed it in
OUString into operator ""_ustr(). These operators still use O(U)StringLiterals
internally, but hide it from the programmer.

Now it is possible to write

    for (;;)
    {
        // ...
        OUString foo = u"abc"_ustr; // trivial initialization using a
compile-time object
        // doing something with the string...
    }

While the u"abc"_ustr or "abc"_ostr syntax is a bit bulkier than simple "abc",
it is inline, is unambiguous, and provides the wanted optimization using much
better developer experience than the previous O(U)StringLiteral solution.

The easy hack is to replace uses of run-time initializations of O(U)Strings
using string literals like "abc" in codebase with the user-defined literals:

    OUString foo("abc");

should become

    OUString foo(u"abc"_ustr);

Note that not every string literal in code needs to be replaced: e.g.,
comparison like

    OUString foo;
    //...
    if (foo == "abc")

is no less efficient than comparing OUString to another OUString; also,
concatenations happening at run time, that use such literals, are also very
efficient, and replacing such literals there would not improve things. So e.g.

    OUString foo("abc" + OUString::number(n))

should *not* be changed to

    OUString foo(u"abc"_ustr + OUString::number(n))

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to