Re: [Development] QAnyStringView

Marc Mutz via Development Wed, 24 Jun 2020 00:37:03 -0700

Hi Thiago,

On 2020-06-24 02:36, Thiago Macieira wrote:

On Tuesday, 23 June 2020 02:35:05 PDT Marc Mutz via Development wrote:
I have come to believe that QUtf8StringView without QAnyStringViewwon'tfly: Introducing QUtf8StringView without QAnyStringView will explodethe
number of mixed-type operations we need to support.
Question, what are the "mixed-typed operations we need to support?".Where do
you see the need for this?

QString::replace(), relational operators, QXmlStream has severalcombinatorially-explosive overload sets, too.

The best we can do to condense this down is
to revoke string-ness of QByteArray and we'd be left with

- QStringView
- QLatin1String
- QUtf8StringView
- QChar

Aside from places where an exception is worth it, our string APIshould:

- take QString by const-ref
- return QString by value

That condenses our four types to one for almost the entirety of Qt.

In times of QtMCU, we need to re-think whether owning containers use inAPIs is really the way to go. In C, strings are const char*, which is aview. In Qt 3, the objectName ctor argument was a const char*. Thingshave gone downhill since then for no good reason (ie. no functionalitywas added), and a test like tst_qstatemachine adding 10KiB in text sizefor O(150) setObjectName() calls' forced creation of QStrings temps isjust premature pessimisation at its best.

My Qt 5-era changes to QLatin1String (adding QL1S::arg(), enabling QL1Sas a type for date/time formats, overloading QtJSON functions for QL1S)have shown how dramatic the effect of 8-bit string values withoutconstructing a QString first really is, without actually limitingfunctionality.

QStringView, QStringTokenizer, QUtf8StringView and QAnyStringView arethe result of a multi-year analysis of string processing with Qt. I'mfully convinced they're the way to go for ease of use and performance atthe same time. There's one and only one technical reason to continue touse QString in APIs: lack of implementation bandwidth.

For new API that benefits from the exception, I'd reduce to two:
- QStringView
- QUtf8StringView
But the fact that you listed QChar in the first place indicates thatyou'retalking about the string classes themselves. Nothing else uses QChar inourAPI. In that case, yes, QLatin1String and QChar are part of theoverload set.

[...]

I guess you consider QStringList::join() a string class, then. Butotherwise, yes, I agree.

Assuming for the sake of argument that we need those four types,
consider QString::replace(). Experience shows that stuff like
QStringBuilder expressions being passed will require an actual QString
overload to be present, too. Ignoring existing overloads and regexp,
we'd need 5x5=25 overloads. I won't enumerate them here. What I will
enumerate is the complete set of overloads when using QAnyStringView:

    QString& QString::replace(QAnyStringView, QAnyStringView,
Qt::CaseSensitivity);

That's it.

Unlike QStringView, QAnyStringView is a pure interface type. I won'taddmuch in the way of parsing API to it, even though I acknowledge that'saslippery slope. While it would be easy to add trimmed(), andtokenize()

would be really interesting, QAnyStringView should not be used for
parsing. That's what we have the three non-variant string view types
for. Being a pure interface type means we can add more "dangerous"
conversions. QStringView can't be constructed from a QStringBuilder,
e.g., because it's almost impossible to make that work without
referencing destroyed data:

    QStringView s = u'c' + QString::number(x); // oops
    QString c = u'c' + QString::number(x);
    QStringView s = c; // ok

But QAnyStringView supports this:

    str.replace(name, name % "_1");

That's not the same code. In one you're creating a view object andaccessing

it later outside of the same statement; in the other, it is created and
accessed in the same statement. That is to say, the following works:

  void foo(QStringView str);
  foo(u'c' + QString::number(x));

and the following doesn't:

 QAnyStringView s = u'c' + QString::number(x);

That's what I've been trying to say: since Q(Utf8)StringView is verygood at parsing, QAnyStringView is very good at being an interface typeand an interface type only. As such, we can allow ourselves some leewayin what (implicit) conversions we add to QAnyStringView that we activelyrejected for QStringView.

QAnyStringView solves this in the sense that one overload can replace
many overloads. The complexity is still there, a binary visitation ofaQAnyStringView produces nine instantiations of the visitor (thoughthatcan be reduced to six in many cases), but many implementations fallintoone of just two classes: 1) a function would just call toString() ontheany-string-view, anyway, in which case the QString construction istaken
out of user code and centralized in the library. If you think that
doesn't matter, look at the tst_qstatemachine numbers in

   https://codereview.qt-project.org/c/qt/qtbase/+/301595 (-10KiB just
from temporary QString creation and destruction)
I'm leaning towards agreeing to use QAnyStringView in the stringclasses.

The part you're replying to is, however, a case of a traditionalQString-only function: setObjectName().

I'll remove my -2.


Thanks for that!

2) the complexity is already there and QAnyStringView helps inreducing
it:

   https://codereview.qt-project.org/c/qt/qtbase/+/303483 (QCalendar)
   https://codereview.qt-project.org/c/qt/qtbase/+/303512 (QColor)
   https://codereview.qt-project.org/c/qt/qtbase/+/303707 (arg())
   https://codereview.qt-project.org/c/qt/qtbase/+/303708 (QUuid)
Agreed on arg(), it's a great clean-up and performance improvement.
But it's part of QString itself. The other ones, however, are theslipperyslope. I agree they improve performance for sink-only functions, but wedon't*need* QAnyStringView for them. For example, for QCalendar, they couldbe the
QStringView/QUtf8StringView pair.

Remember that a great deal of performance improvement already came fromadding QLatin1String::arg() and QStringView::arg(). You can say this isstring-classes, but it really isn't. It's formatting: you take a formatstring, parse it, and produce some result based on it. We have tons ofthese in our API: date/time come to mind. If QAnyStringView for arg() isa good idea, so it is for any format string, and by a ever-so-slightextension, for any parser input. Which brings us to:

My problem is not with the clean up that it provides, it's adding yetanother
class to our API.

We seems to have agreement on using QAnyStringView for "string classes".If we do, this argument is moot, as the class will _be_ there. It's thenonly a little extra step to my proposal, since a) using QAnyStringViewmore widely makes for a more consistent string story in Qt 6 and b)meeting my proposed minimal step would eradicate QLatin1String from ourAPIs, reducing the newcomer-need-to-know API by one class.+QAnyStringView -QLatin1String = same number of classes.

That said, I'll happily repeat my mantra that fewer classes don't makean API easier to use, it's fewer responsibilities per class that does.And my design is very clean here:


- QAnyStringView is the interface type (and only that)
- Q(Utf8)StringView are the parse types (via QAnyStringView::visit())

- QString is a (possible) storage type (std::u16string andQVAL<char16_t> are others,and I expect a lot more QVLA<char16_t> to be used in implementation aswe move forward).


One class - exactly one responsibility. _That_ makes an API easy to use.

Now that I hopefully have convinced you that we need QAnyStringView,
where to go from here?

Given the lack of time until Qt 6.0, I'd like to propose to justreplace

all overload sets that contain QL1S with one overload taking
QAnyStringView


Agreed for the string classes themselves.


I hope I've made my point about not stopping there.

The implementation usually contains the optimized handling of L1 data
already, and can often be easily extended to UTF-8, too, cf. QColor,
QUuid, arg().
Those are likely candidates, yes.
I just don't want to give blanket approval for everything. There may beplaceswhere the correct solution is to delete the QLatin1String overload andkeep
only QString.

I disagree here. If there's already a QL1S overload, we must never goback to just QString.

What instead we should look into is whether we can make aQString&&/QAnyStringView overload set work meaningfully (ie. noambiguities for whatever the user passes). That would allow classes thatactually store QStrings to allow transfer-of-ownership, at the cost ofexposing an implementation detail. The main problem I see here isQStringBuilder.


Thanks,
Marc

_______________________________________________
Development mailing list
[email protected]
https://lists.qt-project.org/listinfo/development

Re: [Development] QAnyStringView

Reply via email to