(From "RFC: Defaulting to or enforcing UTF-8 locales on Unix systems"...)
Am 17.11.19 um 01:55 schrieb Thiago Macieira:
It all started with a change (see OP) about removing QTextCodec from the API
and from QtCore. It seemed reasonable enough but it turned up quite a few
kinks that hadn't been predicted. One of them, which may still be a
showstopper, is QXmlStreamReader's inability to handle XML data encoded in
anything except UTF-8, though a thorough search of all XML files in my system
turned up exactly zero such files.
By default, QXmlStreamWriter outputs UTF-8. With QTextCodec removed,
will QXmlStreamWriter always output UTF-8? If so, will it be changed to
handle UTF-8 input as efficient as possible?
At the moment, the public API is just QString. So unless you have
QString already, you convert from UTF-8, Latin-1 or raw numerical types
to UTF-16 (QString), and then QXmlStreamWriter converts to UTF-8 for
output. The double conversion burns a lot of CPU and time, including
memory allocations, for what I consider a typical use case. As an
example, think of an SVG document where graphical "paths" are very long
sequences of letters and numbers which are known to be Latin-1 and to
not need any escaping. The effect can be studied by sending the
characters directly to the device instead of going through
QXmlStreamWriter::writeCharacters().
Latin-1 element names and attribute names are quite common, too. So they
might also be considered for avoiding the UTF-16 (QString) conversion step.
Kai
_______________________________________________
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development