(From "RFC: Defaulting to or enforcing UTF-8 locales on Unix systems"...)

Am 17.11.19 um 01:55 schrieb Thiago Macieira:

It all started with a change (see OP) about removing QTextCodec from the API
and from QtCore. It seemed reasonable enough but it turned up quite a few
kinks that hadn't been predicted. One of them, which may still be a
showstopper, is QXmlStreamReader's inability to handle XML data encoded in
anything except UTF-8, though a thorough search of all XML files in my system
turned up exactly zero such files.
By default, QXmlStreamWriter outputs UTF-8. With QTextCodec removed, will QXmlStreamWriter always output UTF-8? If so, will it be changed to handle UTF-8 input as efficient as possible?

At the moment, the public API is just QString. So unless you have QString already, you convert from UTF-8, Latin-1 or raw numerical types to UTF-16 (QString), and then QXmlStreamWriter converts to UTF-8 for output. The double conversion burns a lot of CPU and time, including memory allocations, for what I consider a typical use case. As an example, think of an SVG document where graphical "paths" are very long sequences of letters and numbers which are known to be Latin-1 and to not need any escaping. The effect can be studied by sending the characters directly to the device instead of going through QXmlStreamWriter::writeCharacters().

Latin-1 element names and attribute names are quite common, too. So they might also be considered for avoiding the UTF-16 (QString) conversion step.

Kai

_______________________________________________
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development

Reply via email to