Il 17/11/19 01:55, Thiago Macieira ha scritto:
HiSorry, it looks like this thread is not progressing in a calm and reasoned manner, the way it was meant to be. And I'm very much to blame. So I apologise for the strong language and passionate opinions. I'm deleting most of what I had written as a reply so we can start over. Let's start with your questions: On Saturday, 16 November 2019 10:50:13 PST André Pönitz wrote:You have not yet answered - why this decision was madeYou know, I don't know. To be frank, I don't know that a decision *was* made. It all started with a change (see OP) about removing QTextCodec from the API and from QtCore. It seemed reasonable enough but it turned up quite a few kinks that hadn't been predicted. One of them, which may still be a showstopper, is QXmlStreamReader's inability to handle XML data encoded in anything except UTF-8, though a thorough search of all XML files in my system turned up exactly zero such files. I don't know why QTextCodec is being removed. I don't remember any decisions in prior QtCS or this mailing list about removing it. We definitely discussed removing the CJK codecs and their big tables and that can still be done, with no effect in the API, since QTextCodec is backed by ICU's ucnv. We may have discussed removing it, but I don't remember a firm decision. And even if it is firm, after looking at the consequences of doing so, we may want to reverse our decision.
I don't know either. Is it to make QtCore smaller? Wasn't the feature system ("Qt Lite") supposed to address that? Or is it to make it less of a "kitchen sink", and split it in smaller libraries? Could that mean having QTextCodec in its own library, and QXmlStreamReader in another (that depends on the former)?
Related to that is the discussion of whether UTF-8 is the only acceptable locale on Unix systems. If we don't have QTextCodec, then we have to have something fixed for QString::fromLocal8Bit and it would necessarily be UTF-8. But even if we do have QTextCodec, that's still a reasonable question: should assume it is UTF-8? And should we enforce it? Those were the questions in my OP.
Should fromLocal8Bit be following the locale environment instead (LC_CTYPE, LC_MESSAGES or similar)?
2) QtCore size As I said above, removing the legacy codecs we have code for is not a problem. They are already disabled in Qt builds where ICU is present, so we'd additionally remove them from all other builds. Where ICU is present, there's no loss of functionality for user applications, since ICU provides far more codecs than we do. For those without ICU, it stands to reason that the user chose size so they are aware of the limitations. Plus, one can always instantiate their own QTextCodec and add to the list (at least, with today's implementation). If QTextCodec is not in QtCore, then most likely you can't affect how QtCore and almost all other Qt classes decode 8-bit data into QString, including QTextStream.
See above -- it also means QTextStream goes in some I/O lib that contains or depends on the codecs lib.
and 3) misconfigured locale systems and filename handling This is probably the biggest problem. As it is right now, when the locale isn't set on a Unix system or if it is explicitly set to C, we *cannot* decode any file names with the 8th bit set. Those file names are considered filesystem corruption. And yet they are quite commonly created by the user outside of English-speaking jurisdictions.
Why do we bother about "saving the world"? A misconfigured system is the user's mistake. They should be in charge of fixing it in order to address the problem.
I get the impression that this thread was not started as an RFC for an open-ended discussion, but as a staged attempt to provide a figleaf for a pre-determined decision.That was not the intention. That's why I am re-starting it so we can come back to a reasoned approach. Anyway, the two independent (but related) decisions we need to make are: 1) do we keep QTextCodec in QtCore? 2) do we want to change we handle legacy (non-UTF8) locales? For #2, the sub-questions of the OP apply: a) What should Qt 6 assume the locale to be, if no locale is set? b) In case a non-UTF-8 locale is set, what should we do? c) Should we propagate our decision to child processes? My preferences were: a) C.UTF-8 b) override it to force UTF-8 on the same locale c) yes
How abouta) either C / C.UTF-8, but warning the user; but I'd up the ante, and say: just assert/crash.
b) keep the choice. Silently changing it sounds like a bad idea; we should never override the user choices silently.
c) no. We shouldn't "fix" subprocesses. They have the right to make their own independent decisions.
But I think we should. My arguments are that UTF-8 locales are the default in all desktop Linux distributions, all BSDs and on macOS and have been for 15 years. Most embedded systems from the last 5 years at least also have it as the default, especially those with graphical HMIs and most especially those using Qt for that. Any applications that had problems with UTF-8 must have been fixed for a long time and those that didn't are almost certainly launched from wrappers that set a suitable environment for them, either via QProcessEnvironment, execle, a shell script, or some other mechanism.
Or, on the other hand: what is the chance that a system comes without a locale set? What is more likely to conclude, that it's an accident or a deliberate setting? If it's an accident, why not being *very* verbose about it?
Moreover, setting the locale to non-UTF-8 on a Qt 4 or 5 application on a system with UTF-8-encoded file names is just *wrong* and asking for trouble, for the filesystem reasons stated above. Just as an example, think of an embedded system with a multimedia player that reads a FAT32-formatted USB stick: it wouldn't go very far if it couldn't even see the music files with non-ASCII characters in them. So I feel confident when I say applications targetting porting to Qt 6 are not subject to that problem. Therefore, our resetting of the environment inside the Qt 6 application is not going to affect the chiid processes. But if we disagree and think we shouldn't qputenv, I still think we should assume by default the locale *is* UTF-8, even if the environment tells us it isn't (an explict LANG=ja_JP for example, but much more commonly an LC_ALL=C override). The changing of the encoding is usually an undesired side-effect, not an intentional choice. That is to say, LANG=ja_JP was actually meant to be LANG=ja_JP.UTF-8 and LC_ALL=C could have been for the parsing reasons you brought up. If we don't do the qputenv(), we'll still setlocale() in QCoreApplication so qt_error_string() produces output and we'll live with the danger that some code does our choice. My search through Linux library code found no instance of a permanent setlocale() call with a non-null second parameter (Qt is actually the only exception).
Qt is a "framework", not a "library". :-) -- Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - The Qt, C++ and OpenGL Experts
smime.p7s
Description: Firma crittografica S/MIME
_______________________________________________ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development