On 08.04.2018 13:59, Inkane wrote:
I recently had a look at Bug 173097 (Cannot delete a file with "invalid"
characters in its name), and unfortunately, this seems to be a
surprisingly difficult issue to fix with how KIO is currently designed.
The root of the issue here is basically the way Qt handles file paths,

Since QFile::setEncodingFunction() no longer works, another way to "hack" the conversion is to use QTextCodec::setCodecForLocale() within our platform plugin. A specially crafted codec could replace non-UTF8 bytes with other UTF-16 code words.

From some minor investigations, we could either use U+DC80...U+DCFF (what Python3 uses), or U+EF80...U+EFFF (what MirOS uses). The latter code range is also mentioned as "reserved for encoding hacks" in the Under-ConScript Unicode Registry http://www.kreativekorp.com/ucsur/

https://docs.python.org/3.3/howto/unicode.html says:
"Files in an Unknown Encoding

What can you do if you need to make a change to a file, but don’t know the file’s encoding? If you know the encoding is ASCII-compatible and only want to examine or modify the ASCII parts, you can open the file with the surrogateescape error handler[...] The surrogateescape error handler will decode any non-ASCII bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the surrogateescape error handler is used when encoding the data and writing it back out."

I can no longer find the MirOS/MirBSD reference, though.

Christoph Feck

Reply via email to