Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review35800 --- Important: the bug is not fixed forever. In Qt5, QFile::setEncodingFunction and QFile::setDecodingFunction don't exist anymore [they're deprecated and declared empty]. So, a different solution will have to be found for Qt5 -- preferrably within Qt itself. I'm removing the code from kdelibs-frameworks, there's no point in calling a method that does nothing. - David Faure On July 1, 2013, 10:55 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated July 1, 2013, 10:55 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated July 1, 2013, 10:55 a.m.) Status -- This change has been marked as submitted. Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review35344 --- This review has been submitted with commit f4269ef3498581964e8a1a13cd0d6d7f19c88762 by Szókovács Róbert to branch master. - Commit Hook On June 28, 2013, 9:28 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated June 28, 2013, 9:28 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated June 28, 2013, 9:28 a.m.) Status -- This change has been marked as submitted. Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review35221 --- Have you committed this yet? I do not see it in git master. Anyhow, can you please add BUG: 159241 to your commit message when you do push it in? That bug report is the original one for the encoding issue you were trying to address with this workaround. - Dawit Alemayehu On June 28, 2013, 9:28 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated June 28, 2013, 9:28 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On June 28, 2013, 12:44 p.m., Dawit Alemayehu wrote: Have you committed this yet? I do not see it in git master. Anyhow, can you please add BUG: 159241 to your commit message when you do push it in? That bug report is the original one for the encoding issue you were trying to address with this workaround. No, I'm waiting for approval of my contributor account. - Róbert --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review35221 --- On June 28, 2013, 9:28 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated June 28, 2013, 9:28 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On June 28, 2013, 2:53 p.m., David Faure wrote: Independently from this patch, I was thinking of a completely different approach to solve non-utf8 encodings in file:/// urls: using KRemoteEncoding in kio_file, just like we do in other slaves like kio_ftp. I.e. letting the user select the encoding by hand for the current directory. This way, not only would files be usable, but they could even appear correctly. However it requires manual user intervention, but that's the case anyway for FTP dirs that use another locale etc. On the other hand it moves the fix up to KIO, so it wouldn't work for other QFile uses.. [hmm, or anywhere where KIO has a fast path for local files to avoid calling kio_file Ouch, that might kill this idea completely, in fact]. Well, food for thought, then. Using anything but the exact byte sequence in URLs breaks http://freedesktop.org/wiki/Specifications/file-uri-spec/. That spec and RFC 3987 put together limit KDE to exactly one local filesystem encoding: UTF-8. - Thiago --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review35232 --- On June 28, 2013, 9:28 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated June 28, 2013, 9:28 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On June 5, 2013, 10:35 a.m., Róbert Szókovács wrote: Would anybody please take a look at the latest version and comment? I'd like to push it into 4.11, if possible, please someone with authority say the word. - Róbert --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review33783 --- On May 17, 2013, 10:08 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 17, 2013, 10:08 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review33783 --- Would anybody please take a look at the latest version and comment? - Róbert Szókovács On May 17, 2013, 10:08 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 17, 2013, 10:08 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 17, 2013, 10:07 a.m.) Review request for kdelibs and Thiago Macieira. Changes --- Updated version, the new behaviour used when KDE_UTF8_FILENAMES enviroment variable set or the the locale is UTF-8. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs (updated) - kdecore/localization/klocale_kde.cpp b010e74 CMakeLists.txt 181f139 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 17, 2013, 10:08 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs (updated) - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On May 7, 2013, 10:11 a.m., Róbert Szókovács wrote: The solution is intentionally shy, I really don't want to fan the flames surrounding this issue. I just stumbled upon this location when it can be handled painlessly. Whether or not it should be turned on by default, in my opinion, can be left for distributors. Thomas Lübking wrote: Then it's worthless. When I encounter broken filenames on a rw device, i know it's time for a fix. When I encounter broken filenames in joliet or rockridge (latter usually caused by myself long ago - thank you, wodim...) i know it's time to mount norock/nojoliet. Whether i do that or set a (KDE only affecting) env makes hardly a difference. When my little sister™ encounters broken filenames anywhere, she knows that it's time to call her personal IT (me) with these files won't open! - if she could not call me, she had no access to those files. Period. She won't think to google for kde broken filenames, because she would not think it's a problem with the name - the files have weird names, yes, but essentially they won't open when she clicks them. That this could be due to some restrictions in UTF-8 and QString and other terms she does not know, cannot be an expected consideration. So either this is not a fixworthy issue at all, or it (as OPT-IN) only becomes a way for distro discrimination (works on distro X but not on distro Y) because fact is that the filenames are broken and if we want to assist in that situation, we assist the unskilled *only* and the unskilled simply dont set env vars. If they did, they were also skilled enough for convmv et al. to deal with that issue correctly. IOW *every* distro but Arch/Gentoo/LFS - ie. where you read a wiki for setup - likely would *have* to set this anyway and those have the users to turn it off at will. /2¢ Róbert Szókovács wrote: OK, I'm all for making this on by default, but that would be a change from the current situation, when the default is QFile's filename encoding, basen on locale. If this becomes the default, it disrupts those who use a non-UTF8 locale. The current code provides an enviroment variable to force KDE to threat the filenames UTF8, this patch piggybacks that mechanism. Should we check the locale the same way QFile does? Thomas Lübking wrote: There should be no regression in regular use on non broken FS names for no-one - not even those using non UTF-8 locales, so yes - testing the locale to dis/enable this sounds reasonable. Is the solution as simple as deactivating it if the tested env is set to anything but non_broken_names? No, I'm affraid we would need heuristics similar to the one in QT, see qtextcodec.cpp, setupLocaleMapper(): Get the first nonempty value from $LC_ALL, $LC_CTYPE, and $LANG environment variables., then check the CODESET part; if it's UTF8, enable this new functionality, otherwise do as before the patch. - Róbert --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review32184 --- On May 7, 2013, 4:14 p.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 4:14 p.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 kdecore/localization/klocale_p.h af4a768 Diff:
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On May 7, 2013, 10:11 a.m., Róbert Szókovács wrote: The solution is intentionally shy, I really don't want to fan the flames surrounding this issue. I just stumbled upon this location when it can be handled painlessly. Whether or not it should be turned on by default, in my opinion, can be left for distributors. Thomas Lübking wrote: Then it's worthless. When I encounter broken filenames on a rw device, i know it's time for a fix. When I encounter broken filenames in joliet or rockridge (latter usually caused by myself long ago - thank you, wodim...) i know it's time to mount norock/nojoliet. Whether i do that or set a (KDE only affecting) env makes hardly a difference. When my little sister™ encounters broken filenames anywhere, she knows that it's time to call her personal IT (me) with these files won't open! - if she could not call me, she had no access to those files. Period. She won't think to google for kde broken filenames, because she would not think it's a problem with the name - the files have weird names, yes, but essentially they won't open when she clicks them. That this could be due to some restrictions in UTF-8 and QString and other terms she does not know, cannot be an expected consideration. So either this is not a fixworthy issue at all, or it (as OPT-IN) only becomes a way for distro discrimination (works on distro X but not on distro Y) because fact is that the filenames are broken and if we want to assist in that situation, we assist the unskilled *only* and the unskilled simply dont set env vars. If they did, they were also skilled enough for convmv et al. to deal with that issue correctly. IOW *every* distro but Arch/Gentoo/LFS - ie. where you read a wiki for setup - likely would *have* to set this anyway and those have the users to turn it off at will. /2¢ OK, I'm all for making this on by default, but that would be a change from the current situation, when the default is QFile's filename encoding, basen on locale. If this becomes the default, it disrupts those who use a non-UTF8 locale. The current code provides an enviroment variable to force KDE to threat the filenames UTF8, this patch piggybacks that mechanism. Should we check the locale the same way QFile does? - Róbert --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review32184 --- On May 7, 2013, 4:14 p.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 4:14 p.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 kdecore/localization/klocale_p.h af4a768 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On May 7, 2013, 10:11 a.m., Róbert Szókovács wrote: The solution is intentionally shy, I really don't want to fan the flames surrounding this issue. I just stumbled upon this location when it can be handled painlessly. Whether or not it should be turned on by default, in my opinion, can be left for distributors. Thomas Lübking wrote: Then it's worthless. When I encounter broken filenames on a rw device, i know it's time for a fix. When I encounter broken filenames in joliet or rockridge (latter usually caused by myself long ago - thank you, wodim...) i know it's time to mount norock/nojoliet. Whether i do that or set a (KDE only affecting) env makes hardly a difference. When my little sister™ encounters broken filenames anywhere, she knows that it's time to call her personal IT (me) with these files won't open! - if she could not call me, she had no access to those files. Period. She won't think to google for kde broken filenames, because she would not think it's a problem with the name - the files have weird names, yes, but essentially they won't open when she clicks them. That this could be due to some restrictions in UTF-8 and QString and other terms she does not know, cannot be an expected consideration. So either this is not a fixworthy issue at all, or it (as OPT-IN) only becomes a way for distro discrimination (works on distro X but not on distro Y) because fact is that the filenames are broken and if we want to assist in that situation, we assist the unskilled *only* and the unskilled simply dont set env vars. If they did, they were also skilled enough for convmv et al. to deal with that issue correctly. IOW *every* distro but Arch/Gentoo/LFS - ie. where you read a wiki for setup - likely would *have* to set this anyway and those have the users to turn it off at will. /2¢ Róbert Szókovács wrote: OK, I'm all for making this on by default, but that would be a change from the current situation, when the default is QFile's filename encoding, basen on locale. If this becomes the default, it disrupts those who use a non-UTF8 locale. The current code provides an enviroment variable to force KDE to threat the filenames UTF8, this patch piggybacks that mechanism. Should we check the locale the same way QFile does? There should be no regression in regular use on non broken FS names for no-one - not even those using non UTF-8 locales, so yes - testing the locale to dis/enable this sounds reasonable. Is the solution as simple as deactivating it if the tested env is set to anything but non_broken_names? - Thomas --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review32184 --- On May 7, 2013, 4:14 p.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 4:14 p.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 kdecore/localization/klocale_p.h af4a768 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 10:04 a.m.) Review request for kdelibs and Thiago Macieira. Changes --- Incorporated Thiago's suggestions. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs (updated) - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review32184 --- The solution is intentionally shy, I really don't want to fan the flames surrounding this issue. I just stumbled upon this location when it can be handled painlessly. Whether or not it should be turned on by default, in my opinion, can be left for distributors. - Róbert Szókovács On May 7, 2013, 10:04 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 10:04 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On May 7, 2013, 10:11 a.m., Róbert Szókovács wrote: The solution is intentionally shy, I really don't want to fan the flames surrounding this issue. I just stumbled upon this location when it can be handled painlessly. Whether or not it should be turned on by default, in my opinion, can be left for distributors. Then it's worthless. When I encounter broken filenames on a rw device, i know it's time for a fix. When I encounter broken filenames in joliet or rockridge (latter usually caused by myself long ago - thank you, wodim...) i know it's time to mount norock/nojoliet. Whether i do that or set a (KDE only affecting) env makes hardly a difference. When my little sister™ encounters broken filenames anywhere, she knows that it's time to call her personal IT (me) with these files won't open! - if she could not call me, she had no access to those files. Period. She won't think to google for kde broken filenames, because she would not think it's a problem with the name - the files have weird names, yes, but essentially they won't open when she clicks them. That this could be due to some restrictions in UTF-8 and QString and other terms she does not know, cannot be an expected consideration. So either this is not a fixworthy issue at all, or it (as OPT-IN) only becomes a way for distro discrimination (works on distro X but not on distro Y) because fact is that the filenames are broken and if we want to assist in that situation, we assist the unskilled *only* and the unskilled simply dont set env vars. If they did, they were also skilled enough for convmv et al. to deal with that issue correctly. IOW *every* distro but Arch/Gentoo/LFS - ie. where you read a wiki for setup - likely would *have* to set this anyway and those have the users to turn it off at will. /2¢ - Thomas --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review32184 --- On May 7, 2013, 10:04 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 10:04 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 2:25 p.m.) Review request for kdelibs and Thiago Macieira. Changes --- Compilation fix Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs (updated) - CMakeLists.txt 181f139 kdecore/localization/klocale_kde.cpp b010e74 kdecore/localization/klocale_p.h af4a768 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 7, 2013, 4:14 p.m.) Review request for kdelibs and Thiago Macieira. Changes --- Removed artifact :( Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs (updated) - kdecore/localization/klocale_kde.cpp b010e74 kdecore/localization/klocale_p.h af4a768 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 5, 2013, 8:13 a.m.) Review request for kdelibs and Thiago Macieira. Changes --- I very much like the idea of finally making it possible to deal with non-utf8 files, since this can always happen when plugging in a USB key coming from another OS, mounting a network partition, etc. Ideally this should be handled in Qt rather than in KDE though. But let's see what Thiago has to say about either solution, it's his domain... Triggering this with a special env var is probably too shy. We probably want this to work out of the box without the need for users to change set an env var, provided that it doesn't create any regressions (ok, the performance issue would be a regression, which is another reason for doing this in Qt directly -- only for filenames of course). Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review32055 --- this is kinda getting old ... just two links to discussions i was involved in: http://www.mail-archive.com/development@qt-project.org/msg04255.html http://comments.gmane.org/gmane.comp.kde.devel.core/54679 - Oswald Buddenhagen On May 5, 2013, 8:13 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 5, 2013, 8:13 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review32093 --- I don't like it. There's a reason why I removed the equivalent code from Qt and there's a reason I refuse to consider adding it back. At least you've made this about filenames only, so there's hope. If you're going to use any range, I recommend using the same, old range from Qt 3 and early Qt 4. kdecore/localization/klocale_kde.cpp http://git.reviewboard.kde.org/r/110043/#comment23902 Use the same range that Qt used from 2003 to 2007: U+10FE00 to 10FEFF kdecore/localization/klocale_kde.cpp http://git.reviewboard.kde.org/r/110043/#comment23903 Don't check the BOM. No file names have BOM. kdecore/localization/klocale_kde.cpp http://git.reviewboard.kde.org/r/110043/#comment23904 BOM handling again. - Thiago Macieira On May 5, 2013, 8:13 a.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated May 5, 2013, 8:13 a.m.) Review request for kdelibs and Thiago Macieira. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
On April 16, 2013, 5:15 p.m., Christoph Feck wrote: Nice idea, would be better to use Unicode Private Use Areas instead of 0x18000 codes. I've been considering that, too. I don't think it's very likely that someone will legitimely use neither the PUA or any currently unassigned area for filenames (which would cause problem, of course), so it really up to discussion. I don't have commit right, so ultimately it's up to someone who has and willing to sponsor/commit this fix, I have no real preference. I picked this area because I had to pick something and currently no-one can or should use it, but the PUA can be in private use. - Róbert --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review31172 --- On April 16, 2013, 2:59 p.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated April 16, 2013, 2:59 p.m.) Review request for kdelibs. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- Review request for kdelibs. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács
Re: Review Request 110043: Proposed fix/workaround for legacy encoded filename handling
--- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/#review31172 --- Nice idea, would be better to use Unicode Private Use Areas instead of 0x18000 codes. - Christoph Feck On April 16, 2013, 2:59 p.m., Róbert Szókovács wrote: --- This is an automatically generated e-mail. To reply, visit: http://git.reviewboard.kde.org/r/110043/ --- (Updated April 16, 2013, 2:59 p.m.) Review request for kdelibs. Description --- This patch works around the problem of filenames that are not valid UTF8 strings: in KLocalePrivate::initFileNameEncoding() KDE sets the QFile's encoding/decoding function, to to/fromUTF8() in QString, which in turn calls QUtf8's converter function (QUtf8 is not exported to developers, so I had to use an inefficient method, I think it would be better if we could use the state parameter for error detection). I replaced this with the said functions' copy/pasted version and changed it, so when it encounters an invalid UTF8 string, it will encode it byte by byte, mapping the lower 128 their normal unicode place and the upper 128 to U+18000-U+1807F, and of course the decoder reverses it. To make this actually work you have to define the KDE_UTF8_FILENAMES enviroment variable to a specific value (broken_names). To test it, do the following: .kde/env/KDE_UTF8_FILENAMES.sh with this content: export KDE_UTF8_FILENAMES=broken_names logout, login, try dolphin on faulty files. (instead of the usual boxed ? you'll see just boxes) This addresses bug 165044. http://bugs.kde.org/show_bug.cgi?id=165044 Diffs - kdecore/localization/klocale_kde.cpp b010e74 Diff: http://git.reviewboard.kde.org/r/110043/diff/ Testing --- Thanks, Róbert Szókovács