On quarta-feira, 6 de junho de 2012 16.51.14, João Abecasis wrote: > > From there, we come to the conclusion that the QString representing such a > > file name must contain special processing instructions (e.g., one or > > more special characters). One form of special processing instruction is > > escaping each character, like URLs do. The problem with the approach of > > escaping is what to do when the escape character occurs in a file name. > > If that is a possibility, the escape character needs to be escaped by > > itself (like "\\" for backslashes in C or "%25" for percents in URLs). If > > we use this approach, then we will not interoperate properly with non-Qt > > applications when this character happens.> > > > > > > The only sane solution, then, is to use a character that has a very small > > chance of ever being used or, better yet, a zero chance (I don't think > > there's any). If that happens, then this character will be close to > > "untypeable" on the terminal. Not a big loss, I'd say. > > We could use some magic sequence. Windows, for instance, uses the "\\?\" > prefix to support longer paths. We could use '<' and '>', which are rare > but valid, we could give a specific meaning to sequences of 3 or more > slashes. > > I don't have a concrete solution at the moment.
I really think we should not use a character that is easily used on file names, and that includes <, >, commas, percents, backslashes, spaces, etc. It needs to be a Unicode character that has a close-to-zero chance of being intentionally used. I recommend selecting one or two characters from the Unicode private use area for this. We could use a non-character (such as U+FDD0), but that will cause problems elsewhere. For example, if you add such a path to QTextBrowser, it might do weird things. For another, such characters are dropped by the UTF-8 encoder and decoder, aren't allowed in D-Bus, etc. This character will be all but "untypeable" on the command-line. I don't think we care, though, since Qt applications are seldomly launched from the command- line and, besides, if the user sees the broken file name anyway (in either form), the user is likely to fix the problem. > > If it was named "βιογραφικό σημείωμα.txt" in ISO-8859-7, the QString > > representation would be: > > /home/foo/έγγραφα/<escape>âéïãñáöéêü óçìåßùìá.txt > > > > That has the drawback of being hard to use when it comes to path > > manipulation. Appending, prepending, extracting or inserting text could > > have unexpected consequences. > > I think any such scheme should support both absolute and relative paths and > should allow a relative path to be combined with an absolute path with: > > absolute-path + '/' + relative-path If you append a slash, it unshifts back to normal. But imagine someone appending a suffix. Thankfully, non-ASCII suffixes / extensions are really rare. > > Limitations: > > a) Qt-only, I don't expect anyone else to use such file names > > b) if encodeName() isn't used properly, it leads to a bad encoding of the > > file name onto 8-bit. Applications dealing with the filesystem need to > > be extra careful so as to not show two representations of the same file. > > c) for that matter, it's possible to produce an escaped form that matches > > a regular file name > > d) double representations are often a source of security issues if not > > handled carefully (cf. overlong sequences in UTF-8) > > I don't see a) as such a big problem, since currently Qt can't even handle > such file names. As for b) I think ideally we'd come up with something that > makes the use of encode/decodeName invisible and doesn't require users to > register their own encoding/decoding functions. c) is what we want to > minimize. > > As for d), if we make it all transparent and handled in a seamless way in Qt > the problem that remains is how those paths interoperate with other > applications and user code. It really helps to minimize c). I'm not sure I agree with your dismissal of D. I'd like to see more research into this topic first. > On the other hand we already have Qt-only paths in resource files and > QDir::searchPaths(). We could easily use a well-known prefix for the > special paths: url-encoded:/usr/joao/R%E9sum%E9.txt, which only supports > absolute paths, but would already enable all items in my wish list. > > As you can see, I didn't come up with this today. I've known these > > alternatives for years. I don't think they're worth our time. Search paths and the filesystem engines are misfeatures. One is gone, the other not yet. They are potential security issues too. Anyway, what I recommend for now: 1) immediately, de-inline QFile::decodeName and QFile::encodeName 2) un-deprecate them and update the text in changes-5.0.0 3) make QProcess use QFile::encodeName for its arguments (no-op right now) 4) make QCoreApplication parse its arguments using QFile::decodeName (no-op right now) 5) idem for Laszlo's command-line parser class Later, we can decide whether to add escaping to those functions. However, I cannot agree with bringing the setter functions back. I do agree with removing them completely, though. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development