On Wed, Dec 7, 2016 at 12:39 PM, Chris Angelico <ros...@gmail.com> wrote: > Note that two of the Beauty Stone tracks include quotes as well as > question marks. How do you identify those? Let's say you want to play > one of these in VLC, and then maybe you decide that the track in > Pirates of Penzance/MusicOnly is slightly mis-cropped, so you rebuild > it from the one in the parent directory. How does that work on > Windows? If you say "it doesn't", then (a) you have taken away choice > on a fundamental level, (b) you have your head in the sand, and (c) > you still haven't solved the problem of percent signs, carets, and so > on, which are perfectly legal in file names, but have meaning to the > shell.
The five wildcard characters ("<>*?) aren't allowed in the names of files and directories -- at least not by any Windows filesystem that I've used. This makes it easy for a filesystem to support globbing in its implementation of NtQueryDirectoryFile. Filenames also can't contain control characters, slash, backslash, pipe, and colon (the latter delimits a fully-qualified NTFS name, e.g. filename:streamname:streamtype). NTFS stream names are less limited. They only disallow NUL, slash, backslash, and colon. The filesystem runtime library provides the macro FsRtlIsAnsiCharacterLegal [1], among other related macros, which allows filesystem drivers to be consistent with FAT or NTFS. To my knowledge this is voluntary, but going against the grain is only asking for headaches. [1]: https://msdn.microsoft.com/en-us/library/ff546731 This macro depends on the array FsRtlLegalAnsiCharacterArray, which indicates whether each ASCII character is valid for a fixed set of filesystems. The flag values are as follows: 0x01 - FAT 0x02 - OS/2 HPFS 0x04 - NTFS/Stream 0x08 - Wildcard 0x10 - Stream Here's the array dumped from the kernel debugger. For convenience I've added the printable ASCII characters above each line. lkd> db poi(nt!FsRtlLegalAnsiCharacterArray) fffff801`fc0e8550 00 10 10 10 10 10 10 10-10 10 10 10 10 10 10 10 fffff801`fc0e8560 10 10 10 10 10 10 10 10-10 10 10 10 10 10 10 10 ! " # $ % & ' ( ) * + , - . / fffff801`fc0e8570 17 07 18 17 17 17 17 17-17 17 18 16 16 17 07 00 0 1 2 3 4 5 6 7 8 9 : ; < = > ? fffff801`fc0e8580 17 17 17 17 17 17 17 17-17 17 04 16 18 16 18 18 @ A B C D E F G H I J K L M N O fffff801`fc0e8590 17 17 17 17 17 17 17 17-17 17 17 17 17 17 17 17 P Q R S T U V W X Y Z [ \ ] ^ _ fffff801`fc0e85a0 17 17 17 17 17 17 17 17-17 17 17 16 00 16 17 17 ` a b c d e f g h i j k l m n o fffff801`fc0e85b0 17 17 17 17 17 17 17 17-17 17 17 17 17 17 17 17 p q r s t u v w x y z { | } ~ fffff801`fc0e85c0 17 17 17 17 17 17 17 17-17 17 17 17 10 17 17 17 NTFS stream names are the least restricted, allowing everything except NUL, slash, and backslash. For example: >>> open('test:\x01|*?<>"', 'w').close() >>> win32file.FindStreams('test') [(0, '::$DATA'), (0, ':\x01|*?<>":$DATA')] The first stream listed above is the anonymous data stream "test::$DATA", which is the same as simply opening "test". Technically by the above table ":" is allowed in NTFS names, but its use is reserved as the delimiter of the fully-qualified name. For example, the fully-qualified name of a directory is "dirname:$I30:$INDEX_ALLOCATION". The stream name is "$I30" and the stream type is "$INDEX_ALLOCATION". A directory can also have multiple named $DATA streams, because NTFS is weird like that. -- https://mail.python.org/mailman/listinfo/python-list