Doug Ewell scripsit: > > Now suppose you have a UNIX filesystem, containing filenames in a > > legacy encoding (possibly even more than one). If one wants to switch > > to UTF-8 filenames, what is one supposed to do? Convert all filenames > > to UTF-8? > > Well, yes. Doesn't the file system dictate what encoding it uses for > file names? How would it interpret file names with "unknown" characters > from a legacy encoding? How would they be handled in a directory > search?
Windows filesystems do know what encoding they use. But a filename on a Unix(oid) file system is a mere sequence of octets, of which only 00 and 2F are interpreted. (Filenames containing 20, and especially 0A, are annoying to handle with standard tools, but not illegal.) How these octet sequences are translated to characters, if at all, is no concern of the file system's. Some higher-level tools, such as directory listers and shells, have hardwired assumptions, others have changeable assumptions, but all are assumptions. -- John Cowan [EMAIL PROTECTED] www.reutershealth.com www.ccil.org/~cowan No man is an island, entire of itself; every man is a piece of the continent, a part of the main. If a clod be washed away by the sea, Europe is the less, as well as if a promontory were, as well as if a manor of thy friends or of thine own were: any man's death diminishes me, because I am involved in mankind, and therefore never send to know for whom the bell tolls; it tolls for thee. --John Donne