Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote: >> Which metadata is that? I was sure we were talking about editors for >> plain-text files, which don't have any sort of metadata declaring the >> character encoding or anything else. > > There's always some metadata : either it comes from the filesystem > itself (filenaming conventions or explicit storage of this metadata, > including HTTP that is a filesystem supporting them, or MIME for > emails), or it comes from information provided by the user in that > editor, to instrut it about how to decode it, or it is implicit in the > editor itself which offers no choice for it in its GUI or command > line.
Suppose I have a file called 'karenina.txt' on my flash drive. Let's assume we can trust from the .txt extension that it really is a text file of some sort (that is metadata). Now, what encoding is this file in? See Stephan's comment again about the editor doing charset detection. > As soon as a user needs to specify the filetype or file encoding > somewhere that the filesystem does not provide itself as separately > stored metadata, the user provides additional metadata. This is true > when he also chooses a specific editor that handles a specific syntax > or encoding (the metadata provided by the user consists in this choice > of tool, even if it was inappropriate from a wrong guess or > assumption). Right, but you talked about "saving them as ASCII (i.e. saving this charset information in the metadata)". This is explicit metadata, not the implicit type that you're talking about now. -- Doug Ewell | Thornton, Colorado, USA http://www.ewellic.org | @DougEwell

