Blake Winton <[EMAIL PROTECTED]> wrote: [snip] > Um, what more data do we need for this use-case? I'm not going to > suggest an API, other than it would be nice if I didn't have to manually > figure out/hard code all the encodings. (It's my belief that I will > currently have to do that, or at least special-case XML, to read the > encoding attribute.) Oh, and it would be particularly horrible if I > output a shell script in UTF-8, and it included the BOM, since I believe > that would break the "magic number" of "#!".
Use the XML tag/attribute "<?xml ... encoding="..." ?> to discover the encoding and assume utf-8 otherwise as per spec: http://www.w3.org/TR/2000/REC-xml-20001006#NT-EncodingDecl Does bash natively support utf-8? Is there a bash equivalent to Python coding: directives? You may be attempting to fix a problem that doesn't exist. > Yeah, see, at a business level, I really need to process those all in > the same way, and it would be annoying to have to write code to handle > them all differently. So you, or anyone else, can write a module for discovering the encoding used for a particular file based on XML tags, Python coding: directives, etc. It could include an extensible registry, and if it is used enough, could be included in the Python standard library. - Josiah _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
