This look okay?
--- perlpod.pod~core Mon Sep 1 22:15:26 2003 +++ perlpod.pod Fri Sep 5 02:39:04 2003 @@ -269,6 +269,24 @@ normal formatting (e.g., may not be a normal-use paragraph, but might be for formatting as a footnote). + +=item C<=encoding I<encodingname>> + +This command is used for declaring the encoding of a document. Most +users won't need this; but if your encoding isn't US-ASCII or Latin-1, +then put a C<=encoding I<encodingname>> command early in the document so +that pod formatters will know how to decode the document. For +I<encodingname>, use a name recognized by the L<Encode::Supported> +module. Examples: + + =encoding utf8 + + =encoding koi8-r + + =encoding ShiftJIS + + =encoding big5 + =back And don't forget, when using any command, that the command lasts up --- perlpodspec.pod~core Mon Sep 1 22:15:26 2003 +++ perlpodspec.pod Fri Sep 5 02:52:18 2003 @@ -332,6 +332,30 @@ to use "=for formatname text..." to express "text..." as a verbatim paragraph. +=item "=encoding encodingname" + +This command, which should occur early in the document (at least +before any non-USASCII data!), declares that this document is +encoded in the encoding I<encodingname>, which must be +an encoding name that L<Encoding> recognizes. (Encoding's list +of supported encodings, in L<Encoding::Supported>, is useful here.) +If the Pod parser cannot decode the declared encoding, it +should emit a warning and may abort parsing the document +altogether. + +A document having more than one "=encoding" line should be +considered an error. Pod processors may silently tolerate this if +the not-first "=encoding" lines are just duplicates of the +first one (e.g., if there's a "=use utf8" line, and later on +another "=use utf8" line). But Pod processors should complain if +there are contradictory "=encoding" lines in the same document +(e.g., if there is a "=encoding utf8" early in the document and +"=encoding big5" later). Pod processors that recognize BOMs +may also complain if they see an "=encoding" line +that contradicts the BOM (e.g., if a document with a UTF16LE BOM +has an "=encoding shiftjis" line). + + =back If a Pod processor sees any command other than the ones listed @@ -569,12 +593,14 @@ being UTF-8 if the first highbit byte sequence in the file seems valid as a UTF-8 sequence, or otherwise as Latin-1. -Future versions of this specification may specify -how Pod can accept other encodings. Presumably treatment of other -encodings in Pod parsing would be as in XML parsing: whatever the -encoding declared by a particular Pod file, content is to be -stored in memory as Unicode characters. +It is, however, good practice to explicitly declare the document +as UTF8, with an "=encoding utf8" line, early in the document. +At time of writing (2003), UTF8 is considered the normal encoding +for Unicode; the UTF-16 encodings are not widely used, +and support for them in various Perl tools is spotty. + + =item * The well known Unicode Byte Order Marks are as follows: if the
