On Sun, 12 Aug 2001, Philip Newton wrote:

> On 11 Aug 01, at 20:13, Jarkko Hietaniemi wrote:
> 
> > >                   E<0n>         ASCII character number n (octal)
> > >                   E<n>          ASCII character number n (decimal)
> > >                   E<0xn)        ASCII character number n (hex)
> > >                   E<html>       Some non-numeric HTML entity, such
> > >                                   as E<Agrave>
> > >                 (Older pod formatters might not recognize octal or
> > >                 hex numeric escapes.)
> > > -----------------
> > > 
> > > Is there any way you might be convinced that the acronym
> > > "ASCII" is not completely accurate here?  How about
> > > doing an edit and replacing s/ASCII/coded/g in that part?
> > 
> > Good point.  Let's stop kicking the EBCDIC people :-)
> > Even saying Latin-1 here would be wrong, it is in truth
> > whatever happens to be the native coding.  (On > 255 codes
> > I think we can assume Unicode.)
> 
> Is it now? This is supposed to be the specification; it can mandate 
> whatever it wants, including "E<234> is to be the Latin-1 character 
> with the code point 234 and *not* whatever happens to be at code point 
> 234 in your current code page". Which makes a little sense if you 
> consider all E<number> to be Unicode code points, since in 32..126 they 
> agree with ASCII and in 160..255 they agree with Latin-1.
> 
> Of course, this would mean that someone composing POD on a system whose 
> native character set is not Latin-1 needs to refer to a Latin-1 
> character set listing if he wants to use numbers in E<> references; but 
> IMO that's acceptable and less confusing than saying "E<234> gets 
> rendered as whatever your formatter feels like", effectively destroying 
> the usefulness of E<nnn> for nnn > 126 (since you can't count on what 
> will become of it on your reader's system).

The incompatabilities between coded character sets can hardly be blamed
upon an attempt to specify what pod is legal and how to expect it to be
transformed.  A spec that tells the E<number> will get rendered as the
character at number in your coded character set is still useful in my
opinion.  A further warning that not all computers employ the same coded
character set would be a useful reminder to folks who might not otherwise
be aware of it.  An example that illustrated that E<234> would be
transformed into LATIN SMALL LETTER E WITH CIRCUMFLEX in an ISO 8859-1
coded character set would be also be useful.  Again a reminder that a
large number of computer systems do not necessarily employ the ISO 8859-1
coded character set could also be helpful to folks who might otherwise not
know that.

Peter Prymmer


Reply via email to