On 12 Aug 01, at 14:04, Jarkko Hietaniemi wrote:
> On Sun, Aug 12, 2001 at 09:08:28PM +0200, Philip Newton wrote:
> > On 12 Aug 01, at 9:34, Jarkko Hietaniemi wrote:
> >
> > > > I'm generally unhappy with the idea of literal byte values over 127 coming
> > > > up at all in POD files, but it's occasionally quite unavoidable, because
> > > > you can't use E<...>'s in verbatim paragraphs.
> > >
> > > Well, people may want to write pods in their native languages.
> >
> > People discussed this a bit when the initiative to translate Perl's
> > documentation into German came up. Most people weren't terribly keen on
> > writing stuff like "@farben = qw(grE<uuml>n blE<auml>ulich
> > weiE<szlig>)" (or "hyvE<auml>E<auml> pE<auml>ivE<auml>E<auml>", for
> > those that prefer that sort of thing).
>
> How keen people are to having to produce UTF-8 encoded Unicode instead?
I didn't ask, but for me it seems like the lesser of two evils as
regards readability to put "raw" UTF-8 in rather than E<auml> all over
the place. (And people who don't do Latin-1 but iso-8859-x for x != 1
would have problems anyway, since their native letters may not even
have nice readable E<blabla> encodings so they'd have to resort to
E<1234> which is even less readable.)
> > So what should happen with literal byte values over 127?
What *should* happen with them, then? The way I read Sean M. Burke's
reply, "starts with BOM means UTF-16, otherwise some undefined
encoding, presumably the platform native", with no way to specify UTF-
8. Unless, say, UTF-8 is mandated or we get the possibility of putting
a faux UTF-8 BOM at the beginning of the file as a charset signature.
Cheers,
Philip
--
Philip Newton <[EMAIL PROTECTED]>