Hi POD people There's been a discussion on #metacpan about non-ASCII characters in POD being rendered incorrectly on the metacpan.org web site.
The short story is that some people use utf8 characters without including: =encoding utf8. Apparently the metacpan tool chain assumes latin1 encoding, but with the right encoding declaration, the characters would be rendered correctly. The latest perlpodspec seems to imply an ASCII default and anything else should have an =encoding. In the implementation notes section it also suggests a heuristic of checking whether the first highbit byte-sequence is valid as UTF-8 and default to UTF-8 if so and Latin-1 otherwise. This raises two issues: 1) Pod::Simple (as used by metacpan) does not seem to implement this heuristic 2) We need to educate people who are not aware of the =encoding command My thoughts on the second issue are that we could modify Pod::Simple to 'whine' if it sees non-ASCII bytes but no =encoding. This in turn would cause Test::Pod to pick up the error and help people fix it. I'd be happy to look at implementing both these things if it's agreed they're a good idea. Regards Grant
