On Fri, 2012-04-27 at 09:17 -0700, David E. Wheeler wrote: > On Apr 27, 2012, at 12:10 AM, Grant McLean wrote: > > > OK, so I went ahead and implemented both the warning and the heuristic > > to guess Latin-1 vs UTF-8 (only when no encoding was specified). The > > resulting patch is here: > > > > https://github.com/theory/pod-simple/pull/26 > > I like this, but wonder if maybe it shouldn't be consistent? That is, > if you see more than one of these in a single document, and one can be > output as UTF-8 and the other can’t, would the resulting output have > mixed encodings? IOW, should it not perhaps use the encoding it > determined for the first one of these it finds in a document?
I'm not sure I quite understand what you're saying. The first time a non-ASCII byte is encountered, the code will 'fire' and apply the heuristic to set an encoding. Once the encoding is set, the code won't be called again. The perlpodspec seems pretty clear that a POD document containing different encodings should be considered an error. Regards Grant
