Re: Assume CP1252
On 01/12/2015 06:25 AM, Shawn H Corey wrote: On Sun, 11 Jan 2015 20:57:26 -0700 Karl Williamson pub...@khwilliamson.com wrote: To be clear, I think that assuming 1252 when there is no =encoding line is a good idea. But I'm leery of overriding an actual =encoding line. Agreed. I could possibly be persuaded, if someone want to make it, by the argument that 'latin1' is kind of colloquial, and someone using it may very well not be familiar with the possibility that they really mean cp1252. But, if so, there needs to be a way for someone to say I really mean it and not be overridden by us. Perhaps that could be =encoding ISO-8859-1. Q: What if there is more than one =encoding line? Does it switch encoding part way thru a POD? Error while formatting with Pod::Perldoc::ToMan: Nested processed encoding. at /usr/share/perl/5.18/Pod/Simple/BlackBox.pm line 380.
Re: Assume CP1252
On Jan 12, 2015, at 11:18 AM, Karl Williamson pub...@khwilliamson.com wrote: To be clear, I think that assuming 1252 when there is no =encoding line is a good idea. But I'm leery of overriding an actual =encoding line. Agreed. I’m okay with this. I could possibly be persuaded, if someone want to make it, by the argument that 'latin1' is kind of colloquial, and someone using it may very well not be familiar with the possibility that they really mean cp1252. But, if so, there needs to be a way for someone to say I really mean it and not be overridden by us. Perhaps that could be =encoding ISO-8859-1. If we *were* to assume CP1252 for Latin-1, I would want it to be consistent with the precedent set by the W3C. Sean supplied this link: http://www.w3.org/TR/encoding/#names-and-labels Here’s the list of labels that they translate to Windows-1252: ansi_x3.4-1968 ascii cp1252 cp819 csisolatin1 ibm819 iso-8859-1 iso-ir-100 iso8859-1 iso88591 iso_8859-1 iso_8859-1:1987 l1 latin1 us-ascii windows-1252 x-cp1252 In their interpretation, no label ever resolves to iso-8859-1. Pretty interesting. Q: What if there is more than one =encoding line? Does it switch encoding part way thru a POD? Error while formatting with Pod::Perldoc::ToMan: Nested processed encoding. at /usr/share/perl/5.18/Pod/Simple/BlackBox.pm line 380. I recently changed this error, because that was a pretty useless message. The new message is Cannot have multiple =encoding directives. Also, it is no longer fatal, but is passed to scream(), which means it would be a failure for Test::Pod, but won’t break tools that generate docs. http://github.com/theory/pod-simple/commit/cb884b5 Best, David smime.p7s Description: S/MIME cryptographic signature
Re: Assume CP1252
On 01/12/2015 12:37 PM, David E. Wheeler wrote: On Jan 12, 2015, at 11:18 AM, Karl Williamson pub...@khwilliamson.com wrote: To be clear, I think that assuming 1252 when there is no =encoding line is a good idea. But I'm leery of overriding an actual =encoding line. Agreed. I’m okay with this. I could possibly be persuaded, if someone want to make it, by the argument that 'latin1' is kind of colloquial, and someone using it may very well not be familiar with the possibility that they really mean cp1252. But, if so, there needs to be a way for someone to say I really mean it and not be overridden by us. Perhaps that could be =encoding ISO-8859-1. If we *were* to assume CP1252 for Latin-1, I would want it to be consistent with the precedent set by the W3C. That sounds reasonable. Sean supplied this link: http://www.w3.org/TR/encoding/#names-and-labels Here’s the list of labels that they translate to Windows-1252: ansi_x3.4-1968 ascii cp1252 cp819 csisolatin1 ibm819 iso-8859-1 iso-ir-100 iso8859-1 iso88591 iso_8859-1 iso_8859-1:1987 l1 latin1 us-ascii windows-1252 x-cp1252 In their interpretation, no label ever resolves to iso-8859-1. Pretty interesting. I ran across this link, but didn't see what action was taken on it: http://www.w3.org/TR/newline Q: What if there is more than one =encoding line? Does it switch encoding part way thru a POD? Error while formatting with Pod::Perldoc::ToMan: Nested processed encoding. at /usr/share/perl/5.18/Pod/Simple/BlackBox.pm line 380. I recently changed this error, because that was a pretty useless message. The new message is Cannot have multiple =encoding directives. Also, it is no longer fatal, but is passed to scream(), which means it would be a failure for Test::Pod, but won’t break tools that generate docs. http://github.com/theory/pod-simple/commit/cb884b5 Best, David
Re: Assume CP1252
On 01/12/2015 12:49 PM, David E. Wheeler wrote: On Jan 12, 2015, at 11:46 AM, Karl Williamson pub...@khwilliamson.com wrote: I ran across this link, but didn't see what action was taken on it: http://www.w3.org/TR/newline Pardon my ignorance. Does that mean that `s/Latin-1/CP1252/g` could be a mistake on EBCDIC? David Yes, that's essentially what I meant when I said in an earlier email that NEL is THE new-line character on os390, which generally runs using EBCDIC. The code point for NEL in cp1252 is a horizontal ellipsis, and not a next line, but on some platforms, like os390, it means next line. This is a conflict. However, now that I think about it, when I look at os390 runs, I rarely see NELs. Maybe there is a filter that translates them to \n before the pod sees it, but sometimes, I do see NEL all over the place but no \n. I'll ask on the perl-mvs list about this.