On 22 Jun 2018, at 14:29 (-0400), Kevin A. McGrail wrote:
Hi All,
3.4 is not passing tests for me with the idn_dots.t and it appears to
point
to a problem in P:M:S::get_uri_list. I'm bleary from looking at this
for
three days. Can someone take a look at this?
If you modify the t/idn_dots to print the uri list from the generated
message in the test, it fails in 3.4 but passes in Trunk and in the
3.4.1
release. See below for output but basically there is a missing URI
which
is why the test correctly fails.
I have made the test work by adding "use utf8" to the test script. This
is just avoiding the underlying subtle bug.
The breakage is only seen (so far) on the RedHat perl 5.16.3 packaged
for EL7 and derivatives. I believe that 5.16.x was the last major
release to NOT work in UTF-8 by default without "use utf8" explicitly
used. I have replicated the incorrect parse with the spamassassin script
and a message with all-ascii URLs, so the problem is somewhere in the
spectacularly complicated RE that extracts URIs from the cooked text
array inside PerMsgStatus->get_uri_detail_list. Making matters worse, if
I run either t/idn_dots.t or spamassassin with the Perl debugger (-d)
the parse works.
Anyone who is still using an even older Perl could assist simply by
confirming that the 3.4 branch from SVN fails subtest 4 of t/idn_dots.t
if you remove or comment out the "use utf8" line I added to that file
today.
It would be interesting to see it the problem would be solved by adding
"use utf8" to every .pm that had a "use bytes" declaration before 2017.
This is a bit of a shotgun approach but simpler than hunting for the
specific issue. I'd try it myself, but that I'm basically on my last
stealable minute for the weekend already.
--
Bill Cole
[email protected] or [email protected]
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steadier Work: https://linkedin.com/in/billcole