Package: libhtml-parser-perl Version: 3.31 While trying this version with SpamAssassin, I sseem to be getting some odd attributes when IMG tags are parsed. Not all messages do this; only some.
Here's some input HTML: <TABLE cellSpacing=0 cellPadding=0 width=560 align=center border=0> <TBODY> <TR vAlign=top> <TD> <DIV align=center><A href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" target=_blank><IMG height=184 src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_01.gif" width=324 border=0></A><A href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" target=_blank><IMG height=184 src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_02.jpg" width=236 border=0></A><BR><A href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" target=_blank><IMG height=232 src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_03.gif" width=560 border=0></A><BR><A href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" target=_blank><IMG height=214 src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_04.gif" width=560 border=0></A><BR><A href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" target=_blank><IMG height=193 src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_05.gif" width=560 border=0></A><BR><A href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" target=_blank><IMG height=129 src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_06.gif" width=310 border=0></A> <BR><A href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" target=_blank><IMG height=60 src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_07.gif" width=302 border=0></A> </DIV></TD></TR></TBODY></TABLE> Here's the "print" code inside the tag parser function: if ($tag =~ m/img/i) { print STDERR "html_tests: found image '$tag' ("; my($key,$val); while (($key,$val) = each %$attr) { print STDERR " $key=$val;"; } print STDERR " )\n"; } Here's the output: html_tests: found image 'img' ( border=0; src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_01.gif; height4=height4; width24=width24; ) html_tests: found image 'img' ( border=0; width#6=width#6; src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_02.jpg; height4=height4; ) html_tests: found image 'img' ( border=0; src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_03.gif; widthv0=widthV0; height#2=height#2; ) html_tests: found image 'img' ( border=0; src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_04.gif; height!4=height!4; widthv0=widthV0; ) html_tests: found image 'img' ( border=0; src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_05.gif; height3=height3; widthv0=widthV0; ) html_tests: found image 'img' ( width10=width10; border=0; src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_06.gif; height9=height9; ) html_tests: found image 'img' ( width02=width02; border=0; src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_07.gif; height`=height`; ) html_tests: found image 'img' ( src=http://64.119.218.137/cgi-bin/view?v=7&mQ62&[EMAIL PROTECTED]; ) As you can see, the "width" and "height" tags don't seem to get parsed correctly but the others are fine. I've gotten even wierder results on other messages: html_tests: found image 'img' ( dkf=dkf; pup=pup; ehxyky=ehxyky; ey=ey; toemo=toemo; border=0; src=http://[EMAIL PROTECTED]/img/img.php?a=1&i=lj.gif; youaqpxlvvmvgkl=youaqpxlvvmvgkl; aqikzxo=aqikzxo; tqt=tqt; j=j; zr=zr; wrcdmzww=wrcdmzww; ) Any thoughts? Brian ( [EMAIL PROTECTED] ) ------------------------------------------------------------------------------- Many times the difference between failure and success is doing something nearly right... or doing it exactly right.