Package: libhtml-parser-perl
Version: 3.31

While trying this version with SpamAssassin, I sseem to be getting some
odd attributes when IMG tags are parsed.  Not all messages do this; only
some.

Here's some input HTML:

<TABLE cellSpacing=0 cellPadding=0 width=560 align=center border=0>
  <TBODY>
  <TR vAlign=top>
    <TD>
      <DIV align=center><A 
      href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" 
      target=_blank><IMG height=184 
src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_01.gif"; 
      width=324 border=0></A><A 
      href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" 
      target=_blank><IMG height=184 
src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_02.jpg"; 
      width=236 border=0></A><BR><A 
      href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" 
      target=_blank><IMG height=232 
src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_03.gif"; 
      width=560 border=0></A><BR><A 
      href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" 
      target=_blank><IMG height=214 
src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_04.gif"; 
      width=560 border=0></A><BR><A 
      href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" 
      target=_blank><IMG height=193 
src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_05.gif"; 
      width=560 border=0></A><BR><A 
      href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" 
      target=_blank><IMG height=129 
src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_06.gif"; 
      width=310 border=0></A> <BR><A 
      href="http://64.119.218.137/cgi-bin/clickthru?c=1147&m=5162&[EMAIL PROTECTED]" 
      target=_blank><IMG height=60 
src="http://64.119.208.20/ads/responsebase/digibino/2/binoc_07.gif"; 
      width=302 border=0></A> </DIV></TD></TR></TBODY></TABLE>


Here's the "print" code inside the tag parser function:

  if ($tag =~ m/img/i) {
    print STDERR "html_tests: found image '$tag' (";
    my($key,$val);
    while (($key,$val) = each %$attr) {
        print STDERR " $key=$val;";
    }
    print STDERR " )\n";
  }


Here's the output:

html_tests: found image 'img' ( border=0; 
src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_01.gif; height4=height4; 
width24=width24; )
html_tests: found image 'img' ( border=0; width#6=width#6; 
src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_02.jpg; height4=height4; )
html_tests: found image 'img' ( border=0; 
src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_03.gif; widthv0=widthV0; 
height#2=height#2; )
html_tests: found image 'img' ( border=0; 
src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_04.gif; height!4=height!4; 
widthv0=widthV0; )
html_tests: found image 'img' ( border=0; 
src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_05.gif; height3=height3; 
widthv0=widthV0; )
html_tests: found image 'img' ( width10=width10; border=0; 
src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_06.gif; height9=height9; )
html_tests: found image 'img' ( width02=width02; border=0; 
src=http://64.119.208.20/ads/responsebase/digibino/2/binoc_07.gif; height`=height`; )
html_tests: found image 'img' ( src=http://64.119.218.137/cgi-bin/view?v=7&mQ62&[EMAIL 
PROTECTED]; )


As you can see, the "width" and "height" tags don't seem to get parsed
correctly but the others are fine.


I've gotten even wierder results on other messages:

html_tests: found image 'img' ( dkf=dkf; pup=pup; ehxyky=ehxyky; ey=ey; toemo=toemo; 
border=0; src=http://[EMAIL PROTECTED]/img/img.php?a=1&i=lj.gif; 
youaqpxlvvmvgkl=youaqpxlvvmvgkl; aqikzxo=aqikzxo; tqt=tqt; j=j; zr=zr; 
wrcdmzww=wrcdmzww; )


Any thoughts?

                                          Brian
                                 ( [EMAIL PROTECTED] )

-------------------------------------------------------------------------------
    Many times the difference between failure and success is doing something
                   nearly right... or doing it exactly right.

Reply via email to