Keith C. Ivey wrote:

You've lost me there. What do "dangling attributes" have to do with this case? HTML_FONTCOLOR_UNKNOWN was triggered, so the COLOR attributes were seen. The problem is they weren't recognized as being nearly invisible, so the problem seems to be with the HTML_FONT_LOW_CONTRAST test, not with parsing.

Well, there's some problem, hard to tell if it's parsing or the test itself, but here's what I've found out after adding some debugging calls to HTML.pm:

  1. The html_font_invisible() method gets called
  2. There's a problem with arguments passed to it. The foreground
     color is seen by Perl code as the string 'color', not the string
     in the form of '#feefea'. This explains why the test
     HTML_FONTCOLOR_UNKNOWN was triggered ('color' is not a known color
     name or a HTML hex code), and why HTML_FONT_LOW_CONTRAST test has
     failed.

More specifically, when there's a dangling attribute value like this in HTML source:

<font color=

#feefea>

, then in the html_font_invisible() method the foreground color ($fg variable) has the value of 'color' instead of hex code of the HTML color.

If I add a single space after the equality mark in the tag seen above, html_font_invisible() receives correct data and $fg variable holds the hex code of font color. So this (notice the space after equality mark):

<font color=

#feefea>

is processed correctly and nearly invisible font is detected .

Moreover, after adding debugging calls to the method html_fgcolor() (which extracts foreground color information from a HTML element), I can see that the attribute "color" of this font tag already has the value of 'color', instead of hex code (which should be #feefea in my testcase), so the problem is deeper than in html_font_invisible() method.

This suggests a parsing problem somewhere, as far as I understood the code... If I am correct in my suspicions that the Perl expression "$attr->{color}" is an attribute of a HTML::Parser object, then the problem is indeed in HTML parser code (correct me if I'm wrong).

--
Best Regards,
Aleksander Adamowski
GG#: 274614
ICQ UIN: 19780575 http://olo.ab.altkom.pl




Reply via email to