As per the HTML 4.01 Specification's Notes on invalid documents:
http://www.w3.org/TR/html4/appendix/notes.html#h-B.1

we recommend the following behavior:
[snip...]
If it encounters an undeclared entity, the entity should be
treated as character data.

I offer the following patch against the current CVS version of
/plucker/parser/python/PyPlucker/TextParser.py:

130c130
<             content = "&" + content + ";"
---
>             content = "?"

(or in diff -c)
*** 127,133 ****
          if htmlentitydefs.entitydefs.has_key (content):
              content = htmlentitydefs.entitydefs[content]
          else:
!             content = "&" + content + ";"
      return cleanup_attribute (pre) + content + post


--- 127,133 ----
          if htmlentitydefs.entitydefs.has_key (content):
              content = htmlentitydefs.entitydefs[content]
          else:
!             content = "?"
      return cleanup_attribute (pre) + content + post


***************

I haven't thought about how to implement the next part of the
recommendation:
We also recommend that user agents provide support for notifying
the user of such errors.

This fix would also relate to bug number 224 and to a lesser
extent bug number 58
http://gnu-designs.com/bugs/view_bug_advanced_page.php?f_id=224
http://gnu-designs.com/bugs/view_bug_advanced_page.php?f_id=58

Later,
Blake.

_______________________________________________
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev

Reply via email to