As per the HTML 4.01 Specification's Notes on invalid documents:
http://www.w3.org/TR/html4/appendix/notes.html#h-B.1
we recommend the following behavior:
[snip...]
If it encounters an undeclared entity, the entity should be
treated as character data.
I offer the following patch against the current CVS version of
/plucker/parser/python/PyPlucker/TextParser.py:
130c130
< content = "&" + content + ";"
---
> content = "?"
(or in diff -c)
*** 127,133 ****
if htmlentitydefs.entitydefs.has_key (content):
content = htmlentitydefs.entitydefs[content]
else:
! content = "&" + content + ";"
return cleanup_attribute (pre) + content + post
--- 127,133 ----
if htmlentitydefs.entitydefs.has_key (content):
content = htmlentitydefs.entitydefs[content]
else:
! content = "?"
return cleanup_attribute (pre) + content + post
***************
I haven't thought about how to implement the next part of the
recommendation:
We also recommend that user agents provide support for notifying
the user of such errors.
This fix would also relate to bug number 224 and to a lesser
extent bug number 58
http://gnu-designs.com/bugs/view_bug_advanced_page.php?f_id=224
http://gnu-designs.com/bugs/view_bug_advanced_page.php?f_id=58
Later,
Blake.
_______________________________________________
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev