>Any chance of a partial/lossy import, ignore all unknown tags, dump all
>unmatched tags ...???

We already ignore unknown tags (such as frames or tables). I have *no* idea 
how to get expat or libxml2 to not choke on unmatched tags. I'm not sure 
that we would even want to do this. We'd need a parsing engine that's a lot 
more complex like Gecko to do this "correctly."

>More simply, what im trying to suggest is, "Error: this document
>contains
>invalid HTML.  Would you like to import it as plain text with line
>breaks"

This'd be ok, but you can already import HTML as text (choose "open as 
text"). Are you suggesting that we remove the markup tags on import?

>Please :)
>
>or as a temporary measure we could recommend a HTML validator like:
>http://validator.w3.org/

This is a reasonable suggestion.

Dom

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com


Reply via email to