https://bugs.freedesktop.org/show_bug.cgi?id=40218
--- Comment #5 from Thomas Arnhold <[email protected]> --- The HTML importer is only confused by this unclosed anchor tag. I've tried other tags like <div>, <span> or <font>, but the import works fine. Also <a name="foo"> works. The only problem exists with <a href="eu">. A solution would be to manually end the started anchor if the next </td> is found, but that's some kind of spaghetti: --- a/editeng/source/editeng/eehtml.cxx +++ b/editeng/source/editeng/eehtml.cxx @@ -319,6 +319,7 @@ void EditHTMLParser::NextToken( int nToken ) case HTML_TABLEHEADER_OFF: case HTML_TABLEDATA_OFF: { + AnchorEnd(); if ( nInCell ) nInCell--; } A far better solution for all non-well-formatted HTML documents would be to clean them up in a first step. This could be done like http://www.mostthingsweb.com/2013/02/parsing-html-with-c/ Do we want to include tidy in our project? In my opinion this could be a huge benefit. -- You are receiving this mail because: You are the assignee for the bug.
_______________________________________________ Libreoffice-bugs mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs
