Edit report at https://bugs.php.net/bug.php?id=60021&edit=1
ID: 60021 Updated by: ahar...@php.net Reported by: drgroove at gmail dot com Summary: DOMDocument errors on HTML5 tags -Status: Open +Status: Suspended Type: Bug Package: DOM XML related Operating System: Mac OS X PHP Version: 5.3.8 Block user comment: N Private report: N New Comment: It's a valid issue, but it's really an upstream one: libxml2's HTML parser only supports HTML 4.01, so until that's extended to support HTML5 or a new parser is added to libxml2, there's little to be done in PHP proper. There are userspace parsers available: html5lib will parse documents according to the HTML5 algorithm and give you a DOMDocument to work with. Suspending for now. Given that the issue was first raised upstream in 2008, I wouldn't hold your breath (although I suspect they'd love a patch). Previous Comments: ------------------------------------------------------------------------ [2012-04-02 03:41:38] drgroove at gmail dot com Any progress on resolving this? Working w/ DOMDocument and HTML5 is a huge pain in the butt right now; you have to write custom error handlers for things like <header/>, <nav/>, and other HTML5 tags. Also, just entered a bug report for SimpleXML (where tags w/ both attributes and text have their attributes dropped). Both DOMDocument and SimpleXML need updates... it's very difficult to work w/ HTML and XML when both of these APIs have so many issues. Thanks for your help everyone :) ------------------------------------------------------------------------ [2011-10-09 05:24:13] drgroove at gmail dot com Description: ------------ Loading HTML documents through DOMDocument->loadHTMLFile(), when the HTML file contains certain new HTML5 tags, results in this error: Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Tag footer invalid in {file path here} <footer> is a new HTML5 tag. The error appears for other HTML5 tags as well (eg, <header>). Test script: --------------- // TEST.html <header> Some text here </header> // TEST.php <?php $dom_document = new DOMDocument(); $dom_document->loadHTMLFile("TEST.html"); ?> Expected result: ---------------- DOMDocument should not fail on HTML5 tags. Actual result: -------------- Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Tag footer invalid in {file path here} ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=60021&edit=1