Hi Robert On 30/12/2023 10:25, Robert Landers wrote: > Hi Niels, > >> They are indeed going to be very similar, but at least having better return >> types would be good to give one particular example. >> e.g. we currently have a lot of methods that can return an object or false. >> The current living DOM spec always throws exceptions instead of returning >> false on error which is a much cleaner API. >> Furthermore, we have the DOMNameSpaceNode that can be returned by some >> methods and has been a point of confusion for static analysis tools (I did a >> PR on psalm to fix one of those issues). >> That node type won't be special cased in the new classes API so the >> (inconsistent use of the) union of DOMAttr|DOMNameSpaceNode will go away. > > Actually, I'm not sure it is supposed to be throwing exceptions (if we > look at https://html.spec.whatwg.org/multipage/parsing.html#parse-errors); > in fact, I'd argue there are three different ways to handle errors > (from some experience in writing a parser from scratch):
I'm not talking about handling parser errors. Parser errors indeed should not be handled via exceptions, they emit a warning and continue with error recovery as described in spec. This was part of my HTML 5 RFC: https://wiki.php.net/rfc/domdocument_html5_parser I'm talking about methods like createElement, setAttributeNode, ... that can fail due to errors. In DOM 3 (and therefore PHP too), there was a "strictErrorChecking" boolean option. When enabled, exceptions were thrown when constraints were not met of such methods. When disabled, no exception is thrown but a warning is emit and false is returned instead. The DOM living spec no longer has that option and always uses exceptions. In the new classes I would also only use exceptions and not include the strictErrorChecking option, as spec demands. This cleans up return types. For example: $doc->createElement("") should throw. Or $element->setAttributeNode($attr) should throw when $attr is already used by another element. Etc. > > 1. Acting as a user-agent: in this case, errors should be handled as > described in the spec for a user-agent, e.g., switching to Text-Mode > in some cases and gobbling up the rest of the document. The HTML 5 RFC follows the spec error recovery rules for user agents. > > 2. Acting as a conformance checker: in this case, a list of errors > should be available to the programmer instead of bailing when parsing > (e.g., not switching to Text-Mode, but trying to continue parsing the > document, as described in the parser spec for conformance checking). > > 3. Acting as a document builder: Putting the document into an invalid > state should emit at least a warning. However, it's likely better to > let the user-agent handle the invalid DOM (as this is probably more > forward-thinking for new HTML that currently doesn't exist). This is > actually one of the biggest draw-backs to the current implementation > as it requires a number of "hacks" to build valid HTML. Kind regards Niels -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php