Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-30 Thread Niels Dossche
Hi Robert

On 30/12/2023 10:25, Robert Landers wrote:
> Hi Niels,
> 
>> They are indeed going to be very similar, but at least having better return 
>> types would be good to give one particular example.
>> e.g. we currently have a lot of methods that can return an object or false. 
>> The current living DOM spec always throws exceptions instead of returning 
>> false on error which is a much cleaner API.
>> Furthermore, we have the DOMNameSpaceNode that can be returned by some 
>> methods and has been a point of confusion for static analysis tools (I did a 
>> PR on psalm to fix one of those issues).
>> That node type won't be special cased in the new classes API so the 
>> (inconsistent use of the) union of DOMAttr|DOMNameSpaceNode will go away.
> 
> Actually, I'm not sure it is supposed to be throwing exceptions (if we
> look at https://html.spec.whatwg.org/multipage/parsing.html#parse-errors);
> in fact, I'd argue there are three different ways to handle errors
> (from some experience in writing a parser from scratch):

I'm not talking about handling parser errors.
Parser errors indeed should not be handled via exceptions, they emit a warning 
and continue with error recovery as described in spec.
This was part of my HTML 5 RFC: 
https://wiki.php.net/rfc/domdocument_html5_parser

I'm talking about methods like createElement, setAttributeNode, ... that can 
fail due to errors.
In DOM 3 (and therefore PHP too), there was a "strictErrorChecking" boolean 
option.
When enabled, exceptions were thrown when constraints were not met of such 
methods.
When disabled, no exception is thrown but a warning is emit and false is 
returned instead.
The DOM living spec no longer has that option and always uses exceptions.

In the new classes I would also only use exceptions and not include the 
strictErrorChecking option, as spec demands.
This cleans up return types.

For example: $doc->createElement("") should throw.
Or $element->setAttributeNode($attr) should throw when $attr is already used by 
another element.
Etc.

> 
> 1. Acting as a user-agent: in this case, errors should be handled as
> described in the spec for a user-agent, e.g., switching to Text-Mode
> in some cases and gobbling up the rest of the document.

The HTML 5 RFC follows the spec error recovery rules for user agents.

> 
> 2. Acting as a conformance checker: in this case, a list of errors
> should be available to the programmer instead of bailing when parsing
> (e.g., not switching to Text-Mode, but trying to continue parsing the
> document, as described in the parser spec for conformance checking).
> 
> 3. Acting as a document builder: Putting the document into an invalid
> state should emit at least a warning. However, it's likely better to
> let the user-agent handle the invalid DOM (as this is probably more
> forward-thinking for new HTML that currently doesn't exist). This is
> actually one of the biggest draw-backs to the current implementation
> as it requires a number of "hacks" to build valid HTML.

Kind regards
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-30 Thread Sebastian Bergmann

Am 29.12.2023 um 17:58 schrieb Larry Garfield:

I am also on team "yes, let's just do it right."  If that means the new classes 
are only 99% drop ins for the old ones, I'm OK with that.  People can switch over when 
they're ready and do all the clean up at once.


+1

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-30 Thread Robert Landers
Hi Niels,

> They are indeed going to be very similar, but at least having better return 
> types would be good to give one particular example.
> e.g. we currently have a lot of methods that can return an object or false. 
> The current living DOM spec always throws exceptions instead of returning 
> false on error which is a much cleaner API.
> Furthermore, we have the DOMNameSpaceNode that can be returned by some 
> methods and has been a point of confusion for static analysis tools (I did a 
> PR on psalm to fix one of those issues).
> That node type won't be special cased in the new classes API so the 
> (inconsistent use of the) union of DOMAttr|DOMNameSpaceNode will go away.

Actually, I'm not sure it is supposed to be throwing exceptions (if we
look at https://html.spec.whatwg.org/multipage/parsing.html#parse-errors);
in fact, I'd argue there are three different ways to handle errors
(from some experience in writing a parser from scratch):

1. Acting as a user-agent: in this case, errors should be handled as
described in the spec for a user-agent, e.g., switching to Text-Mode
in some cases and gobbling up the rest of the document.

2. Acting as a conformance checker: in this case, a list of errors
should be available to the programmer instead of bailing when parsing
(e.g., not switching to Text-Mode, but trying to continue parsing the
document, as described in the parser spec for conformance checking).

3. Acting as a document builder: Putting the document into an invalid
state should emit at least a warning. However, it's likely better to
let the user-agent handle the invalid DOM (as this is probably more
forward-thinking for new HTML that currently doesn't exist). This is
actually one of the biggest draw-backs to the current implementation
as it requires a number of "hacks" to build valid HTML.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-29 Thread Niels Dossche
Hi Larry

On 29/12/2023 17:58, Larry Garfield wrote:
> 
> I am also on team "yes, let's just do it right."  If that means the new 
> classes are only 99% drop ins for the old ones, I'm OK with that.  People can 
> switch over when they're ready and do all the clean up at once.  
> 

They are indeed going to be very similar, but at least having better return 
types would be good to give one particular example.
e.g. we currently have a lot of methods that can return an object or false. The 
current living DOM spec always throws exceptions instead of returning false on 
error which is a much cleaner API.
Furthermore, we have the DOMNameSpaceNode that can be returned by some methods 
and has been a point of confusion for static analysis tools (I did a PR on 
psalm to fix one of those issues).
That node type won't be special cased in the new classes API so the 
(inconsistent use of the) union of DOMAttr|DOMNameSpaceNode will go away.

> I'm not sure about making things final.  I don't know the domain space well 
> enough to have a strong opinion at the moment, but my main concern would be 
> ensuring that it's still extensible in reasonable ways.  Eg, if I wanted to 
> add a Web Component element to a page, I want to do that without fugly 
> workarounds.  I don't have a strong opinion at this point on what the right 
> way to do that is.
> 

Yeah indeed.

> --Larry Garfield
> 

Kind regards
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-29 Thread Niels Dossche
Hi Gina

On 29/12/2023 15:40, G. P. B. wrote:
> Thank you for the work!
> 
> I agree that making them proper classes instead of aliases is the better 
> proposition here.
> I'm not fully informed about the DOM spec, and I don't know if the current 
> class/interface hierarchy is in the best shape, but maybe we should also 
> consider having a look a this?
> 

Yeah, our current class hierarchy is wrong, but not "overly wrong". The 
incorrectness comes from the design of the pre-HTML5 era.

This is how it's supposed to be:
CharacterData extends Node (Actually an interface, but PHP does not have 
interfaces with properties)
Text extends CharacterData
CDATASection extends Text
ProcessingInstruction extends CharacterData
Comment extends CharacterData

However in the current implementation, the ProcessingInstruction class extends 
Node instead of CharacterData.
Also CharacterData is a class instead of an interface in the current 
implementation.

So nothing too bad, but not correct either.
There's also some functionality that should be on the Element class instead of 
the Node class.

> About making those new classes finals, this would require reconsidering the 
> class hierarchy anyway, as nearly everything inherits from DOMNode, and other 
> classes (namely Comment/Text/CData nodes) extend other classes.
> However, I would not necessarily be against it, especially if we add the 
> required interfaces, as the current mechanism of registering a custom class 
> is not very powerful and rather cumbersome to use as the constructor is never 
> called.

I'm already reconsidering the class hierarchy :-).
As for the constructor problem: I can fix that for the new classes, I can make 
sure the constructor is called which would already solve a pain point.

> As such, I'm not sure if I would support adding the current mechanism to 
> customize the node classes returned by the extension. Indeed, the current 
> mechanism doesn't play nicely at all with static analysis and this is 
> something I stopped trying to integrate when writing my DocBook renderer 
> project. [1]

I'm also not entirely sure, but in the JS world we do have custom elements that 
you can register and get an instance from back, so it has been done before at 
least.

> 
> Best regards,
> 
> Gina P. Banyard
> 
> [1] https://gitlab.com/Girgias/docbook-renderer 
> 

Kind regards
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-29 Thread Larry Garfield
On Tue, Dec 26, 2023, at 3:45 PM, Niels Dossche wrote:
> Hi internals
>
> The DOM extension in PHP is used to parse, query and manipulate 
> XML/HTML documents. The DOM extension is based on the DOM specification.
> Originally this was the DOM Core Level 3 specification, but nowadays, 
> that specification has evolved into the current "Living Specification" 
> maintained by WHATWG.
>
> Unfortunately, there are many bugs in PHP's DOM extension. Most of 
> those bugs are related to namespace and attribute handling. This leads 
> to people trying to work around those bugs by relying on more bugs, or 
> on undocumented side-effects of incorrect behaviour, leading to even 
> more issues in the end. Furthermore, some of these bugs may have 
> security implications [1].
>
> Some of these bugs are caused because the method or property was 
> implemented incorrectly back in the day, or because the original 
> specification used to be unclear. A smaller part of this is because the 
> specification has made breaking changes when HTML 5 first came along 
> and the specification creators had to unify what browsers implemented 
> into a single specification that everyone agreed on.
>
> It's not possible to "just fix" these bugs because people actually 
> _rely_ on these bugs. They are also often unaware that what they're 
> doing is actually incorrect or causes the internal document state to be 
> inconsistent. We therefore have to fix this in a backwards-compatible 
> way: i.e. a hard requirement is that all code written for the current 
> DOM extension keeps working without requiring changes.
> In short: the main problem is that 20 years of buggy behaviour means 
> that the bugs have become ingrained into the system.
>
> Some people have implemented userland DOM libraries on top of the 
> existing DOM extension. However, even userland solutions can't fully 
> work around issues caused by PHP's DOM extension. The real solution is 
> to provide a BC-preserving fix at PHP's side.
>
> Roughly 1.5 months ago I merged my HTML 5 RFC [2] into the PHP 8.4 
> development branch. This RFC introduced new document classes: 
> DOM\HTMLDocument and DOM\XMLDocument. The idea here was to preserve 
> backwards compatibility: if the user wants to keep using HTML 4, they 
> can keep using the DOMDocument class. Also, when the user wants to work 
> with HTML 5 and are currently using workarounds, they can migrate on 
> their own pace (without deprecations or anything) to the new classes. 
> New code can use DOM\{HTML,XML}Document from the start without touching 
> the old classes.
>
> The HTML 5 RFC has left us with an interesting opportunity to also 
> introduce the spec bugfixes in a BC-preserving way. The idea is that 
> when the new DOM\{HTML,XML}Document classes are used, then the DOM 
> extension will follow the DOM specification and therefore get rid of 
> bugs. When you are using the DOMDocument class, the old implementations 
> will be used. This means that backwards compatibility is kept.
>
> For the past 2.5 weeks I've been working on getting all spec bugs that 
> I know of fixed. The full list of bugs that this proposal fixes can be 
> found here: 
> https://github.com/nielsdos/php-src/blob/dom-spec-compliance-pub/bugs.md. 
> I also found some discussion [3] from some years ago where C. Scott 
> shared a list of problems they encountered at Wikimedia [4]. All 
> behavioural issues are fixed in my PR [5], although my PR could always 
> use more testing. Currently I have tested that existing DOM code does 
> not break (I have tested veewee's XML library, Mensbeam library, some 
> SimpleSAML libraries). I have added tests to test the new 
> spec-compliant behaviour. I also ported some of the WHATWG's WPT DOM 
> tests (DOM spec-compliance testsuite) to PHP and those that I've ported 
> all pass [6].
>
> Implementation PR can be found here: https://github.com/php/php-src/pull/13031
>
> Note that this is not a new extension, but an improvement to the 
> existing DOM extension. As for "why not an entirely new extension?", 
> please see the reasoning in my HTML 5 RFC. All interactions with 
> SimpleXML, XSL, XPath etc will remain possible like you are used to. 
> Implementation-wise, a lot of code internally is shared between the 
> spec-compliant and old implementations.
>
> I intend to put this up for RFC. There is however one last detail that 
> needs to be cleared up: what about "type issues"?
> To give an example of a "type issue": there is a `string 
> DOMNode::$prefix` property. DOM spec tells us that this should be 
> nullable: when there is no prefix for a node, the prefix should return 
> NULL. However, because the property is a string, this currently returns 
> an empty string instead in PHP. Not a big deal maybe, but there's many 
> of these subtle inconsistencies: null vs false return value, arguments 
> that should accept `?string` instead of `string`, etc.
> Sadly, it's not possible to fix the typing issues for 

Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-29 Thread G. P. B.
On Tue, 26 Dec 2023 at 21:45, Niels Dossche  wrote:

> Hi internals
>
> The DOM extension in PHP is used to parse, query and manipulate XML/HTML
> documents. The DOM extension is based on the DOM specification.
> Originally this was the DOM Core Level 3 specification, but nowadays, that
> specification has evolved into the current "Living Specification"
> maintained by WHATWG.
>
> Unfortunately, there are many bugs in PHP's DOM extension. Most of those
> bugs are related to namespace and attribute handling. This leads to people
> trying to work around those bugs by relying on more bugs, or on
> undocumented side-effects of incorrect behaviour, leading to even more
> issues in the end. Furthermore, some of these bugs may have security
> implications [1].
>
> Some of these bugs are caused because the method or property was
> implemented incorrectly back in the day, or because the original
> specification used to be unclear. A smaller part of this is because the
> specification has made breaking changes when HTML 5 first came along and
> the specification creators had to unify what browsers implemented into a
> single specification that everyone agreed on.
>
> It's not possible to "just fix" these bugs because people actually _rely_
> on these bugs. They are also often unaware that what they're doing is
> actually incorrect or causes the internal document state to be
> inconsistent. We therefore have to fix this in a backwards-compatible way:
> i.e. a hard requirement is that all code written for the current DOM
> extension keeps working without requiring changes.
> In short: the main problem is that 20 years of buggy behaviour means that
> the bugs have become ingrained into the system.
>
> Some people have implemented userland DOM libraries on top of the existing
> DOM extension. However, even userland solutions can't fully work around
> issues caused by PHP's DOM extension. The real solution is to provide a
> BC-preserving fix at PHP's side.
>
> Roughly 1.5 months ago I merged my HTML 5 RFC [2] into the PHP 8.4
> development branch. This RFC introduced new document classes:
> DOM\HTMLDocument and DOM\XMLDocument. The idea here was to preserve
> backwards compatibility: if the user wants to keep using HTML 4, they can
> keep using the DOMDocument class. Also, when the user wants to work with
> HTML 5 and are currently using workarounds, they can migrate on their own
> pace (without deprecations or anything) to the new classes. New code can
> use DOM\{HTML,XML}Document from the start without touching the old classes.
>
> The HTML 5 RFC has left us with an interesting opportunity to also
> introduce the spec bugfixes in a BC-preserving way. The idea is that when
> the new DOM\{HTML,XML}Document classes are used, then the DOM extension
> will follow the DOM specification and therefore get rid of bugs. When you
> are using the DOMDocument class, the old implementations will be used. This
> means that backwards compatibility is kept.
>
> For the past 2.5 weeks I've been working on getting all spec bugs that I
> know of fixed. The full list of bugs that this proposal fixes can be found
> here:
> https://github.com/nielsdos/php-src/blob/dom-spec-compliance-pub/bugs.md.
> I also found some discussion [3] from some years ago where C. Scott shared
> a list of problems they encountered at Wikimedia [4]. All behavioural
> issues are fixed in my PR [5], although my PR could always use more
> testing. Currently I have tested that existing DOM code does not break (I
> have tested veewee's XML library, Mensbeam library, some SimpleSAML
> libraries). I have added tests to test the new spec-compliant behaviour. I
> also ported some of the WHATWG's WPT DOM tests (DOM spec-compliance
> testsuite) to PHP and those that I've ported all pass [6].
>
> Implementation PR can be found here:
> https://github.com/php/php-src/pull/13031
>
> Note that this is not a new extension, but an improvement to the existing
> DOM extension. As for "why not an entirely new extension?", please see the
> reasoning in my HTML 5 RFC. All interactions with SimpleXML, XSL, XPath etc
> will remain possible like you are used to. Implementation-wise, a lot of
> code internally is shared between the spec-compliant and old
> implementations.
>
> I intend to put this up for RFC. There is however one last detail that
> needs to be cleared up: what about "type issues"?
> To give an example of a "type issue": there is a `string DOMNode::$prefix`
> property. DOM spec tells us that this should be nullable: when there is no
> prefix for a node, the prefix should return NULL. However, because the
> property is a string, this currently returns an empty string instead in
> PHP. Not a big deal maybe, but there's many of these subtle
> inconsistencies: null vs false return value, arguments that should accept
> `?string` instead of `string`, etc.
> Sadly, it's not possible to fix the typing issues for properties and
> methods for DOMNode, DOMElement, ... because of BC: 

Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-26 Thread Robert Landers
On Tue, Dec 26, 2023 at 11:14 PM Niels Dossche  wrote:
>
> Hi Tim
>
> On 26/12/2023 22:58, Tim Düsterhus wrote:
> > Hi
> >
> > On 12/26/23 22:45, Niels Dossche wrote:
> >> In my opinion, having them become proper classes instead of aliases has my 
> >> preference: either we fix everything in one go now while we have the 
> >> opportunity, or never.
> >
> > As I've already told you in private, I'm in favor of using this opportunity.
> >
> >> Let me know what you think, especially regarding the type issues.
> >>
> >
> > Will the classes be made `final` if they are no longer aliases? That should 
> > (hopefully) make similar changes somewhat easier in the future.
>
> I've been thinking about that as well, but I'm not sure.
> We still have the registerNodeClass() feature, and I've seen people ask to 
> bring this even further to allow custom Element classes (e.g. 
> MyHTMLScriptElement etc).
> I'd like to hear from more people on this matter.
>
> >
> > Best regards
> > Tim Düsterhus
>
> Kind regards
> Niels
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: https://www.php.net/unsub.php
>

Hi Niels,

> We still have the registerNodeClass() feature, and I've seen people ask to 
> bring this even further to allow custom Element classes (e.g. 
> MyHTMLScriptElement etc).
> I'd like to hear from more people on this matter.

Custom element classes would be really nice! I ended up having to
write a custom html5 parser in pure php due to the shortcomings of
php's extension. Having the ability to create custom elements can make
the semantics much more clear (a HeaderElement class, for example).

Robert Landers
Software Engineer
Utrecht NL

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-26 Thread Niels Dossche
Hi Tim

On 26/12/2023 22:58, Tim Düsterhus wrote:
> Hi
> 
> On 12/26/23 22:45, Niels Dossche wrote:
>> In my opinion, having them become proper classes instead of aliases has my 
>> preference: either we fix everything in one go now while we have the 
>> opportunity, or never.
> 
> As I've already told you in private, I'm in favor of using this opportunity.
> 
>> Let me know what you think, especially regarding the type issues.
>>
> 
> Will the classes be made `final` if they are no longer aliases? That should 
> (hopefully) make similar changes somewhat easier in the future.

I've been thinking about that as well, but I'm not sure.
We still have the registerNodeClass() feature, and I've seen people ask to 
bring this even further to allow custom Element classes (e.g. 
MyHTMLScriptElement etc).
I'd like to hear from more people on this matter.

> 
> Best regards
> Tim Düsterhus

Kind regards
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



Re: [PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-26 Thread Tim Düsterhus

Hi

On 12/26/23 22:45, Niels Dossche wrote:

In my opinion, having them become proper classes instead of aliases has my 
preference: either we fix everything in one go now while we have the 
opportunity, or never.


As I've already told you in private, I'm in favor of using this opportunity.


Let me know what you think, especially regarding the type issues.



Will the classes be made `final` if they are no longer aliases? That 
should (hopefully) make similar changes somewhat easier in the future.


Best regards
Tim Düsterhus

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php



[PHP-DEV] Pre-RFC: Fixing spec bugs in the DOM extension

2023-12-26 Thread Niels Dossche
Hi internals

The DOM extension in PHP is used to parse, query and manipulate XML/HTML 
documents. The DOM extension is based on the DOM specification.
Originally this was the DOM Core Level 3 specification, but nowadays, that 
specification has evolved into the current "Living Specification" maintained by 
WHATWG.

Unfortunately, there are many bugs in PHP's DOM extension. Most of those bugs 
are related to namespace and attribute handling. This leads to people trying to 
work around those bugs by relying on more bugs, or on undocumented side-effects 
of incorrect behaviour, leading to even more issues in the end. Furthermore, 
some of these bugs may have security implications [1].

Some of these bugs are caused because the method or property was implemented 
incorrectly back in the day, or because the original specification used to be 
unclear. A smaller part of this is because the specification has made breaking 
changes when HTML 5 first came along and the specification creators had to 
unify what browsers implemented into a single specification that everyone 
agreed on.

It's not possible to "just fix" these bugs because people actually _rely_ on 
these bugs. They are also often unaware that what they're doing is actually 
incorrect or causes the internal document state to be inconsistent. We 
therefore have to fix this in a backwards-compatible way: i.e. a hard 
requirement is that all code written for the current DOM extension keeps 
working without requiring changes.
In short: the main problem is that 20 years of buggy behaviour means that the 
bugs have become ingrained into the system.

Some people have implemented userland DOM libraries on top of the existing DOM 
extension. However, even userland solutions can't fully work around issues 
caused by PHP's DOM extension. The real solution is to provide a BC-preserving 
fix at PHP's side.

Roughly 1.5 months ago I merged my HTML 5 RFC [2] into the PHP 8.4 development 
branch. This RFC introduced new document classes: DOM\HTMLDocument and 
DOM\XMLDocument. The idea here was to preserve backwards compatibility: if the 
user wants to keep using HTML 4, they can keep using the DOMDocument class. 
Also, when the user wants to work with HTML 5 and are currently using 
workarounds, they can migrate on their own pace (without deprecations or 
anything) to the new classes. New code can use DOM\{HTML,XML}Document from the 
start without touching the old classes.

The HTML 5 RFC has left us with an interesting opportunity to also introduce 
the spec bugfixes in a BC-preserving way. The idea is that when the new 
DOM\{HTML,XML}Document classes are used, then the DOM extension will follow the 
DOM specification and therefore get rid of bugs. When you are using the 
DOMDocument class, the old implementations will be used. This means that 
backwards compatibility is kept.

For the past 2.5 weeks I've been working on getting all spec bugs that I know 
of fixed. The full list of bugs that this proposal fixes can be found here: 
https://github.com/nielsdos/php-src/blob/dom-spec-compliance-pub/bugs.md. I 
also found some discussion [3] from some years ago where C. Scott shared a list 
of problems they encountered at Wikimedia [4]. All behavioural issues are fixed 
in my PR [5], although my PR could always use more testing. Currently I have 
tested that existing DOM code does not break (I have tested veewee's XML 
library, Mensbeam library, some SimpleSAML libraries). I have added tests to 
test the new spec-compliant behaviour. I also ported some of the WHATWG's WPT 
DOM tests (DOM spec-compliance testsuite) to PHP and those that I've ported all 
pass [6].

Implementation PR can be found here: https://github.com/php/php-src/pull/13031

Note that this is not a new extension, but an improvement to the existing DOM 
extension. As for "why not an entirely new extension?", please see the 
reasoning in my HTML 5 RFC. All interactions with SimpleXML, XSL, XPath etc 
will remain possible like you are used to. Implementation-wise, a lot of code 
internally is shared between the spec-compliant and old implementations.

I intend to put this up for RFC. There is however one last detail that needs to 
be cleared up: what about "type issues"?
To give an example of a "type issue": there is a `string DOMNode::$prefix` 
property. DOM spec tells us that this should be nullable: when there is no 
prefix for a node, the prefix should return NULL. However, because the property 
is a string, this currently returns an empty string instead in PHP. Not a big 
deal maybe, but there's many of these subtle inconsistencies: null vs false 
return value, arguments that should accept `?string` instead of `string`, etc.
Sadly, it's not possible to fix the typing issues for properties and methods 
for DOMNode, DOMElement, ... because of BC: properties and methods can be 
overridden.
Or is it?

Currently, as a result of the HTML 5 RFC, the new DOM\{HTML,XML}Document 
classes keep using the DOMNode,