On 14 August 2023 13:40:40 BST, Niels Dossche <dossche.ni...@gmail.com>
wrote:
>And you load it into simpleXML, the result of calling
json_encode($the_simplexml_object)
My usual reaction to this is "why would you take an object designed for
accessing parts of an XML document, and serialise it to JSON?" Often,
the answer turns out to be "because I don't understand SimpleXML
objects, and have copied and pasted a weird hack to get a less useful
array representation by round-tripping to JSON".
On the other hand, the fact that the *debug* representation of SimpleXML
objects misses out some parts causes a lot of confusion, and I've
actually considered the *opposite* of what you suggest - leave the JSON
alone, because people will have written production code based on it, but
make the debug array more descriptive of how to use the object.
Either way, the challenge is coming up with something that's concise for
simple structures, but comprehensive for more complex ones, particularly
if you want it to be consistent. For instance:
- Do you assume tag names are unique within a parent, so use key=>value
directly; or assume they're not, so use key=>[list,of,values]; or
dynamically switch between the two?
- Do you care about the order of elements with different names, or
prefer to group by name?
- Do you have any elements with both child tags and text, or attributes
and text, or all three?
- Do you need to retain the order of text in relation to child elements
(important for markup languages like HTML or DocBook)? Or is it enough
to have a representation of "all text content" (the behaviour of
SimpleXML's string cast)?
- Do you have any elements with namespaces? If so, do you want to use
local prefixes (and include the xmlns attributes somewhere), or repeat
the full namespace URI?
There's a reason why both the DOM and SimpleXML provide object-oriented
APIs for accessing the document, not a representation flattened to
native types, and why both APIs are useful for different jobs - XML just
isn't designed for flattening, and different patterns make sense for
different documents / use cases.
Ultimately, I'm not that interested in trying to come up with a JSON or
array representation that covers every possibility, because I think the
only consistent answer would be horribly verbose - basically, describe
every property that DOM would expose on each node.
For debug output, the main concern is showing what you'll get with
various styles of access in SimpleXML, so a single "@text" =>
"foobarbaz" would make sense; or maybe even "(string)" => "foobarbaz"
and rename "@attributes" to "->attributes()"
Regards,
--
Rowan Tommins
[IMSoP]
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php