On Tue, 2008-07-01 at 21:48 -0700, Tom Raney wrote: > This is an update to my EXPLAIN XML patch submitted a few days ago.
This is an important patch and you've done well to get it this far. I have much detailed feedback, but these are just emergent requests, now I can see and understand what you've done. * If the patch was implemented as an ExplainOneQuery_hook we would be able to use this with 8.3 also, which might be interesting. So there's no real need for this to be a patch on core. * If I understand, the DTD option inlines the DTD into the resulting XML document, rather than adding a schema definition? * The DTD should be in a separate file also, so it can be referred to. We should give that DTD a permanent URL so we can identify it across the internet. This is the first time the project has published an XML DTD/Schema, so we need to think through the right way to publish it. The DTD should have a version number in it, so when this gets updated in the future we can validate against the correct one. * The DTD is regrettably a long, long way from where I need it to be. The PLAN elements are all unrelated to one another, apart from their sequence within the XML document and their "indent". That definition can lead to XML documents that match the DTD yet would never be valid plans, amongst other problems. For sensible analysis of the SQL we need the DTD to match the actual structure of nodes in the executor. This requires major DTD changes, regrettably. The "indent" comes from the nesting of the nodes, and is not merely a visual feature of the complete plan. IMHO it is critically important that the XML correctly conveys the structure of the plan and not just the formatting of the current text output. Otherwise many doors will be closed to us and this whole project wasted. I want to be able to write xml queries to retrieve plans that contain a merge join where both the inner and outer are merge joins, or to easily compare the structure of two complex queries looking for differences. * The meaning of the two time attributes is somewhat confused. It is startuptime and totaltime, not time_start and time_end. * It looks to me like QUALIFIER alone is not enough, since in some cases nodes have both an index condition and a filter, for example. * I've made a number of suggested changes below, but the DTD really needs to follow the structures in src/include/nodes/plannodes.h In particular you'll see the recursive structure introduced by the DTD changes below * EXPLAIN has been renamed PLAN * PLAN has been renamed NODE * COST has been renamed to ESTIMATE * ANALYZE has been renamed to ACTUAL * OUTPUT, SORT and QUALIFIER have been removed for clarity only <!ELEMENT plan (estimate, runtime?)> <!ELEMENT node (table?, index?, estimate, actual?, outerpath?, innerpath?, initpath)> <!ATTLIST plan\n" nodetype CDATA #REQUIRED> <!ELEMENT outerpath (node)> <!ELEMENT innerpath (node)> <!ELEMENT initpath (node)> <!ELEMENT estimate EMPTY > <!ATTLIST estimate startupcost CDATA #REQUIRED totalcost CDATA #REQUIRED rows CDATA #REQUIRED width CDATA #REQUIRED> <!ELEMENT actual EMPTY > <!ATTLIST actual\n" startuptime CDATA #REQUIRED totaltime CDATA #REQUIRED rows CDATA #REQUIRED loops CDATA #REQUIRED> * I'd much prefer to define this as a Schema, since that's a bit more flexible in what we can do, plus we can specify datatypes. * Based upon some of those issues, I'd suggest that further work on the DTD/Schema should only happen when a reasonable range of plans have been accumulated to allow full understanding of the possibilities I'm sorry I've found so much to say about this, but don't be dissuaded. The basic structure of the patch is there, we just need to get the details right also, so we can take full. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-patches mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches