Re: [HACKERS] machine-readable explain output
Andres Freund and...@anarazel.de writes: On 06/16/2009 04:32 PM, Robert Haas wrote: Note that even in this case we DON'T rely on the ordering of the nodes. The innerplan nodes have child nodes which contain their relationship to the parent. Not in the case of Append nodes, but I fail to see a problem there, so... The order of Append child nodes is in fact significant. If this representation loses that information then it needs to be fixed. However, is it really so bad to be relying on node order for this? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Tom Lane wrote: Andres Freund and...@anarazel.de writes: On 06/16/2009 04:32 PM, Robert Haas wrote: Note that even in this case we DON'T rely on the ordering of the nodes. The innerplan nodes have child nodes which contain their relationship to the parent. Not in the case of Append nodes, but I fail to see a problem there, so... The order of Append child nodes is in fact significant. If this representation loses that information then it needs to be fixed. However, is it really so bad to be relying on node order for this? No, if there is a genuine sequence of items then relying on node order is just fine. My earlier (mistaken) reference was to possibly relying on node order for a non-sequence relationship. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Andres Freund wrote: Anyway, I think what this discussion points out is that we actually need a formal XML Schema for this output. Agreed. If helpful I can create a schema for the current format. That will give us a useful starting point. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Tue, Jun 16, 2009 at 1:53 PM, Andres Freundand...@anarazel.de wrote: While that also looks sensible the more structured variant makes it easier to integrate additional stats which may not easily be pressed in the 'attribute' format. As a fastly contrived example you could have io statistics over time like: iostat stat time=10 name=pagefault.../stat stat time=20 name=pagefault.../stat stat time=30 name=pagefault.../stat /iostat Something like that would be harder with your variant. Actually that's exactly the kind of example I had in mind to make easier. I'm picturing adding a new tag, such as iostats, or actually I was thinking of dtrace. If we have separate tags for all the estimates and actual timings then any tags which come with the iostat or dtrace option would just get mixed up with the estimates and timing info. Each new module would provide a single tag which would have some attributes and some child tags depending on how much structure it needs. In cases where there's no structure, just a fixed list of scalars like the existing expected and actual stats I don't see any advantage to making each scalar a tag. (There's not much disadvantage except I would have said it was completely unreadable for a human given that you would have pages and pages of output for a significant size plan.) So your plan might look like scan type=... expected cost=.../ actual time=.../ iostats samples sample time=nnn value=nnn/ /samples /iostats dtrace script=foo.d probes probe name=foo result=nnn/ probe name=bar result=nnn/ /probes /dtrace That would make it easy for a tool like pgadmin which doesn't know what to do with the iostats to ignore the whole chunk, rather than have to dig through a list of stats some of which come from iostats and some from dtrace and some from the instrumentation and have to figure out which tags are things it can use and which are things it can't. -- Gregory Stark http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Greg Stark gsst...@mit.edu writes: I'm picturing adding a new tag, such as iostats, or actually I was thinking of dtrace. If we have separate tags for all the estimates and actual timings then any tags which come with the iostat or dtrace option would just get mixed up with the estimates and timing info. FWIW, I like Greg's idea of subdividing the available data this way. I'm no XML guru, so maybe there is a better way to do it --- but a very large part of the reason for doing this at all is to have an extensible format, and part of that IMHO is that client programs should be able to have some rough idea of what things are even when they don't know it exactly. But I'd be just as happy with a naming convention, like planner:rowcount versus actual:rowcount, etc. I don't know enough about XML usage to understand the benefits and costs of different ways of providing that kind of structure. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Tom Lane wrote: But I'd be just as happy with a naming convention, like planner:rowcount versus actual:rowcount, etc. I don't know enough about XML usage to understand the benefits and costs of different ways of providing that kind of structure. FYI, you probably don't want this. the ':' is not just another character, it separates the namespace designator from the local name. We probably only want one namespace. You can use '-' or '_' or '.' inside names to give them some structure beyond XML semantics. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Tue, Jun 16, 2009 at 12:04 PM, Tom Lanet...@sss.pgh.pa.us wrote: Greg Stark gsst...@mit.edu writes: I'm picturing adding a new tag, such as iostats, or actually I was thinking of dtrace. If we have separate tags for all the estimates and actual timings then any tags which come with the iostat or dtrace option would just get mixed up with the estimates and timing info. FWIW, I like Greg's idea of subdividing the available data this way. I'm no XML guru, so maybe there is a better way to do it --- but a very large part of the reason for doing this at all is to have an extensible format, and part of that IMHO is that client programs should be able to have some rough idea of what things are even when they don't know it exactly. I like it too, but I'd like to see us come up with a design that allows it to be used for all of the output formats (text, XML, and JSON). I think it we should be looking for a way to allow modules to publish abstract objects like property-value mappings, or lists of strings, rather than thinking strictly in terms of XML. If we have a module called foo that emits property bar with value baz and property bletch with value quux, then in text format we can print: Module Foo: Bar: Bletch Baz: Quux In XML we can print: Modules Module Module-NameFoo/Module-Name BarBletch/Bar BazQuux/Baz /Module /Modules (or any of about 10 reasonable alternatives that are functionally identical) In JSON we can print Modules : [ { Module Name : Foo, Bar: Bletch, Baz: Quux } ] (or any of about 2 reasonable alternatives that are functionally identical) If we start thinking in terms of provide an API to insert XML into the XML-format output, we get back to my original complaint: if the only way of getting additional data is to piece through the XML output, then we'll quickly reach the point where users need XSLT and stylesheets to extract the data they care about. I think that's an annoyance that is easily avoidable. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: On Tue, Jun 16, 2009 at 12:04 PM, Tom Lanet...@sss.pgh.pa.us wrote: FWIW, I like Greg's idea of subdividing the available data this way. I like it too, but I'd like to see us come up with a design that allows it to be used for all of the output formats (text, XML, and JSON). I think it we should be looking for a way to allow modules to publish abstract objects like property-value mappings, or lists of strings, rather than thinking strictly in terms of XML. If we have a module called foo that emits property bar with value baz and property bletch with value quux, then ... This seems to be missing the point I was trying to make, which is that a design like that actually offers no leverage at all: if you don't know all about foo to start with, you have no idea what to do with either bar or bletch. You can *parse* the data, since it's in XML or JSON or whatever, but you don't know what it is. The EXPLAIN problem is a fairly constrained universe: there is going to be a tree of plan nodes, there are going to be some static properties of each plan node, and there may or may not be various sorts of estimates and/or measurements attached to each one. What I'm after is that code examining the output can know oh, this is a measurement even if it hasn't heard of the particular kind of measurement. As a concrete example of what I'm thinking about, I'd hope that PgAdmin would be able to display a graphical summary of a plan tree, and then pop up measurements associated with one of the nodes when you right-click on that node. To do this, it doesn't necessarily have to know all about each specific measurement that a particular backend version might emit; but it needs to be able to tell which things are measurements. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Tue, Jun 16, 2009 at 1:21 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Jun 16, 2009 at 12:04 PM, Tom Lanet...@sss.pgh.pa.us wrote: FWIW, I like Greg's idea of subdividing the available data this way. I like it too, but I'd like to see us come up with a design that allows it to be used for all of the output formats (text, XML, and JSON). I think it we should be looking for a way to allow modules to publish abstract objects like property-value mappings, or lists of strings, rather than thinking strictly in terms of XML. If we have a module called foo that emits property bar with value baz and property bletch with value quux, then ... This seems to be missing the point I was trying to make, which is that a design like that actually offers no leverage at all: if you don't know all about foo to start with, you have no idea what to do with either bar or bletch. You can *parse* the data, since it's in XML or JSON or whatever, but you don't know what it is. The EXPLAIN problem is a fairly constrained universe: there is going to be a tree of plan nodes, there are going to be some static properties of each plan node, and there may or may not be various sorts of estimates and/or measurements attached to each one. What I'm after is that code examining the output can know oh, this is a measurement even if it hasn't heard of the particular kind of measurement. As a concrete example of what I'm thinking about, I'd hope that PgAdmin would be able to display a graphical summary of a plan tree, and then pop up measurements associated with one of the nodes when you right-click on that node. To do this, it doesn't necessarily have to know all about each specific measurement that a particular backend version might emit; but it needs to be able to tell which things are measurements. *scratches head* So you're looking for a way to categorize the data that appear in the output by type, like any given piece of data is either a measurement, an estimate, or a part of the plan structure? It seems to me that with a sufficiently powerful API, add-on modules could emit arbitrary stuff that might not fall into the categories that you've mentioned. For example, there was a previous EXPLAIN XML patch which contained a bunch of code that spit out plans that were considered but not chosen. And there could easily be other kinds of less invasive add-ons that would still want to emit properties that are formatted as text or lists rather than measurements per se. I think it's kind of hopeless to think that a third-party module is going to be able to do much better than to display any unexpected properties whose value is just text and punt any unexpected properties whose value is a complex object (nested tags in XML-parlance, hash in JSON). I have a feeling I'm still missing the point here... ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: On Tue, Jun 16, 2009 at 1:21 PM, Tom Lanet...@sss.pgh.pa.us wrote: As a concrete example of what I'm thinking about, I'd hope that PgAdmin would be able to display a graphical summary of a plan tree, and then pop up measurements associated with one of the nodes when you right-click on that node. It seems to me that with a sufficiently powerful API, add-on modules could emit arbitrary stuff that might not fall into the categories that you've mentioned. I don't have a problem with inventing new categories when we need to. What I'm objecting to is using the above to justify flattening the design completely, so that the only way to know anything about a particular datum is to know that type of datum specifically. There is way more structure in EXPLAIN than that, and we should design it accordingly. (Note that any information about rejected plans could not usefully be attached to the plan tree anyway; it'd have to be put in some other child of the topmost node.) And there could easily be other kinds of less invasive add-ons that would still want to emit properties that are formatted as text or lists rather than measurements per se. By measurement I did not mean to imply single number. Text strings or lists could be handled very easily, I think, especially since there are explicit ways to represent those in XML. The main point here is that we have a pretty good idea of what general-purpose client code is likely to want to do with the data, and in a lot of cases that does not translate to having to know each node type explicitly, so long as it can be categorized. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Tue, Jun 16, 2009 at 2:12 PM, Tom Lanet...@sss.pgh.pa.us wrote: The main point here is that we have a pretty good idea of what general-purpose client code is likely to want to do with the data, and in a lot of cases that does not translate to having to know each node type explicitly, so long as it can be categorized. I agree. I'm just not seeing the need for an *explicit* categorization contained within the data itself. For one thing, AIUI, that's the job of things like an XML Schema, which Andres Freund has already agreed to write, and I would expect that would be of some value to tool-writers, else why are we creating it? I also think scalars and lists are recognizable without any particular additional markup at all, just by introspection of the contents. Even if we do need some kind of additional markup, I'm reluctant to try to design it without some feedback from people writing actual tools about what they find inadequate in the current output. The good news is that if this patch gets committed fairly quickly after the release of 8.4, tool authors should have enough time to discover where any bodies are buried in time to fix them before 8.5. But I'm really unconvinced that any of this minor formatting stuff is going to rise to the level of a real problem. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On 06/16/2009 09:51 PM, Robert Haas wrote: On Tue, Jun 16, 2009 at 2:12 PM, Tom Lanet...@sss.pgh.pa.us wrote: The main point here is that we have a pretty good idea of what general-purpose client code is likely to want to do with the data, and in a lot of cases that does not translate to having to know each node type explicitly, so long as it can be categorized. I agree. I'm just not seeing the need for an *explicit* categorization contained within the data itself. For one thing, AIUI, that's the job of things like an XML Schema, which Andres Freund has already agreed to write, and I would expect that would be of some value to tool-writers, else why are we creating it? It defines how exactly the output has to look - thats not easily readable out of explain.c - so anything that could be created and validated with that schema should be acceptable by $tool - even if explain may not create it. Just like EBNF or similar for other languages. It does not help categorizing values in planner/execution/whatever categories automatedly by some tool though. I attached a simple relaxng schema - if somebody likes another format that should be generatable out of that (using trang). It surely could use some more work, but I think its detailed enough for now. I also think scalars and lists are recognizable without any particular additional markup at all, just by introspection of the contents. That somewhat defies the usage of a strictly structured format? Perhaps I am misunderstanding you though. On another note it may be interesting to emit the current options to explain in xml/json format - although that depends whether the option syntax will be accepted. Writing the schema I noticed something else I did not like about the current format: Triggers Trigger TriggerName/Trigger or: ConstraintConstraintName/Constraint /Trigger /Triggers The double usage of Trigger/ seems to be somewhat ugly. Renaming it to TriggerName/ConstraintName seems to be a good idea - at least when staying at the current tag oriented style. Andres grammar xmlns=http://relaxng.org/ns/structure/1.0; ns=http://www.postgresql.org/2009/explain; start element name=explain choice ref name=QueryType/ ref name=NotifyType/ ref name=UtilityType/ /choice /element /start define name=QueryType element name=Query ref name=PlanType/ optional ref name=TriggersType/ /optional optional element name=Total-Runtimetext//element /optional /element /define define name=PlanType element name=Plan element name=Node-Type text//element interleave optional element name=Join-Type text//element /optional optional element name=Parent-Relationshiptext//element /optional optional element name=Parent-Labeltext//element /optional optional element name=Relation-Nametext//element /optional optional element name=Aliastext//element /optional optional element name=Startup-Costtext//element /optional optional element name=Strategytext//element /optional optional element name=Total-Costtext//element /optional optional element name=Plan-Rowstext//element /optional optional element name=Plan-Widthtext//element /optional optional element name=Actual-Startup-Timetext//element /optional optional
Re: [HACKERS] machine-readable explain output
I wrote: On Sun, Jun 14, 2009 at 1:04 PM, Robert Haasrobertmh...@gmail.com wrote: On Sun, Jun 14, 2009 at 1:02 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Jun 14, 2009 at 11:28 AM, Tom Lanet...@sss.pgh.pa.us wrote: However, using BuildTupleFromCStrings is wasteful/stupid for *both* text and xml output, so it seems like getting rid of it is the thing to do here. Makes sense. However, if we just make that change in do_tup_output(), then we'll break the ability to use that function for non-text datatypes. I'd envision it taking Datums, so it doesn't really matter. However, as you say, specializing it to text only wouldn't be much of a loss. I like the Datum option, so I'll work up a patch for that, unless you want to just do it and spare me the trouble. :-) Here's an attempt. Is this anything like what you had in mind? Hmm... on further review, I'm thinking this is still a bit wastful, because we don't really need (I think) to call TupleDescGetAttInMetadata from begin_tup_output_tupdesc. But I'm not sure what the best way is to avoid that. Any thoughts? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: Hmm... on further review, I'm thinking this is still a bit wastful, because we don't really need (I think) to call TupleDescGetAttInMetadata from begin_tup_output_tupdesc. But I'm not sure what the best way is to avoid that. Any thoughts? Er, just don't do it? We shouldn't need it if the function is doing heap_form_tuple directly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Mon, Jun 15, 2009 at 9:51 AM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Hmm... on further review, I'm thinking this is still a bit wastful, because we don't really need (I think) to call TupleDescGetAttInMetadata from begin_tup_output_tupdesc. But I'm not sure what the best way is to avoid that. Any thoughts? Er, just don't do it? We shouldn't need it if the function is doing heap_form_tuple directly. Oh, I guess that works. I had thought there might be people calling begin_tup_output_tupdesc() who wanted to go on to call BuildTupleFromCStrings(), but it seems that's not the case. In fact, it looks like I can probably rip that member out of TupOutputState altogether. Will update patch. Does this look like what you were thinking otherwise? Thanks, ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: it looks like I can probably rip that member out of TupOutputState altogether. Will update patch. Does this look like what you were thinking otherwise? Yeah, that's exactly what I was thinking. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On 6/13/09, Bernd Helmle maili...@oopsware.de wrote: --On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? I hope not, otherwise the usefulness of the format is significantly reduced (to practically zero) if tools cannot rely on it being available and have to fall back to something else if it's not available. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
2009/6/14 Dave Page dp...@pgadmin.org: On 6/13/09, Bernd Helmle maili...@oopsware.de wrote: --On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? I hope not, otherwise the usefulness of the format is significantly reduced (to practically zero) if tools cannot rely on it being available and have to fall back to something else if it's not available. I thing so using --with-libxml is good idea. Is nonsense repeat some necessary xml code like xml escaping and similar. And almost all distributed PostgreSQL binaries are compiled with xml support, so this cannot do some problems. When somebody compile pg without xml support, then he knows what he do. regards Pavel Stehule -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On 6/14/09, Pavel Stehule pavel.steh...@gmail.com wrote: 2009/6/14 Dave Page dp...@pgadmin.org: On 6/13/09, Bernd Helmle maili...@oopsware.de wrote: --On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? I hope not, otherwise the usefulness of the format is significantly reduced (to practically zero) if tools cannot rely on it being available and have to fall back to something else if it's not available. I thing so using --with-libxml is good idea. Is nonsense repeat some necessary xml code like xml escaping and similar. And almost all distributed PostgreSQL binaries are compiled with xml support, so this cannot do some problems. When somebody compile pg without xml support, then he knows what he do. That will mean we never get to use XML explain in pgAdmin. We're not in the business of writing basic features that might work, if the postgres packager enabled an option. We need to be able to rely on such features always being available. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Dave Page wrote: I thing so using --with-libxml is good idea. Is nonsense repeat some necessary xml code like xml escaping and similar. And almost all distributed PostgreSQL binaries are compiled with xml support, so this cannot do some problems. When somebody compile pg without xml support, then he knows what he do. That will mean we never get to use XML explain in pgAdmin. We're not in the business of writing basic features that might work, if the postgres packager enabled an option. We need to be able to rely on such features always being available. As a matter of curiosity, do we have any idea what platforms don't support libxml2? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Dave Page dp...@pgadmin.org writes: On 6/14/09, Pavel Stehule pavel.steh...@gmail.com wrote: I thing so using --with-libxml is good idea. That will mean we never get to use XML explain in pgAdmin. Exactly. We are *not* going to make libxml a required piece of infrastructure, and that means that XML-format explain output will be useless to most client tools if it doesn't work without libxml. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Andrew Dunstan and...@dunslane.net writes: As a matter of curiosity, do we have any idea what platforms don't support libxml2? It's only partially about whether libxml2 is portable enough. A person building Postgres might also have legitimate concerns about how bug-free and/or secure it is. We've already spent nontrivial amounts of time working around libxml bugs; and as for security, google shows at least four CVEs against libxml2 in the past two years, so it's not a negligible risk. I can entirely see people choosing to build without it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Peter Eisentraut pete...@gmx.net writes: A regression test for EXPLAIN, however, should primarily check whether the output format is stable. We are planning to offer this as a public interface, after all. You could use faked up statistics and all but one or two plan types turned off, and then the results should be pretty stable. You'd be surprised :-(. We've found in the past that queries in the regression tests get different plans across different platforms just because of alignment-rule differences (leading to different numbers of rows per page, etc etc). I think that test cases could be chosen to be relatively stable points in the plan space, but it's hopeless to imagine that the low-order digits of cost estimates will be the same across all platforms. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: On Sat, Jun 13, 2009 at 6:40 PM, Tom Lanet...@sss.pgh.pa.us wrote: I believe we have things set up so that you can still print xml data without libxml configured in. We'd need to be sure casting to text works too, but other than that I don't see an issue here. Hmm, I just tried to do this by modifying ExplainResultDesc to use XMLOID rather than TEXTOID when stmt-format == EXPLAIN_FORMAT_XML, and sure enough, explain (format xml) ... fails when --with-libxml is not specified. That's because the code goes through BuildTupleFromCStrings, which invokes xml_in in this scenario, and xml_in (as opposed to xml_out) does depend on libxml. However, using BuildTupleFromCStrings is wasteful/stupid for *both* text and xml output, so it seems like getting rid of it is the thing to do here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sun, Jun 14, 2009 at 11:28 AM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jun 13, 2009 at 6:40 PM, Tom Lanet...@sss.pgh.pa.us wrote: I believe we have things set up so that you can still print xml data without libxml configured in. We'd need to be sure casting to text works too, but other than that I don't see an issue here. Hmm, I just tried to do this by modifying ExplainResultDesc to use XMLOID rather than TEXTOID when stmt-format == EXPLAIN_FORMAT_XML, and sure enough, explain (format xml) ... fails when --with-libxml is not specified. That's because the code goes through BuildTupleFromCStrings, which invokes xml_in in this scenario, and xml_in (as opposed to xml_out) does depend on libxml. However, using BuildTupleFromCStrings is wasteful/stupid for *both* text and xml output, so it seems like getting rid of it is the thing to do here. Makes sense. However, if we just make that change in do_tup_output(), then we'll break the ability to use that function for non-text datatypes. Currently that doesn't look like a problem, because the only clients are ShowGUCConfigOption(), do_text_output_oneline(), and do_text_output_multiline(), -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: On Sun, Jun 14, 2009 at 11:28 AM, Tom Lanet...@sss.pgh.pa.us wrote: However, using BuildTupleFromCStrings is wasteful/stupid for *both* text and xml output, so it seems like getting rid of it is the thing to do here. Makes sense. However, if we just make that change in do_tup_output(), then we'll break the ability to use that function for non-text datatypes. I'd envision it taking Datums, so it doesn't really matter. However, as you say, specializing it to text only wouldn't be much of a loss. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sun, Jun 14, 2009 at 1:02 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Jun 14, 2009 at 11:28 AM, Tom Lanet...@sss.pgh.pa.us wrote: However, using BuildTupleFromCStrings is wasteful/stupid for *both* text and xml output, so it seems like getting rid of it is the thing to do here. Makes sense. However, if we just make that change in do_tup_output(), then we'll break the ability to use that function for non-text datatypes. I'd envision it taking Datums, so it doesn't really matter. However, as you say, specializing it to text only wouldn't be much of a loss. I like the Datum option, so I'll work up a patch for that, unless you want to just do it and spare me the trouble. :-) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sun, Jun 14, 2009 at 1:04 PM, Robert Haasrobertmh...@gmail.com wrote: On Sun, Jun 14, 2009 at 1:02 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Jun 14, 2009 at 11:28 AM, Tom Lanet...@sss.pgh.pa.us wrote: However, using BuildTupleFromCStrings is wasteful/stupid for *both* text and xml output, so it seems like getting rid of it is the thing to do here. Makes sense. However, if we just make that change in do_tup_output(), then we'll break the ability to use that function for non-text datatypes. I'd envision it taking Datums, so it doesn't really matter. However, as you say, specializing it to text only wouldn't be much of a loss. I like the Datum option, so I'll work up a patch for that, unless you want to just do it and spare me the trouble. :-) Here's an attempt. Is this anything like what you had in mind? ...Robert *** a/src/backend/executor/execTuples.c --- b/src/backend/executor/execTuples.c *** *** 95,100 --- 95,101 #include catalog/pg_type.h #include nodes/nodeFuncs.h #include storage/bufmgr.h + #include utils/builtins.h #include utils/lsyscache.h #include utils/typcache.h *** *** 1216,1232 begin_tup_output_tupdesc(DestReceiver *dest, TupleDesc tupdesc) /* * write a single tuple * - * values is a list of the external C string representations of the values - * to be projected. - * * XXX This could be made more efficient, since in reality we probably only * need a virtual tuple. */ void ! do_tup_output(TupOutputState *tstate, char **values) { ! /* build a tuple from the input strings using the tupdesc */ ! HeapTuple tuple = BuildTupleFromCStrings(tstate-metadata, values); /* put it in a slot */ ExecStoreTuple(tuple, tstate-slot, InvalidBuffer, true); --- 1217,1233 /* * write a single tuple * * XXX This could be made more efficient, since in reality we probably only * need a virtual tuple. */ void ! do_tup_output(TupOutputState *tstate, Datum *values, bool *isnull) { ! TupleDesc tupdesc = tstate-metadata-tupdesc; ! HeapTuple tuple; ! ! /* Form a tuple. */ ! tuple = heap_form_tuple(tupdesc, values, isnull); /* put it in a slot */ ExecStoreTuple(tuple, tstate-slot, InvalidBuffer, true); *** *** 1241,1264 do_tup_output(TupOutputState *tstate, char **values) /* * write a chunk of text, breaking at newline characters * - * NB: scribbles on its input! - * * Should only be used with a single-TEXT-attribute tupdesc. */ void do_text_output_multiline(TupOutputState *tstate, char *text) { while (*text) { char *eol; eol = strchr(text, '\n'); if (eol) ! *eol++ = '\0'; else ! eol = text +strlen(text); ! do_tup_output(tstate, text); text = eol; } } --- 1242,1275 /* * write a chunk of text, breaking at newline characters * * Should only be used with a single-TEXT-attribute tupdesc. */ void do_text_output_multiline(TupOutputState *tstate, char *text) { + Datum values[1]; + bool isnull[1] = { false }; + while (*text) { char *eol; + int len; eol = strchr(text, '\n'); if (eol) ! { ! len = eol - text; ! ++eol; ! } else ! { ! len = strlen(text); ! eol += len; ! } ! values[0] = PointerGetDatum(cstring_to_text_with_len(text, len)); ! do_tup_output(tstate, values, isnull); ! pfree(DatumGetPointer(values[0])); text = eol; } } *** a/src/backend/utils/misc/guc.c --- b/src/backend/utils/misc/guc.c *** *** 5978,5984 ShowAllGUCConfig(DestReceiver *dest) int i; TupOutputState *tstate; TupleDesc tupdesc; ! char *values[3]; /* need a tuple descriptor representing three TEXT columns */ tupdesc = CreateTemplateTupleDesc(3, false); --- 5978,5985 int i; TupOutputState *tstate; TupleDesc tupdesc; ! Datum values[3]; ! bool isnull[3] = { false, false, false }; /* need a tuple descriptor representing three TEXT columns */ tupdesc = CreateTemplateTupleDesc(3, false); *** *** 5996,6017 ShowAllGUCConfig(DestReceiver *dest) for (i = 0; i num_guc_variables; i++) { struct config_generic *conf = guc_variables[i]; if ((conf-flags GUC_NO_SHOW_ALL) || ((conf-flags GUC_SUPERUSER_ONLY) !am_superuser)) continue; /* assign to the values array */ ! values[0] = (char *) conf-name; ! values[1] = _ShowOption(conf, true); ! values[2] = (char *) conf-short_desc; /* send it to dest */ ! do_tup_output(tstate, values); /* clean up */ ! if (values[1] != NULL) ! pfree(values[1]); } end_tup_output(tstate); --- 5997,6033 for (i = 0; i num_guc_variables; i++) { struct config_generic *conf = guc_variables[i]; + char *setting; if ((conf-flags GUC_NO_SHOW_ALL) || ((conf-flags GUC_SUPERUSER_ONLY)
Re: [HACKERS] machine-readable explain output
On Saturday 13 June 2009 01:10:06 Robert Haas wrote: pgexplain, as it happens... I could post some samples of the output, but it seems like it might be just as well to let those who are curious try it for themselves. I'd rather get opinions from people who care enough to download test than from those who are just bikeshedding. :-) I recommend, however, that you think about writing a regression test for this, so the interfaces are explicit, and those tweaking them in the future know what they are dealing with. A couple of comments on the specifics of the output: For the JSON format: * Numbers should not be quoted. For the XML format: * Instead of pgexplain, use explain with an XML namespace declaration. The schema name is missing in either output format. I think that was supposed to be one of the features of this that the objects are unambiguously qualified. I'm not sure I like element names such as Node-Type, instead of say nodetype, which is more like HTML and DocBook. (Your way might be more like SOAP, I guess.) Also, the result type of an EXPLAIN (format xml) should be type xml, not text. In general, I like this direction very much. There will probably be more tweaks on the output format over time. It's not like the plain EXPLAIN hasn't been tweaked countless times. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
--On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Bernd Helmle maili...@oopsware.de writes: --On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? I believe we have things set up so that you can still print xml data without libxml configured in. We'd need to be sure casting to text works too, but other than that I don't see an issue here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sat, Jun 13, 2009 at 6:40 PM, Tom Lanet...@sss.pgh.pa.us wrote: Bernd Helmle maili...@oopsware.de writes: --On 13. Juni 2009 15:01:43 -0400 Robert Haas robertmh...@gmail.com wrote: Also, the result type of an EXPLAIN (format xml) should be type xml, not text. Seems reasonable. I'll see if I can figure out how to do that. I suppose it's okay then, that the format is not available when the server isn't build with --with-libxml ? I believe we have things set up so that you can still print xml data without libxml configured in. We'd need to be sure casting to text works too, but other than that I don't see an issue here. Hmm, I just tried to do this by modifying ExplainResultDesc to use XMLOID rather than TEXTOID when stmt-format == EXPLAIN_FORMAT_XML, and sure enough, explain (format xml) ... fails when --with-libxml is not specified. But maybe that's not the right way to do it - now that I think about it, using that in combination with do_text_output_multiline() seems totally wrong even if we end up deciding not to worry about the output type, since while there are multiple rows when the output is considered as text, there is surely only one row when you look at the whole thing as an XML document. I'm not too sure how to do this though. Help? In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Robert Haas robertmh...@gmail.com writes: In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php I think you misinterpreted the point of that example, which is that there already is a way to get the output of EXPLAIN into the system for further processing. Were this not so, we wouldn't be worrying at all what data type it claims to have. But since there is a way, it's important what data type it produces. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sat, Jun 13, 2009 at 7:42 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php I think you misinterpreted the point of that example, which is that there already is a way to get the output of EXPLAIN into the system for further processing. Were this not so, we wouldn't be worrying at all what data type it claims to have. But since there is a way, it's important what data type it produces. Well, if you get the EXPLAIN output into the system by defining a wrapper function, said wrapper function will return the type that it's defined to return, regardless of what EXPLAIN itself returns, no? I don't have a problem making it return XML; I'm just not exactly sure how to do it. Is it possible to get that working without depending on libxml? How? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Saturday 13 June 2009 22:01:43 Robert Haas wrote: * Instead of pgexplain, use explain with an XML namespace declaration. Could you specify this a bit further, like write out exactly what you want it to look like? My XML-fu is not very strong. Just replace your pgexplain by explain xmlns=http://www.postgresql.org/2009/explain; The actual URI doesn't matter, as long as it is distinguishing. The value I chose here follows conventions used by W3C. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Saturday 13 June 2009 22:01:43 Robert Haas wrote: I recommend, however, that you think about writing a regression test for this, so the interfaces are explicit, and those tweaking them in the future know what they are dealing with. I would like to have something in this area, but Tom didn't think it was workable. http://archives.postgresql.org/message-id/603c8f070904151623ne07d744k615edd 4aa669a...@mail.gmail.com Currently, we don't even have something trivial like EXPLAIN SELECT 1 in the regression tests, so even if you completely break EXPLAIN so that it core dumps (voice of experience speaking here) make check still passes with flying colors. That post described a scenario where you check whether given a data set and ANALYZE, the optimizer produces a certain plan. I agree that that might be tricky. A regression test for EXPLAIN, however, should primarily check whether the output format is stable. We are planning to offer this as a public interface, after all. You could use faked up statistics and all but one or two plan types turned off, and then the results should be pretty stable. Unless the fundamental cost model changes, but it doesn't do that very often for the simpler plan types anyway. Things to check for would be checking whether all the fields are there, quoted and escaped correctly, and what happens if statistics are missing or corrupted, etc. Or whether you get any output at all, as you say. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Sunday 14 June 2009 07:27:19 Robert Haas wrote: On Sat, Jun 13, 2009 at 7:42 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: In any event, considering that EXPLAIN is a utility statement and can't be embedded within a query, I'm not sure what benefit we get out of returning the data as XML rather than text. This doesn't seem likely to change either, based on Tom's comments here. http://archives.postgresql.org/pgsql-hackers/2009-05/msg00969.php I think you misinterpreted the point of that example, which is that there already is a way to get the output of EXPLAIN into the system for further processing. Were this not so, we wouldn't be worrying at all what data type it claims to have. But since there is a way, it's important what data type it produces. Well, if you get the EXPLAIN output into the system by defining a wrapper function, said wrapper function will return the type that it's defined to return, regardless of what EXPLAIN itself returns, no? I don't have a problem making it return XML; I'm just not exactly sure how to do it. Is it possible to get that working without depending on libxml? How? Even if this doesn't end up being feasible, I feel it's important that the XML and JSON formats return one datum, not one per line. Otherwise a client that wants to do some processing on the result will have to do about three extra steps to get the result usable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On 6/11/09 10:15 PM, Robert Haas wrote: Here we go, XML and JSON output. You will need to apply explain_refactor-v4.patch and explain_options-v2.patch first, then apply the two patches attached to this message. Wow, cool. Can this work with auto_explain? That's where I see machine-readable being most useful. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Fri, Jun 12, 2009 at 11:47 AM, Josh Berkusj...@agliodbs.com wrote: On 6/11/09 10:15 PM, Robert Haas wrote: Here we go, XML and JSON output. You will need to apply explain_refactor-v4.patch and explain_options-v2.patch first, then apply the two patches attached to this message. Wow, cool. Can this work with auto_explain? That's where I see machine-readable being most useful. The patch does touch contrib/auto_explain, but just enough to make it keep working the same way it does now. I don't think it would be too hard to improve on that, though; I might work on it if I get bored, but I'm hoping someone else will be motivated enough to do that part. :-) How would you go about extracting the XML/JSON bits from the rest of what is in the log file? (apologies if this is a question I should already know the answer to) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
How would you go about extracting the XML/JSON bits from the rest of what is in the log file? (apologies if this is a question I should already know the answer to) If you do CSV output, it's in a field. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
Josh Berkus wrote: How would you go about extracting the XML/JSON bits from the rest of what is in the log file? (apologies if this is a question I should already know the answer to) If you do CSV output, it's in a field. And even if it's not, a well formed XML document would be fairly easy to extract, especially if the root element is well known (e.g. pg:explain or some such). And if the patch doesn't produce a well-formed XML doc then it needs work ;-). It might be nice if Robert were to post some samples of the output. Quick question: does the patch use formal methods using, say, the DOM API to build up the XML, or informal methods (like foo.append('element'); foo.append(content); foo.append('/element'); ) As you can tell, I haven't looked over it yet. But I intend to ;-) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Fri, Jun 12, 2009 at 5:13 PM, Andrew Dunstanand...@dunslane.net wrote: Josh Berkus wrote: How would you go about extracting the XML/JSON bits from the rest of what is in the log file? (apologies if this is a question I should already know the answer to) If you do CSV output, it's in a field. And even if it's not, a well formed XML document would be fairly easy to extract, especially if the root element is well known (e.g. pg:explain or some such). And if the patch doesn't produce a well-formed XML doc then it needs work ;-). It might be nice if Robert were to post some samples of the output. pgexplain, as it happens... I could post some samples of the output, but it seems like it might be just as well to let those who are curious try it for themselves. I'd rather get opinions from people who care enough to download test than from those who are just bikeshedding. :-) Quick question: does the patch use formal methods using, say, the DOM API to build up the XML, or informal methods (like foo.append('element'); foo.append(content); foo.append('/element'); ) As you can tell, I haven't looked over it yet. But I intend to ;-) Use the Source, Luke. :-) But, it's informal methods. I don't see a lot of value in doing it the other way, though perhaps I could be convinced otherwise. One thing that's nice about the way it works now is that the only support function it requires is a basic XML-escaping function, which it turns out we already have in the PG sources anyway, though not in a quite usable form (the infrastructure patch deals with the necessary adjustments). So you can explain (format xml) even if you compile without --with-libxml. If you want to see how the actual XML/JSON stuff works, you might want to start with the last patch in the series (explain_format). If you want to commit it, a course of action to which I can give my unbiased endorsement, then you'll want to start with explain_refactor. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] machine-readable explain output
On Friday 12 June 2009 08:15:17 Robert Haas wrote: Here we go, XML and JSON output. Could you post some examples of how some plans would look in either format? That would help us judge the particulars. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers