On Fri, Mar 27, 2020 at 2:01 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > > On 2020/03/27 19:00, Julien Rouhaud wrote: > > On Thu, Mar 26, 2020 at 02:22:47PM +0100, Julien Rouhaud wrote: > >> On Thu, Mar 26, 2020 at 08:08:35PM +0900, Fujii Masao wrote: > >>> > >>> Here are other comments. > >>> > >>> - if (jstate) > >>> + if (kind == PGSS_JUMBLE) > >>> > >>> Why is PGSS_JUMBLE necessary? ISTM that we can still use jstate here, > >>> instead. > >>> > >>> If it's ok to remove PGSS_JUMBLE, we can define PGSS_NUM_KIND(=2) instead > >>> and replace 2 in, e.g., total_time[2] with PGSS_NUM_KIND. Thought? > >> > >> Yes, we could be using jstate here. I originally used that to avoid > >> passing > >> PGSS_EXEC (or the other one) as a way to say "ignore this information as > >> there's the jstate which says it's yet another meaning". If that's not an > >> issue, I can change that as PGSS_NUM_KIND will clearly improve the > >> explicit "2" > >> all over the place. > > > > Done, passing PGSS_PLAN when jumble is intended, with a comment saying that > > the > > pgss_kind is ignored in that case. > > > >>> + <entry><structfield>total_time</structfield></entry> > >>> + <entry><type>double precision</type></entry> > >>> + <entry></entry> > >>> + <entry> > >>> + Total time spend planning and executing the statement, in > >>> milliseconds > >>> + </entry> > >>> + </row> > >>> > >>> pg_stat_statements view has this column but the function not. > >>> We should make both have the column or not at all, for consistency? > >>> I'm not sure if it's good thing to expose the sum of total_plan_time > >>> and total_exec_time as total_time. If some users want that, they can > >>> easily calculate it from total_plan_time and total_exec_time by using > >>> their own logic. > >> > >> I think we originally added it as a way to avoid too much compatibility > >> break, > >> and also because it seems like a field most users will be interested in > >> anyway. > >> Now that I'm thinking about it again, I indeed think it was a mistake to > >> have > >> that in view part only. Not mainly for consistency, but for users who > >> would be > >> interested in the total_time field while not wanting to pay the overhead of > >> retrieving the query text if they don't need it. So I'll change that! > > > > Done > > > > Thanks for updating the patch! But I'm still wondering if it's really > good thing to expose total_time. For example, when the query fails > with an error many times and "calls" becomes very different from > "plans", "total_plan_time" + "total_exec_time" is really what the users > are interested in?
That's also the case when running explain without analyze, or prepared statements that fallback to generic plans. As a user, knowing how long postgres actually spent processing a query is interesting as a way to find likely low hanging fruits, even if there's no planning/execution strict correlation. The planning/execution detail is also useful but that's probably not what I'd be starting from (at least in OLTP workload). The error scenario is unfortunate, but that's yet another topic. > Some users may be interested in the sum of mean > times, but it's not exposed... Yes, we had a discussion about summing the other fields, but it seems to me that doing a sum of computed fields doesn't really make sense. Mean without variance is already not that useful. > So what I'd like to say is that the information that users are interested > in would vary on each situation and case. At least for me it seems > enough for pgss to report only the basic information. Then users > can calculate to get the numbers (like total_time) they're interested in, > from those basic information. > > But of course, I'd like to hear more opinions about this... +1 Unless someone chime in by tomorrow, I'll just drop the sum as it seems less controversial and not a blocker in userland if users are interested. > > + if (api_version >= PGSS_V1_8) > + values[i++] = Int64GetDatumFast(tmp.total_time[0] + > + > tmp.total_time[1]); > > BTW, Int64GetDatumFast() should be Float8GetDatumFast()? Oh indeed, embarrassing copy/pasto. > > >>> + nested_level++; > >>> + PG_TRY(); > >>> > >>> In old thread [1], Tom Lane commented the usage of nested_level > >>> in the planner hook. There seems no reply to that so far. What's > >>> your opinion about that comment? > >>> > >>> [1] https://www.postgresql.org/message-id/28980.1515803...@sss.pgh.pa.us > >> > >> Oh thanks, I didn't noticed this part of the discussion. I agree with > >> Tom's > >> concern, and I think that having a specific nesting level variable for the > >> planner is the best workaround, so I'll implement that. > > > > Done. > > > > I also exported BufferUsageAccumDiff as mentioned previously, as it seems > > clearner and will avoid future useless code churn, and run pgindent. > > Many thanks!! I'm thinking to commit this part separately. > So I made that patch based on your patch. Attached. Thanks! It looks good to me.