Re: [HACKERS] Prepared statements and generic plans

'br...@momjian.us' Fri, 10 Jun 2016 16:47:20 -0700

On Tue, Jun  7, 2016 at 06:52:15AM +0000, Albe Laurenz wrote:
> Bruce Momjian wrote:
> >> !    distinct column values, a generic plan assumes a column equality
> >> !    comparison will match 33% of processed rows.  Column statistics
> >>
> >> ... assumes *that* a column equality comparison will match 33% of *the* 
> >> processed rows.
> > 
> > Uh, that seems overly wordy.  I think the rule is that if the sentence
> > makes sense without the words, you should not use them, but it is
> > clearly a judgement call in this case.  Do you agree?
> 
> My gut feeling is that at least the "the" should be retained, but mine
> are the guts of a German speaker.
> It is clearly a judgement call, so follow your instincts.


I think "that/the" would make sense if this sentence was referencing a
specific result.  The sentence is referencing a hypothetical, so I don't
think "that/the" is needed.

> > One more thing --- there was talk of moving some of this into chapter
> > 66, but as someone already mentioned, there are no subsections there
> > because it is a dedicated topic:
> > 
> >     66. How the Planner Uses Statistics.
> > 
> > I am not inclined to add a prepare-only section to that chapter.  On the
> > other hand, the issues described apply to PREPARE and to protocol-level
> > prepare, so having it in PREPARE also seems illogical.  However, I am
> > inclined to leave it in PREPARE until we are ready to move all of this
> > to chapter 66.
> 
> I think it would be ok to leave it where it is in your patch; while the
> paragraph goes into technical detail, it is still alright in the general
> documentation (but only just).

I researched moving some of this text into chapter 66, but found that
only some of it related to the optimizer.  I also realized that the text
applies to the libpq/wire protocol prepare cases too, so rather than
bounce readers to the PREPARE manual page, and then to chapter 66, I
just kept it all in PREPARE, with a reference from the wire protocol
prepare section.

Also, is it possible to do an EXPLAIN prepare() with the libpq/wire
protocol?  I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE. 
However, I don't see any way to inject EXPLAIN into the libpq/wire
prepare case.  Can you specify prepare(EXPLAIN SELECT)?  (PREPARE
EXPLAIN SELECT throws a syntax error.)

Looking at how the code behaves, it seems custom plans that are _more_
expensive (plus planning cost) than the generic plan switch to the
generic plan after five executions, as now documented.  Custom plans
that are significantly _cheaper_ than the generic plan _never_ use the
generic plan.

Here is an example --- first load this SQL:

        DROP TABLE IF EXISTS test;
        CREATE TABLE test (c1 INT, c2 INT);
        INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM 
generate_series(1, 10000) AS a(c1);
        INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM 
generate_series(10001, 15000) AS a(c1);
        INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM 
generate_series(15001, 20000) AS a(c1);
        -- add non-uniformly-distributed values to 'c2'
        INSERT INTO test SELECT 20001, 3;
        INSERT INTO test SELECT 20002, 4;
        CREATE INDEX i_test_c1 ON test (c1);
        CREATE INDEX i_test_c2 ON test (c2);
        ANALYZE test;
        PREPARE prep_c1 AS SELECT * FROM test WHERE c1 = $1;
        PREPARE prep_c2 AS SELECT * FROM test WHERE c2 = $1;

prep_c1 references 'c1', which is a unique column.  Any value used in
the EXECUTE, e.g. EXPLAIN EXECUTE prep_c1(1), existent or non-existent,
generates an index scan, and after five executions a generic index
scan is used.

For prep_c2, if you use the 50% common value '1', the first five
executions use a sequential scan, then the sixth is a generic Bitmap
Heap Scan. For the 25% value of '0' or '2',  the first five runs
generate a Bitmap Heap Scan, and a generic Bitmap Heap Scan on the sixth
and after.

For a prep_c2 value of 3 or any non-existent value, an Index Scan is
used, and a generic plan is never chosen, because the Index Scan is
significantly cheaper than the generic plan.

Updated patch attached.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
new file mode 100644
index 3829a14..6285dd0
*** a/doc/src/sgml/libpq.sgml
--- b/doc/src/sgml/libpq.sgml
*************** PGresult *PQprepare(PGconn *conn,
*** 2303,2310 ****
         <para>
          <function>PQprepare</> creates a prepared statement for later
          execution with <function>PQexecPrepared</>.  This feature allows
!         commands that will be used repeatedly to be parsed and planned just
!         once, rather than each time they are executed.
          <function>PQprepare</> is supported only in protocol 3.0 and later
          connections; it will fail when using protocol 2.0.
         </para>
--- 2303,2310 ----
         <para>
          <function>PQprepare</> creates a prepared statement for later
          execution with <function>PQexecPrepared</>.  This feature allows
!         commands to be executed repeatedly without being parsed and
!         planned each time;  see <xref linkend="SQL-PREPARE"> for details.
          <function>PQprepare</> is supported only in protocol 3.0 and later
          connections; it will fail when using protocol 2.0.
         </para>
diff --git a/doc/src/sgml/ref/prepare.sgml b/doc/src/sgml/ref/prepare.sgml
new file mode 100644
index dbce8f2..08916d3
*** a/doc/src/sgml/ref/prepare.sgml
--- b/doc/src/sgml/ref/prepare.sgml
*************** PREPARE <replaceable class="PARAMETER">n
*** 70,80 ****
    </para>
  
    <para>
!    Prepared statements have the largest performance advantage when a
!    single session is being used to execute a large number of similar
     statements. The performance difference will be particularly
!    significant if the statements are complex to plan or rewrite, for
!    example, if the query involves a join of many tables or requires
     the application of several rules. If the statement is relatively simple
     to plan and rewrite but relatively expensive to execute, the
     performance advantage of prepared statements will be less noticeable.
--- 70,80 ----
    </para>
  
    <para>
!    Prepared statements potentially have the largest performance advantage
!    when a single session is being used to execute a large number of similar
     statements. The performance difference will be particularly
!    significant if the statements are complex to plan or rewrite, e.g. 
!    if the query involves a join of many tables or requires
     the application of several rules. If the statement is relatively simple
     to plan and rewrite but relatively expensive to execute, the
     performance advantage of prepared statements will be less noticeable.
*************** PREPARE <replaceable class="PARAMETER">n
*** 123,148 ****
    </variablelist>
   </refsect1>
  
!  <refsect1>
    <title>Notes</title>
  
    <para>
!    If a prepared statement is executed enough times, the server may eventually
!    decide to save and re-use a generic plan rather than re-planning each time.
!    This will occur immediately if the prepared statement has no parameters;
!    otherwise it occurs only if the generic plan appears to be not much more
!    expensive than a plan that depends on specific parameter values.
!    Typically, a generic plan will be selected only if the query's performance
!    is estimated to be fairly insensitive to the specific parameter values
!    supplied.
    </para>
  
    <para>
     To examine the query plan <productname>PostgreSQL</productname> is using
!    for a prepared statement, use <xref linkend="sql-explain">.
     If a generic plan is in use, it will contain parameter symbols
     <literal>$<replaceable>n</></literal>, while a custom plan will have the
!    current actual parameter values substituted into it.
    </para>
  
    <para>
--- 123,165 ----
    </variablelist>
   </refsect1>
  
!  <refsect1 id="SQL-PREPARE-notes">
    <title>Notes</title>
  
    <para>
!    Prepared statements can optionally use generic plans rather than
!    re-planning with each set of supplied <command>EXECUTE</command> values.
!    This occurs immediately for prepared statements with no parameters;
!    otherwise it occurs only after five or more executions produce plans
!    whose estimated cost average (including planning overhead) is more
!    expensive than the generic plan cost estimate.  Once a generic plan is
!    chosen, it is used for the remaining lifetime of the prepared statement.
!    Using <command>EXECUTE</command> values which are rare can generate
!    custom plans that are so much cheaper than the generic plan, even after
!    adding planning overhead, that the generic plan might never be used.
!   </para>
! 
!   <para>
!    A generic plan assumes that each value supplied to
!    <command>EXECUTE</command> is one of the column's distinct values
!    and that column values are uniformly distributed.  For example,
!    if statistics record three distinct column values, a generic plan
!    assumes a column equality comparison will match 33% of processed rows.
!    Column statistics also allow generic plans to accurately compute the
!    selectivity of unique columns.  Comparisons on non-uniformly-distributed
!    columns and specification of non-existent values affects the average
!    plan cost, and hence if and when a generic plan is chosen.
    </para>
  
    <para>
     To examine the query plan <productname>PostgreSQL</productname> is using
!    for a prepared statement, use <xref linkend="sql-explain">, e.g.
!    <command>EXPLAIN EXECUTE</>.
     If a generic plan is in use, it will contain parameter symbols
     <literal>$<replaceable>n</></literal>, while a custom plan will have the
!    supplied parameter values substituted into it.
!    The row estimates in the generic plan reflect the selectivity
!    computed for the parameters.
    </para>
  
    <para>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Prepared statements and generic plans

Reply via email to