On Tue, Jun 7, 2016 at 06:52:15AM +0000, Albe Laurenz wrote: > Bruce Momjian wrote: > >> ! distinct column values, a generic plan assumes a column equality > >> ! comparison will match 33% of processed rows. Column statistics > >> > >> ... assumes *that* a column equality comparison will match 33% of *the* > >> processed rows. > > > > Uh, that seems overly wordy. I think the rule is that if the sentence > > makes sense without the words, you should not use them, but it is > > clearly a judgement call in this case. Do you agree? > > My gut feeling is that at least the "the" should be retained, but mine > are the guts of a German speaker. > It is clearly a judgement call, so follow your instincts.
I think "that/the" would make sense if this sentence was referencing a specific result. The sentence is referencing a hypothetical, so I don't think "that/the" is needed. > > One more thing --- there was talk of moving some of this into chapter > > 66, but as someone already mentioned, there are no subsections there > > because it is a dedicated topic: > > > > 66. How the Planner Uses Statistics. > > > > I am not inclined to add a prepare-only section to that chapter. On the > > other hand, the issues described apply to PREPARE and to protocol-level > > prepare, so having it in PREPARE also seems illogical. However, I am > > inclined to leave it in PREPARE until we are ready to move all of this > > to chapter 66. > > I think it would be ok to leave it where it is in your patch; while the > paragraph goes into technical detail, it is still alright in the general > documentation (but only just). I researched moving some of this text into chapter 66, but found that only some of it related to the optimizer. I also realized that the text applies to the libpq/wire protocol prepare cases too, so rather than bounce readers to the PREPARE manual page, and then to chapter 66, I just kept it all in PREPARE, with a reference from the wire protocol prepare section. Also, is it possible to do an EXPLAIN prepare() with the libpq/wire protocol? I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE. However, I don't see any way to inject EXPLAIN into the libpq/wire prepare case. Can you specify prepare(EXPLAIN SELECT)? (PREPARE EXPLAIN SELECT throws a syntax error.) Looking at how the code behaves, it seems custom plans that are _more_ expensive (plus planning cost) than the generic plan switch to the generic plan after five executions, as now documented. Custom plans that are significantly _cheaper_ than the generic plan _never_ use the generic plan. Here is an example --- first load this SQL: DROP TABLE IF EXISTS test; CREATE TABLE test (c1 INT, c2 INT); INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(1, 10000) AS a(c1); INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(10001, 15000) AS a(c1); INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(15001, 20000) AS a(c1); -- add non-uniformly-distributed values to 'c2' INSERT INTO test SELECT 20001, 3; INSERT INTO test SELECT 20002, 4; CREATE INDEX i_test_c1 ON test (c1); CREATE INDEX i_test_c2 ON test (c2); ANALYZE test; PREPARE prep_c1 AS SELECT * FROM test WHERE c1 = $1; PREPARE prep_c2 AS SELECT * FROM test WHERE c2 = $1; prep_c1 references 'c1', which is a unique column. Any value used in the EXECUTE, e.g. EXPLAIN EXECUTE prep_c1(1), existent or non-existent, generates an index scan, and after five executions a generic index scan is used. For prep_c2, if you use the 50% common value '1', the first five executions use a sequential scan, then the sixth is a generic Bitmap Heap Scan. For the 25% value of '0' or '2', the first five runs generate a Bitmap Heap Scan, and a generic Bitmap Heap Scan on the sixth and after. For a prep_c2 value of 3 or any non-existent value, an Index Scan is used, and a generic plan is never chosen, because the Index Scan is significantly cheaper than the generic plan. Updated patch attached. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml new file mode 100644 index 3829a14..6285dd0 *** a/doc/src/sgml/libpq.sgml --- b/doc/src/sgml/libpq.sgml *************** PGresult *PQprepare(PGconn *conn, *** 2303,2310 **** <para> <function>PQprepare</> creates a prepared statement for later execution with <function>PQexecPrepared</>. This feature allows ! commands that will be used repeatedly to be parsed and planned just ! once, rather than each time they are executed. <function>PQprepare</> is supported only in protocol 3.0 and later connections; it will fail when using protocol 2.0. </para> --- 2303,2310 ---- <para> <function>PQprepare</> creates a prepared statement for later execution with <function>PQexecPrepared</>. This feature allows ! commands to be executed repeatedly without being parsed and ! planned each time; see <xref linkend="SQL-PREPARE"> for details. <function>PQprepare</> is supported only in protocol 3.0 and later connections; it will fail when using protocol 2.0. </para> diff --git a/doc/src/sgml/ref/prepare.sgml b/doc/src/sgml/ref/prepare.sgml new file mode 100644 index dbce8f2..08916d3 *** a/doc/src/sgml/ref/prepare.sgml --- b/doc/src/sgml/ref/prepare.sgml *************** PREPARE <replaceable class="PARAMETER">n *** 70,80 **** </para> <para> ! Prepared statements have the largest performance advantage when a ! single session is being used to execute a large number of similar statements. The performance difference will be particularly ! significant if the statements are complex to plan or rewrite, for ! example, if the query involves a join of many tables or requires the application of several rules. If the statement is relatively simple to plan and rewrite but relatively expensive to execute, the performance advantage of prepared statements will be less noticeable. --- 70,80 ---- </para> <para> ! Prepared statements potentially have the largest performance advantage ! when a single session is being used to execute a large number of similar statements. The performance difference will be particularly ! significant if the statements are complex to plan or rewrite, e.g. ! if the query involves a join of many tables or requires the application of several rules. If the statement is relatively simple to plan and rewrite but relatively expensive to execute, the performance advantage of prepared statements will be less noticeable. *************** PREPARE <replaceable class="PARAMETER">n *** 123,148 **** </variablelist> </refsect1> ! <refsect1> <title>Notes</title> <para> ! If a prepared statement is executed enough times, the server may eventually ! decide to save and re-use a generic plan rather than re-planning each time. ! This will occur immediately if the prepared statement has no parameters; ! otherwise it occurs only if the generic plan appears to be not much more ! expensive than a plan that depends on specific parameter values. ! Typically, a generic plan will be selected only if the query's performance ! is estimated to be fairly insensitive to the specific parameter values ! supplied. </para> <para> To examine the query plan <productname>PostgreSQL</productname> is using ! for a prepared statement, use <xref linkend="sql-explain">. If a generic plan is in use, it will contain parameter symbols <literal>$<replaceable>n</></literal>, while a custom plan will have the ! current actual parameter values substituted into it. </para> <para> --- 123,165 ---- </variablelist> </refsect1> ! <refsect1 id="SQL-PREPARE-notes"> <title>Notes</title> <para> ! Prepared statements can optionally use generic plans rather than ! re-planning with each set of supplied <command>EXECUTE</command> values. ! This occurs immediately for prepared statements with no parameters; ! otherwise it occurs only after five or more executions produce plans ! whose estimated cost average (including planning overhead) is more ! expensive than the generic plan cost estimate. Once a generic plan is ! chosen, it is used for the remaining lifetime of the prepared statement. ! Using <command>EXECUTE</command> values which are rare can generate ! custom plans that are so much cheaper than the generic plan, even after ! adding planning overhead, that the generic plan might never be used. ! </para> ! ! <para> ! A generic plan assumes that each value supplied to ! <command>EXECUTE</command> is one of the column's distinct values ! and that column values are uniformly distributed. For example, ! if statistics record three distinct column values, a generic plan ! assumes a column equality comparison will match 33% of processed rows. ! Column statistics also allow generic plans to accurately compute the ! selectivity of unique columns. Comparisons on non-uniformly-distributed ! columns and specification of non-existent values affects the average ! plan cost, and hence if and when a generic plan is chosen. </para> <para> To examine the query plan <productname>PostgreSQL</productname> is using ! for a prepared statement, use <xref linkend="sql-explain">, e.g. ! <command>EXPLAIN EXECUTE</>. If a generic plan is in use, it will contain parameter symbols <literal>$<replaceable>n</></literal>, while a custom plan will have the ! supplied parameter values substituted into it. ! The row estimates in the generic plan reflect the selectivity ! computed for the parameters. </para> <para>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers