date:20200317

Re: COPY performance vs insert

2020-03-17 Thread Bruce Momjian

On Wed, Feb 12, 2020 at 09:25:05AM +, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/12/populate.html
> Description:
> 
> I think this could be clearer. In particular, performance of COPY vs INSERT
> from a query (INSERT INTO ... SELECT FROM ...) or INSERT with a list of rows
> (INSERT INTO ... VALUES (), (), (), ...)
> Is there any performance benefit with COPY in those cases?

COPY input/outputs rows in bulk, so it is still faster to do COPY than
INSERT INTO ... SELECT FROM or INSERT INTO ... VALUES.  You can run
tests to prove it.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: pg_buffercache query example results misleading, grouping by just relname, needs schema_name

2020-03-17 Thread Bruce Momjian

On Wed, Feb 12, 2020 at 11:55:51PM +, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/12/pgbuffercache.html
> Description:
> 
> The pg_buffercache query example results are misleading. The "group by" uses
> just by relname. It needs to include pg_namespace.nspname, without it, if
> the same object exists in multiple schemas, the buffer count is summed for
> those multiple distinct objects.  
> In: https://www.postgresql.org/docs/12/pgbuffercache.html
> Alternative SQL (the count is now correct for tables in multiple schemas):
> SELECT ts.nspname AS schema_name,c.relname, count(*) AS buffers
>  FROM pg_buffercache b INNER JOIN pg_class c
>  ON b.relfilenode = pg_relation_filenode(c.oid) AND
> b.reldatabase IN (0, (SELECT oid FROM pg_database
>   WHERE datname = current_database()))
>  JOIN pg_namespace ts ON ts.oid = c.relnamespace
>  GROUP BY ts.nspname,c.relname
>  ORDER BY buffers DESC
>  LIMIT 10;
> 
> Example Results:
> Current Query returns 1 row with buffer count summed for 3 tables:
> relname   buffers
> tab1  72401
> 
> Modified Query:
> schema_name   relname buffers
> schema1   tab11883
> schema2   tab169961
> schema3   tab1557

Very good point!  Patch attached.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index b5233697c3..2479181c5f 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -148,27 +148,28 @@
   Sample Output
 
 
-regression=# SELECT c.relname, count(*) AS buffers
+regression=# SELECT n.nspname, c.relname, count(*) AS buffers
  FROM pg_buffercache b INNER JOIN pg_class c
  ON b.relfilenode = pg_relation_filenode(c.oid) AND
 b.reldatabase IN (0, (SELECT oid FROM pg_database
   WHERE datname = current_database()))
- GROUP BY c.relname
- ORDER BY 2 DESC
+ JOIN pg_namespace n ON n.oid = c.relnamespace
+ GROUP BY n.nspname, c.relname
+ ORDER BY 3 DESC
  LIMIT 10;
 
- relname | buffers
--+-
- tenk2   | 345
- tenk1   | 141
- pg_proc |  46
- pg_class|  45
- pg_attribute|  43
- pg_class_relname_nsp_index  |  30
- pg_proc_proname_args_nsp_index  |  28
- pg_attribute_relid_attnam_index |  26
- pg_depend   |  22
- pg_depend_reference_index   |  20
+  nspname   |relname | buffers
+++-
+ public | delete_test_table  | 593
+ public | delete_test_table_pkey | 494
+ pg_catalog | pg_attribute   | 472
+ public | quad_poly_tbl  | 353
+ public | tenk2  | 349
+ public | tenk1  | 349
+ public | gin_test_idx   | 306
+ pg_catalog | pg_largeobject | 206
+ public | gin_test_tbl   | 188
+ public | spgist_text_tbl| 182
 (10 rows)

Re: Clarification on interactions between query parameters and partial indexes

2020-03-17 Thread Bruce Momjian

On Fri, Feb 14, 2020 at 11:42:34AM -0500, Tom Lane wrote:
> PG Doc comments form  writes:
> > In section "11.8 Partial Indexes" it states the following:
> 
> > "Matching takes place at query planning time, not at run time. As a result,
> > parameterized query clauses do not work with a partial index. For example a
> > prepared query with a parameter might specify “x < ?” which will never imply
> > “x < 2” for all possible values of the parameter."
> 
> > We decided to run some tests to verify this statement, as we use both
> > partial indexes and parameterized queries on some very large tables (100mil+
> > rows). However, we are not able to replicate the stated behavior. It seems
> > like the query planner is able to make use of the partial index for both
> > parameterized and manually interpolated values.
> 
> > Have we misunderstood what the documentation is trying to say or has this
> > limitation been fixed?
> 
> The statement is true as far as it goes: "x < $1" will never be considered
> to imply "x < 2".  However, there's a lot of context that's going unstated
> there.  In some code paths, higher-level code such as the plan cache may
> try substituting the concrete value of a parameter as a constant, to see
> if it can get a better (but less general) plan that way.  I think that's
> probably what happened in your experiment, but you didn't provide enough
> details to be sure.

Also. the PREPARE docs might explain some of your test results:

https://www.postgresql.org/docs/12/sql-prepare.html

A prepared statement can be executed with either a generic plan or
a custom plan. A generic plan is the same across all executions,
while a custom plan is generated for a specific execution using
the parameter values given in that call. Use of a generic plan
avoids planning overhead, but in some situations a custom plan
will be much more efficient to execute because the planner can
make use of knowledge of the parameter values. (Of course, if
the prepared statement has no parameters, then this is moot and
a generic plan is always used.)

By default (that is, when plan_cache_mode is set to auto), the
server will automatically choose whether to use a generic or
custom plan for a prepared statement that has parameters. The
current rule for this is that the first five executions are done
with custom plans and the average estimated cost of those plans
is calculated. Then a generic plan is created and its estimated
cost is compared to the average custom-plan cost. Subsequent
executions use the generic plan if its cost is not so much higher
than the average custom-plan cost as to make repeated replanning
seem preferable.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: REGEXP_REPLACE : How to use a column value in the regex

2020-03-17 Thread Bruce Momjian

On Mon, Feb 17, 2020 at 06:37:11AM +, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/12/functions-matching.html
> Description:
> 
> The documentation doesn't talk about any regex syntax for using a column
> from the db in the pattern. There is no mention if this is NOT supported.
> All the examples are using static string and none using a DB column
> values.
> e.g. I have a field *portfolio* which had data like CloudPlatform (Public),
> CloudPlatform (Private), MobilePlatfom (Public),.. etc
> These values are coming from another table *platformtypes" which has these
> values Public, Private, Hybrid...etc.
> I want to replace these types from the portfolio names, i was able to do it
> using string 
> *SELECT
> Portfolios.PortfolioName,REGEXP_REPLACE(Portfolios.PortfolioName,'(\(Public\))',''),
> but I would like to use something  like 
> *REGEXP_REPLACE(Portfolios.PortfolioName,'(\(platformtypes.Name\)),'')*

Sure, you can do it ('a' -> 'X'):

CREATE TABLE TEST (x text);
INSERT INTO test VALUES ('a');
SELECT REGEXP_REPLACE(relname, x, 'X') FROM pg_class, test;
regexp_replace
---
 pg_toXst_36526
 pg_toXst_36526_index
 test
 pg_stXtistic
 pg_type
 pg_toXst_2600

You can also combine column names with literal text ('ta' -> 'YY'):

SELECT REGEXP_REPLACE(relname, 't' || x, 'YY') FROM pg_class, test;
regexp_replace
---
 pg_toast_36526
 pg_toast_36526_index
 test
 pg_sYYtistic

I didn't think it was worth documenting this.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: COPY performance vs insert

Re: pg_buffercache query example results misleading, grouping by just relname, needs schema_name

Re: Clarification on interactions between query parameters and partial indexes

Re: REGEXP_REPLACE : How to use a column value in the regex

4 matches

Site Navigation

Mail list logo

Footer information