Re: SV: Log files polluted with permission denied error messages after every 10 seconds

2021-03-20 Thread Andrus

Hi!

Today got 2 errors in patched version:

2021-03-20 20:31:27 EET stats collector LOG:  could not rename 
temporary statistics file "pg_stat_tmp/global.tmp" to 
"pg_stat_tmp/global.stat": Permission denied


2021-03-20 20:51:25 EET stats collector LOG:  could not rename 
temporary statistics file "pg_stat_tmp/global.tmp" to 
"pg_stat_tmp/global.stat": Permission denied


Andrus.



Re: More than one UNIQUE key when matching items..

2021-03-20 Thread Laurenz Albe
On Sat, 2021-03-20 at 15:51 +, Ron Clarke wrote:
> In SQL Server this is easy, we insert the records into a temporary table with 
> separate Unique
>  indexes on the id for set a and the ids for set b and put the 
> 'ignore_dup_key' on which tells
>  SQL Server to ignore duplicate rows and carry on.
> 
> The nearest to this at first appears to be the ON CONFLICT ON CONSTRAINT 
> IGNORE in Postgres.
>  But this only works with a single constraint, at a time i.e. we can't set 
> the ON CONFLICT ON
>  CONSTRAINT IGNORE to work with multiple UNIQUE indexes.

You can try this:

INSERT ... ON CONFLICT DO NOTHING;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Unkillable processes creating millions of tiny temp files

2021-03-20 Thread Jan Wieck

On 3/5/21 6:57 PM, Tom Lane wrote:


Not sure how fast that is either.  If you need to do it again, you could
try manually rm'ing everything under the pgsql_tmp directory before
letting the postmaster start.
That is actually a strategy that works rather well. mv(1) the tmp 
directory to something date(1) based, then kick off a recursive rm(1) 
-rf on everything named pgsql_tmp_*. That won't miss anything in the 
case the whole server is restarted while the procedure is under way. It 
can cause multiple rm(1) processes trampling over each other, but that 
has no real ill side effects. They are just trying to unlink a file 
another one already did.


Under normal circumstances the rm(1) will clean up while the postmaster 
is already up and possibly created a new pgsql_tmp.



Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: More than one UNIQUE key when matching items..

2021-03-20 Thread Allan Kamau
On Sat, Mar 20, 2021 at 6:52 PM Ron Clarke  wrote:

> /*
> I'm trying to port a system from SQL server, and at the same time better
> learn postgreSQL.
>
> I've come across a problem that is easily solved in that world, but I am
> struggling to find an approach in postgres that works.
>
> We have 2 sets of events A and B (sets), they have a shared number
> (ncode), both have unique Id's
>
> We want to link items of set A to those of set B, but each item of each
> set can only be linked once.That is we do not want to link all set 'A'
> items to all set 'B' Items with the same code.
>
> In SQL Server this is easy, we insert the records into a temporary table
> with separate Unique indexes on the id for set a and the ids for set b and
> put the 'ignore_dup_key' on which tells SQL Server to ignore duplicate rows
> and carry on.
>
> The nearest to this at first appears to be the ON CONFLICT ON CONSTRAINT
> IGNORE in Postgres. But this only works with a single constraint, at a time
> i.e. we can't set the ON CONFLICT ON CONSTRAINT IGNORE to work with
> multiple UNIQUE indexes.
>
> To show the problem:
>
> I'm using PostgreSQL version 11.
>
> */
> -- source data
> WITH sd AS (
> SELECT iid, s, ncode FROM (
> VALUES (1, 'A', 10),
>(2, 'A', 30),
>(3, 'A', 10),
>(4, 'B', 10),
>(5, 'B', 20),
>(6, 'B', 10)
> )
> AS tx (iid, s, ncode))
> SELECT iid, s, ncode FROM sd
>
> /* The target result would be :
>
>   id:1, A, 10 this matches id:4, B, 10
>   id:3, A, 10 this matches id:6, B, 10
> */
>
> --  Example to get the *wrong *answer, i.e. both sets of links
>
> WITH
> sd (i, s, n ) AS (
> SELECT iid, s, ncode FROM (
> VALUES (1, 'A', 10),
>   (2, 'A', 30),
>   (3, 'A', 10),
>   (4, 'B', 10),
>   (5, 'B', 20),
>   (6, 'B', 10)
> )
> AS tx (iid, s, ncode))
> ,
> x AS ( SELECT
>
> ax.i as ia,
> ax.s as sa,
> ax.n as na,
> bx.i as ib,
> bx.s as sb,
> bx.n as nb,
> ROW_NUMBER () OVER (
>
> PARTITION BY bx.i
>
> ORDER BY
>
> ax.i ) as rx
>
> FROM sd AS ax
> INNER JOIN sd AS bx ON ax.n = bx.n and ax.i != bx.i and bx.s = 'B'
> WHERE ax.s = 'A'
> )
> SELECT ia,ib, na, rx FROM x
> ;
>
>
> /*   I've tried using a recursive CTE where I'm trying to exclude results
> from the result set that have already been identified, but I can't get an
> allowed syntax.
>   Doesn't seem to allow joins to the recursive term to exclude results.
> */
>
>
> /*   I've tried Unique and Exclusion constraints on temporary table, e.g */
> -- similar Example to get the wrong answer, i.e. both sets of links
>
> DROP TABLE IF EXISTS links ;
>
> CREATE TEMPORARY TABLE links
> (mid serial ,
> ia int ,
> -- ia int UNIQUE,
> ib int ,
> -- ib int UNIQUE,
> EXCLUDE USING gist (ia WITH =, ib WITH =)
>
>   ) ;
>
> WITH
> sd (i, s, n ) AS (
> SELECT iid, side, ncode FROM (
> VALUES (1, 'A', 10),
>   (2, 'A', 30),
>   (3, 'A', 10),
>   (4, 'B', 10),
>   (5, 'B', 20),
>   (6, 'B', 10)
> )
> AS tx (iid, side, ncode))
> ,
> x AS (
> SELECT
> ax.i as ia,
> ax.s as sa,
> ax.n as na,
> bx.i as ib,
> bx.s as sb,
> bx.n as nb,
> ROW_NUMBER () OVER (
> PARTITION BY bx.i
> ORDER BY
> ax.i
> ) as rx
> FROM sd AS ax
> INNER JOIN sd AS bx ON ax.n = bx.n and ax.i != bx.i and bx.s = 'B'
> WHERE ax.s = 'A'
> )
> -- SELECT * FROM x
> INSERT INTO links(ia,ib)
> SELECT ia, ib FROM x
> ON CONFLICT ON CONSTRAINT links_ia_ib_excl DO NOTHING;
>
>   --
> SELECT * from links;
>
> /*   I've also tried and failed to use array(ia,ib) within or as computed
> column of an Exclusion constraint of && s on temporary table, e.g
>   but can't find any syntax that doesn't result in an error
>   */
>
>
> DROP TABLE IF EXISTS links ;
>
> CREATE TEMPORARY TABLE links
> (mid serial ,
> ia int ,
> -- ia int UNIQUE,
> ib int ,
> -- ib int UNIQUE,
> ix int[],
>   EXCLUDE USING gist (ix WITH &&)
>   ) ;
>
> -- This gives me:
> -- ERROR:  data type integer[] has no default operator class for access
> method "gist"
>
> -- I have the btree_gist extension installed
>
>
> /*
>
> I appreciate I could create a cursor from a list of proposed links and
> step through each one, checking if the id value has been "used up"
> but I am trying to keep this as a set based operation to give me the
> results in one statement.
>
> There are some similar questions w.r.t. duplicate detection, but these
> again seem to be solved by evaluating each proposed record individually.
> If that's just what I have to do then so be it. There is probably a
> simple 'postgreSQL' freindly approach I'm still yet to discover having spent
> too long in Sybase and SQL Server worlds.
>
> Thanks for looking at this
>
> */
>
>
>



Hi Ron,

How about the code below.
It may require testing with more data.






WITH _sd AS (
SELECT iid, s, ncode FROM (
VALUES (1, 'A', 10),
   (2, 'A', 30),
   (3, 'A', 10),
   (4, 'B', 10),
   (5, 'B', 20),
   (6, 'B', 10)
)
AS tx (iid, s, 

Re: Programmatic Trigger Create

2021-03-20 Thread Tom Lane
Adrian Klaver  writes:
> In what program is:
> EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd
> being done?

Presumably ECPG, which if memory serves defaults to not-auto-commit.
Maybe Niko is failing to commit the transaction?

If all else fails, turning on log_statements on the server and
examining the log might help debug what the program is doing wrong.

regards, tom lane




Re: Programmatic Trigger Create

2021-03-20 Thread Adrian Klaver

On 3/20/21 10:03 AM, Niko Ware wrote:
I would like to programmatically create audit trail functions which are 
called by triggers for custom user tables. This will be used for audit 
trail generation in our application. The user is able to define a custom 
table. Therefore, I need a custom audit trail function. The audit trail 
function outputs the row changes in human readable form (e.g., "process 
name changed from "my process" to "your process"  by user  on host xxx".


The basic steps are as follows:

1. User defines the table
2. Create table via  EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd
3. Programatically constructs the audit trail function for 
insert/update/delete.

4. Create the function via EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd
5. Create the trigger via EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd

Here is an example in function source which is passed to "EXEC SQL 
EXECUTE IMMEDIATE" via char*:


In what program is:

EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd

being done?






2
3
4
5
6
7
8
9
10
11
12

CREATE OR REPLACE FUNCTION name_changes_log()
RETURNS TRIGGER AS
$BODY$
BEGIN
IF NEW.name <> OLD.name THEN
INSERT INTO kids_audit(kids_id,kids_name,modified_on)
VALUES(OLD.id,OLD.name,now());
END IF;

RETURN NEW;
END;
$BODY$ LANGUAGE plpgsql;


The "execute immediate" succeeds, but the function is not created. The 
application is connected to the database as a user which has permission 
to create functions.


I could output the function text to a file and then use "psql" to 
process the "create trigger", but this seems a bit hacky.


Thanks in advance,
Thomas






--
Adrian Klaver
adrian.kla...@aklaver.com




Programmatic Trigger Create

2021-03-20 Thread Niko Ware
I would like to programmatically create audit trail functions which are
called by triggers for custom user tables. This will be used for audit
trail generation in our application. The user is able to define a custom
table. Therefore, I need a custom audit trail function. The audit trail
function outputs the row changes in human readable form (e.g., "process
name changed from "my process" to "your process"  by user  on host xxx".

The basic steps are as follows:

1. User defines the table
2. Create table via  EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd
3. Programatically constructs the audit trail function for
insert/update/delete.
4. Create the function via EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd
5. Create the trigger via EXEC SQL EXECUTE IMMEDIATE :my_sql_cmd

Here is an example in function source which is passed to "EXEC SQL EXECUTE
IMMEDIATE" via char*:


2
3
4
5
6
7
8
9
10
11
12
CREATE OR REPLACE FUNCTION name_changes_log()
  RETURNS TRIGGER AS
$BODY$
BEGIN
   IF NEW.name <> OLD.name THEN
   INSERT INTO kids_audit(kids_id,kids_name,modified_on)
   VALUES(OLD.id,OLD.name,now());
   END IF;

   RETURN NEW;
END;
$BODY$ LANGUAGE plpgsql;

The "execute immediate" succeeds, but the function is not created. The
application is connected to the database as a user which has permission to
create functions.

I could output the function text to a file and then use "psql" to process
the "create trigger", but this seems a bit hacky.

Thanks in advance,
Thomas


More than one UNIQUE key when matching items..

2021-03-20 Thread Ron Clarke
/*
I'm trying to port a system from SQL server, and at the same time better
learn postgreSQL.

I've come across a problem that is easily solved in that world, but I am
struggling to find an approach in postgres that works.

We have 2 sets of events A and B (sets), they have a shared number (ncode),
both have unique Id's

We want to link items of set A to those of set B, but each item of each set
can only be linked once.That is we do not want to link all set 'A'
items to all set 'B' Items with the same code.

In SQL Server this is easy, we insert the records into a temporary table
with separate Unique indexes on the id for set a and the ids for set b and
put the 'ignore_dup_key' on which tells SQL Server to ignore duplicate rows
and carry on.

The nearest to this at first appears to be the ON CONFLICT ON CONSTRAINT
IGNORE in Postgres. But this only works with a single constraint, at a time
i.e. we can't set the ON CONFLICT ON CONSTRAINT IGNORE to work with
multiple UNIQUE indexes.

To show the problem:

I'm using PostgreSQL version 11.

*/
-- source data
WITH sd AS (
SELECT iid, s, ncode FROM (
VALUES (1, 'A', 10),
   (2, 'A', 30),
   (3, 'A', 10),
   (4, 'B', 10),
   (5, 'B', 20),
   (6, 'B', 10)
)
AS tx (iid, s, ncode))
SELECT iid, s, ncode FROM sd

/* The target result would be :

  id:1, A, 10 this matches id:4, B, 10
  id:3, A, 10 this matches id:6, B, 10
*/

--  Example to get the *wrong *answer, i.e. both sets of links

WITH
sd (i, s, n ) AS (
SELECT iid, s, ncode FROM (
VALUES (1, 'A', 10),
  (2, 'A', 30),
  (3, 'A', 10),
  (4, 'B', 10),
  (5, 'B', 20),
  (6, 'B', 10)
)
AS tx (iid, s, ncode))
,
x AS ( SELECT

ax.i as ia,
ax.s as sa,
ax.n as na,
bx.i as ib,
bx.s as sb,
bx.n as nb,
ROW_NUMBER () OVER (

PARTITION BY bx.i

ORDER BY

ax.i ) as rx

FROM sd AS ax
INNER JOIN sd AS bx ON ax.n = bx.n and ax.i != bx.i and bx.s = 'B'
WHERE ax.s = 'A'
)
SELECT ia,ib, na, rx FROM x
;


/*   I've tried using a recursive CTE where I'm trying to exclude results
from the result set that have already been identified, but I can't get an
allowed syntax.
  Doesn't seem to allow joins to the recursive term to exclude results.
*/


/*   I've tried Unique and Exclusion constraints on temporary table, e.g */
-- similar Example to get the wrong answer, i.e. both sets of links

DROP TABLE IF EXISTS links ;

CREATE TEMPORARY TABLE links
(mid serial ,
ia int ,
-- ia int UNIQUE,
ib int ,
-- ib int UNIQUE,
EXCLUDE USING gist (ia WITH =, ib WITH =)

  ) ;

WITH
sd (i, s, n ) AS (
SELECT iid, side, ncode FROM (
VALUES (1, 'A', 10),
  (2, 'A', 30),
  (3, 'A', 10),
  (4, 'B', 10),
  (5, 'B', 20),
  (6, 'B', 10)
)
AS tx (iid, side, ncode))
,
x AS (
SELECT
ax.i as ia,
ax.s as sa,
ax.n as na,
bx.i as ib,
bx.s as sb,
bx.n as nb,
ROW_NUMBER () OVER (
PARTITION BY bx.i
ORDER BY
ax.i
) as rx
FROM sd AS ax
INNER JOIN sd AS bx ON ax.n = bx.n and ax.i != bx.i and bx.s = 'B'
WHERE ax.s = 'A'
)
-- SELECT * FROM x
INSERT INTO links(ia,ib)
SELECT ia, ib FROM x
ON CONFLICT ON CONSTRAINT links_ia_ib_excl DO NOTHING;

  --
SELECT * from links;

/*   I've also tried and failed to use array(ia,ib) within or as computed
column of an Exclusion constraint of && s on temporary table, e.g
  but can't find any syntax that doesn't result in an error
  */


DROP TABLE IF EXISTS links ;

CREATE TEMPORARY TABLE links
(mid serial ,
ia int ,
-- ia int UNIQUE,
ib int ,
-- ib int UNIQUE,
ix int[],
  EXCLUDE USING gist (ix WITH &&)
  ) ;

-- This gives me:
-- ERROR:  data type integer[] has no default operator class for access
method "gist"

-- I have the btree_gist extension installed


/*

I appreciate I could create a cursor from a list of proposed links and step
through each one, checking if the id value has been "used up"
but I am trying to keep this as a set based operation to give me the
results in one statement.

There are some similar questions w.r.t. duplicate detection, but these
again seem to be solved by evaluating each proposed record individually.
If that's just what I have to do then so be it. There is probably a
simple 'postgreSQL' freindly approach I'm still yet to discover having spent
too long in Sybase and SQL Server worlds.

Thanks for looking at this

*/


Re: questions about wraparound

2021-03-20 Thread Luca Ferrari
On Thu, Mar 18, 2021 at 12:14 PM Luca Ferrari  wrote:
> testdb=> select datname, datfrozenxid, age(datfrozenxid) from pg_database;
>   datname  | datfrozenxid |   age
> ---+--+--
>  postgres  |   3318163526 | 5002
>  backupdb  |   3318163526 | 5002
>  template1 |   3368163526 |2
>  template0 |   3368163526 |2
>  testdb|   3318163526 | 5002
>  pgbench   |   3318163526 | 5002
>

I did it again: I provoked another wraparound and entered the single
user mode to vacuum. This is the situation before:

backend> select age(datfrozenxid), datname from pg_database
 1: age (typeid = 23, len = 4, typmod = -1, byval = t)
 2: datname (typeid = 19, len = 64, typmod = -1, byval = f)

 1: age = "2146483647"  (typeid = 23, len = 4, typmod = -1, byval = t)
 2: datname = "postgres"(typeid = 19, len = 64, typmod
= -1, byval = f)

 1: age = "2146483647"  (typeid = 23, len = 4, typmod = -1, byval = t)
 2: datname = "backupdb"(typeid = 19, len = 64, typmod
= -1, byval = f)

 1: age = "2146483647"  (typeid = 23, len = 4, typmod = -1, byval = t)
 2: datname = "template1"   (typeid = 19, len = 64, typmod
= -1, byval = f)

 1: age = "2146483647"  (typeid = 23, len = 4, typmod = -1, byval = t)
 2: datname = "template0"   (typeid = 19, len = 64, typmod
= -1, byval = f)

 1: age = "2146483647"  (typeid = 23, len = 4, typmod = -1, byval = t)
 2: datname = "testdb"  (typeid = 19, len = 64, typmod = -1, byval = f)

 1: age = "2146483647"  (typeid = 23, len = 4, typmod = -1, byval = t)
 2: datname = "pgbench" (typeid = 19, len = 64, typmod = -1, byval = f)

backend> vacuum
2021-03-20 11:54:44.878 CET [87179] WARNING:  database "backupdb" must
be vacuumed within 100 transactions
2021-03-20 11:54:44.878 CET [87179] HINT:  To avoid a database
shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared
transactions, or drop stale replication slots.


and after the vacuum I was able to start other in normal mode, without
having to vacuum again another database as suggested by the HINT.
However, templates have a zero age instead of the other databases:

% psql -U luca -c "SELECT age(datfrozenxid), datname from pg_database;" testdb
   age|  datname
--+---
 5000 | postgres
 5000 | backupdb
0 | template1
0 | template0
 5000 | testdb
 5000 | pgbench

I suspect freezing is doing it "totally" for a idatistemplate
database, even if I don't understand why.




Re: SELECT is faster on SQL Server

2021-03-20 Thread Frank Millman



On 2021-03-19 7:11 PM, Thomas Kellerer wrote:

Frank Millman schrieb am 19.03.2021 um 10:16:


cl_bal selects WHERE tran_date <= '2018-03-31'.

op_bal selects WHERE tran_date < '2018-03-01'.

The second one could be written as WHERE tran_date <= '2018-02-28', 
but I don't think that would make any difference.


I knew I overlooked something ;)

But as one is a true subset of the other, I think you can merge that 
into a single SELECT statement:


    select '2018-03-01' AS op_date,
   '2018-03-31' AS cl_date,
   a.source_code_id,
   sum(a.tran_tot) AS cl_tot,
   sum(a.tran_tot) filter (where tran_date < '2018-03-01') AS 
op_tot

    FROM (
   SELECT distinct on (location_row_id, function_row_id, 
source_code_id) source_code_id, tran_tot, tran_date

   FROM prop.ar_totals
   WHERE deleted_id = 0
 AND tran_date <= '2018-03-31'
 AND ledger_row_id = 1
   ORDER BY location_row_id, function_row_id, source_code_id, 
tran_date DESC

    ) AS a
    GROUP BY a.source_code_id


Thanks very much Thomas - I did not know about FILTER.

But it does not quite work. If the SELECT does find a row where the max 
tran_date is <= '2018-03-31' it correctly includes it in 'cl_tot'. But 
the filter returns nothing for 'op_tot' because there is no 
corresponding row where tran_date < '2018-03-01'.


But I have learned something new, so thanks for that.

Frank






Re: SELECT is faster on SQL Server

2021-03-20 Thread Frank Millman



On 2021-03-19 5:32 PM, Jehan-Guillaume de Rorthais wrote:

On Fri, 19 Mar 2021 14:28:27 +0200
Frank Millman  wrote:


[...]
Execution of my main query has improved from 50ms to 33ms. Sql Server
takes 25ms, but this is much better than it was.

[...]

Here is the new EXPLAIN ANALYSE -

QUERY PLAN

   Merge Left Join  (...) (actual time=1.566..1.581 rows=5 loops=1)

1.581ms to output the very last row of this plan. This is in contradiction with
the 33ms you are referencing above.

What do I miss here? Maybe your 33ms comes yet from another set of data? Could
you share an explain analyze actually showing this 33ms total execution time?


Sorry, I should have explained.

The query I showed selects data for a single month. The 'real' query 
repeats this 12 times, each with different dates, and combines the 
results using UNION ALL. This was the timing mentioned above.


BTW, I know that I can improve this by setting up the dates in a CTE and 
using JOIN LATERAL. I am avoiding this as it is not supported by SQL 
Server or sqlite3, and I am trying to stick to one code base for all 
databases. But I will look into it further.


Frank






Re: SELECT is faster on SQL Server

2021-03-20 Thread Frank Millman



On 2021-03-19 4:38 PM, Tom Lane wrote:

Frank Millman  writes:

However, the bizarre thing is that I have simply restored the index to
what it was in the first place. If you look at the table definition in
my original message you can see that all the columns were included in
the index. But the query did not use it as a covering index. Now the
EXPLAIN ANALYSE clearly shows 'Index Only Scan using ar_tots_cover'. I
have no idea what changed.

VACUUM, maybe?  Even if there's a covering index, the planner is not
likely to prefer an index-only scan unless it thinks that most of the
table's pages are known all-visible.  If they're not, most of the
rows will require heap probes anyway to check row visibility, meaning
that the "index-only" scan's performance degrades to about that of a
regular indexscan.

In this example, since you're fetching such a large fraction of the
table (which the planner is accurately estimating), there's not a lot
of daylight between the estimated costs of seqscan and index-only
scan to begin with.  I'm not surprised that it'd prefer the former
if the table isn't recently vacuumed.


It is possible. I know that I *did* vacuum. But I also ran a program to 
generate a few hundred additional rows, and I cannot remember if I ran 
the vacuum before or after that.


Frank