Re: Is postgres able to share sorts required by common partition window functions?

2020-07-07 Thread Michael Lewis
On Monday, July 6, 2020, Michael Lewis  wrote:

> Did you say you have an index on c1?
> [...]
> I don't know the data, but I assume there may be many rows with the same
> c1 value, so then you would likely benefit from getting that distinct set
> first like below as your FROM table.
>

> Re-reading the original email I see both the answer to your question and
the data being queried.
> David J.

Thanks David. I meant it as a rhetorical question, since yes of course
there was an index. I also didn't trust the example to be true to real data
in terms of c1 values distribution.


On Tue, Jul 7, 2020 at 9:01 AM Sebastien Arod 
wrote:

> Michael, David thanks for your quick replies.
>
> *@Michael*
> I initially dismissed writing this query using joins or subselects because
> the real query has about 80 columns and I was afraid that having 80
> joins/subselect would cause issues with postgresql including planner that
> would fallback to GEQO.
> I'll test it anyway with real data and see how it behaves.
>

Contrived and overly simplified examples often lead to uninformed, bad
advice. I would not attempt joins, unless the number of distinct c1 values
is relatively small perhaps. It might go fine though, and depending on your
query and the statistics on the table, perhaps join_collapse_limit = 1
would be prudent to constrain the planner to your desired plan and not
introduce the chance for the genetic optimizer to get involved.

Sort Method: external merge  Disk: 52912kB
Sort Method: external merge  Disk: 46168kB

What is your work_mem set to? Would it be possible to set it higher (for
this process) to avoid spilling to disk?


Re: Basic question about structuring SQL

2020-07-07 Thread David G. Johnston
On Tue, Jul 7, 2020 at 4:41 AM Robert Inder 
wrote:

> So how should I structure my chunks of SQL so that I can have "safe"
> (all-or-nothing) blocks,
> AND use them from within one another?
>

While there are more advanced constructs that may aid here I would suggest
just following two rules:

top-level scripts handle transactions - usually just a single BEGIN/COMMIT
at the top/bottom of the script respectively.  These are scripts you are
allowed to name on the psql command line.
support scripts don't handle transactions - these are scripts you include
into the main script with \i or \ir

David J.


Re: Is postgres able to share sorts required by common partition window functions?

2020-07-07 Thread Sebastien Arod
 Michael, David thanks for your quick replies.

*@Michael*
I initially dismissed writing this query using joins or subselects because
the real query has about 80 columns and I was afraid that having 80
joins/subselect would cause issues with postgresql including planner that
would fallback to GEQO.
I'll test it anyway with real data and see how it behaves.


*@David About Subsequent sorts*
>From my experiments the time of subsequent sorts is not reduced by previous
sort but is in fact higher.
In the following example in Query 1 Sort node on key c1, c2 has a time of
~1893 excluding child nodes
In the following example in Query 2

   - Sort node on key c1 has a time of ~1558 excluding child nodes
   - Sort node on key c1, c2 has a time of ~2964 excluding child nodes
   which is higher than the time for the same sort key in Query 1 but in Query
   2 data postgresql could have benefited from a preliminary sort on c1.

I can't figure why it would be slower.

*Query1*
explain analyze select
  first_value(c2) OVER (PARTITION BY c1 order by c2)
from t;


*Explain Plan  Query1*
WindowAgg  (cost=135042.46..155042.48 rows=101 width=47) (actual
time=1555.005..2639.659 rows=101 loops=1)
  ->  Sort  (cost=135042.46..137542.46 rows=101 width=15) (actual
time=1554.989..2070.462 rows=101 loops=1)
Sort Key: c1, c2
Sort Method: external merge  Disk: 25960kB
->  Seq Scan on t  (cost=0.00..18294.01 rows=101 width=15)
(actual time=0.013..177.037 rows=101 loops=1)
Planning Time: 0.115 ms
Execution Time: 2693.935 ms

*Query2*
explain analyze select
  first_value(c2) OVER (PARTITION BY c1),
  first_value(c2) OVER (PARTITION BY c1 order by c2)
from t;
*Explain Plan Query2*
WindowAgg  (cost=313730.43..333730.45 rows=101 width=79) (actual
time=4692.257..5702.226 rows=101 loops=1)
  ->  Sort  (cost=313730.43..316230.43 rows=101 width=47) (actual
time=4692.152..5174.928 rows=101 loops=1)
Sort Key: c1, c2
Sort Method: external merge  Disk: 32688kB
->  WindowAgg  (cost=135042.46..152542.48 rows=101 width=47)
(actual time=1188.109..2210.845 rows=101 loops=1)
  ->  Sort  (cost=135042.46..137542.46 rows=101 width=15)
(actual time=1188.099..1709.253 rows=101 loops=1)
Sort Key: c1
Sort Method: external merge  Disk: 25960kB
->  Seq Scan on t  (cost=0.00..18294.01 rows=101
width=15) (actual time=0.014..150.435 rows=101 loops=1)
Planning Time: 0.059 ms
Execution Time: 5756.591 ms



*@David About using an index*
You are correct that a simple index cannot be leveraged efficiently.
However a covering index including all columns and starting with the
partition column c1 could be used to avoid sorting on c1 right?

But in my tests it is not used. In the following example covering_index
could be used to avoid filtering on c1 but it doesn't:
*Covering Index*
create index covering_index on t (c1,c2, c3,c4);

*Query*
explain analyze select
  c1,
  first_value(c2) OVER (PARTITION BY c1 order by c2, c4) AS c2,
  first_value(c3) OVER (PARTITION BY c1 order by coalesce(c4, '000'), c3)
AS c3
from
  t;

*Plan*
WindowAgg  (cost=417853.93..440353.95 rows=101 width=123) (actual
time=3071.980..4184.030 rows=101 loops=1)
  ->  Sort  (cost=417853.93..420353.93 rows=101 width=91) (actual
time=3071.962..3575.641 rows=101 loops=1)
Sort Key: c1, (COALESCE(c4, '000'::character varying)), c3
Sort Method: external merge  Disk: 52912kB
->  WindowAgg  (cost=193152.96..215652.98 rows=101 width=91)
(actual time=1268.373..2271.463 rows=101 loops=1)
  ->  Sort  (cost=193152.96..195652.96 rows=101 width=59)
(actual time=1268.357..1711.526 rows=101 loops=1)
Sort Key: c1, c2, c4
Sort Method: external merge  Disk: 46168kB
->  Seq Scan on t  (cost=0.00..18294.01 rows=101
width=59) (actual time=0.014..163.347 rows=101 loops=1)
Planning Time: 0.081 ms
Execution Time: 4250.367 ms

However the index is used if one of the partition by + order by matches
entirely the beginning of the index. For instance query:
select
  c1,
  first_value(c2) OVER (PARTITION BY c1 order by c2) AS c2,
  first_value(c3) OVER (PARTITION BY c1 order by coalesce(c4, '000'), c3)
AS c3
from
  t;

Uses covering_index to avoid the sort on c1, c2

On Tue, Jul 7, 2020 at 12:48 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Monday, July 6, 2020, Sebastien Arod  wrote:
>
>> I would have expected postgresql to "share" a preliminary sort on c1 that
>> would then be useful to reduce the work on all window functions but it
>> doesn't.
>>
>
> The plan shown does share - the output of one sort goes into another.
> Subsequent sorts still have to happen but they should be faster as the
> first field is already grouped.  Doesn’t change the plan though.
>
>
>> I even created an index on c1 hoping 

Re: Basic question about structuring SQL

2020-07-07 Thread Laurenz Albe
On Tue, 2020-07-07 at 12:40 +0100, Robert Inder wrote:
> NOW, I want to do the same for a BIGGER task.
> BUT I realise that if I create bigger-task.sql like this...
>BEGIN;
>   <>
>\i do-something-useful.sql
>   <>
>COMMIT;
> ...the COMMIT inside "do-something-useful.sql" closes the transaction started 
> in "bigger-task.sql"
> So I can get some things committed even if other things (in tidy-up) fail.
> 
> So how SHOULD I tackle this? 
> PostgreSQL does not do nested transactions (right?)

You can use subtransactions.

In the inner scripts, don't use BEGIN, but SAVEPOINT .
Instead of ROLLBACK in the inner script, use ROLLBACK TO SAVEPOINT .

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Basic question about structuring SQL

2020-07-07 Thread Jason Wang
I don't think nested commit is supported however you might want to put
logic in do-something-useful.sql into a stored procedure without commit and
your BIGGER task just calls this SP and commits at the end; you can run the
SP by itself to have transaction protected.

/Jason

On Tue, 7 Jul 2020 at 21:41, Robert Inder  wrote:

> I'm an experienced programmer but really new to SQL,
> and I'm wanting a pointer to "the SQL way" to structure/organise chunks of
> code.
>
> A while back, I wrote a chunk of SQL to Do Something Useful.
> I put it in a file (do-something-useful.sql).
> And, to protect against getting into a weird state, I wrapped the code in
> my file with
>BEGIN;
>UPDATE
>DELETE...
>COMMIT;
> With the idea that I can do
>psql my_database
>\i do-something-useful.sql
> And be sure that either my task will be have been completed, or nothing
> with have changed.
>
> NOW, I want to do the same for a BIGGER task.
> BUT I realise that if I create bigger-task.sql like this...
>BEGIN;
>   <>
>\i do-something-useful.sql
>   <>
>COMMIT;
> ...the COMMIT inside "do-something-useful.sql" closes the transaction
> started in "bigger-task.sql"
> So I can get some things committed even if other things (in tidy-up) fail.
>
> So how SHOULD I tackle this?
> PostgreSQL does not do nested transactions (right?)
>
> So how should I structure my chunks of SQL so that I can have "safe"
> (all-or-nothing) blocks,
> AND use them from within one another?
>
> Robert
>
>


Basic question about structuring SQL

2020-07-07 Thread Robert Inder
I'm an experienced programmer but really new to SQL,
and I'm wanting a pointer to "the SQL way" to structure/organise chunks of
code.

A while back, I wrote a chunk of SQL to Do Something Useful.
I put it in a file (do-something-useful.sql).
And, to protect against getting into a weird state, I wrapped the code in
my file with
   BEGIN;
   UPDATE
   DELETE...
   COMMIT;
With the idea that I can do
   psql my_database
   \i do-something-useful.sql
And be sure that either my task will be have been completed, or nothing
with have changed.

NOW, I want to do the same for a BIGGER task.
BUT I realise that if I create bigger-task.sql like this...
   BEGIN;
  <>
   \i do-something-useful.sql
  <>
   COMMIT;
...the COMMIT inside "do-something-useful.sql" closes the transaction
started in "bigger-task.sql"
So I can get some things committed even if other things (in tidy-up) fail.

So how SHOULD I tackle this?
PostgreSQL does not do nested transactions (right?)

So how should I structure my chunks of SQL so that I can have "safe"
(all-or-nothing) blocks,
AND use them from within one another?

Robert


Transaction control in SECURITY DEFINER procedures

2020-07-07 Thread Chris Sterritt
The documentation for CREATE PROCEDURE informs us "A|SECURITY 
DEFINER|procedure cannot execute transaction control statements (for 
example,|COMMIT|and|ROLLBACK|, depending on the language)."


Can anyone let me know why this is so and are there any plans to remove 
this restriction in future releases?


Regards, Chris Sterritt



Auto vacuum not reclaiming the space and disk utilization spike to 100%

2020-07-07 Thread Ishan Joshi
Hi Team,

I am running  Postgres v12.2 and in our testing we are running performance test 
for 72 hrs with load. During the run it is working fine till first 16hrs after 
that disk utilization getting spike and next 2-4 hrs it is  reaching to 100%.

During the run we are deleting 2 million records from one of the partition 
table's older  partition every 20 second. As we are deleting such records 
gradually, auto vacuum process is executing accordingly. But I can see few 
partition which have clean all the records but its size is still getting in MBs.

nftdb=# select count(*) from XXX_sequence_pp18;
count
---
 0

Table
Size
External Size
xxx_sequence_pp18
340 MB
74 MB

I am suspecting due to this disk utilization is getting higher in my 
environment.

Can you please help to tune such issue

Thanks & Regards,
Ishan Joshi

This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service 



SV: SV: Using Postgres jdbc driver with Oracle SQL Developer

2020-07-07 Thread Niels Jespersen
-Oprindelig meddelelse-
>Fra: legrand legrand  
>Sendt: 6. juli 2020 17:36
>Til: pgsql-gene...@postgresql.org
>Emne: Re: SV: Using Postgres jdbc driver with Oracle SQL Developer
>
>Try Replacing hostname by hostname/Database?
>Don’t Forget ?
>
>Regards
>Pascal

Ha! Thats it!. Almost. 

Its actually portnumer that must be replaced with portnumber/database?

Thank you very much for your help. 

Regards Niels