Re: Enforce primary key on every table during dev?

2018-02-28 Thread Melvin Davidson
On Wed, Feb 28, 2018 at 4:07 PM, Tim Cross  wrote:

>
> Jeremy Finzel  writes:
>
> > We want to enforce a policy, partly just to protect those who might
> forget,
> > for every table in a particular schema to have a primary key.  This can't
> > be done with event triggers as far as I can see, because it is quite
> > legitimate to do:
> >
> > BEGIN;
> > CREATE TABLE foo (id int);
> > ALTER TABLE foo ADD PRIMARY KEY (id);
> > COMMIT;
> >
> > It would be nice to have some kind of "deferrable event trigger" or some
> > way to enforce that no transaction commits which added a table without a
> > primary key.
> >
>
> I think you would be better off having an automated report which alerts
> you to tables lacking a primary key and deal with that policy through
> other means. Using triggers in this way often leads to unexpected
> behaviour and difficult to identify bugs. The policy is a management
> policy and probably should be dealt with via management channels rather
> than technical ones. Besides, the likely outcome will be your developers
> will just adopt the practice of adding a serial column to every table,
> which in itself doesn't really add any value.
>
> Tim
>
>
> --
> Tim Cross
>
>










*> I think you would be better off having an automated report which
alerts>you to tables lacking a primary key and deal with that policy
through>other means. Perhaps a better solution is to have a meeting with
the developers and explain to them WHY the policy of enforcing a primary
key is important. Also, explain the purpose ofprimary keys and why it is
not always suitable to just use an integer or serial as the key,but rather
why natural unique (even multi column) keys are better. But this begs the
question, why are "developers" allowed to design database tables? That
should be the job of the DBA! Atthe very minimum, the DBA should be
reviewing and have the authority to approve of disapprove of table/schema
designs/changes .*

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


Re: selecting timestamp

2018-02-28 Thread chris

Thank you everyone for the help.
Sorry there wasn't much detail given originally.

CURRENT_TIMESTAMP at time zone 'UTC' as ct4

Seems to be working well.

Thanks,
Chris

On 02/27/2018 01:26 PM, David G. Johnston wrote:
On Tue, Feb 27, 2018 at 1:16 PM, chris >wrote:



What is the best way of selecting current timestamp in UTC?


​You
​ haven't​
 define
​d​
criteria upon which to judge - and the list below is not exhaustive
​ (but sufficiently so)​
​


SELECT
CURRENT_TIMESTAMP   as ct1


standard conforming, ​assumes server configured for UTC
​

,timezone('UTC',CURRENT_TIMESTAMP) as ct2
,timezone('utc',now()) as ct3


​non-standard, personally non-obvious (the function looks like an 
implementation detail that should be avoided)​


,CURRENT_TIMESTAMP at time zone 'UTC' as ct4


​This -​

standard conforming and doesn't require assumptions about the calling 
environment


,NOW() at time zone 'utc' as ct5


​non-standard but frequently used; no semantic different compared to 
the previous entry

​
David J.






Re: Enforce primary key on every table during dev?

2018-02-28 Thread Tim Cross

Jeremy Finzel  writes:

> We want to enforce a policy, partly just to protect those who might forget,
> for every table in a particular schema to have a primary key.  This can't
> be done with event triggers as far as I can see, because it is quite
> legitimate to do:
>
> BEGIN;
> CREATE TABLE foo (id int);
> ALTER TABLE foo ADD PRIMARY KEY (id);
> COMMIT;
>
> It would be nice to have some kind of "deferrable event trigger" or some
> way to enforce that no transaction commits which added a table without a
> primary key.
>

I think you would be better off having an automated report which alerts
you to tables lacking a primary key and deal with that policy through
other means. Using triggers in this way often leads to unexpected
behaviour and difficult to identify bugs. The policy is a management
policy and probably should be dealt with via management channels rather
than technical ones. Besides, the likely outcome will be your developers
will just adopt the practice of adding a serial column to every table,
which in itself doesn't really add any value.

Tim


-- 
Tim Cross



Re: How to avoid trailing zero (after decimal point) for numeric type column

2018-02-28 Thread Charles Clavadetscher
Hello

> On 28.02.2018, at 13:33, pkashimalla  wrote:
> 
> Hello Team,
> 
> We have recently migrated our database from Oracle
> And there are few of my tables with numeric column type.
> 
> In below example
> I did insertion from java program with below code snippet
> 
> Double object = 10.0;
> String inserRecordQuery_NEWWAY11 = "INSERT INTO BLOB_TEST_TABLE(id)VALUES
> (?)";
> selectPrepareStmt.setObject(1, object,Types.NUMERIC);
> int count = selectPrepareStmt.executeUpdate();
> 
> it inserted like this.
> /
> select id from blob_test_table;
> 
> id
> numeric
> -
> 10.0/
> 
> 
> In this case, when a decimal point is equal to 0 then,  I don't want to see
> the precision and the value in the column should just 10
> 
> And If I execute code,
> 
> Double object = 10.5801
> String inserRecordQuery_NEWWAY11 = "INSERT INTO BLOB_TEST_TABLE(id)VALUES
> (?)";
> selectPrepareStmt.setObject(1, object,Types.NUMERIC);
> int count = selectPrepareStmt.executeUpdate();
> 
> Now ,the value in the column should be 10.5801 as the precision is greater
> than ZERO
> 
> Because of this, the migrated data (from Oracle) is without PRECISION ZERO
> and the new data which is being inserted is with PRECISION ZERO. 
> 
> /
> select id from blob_test_table;
> 
> id
> numeric
> -
> 10.0
> 10
> 11
> 11.0
> /
> 
> Is there a possible setting in PostgreSQL server to achieve this?
> 

I think that you should set the colomn type to real or double precision:

select 10.0::double precision;
10

select 10.5801::double precision;
10.5801

Regards
Charles

> FYI - 
> 
> Oracle's NUMBER column type is handling it as I expected.
> I migrate Oracle's NUMBER column as just NUMERIC column in PostgreSQL
> 
> 
> 
> Thanks,
> Praveen
> 
> 
> 
> 
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
> 




How to avoid trailing zero (after decimal point) for numeric type column

2018-02-28 Thread pkashimalla
Hello Team,

We have recently migrated our database from Oracle
And there are few of my tables with numeric column type.

In below example
I did insertion from java program with below code snippet

Double object = 10.0;
String inserRecordQuery_NEWWAY11 = "INSERT INTO BLOB_TEST_TABLE(id)VALUES
(?)";
selectPrepareStmt.setObject(1, object,Types.NUMERIC);
int count = selectPrepareStmt.executeUpdate();

it inserted like this.
/
select id from blob_test_table;

id
numeric
-
10.0/


In this case, when a decimal point is equal to 0 then,  I don't want to see
the precision and the value in the column should just 10

And If I execute code,

Double object = 10.5801
String inserRecordQuery_NEWWAY11 = "INSERT INTO BLOB_TEST_TABLE(id)VALUES
(?)";
selectPrepareStmt.setObject(1, object,Types.NUMERIC);
int count = selectPrepareStmt.executeUpdate();

Now ,the value in the column should be 10.5801 as the precision is greater
than ZERO

Because of this, the migrated data (from Oracle) is without PRECISION ZERO
and the new data which is being inserted is with PRECISION ZERO. 

/
select id from blob_test_table;

id
numeric
-
10.0
10
11
11.0
/

Is there a possible setting in PostgreSQL server to achieve this?

FYI - 

Oracle's NUMBER column type is handling it as I expected.
I migrate Oracle's NUMBER column as just NUMERIC column in PostgreSQL



Thanks,
Praveen




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: Parallel index creation & pg_stat_activity

2018-02-28 Thread Andres Freund
Hi Peter,

On 2018-02-28 16:50:44 +, Phil Florent wrote:
> With an index creation (create index t1_i1 on t1(c1, c2);) I have this kind 
> of output :
> 
> ./t -d 20 -o "pid, backend_type, query, wait_event_type, wait_event"
> busy_pc | distinct_exe | pid  |  backend_type  |   query  
>  | wait_event_type |  wait_event
> -+--+--++---+-+--
>   68 | 1 / 136  | 8262 | client backend | create index t1_i1 on 
> t1(c1, c2); | IO  | DataFileRead
>   26 | 1 / 53   | 8262 | client backend | create index t1_i1 on 
> t1(c1, c2); | |
>6 | 1 / 11   | 8262 | client backend | create index t1_i1 on 
> t1(c1, c2); | IO  | BufFileWrite
> (3 rows)

> No parallel worker. At least one parallel worker was active though, I could 
> see its work with a direct query on pg_stat_activity or a ps -ef :
> 
> ...
> postgres  8262  8230  7 08:54 ?00:22:46 postgres: 11/main: postgres 
> postgres [local] CREATE INDEX
> ...
> postgres  9833  8230 23 14:17 ?00:00:33 postgres: 11/main: parallel 
> worker for PID 8262
> ...

Looks like we're not doing a pgstat_report_activity() in the workers?
Any argument for not doing so?

Greetings,

Andres Freund



Parallel index creation & pg_stat_activity

2018-02-28 Thread Phil Florent


Hi,
I created an index on a 11devel base while sampling pg_stat_activity with a 
little tool. Tool catches a line if state = active. Collected rows are 
aggregated and sorted by activity percentage.

Test environment :

select version();
  version

PostgreSQL 11devel (Debian 11~~devel~20180227.2330-1~420.git51057fe.pgdg+1) on 
x86_64-pc-linux-gnu, compiled by gcc (Debian 7.3.0-5) 7.3.0, 64-bit
(1 ligne)

Temps : 0,762 ms

create table t1(c1 bigint, c2 double precision, c3 text);
CREATE TABLE

insert into t1 select generate_series(1,1,1), random(), 
md5(random()::text) ;
INSERT 0 1

With a select (select max(c1) from t1 group by c2;) I have this kind of output :

./t -d 20 -o "pid, backend_type, query, wait_event_type, wait_event"
traqueur 2.04.00 - performance tool for PostgreSQL 9.3 => 11
INFORMATION, no connection parameters provided, connecting to traqueur database 
...
INFORMATION, connected to traqueur database
INFORMATION, PostgreSQL version : 11
INFORMATION, sql preparation ...
INFORMATION, sql execution ...
busy_pc | distinct_exe |  pid  |  backend_type   |query 
   | wait_event_type |  wait_event
-+--+---+-+-+-+--
  76 | 1 / 151  | 10065 | parallel worker | select max(c1) from t1 
group by c2; | IO  | DataFileRead
  73 | 1 / 146  |  8262 | client backend  | select max(c1) from t1 
group by c2; | IO  | DataFileRead
  72 | 1 / 144  | 10066 | parallel worker | select max(c1) from t1 
group by c2; | IO  | DataFileRead
  26 | 1 / 53   | 10066 | parallel worker | select max(c1) from t1 
group by c2; | |
  26 | 1 / 51   |  8262 | client backend  | select max(c1) from t1 
group by c2; | |
  24 | 1 / 47   | 10065 | parallel worker | select max(c1) from t1 
group by c2; | |
   2 | 1 / 3| 10066 | parallel worker | select max(c1) from t1 
group by c2; | IO  | BufFileWrite
   2 | 1 / 3|  8262 | client backend  | select max(c1) from t1 
group by c2; | IO  | BufFileWrite
   1 | 1 / 2| 10065 | parallel worker | select max(c1) from t1 
group by c2; | IO  | BufFileWrite



With an index creation (create index t1_i1 on t1(c1, c2);) I have this kind of 
output :

./t -d 20 -o "pid, backend_type, query, wait_event_type, wait_event"
traqueur 2.04.00 - performance tool for PostgreSQL 9.3 => 11
INFORMATION, no connection parameters provided, connecting to traqueur database 
...
INFORMATION, connected to traqueur database
INFORMATION, PostgreSQL version : 11
INFORMATION, sql preparation ...
INFORMATION, sql execution ...
busy_pc | distinct_exe | pid  |  backend_type  |   query
   | wait_event_type |  wait_event
-+--+--++---+-+--
  68 | 1 / 136  | 8262 | client backend | create index t1_i1 on t1(c1, 
c2); | IO  | DataFileRead
  26 | 1 / 53   | 8262 | client backend | create index t1_i1 on t1(c1, 
c2); | |
   6 | 1 / 11   | 8262 | client backend | create index t1_i1 on t1(c1, 
c2); | IO  | BufFileWrite
(3 rows)


No parallel worker. At least one parallel worker was active though, I could see 
its work with a direct query on pg_stat_activity or a ps -ef :

...
postgres  8262  8230  7 08:54 ?00:22:46 postgres: 11/main: postgres 
postgres [local] CREATE INDEX
...
postgres  9833  8230 23 14:17 ?00:00:33 postgres: 11/main: parallel 
worker for PID 8262
...

Tool only catches activity of the client backend cause column state of 
pg_stat_activity is null for the parallel workers in this case. I added an 
option to do a  "(state = 'active' or wait_event_is not null)"  It's not 100% 
accurate though : I miss the activity of the parallel workers which is not 
waiting and it’s more difficult to know who helps whom since query is also null.
I can imagine various workarounds but 11 is in devel and maybe columns active & 
query of pg_stat_activity will be filled for the parallel workers even for an 
index creation ?

Best regards
Phil












Re: Enforce primary key on every table during dev?

2018-02-28 Thread David G. Johnston
On Wed, Feb 28, 2018 at 6:34 AM, Jeremy Finzel  wrote:

> We want to enforce a policy, partly just to protect those who might
> forget, for every table in a particular schema to have a primary key.  This
> can't be done with event triggers as far as I can see, because it is quite
> legitimate to do:
>

​Add a query to your test suite that queries the catalogs and fails if this
policy is violated.  There is nothing in a running PostgreSQL server
instance that is going to enforce this for you.

David J.


Re: Enforce primary key on every table during dev?

2018-02-28 Thread John McKown
On Wed, Feb 28, 2018 at 7:57 AM, Adrian Klaver 
wrote:

> On 02/28/2018 05:52 AM, John McKown wrote:
>
>> On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel > >wrote:
>>
>> We want to enforce a policy, partly just to protect those who might
>> forget, for every table in a particular schema to have a primary
>> key.  This can't be done with event triggers as far as I can see,
>> because it is quite legitimate to do:
>>
>> BEGIN;
>> CREATE TABLE foo (id int);
>> ALTER TABLE foo ADD PRIMARY KEY (id);
>> COMMIT;
>>
>> It would be nice to have some kind of "deferrable event trigger" or
>> some way to enforce that no transaction commits which added a table
>> without a primary key.
>>
>> Any ideas?
>>
>> Thanks,
>> Jeremy
>>
>>
>>
>> ​What stops somebody from doing:
>>
>> CREATE TABLE foo (filler text primary key default null, realcol1 int,
>> realcol2 text);
>>
>> And then just never bother to ever insert anything into the column
>> FILLER? It fulfills your stated requirement​ of every table having a
>>
>
> Then you would get this:
>
> test=# CREATE TABLE foo (filler text primary key default null, realcol1
> int, realcol2 text);
> CREATE TABLE
> test=# insert into  foo (realcol1, realcol2) values (1, 'test');
> ERROR:  null value in column "filler" violates not-null constraint
> DETAIL:  Failing row contains (null, 1, test).
>

​Hum, it's been so long, I totally forgot. Which makes me wonder why the
parser doesn't "know" that a default of NULL for a primary key is going to
fail anyway and flag it at CREATE time. Oh, well. Thanks.​



>
> primary key. Of course, you could amend the policy to say a "non-NULL
>> primary key".
>>
>>
>>
>> --
>> I have a theory that it's impossible to prove anything, but I can't prove
>> it.
>>
>> Maranatha! <><
>> John McKown
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
I have a theory that it's impossible to prove anything, but I can't prove
it.

Maranatha! <><
John McKown


Re: Enforce primary key on every table during dev?

2018-02-28 Thread Melvin Davidson
On Wed, Feb 28, 2018 at 8:57 AM, Adrian Klaver 
wrote:

> On 02/28/2018 05:52 AM, John McKown wrote:
>
>> On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel > >wrote:
>>
>> We want to enforce a policy, partly just to protect those who might
>> forget, for every table in a particular schema to have a primary
>> key.  This can't be done with event triggers as far as I can see,
>> because it is quite legitimate to do:
>>
>> BEGIN;
>> CREATE TABLE foo (id int);
>> ALTER TABLE foo ADD PRIMARY KEY (id);
>> COMMIT;
>>
>> It would be nice to have some kind of "deferrable event trigger" or
>> some way to enforce that no transaction commits which added a table
>> without a primary key.
>>
>> Any ideas?
>>
>> Thanks,
>> Jeremy
>>
>>
>>
>> ​What stops somebody from doing:
>>
>> CREATE TABLE foo (filler text primary key default null, realcol1 int,
>> realcol2 text);
>>
>> And then just never bother to ever insert anything into the column
>> FILLER? It fulfills your stated requirement​ of every table having a
>>
>
> Then you would get this:
>
> test=# CREATE TABLE foo (filler text primary key default null, realcol1
> int, realcol2 text);
> CREATE TABLE
> test=# insert into  foo (realcol1, realcol2) values (1, 'test');
> ERROR:  null value in column "filler" violates not-null constraint
> DETAIL:  Failing row contains (null, 1, test).
>
>
> primary key. Of course, you could amend the policy to say a "non-NULL
>> primary key".
>>
>>
>>
>> --
>> I have a theory that it's impossible to prove anything, but I can't prove
>> it.
>>
>> Maranatha! <><
>> John McKown
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
As Adrian pointed out, by definition, PK's create a constraint which are
NOT NULLABLE;

Here is the SQL to check for tables with no primary key.









*SELECT n.nspname,c.relname as table,
c.reltuples::bigint   FROM pg_class c JOIN pg_namespace n ON (n.oid
=c.relnamespace ) WHERE relkind = 'r' AND   relhaspkey =
FALSEORDER BY n.nspname, c.relname;*

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


Re: Enforce primary key on every table during dev?

2018-02-28 Thread Adrian Klaver

On 02/28/2018 05:52 AM, John McKown wrote:
On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel >wrote:


We want to enforce a policy, partly just to protect those who might
forget, for every table in a particular schema to have a primary
key.  This can't be done with event triggers as far as I can see,
because it is quite legitimate to do:

BEGIN;
CREATE TABLE foo (id int);
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;

It would be nice to have some kind of "deferrable event trigger" or
some way to enforce that no transaction commits which added a table
without a primary key.

Any ideas?

Thanks,
Jeremy



​What stops somebody from doing:

CREATE TABLE foo (filler text primary key default null, realcol1 int, 
realcol2 text);


And then just never bother to ever insert anything into the column 
FILLER? It fulfills your stated requirement​ of every table having a 


Then you would get this:

test=# CREATE TABLE foo (filler text primary key default null, realcol1 
int, realcol2 text);

CREATE TABLE
test=# insert into  foo (realcol1, realcol2) values (1, 'test');
ERROR:  null value in column "filler" violates not-null constraint
DETAIL:  Failing row contains (null, 1, test).

primary key. Of course, you could amend the policy to say a "non-NULL 
primary key".




--
I have a theory that it's impossible to prove anything, but I can't 
prove it.


Maranatha! <><
John McKown



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Enforce primary key on every table during dev?

2018-02-28 Thread John McKown
On Wed, Feb 28, 2018 at 7:34 AM, Jeremy Finzel  wrote:

> We want to enforce a policy, partly just to protect those who might
> forget, for every table in a particular schema to have a primary key.  This
> can't be done with event triggers as far as I can see, because it is quite
> legitimate to do:
>
> BEGIN;
> CREATE TABLE foo (id int);
> ALTER TABLE foo ADD PRIMARY KEY (id);
> COMMIT;
>
> It would be nice to have some kind of "deferrable event trigger" or some
> way to enforce that no transaction commits which added a table without a
> primary key.
>
> Any ideas?
>
> Thanks,
> Jeremy
>


​What stops somebody from doing:

CREATE TABLE foo (filler text primary key default null, realcol1 int,
realcol2 text);

And then just never bother to ever insert anything into the column FILLER?
It fulfills your stated requirement​ of every table having a primary key.
Of course, you could amend the policy to say a "non-NULL primary key".



-- 
I have a theory that it's impossible to prove anything, but I can't prove
it.

Maranatha! <><
John McKown


Re: Enforce primary key on every table during dev?

2018-02-28 Thread Achilleas Mantzios

On 28/02/2018 15:34, Jeremy Finzel wrote:
We want to enforce a policy, partly just to protect those who might forget, for every table in a particular schema to have a primary key.  This can't be done with event triggers as far as I can see, 
because it is quite legitimate to do:


BEGIN;
CREATE TABLE foo (id int);
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;

It would be nice to have some kind of "deferrable event trigger" or some way to 
enforce that no transaction commits which added a table without a primary key.

Any ideas?

cron job to check for tables without PK ? Although for a short period the 
offending table would be there live without a PK.
But IMO every table, in addition to PK, should have also natural unique keys as 
much as possible.
Better safe than sorry.


Thanks,
Jeremy



--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt




Enforce primary key on every table during dev?

2018-02-28 Thread Jeremy Finzel
We want to enforce a policy, partly just to protect those who might forget,
for every table in a particular schema to have a primary key.  This can't
be done with event triggers as far as I can see, because it is quite
legitimate to do:

BEGIN;
CREATE TABLE foo (id int);
ALTER TABLE foo ADD PRIMARY KEY (id);
COMMIT;

It would be nice to have some kind of "deferrable event trigger" or some
way to enforce that no transaction commits which added a table without a
primary key.

Any ideas?

Thanks,
Jeremy


Re: Given a set of daterange, finding the continuous range that includes a particular date (aggregates)

2018-02-28 Thread mariusz

hi,

On Tue, 2018-02-27 at 16:15 -0800, Ken Tanzer wrote:
> Hi.  Thanks so much for your assistance.  This is definitely getting
> the results I was looking for.  It is still syntacticallly more
> cumbersome than I might have hoped, but I can work with it.  So I've
> got two follow questions/issues:
> 
> 
> 1)  I can see there are many, more complex, options for aggregates,
> which I am trying to wrap my mind around.  I'm wondering if any of
> these (esp. partial aggregates/combine functions, final functions or
> moving aggregates) could be used to streamline this into a single
> function call, or if that is barking up a dead tree.
> 
i'm not an expert and i don't use fancy aggregates too often, just got a
simple solution, so i may be wrong, but i don't see simple solution as
one function. so unless someone comes with a better idea, you have to
stay with this, which isn't complex as the idea and solution is really
very simple (simplicity is somewhat visually hidden in that case by
converting dates to ranges and to boundary dates again (and again to
ranges for visualizing result)).
but it requires additional sort for each window as you can see in
explain which is the cost to pay
> 
> 2)  I'm sure at this point I must be being dense, but after re-reading
> docs multiple times, I am still confused about one piece of this:
> 
> 
> > first window (within lower() func) extends its initial
> (earliest) range
> > to right for every next row continuous with current range
> (and jumps to
> > new start on discontinuity), thus over(order by d ASC)
> > second window (within upper() func) extends its initial
> (latest) range
> > to left for every next row continuous with current range
> (and jumps to
> > new end on discontinuity), thus over(order by d DESC)
> > partition by client_id within each over() is to not mix
> client_ids'
> > dateranges
> >
> 
> 
> which is what is it exactly that is triggering Postgresql to know
> there is a discontinuity and to start with a new range?  And is it
> based on the input or the output values?  Based on PARTITION BY
> client_id ORDER BY d, I might have thought it was d.  But that doesn't
> seem to be right.  So is it something about what agg_daterange is
> returning, and if so what?  Again, sorry for being dense about this
> one.
> 
posgresql itself doesn't know nor care about those discontinuities, all
it cares about is partition by client_id to not mix client_ids, and
applying our agg function for rows in order of our choice.

here again i remind you, you don't really want to sort by d::daterange
for second descending window, but end_date because desc order of range
will not be the same as desc order of end_date in general case,
and start_date for first window (but as i already said, for that first
ascending window it's cosmetics as order of d::daterange and
start_date::date will be the same.
that doesn't really matter in your case of exclusive ranges but matters
in more general case.

and back to your question, we find discontinuity in our function. see
the query

SELECT CASE WHEN d1 && d2 OR d1 -|- d2 THEN d1 + d2 ELSE d2 END

where d1 is internal state value of aggregate and d2 is current row's
daterange value.
we check for continuity by d1 && d2 OR d1 -|- d2, and if it is
continuous than we just extend result (return value for current row and
new state value of agg) by d1+d2,
otherwise, when we find discontinuity, we forget d1 (agg state value
until now) and simply return d2 being current row's range, thus starting
with new range.

hope that explains enough. as i already said, i'm not an expert, i'm
just coincidentally working currently on my semi-toy project which
utilizes dateranges quite heavily.
anyway, feel free to ask if you have any further questions. for now i'm
glad i could help somehow.

regards,
mariusz jadczak

> 
> Thanks!
> 
> 
> Ken
> 
> 
> 
> -- 
> 
> AGENCY Software  
> A Free Software data system
> By and for non-profits
> http://agency-software.org/
> https://demo.agency-software.org/client
> 
> ken.tan...@agency-software.org
> (253) 245-3801
> 
> 
> Subscribe to the mailing list to
> learn more about AGENCY or
> follow the discussion.





Re: Parallel Query - Can it be used within functions?

2018-02-28 Thread Michael Krüger
Ok, to close this thread. The problem is, that plpgsql function do seem to
return data using a cursor. That fact is disabling parallel execution. So
if we instead hand over the SQL to a function with e.g. a python body, then
parallel execution is happening, because the data is first assembled in
memory before it is returned, without using a cursor:

mkrueger=# explain analyze select * from reports.generic_query_python($$
select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where
mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1 $$) as foo (mediatrunkid bigint, count numeric);
LOG:  0: duration: 35.158 ms  plan:
Query Text:  select mediatrunkid,count(*)::numeric from
reports.mediatrunkkpi where mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1
Partial HashAggregate  (cost=56693.72..56696.72 rows=300 width=16) (actual
time=35.144..35.149 rows=17 loops=1)
  Group Key: mediatrunkid
  Buffers: shared hit=1641
  ->  Parallel Bitmap Heap Scan on mediatrunkkpi  (cost=4525.01..56279.28
rows=82889 width=8) (actual time=2.350..24.584 rows=63794 loops=1)
Recheck Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
Heap Blocks: exact=1641
Buffers: shared hit=1641
->  Bitmap Index Scan on idx_mediatrunkkpi_trunk
(cost=0.00..4475.27 rows=198933 width=0) (never executed)
  Index Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
LOCATION:  explain_ExecutorEnd, auto_explain.c:359
LOG:  0: duration: 35.165 ms  plan:
Query Text:  select mediatrunkid,count(*)::numeric from
reports.mediatrunkkpi where mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1
Partial HashAggregate  (cost=56693.72..56696.72 rows=300 width=16) (actual
time=35.152..35.157 rows=17 loops=1)
  Group Key: mediatrunkid
  Buffers: shared hit=1630
  ->  Parallel Bitmap Heap Scan on mediatrunkkpi  (cost=4525.01..56279.28
rows=82889 width=8) (actual time=2.364..24.702 rows=63272 loops=1)
Recheck Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
Heap Blocks: exact=1630
Buffers: shared hit=1630
->  Bitmap Index Scan on idx_mediatrunkkpi_trunk
(cost=0.00..4475.27 rows=198933 width=0) (never executed)
  Index Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
LOCATION:  explain_ExecutorEnd, auto_explain.c:359
LOG:  0: duration: 47.855 ms  plan:
Query Text:  select mediatrunkid,count(*)::numeric from
reports.mediatrunkkpi where mediatrunkid in
(2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
group by 1
Finalize GroupAggregate  (cost=57784.41..57792.66 rows=300 width=40)
(actual time=45.331..45.344 rows=17 loops=1)
  Group Key: mediatrunkid
  Buffers: shared hit=2735
  ->  Sort  (cost=57784.41..57785.91 rows=600 width=16) (actual
time=45.322..45.325 rows=51 loops=1)
Sort Key: mediatrunkid
Sort Method: quicksort  Memory: 27kB
Buffers: shared hit=2735
->  Gather  (cost=57693.72..57756.72 rows=600 width=16) (actual
time=45.270..45.295 rows=51 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  Buffers: shared hit=2735
  ->  Partial HashAggregate  (cost=56693.72..56696.72 rows=300
width=16) (actual time=38.387..38.391 rows=17 loops=3)
Group Key: mediatrunkid
Buffers: shared hit=6006
->  Parallel Bitmap Heap Scan on mediatrunkkpi
(cost=4525.01..56279.28 rows=82889 width=8) (actual time=5.564..27.399
rows=67080 loops=3)
  Recheck Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
  Heap Blocks: exact=1912
  Buffers: shared hit=6006
  ->  Bitmap Index Scan on idx_mediatrunkkpi_trunk
(cost=0.00..4475.27 rows=198933 width=0) (actual time=11.229..11.229
rows=201241 loops=1)
Index Cond: (mediatrunkid = ANY
('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2

Re: Parallel Query - Can it be used within functions?

2018-02-28 Thread Michael Krüger
ah, and for completeness the simple python function I use for the test:

create or replace function reports.generic_query_python(_sql text)
  RETURNS SETOF record
  LANGUAGE 'plpythonu'
  PARALLEL SAFE
  COST 100
  VOLATILE
  ROWS 5000
AS $BODY$
return plpy.execute( _sql )
$BODY$;


Michael Krüger  schrieb am Mi., 28. Feb. 2018 um
09:05 Uhr:

> Ok, to close this thread. The problem is, that plpgsql function do seem to
> return data using a cursor. That fact is disabling parallel execution. So
> if we instead hand over the SQL to a function with e.g. a python body, then
> parallel execution is happening, because the data is first assembled in
> memory before it is returned, without using a cursor:
>
> mkrueger=# explain analyze select * from reports.generic_query_python($$
> select mediatrunkid,count(*)::numeric from reports.mediatrunkkpi where
> mediatrunkid in
> (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
> group by 1 $$) as foo (mediatrunkid bigint, count numeric);
> LOG:  0: duration: 35.158 ms  plan:
> Query Text:  select mediatrunkid,count(*)::numeric from
> reports.mediatrunkkpi where mediatrunkid in
> (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
> group by 1
> Partial HashAggregate  (cost=56693.72..56696.72 rows=300 width=16) (actual
> time=35.144..35.149 rows=17 loops=1)
>   Group Key: mediatrunkid
>   Buffers: shared hit=1641
>   ->  Parallel Bitmap Heap Scan on mediatrunkkpi  (cost=4525.01..56279.28
> rows=82889 width=8) (actual time=2.350..24.584 rows=63794 loops=1)
> Recheck Cond: (mediatrunkid = ANY
> ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
> Heap Blocks: exact=1641
> Buffers: shared hit=1641
> ->  Bitmap Index Scan on idx_mediatrunkkpi_trunk
> (cost=0.00..4475.27 rows=198933 width=0) (never executed)
>   Index Cond: (mediatrunkid = ANY
> ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
> LOCATION:  explain_ExecutorEnd, auto_explain.c:359
> LOG:  0: duration: 35.165 ms  plan:
> Query Text:  select mediatrunkid,count(*)::numeric from
> reports.mediatrunkkpi where mediatrunkid in
> (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
> group by 1
> Partial HashAggregate  (cost=56693.72..56696.72 rows=300 width=16) (actual
> time=35.152..35.157 rows=17 loops=1)
>   Group Key: mediatrunkid
>   Buffers: shared hit=1630
>   ->  Parallel Bitmap Heap Scan on mediatrunkkpi  (cost=4525.01..56279.28
> rows=82889 width=8) (actual time=2.364..24.702 rows=63272 loops=1)
> Recheck Cond: (mediatrunkid = ANY
> ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
> Heap Blocks: exact=1630
> Buffers: shared hit=1630
> ->  Bitmap Index Scan on idx_mediatrunkkpi_trunk
> (cost=0.00..4475.27 rows=198933 width=0) (never executed)
>   Index Cond: (mediatrunkid = ANY
> ('{2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028}'::bigint[]))
> LOCATION:  explain_ExecutorEnd, auto_explain.c:359
> LOG:  0: duration: 47.855 ms  plan:
> Query Text:  select mediatrunkid,count(*)::numeric from
> reports.mediatrunkkpi where mediatrunkid in
> (2129932,2101261,2125841,409613,409612,2142220,2101260,2129933,2125839,2138124,2117644,2125836,2187276,2154508,2101263,2109452,2134028)
> group by 1
> Finalize GroupAggregate  (cost=57784.41..57792.66 rows=300 width=40)
> (actual time=45.331..45.344 rows=17 loops=1)
>   Group Key: mediatrunkid
>   Buffers: shared hit=2735
>   ->  Sort  (cost=57784.41..57785.91 rows=600 width=16) (actual
> time=45.322..45.325 rows=51 loops=1)
> Sort Key: mediatrunkid
> Sort Method: quicksort  Memory: 27kB
> Buffers: shared hit=2735
> ->  Gather  (cost=57693.72..57756.72 rows=600 width=16) (actual
> time=45.270..45.295 rows=51 loops=1)
>   Workers Planned: 2
>   Workers Launched: 2
>   Buffers: shared hit=2735
>   ->  Partial HashAggregate  (cost=56693.72..56696.72 rows=300
> width=16) (actual time=38.387..38.391 rows=17 loops=3)
> Group Key: mediatrunkid
> Buffers: shared hit=6006
> ->  Parallel Bitmap Heap Scan on mediatrunkkpi
> (cost=4525.01..56279.28 rows=82889 width=8) (actual time=5.564..27.399
> rows=67080 loops=3)
>   Recheck Cond: (mediatrunkid = ANY
> ('{2129932,2101261,2125841,409613,409612,