Re: strange slow query performance

2019-01-17 Thread Ben Snaidero
On Thu, Jan 17, 2019 at 4:13 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, Jan 17, 2019 at 9:19 AM Ben Snaidero 
> wrote:
> > Any ideas as to why this is happening?
>
> Not really, I would expect roughly double execution time, not an
> exponential increase.  Still not experienced enough to diagnose with
> what has been provided but I will suggest you provide the version that
> you are running these queries against and confirm that the results are
> consistent across multiple runs to discount the possibility that
> locking or other transient work is involved.
>
> Curious if increasing work_mem helps but its mostly a trial-and-error
> thing for me (I wouldn't expect it to)...
>
> David J.
>

Query was tested multiple times with similar results.  I also tried
increasing work_mem with the same results as well.  Memory did not increase
or even come close to the 10MB setting.  Query does seem to occupy one full
cpu for the duration of the run though.  Query is running on windows with
Postgres 9.6.7


Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, David G. Johnston wrote:


I would advise changing Contacts to "Activities" as the former can readily
be interpreted (and is in the wild) as both "an instance of contacting a
person" and "the person at the organization who is being contacted" (i.e.,
your People class).


David,

  That's why I originally named the Contacts table Activities and the Peoples 
table
as Contacts. :-) I'll change it back again since there's now a Peoples
table.

Many thanks!

Rich




Re: Refining query statement

2019-01-17 Thread David G. Johnston
On Thu, Jan 17, 2019 at 3:44 PM Rich Shepard  wrote:
> FROM People AS p
>   JOIN Organizations AS o ON p.org_id = o.org_id
>   JOIN Contacts AS c ON c.person_id = p.person_id

I would advise changing Contacts to "Activities" as the former can
readily be interpreted (and is in the wild) as both "an instance of
contacting a person" and "the person at the organization who is being
contacted" (i.e., your People class).

David J.



Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, Adrian Klaver wrote:


I would think the active would be on People or Organizations. Then you can
eliminate then from the query results before you ever got to the contact
history.


Adrian,

Excellent point. I don't know why I put active in the contacts table as it
does make more sense to put it in the people table (as some clients have
multiple sites, each with its own contact person).


Also not sure how you the above result when the query showed:
WHERE c.active = TRUE AND c.next_contact <= 'today'::date
Or did the history results come from an undisclosed query?


I don't recall which query I used for the MWE. I'll move the active column
to the new table, as well as those values, and keep working on learning how
to formulate complex, multi-table queries.

Also, moving the active table to peoples, where there is only one row per
person (rather than multiple contact rows per person) removes the need to
find only the most recent row in the many of the one-to-many relationship.

Thanks for the advice,

Rich




Re: Refining query statement

2019-01-17 Thread Adrian Klaver

On 1/17/19 2:44 PM, Rich Shepard wrote:

On Thu, 17 Jan 2019, Adrian Klaver wrote:


Got to thinking more and realized the answer depends on what you want the
query to produce. Can you let us know what is you are trying to pull out
with the query?


Adrian, et al.,

Took your advice and re-thought what I need the query to return. This
allowed me to realize that I don't need a separate contact history query as
I can incorporate it in a single query. The goal and pseudocode are:

Purpose: List all contact information and contact history for active people
where next contact date is less than today.

For each person select person_id, lname, fname, and direct_phone from 
People.


For each person get the org_name from Organizations.

For each person get contact history in date order and next contact date 
from

Contacts where active = True.

Order by next contact dates in ascending order.

Query code:

SELECT p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, 
c.cont_date,

    c.cont_type, c.notes, c.next_contact, c.comment
FROM People AS p
  JOIN Organizations AS o ON p.org_id = o.org_id
  JOIN Contacts AS c ON c.person_id = p.person_id
WHERE c.active = TRUE AND c.next_contact <= 'today'::date
GROUP BY o.org_name, p.person_id, c.person_id, c.cont_date, c.cont_type,
   c.next_contact
ORDER BY p.person_id, c.next_contact

The results are not correctly grouped or ordered; I'll work on fixing these
issues.

The other issue that needs fixing is identifying the most recent 'active'
value in the Contacts table for each person_id and including that person in
the results only when active = 't'. Here's a MWE of a redacted set of rows
that should not have been returned by the query:

  person_id | cont_date  | cont_type  |
  notes
   | active | next_contact | comment 
--++--+-

     36 | 2018-12-12 | Conference | Notes on this conversation.
     | t  | 2018-12-17   |
     36 | 2018-12-17 | Phone  | Left message asking for a meeting.
   | t  | 2019-01-03   |
     36 | 2019-01-03 | Phone  | Left another message.
   | t  | 2019-01-07   |
     36 | 2019-01-07 | Phone  | Going into a meeting.
   | t  | 2019-01-14   |
     36 | 2019-01-15 | Phone  | Will call when/if.
   | f  | infinity | (5 rows)

Because the most recent row's active value is 'f' these rows should not be
included as there is no next contact date. I'll keep reading looking for
this answer and pointers will be helpful.


I would think the active would be on People or Organizations. Then you 
can eliminate then from the query results before you ever got to the 
contact history.


Also not sure how you the above result when the query showed:

WHERE c.active = TRUE AND c.next_contact <= 'today'::date

Or did the history results come from an undisclosed query?



Regards,

Rich





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, Adrian Klaver wrote:


Got to thinking more and realized the answer depends on what you want the
query to produce. Can you let us know what is you are trying to pull out
with the query?


Adrian, et al.,

Took your advice and re-thought what I need the query to return. This
allowed me to realize that I don't need a separate contact history query as
I can incorporate it in a single query. The goal and pseudocode are:

Purpose: List all contact information and contact history for active people
where next contact date is less than today.

For each person select person_id, lname, fname, and direct_phone from People.

For each person get the org_name from Organizations.

For each person get contact history in date order and next contact date from
Contacts where active = True.

Order by next contact dates in ascending order.

Query code:

SELECT p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, c.cont_date,
   c.cont_type, c.notes, c.next_contact, c.comment
FROM People AS p
 JOIN Organizations AS o ON p.org_id = o.org_id
 JOIN Contacts AS c ON c.person_id = p.person_id
WHERE c.active = TRUE AND c.next_contact <= 'today'::date
GROUP BY o.org_name, p.person_id, c.person_id, c.cont_date, c.cont_type,
  c.next_contact
ORDER BY p.person_id, c.next_contact

The results are not correctly grouped or ordered; I'll work on fixing these
issues.

The other issue that needs fixing is identifying the most recent 'active'
value in the Contacts table for each person_id and including that person in
the results only when active = 't'. Here's a MWE of a redacted set of rows
that should not have been returned by the query:

 person_id | cont_date  | cont_type  |
 notes
  | active | next_contact | comment 
--++--+-

36 | 2018-12-12 | Conference | Notes on this conversation.
| t  | 2018-12-17   |
36 | 2018-12-17 | Phone  | Left message asking for a meeting.
  | t  | 2019-01-03   |
36 | 2019-01-03 | Phone  | Left another message.
  | t  | 2019-01-07   |
36 | 2019-01-07 | Phone  | Going into a meeting.
  | t  | 2019-01-14   |
36 | 2019-01-15 | Phone  | Will call when/if.
  | f  | infinity | 
(5 rows)


Because the most recent row's active value is 'f' these rows should not be
included as there is no next contact date. I'll keep reading looking for
this answer and pointers will be helpful.

Regards,

Rich



Re: strange slow query performance

2019-01-17 Thread David G. Johnston
On Thu, Jan 17, 2019 at 9:19 AM Ben Snaidero  wrote:
> Any ideas as to why this is happening?

Not really, I would expect roughly double execution time, not an
exponential increase.  Still not experienced enough to diagnose with
what has been provided but I will suggest you provide the version that
you are running these queries against and confirm that the results are
consistent across multiple runs to discount the possibility that
locking or other transient work is involved.

Curious if increasing work_mem helps but its mostly a trial-and-error
thing for me (I wouldn't expect it to)...

David J.



Re: Varlena with recursive data structures?

2019-01-17 Thread Michel Pelletier
Hi Karl,

I'm going down this road myself.  In addition to the files Tom Lane pointed
out there is also some helpful documentation here:

https://www.postgresql.org/docs/current/storage-toast.html#STORAGE-TOAST-INMEMORY

On Wed, Jan 16, 2019 at 2:09 PM Sam Patterson  wrote:

> Hi all,
>
> I've recently started developing an extension for Postgres for which I'll
> need to create a new variable-length base type. The type will require a
> tree-like structure in order to parse sufficiently, which of course
> probably means having some sort of recursive data structure, like a struct
> that has members which are pointers to itself for child nodes. After doing
> some research, specifically looking at how other variable-length data types
> store their data, it seems almost all of them store the data in a binary
> representation, using bit masks and offsets etc in order to store/access
> the data whilst having an in-memory representation that's used to
> manipulate the data.
>
> I presume the purpose for using this approach is because all the data in a
> varlena type has to be contiguous, and the moment you start using pointers
> this is no longer possible. So my question is, given a structure that looks
> something like this,
>
> typedef struct Node
> {
> char *data;
> Node *left;
> Node *right;
> } Node;
>
> am I right in saying that I wouldn't be able to store that representation
> on-disk, but instead I'd have to transform it into some binary
> representation and back again when writing/reading respectively, are there
> any alternatives?
>
> Regards,
>
> Karl
>


Re: strange slow query performance

2019-01-17 Thread Laurenz Albe
Ben Snaidero wrote:
> The following query runs as expected.
> 
> explain analyze SELECT MainTable.gid AS MainTable_gid,MainTable.datetime AS 
> MainTable_datetime,MainTable.objectid AS MainTable_objectid,MainTable.type AS 
> MainTable_type FROM MainTable  
>WHERE objectid = ANY(ARRAY(SELECT NE.objectid AS entityId FROM ( 
> nodeobject NE INNER JOIN (   Node N1NE INNER JOIN Node N2NE   ON 
> ((N1NE.iLeft >= N2NE.iLeft) AND (N1NE.iLeft <= N2NE.iRight)))   ON NE.inodeid 
> = N1NE.iId ) WHERE N2NE.iId = ANY(ARRAY[14438,10814]))) 
>  AND objectid=3161;
>   
>  QUERY PLAN   
>  
> -
>  Index Scan using ix_MainTable_objectid_datetime on MainTable  
> (cost=3254.91..3264.39 rows=1 width=32) (actual time=33.094..33.094 rows=0 
> loops=1)
>Index Cond: ((objectid = ANY ($3)) AND (objectid = 3161))
>InitPlan 1 (returns $3)
>  ->  Nested Loop  (cost=1.10..3254.75 rows=16296 width=4) (actual 
> time=0.403..26.147 rows=19042 loops=1)
> [...]
>  Planning time: 5.693 ms
>  Execution time: 33.383 ms
> (15 rows)
> 
> But when we add a second condition to the where clause it causes the query 
> performance to drop dramatically.  Values in "any(array" do not make a 
> difference.  In this example they are the same but even for different values 
> performance is still the poor.  Any ideas as to why this is happening?
> 
> explain analyze SELECT MainTable.gid AS MainTable_gid,MainTable.datetime AS 
> MainTable_datetime,MainTable.objectid AS MainTable_objectid,MainTable.type AS 
> MainTable_type FROM MainTable  
>WHERE objectid = ANY(ARRAY(SELECT NE.objectid AS entityId FROM ( 
> nodeobject NE INNER JOIN (   Node N1NE INNER JOIN Node N2NE   ON 
> ((N1NE.iLeft >= N2NE.iLeft) AND (N1NE.iLeft <= N2NE.iRight)))   ON NE.inodeid 
> = N1NE.iId ) WHERE N2NE.iId = ANY(ARRAY[14438,10814]))) 
>  AND objectid = ANY(ARRAY(SELECT NE.objectid AS entityId FROM ( 
> nodeobject NE INNER JOIN (   Node N1NE INNER JOIN Node N2NE   ON 
> ((N1NE.iLeft >= N2NE.iLeft) AND (N1NE.iLeft <= N2NE.iRight)))   ON NE.inodeid 
> = N1NE.iId ) WHERE N2NE.iId = ANY(ARRAY[14438,10814]))) AND objectid=3161;
> 
>   
>   QUERY PLAN  
>   
>  
> --
>  Index Scan using ix_MainTable_objectid_datetime on MainTable  
> (cost=6509.66..6534.02 rows=1 width=32) (actual time=16442.004..16442.004 
> rows=0 loops=1)
>Index Cond: ((objectid = ANY ($3)) AND (objectid = ANY ($7)) AND (objectid 
> = 3161))
>InitPlan 1 (returns $3)
>  ->  Nested Loop  (cost=1.10..3254.75 rows=16296 width=4) (actual 
> time=0.438..28.484 rows=19042 loops=1)
> [...]
>InitPlan 2 (returns $7)
>  ->  Nested Loop  (cost=1.10..3254.75 rows=16296 width=4) (actual 
> time=0.056..11.786 rows=19042 loops=1)
> [...]
>  Planning time: 4.860 ms
>  Execution time: 16442.462 ms
> (26 rows)

Not sure what is going on, but why don't you put the condition "WHERE 
ne.objectid=3161"
into the subselects?  Then there should be at most one result row, and I can 
imagine
that things will become faster.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Need aws_oracle_ext.systimestamp function defination for postgres

2019-01-17 Thread Ahtesham Karajgi
Hi all,

I am working on a migration from Oracle to PostgreSQL using the AWS SCT.

I had converted 823 table out of 866 tables to postgres. However for few
tables I am facing challenges in convertion due to
"aws_oracle_ext.systimestamp" fuctions.

below is the Error for your reference.

ERROR: function aws_oracle_ext.systimestamp() does not exist
Hint: No function matches the given name and argument types. You might need
to add explicit type casts.

I am searching for the fucntion defination for the
aws_oracle_ext.systimestamp.

Similar to the below "aws_oracle_ext.sysdate".

CREATE OR REPLACE FUNCTION aws_oracle_ext.sysdate() RETURNS timestamp
without time zone AS $BODY$ DECLARE l_var1 interval ; BEGIN l_var1 := ’0
hour’; /* Please type your value instead of 0 */ return
(clock_timestamp()::TIMESTAMP(0) WITHOUT TIME ZONE) + l_var1; END; $BODY$
LANGUAGE plpgsql VOLATILE;


Thank you in advance.


Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, Adrian Klaver wrote:


It would produce results, so yes it would work. The question are they the
results you want?


Adrian,

  To which the answer is no as I just discovered.


The above would return anything with a next_contact less then today. That
could extend backwards to some undetermined point in the past, depending
on how diligent you where on updating next_contact. Before you go much
further I would look over the suggestions posted and then sit down with
pen and paper and figure out what it you want to track and how you are
going to maintain that tracking. Then draw up an outline of how you will
achieve that in code.


  I have that tracking defined (who is active and needs a contact today or
earlier) and am now working on how to code it so only the last contact for a
person is examined to determine if the active column = true. I'll ponder
that.

Thanks,

Rich




Re: Refining query statement

2019-01-17 Thread Adrian Klaver

On 1/17/19 10:01 AM, Rich Shepard wrote:

On Thu, 17 Jan 2019, Adrian Klaver wrote:


Seems to me a boolean field of name active to denote contacts you need to
keep up with is in order. Then make the next_contact field NOT NULL and
replace the current NULL values with 'infinity':



WHERE COALESCE(next_contact, 'infinity') BETWEEN '01/01/2019'::date AND
'today'::date


Adrian,

   Having added a new column, 'active' with boolean values (and a 
default of

true), would this WHERE clause work:

WHERE active = TRUE and next_contact <= 'today'::date


It would produce results, so yes it would work.

The question are they the results you want?

The above would return anything with a next_contact less then today. 
That could extend backwards to some undetermined point in the past, 
depending on how diligent you where on updating next_contact. Before you 
go much further I would look over the suggestions posted and then sit 
down with pen and paper and figure out what it you want to track and how 
you are going to maintain that tracking. Then draw up an outline of how 
you will achieve that in code.




?

Regards,

Rich




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, David G. Johnston wrote:


Off the top of my head (and this is a model I am quite familiar with
even if I'm doing this email at speed):

I'd suggest an actual activity table:


David,

  Adrian's suggestion of a Contacts table column called 'active' having a
boolean data type seems to me to be more compact. Testing active = true as a
condition in the WHERE clause will display only those People with active
contact needs.

Thanks very much,

Rich



Re: Postgres Automated Failover

2019-01-17 Thread AI Rumman
Thanks, I'll check it out.

On Thu, Jan 17, 2019 at 9:06 AM Jehan-Guillaume (ioguix) de Rorthais <
iog...@free.fr> wrote:

> On Thu, 17 Jan 2019 08:32:48 -0500
> AI Rumman  wrote:
>
> > Hi,
> >
> > I am planning to use Postgresql with TimescaleDb extension. I have to
> > design a system similar to AWS RDS which supports automated failover,
> > transparent minor version upgrades etc.
> > In early days, I worked with PgPool II to enable heartbeat between the
> > Postgres servers. Is there any new tool to achieve the same by which I
> can
> > design a transparent failover system for Postgres?
>
> There is multiple HA solutions, none are "transparent failover" though.
> They
> are just automated-failover. A rollback-ed transaction because of failover
> will
> never be transparent from the application point of view.
>
> Look at Patroni or PAF, depending on what you want to achieve.
>


Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, Adrian Klaver wrote:


Seems to me a boolean field of name active to denote contacts you need to
keep up with is in order. Then make the next_contact field NOT NULL and
replace the current NULL values with 'infinity':



WHERE COALESCE(next_contact, 'infinity') BETWEEN '01/01/2019'::date AND
'today'::date


Adrian,

  Having added a new column, 'active' with boolean values (and a default of
true), would this WHERE clause work:

WHERE active = TRUE and next_contact <= 'today'::date

?

Regards,

Rich



Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, Adrian Klaver wrote:


To be clear the next-activity date = next_contact in the database, correct?


Adrian,

  Yes. I've renamed the Activities table to Contacts and the Contacts table
to People.


NULL basically means unknown, so having it stand for something is a bit of
a stretch. Seems to me a boolean field of name active to denote contacts
you need to keep up with is in order. Then make the next_contact field NOT
NULL and replace the current NULL values with 'infinity':


  This makes good sense.


As to current situation:

1) Use my previous suggestion.


  I did so, but will modify it as you suggest:
2) WHERE COALESCE(next_contact, 'infinity') BETWEEN '01/01/2019'::date AND 
'today'::date


Thanks again,

Rich



Re: Refining query statement

2019-01-17 Thread David G. Johnston
On Thu, Jan 17, 2019 at 10:07 AM Rich Shepard  wrote:
>The direct answer is that a completed activity has a row with either a
> future next-activity date or a null (which is the case when the status of
> that organization or contact is 'no further contact'.)

Off the top of my head (and this is a model I am quite familiar with
even if I'm doing this email at speed):

I'd suggest an actual activity table:

activity (
created_on date not null,
due_on date not null,
performed_on date null,
contact_id bigint not null references (contact)
);

contact (
contact_id bigserial primary key,
told_me_to_go_pound_sand boolean default false
);

Now, an exception report can be made for every contact where
"told_me_to_go_pound_sand" is false and there is no record on activity
where performed_on is null (and also pound sand is true and there is
one performed_on is null)

The report you want is basically everything on activity where
performed_on is null and due_on is today or earlier.

A unique partial index can be made (activity.contact_id) where
performed_on is null; to ensure that at most only one open activity
exists for each contact.

David J.



Re: Refining query statement

2019-01-17 Thread Adrian Klaver

On 1/17/19 9:07 AM, Rich Shepard wrote:

On Thu, 17 Jan 2019, David G. Johnston wrote:


Yes...though now it just sounds like a flawed data model.


David,

   This is what I thought.

How stuck are you in that regard? Those "future" contacts should have 
their

own records and not be derived via an optional field on an existing
record.


   My goal is to make a functioning business tracking application for my
consulting services. Almost all my prior postgres databases hold
environmental data for statistical and spatio-temporal analyses so 
writing a

business application is a new experience for me and I want to get it
correct.


Put differently, how do you know which activities are completed and
which are not?


   The direct answer is that a completed activity has a row with either a
future next-activity date or a null (which is the case when the status of
that organization or contact is 'no further contact'.)


To be clear the next-activity date = next_contact in the database, correct?

NULL basically means unknown, so having it stand for something is a bit 
of a stretch. Seems to me a boolean field of name active to denote 
contacts you need to keep up with is in order. Then make the 
next_contact field NOT NULL and replace the current NULL values with 
'infinity':


https://www.postgresql.org/docs/10/datatype-datetime.html

8.5.1.4. Special Values


As to current situation:

1) Use my previous suggestion.

or

2) WHERE COALESCE(next_contact, 'infinity') BETWEEN '01/01/2019'::date 
AND 'today'::date


https://www.postgresql.org/docs/10/functions-conditional.html#FUNCTIONS-COALESCE-NVL-IFNULL



   I should rename the Contacts table as People and the Activities table as
Contacts. The original names came from a sales management system I used 
as a
design guide, but they're probably confusing to others as well as to me. 
:-)


   I can provide my current schema (eight tables) to the list (perhaps 
as an

attachment), an individual, or put in on a cloud site and pass the URL.

Thanks,

Rich




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, David G. Johnston wrote:


Yes...though now it just sounds like a flawed data model.


David,

  This is what I thought.


How stuck are you in that regard? Those "future" contacts should have their
own records and not be derived via an optional field on an existing
record.


  My goal is to make a functioning business tracking application for my
consulting services. Almost all my prior postgres databases hold
environmental data for statistical and spatio-temporal analyses so writing a
business application is a new experience for me and I want to get it
correct.


Put differently, how do you know which activities are completed and
which are not?


  The direct answer is that a completed activity has a row with either a
future next-activity date or a null (which is the case when the status of
that organization or contact is 'no further contact'.)

  I should rename the Contacts table as People and the Activities table as
Contacts. The original names came from a sales management system I used as a
design guide, but they're probably confusing to others as well as to me. :-)

  I can provide my current schema (eight tables) to the list (perhaps as an
attachment), an individual, or put in on a cloud site and pass the URL.

Thanks,

Rich



Re: Refining query statement

2019-01-17 Thread David G. Johnston
On Thu, Jan 17, 2019 at 9:47 AM Rich Shepard  wrote:
> What I want is a list of contacts to make today. This includes ones that
> should have been made earlier but weren't and excludes earlier contacts that
> have no scheduled next contact (therefore, the nulls.).
>
> Does this clarify what I'm asking of the data?

Yes...though now it just sounds like a flawed data model.  How stuck
are you in that regard?  Those "future" contacts should have their own
records and not be derived via an optional field on an existing
record.

Put differently, how do you know which activities are completed and
which are not?

Dave



Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, Adrian Klaver wrote:


Got to thinking more and realized the answer depends on what you want the
query to produce. Can you let us know what is you are trying to pull out
with the query?


Adrian,

Certainly. Over breakfast I realized the same thing: the existing SELECT
query is not asking for the correct data.

As I replied in response to David's message, what I want is a list of
contacts to make today. This includes ones that should have been made
earlier but weren't and excludes earlier contacts that have no scheduled
next contact (therefore, the nulls.).

Thanks,

Rich




Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, David G. Johnston wrote:


Why is next_contact allowed to be null?


David,

There are a number of reasons. The prospect might have retired, told me to
pound sand, or has put off a decision.


Your concept of "most current row" is strictly based upon next_contact so
if next_contact is null it is impossible to know whether it is the most
current row (consider that two activities could be missing the
next_contact date - which of those is the most current?)


That's quite true. I realize I've been asking the question incorrectly.

What I want is a list of contacts to make today. This includes ones that
should have been made earlier but weren't and excludes earlier contacts that
have no scheduled next contact (therefore, the nulls.).

Does this clarify what I'm asking of the data?

Thanks,

Rich




Re: Refining query statement

2019-01-17 Thread Adrian Klaver

On 1/17/19 8:14 AM, Rich Shepard wrote:

On Tue, 15 Jan 2019, Thomas Kellerer wrote:

   select distinct on (C.contact_id) C.contact_id, C.lname, C.fname, 
C.direct_phone, O.org_name, A.next_contact

   from Contacts as C
 join Organizations as O on C.org_id = O.org_id
 join Activities as A on C.contact_id = A.contact_id
   where A.next_contact <= 'today'
 and A.next_contact > '2018-12-31'
   order by c.contact_id, a.next_contact DESC;


Combining this with Adrian's advice to use BETWEEN I have this statement
that almost works:

SELECT DISTINCT ON (c.contact_id) c.contact_id, c.lname, c.fname, 
c.direct_phone, o.org_name, a.next_contact

FROM Contacts AS c
   JOIN Organizations AS o ON c.org_id = o.org_id
   JOIN Activities AS a ON c.contact_id = a.contact_id WHERE 
next_contact BETWEEN '01/01/2019'::date AND 'today'::date ORDER BY 
c.contact_id, a.next_contact DESC;


It fails when the most recent next_contact column in Activities is NULL and
an earier row has a non-NULL date in the specified range.

I tried specifying max(a.next_contact) and added GROUP BY, but the result
set all returned o.org_name columns to the same one.

The WHERE clause needs to exclude a contact_id where the most current 
row in

Activities has NULL for the next_contact column. I've tried a few ideas but
none work so I need to learn the proper syntax, and I don't find that in
Rick van der Lans' or Joe Celko's books I have.


Got to thinking more and realized the answer depends on what you want 
the query to produce. Can you let us know what is you are trying to pull 
out with the query?




Looking forward to learning,

Rich





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Refining query statement

2019-01-17 Thread David G. Johnston
On Thu, Jan 17, 2019 at 9:14 AM Rich Shepard  wrote:
> ORDER BY c.contact_id, a.next_contact DESC;

> The WHERE clause needs to exclude a contact_id where the most current row in
> Activities has NULL for the next_contact column.

Why is next_contact allowed to be null?

Your concept of "most current row" is strictly based upon next_contact
so if next_contact is null it is impossible to know whether it is the
most current row (consider that two activities could be missing the
next_contact date - which of those is the most current?)

David J.



strange slow query performance

2019-01-17 Thread Ben Snaidero
The following query runs as expected.

explain analyze SELECT MainTable.gid AS MainTable_gid,MainTable.datetime AS
MainTable_datetime,MainTable.objectid AS MainTable_objectid,MainTable.type
AS MainTable_type FROM MainTable
   WHERE objectid = ANY(ARRAY(SELECT NE.objectid AS entityId FROM (
nodeobject NE INNER JOIN (   Node N1NE INNER JOIN Node N2NE   ON
((N1NE.iLeft >= N2NE.iLeft) AND (N1NE.iLeft <= N2NE.iRight)))   ON
NE.inodeid = N1NE.iId ) WHERE N2NE.iId = ANY(ARRAY[14438,10814])))
 AND objectid=3161;

   QUERY PLAN

-
 Index Scan using ix_MainTable_objectid_datetime on MainTable
(cost=3254.91..3264.39 rows=1 width=32) (actual time=33.094..33.094 rows=0
loops=1)
   Index Cond: ((objectid = ANY ($3)) AND (objectid = 3161))
   InitPlan 1 (returns $3)
 ->  Nested Loop  (cost=1.10..3254.75 rows=16296 width=4) (actual
time=0.403..26.147 rows=19042 loops=1)
   ->  Nested Loop  (cost=0.81..1784.15 rows=618 width=4) (actual
time=0.249..10.485 rows=2082 loops=1)
 ->  Index Scan using pk_node on node n2ne
(cost=0.41..16.80 rows=2 width=8) (actual time=0.103..0.233 rows=2 loops=1)
   Index Cond: (iid = ANY ('{14438,10814}'::integer[]))
 ->  Index Only Scan using ix_node on node n1ne
(cost=0.41..880.59 rows=309 width=8) (actual time=0.074..5.022 rows=1041
loops=2)
   Index Cond: ((ileft >= n2ne.ileft) AND (ileft <=
n2ne.iright))
   Heap Fetches: 30
   ->  Index Only Scan using ix_nodeobject_inodeid_objectid on
nodeobject ne  (cost=0.29..1.81 rows=57 width=8) (actual time=0.002..0.007
rows=9 loops=2082)
 Index Cond: (inodeid = n1ne.iid)
 Heap Fetches: 13973
 Planning time: 5.693 ms
 Execution time: 33.383 ms
(15 rows)

But when we add a second condition to the where clause it causes the query
performance to drop dramatically.  Values in "any(array" do not make a
difference.  In this example they are the same but even for different
values performance is still the poor.  Any ideas as to why this is
happening?

explain analyze SELECT MainTable.gid AS MainTable_gid,MainTable.datetime AS
MainTable_datetime,MainTable.objectid AS MainTable_objectid,MainTable.type
AS MainTable_type FROM MainTable
   WHERE objectid = ANY(ARRAY(SELECT NE.objectid AS entityId FROM (
nodeobject NE INNER JOIN (   Node N1NE INNER JOIN Node N2NE   ON
((N1NE.iLeft >= N2NE.iLeft) AND (N1NE.iLeft <= N2NE.iRight)))   ON
NE.inodeid = N1NE.iId ) WHERE N2NE.iId = ANY(ARRAY[14438,10814])))
 AND objectid = ANY(ARRAY(SELECT NE.objectid AS entityId FROM (
nodeobject NE INNER JOIN (   Node N1NE INNER JOIN Node N2NE   ON
((N1NE.iLeft >= N2NE.iLeft) AND (N1NE.iLeft <= N2NE.iRight)))   ON
NE.inodeid = N1NE.iId ) WHERE N2NE.iId = ANY(ARRAY[14438,10814]))) AND
objectid=3161;


QUERY PLAN


--
-
 Index Scan using ix_MainTable_objectid_datetime on MainTable
(cost=6509.66..6534.02 rows=1 width=32) (actual time=16442.004..16442.004
rows=0 loops=1)
   Index Cond: ((objectid = ANY ($3)) AND (objectid = ANY ($7)) AND
(objectid = 3161))
   InitPlan 1 (returns $3)
 ->  Nested Loop  (cost=1.10..3254.75 rows=16296 width=4) (actual
time=0.438..28.484 rows=19042 loops=1)
   ->  Nested Loop  (cost=0.81..1784.15 rows=618 width=4) (actual
time=0.172..7.141 rows=2082 loops=1)
 ->  Index Scan using pk_node on node n2ne
(cost=0.41..16.80 rows=2 width=8) (actual time=0.046..0.111 rows=2 loops=1)
   Index Cond: (iid = ANY ('{14438,10814}'::integer[]))
 ->  Index Only Scan using ix_node on node n1ne
(cost=0.41..880.59 rows=309 width=8) (actual time=0.073..3.438 rows=1041
loops=2)
   Index Cond: ((ileft >= n2ne.ileft) AND (ileft <=
n2ne.iright))
   Heap Fetches: 30
   ->  Index Only Scan using ix_nodeobject_inodeid_objectid on
nodeobject ne  (cost=0.29..1.81 rows=57 width=8) (actual time=0.003..0.010
rows=9 loops=2082)
 Index Cond: (inodeid = n1ne.iid)
 Heap Fetches: 13973
   InitPlan 2 (returns $7)
 ->  Nested Loop  (cost=1.10..3254.75 rows=16296 width=4) (actual
time=0.056..11.786 rows=19042 loops=1)
   ->  Nested Loop  (cost=0.81..1784.15 rows=618 width=4) (actual
time=0.034..1.343 rows=2082 loops=1)
 ->  Index Scan using pk_node on node n2ne_1
(cost=0.41..16.80 rows=2 width=8) (actual time=0.007..0.016 rows=2 loops=1)
   Index Cond: (iid = ANY ('{14438,10814}'::integer[]))
 ->  Index Only Scan using ix_node on node n1ne_1
(cost=0.41..880.59 rows=309 width=8) (actual time=0.012..0.581 

Re: pgbouncer

2019-01-17 Thread Nicola Contu
> If I set 3, and I tried to connect from 4 shells, I am still able to
connect.

That would be the point - pgbouncer is sharing the 3 connections it
keeps with PostgreSQL between the 4 client connections made to it.

Mmh, my pool_mode is per session. The 4 sessions were active, not doing any
query, but connected to the shell.
So that's what my doubt

Il giorno gio 17 gen 2019 alle ore 16:32 David G. Johnston <
david.g.johns...@gmail.com> ha scritto:

> On Thu, Jan 17, 2019 at 8:18 AM Nicola Contu 
> wrote:
> >
> > Hello,
> > I am a bit confused about the settings in pgbouncer
> >
> > What's exactly the pool_size?
>
> Roughly, the number of open connections pgbouncer will keep to PostgreSQL.
>
> > If I set 3, and I tried to connect from 4 shells, I am still able to
> connect.
>
> That would be the point - pgbouncer is sharing the 3 connections it
> keeps with PostgreSQL between the 4 client connections made to it.
>
> > Same thing for max_db_connections. I set this to 1 and I am able to
> connect from 2 shells.
>
> Same as pool_size but basically a fail-safe since pools are
> per-user/per-database while this limit is per-database only.
>
> > This is kind of confusing and I'm not really cleared reading the
> documentation.
>
> For each setting you need to understand whether it limits
> pgbouncer->PostgreSQL or client->pgbouncer
>
> Configurations in [databases] limit the former; [pgbouncer] options
> either provide defaults for the [databases] or limit clients.
>
> > [databases]
> > cmdwela = host=127.0.0.1 port=5432 dbname=cmdwela auth_user=pgbouncer
> pool_size=120
> > cicciopasticcio = host=127.0.0.1 port=5432 dbname=cicciopasticcio
> auth_user=pgbouncer pool_size=2
> >
> > [pgbouncer]
> > listen_port = 6543
> > listen_addr = *
> > auth_type = md5
> > auth_file = /etc/pgbouncer/users.txt
> > auth_query = select uname,phash from user_lookup($1)
> > logfile = /var/log/pgbouncer.log
> > pidfile = /home/postgres/pgbouncer.pid
> > admin_users = admin
> > user=postgres
>
> > max_db_connections=1
> So one open connection max per database/user pool but it is shared -
> i.e., actively executing queries running in parallel are limited to
> this number.
>
> > max_client_conn=5
> > I just want to limit connections from the app etc.
>
> That limit is 5
>
> David J.
>


Re: Refining query statement

2019-01-17 Thread Rich Shepard

On Tue, 15 Jan 2019, Thomas Kellerer wrote:


   select distinct on (C.contact_id) C.contact_id, C.lname, C.fname, 
C.direct_phone, O.org_name, A.next_contact
   from Contacts as C
 join Organizations as O on C.org_id = O.org_id
 join Activities as A on C.contact_id = A.contact_id
   where A.next_contact <= 'today'
 and A.next_contact > '2018-12-31'
   order by c.contact_id, a.next_contact DESC;


Combining this with Adrian's advice to use BETWEEN I have this statement
that almost works:

SELECT DISTINCT ON (c.contact_id) c.contact_id, c.lname, c.fname, 
c.direct_phone, o.org_name, a.next_contact
FROM Contacts AS c
  JOIN Organizations AS o ON c.org_id = o.org_id
  JOIN Activities AS a ON c.contact_id = a.contact_id 
WHERE next_contact BETWEEN '01/01/2019'::date AND 'today'::date 
ORDER BY c.contact_id, a.next_contact DESC;


It fails when the most recent next_contact column in Activities is NULL and
an earier row has a non-NULL date in the specified range.

I tried specifying max(a.next_contact) and added GROUP BY, but the result
set all returned o.org_name columns to the same one.

The WHERE clause needs to exclude a contact_id where the most current row in
Activities has NULL for the next_contact column. I've tried a few ideas but
none work so I need to learn the proper syntax, and I don't find that in
Rick van der Lans' or Joe Celko's books I have.

Looking forward to learning,

Rich



Re: pgbouncer

2019-01-17 Thread Fabio Pardi
David,

Are you sure? The behaviour I experienced is different from what you described.

On 17/01/2019 16:32, David G. Johnston wrote:
> On Thu, Jan 17, 2019 at 8:18 AM Nicola Contu  wrote:
>>
>> Hello,
>> I am a bit confused about the settings in pgbouncer
>>
>> What's exactly the pool_size?
> 
> Roughly, the number of open connections pgbouncer will keep to PostgreSQL.
> 
>> If I set 3, and I tried to connect from 4 shells, I am still able to connect.
> 
> That would be the point - pgbouncer is sharing the 3 connections it
> keeps with PostgreSQL between the 4 client connections made to it.


pgbouncer will spawn connections until it reach the maximum allowed. 
Connections go 1:1 between pgbouncer and the client, and are not 'shared'

pool_size is the size of the pool that pgbouncer keeps open for you. If not 
set, the default_pool_size is used.


If you check the open connections to your host, you can verify it 

netstat --tcp -n | grep ... 

will show ESTABLISHED connections = pool_size


client connections will be accepted by pgbouncer until pool_size is reached.

after that, pgbouncer can spawn more connection only if reserve_pool_size 
allows it, and after a wait of reserve_pool_timeout .

when your roof is reached (means you are opening pool_size + reserve_pool_size) 
then client connections will enter a queue.

Entering a queue, does not mean be serviced by the database. It means wait your 
turn. If you connect manually, you will notice an open connection, (socket is 
opened) but not database shell.


regards,

fabio pardi

> 
>> Same thing for max_db_connections. I set this to 1 and I am able to connect 
>> from 2 shells.
> 
> Same as pool_size but basically a fail-safe since pools are
> per-user/per-database while this limit is per-database only.
> 
>> This is kind of confusing and I'm not really cleared reading the 
>> documentation.
> 
> For each setting you need to understand whether it limits
> pgbouncer->PostgreSQL or client->pgbouncer
> 
> Configurations in [databases] limit the former; [pgbouncer] options
> either provide defaults for the [databases] or limit clients.
> 
>> [databases]
>> cmdwela = host=127.0.0.1 port=5432 dbname=cmdwela auth_user=pgbouncer 
>> pool_size=120
>> cicciopasticcio = host=127.0.0.1 port=5432 dbname=cicciopasticcio 
>> auth_user=pgbouncer pool_size=2
>>
>> [pgbouncer]
>> listen_port = 6543
>> listen_addr = *
>> auth_type = md5
>> auth_file = /etc/pgbouncer/users.txt
>> auth_query = select uname,phash from user_lookup($1)
>> logfile = /var/log/pgbouncer.log
>> pidfile = /home/postgres/pgbouncer.pid
>> admin_users = admin
>> user=postgres
> 
>> max_db_connections=1
> So one open connection max per database/user pool but it is shared -
> i.e., actively executing queries running in parallel are limited to
> this number.
> 
>> max_client_conn=5
>> I just want to limit connections from the app etc.
> 
> That limit is 5
> 
> David J.
> 



Re: Identifying comments [ANSWERED]

2019-01-17 Thread David G. Johnston
On Thu, Jan 17, 2019 at 8:39 AM Rich Shepard  wrote:
>
> On Thu, 17 Jan 2019, Rich Shepard wrote:
>
> > I want only to confirm that I can use /* ... */ for multiline comments in my
> > DDL and DML scripts.
>
> Oops! Section 4.1.5 tells me that I can.
>

You could also just try it...

select 1 /* comment */ AS col_one

executes just fine...

David J.



Re: Identifying comments [ANSWERED]

2019-01-17 Thread Rich Shepard

On Thu, 17 Jan 2019, Rich Shepard wrote:


I want only to confirm that I can use /* ... */ for multiline comments in my
DDL and DML scripts.


Oops! Section 4.1.5 tells me that I can.

Apologies to all,

Rich



Identifying comments

2019-01-17 Thread Rich Shepard

I know that "--" has been the single-line comment identifier since SQL was
implemented by IBM on mainframes using Hollerith cards. Today, some RDBMSs
also allow the C comment indentifier pair, /* ... */.

The postgres10 manual when searched for 'comments' presents the double
hyphens and, in Section 4.1.3, on the top line of page 39, I read that "--
and /* cannot appear anywhere in an operator name, since they will be taken
as the start of a comment."

I want only to confirm that I can use /* ... */ for multiline comments in my
DDL and DML scripts.

TIA,

Rich




Re: pgbouncer

2019-01-17 Thread David G. Johnston
On Thu, Jan 17, 2019 at 8:18 AM Nicola Contu  wrote:
>
> Hello,
> I am a bit confused about the settings in pgbouncer
>
> What's exactly the pool_size?

Roughly, the number of open connections pgbouncer will keep to PostgreSQL.

> If I set 3, and I tried to connect from 4 shells, I am still able to connect.

That would be the point - pgbouncer is sharing the 3 connections it
keeps with PostgreSQL between the 4 client connections made to it.

> Same thing for max_db_connections. I set this to 1 and I am able to connect 
> from 2 shells.

Same as pool_size but basically a fail-safe since pools are
per-user/per-database while this limit is per-database only.

> This is kind of confusing and I'm not really cleared reading the 
> documentation.

For each setting you need to understand whether it limits
pgbouncer->PostgreSQL or client->pgbouncer

Configurations in [databases] limit the former; [pgbouncer] options
either provide defaults for the [databases] or limit clients.

> [databases]
> cmdwela = host=127.0.0.1 port=5432 dbname=cmdwela auth_user=pgbouncer 
> pool_size=120
> cicciopasticcio = host=127.0.0.1 port=5432 dbname=cicciopasticcio 
> auth_user=pgbouncer pool_size=2
>
> [pgbouncer]
> listen_port = 6543
> listen_addr = *
> auth_type = md5
> auth_file = /etc/pgbouncer/users.txt
> auth_query = select uname,phash from user_lookup($1)
> logfile = /var/log/pgbouncer.log
> pidfile = /home/postgres/pgbouncer.pid
> admin_users = admin
> user=postgres

> max_db_connections=1
So one open connection max per database/user pool but it is shared -
i.e., actively executing queries running in parallel are limited to
this number.

> max_client_conn=5
> I just want to limit connections from the app etc.

That limit is 5

David J.



pgbouncer

2019-01-17 Thread Nicola Contu
Hello,
I am a bit confused about the settings in pgbouncer

What's exactly the pool_size?
If I set 3, and I tried to connect from 4 shells, I am still able to
connect.
Same thing for max_db_connections. I set this to 1 and I am able to connect
from 2 shells.

This is kind of confusing and I'm not really cleared reading the
documentation.

This is my config.

[databases]
cmdwela = host=127.0.0.1 port=5432 dbname=cmdwela auth_user=pgbouncer
pool_size=120
cicciopasticcio = host=127.0.0.1 port=5432
dbname=cicciopasticcio auth_user=pgbouncer pool_size=2

[pgbouncer]
listen_port = 6543
listen_addr = *
auth_type = md5
auth_file = /etc/pgbouncer/users.txt
auth_query = select uname,phash from user_lookup($1)
logfile = /var/log/pgbouncer.log
pidfile = /home/postgres/pgbouncer.pid
admin_users = admin
user=postgres
max_db_connections=1
log_connections=0
log_disconnections=0
max_client_conn=5


I just want to limit connections from the app etc.

Thanks


Re: Barman versus pgBackRest

2019-01-17 Thread Achilleas Mantzios

On 17/1/19 4:12 μ.μ., Jehan-Guillaume (ioguix) de Rorthais wrote:

On Thu, 17 Jan 2019 13:09:18 +0200
Achilleas Mantzios  wrote:


Hello,

One strong point of barman IMHO is transparently converting an incremental
backup to a full backup for retention purposes, so retention specification is
far more liberal than with pgbackrest, and configuring for incremental backup
does not pose any limitations to the schedule of backups. In our environment
our net connection to the remote site (repo) is extremely lousy, (although
within Switzerland if that makes any difference), so with pgbackrest a full
backup of our 1.3TB db, would take about 2 days ,lets say set in cron weekly
on sunday, (Sunday 1:00->Tuesday),  then I would have to take incr backups
from Wednesday->Saturday. And we would have to also take a full backup next
Sunday. With pgbarman we had to set : reuse_backup = link retention_policy =
RECOVERY WINDOW OF 14 DAYS and just perform regular backups. So for barman
every backup can be used as a base for the next backup, which achieves fast
backups and reduced disk space. In pgbackrest one has to be explicit about
the retention of both full backups and diff backups. it would be nice in
pgbackrest to only have incremental backups and let the system do the
necessary management (converting incremental to full) transparently and
asynchronously, e.g.via cron. I have read about the --repo-hardlink option. "
This gives the appearance that each backup is a full backup at the
file-system level " So could we just take a first full backup and then switch
permanently to incr backups?

Funnily enough, I opened an issue a month ago about this feature. If you want
to join the brainstorming, discuss, add some opinions and thoughts, see
https://github.com/pgbackrest/pgbackrest/issues/644

Surely I enjoyed your convo there!





--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt




RE: Postgres Automated Failover

2019-01-17 Thread ROS Didier
Hi
For PostgreSQL Automatic Failover , we are using repmgr too.

Best Regards

Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD

-Message d'origine-
De : f.pa...@portavita.eu [mailto:f.pa...@portavita.eu] 
Envoyé : jeudi 17 janvier 2019 15:29
À : pgsql-general@lists.postgresql.org
Objet : Re: Postgres Automated Failover

Hi,

In my opinion repmgr it's worth a look. 

https://repmgr.org/

regards,

fabio pardi


On 17/01/2019 14:32, AI Rumman wrote:
> Hi,
> 
> I am planning to use Postgresql with TimescaleDb extension. I have to design 
> a system similar to AWS RDS which supports automated failover, transparent 
> minor version upgrades etc.
> In early days, I worked with PgPool II to enable heartbeat between the 
> Postgres servers. Is there any new tool to achieve the same by which I can 
> design a transparent failover system for Postgres?
> 
> Also is there any multi-master Postgresql solution? Is Postgres-XC production 
> ready?
> 
> Any suggestions is appreciated. 
> 
> Thanks.




Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


Re: Postgres Automated Failover

2019-01-17 Thread Fabio Pardi
Hi,

In my opinion repmgr it's worth a look. 

https://repmgr.org/

regards,

fabio pardi


On 17/01/2019 14:32, AI Rumman wrote:
> Hi,
> 
> I am planning to use Postgresql with TimescaleDb extension. I have to design 
> a system similar to AWS RDS which supports automated failover, transparent 
> minor version upgrades etc.
> In early days, I worked with PgPool II to enable heartbeat between the 
> Postgres servers. Is there any new tool to achieve the same by which I can 
> design a transparent failover system for Postgres?
> 
> Also is there any multi-master Postgresql solution? Is Postgres-XC production 
> ready?
> 
> Any suggestions is appreciated. 
> 
> Thanks.



Re: Barman versus pgBackRest

2019-01-17 Thread Jehan-Guillaume (ioguix) de Rorthais
On Thu, 17 Jan 2019 13:09:18 +0200
Achilleas Mantzios  wrote:

> Hello,
> 
> One strong point of barman IMHO is transparently converting an incremental
> backup to a full backup for retention purposes, so retention specification is
> far more liberal than with pgbackrest, and configuring for incremental backup
> does not pose any limitations to the schedule of backups. In our environment
> our net connection to the remote site (repo) is extremely lousy, (although
> within Switzerland if that makes any difference), so with pgbackrest a full
> backup of our 1.3TB db, would take about 2 days ,lets say set in cron weekly
> on sunday, (Sunday 1:00->Tuesday),  then I would have to take incr backups
> from Wednesday->Saturday. And we would have to also take a full backup next
> Sunday. With pgbarman we had to set : reuse_backup = link retention_policy =
> RECOVERY WINDOW OF 14 DAYS and just perform regular backups. So for barman
> every backup can be used as a base for the next backup, which achieves fast
> backups and reduced disk space. In pgbackrest one has to be explicit about
> the retention of both full backups and diff backups. it would be nice in
> pgbackrest to only have incremental backups and let the system do the
> necessary management (converting incremental to full) transparently and
> asynchronously, e.g.via cron. I have read about the --repo-hardlink option. "
> This gives the appearance that each backup is a full backup at the
> file-system level " So could we just take a first full backup and then switch
> permanently to incr backups?

Funnily enough, I opened an issue a month ago about this feature. If you want
to join the brainstorming, discuss, add some opinions and thoughts, see
https://github.com/pgbackrest/pgbackrest/issues/644



Re: Postgres Automated Failover

2019-01-17 Thread Jehan-Guillaume (ioguix) de Rorthais
On Thu, 17 Jan 2019 08:32:48 -0500
AI Rumman  wrote:

> Hi,
> 
> I am planning to use Postgresql with TimescaleDb extension. I have to
> design a system similar to AWS RDS which supports automated failover,
> transparent minor version upgrades etc.
> In early days, I worked with PgPool II to enable heartbeat between the
> Postgres servers. Is there any new tool to achieve the same by which I can
> design a transparent failover system for Postgres?

There is multiple HA solutions, none are "transparent failover" though. They
are just automated-failover. A rollback-ed transaction because of failover will
never be transparent from the application point of view.

Look at Patroni or PAF, depending on what you want to achieve.



Re: Postgres Automated Failover

2019-01-17 Thread Deepika S Gowda
There is postgres multimaster replication. Please explore bdr multi master
node.

On Thu, Jan 17, 2019 at 7:03 PM AI Rumman  wrote:

> Hi,
>
> I am planning to use Postgresql with TimescaleDb extension. I have to
> design a system similar to AWS RDS which supports automated failover,
> transparent minor version upgrades etc.
> In early days, I worked with PgPool II to enable heartbeat between the
> Postgres servers. Is there any new tool to achieve the same by which I can
> design a transparent failover system for Postgres?
>
> Also is there any multi-master Postgresql solution? Is Postgres-XC
> production ready?
>
> Any suggestions is appreciated.
>
> Thanks.
>


Postgres Automated Failover

2019-01-17 Thread AI Rumman
Hi,

I am planning to use Postgresql with TimescaleDb extension. I have to
design a system similar to AWS RDS which supports automated failover,
transparent minor version upgrades etc.
In early days, I worked with PgPool II to enable heartbeat between the
Postgres servers. Is there any new tool to achieve the same by which I can
design a transparent failover system for Postgres?

Also is there any multi-master Postgresql solution? Is Postgres-XC
production ready?

Any suggestions is appreciated.

Thanks.


Re: Barman versus pgBackRest

2019-01-17 Thread Achilleas Mantzios

Hello,

One strong point of barman IMHO is transparently converting an incremental backup to a full backup for retention purposes, so retention specification is far more liberal than with pgbackrest, and 
configuring for incremental backup does not pose any limitations to the schedule of backups.
In our environment our net connection to the remote site (repo) is extremely lousy, (although within Switzerland if that makes any difference), so with pgbackrest a full backup of our 1.3TB db, would 
take about 2 days ,lets say set in cron weekly on sunday, (Sunday 1:00->Tuesday),  then I would have to take incr backups from Wednesday->Saturday. And we would have to also take a full backup next 
Sunday.

With pgbarman we had to set :
reuse_backup = link
retention_policy = RECOVERY WINDOW OF 14 DAYS
and just perform regular backups. So for barman every backup can be used as a 
base for the next backup, which achieves fast backups and reduced disk space.
In pgbackrest one has to be explicit about the retention of both full backups 
and diff backups.
it would be nice in pgbackrest to only have incremental backups and let the 
system do the necessary management (converting incremental to full) 
transparently and asynchronously, e.g.via cron.
I have read about the --repo-hardlink option.
"
This gives the appearance that each backup is a full backup at the file-system 
level
"
So could we just take a first full backup and then switch permanently to incr 
backups?

--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: Weird behaviour of ROLLUP/GROUPING

2019-01-17 Thread Guillaume Lelarge
Le jeu. 17 janv. 2019 à 08:27, Andrew Gierth 
a écrit :

> > "Guillaume" == Guillaume Lelarge  writes:
>
>  >> I will see about fixing this, somehow.
>
>  Guillaume> Thanks a lot.
>
> I've committed a fix (to all supported branches, since this bug actually
> precedes the addition of GROUPING SETS and can be triggered with a
> simple GROUP BY if you try hard enough). The regression test says it
> works now, but it'd be good if you could try it again on REL_11_STABLE
> (at commit e74d8c5085 or later) to check that it fixes your case.
>
>
I checked on REL9_6_STABLE and REL_11_STABLE, and it works great. Thank a
lot for the quick fix!


-- 
Guillaume.