Re: Problems pushing down WHERE-clause to underlying view

2019-02-15 Thread Tom Lane
=?UTF-8?Q?Nicklas_Av=c3=a9n?=  writes:
> I also, in the first query, changed the where clause to filter on 
> machine_key in table contractor _access. Just to illustrate the problem 
> better.

> Both queries filter on the same table which is joined the same way. But 
> in the second example the where clause is not pushed to the subquery

The filters are totally different though.  In one case you provide

where ci.machine_key = '887655635442600'

and there is also a join condition

l.machine_key=ci.machine_key

From these two things the planner can deduce

l.machine_key='887655635442600'

which is a restriction condition that it knows how to push down into the
"l" subquery.  Furthermore, it can also deduce that it can restrict
all of the left-joined tables to consider only that value of their
join keys.

In query #2 you have no constant value for machine_key so none of that
happens.

IIRC, the propagated value doesn't have to be a constant, exactly,
just a fixed expression.  So you might consider something like


where ci.machine_key = (select machine_key from contractor_access
where t4e_contractor_id = 'nicklas.a...@jordogskog.no');

when you need to drive the lookup from something other than raw
machine_key.  This'll fail, as-is, if there's more than one
contractor_access row with t4e_contractor_id =
'nicklas.a...@jordogskog.no', but you can probably adapt the idea
to make it work.

regards, tom lane



Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

2019-02-15 Thread Ron

On 2/15/19 4:04 PM, Bruce Klein wrote:
[snip]

I'm glad Microsoft is trying though


If Steve "Linux is a cancer" Ballmer were dead, he's be spinning in his grave...

--
Angular momentum makes the world go 'round.



Re: Problems pushing down WHERE-clause to underlying view

2019-02-15 Thread Adrian Klaver

On 2/15/19 12:43 PM, Nicklas Avén wrote:


 > I have not had chance to fully go through all of below. Some 
questions/suggestions:

 >
 > 1) Thanks for the formatted queries. If I could make a suggestion, 
when aliasing could you include AS. It would make finding what l.* 
refers to easier for those of us with old eyes:)

 >
Yes, of course, sorry :-)


 > 2) t4e_contractor_id is in the shiny_adm.contractor_access table?
 > If not where?

Yes, sorry again, it is there


 >
 > 3) What is the schema for shiny_adm.contractor_access?
 > In particular what indexes are on it?
 >
shiny_adm.contractor_access looks like this:


CREATE TABLE shiny_adm.contractor_access
(
   machine_key text,
   t4e_contractor_id text,
   active integer DEFAULT 1,
   id serial NOT NULL,
   CONSTRAINT contractor_access_pkey PRIMARY KEY (id),
   CONSTRAINT contractor_unique UNIQUE (machine_key, t4e_contractor_id),
   CONSTRAINT co_check_t4e_co_email CHECK 
(utils.verify_email(t4e_contractor_id))

)


CREATE INDEX idx_contractor
   ON shiny_adm.contractor_access
   USING btree
   (t4e_contractor_id COLLATE pg_catalog."default");

CREATE INDEX idx_contractor_mk
   ON shiny_adm.contractor_access
   USING btree
   (machine_key COLLATE pg_catalog."default");


I tried to format the below a little better with AS and some more 
consistent indents.



I also, in the first query, changed the where clause to filter on 
machine_key in table contractor _access. Just to illustrate the problem 
better.


Both queries filter on the same table which is joined the same way. But 
in the second example the where clause is not pushed to the subquery l



Thanks a lot for looking into it


Nicklas





Next query, the slow one that calculates the whole dataset:



EXPLAIN ANALYZE
SELECT
     l.machine_key,
     o.object_name,
     o.sub_object_name,
     o.object_user_id,
     o.sub_object_user_id,
     o.start_date AS object_start_date,
     s.species_group_name,
     p.product_group_name,
     l.m3_sub AS volume_m3sub,
     l.number_of_logs,
     mi.basemachine_manufacturer,
     mi.basemachine_model
FROM  shiny_adm.contractor_access ci join
     (
     SELECT
     hl.contractor_id,
     hl.machine_key,
     hl.operator_key,
     hl.object_key,
     hl.sub_object_key,
     date(hl.harvest_date) AS harvest_date,
     hl.species_group_key,
     hl.product_key,
     sum(hl.m3_sub) AS m3_sub,
     count(*) AS number_of_logs
     FROM
     version_union_tables_r02.harvester_logs AS hl
     GROUP BY
     hl.machine_key, hl.contractor_id, hl.operator_key, 
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)), 
hl.species_group_key, hl.product_key

     )  AS l on l.machine_key=ci.machine_key
     LEFT JOIN version_union_tables_r02.machine_info  AS mi ON 
l.machine_key::text = mi.machine_key::text
     LEFT JOIN version_union_tables_r02.objects AS o ON 
l.machine_key::text = o.machine_key::text AND l.object_key = 
o.object_key AND l.sub_object_key = o.sub_object_key
     LEFT JOIN version_union_tables_r02.products  AS p ON 
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
     LEFT JOIN version_union_tables_r02.species  AS s ON 
l.machine_key::text = s.machine_key::text AND l.species_group_key = 
s.species_group_key

WHERE t4e_contractor_id = 'nicklas.a...@jordogskog.no';


To make it apples to apples try changing above to be more like first query:

...

 AS l
LEFT JOIN version_union_tables_r02.machine_info  AS mi ON 
l.machine_key::text = mi.machine_key::text
LEFT JOIN version_union_tables_r02.objects AS  o ON 
l.machine_key::text = o.machine_key::text AND l.object_key = 
o.object_key AND l.sub_object_key = o.sub_object_key
LEFT JOIN version_union_tables_r02.products AS  p ON 
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
LEFT JOIN version_union_tables_r02.species  AS s ON 
l.machine_key::text = s.machine_key::text AND l.species_group_key = 
s.species_group_key

JOIN shiny_adm.contractor_access AS ci ON l.machine_key=ci.machine_key
 WHERE t4e_contractor_id = 'nicklas.a...@jordogskog.no'

;


results in this query plan:




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

2019-02-15 Thread Bruce Klein
>  I guess the OP is reporting about a .deb that was built on a real Linux
system

Yes, I (OP) installed via:
  % wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc |
sudo apt-key add -
  % sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/
$(lsb_release -sc)-pgdg main" > /etc/apt/sources.list.d/PostgreSQL.list'
  % sudo apt update
  % sudo apt-get install postgresql-11

> no one bothered to complain about PostgreSQL spewing scary looking
warnings on WSL for years

At least you weren't spamming a once-per-second(!) log entry about a
missing function call like one of my other packages did (can't remember,
maybe it was nginx?)

WSL still feels early and if you're going to try it, you get used to
annoyances like that. I'm glad Microsoft is trying though and I hope with
time and support they get all the way there because developers who have
enterprise or other reasons to be on Windows instead of Mac desktops
deserve to have decent unix tools too. Warts and all I still find it
overall more convenient and fluid than my previous VirtualBox / vagrant
solution.

On Fri, Feb 15, 2019 at 11:20 AM Thomas Munro 
wrote:

> On Sat, Feb 16, 2019 at 6:50 AM Andres Freund  wrote:
> > On February 15, 2019 9:44:50 AM PST, Tom Lane  wrote:
> > >Andres Freund  writes:
> > >> On February 15, 2019 9:13:10 AM PST, Tom Lane 
> > >wrote:
> > >>> I'm of the opinion that we shouldn't be panicking for
> > >sync_file_range
> > >>> failure, period.
> > >
> > >> With some flags it's strictly required, it does"eat"errors depending
> > >on the flags. So I'm not sure I understand?
> > >
> > >Really?  The specification says that it starts I/O, not that it waits
> > >around for any to finish.
> >
> > That depends on the flags you pass in. By memory I don't think it eats
> an error with our flags in recent kernels, but I'm not sure.
>
> Right, there was some discussion of that, and I didn't (and still
> don't) think it'd be wise to rely on undocumented knowledge about
> which flags can eat errors based on a drive-by reading of a particular
> snapshot of the Linux tree.  The man page says it can return EIO; I
> think we should assume that it might actually do that.
>
> BTW I had a report from someone on IRC that PostgreSQL breaks in other
> ways (not yet understood) if you build it directly on WSL/Ubuntu.  I
> guess the OP is reporting about a .deb that was built on a real Linux
> system.  I'm vaguely familiar with these types of problems from other
> platforms (Linux syscall emulation on FreeBSD and Sun-ish systems, and
> also I'm old enough to remember people doing SCO SysV syscall
> emulation on Linux systems back before certain valuable software was
> available natively); it's possible that you get ENOSYS on other
> emulators too, considering that other kernels don't seem to have a
> sync_file_range()-like facility, but probably no one cares, since
> there is no reason to run PostgreSQL on a syscall emulator when you
> can run it natively.  This is a bit different though: I guess people
> want to be able to develop Linux-stack stuff on company-issued Windows
> computers for later deployment on Linux servers; someone interested in
> this would ideally make it work and set up a build farm animal to tell
> us when we break it.  It would probably require only minimal changes,
> but considering that no one bothered to complain about PostgreSQL
> spewing scary looking warnings on WSL for years, it's not too
> surprising that we didn't consider this case before.  A bit like the
> nightjar case, the PANIC patch revealed a pre-existing problem that
> had gone unreported and needs some work, but it doesn't seem like a
> very good reason to roll back that part of the change completely IMHO.
>
> --
> Thomas Munro
> http://www.enterprisedb.com
>


Re: Subquery to select max(date) value

2019-02-15 Thread Rich Shepard

On Fri, 15 Feb 2019, Andrew Gierth wrote:


Rich> I've not before run 'explain' on a query. Would that be
Rich> appropriate here?

Yes.


Andrew,

I'll learn how to use it.


The problem here is that you have no join conditions at all, so the
result set of this query is massive. And you've duplicated many tables
inside the subquery which is not necessary or appropriate.


Got it now.


select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, sq.*
 from people as p
  join organizations as o on p.organization_id=o.id   -- OR WHATEVER
  cross join
lateral (select a.next_contact
   from activities as a
  where a.person_id=p.person_id --VERY IMPORTANT
and a.next_contact > '2018-12-31'
and a.next_contact <= 'today'
and a.next_contact is not null
  order by a.next_contact DESC
  limit 1) sq;

Ordering by DESC with a limit 1 is used to get the max next_contact
value rather than the smallest; this is similar to max(), but makes it
trivial to also access the other columns of the _same_ activities row
which is being selected.


This puts everything in perspective and is a very valuable lesson for me as
this application has many queries of this type. You've put together all that
I've read in the manual, on this mail list thread, and on web pages. I
really appreciate your patient guidance.

Best regards,

Rich



Re: Subquery to select max(date) value

2019-02-15 Thread Andrew Gierth
> "Rich" == Rich Shepard  writes:

 Rich> Using LIMIT 1 produces only the first returned row. This
 Rich> statement (using max() for next_contact) produces no error
 Rich> message, but also no results so I killed the process after 30
 Rich> seconds. Without a syntax error for guidance I don't know how to
 Rich> proceed. I've not before run 'explain' on a query. Would that be
 Rich> appropriate here?

Yes.

 Rich> select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name
 Rich> from people as p, organizations as o,
 Rich> lateral
 Rich> (select p.person_id, p.lname, p.fname, p.direct_phone, 
o.org_name,
 Rich> max(a.next_contact)
 Rich> from people as p, organizations as o, activities as a
 Rich> where a.next_contact > '2018-12-31' and
 Rich>   a.next_contact <= 'today' and
 Rich>   a.next_contact is not null
 Rich> group by p.person_id, o.org_name, a.next_contact
 Rich> order by p.person_id, o.org_name, a.next_contact) sq;

The problem here is that you have no join conditions at all, so the
result set of this query is massive. And you've duplicated many tables
inside the subquery which is not necessary or appropriate.

select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, sq.*
  from people as p
   join organizations as o on p.organization_id=o.id   -- OR WHATEVER
   cross join
 lateral (select a.next_contact
from activities as a
   where a.person_id=p.person_id --VERY IMPORTANT
 and a.next_contact > '2018-12-31'
 and a.next_contact <= 'today'
 and a.next_contact is not null
   order by a.next_contact DESC
   limit 1) sq;

Ordering by DESC with a limit 1 is used to get the max next_contact
value rather than the smallest; this is similar to max(), but makes it
trivial to also access the other columns of the _same_ activities row
which is being selected.

-- 
Andrew (irc:RhodiumToad)



Re: Subquery to select max(date) value

2019-02-15 Thread Rich Shepard

On Fri, 15 Feb 2019, Andrew Gierth wrote:


LATERAL (SELECT ...)   is syntactically like (SELECT ...) in that it
comes _after_ a "," in the from-clause or after a [LEFT] JOIN keyword.


Andrew,

Yes, the missing ',' made a big difference.


You'd want a condition here that references the "people" table;


Got it.


and I'm guessing you want that ordered by next_contact alone, possibly
with LIMIT 1 to get just the nearest following next_contact time.


Using LIMIT 1 produces only the first returned row. This statement (using
max() for next_contact) produces no error message, but also no results so I
killed the process after 30 seconds. Without a syntax error for guidance I
don't know how to proceed. I've not before run 'explain' on a query. Would
that be appropriate here?

select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name
from people as p, organizations as o,
lateral
(select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name,
max(a.next_contact)
from people as p, organizations as o, activities as a
where a.next_contact > '2018-12-31' and
  a.next_contact <= 'today' and
  a.next_contact is not null
group by p.person_id, o.org_name, a.next_contact
order by p.person_id, o.org_name, a.next_contact) sq;

Regards,

Rich



Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

2019-02-15 Thread Thomas Munro
On Sat, Feb 16, 2019 at 6:50 AM Andres Freund  wrote:
> On February 15, 2019 9:44:50 AM PST, Tom Lane  wrote:
> >Andres Freund  writes:
> >> On February 15, 2019 9:13:10 AM PST, Tom Lane 
> >wrote:
> >>> I'm of the opinion that we shouldn't be panicking for
> >sync_file_range
> >>> failure, period.
> >
> >> With some flags it's strictly required, it does"eat"errors depending
> >on the flags. So I'm not sure I understand?
> >
> >Really?  The specification says that it starts I/O, not that it waits
> >around for any to finish.
>
> That depends on the flags you pass in. By memory I don't think it eats an 
> error with our flags in recent kernels, but I'm not sure.

Right, there was some discussion of that, and I didn't (and still
don't) think it'd be wise to rely on undocumented knowledge about
which flags can eat errors based on a drive-by reading of a particular
snapshot of the Linux tree.  The man page says it can return EIO; I
think we should assume that it might actually do that.

BTW I had a report from someone on IRC that PostgreSQL breaks in other
ways (not yet understood) if you build it directly on WSL/Ubuntu.  I
guess the OP is reporting about a .deb that was built on a real Linux
system.  I'm vaguely familiar with these types of problems from other
platforms (Linux syscall emulation on FreeBSD and Sun-ish systems, and
also I'm old enough to remember people doing SCO SysV syscall
emulation on Linux systems back before certain valuable software was
available natively); it's possible that you get ENOSYS on other
emulators too, considering that other kernels don't seem to have a
sync_file_range()-like facility, but probably no one cares, since
there is no reason to run PostgreSQL on a syscall emulator when you
can run it natively.  This is a bit different though: I guess people
want to be able to develop Linux-stack stuff on company-issued Windows
computers for later deployment on Linux servers; someone interested in
this would ideally make it work and set up a build farm animal to tell
us when we break it.  It would probably require only minimal changes,
but considering that no one bothered to complain about PostgreSQL
spewing scary looking warnings on WSL for years, it's not too
surprising that we didn't consider this case before.  A bit like the
nightjar case, the PANIC patch revealed a pre-existing problem that
had gone unreported and needs some work, but it doesn't seem like a
very good reason to roll back that part of the change completely IMHO.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Channel binding not supported using scram-sha-256 passwords

2019-02-15 Thread Hugh Ranalli
On Fri, 15 Feb 2019 at 16:14, Bruce Momjian  wrote:

> The PG 11 release notes are clear that channel binding is not supported
> in a usable way yet:
>

I did see that. However, I'm not *trying* to use it. I set up accounts with
scram-sha-256 passwords, and when trying to connect I get this message.
Hence why I tried to disable it.

Hugh


Re: Channel binding not supported using scram-sha-256 passwords

2019-02-15 Thread Bruce Momjian
On Fri, Feb 15, 2019 at 03:41:37PM -0500, Hugh Ranalli wrote:
> 
> I've been trying to implement scram-sha-256 passwords on PostgreSQL 11.1.
> However, connection attempts whether through Python (psycopg2) or psql fail
> with the message: "channel binding not supported by this build." I've tried
> clearing scram_channel_binding in my global psqlrc ("\set
> scram_channel_binding"), with no success.
> 
> We are not using SSL for the connections, and the documentation labels this as
> an SASL authentication mechanism. Is SSL required for using scram-sha-256
> passwords? What am I missing?

The PG 11 release notes are clear that channel binding is not supported
in a usable way yet:

https://www.postgresql.org/docs/11/release-11.html

Add ability to use channel binding when using SCRAM authentication
(Michael Paquier)

Channel binding is intended to prevent man-in-the-middle attacks, but
SCRAM cannot prevent them unless it can be forced to be active.
Unfortunately, there is no way to do that in libpq. Support for it is
expected in future versions of libpq and in interfaces not built using
libpq, e.g. JDBC.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Subquery to select max(date) value

2019-02-15 Thread Rich Shepard

On Fri, 15 Feb 2019, Andrew Gierth wrote:


LATERAL (SELECT ...)   is syntactically like (SELECT ...) in that it comes
_after_ a "," in the from-clause or after a [LEFT] JOIN keyword. Don't
think of LATERAL as being a type of join, think of it as qualifying the
(SELECT ...) that follows.


Andrew,

Thank you. Now I understand the difference.


Rich> from activities as a
Rich> where a.next_contact is not null and a.next_contact <= 'today' and
Rich>   a.next_contact > '2018-12-31'

You'd want a condition here that references the "people" table; the whole
point of LATERAL is that it opens up the scope of column references in the
subquery to include those tables which are to its left in the from-clause.


And here I got it backwards, thinking the subquery could reference the
columns in the people table from the initial select.


Rich> order by person_id,next_contact);

and I'm guessing you want that ordered by next_contact alone, possibly
with LIMIT 1 to get just the nearest following next_contact time.


That's true. With 'discrete on' only the most recent next_contact date is
returned.

More work over the weekend on this now I have a better understanding of
lateral.

Thanks again,

Rich



Re: Subquery to select max(date) value

2019-02-15 Thread Andrew Gierth
> "Rich" == Rich Shepard  writes:

 Rich> I found a couple of web pages describing the lateral join yet
 Rich> have not correctly applied them. The manual's page did not help
 Rich> me get the correct syntax, either. Think I'm close, however:

 Rich> select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, 
a.next_contact
 Rich> from people as p, organizations as o
 Rich> lateral
 Rich> (select a.next_contact

LATERAL (SELECT ...)   is syntactically like (SELECT ...) in that it
comes _after_ a "," in the from-clause or after a [LEFT] JOIN keyword.
Don't think of LATERAL as being a type of join, think of it as
qualifying the (SELECT ...) that follows.

 Rich> from activities as a
 Rich> where a.next_contact is not null and a.next_contact <= 'today' 
and
 Rich>   a.next_contact > '2018-12-31'

You'd want a condition here that references the "people" table; the
whole point of LATERAL is that it opens up the scope of column
references in the subquery to include those tables which are to its left
in the from-clause.

 Rich> order by person_id,next_contact);

and I'm guessing you want that ordered by next_contact alone, possibly
with LIMIT 1 to get just the nearest following next_contact time.

-- 
Andrew (irc:RhodiumToad)



Re: Trigger function always logs postgres as user name

2019-02-15 Thread Alexander Reichstadt
HI,

The answer to the question is that you need to use session_user instead of user 
or current_user.

Cheers,
Alex


> On 9 Feb 2019, at 10:08, Alexander Reichstadt  wrote:
> 
> Hi,
> 
> I setup trigger functions for logging, and while they do work and get 
> triggered, the current_user always insert “postgres” even when 
> updates/deletes/inserts are caused by users of another name.
> 
> How do I get it to use the name that caused the update? It seems current_user 
> is the trigger’s user, so the server itself in some way. This is on PG10
> 
> Here the function:
>BEGIN
> 
>IF  TG_OP = 'INSERT'
> 
>THEN
> 
>INSERT INTO logging (tabname, schemaname, who, 
> operation, new_val)
> 
>VALUES (TG_RELNAME, TG_TABLE_SCHEMA, 
> current_user, TG_OP, row_to_json(NEW));
> 
>RETURN NEW;
> 
>ELSIF   TG_OP = 'UPDATE'
> 
>THEN
> 
>INSERT INTO logging (tabname, schemaname, who, 
> operation, new_val, old_val)
> 
>VALUES (TG_RELNAME, TG_TABLE_SCHEMA, 
> current_user, TG_OP,
> 
>row_to_json(NEW), row_to_json(OLD));
> 
>RETURN NEW;
> 
>ELSIF   TG_OP = 'DELETE'
> 
>THEN
> 
>INSERT INTO logging (tabname, schemaname, operation, 
> who, old_val)
> 
>VALUES (TG_RELNAME, TG_TABLE_SCHEMA, TG_OP, 
> current_user, row_to_json(OLD));
> 
>RETURN OLD;
> 
>END IF;
> 
>END;
> 
> 
> 
> Cheers,
> Alex




Re: Problems pushing down WHERE-clause to underlying view

2019-02-15 Thread Nicklas Avén



> I have not had chance to fully go through all of below. Some 
questions/suggestions:

>
> 1) Thanks for the formatted queries. If I could make a suggestion, 
when aliasing could you include AS. It would make finding what l.* 
refers to easier for those of us with old eyes:)

>
Yes, of course, sorry :-)


> 2) t4e_contractor_id is in the shiny_adm.contractor_access table?
> If not where?

Yes, sorry again, it is there


>
> 3) What is the schema for shiny_adm.contractor_access?
> In particular what indexes are on it?
>
shiny_adm.contractor_access looks like this:


CREATE TABLE shiny_adm.contractor_access
(
  machine_key text,
  t4e_contractor_id text,
  active integer DEFAULT 1,
  id serial NOT NULL,
  CONSTRAINT contractor_access_pkey PRIMARY KEY (id),
  CONSTRAINT contractor_unique UNIQUE (machine_key, t4e_contractor_id),
  CONSTRAINT co_check_t4e_co_email CHECK 
(utils.verify_email(t4e_contractor_id))

)


CREATE INDEX idx_contractor
  ON shiny_adm.contractor_access
  USING btree
  (t4e_contractor_id COLLATE pg_catalog."default");

CREATE INDEX idx_contractor_mk
  ON shiny_adm.contractor_access
  USING btree
  (machine_key COLLATE pg_catalog."default");


I tried to format the below a little better with AS and some more 
consistent indents.



I also, in the first query, changed the where clause to filter on 
machine_key in table contractor _access. Just to illustrate the problem 
better.


Both queries filter on the same table which is joined the same way. But 
in the second example the where clause is not pushed to the subquery l



Thanks a lot for looking into it


Nicklas




Query 1:
EXPLAIN ANALYZE
SELECT
    l.machine_key,
    o.object_name,
    o.sub_object_name,
    o.object_user_id,
    o.sub_object_user_id,
    o.start_date AS object_start_date,
    s.species_group_name,
    p.product_group_name,
    l.m3_sub AS volume_m3sub,
    l.number_of_logs,
    mi.basemachine_manufacturer,
    mi.basemachine_model
FROM
    (
    SELECT
    hl.contractor_id,
    hl.machine_key,
    hl.operator_key,
    hl.object_key,
    hl.sub_object_key,
    date(hl.harvest_date) AS harvest_date,
    hl.species_group_key,
    hl.product_key,
    sum(hl.m3_sub) AS m3_sub,
    count(*) AS number_of_logs
    FROM
    version_union_tables_r02.harvester_logs hl
    GROUP BY
    hl.machine_key, hl.contractor_id, hl.operator_key, 
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)), 
hl.species_group_key, hl.product_key

    )  AS l
    LEFT JOIN version_union_tables_r02.machine_info  AS mi ON 
l.machine_key::text = mi.machine_key::text
    LEFT JOIN version_union_tables_r02.objects AS  o ON 
l.machine_key::text = o.machine_key::text AND l.object_key = 
o.object_key AND l.sub_object_key = o.sub_object_key
    LEFT JOIN version_union_tables_r02.products AS  p ON 
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
    LEFT JOIN version_union_tables_r02.species  AS s ON 
l.machine_key::text = s.machine_key::text AND l.species_group_key = 
s.species_group_key

    join shiny_adm.contractor_access AS ci on l.machine_key=ci.machine_key
where ci.machine_key = '887655635442600';


Resulting in this query plan:
Nested Loop  (cost=61865.25..65302.20 rows=22624 width=122) (actual 
time=27.801..29.225 rows=250 loops=1)
  ->  Seq Scan on contractor_access ci  (cost=0.00..1.01 rows=1 
width=19) (actual time=0.005..0.006 rows=1 loops=1)

    Filter: (machine_key = '887655635442600'::text)
  ->  Hash Left Join  (cost=61865.25..65074.95 rows=22624 width=122) 
(actual time=27.794..29.070 rows=250 loops=1)
    Hash Cond: (((hl.machine_key)::text = (s.machine_key)::text) 
AND (hl.species_group_key = s.species_group_key))
    ->  Hash Left Join  (cost=61854.55..64263.92 rows=22624 
width=120) (actual time=27.771..28.851 rows=250 loops=1)
  Hash Cond: (((hl.machine_key)::text = 
(p.machine_key)::text) AND (hl.product_key = p.product_key))
  ->  Hash Left Join  (cost=61815.97..63145.14 rows=22624 
width=118) (actual time=27.736..28.628 rows=250 loops=1)
    Hash Cond: (((hl.machine_key)::text = 
(o.machine_key)::text) AND (hl.object_key = o.object_key) AND 
(hl.sub_object_key = o.sub_object_key))
    ->  Hash Left Join  (cost=61799.78..62619.90 
rows=22624 width=65) (actual time=27.709..28.416 rows=250 loops=1)
  Hash Cond: ((hl.machine_key)::text = 
(mi.machine_key)::text)
  ->  HashAggregate (cost=61796.99..62079.79 
rows=22624 width=69) (actual time=27.677..28.217 rows=250 loops=1)
    Group Key: hl.machine_key, 
hl.contractor_id, hl.operator_key, hl.object_key, hl.sub_object_key, 
date(hl.harvest_date), hl.species_group_key, hl.product_key
    ->  Bitmap Heap Scan on harvester_logs 
hl  (cost=570.14..61224.14 rows=22914 width=61) 

Channel binding not supported using scram-sha-256 passwords

2019-02-15 Thread Hugh Ranalli
I've been trying to implement scram-sha-256 passwords on PostgreSQL 11.1.
However, connection attempts whether through Python (psycopg2) or psql fail
with the message: "channel binding not supported by this build." I've tried
clearing scram_channel_binding in my global psqlrc ("\set
scram_channel_binding"), with no success.

We are not using SSL for the connections, and the documentation labels this
as an SASL authentication mechanism. Is SSL required for using
scram-sha-256 passwords? What am I missing?

Thanks,
Hugh

--
Hugh Ranalli
Principal Consultant
White Horse Technology Consulting
e: h...@whtc.ca


Re: Subquery to select max(date) value

2019-02-15 Thread Rich Shepard

On Wed, 13 Feb 2019, Andrew Gierth wrote:


You want LATERAL.


Andrew, et al,:

I found a couple of web pages describing the lateral join yet have not
correctly applied them. The manual's page did not help me get the correct
syntax, either. Think I'm close, however:

select p.person_id, p.lname, p.fname, p.direct_phone, o.org_name, a.next_contact
from people as p, organizations as o
lateral
(select a.next_contact
from activities as a
where a.next_contact is not null and a.next_contact <= 'today' and
  a.next_contact > '2018-12-31'
order by person_id,next_contact);

When run I get this syntax error:

$ psql -f get_next_contact_dates.sql -d bustrac 
psql:get_next_contact_dates.sql:10: ERROR:  syntax error at or near "lateral"

LINE 3: lateral

What am I still missing?

Regards,

Rich



Re: Promoted slave tries to archive previously archived WAL file

2019-02-15 Thread Andre Piwoni
Ok. I think I uncovered a bug.

My slave nodes were created using pg_basebackup with --wal-method=stream.
If I understand right this option streams WAL files generated during backup
and this WAL file was 00010002 but its contents were
different from what was on the primary and in WAL archive. When I changed
--wal-method=fetch which collects WAL files generated during the backup at
the end then diff did not detect any changes. When I failover I don't seem
to have the issue with new primary to archive.

It seems streaming WAL segments created during backup produced corrupt file
based on diff.





On Fri, Feb 15, 2019 at 10:23 AM Andre Piwoni  wrote:

> I have master and slave running with the following contents of their
> pg_wal directories and archivedir:
>
> ls -l /mnt/pgsql/archive/
> -rw-rw-rw-. 1 root root 16777216 Feb 15 09:39 00010001
> -rw-rw-rw-. 1 root root 16777216 Feb 15 09:39 00010002
> -rw-rw-rw-. 1 root root  302 Feb 15 09:39
> 00010002.0028.backup
>
> pg-hdp-node1.kitchen.local
> /var/lib/pgsql/10/data/pg_wal/:
> -rw---. 1 postgres postgres 16777216 Feb 15 09:39
> 00010002
> -rw---. 1 postgres postgres  302 Feb 15 09:39
> 00010002.0028.backup
> -rw---. 1 postgres postgres 16777216 Feb 15 09:44
> 00010003
> -rw---. 1 postgres postgres 16777216 Feb 15 09:39
> 00010004
> drwx--. 2 postgres postgres   96 Feb 15 09:44 archive_status
> /var/lib/pgsql/10/data/pg_wal/archive_status:
> -rw---. 1 postgres postgres 0 Feb 15 09:39
> 00010002.0028.backup.done
> -rw---. 1 postgres postgres 0 Feb 15 09:39
> 00010002.done
>
> pg-hdp-node2.kitchen.local
> /var/lib/pgsql/10/data/pg_wal/:
> -rw---. 1 postgres root 16777216 Feb 15 09:39
> 00010002
> -rw---. 1 postgres postgres 16777216 Feb 15 09:44
> 00010003
> drwx--. 2 postgres root6 Feb 15 09:39 archive_status
> /var/lib/pgsql/10/data/pg_wal/archive_status:
>
> diff from secondary pg-hdp-node2.kitchen.local on
> /var/lib/pgsql/10/data/pg_wal/00010002 and
> /mnt/pgsql/archive/00010002 shows binary differences but as
> expected no differences for diff on primary pg-hdp-node1.kitchen.local
>
> Failover is performed and pg-hdp-node2.kitchen.local tries and fails to
> archive WAL segment 00010002 because it has been previously
> archived
> 2019-02-15 09:54:50.518 PST [780] DETAIL:  The failed archive command was:
> test ! -f /mnt/pgsql/archive/00010002 && cp
> pg_wal/00010002 /mnt/pgsql/archive/00010002
>
> Based on this thread
> https://www.postgresql.org/message-id/11b405a6-2176-9510-bf5b-ea9c0e860635%40pgmasters.net
> it is suggested to handle this case by reporting success but in my case
> contents are different. I would think that previously archived
> 00010002 is the right WAL segment.
>
> So my questions are as follows:
>
> Why WAL segments differ?
> How should this be resolved on the new primary?
> --
>
> *Andre Piwoni*
>


-- 

*Andre Piwoni*

Sr. Software Developer, BI/Database

*Web*MD Health Services

Mobile: 801.541.4722

www.webmdhealthservices.com


Re: Size estimation of postgres core files

2019-02-15 Thread Jeremy Finzel
>
> It doesn't write out all of RAM, only the amount in use by the
> particular backend that crashed (plus all the shared segments attached
> by that backend, including the main shared_buffers, unless you disable
> that as previously mentioned).
>
> And yes, it can take a long time to generate a large core file.
>
> --
> Andrew (irc:RhodiumToad)
>

Based on the Alvaro's response, I thought it is reasonably possible that
that *could* include nearly all of RAM, because that was my original
question.  If shared buffers is say 50G and my OS has 1T, shared buffers is
a small portion of that.  But really my question is what should we
reasonably assume is possible - meaning what kind of space should I
provision for a volume to be able to contain the core dump in case of
crash?  The time of writing the core file would definitely be a concern if
it could indeed be that large.

Could someone provide more information on exactly how to do that
coredump_filter?

We are looking to enable core dumps to aid in case of unexpected crashes
and wondering if there are any recommendations in general in terms of
balancing costs/benefits of enabling core dumps.

Thank you!
Jeremy


Re: Problems pushing down WHERE-clause to underlying view

2019-02-15 Thread Adrian Klaver

On 2/15/19 9:27 AM, Nicklas Avén wrote:


On 2/15/19 5:06 PM, Adrian Klaver wrote:
 > On 2/15/19 7:28 AM, Nicklas Avén wrote:
 >> Hi
 >>
 >> The problem is that it always calculates all those 22000 rows even 
if the user id I use only gives 250 rows.

 >>
 >> So, the query uses 4 seconds instead of under 100 ms.
 >
 > https://www.postgresql.org/docs/10/sql-createview.html
 >
 > "CREATE VIEW defines a view of a query. The view is not physically 
materialized. Instead, the query is run every time the view is 
referenced in a query."


 >



Sorry, I must have expressed what I mean bad. Of course a view is not 
materialized.

I will explain without views what I mean here below



 > Might want to look at materialized view:
 > https://www.postgresql.org/docs/10/sql-creatematerializedview.html
 >
 > "CREATE MATERIALIZED VIEW is similar to CREATE TABLE AS, except that 
it also remembers the query used to initialize the view, so that it can 
be refreshed later upon demand. A materialized view has many of the same 
properties as a table, but there is no support for temporary 
materialized views or automatic generation of OIDs."


 >


No, materialized views is not an option. We get some data into those 
tables daily. Recalculating the full dataset on a lot of views like this 
doesn't make sense.
Instead we have tables maintained with processed new data. But I want 
this last part of logic on top as views for flexibility,

to not need cached tables for each possible type of grouping that we need.
Started out with materialized views and it didn't work out well.




 >
 > I would also suggest running the EXPLAIN below with ANALYZE so actual 
timings are returned. Also try:

 >
 > SELECT
 > *
 > FROM
 > underlying_view AS b
 > JOIN
 > contractor_access AS b
 > ON
 > a.machine_key = b.machine_key
 > WHERE
 > user_id = 'name@email.address'
 >>

Sorry again, I didn't mention. This I have tried this since this is what 
the top level view do.
So first step when trying to understand this was (of course) to apply 
the where-clause directly to the query




So, let's do that also on the underlying query (view) .


Here I have 2 queries, where I apply the where clause directly to the 
query in the underlying view


(joining the contractor_access table directly on that query).

The first takes 30-40 ms ms and returns the same 250 rows as the second.
In the first I use the machine_key in the where clause.
In the second query that takes about 16 seconds to return the same 250 
rows I use the user_id in the contractor_access table.


I have also cleaned up the contractor_access table. So there is only 1 
row now, with my email as user_id and the same machine_key as used in 
the first query.




I have not had chance to fully go through all of below. Some 
questions/suggestions:


1) Thanks for the formatted queries. If I could make a suggestion, when 
aliasing could you include AS. It would make finding what l.* refers to 
easier for those of us with old eyes:)


2) t4e_contractor_id is in the shiny_adm.contractor_access table?
If not where?

3) What is the schema for shiny_adm.contractor_access?
In particular what indexes are on it?




Query 1:


EXPLAIN ANALYZE
SELECT
     l.machine_key,
     o.object_name,
     o.sub_object_name,
     o.object_user_id,
     o.sub_object_user_id,
     o.start_date AS object_start_date,
     s.species_group_name,
     p.product_group_name,
     l.m3_sub AS volume_m3sub,
     l.number_of_logs,
     mi.basemachine_manufacturer,
     mi.basemachine_model
    FROM ( SELECT hl.contractor_id,
     hl.machine_key,
     hl.operator_key,
     hl.object_key,
     hl.sub_object_key,
     date(hl.harvest_date) AS harvest_date,
     hl.species_group_key,
     hl.product_key,
     sum(hl.m3_sub) AS m3_sub,
     count(*) AS number_of_logs
    FROM version_union_tables_r02.harvester_logs hl
   GROUP BY hl.machine_key, hl.contractor_id, hl.operator_key, 
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)), 
hl.species_group_key, hl.product_key) l
  LEFT JOIN version_union_tables_r02.machine_info mi ON 
l.machine_key::text = mi.machine_key::text
  LEFT JOIN version_union_tables_r02.objects o ON 
l.machine_key::text = o.machine_key::text AND l.object_key = 
o.object_key AND l.sub_object_key = o.sub_object_key
  LEFT JOIN version_union_tables_r02.products p ON 
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
  LEFT JOIN version_union_tables_r02.species s ON 
l.machine_key::text = s.machine_key::text AND l.species_group_key = 
s.species_group_key

join shiny_adm.contractor_access ci on l.machine_key=ci.machine_key
where l.machine_key = '887655635442600'
;

which results in this query plan

Nested Loop  (cost=61865.25..65302.20 rows=22624 width=122) (actual 
time=25.804..27.134 rows=250 loops=1)
   ->  Seq Scan on contractor_access ci  (cost=0.00..1.01 rows=1 

Re: Size estimation of postgres core files

2019-02-15 Thread Andrew Gierth
> "Jeremy" == Jeremy Finzel  writes:

 Jeremy> Yes Linux. This is very helpful, thanks. A follow-up question -
 Jeremy> will it take postgres a really long time to crash (and
 Jeremy> hopefully recover) if I have say 1T of RAM because it has to
 Jeremy> write that all out to a core file first?

It doesn't write out all of RAM, only the amount in use by the
particular backend that crashed (plus all the shared segments attached
by that backend, including the main shared_buffers, unless you disable
that as previously mentioned).

And yes, it can take a long time to generate a large core file.

-- 
Andrew (irc:RhodiumToad)



Promoted slave tries to archive previously archived WAL file

2019-02-15 Thread Andre Piwoni
I have master and slave running with the following contents of their pg_wal
directories and archivedir:

ls -l /mnt/pgsql/archive/
-rw-rw-rw-. 1 root root 16777216 Feb 15 09:39 00010001
-rw-rw-rw-. 1 root root 16777216 Feb 15 09:39 00010002
-rw-rw-rw-. 1 root root  302 Feb 15 09:39
00010002.0028.backup

pg-hdp-node1.kitchen.local
/var/lib/pgsql/10/data/pg_wal/:
-rw---. 1 postgres postgres 16777216 Feb 15 09:39
00010002
-rw---. 1 postgres postgres  302 Feb 15 09:39
00010002.0028.backup
-rw---. 1 postgres postgres 16777216 Feb 15 09:44
00010003
-rw---. 1 postgres postgres 16777216 Feb 15 09:39
00010004
drwx--. 2 postgres postgres   96 Feb 15 09:44 archive_status
/var/lib/pgsql/10/data/pg_wal/archive_status:
-rw---. 1 postgres postgres 0 Feb 15 09:39
00010002.0028.backup.done
-rw---. 1 postgres postgres 0 Feb 15 09:39 00010002.done

pg-hdp-node2.kitchen.local
/var/lib/pgsql/10/data/pg_wal/:
-rw---. 1 postgres root 16777216 Feb 15 09:39
00010002
-rw---. 1 postgres postgres 16777216 Feb 15 09:44
00010003
drwx--. 2 postgres root6 Feb 15 09:39 archive_status
/var/lib/pgsql/10/data/pg_wal/archive_status:

diff from secondary pg-hdp-node2.kitchen.local on
/var/lib/pgsql/10/data/pg_wal/00010002 and
/mnt/pgsql/archive/00010002 shows binary differences but as
expected no differences for diff on primary pg-hdp-node1.kitchen.local

Failover is performed and pg-hdp-node2.kitchen.local tries and fails to
archive WAL segment 00010002 because it has been previously
archived
2019-02-15 09:54:50.518 PST [780] DETAIL:  The failed archive command was:
test ! -f /mnt/pgsql/archive/00010002 && cp
pg_wal/00010002 /mnt/pgsql/archive/00010002

Based on this thread
https://www.postgresql.org/message-id/11b405a6-2176-9510-bf5b-ea9c0e860635%40pgmasters.net
it is suggested to handle this case by reporting success but in my case
contents are different. I would think that previously archived
00010002 is the right WAL segment.

So my questions are as follows:

Why WAL segments differ?
How should this be resolved on the new primary?
-- 

*Andre Piwoni*


Re: Size estimation of postgres core files

2019-02-15 Thread Jeremy Finzel
>
> In Linux, yes.  Not sure about other OSes.
>
> You can turn off the dumping of shared memory with some unusably
> unfriendly bitwise arithmetic using the "coredump_filter" file in /proc
> for the process.  (It's inherited by children, so you can just set it
> once for postmaster at server start time).
>

Yes Linux.  This is very helpful, thanks.  A follow-up question - will it
take postgres a really long time to crash (and hopefully recover) if I have
say 1T of RAM because it has to write that all out to a core file first?

Thanks,
Jeremy


Re: PostgreSql Version Compatibility With Apache ActiveMQ

2019-02-15 Thread Adrian Klaver

On 2/15/19 8:33 AM, Rameshbabu Paulsamy (UST, IND) wrote:

Thanks Adrian for the response. I had posted it there already. As I didn't get 
any response, Tried to check in here.


Only other suggestion I have is to try it. Spin up a 10.x or 11.x 
instance of Postgres and run ActiveMQ against it and see what happens.




Please help !

Thanks
Ramesh

-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
Sent: Friday, 15 February, 2019 9:15 PM
To: Rameshbabu Paulsamy (UST, IND) ; 
pgsql-gene...@postgresql.org
Subject: Re: PostgreSql Version Compatibility With Apache ActiveMQ

On 2/15/19 5:55 AM, Rameshbabu Paulsamy (UST, IND) wrote:

Hi,

I am using Apache ActiveMQ version 5.15.2 and we are connected to
Postgresql version 9.6

My DB team is planning to upgrade to 10.5 or 11.1 version.

Could you please let me know if this Postgre versions are compatible
with  ActiveMQ 5.15.2


That is more of a question for the ActiveMQ community:

http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html



Thanks in Advance.

Regards

Ramesh




--
Adrian Klaver
adrian.kla...@aklaver.com




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

2019-02-15 Thread Andres Freund



On February 15, 2019 9:44:50 AM PST, Tom Lane  wrote:
>Andres Freund  writes:
>> On February 15, 2019 9:13:10 AM PST, Tom Lane 
>wrote:
>>> I'm of the opinion that we shouldn't be panicking for
>sync_file_range
>>> failure, period.
>
>> With some flags it's strictly required, it does"eat"errors depending
>on the flags. So I'm not sure I understand? 
>
>Really?  The specification says that it starts I/O, not that it waits
>around for any to finish.

That depends on the flags you pass in. By memory I don't think it eats an error 
with our flags in recent kernels, but I'm not sure.

Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

2019-02-15 Thread Andres Freund



On February 15, 2019 9:13:10 AM PST, Tom Lane  wrote:
>Andres Freund  writes:
>> I suspect that's because WSL has an empty implementation of
>> sync_file_range(), i.e. it unconditionally returns ENOSYS. But as
>> configure detects it, we still emit calls for it.  I guess we ought
>to
>> except ENOSYS for the cases where we do panic-on-fsync-failure?
>
>I'm of the opinion that we shouldn't be panicking for sync_file_range
>failure, period.

With some flags it's strictly required, it does"eat"errors depending on the 
flags. So I'm not sure I understand? 

Access
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Problems pushing down WHERE-clause to underlying view

2019-02-15 Thread Nicklas Avén



On 2/15/19 5:06 PM, Adrian Klaver wrote:
> On 2/15/19 7:28 AM, Nicklas Avén wrote:
>> Hi
>>
>> The problem is that it always calculates all those 22000 rows even 
if the user id I use only gives 250 rows.

>>
>> So, the query uses 4 seconds instead of under 100 ms.
>
> https://www.postgresql.org/docs/10/sql-createview.html
>
> "CREATE VIEW defines a view of a query. The view is not physically 
materialized. Instead, the query is run every time the view is 
referenced in a query."


>



Sorry, I must have expressed what I mean bad. Of course a view is not 
materialized.

I will explain without views what I mean here below



> Might want to look at materialized view:
> https://www.postgresql.org/docs/10/sql-creatematerializedview.html
>
> "CREATE MATERIALIZED VIEW is similar to CREATE TABLE AS, except that 
it also remembers the query used to initialize the view, so that it can 
be refreshed later upon demand. A materialized view has many of the same 
properties as a table, but there is no support for temporary 
materialized views or automatic generation of OIDs."


>


No, materialized views is not an option. We get some data into those 
tables daily. Recalculating the full dataset on a lot of views like this 
doesn't make sense.
Instead we have tables maintained with processed new data. But I want 
this last part of logic on top as views for flexibility,

to not need cached tables for each possible type of grouping that we need.
Started out with materialized views and it didn't work out well.




>
> I would also suggest running the EXPLAIN below with ANALYZE so actual 
timings are returned. Also try:

>
> SELECT
> *
> FROM
> underlying_view AS b
> JOIN
> contractor_access AS b
> ON
> a.machine_key = b.machine_key
> WHERE
> user_id = 'name@email.address'
>>

Sorry again, I didn't mention. This I have tried this since this is what 
the top level view do.
So first step when trying to understand this was (of course) to apply 
the where-clause directly to the query




So, let's do that also on the underlying query (view) .


Here I have 2 queries, where I apply the where clause directly to the 
query in the underlying view


(joining the contractor_access table directly on that query).

The first takes 30-40 ms ms and returns the same 250 rows as the second.
In the first I use the machine_key in the where clause.
In the second query that takes about 16 seconds to return the same 250 
rows I use the user_id in the contractor_access table.


I have also cleaned up the contractor_access table. So there is only 1 
row now, with my email as user_id and the same machine_key as used in 
the first query.




Query 1:


EXPLAIN ANALYZE
SELECT
    l.machine_key,
    o.object_name,
    o.sub_object_name,
    o.object_user_id,
    o.sub_object_user_id,
    o.start_date AS object_start_date,
    s.species_group_name,
    p.product_group_name,
    l.m3_sub AS volume_m3sub,
    l.number_of_logs,
    mi.basemachine_manufacturer,
    mi.basemachine_model
   FROM ( SELECT hl.contractor_id,
    hl.machine_key,
    hl.operator_key,
    hl.object_key,
    hl.sub_object_key,
    date(hl.harvest_date) AS harvest_date,
    hl.species_group_key,
    hl.product_key,
    sum(hl.m3_sub) AS m3_sub,
    count(*) AS number_of_logs
   FROM version_union_tables_r02.harvester_logs hl
  GROUP BY hl.machine_key, hl.contractor_id, hl.operator_key, 
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)), 
hl.species_group_key, hl.product_key) l
 LEFT JOIN version_union_tables_r02.machine_info mi ON 
l.machine_key::text = mi.machine_key::text
 LEFT JOIN version_union_tables_r02.objects o ON 
l.machine_key::text = o.machine_key::text AND l.object_key = 
o.object_key AND l.sub_object_key = o.sub_object_key
 LEFT JOIN version_union_tables_r02.products p ON 
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
 LEFT JOIN version_union_tables_r02.species s ON 
l.machine_key::text = s.machine_key::text AND l.species_group_key = 
s.species_group_key

join shiny_adm.contractor_access ci on l.machine_key=ci.machine_key
where l.machine_key = '887655635442600'
;

which results in this query plan

Nested Loop  (cost=61865.25..65302.20 rows=22624 width=122) (actual 
time=25.804..27.134 rows=250 loops=1)
  ->  Seq Scan on contractor_access ci  (cost=0.00..1.01 rows=1 
width=19) (actual time=0.009..0.010 rows=1 loops=1)

    Filter: (machine_key = '887655635442600'::text)
  ->  Hash Left Join  (cost=61865.25..65074.95 rows=22624 width=122) 
(actual time=25.793..26.959 rows=250 loops=1)
    Hash Cond: (((hl.machine_key)::text = (s.machine_key)::text) 
AND (hl.species_group_key = s.species_group_key))
    ->  Hash Left Join  (cost=61854.55..64263.92 rows=22624 
width=120) (actual time=25.755..26.763 rows=250 loops=1)
  Hash Cond: (((hl.machine_key)::text = 
(p.machine_key)::text) AND 

Re: loading plpython error

2019-02-15 Thread Alan Nilsson
Indeed, that was it.  Thank you Tom!

alan


> On Feb 14, 2019, at 4:42 PM, Tom Lane  wrote:
> 
> Alan Nilsson  writes:
>> Platform:  Linux x86-64, CentOS 6, Postgres 11.1.
>> We have installed from the YUM repo.  The server runs fine but we are trying 
>> to add python support.
> 
>> yum install postrgesql11-contrib postgresql11-plpython
> 
>> I can see the installed files in the correct locations but when we invoke 
>> create extension, we are greeted with the following error:
> 
>> ERROR:  could not load library "/usr/pgsql-11/lib/plpython2.so": 
>> /usr/pgsql-11/lib/plpython2.so: undefined symbol: getmissingattr
> 
> getmissingattr is a new core-PG function in 11.2.  I believe this means
> you're trying to load an 11.2 build of plpython2.so into an 11.1 server.
> You need to either update the server, or find an 11.1 build of plpython.
> 
>   regards, tom lane
> 




Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

2019-02-15 Thread Tom Lane
Andres Freund  writes:
> I suspect that's because WSL has an empty implementation of
> sync_file_range(), i.e. it unconditionally returns ENOSYS. But as
> configure detects it, we still emit calls for it.  I guess we ought to
> except ENOSYS for the cases where we do panic-on-fsync-failure?

I'm of the opinion that we shouldn't be panicking for sync_file_range
failure, period.

regards, tom lane



Re: WSL (windows subsystem on linux) users will need to turn fsync off as of 11.2

2019-02-15 Thread Andres Freund
Hi,

On 2019-02-14 19:48:05 -0500, Tom Lane wrote:
> Bruce Klein  writes:
> > If you are running Postgres inside Microsoft WSL (at least on Ubuntu, maybe
> > on others too), and just picked up a software update to version 11.2, you
> > will need to go into your /etc/postgresql.conf file and set fsync=off.
> 
> Hm.  Probably this is some unexpected problem with the
> panic-on-fsync-failure change; although that still leaves some things
> unexplained, because if fsync is failing for you now, why didn't it fail
> before?  Anyway, you might try experimenting with data_sync_retry,
> instead of running with scissors by turning off fsync altogether.
> See first item in the release notes:
> 
> https://www.postgresql.org/docs/11/release-11-2.html
> 
> Also, we'd quite like to hear more details; can you find any PANIC
> messages in the server log?

I suspect that's because WSL has an empty implementation of
sync_file_range(), i.e. it unconditionally returns ENOSYS. But as
configure detects it, we still emit calls for it.  I guess we ought to
except ENOSYS for the cases where we do panic-on-fsync-failure?

You temporarily can work around it, mostly, by setting
checkpoint_flush_after = 0 and bgwriter_flush_after = 0.

Greetings,

Andres Freund



RE: PostgreSql Version Compatibility With Apache ActiveMQ

2019-02-15 Thread Rameshbabu Paulsamy (UST, IND)
Thanks Adrian for the response. I had posted it there already. As I didn't get 
any response, Tried to check in here. 

Please help !

Thanks
Ramesh

-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com] 
Sent: Friday, 15 February, 2019 9:15 PM
To: Rameshbabu Paulsamy (UST, IND) ; 
pgsql-gene...@postgresql.org
Subject: Re: PostgreSql Version Compatibility With Apache ActiveMQ

On 2/15/19 5:55 AM, Rameshbabu Paulsamy (UST, IND) wrote:
> Hi,
> 
> I am using Apache ActiveMQ version 5.15.2 and we are connected to 
> Postgresql version 9.6
> 
> My DB team is planning to upgrade to 10.5 or 11.1 version.
> 
> Could you please let me know if this Postgre versions are compatible 
> with  ActiveMQ 5.15.2

That is more of a question for the ActiveMQ community:

http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

> 
> Thanks in Advance.
> 
> Regards
> 
> Ramesh
> 


--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Size estimation of postgres core files

2019-02-15 Thread Alvaro Herrera
On 2019-Feb-15, Jeremy Finzel wrote:

> I am trying to determine the upper size limit of a core file generated for
> any given cluster.  Is it feasible that it could actually be the entire
> size of the system memory + shared buffers (i.e. really huge)?

In Linux, yes.  Not sure about other OSes.

You can turn off the dumping of shared memory with some unusably
unfriendly bitwise arithmetic using the "coredump_filter" file in /proc
for the process.  (It's inherited by children, so you can just set it
once for postmaster at server start time).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Size estimation of postgres core files

2019-02-15 Thread Jeremy Finzel
I am trying to determine the upper size limit of a core file generated for
any given cluster.  Is it feasible that it could actually be the entire
size of the system memory + shared buffers (i.e. really huge)?

I've done a little bit of testing of this myself, but want to be sure I am
clear on this for planning to have enough free space for postgres core
files in case of potential crashes.

Thanks!
Jeremy


Re: How to setup only one connection for the whole event loop?

2019-02-15 Thread Adrian Klaver

On 2/14/19 8:06 AM, Dominic Gua�a wrote:

Dear all,

I am new to postgresql and I am creating a c program that can receive 
request from different users. I want to maximize the performance of 
postgresql so I intend to just create 1 connection that would service 
all queries of different users. How do I do this?


Use one of the existing Postgres poolers?:

http://www.pgpool.net/mediawiki/index.php/Main_Page

http://pgbouncer.github.io/



Do I create a new connection each time there is a new request like how 
it goes in MySQL or can I just have one connection that is established 
in my `main` function?


Sincerely



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Problems pushing down WHERE-clause to underlying view

2019-02-15 Thread Adrian Klaver

On 2/15/19 7:28 AM, Nicklas Avén wrote:

Hi


We have a system with 2 layers of views. It is about forestry.

The first layer contains the logic like grouping volumes in logs 
together to stems or harvesting areas and joining species names to codes 
and things like that.


The second layer just joins this underlying views to a table with user 
ids and machine ids. So, when used by the application there is a where 
clause containing the user id which gives access to the correct data


in the underlying view based on machine id.


The underlying view in this case can return approx 22000 rows, grouped 
from approx 8 million logs by harvest date, harvest object and so on.



The problem is that it always calculates all those 22000 rows even if 
the user id I use only gives 250 rows.


So, the query uses 4 seconds instead of under 100 ms.


https://www.postgresql.org/docs/10/sql-createview.html

"CREATE VIEW defines a view of a query. The view is not physically 
materialized. Instead, the query is run every time the view is 
referenced in a query."



Might want to look at materialized view:
https://www.postgresql.org/docs/10/sql-creatematerializedview.html

"CREATE MATERIALIZED VIEW is similar to CREATE TABLE AS, except that it 
also remembers the query used to initialize the view, so that it can be 
refreshed later upon demand. A materialized view has many of the same 
properties as a table, but there is no support for temporary 
materialized views or automatic generation of OIDs."



I would also suggest running the EXPLAIN below with ANALYZE so actual 
timings are returned. Also try:


SELECT
*
FROM
underlying_view AS b
JOIN
contractor_access AS b
ON
a.machine_key = b.machine_key
WHERE
user_id = 'name@email.address'



I have tried this on 2 servers and my laptop, PostgreSQL 9.6 and 10 and 
get the same issue.



I have tried to pick the query apart to understand what is happening.


First, the underlying view looks like this except I removed some fields 
that doesn't affect the case to save some space:


CREATE OR REPLACE VIEW underlying_view AS
  SELECT
     l.machine_key,
     o.object_name,
     o.sub_object_name,
     s.species_group_name,
     p.product_group_name ,
     l.m3_sub AS volume_m3sub,
     l.number_of_logs,
     mi.basemachine_manufacturer,
     mi.basemachine_model
    FROM
     (
    SELECT
     hl.contractor_id,
     hl.machine_key,
     hl.operator_key,
     hl.object_key,
     hl.sub_object_key,
     date(hl.harvest_date) AS harvest_date, --this is timestamptz 
since we use the time in other places

     hl.species_group_key,
     hl.product_key,
     sum(hl.m3_sub) AS m3_sub,
     count(*) AS number_of_logs
     FROM version_union_tables_r02.harvester_logs hl
     GROUP BY hl.machine_key, hl.contractor_id, hl.operator_key, 
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)), 
hl.species_group_key, hl.product_key

     ) l
  LEFT JOIN version_union_tables_r02.machine_info mi ON 
l.machine_key::text = mi.machine_key::text
  LEFT JOIN version_union_tables_r02.objects o ON 
l.machine_key::text = o.machine_key::text AND l.object_key = 
o.object_key AND l.sub_object_key = o.sub_object_key
  LEFT JOIN version_union_tables_r02.products p ON 
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
  LEFT JOIN version_union_tables_r02.species s ON 
l.machine_key::text = s.machine_key::text AND l.species_group_key = 
s.species_group_key;



In the next level is a view that looks like this:


CREATE OR REPLACE VIEW top_level_view AS
  SELECT *
    FROM underlying_view a
  JOIN contractor_access b ON a.machine_key = b.machine_key WHERE 
b.active <> 0;



If I query this top_level_view like:

SELECT * FROM top_level_view WHERE user_id = 'name@email.address';

I haven't succeeded to avoid the underlying view to compute the full 
dataset.



The user_id 'name@email.address' returns 1 machine_key from 
contractor_access table that has any hits in the underlying view (4 in 
total but 3 machines are not represented in the underlying view)


We call the machine that we get from contractor_id table 'machine1'

Here is what I have tried to find out when the plan changes:


1) SELECT * FROM underlying_view WHERE machine_key = 'machine1'; -- 
returns 250 rows in approx 100 ms, so, works as expected



2) select * from (select 'machine1' machine_key) a, underlying_view b 
where a.machine_key = b.machine_key; --same as above, works as expected



3) select * from (select * from contractor_access where user_id = 
'name@email.address') a,
underlying_view b where a.machine_key = b.machine_key;      -- Here I am 
hit. this returns the same 250 rows, but in over 4 seconds



/*So I thought I should try to force down the machine_key to the 
underlying view with lateral like this*/



4) select * from (select * from contractor_access where user_id = 
'name@email.address') a,
lateral 

Re: PostgreSql Version Compatibility With Apache ActiveMQ

2019-02-15 Thread Adrian Klaver

On 2/15/19 5:55 AM, Rameshbabu Paulsamy (UST, IND) wrote:

Hi,

I am using Apache ActiveMQ version 5.15.2 and we are connected to 
Postgresql version 9.6


My DB team is planning to upgrade to 10.5 or 11.1 version.

Could you please let me know if this Postgre versions are compatible 
with  ActiveMQ 5.15.2


That is more of a question for the ActiveMQ community:

http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html



Thanks in Advance.

Regards

Ramesh




--
Adrian Klaver
adrian.kla...@aklaver.com



Problems pushing down WHERE-clause to underlying view

2019-02-15 Thread Nicklas Avén

Hi


We have a system with 2 layers of views. It is about forestry.

The first layer contains the logic like grouping volumes in logs 
together to stems or harvesting areas and joining species names to codes 
and things like that.


The second layer just joins this underlying views to a table with user 
ids and machine ids. So, when used by the application there is a where 
clause containing the user id which gives access to the correct data


in the underlying view based on machine id.


The underlying view in this case can return approx 22000 rows, grouped 
from approx 8 million logs by harvest date, harvest object and so on.



The problem is that it always calculates all those 22000 rows even if 
the user id I use only gives 250 rows.


So, the query uses 4 seconds instead of under 100 ms.


I have tried this on 2 servers and my laptop, PostgreSQL 9.6 and 10 and 
get the same issue.



I have tried to pick the query apart to understand what is happening.


First, the underlying view looks like this except I removed some fields 
that doesn't affect the case to save some space:


CREATE OR REPLACE VIEW underlying_view AS
 SELECT
    l.machine_key,
    o.object_name,
    o.sub_object_name,
    s.species_group_name,
    p.product_group_name ,
    l.m3_sub AS volume_m3sub,
    l.number_of_logs,
    mi.basemachine_manufacturer,
    mi.basemachine_model
   FROM
    (
   SELECT
    hl.contractor_id,
    hl.machine_key,
    hl.operator_key,
    hl.object_key,
    hl.sub_object_key,
    date(hl.harvest_date) AS harvest_date, --this is timestamptz 
since we use the time in other places

    hl.species_group_key,
    hl.product_key,
    sum(hl.m3_sub) AS m3_sub,
    count(*) AS number_of_logs
    FROM version_union_tables_r02.harvester_logs hl
    GROUP BY hl.machine_key, hl.contractor_id, hl.operator_key, 
hl.object_key, hl.sub_object_key, (date(hl.harvest_date)), 
hl.species_group_key, hl.product_key

    ) l
 LEFT JOIN version_union_tables_r02.machine_info mi ON 
l.machine_key::text = mi.machine_key::text
 LEFT JOIN version_union_tables_r02.objects o ON 
l.machine_key::text = o.machine_key::text AND l.object_key = 
o.object_key AND l.sub_object_key = o.sub_object_key
 LEFT JOIN version_union_tables_r02.products p ON 
l.machine_key::text = p.machine_key::text AND l.product_key = p.product_key
 LEFT JOIN version_union_tables_r02.species s ON 
l.machine_key::text = s.machine_key::text AND l.species_group_key = 
s.species_group_key;



In the next level is a view that looks like this:


CREATE OR REPLACE VIEW top_level_view AS
 SELECT *
   FROM underlying_view a
 JOIN contractor_access b ON a.machine_key = b.machine_key WHERE 
b.active <> 0;



If I query this top_level_view like:

SELECT * FROM top_level_view WHERE user_id = 'name@email.address';

I haven't succeeded to avoid the underlying view to compute the full 
dataset.



The user_id 'name@email.address' returns 1 machine_key from 
contractor_access table that has any hits in the underlying view (4 in 
total but 3 machines are not represented in the underlying view)


We call the machine that we get from contractor_id table 'machine1'

Here is what I have tried to find out when the plan changes:


1) SELECT * FROM underlying_view WHERE machine_key = 'machine1'; -- 
returns 250 rows in approx 100 ms, so, works as expected



2) select * from (select 'machine1' machine_key) a, underlying_view b 
where a.machine_key = b.machine_key; --same as above, works as expected



3) select * from (select * from contractor_access where user_id = 
'name@email.address') a,
underlying_view b where a.machine_key = b.machine_key;      -- Here I am 
hit. this returns the same 250 rows, but in over 4 seconds



/*So I thought I should try to force down the machine_key to the 
underlying view with lateral like this*/



4) select * from (select * from contractor_access where user_id = 
'name@email.address') a,

lateral (select * from underlying_view where machine_key = a.machine_key) b;

But this doesn't work either. It returns the same 250 rows in approx 4 
seconds.



My question is, is there some trick to force the planner to push down 
the machine_key.


I cannot understand what is fooling the planner.

The table is analyzed, I have tried on several machines, so I do not 
think it is miss leading statistics.



I haven't done any configuration more than tried with 
max_parallel_workers_per_gather to 0 since the workers makes it harder 
to understand what is happening.



Here is the quer plan on query number 3 above:

EXPLAIN select * from (select * from contractor_access where user_id = 
'name@email.address') a,

underlying_view b where a.machine_key = b.machine_key;

Aggregate  (cost=543839.03..543839.04 rows=1 width=8)
  ->  Hash Join  (cost=395402.74..543798.72 rows=16123 width=0)
    Hash Cond: ((hl.machine_key)::text = contractor_access.machine_key)
    ->  Hash Left Join  

Re: Shared hosting with FDW on AWS RDS

2019-02-15 Thread Bruno Lavoie



On 2019-02-14 10:21 p.m., Bruce Momjian wrote:

On Sun, Feb 10, 2019 at 03:19:48PM -0800, Paul Jungwirth wrote:

On 2/10/19 2:57 PM, auxsvr wrote:

We'd like to configure an RDS server for shared hosting. The idea is that every 
customer will be using a different database and FDW will be configured, so that 
the remote tables have access to the full data

I've set up something like this before (but on EC2), and the only problem I
couldn't solve was that any user can see your full customer list by typing
`\l` or `\du`. They can't see other customers' stuff, but they can see how
many customers you have and their database/login names. The only way around
it I know is that run separate "clusters" aka RDS instances.

You can try to lock this down somewhat by revoking access to various system
tables, but it starts breaking a lot of tools (e.g. some GUI tools don't
know what to do if they get an error just listing the databases). Also it is
so piecemeal I wouldn't trust that I'd blocked off all avenues of getting
the information.

I'd love to be corrected on this btw if anyone has better information! :-)

Heroku had that issue and used hash values for the user and database
names.


Yes, we have the same problem here...

We want to consolidate users and databases and we can do it easily with 
PostgreSQL, while that's not easily the case with some other RDBMS.


Even if we can mask real clients name by using hashes, it is still sort 
of an information leakage that our security team is concerned about, and 
that is a burden to manage from our clients.


It would be nice to have PG to not list things you don't have rights to. 
I think that MySQL "show databases" list only stuff you have access to. 
Would modifying pg_catalog views make it possible?



Thanks

Bruno




PostgreSql Version Compatibility With Apache ActiveMQ

2019-02-15 Thread Rameshbabu Paulsamy (UST, IND)
Hi,

I am using Apache ActiveMQ version 5.15.2 and we are connected to Postgresql 
version 9.6

My DB team is planning to upgrade to 10.5 or 11.1 version.

Could you please let me know if this Postgre versions are compatible with  
ActiveMQ 5.15.2

Thanks in Advance.

Regards
Ramesh


How to setup only one connection for the whole event loop?

2019-02-15 Thread Dominic Gua�a
Dear all,

I am new to postgresql and I am creating a c program that can receive request 
from different users. I want to maximize the performance of postgresql so I 
intend to just create 1 connection that would service all queries of different 
users. How do I do this? 

Do I create a new connection each time there is a new request like how it goes 
in MySQL or can I just have one connection that is established in my `main` 
function?
Sincerely