Re: sum of numeric column

2020-12-09 Thread avi Singh
Thanks Adrian for pointing me in the right direction, i got it working


On Wed, Dec 9, 2020 at 5:32 PM Adrian Klaver 
wrote:

> On 12/9/20 5:04 PM, avi Singh wrote:
> > Thanks for your reply Adrian
> >
> >
> > What do you want to do with the array?
> > i want to do a sum of the values of numeric array type column e.g. below
> > data_numeric
> > --
> >   {2.0}
> >   {1.0}
>
> If you are going to have a single element arrays only then why not just
> use a numeric field?
>
> To answer question:
>
> select sum(data_numeric[1]) from some_table;
>
> If you are going to have multi-element arrays then there are more
> questions:
>
> 1) Do you want sum horizontal in array?
>
> 2) If 1) then also vertical in column?
>
> 3) Do you want sum in 'columns' of arrays?
>
> 4) If 3) then what about missing data?
>
> >
> >
> > (4 rows)
> >
> >
> >
> > Regards
> >
> >
>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: sum of numeric column

2020-12-09 Thread Adrian Klaver

On 12/9/20 5:04 PM, avi Singh wrote:

Thanks for your reply Adrian


What do you want to do with the array?
i want to do a sum of the values of numeric array type column e.g. below
data_numeric
--
  {2.0}
  {1.0}


If you are going to have a single element arrays only then why not just 
use a numeric field?


To answer question:

select sum(data_numeric[1]) from some_table;

If you are going to have multi-element arrays then there are more questions:

1) Do you want sum horizontal in array?

2) If 1) then also vertical in column?

3) Do you want sum in 'columns' of arrays?

4) If 3) then what about missing data?




(4 rows)



Regards






--
Adrian Klaver
adrian.kla...@aklaver.com




Re: sum of numeric column

2020-12-09 Thread avi Singh
Thanks for your reply Adrian


What do you want to do with the array?
i want to do a sum of the values of numeric array type column e.g. below

data_numeric
--
 {2.0}
 {1.0}


(4 rows)



Regards


On Wed, Dec 9, 2020 at 4:49 PM Adrian Klaver 
wrote:

> On 12/9/20 4:46 PM, avi Singh wrote:
> > I have a table structure and want to do a sum of column type i.e
> > numeric. How can I do it ? when i try sum function i get this error
>
> You don't have a numeric type you have a numeric array type.
>
> >
> > ERROR:  function sum(numeric[]) does not exist
>
> Hence the error above.
>
> >
> > Can anyone please help me with this ?
>
> What do you want to do with the array?
>
> >
> > Column   |   Type   | Collation |
> > Nullable | Default
> >
> +--+---+--+-
> >   grid_id| bigint   |   |
> > not null |
> >   as_of_date | date |   |
> > not null |
> >   cell_id| bigint   |   |
> > not null |
> >   last_event_timestamp_local | timestamp with time zone |   |
> > |
> >   last_event_id  | bigint   |   |
> > |
> >   column_id  | bigint   |   |
> > |
> >   column_name| character varying(50)|   |
> > |
> >   row_id | bigint   |   |
> > |
> >   data_type_id   | smallint |   |
> > |
> >   data_numeric   | numeric[]|   |
> > |
> >   data_string| character varying[]  |   |
> > |
> >
> > e.g. of values in numeric type column
> >
> >   data_numeric
> > --
> >   {2.0}
> >   {1.0}
> >
> > Regards
> > Prabhjot
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>


Re: sum of numeric column

2020-12-09 Thread Adrian Klaver

On 12/9/20 4:46 PM, avi Singh wrote:
I have a table structure and want to do a sum of column type i.e 
numeric. How can I do it ? when i try sum function i get this error


You don't have a numeric type you have a numeric array type.



ERROR:  function sum(numeric[]) does not exist


Hence the error above.



Can anyone please help me with this ?


What do you want to do with the array?



            Column           |           Type           | Collation | 
Nullable | Default

+--+---+--+-
  grid_id                    | bigint                   |           | 
not null |
  as_of_date                 | date                     |           | 
not null |
  cell_id                    | bigint                   |           | 
not null |
  last_event_timestamp_local | timestamp with time zone |           |   
        |
  last_event_id              | bigint                   |           |   
        |
  column_id                  | bigint                   |           |   
        |
  column_name                | character varying(50)    |           |   
        |
  row_id                     | bigint                   |           |   
        |
  data_type_id               | smallint                 |           |   
        |
  data_numeric               | numeric[]                |           |   
        |
  data_string                | character varying[]      |           |   
        |


e.g. of values in numeric type column

  data_numeric
--
  {2.0}
  {1.0}

Regards
Prabhjot



--
Adrian Klaver
adrian.kla...@aklaver.com




sum of numeric column

2020-12-09 Thread avi Singh
I have a table structure and want to do a sum of column type i.e numeric.
How can I do it ? when i try sum function i get this error

ERROR:  function sum(numeric[]) does not exist

Can anyone please help me with this ?

   Column   |   Type   | Collation |
Nullable | Default
+--+---+--+-
 grid_id| bigint   |   | not
null |
 as_of_date | date |   | not
null |
 cell_id| bigint   |   | not
null |
 last_event_timestamp_local | timestamp with time zone |   |
   |
 last_event_id  | bigint   |   |
   |
 column_id  | bigint   |   |
   |
 column_name| character varying(50)|   |
   |
 row_id | bigint   |   |
   |
 data_type_id   | smallint |   |
   |
 data_numeric   | numeric[]|   |
   |
 data_string| character varying[]  |   |
   |

e.g. of values in numeric type column

 data_numeric
--
 {2.0}
 {1.0}

Regards
Prabhjot


Re: JDBC driver - is "getGeneratedKeys()" guaranteed to return the ids in the same order a batch insert was made?

2020-12-09 Thread electrotype

Agreed.


However, this isn't really the purview of JDBC - I'm doubting it does anything that would cause 
the order to be different than what is received, and the batch items are sent and results 
processed sequentially.


The main question is whether any batch items are inserting multiple records themselves - i.e., 
RETURNING * is producing multiple results.  Whatever order RETURNING * produces is what the driver 
will capture - but it isn't responsible for guaranteeing that the order of multiple inserted 
records going in matches what comes out.  PostgreSQL needs to make that claim.  I don't see where 
it does (i've sent an email to see if adding such a claim to the documentation is proper).  Done 
manually one can always do "WITH insert returning SELECT ORDER BY", but it doesn't seem workable 
for the driver to try and do that when adding the returning clause, which I presume is what is in 
scope here.


David J.


Thank you, it's appreciated! I'm sure this clarification would help other 
developers too.


Re: User input to queries

2020-12-09 Thread Paul Förster
Hi Rich,

> On 09. Dec, 2020, at 19:22, Rich Shepard  wrote:
> 
> Okay. I use mupdf to view the document and my search string were 'prompt',
> and 'prompt command'. I didn't use '\prompt',

\prompt is a psql special command, hence the backslash. Only psql knows that, 
the database does not, as with all backslash commands.

Cheers,
Paul



Re: User input to queries

2020-12-09 Thread Rich Shepard

On Wed, 9 Dec 2020, Paul Förster wrote:


but 12 has it:

postgres=# \prompt 'input: ' input
input: this is test input
postgres=# select version(), :'input';


Paul,

Okay. I use mupdf to view the document and my search string were 'prompt',
and 'prompt command'. I didn't use '\prompt',

Thanks again,

Rich




Re: User input to queries

2020-12-09 Thread Paul Förster
Hi Rich,

> On 09. Dec, 2020, at 19:10, Rich Shepard  wrote:
> 
> Looking at the postgres-12 doc I cannot find a command 'PROMPT' anywhere in
> it.

but 12 has it:

postgres=# \prompt 'input: ' input
input: this is test input
postgres=# select version(), :'input';
  version   
|  ?column?  
+
 PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 7.5.0, 
64-bit | this is test input
(1 row)

postgres=# \?
...
Variables
  \prompt [TEXT] NAMEprompt user to set internal variable
  \set [NAME [VALUE]]set internal variable, or list all if no parameters
  \unset NAMEunset (delete) internal variable
...

Cheers,
Paul



Re: User input to queries

2020-12-09 Thread Rob Sargent



On 12/9/20 11:10 AM, Rich Shepard wrote:

On Wed, 9 Dec 2020, Paul Förster wrote:


maybe you're looking for this?
https://stackoverflow.com/a/7389606


Paul,

That looks very useful and I'll try the provided answers.

Looking at the postgres-12 doc I cannot find a command 'PROMPT' 
anywhere in

it.

Thanks,

Rich



https://www.postgresql.org/docs/12/app-psql.html




Re: User input to queries

2020-12-09 Thread Rich Shepard

On Wed, 9 Dec 2020, Paul Förster wrote:


maybe you're looking for this?
https://stackoverflow.com/a/7389606


Paul,

That looks very useful and I'll try the provided answers.

Looking at the postgres-12 doc I cannot find a command 'PROMPT' anywhere in
it.

Thanks,

Rich




Re: User input to queries

2020-12-09 Thread Paul Förster
Hi Rich,

> On 09. Dec, 2020, at 18:53, Rich Shepard  wrote:
> 
> My business tracking tool. Yes, the GUI will have text entry widgets for
> user input but I want to apply these queries using psql on the command line
> until I build the GUI.

maybe you're looking for this?

https://stackoverflow.com/a/7389606

Cheers,
Paul




Re: User input to queries

2020-12-09 Thread Rich Shepard

On Wed, 9 Dec 2020, Rob Sargent wrote:


Put the query in a file, set the desired name, then from psql
  \i filename
Edit pfname, repeat


Thanks, Rob.

Stay well,

Rich




Re: User input to queries [RESOLVED]

2020-12-09 Thread Rich Shepard

On Wed, 9 Dec 2020, Laurenz Albe wrote:


You probably need the \prompt psql command:
\prompt 'What is "p.lname"' p_lname
\prompt 'What is "p.fname"' p_fname
SELECT ... WHERE p.lname = :p_lname AND p.fname = :p_fname;


Laurenz,

Ah! I have not before encountered that command. Yes, this will do the job
and I'll make it a point to look at _all_ available psql commands.

Thanks very much.

Stay well,

Rich




Re: User input to queries

2020-12-09 Thread Rich Shepard

On Wed, 9 Dec 2020, Michael Lewis wrote:


What application is taking the user input and needs to include the
parameters in the query string?


Michael,

My business tracking tool. Yes, the GUI will have text entry widgets for
user input but I want to apply these queries using psql on the command line
until I build the GUI.

Rich




Re: Performance penalty during logical postgres replication

2020-12-09 Thread Victor Yegorov
ср, 9 дек. 2020 г. в 10:21, Lars Vonk :

> We are doing a logical postgres replication from Postgres 11 to 12. Our
> database is around 700GB (8 cpu's, 32 GB).
> During the replication process, at some point, we see a huge performance
> penalty on a particular table. This table acts as a queue with lots of
> inserts and deletes happening throughout the day. For most of the time this
> table is empty, but during this performance penalty the number of rows in
> this table grows to 10.000 rows, and processing is not fast enough to empty
> this table. Main reason for this (as far as we see) is that the performance
> of the query for selecting the next row to process drops from < 10MS to
> 400MS. This eventually causes too much cpu load on the Primary and we have
> to cancel the replication process.
>
We already tried the initial load three times, and it consistently fails
> with the same "error". Last try was a per table approach and excluding this
> "queue" table.
> After cancelling the replication the query is fast again and the load on
> the Primary goes back to normal. We see that this happens when replicating
> large tables (> millions of rows). During this performance penalty the
> explain of the query selecting the next row from this table tells us it is
> doing a sequential scan (there is an index but it is not used).
>
> - What could cause this performance penalty?
> - Is this something other people experienced as well during the initial
> load of a logical replication with large tables?
> - We are now thinking of temporarily increasing the number of CPU's and
> RAM for the migration. Would this help in this case?
>

I've seen similar symptoms in cases with (a) home-made queues in the tables
and (b) long transactions.
Unfortunately, queue requires frequent vacuuming to preserve more or less
constant size of the queue and it's indexes.
And long transactions prevent the vacuum from cleaning up the queue.
Initial synchronization phase of the logical replication is in fact such a
transaction.

I would recommend doing the following:
- avoid adding ALL tables to the publication
- instead, split all tables in a batches in such a way, that initial batch
processing takes limited time (say, 15-30 minutes at most)
- of course, this leaves the biggest tables alone — add those one by one to
the publication, preferably at the time slot with minimal load on the queue.
- make sure to catch up on the queue processing and vacuum it between
batches
- on the receiving side, avoid creating indexes on the tables: create just
a necessary PK or UK, wait for the initial load to complete and then add
all the rest ones

As for the queue, PGQ from skytools is using different approach to maintain
queue tables:
- once in a while (2 hours by default) processing is switched to a new
table, tab_1, tab_2, tab_3 are used in a round
- after the switch, any remaining entries can be moved from previous to the
live table (shouldn't be necessary if switch is done properly, although
might be tricky in a presence of a long transactions)
- previous table is TRUNCATEd

In your case, you can do `VACUUM FULL` between replicating each batch of
tables.

-- 
Victor Yegorov


Re: User input to queries

2020-12-09 Thread Laurenz Albe
On Wed, 2020-12-09 at 09:03 -0800, Rich Shepard wrote:
> While I develop the application's GUI I use the database from the command
> line (psql). While some queries are self-contained others need user input.
> I've not found a search term that locates this information in the
> postgres-12 user manual and my web searches suggest that using '&' should
> work.
> 
> I tried the latter with this query but postgres didn't like it. (There may
> well be other errors in it so don't hesitate to point out my mistakes.)
> 
> --
> /* This query selects all activity information for a named person */
> 
> /* Specify columns */
> SELECT p.lname, p.fname, p.loc_nbr, p.job_title, p.direct_phone, p.active,
> o.org_name,
> l.loc_nbr, l.loc_name,
> a.act_date, a.act_type, a.notes, a.next_contact
> /* Specify tables. */ 
> FROM People AS p
>   JOIN Organizations AS o ON o.org_nbr = p.org_nbr
>   JOIN Locations AS l ON l.org_nbr = o.org_nbr and l.loc_nbr = p.loc_nbr
>   JOIN Activities AS a ON a.person_nbr = p.person_nbr
> /* Specify rows */ 
> WHERE p.lname =  AND p.fname = 
> -
> 
> Suggestions on what I should read to learn more about this subject are
> appreciated.

You probably need the \prompt psql command:

\prompt 'What is "p.lname"' p_lname
\prompt 'What is "p.fname"' p_fname
SELECT ... WHERE p.lname = :p_lname AND p.fname = :p_fname;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: User input to queries

2020-12-09 Thread Rob Sargent


On 12/9/20 10:03 AM, Rich Shepard wrote:

While I develop the application's GUI I use the database from the command
line (psql). While some queries are self-contained others need user 
input.

I've not found a search term that locates this information in the
postgres-12 user manual and my web searches suggest that using '&' should
work.

I tried the latter with this query but postgres didn't like it. (There 
may

well be other errors in it so don't hesitate to point out my mistakes.)

--
/* This query selects all activity information for a named person */

/* Specify columns */
SELECT p.lname, p.fname, p.loc_nbr, p.job_title, p.direct_phone, 
p.active,

   o.org_name,
   l.loc_nbr, l.loc_name,
   a.act_date, a.act_type, a.notes, a.next_contact
/* Specify tables. */ FROM People AS p
 JOIN Organizations AS o ON o.org_nbr = p.org_nbr
 JOIN Locations AS l ON l.org_nbr = o.org_nbr and l.loc_nbr = 
p.loc_nbr

 JOIN Activities AS a ON a.person_nbr = p.person_nbr
/* Specify rows */ WHERE p.lname =  AND p.fname = 
-

Suggestions on what I should read to learn more about this subject are
appreciated.

Regards,

Rich




Put the query in a file, set the desired name, then from psql

   \i filename

Edit pfname, repeat





Re: User input to queries

2020-12-09 Thread Michael Lewis
What application is taking the user input and needs to include the
parameters in the query string?


Re: Performance penalty during logical postgres replication

2020-12-09 Thread Michael Lewis
On Wed, Dec 9, 2020 at 2:21 AM Lars Vonk  wrote:

> Hi,
>
> We are doing a logical postgres replication from Postgres 11 to 12. Our
> database is around 700GB (8 cpu's, 32 GB).
> During the replication process, at some point, we see a huge performance
> penalty on a particular table. This table acts as a queue with lots of
> inserts and deletes happening throughout the day. For most of the time this
> table is empty, but during this performance penalty the number of rows in
> this table grows to 10.000 rows, and processing is not fast enough to empty
> this table. Main reason for this (as far as we see) is that the performance
> of the query for selecting the next row to process drops from < 10MS to
> 400MS. This eventually causes too much cpu load on the Primary and we have
> to cancel the replication process.
> We already tried the initial load three times, and it consistently fails
> with the same "error". Last try was a per table approach and excluding this
> "queue" table.
> After cancelling the replication the query is fast again and the load on
> the Primary goes back to normal. We see that this happens when replicating
> large tables (> millions of rows). During this performance penalty the
> explain of the query selecting the next row from this table tells us it is
> doing a sequential scan (there is an index but it is not used).
>
> - What could cause this performance penalty?
> - Is this something other people experienced as well during the initial
> load of a logical replication with large tables?
> - We are now thinking of temporarily increasing the number of CPU's and
> RAM for the migration. Would this help in this case?
>
> Thanks in advance,
> Lars
>

I would guess that you are hitting bad plans (sequential scan instead of
index) because auto-vacuum is not picking up this table for vacuum/analyze
as often as needed. You could try a periodic 'vacuum analyze table' while
the logical replication is going, or just look at your autovacuum settings
and tune the system to be more aggressive with turning down scale factor
and cost delay. If the table is primarily "insert only" particularly with
this initial load of data, then the autovacuum may not pick it up at all.
That pain point is mitigated with PG13's new behavior to kick off
autovacuum/analyze based on inserts also.


Re: User input to queries

2020-12-09 Thread Rich Shepard

On Wed, 9 Dec 2020, Michael Lewis wrote:


Are you looking for this perhaps?
https://www.postgresql.org/docs/current/sql-prepare.html


Michael,

I don't think so. Reading the PREPARE doc page my understanding is that its
use is for statement execution optimization, not asking for user data input
for variables in the query statement. It's useful information, though not
for my immediate need.

Thanks and stay well,

Rich







Re: User input to queries

2020-12-09 Thread Michael Lewis
On Wed, Dec 9, 2020 at 10:04 AM Rich Shepard 
wrote:

> While I develop the application's GUI I use the database from the command
> line (psql). While some queries are self-contained others need user input.
> I've not found a search term that locates this information in the
> postgres-12 user manual and my web searches suggest that using '&' should
> work.
>
> I tried the latter with this query but postgres didn't like it. (There may
> well be other errors in it so don't hesitate to point out my mistakes.)
>
> --
> /* This query selects all activity information for a named person */
>
> /* Specify columns */
> SELECT p.lname, p.fname, p.loc_nbr, p.job_title, p.direct_phone, p.active,
> o.org_name,
> l.loc_nbr, l.loc_name,
> a.act_date, a.act_type, a.notes, a.next_contact
> /* Specify tables. */
> FROM People AS p
>   JOIN Organizations AS o ON o.org_nbr = p.org_nbr
>   JOIN Locations AS l ON l.org_nbr = o.org_nbr and l.loc_nbr =
> p.loc_nbr
>   JOIN Activities AS a ON a.person_nbr = p.person_nbr
> /* Specify rows */
> WHERE p.lname =  AND p.fname = 
> -
>
> Suggestions on what I should read to learn more about this subject are
> appreciated.
>
> Regards,
>
> Rich


Are you looking for this perhaps?

https://www.postgresql.org/docs/current/sql-prepare.html


User input to queries

2020-12-09 Thread Rich Shepard

While I develop the application's GUI I use the database from the command
line (psql). While some queries are self-contained others need user input.
I've not found a search term that locates this information in the
postgres-12 user manual and my web searches suggest that using '&' should
work.

I tried the latter with this query but postgres didn't like it. (There may
well be other errors in it so don't hesitate to point out my mistakes.)

--
/* This query selects all activity information for a named person */

/* Specify columns */
SELECT p.lname, p.fname, p.loc_nbr, p.job_title, p.direct_phone, p.active,
   o.org_name,
   l.loc_nbr, l.loc_name,
   a.act_date, a.act_type, a.notes, a.next_contact
/* Specify tables. */ 
FROM People AS p

 JOIN Organizations AS o ON o.org_nbr = p.org_nbr
 JOIN Locations AS l ON l.org_nbr = o.org_nbr and l.loc_nbr = p.loc_nbr
 JOIN Activities AS a ON a.person_nbr = p.person_nbr
/* Specify rows */ 
WHERE p.lname =  AND p.fname = 

-

Suggestions on what I should read to learn more about this subject are
appreciated.

Regards,

Rich




Re: Potential BRIN Index Corruption

2020-12-09 Thread Tomas Vondra
On 12/9/20 12:07 AM, Huan Ruan wrote:
> HI Alvaro
> 
> Unfortunately those records were written a while ago and we no longer keep
> their WAL logs. Thanks for your help anyway.
> 

Can you estimate when roughly the records were written? E.g. by using a
rough estimate of WAL or XIDs generated per day, or something like that.
Maybe there was some I/O issue in that time period and a couple writes
got lost, or something like that.

Are there any other corrupted indexes on the table?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Execution order of CTEs / set_config and current_setting in the same query

2020-12-09 Thread Wolfgang Walther

Tom Lane:

I think you're on fairly shaky ground here.  Generally speaking, a CTE
will be executed/read only when the parent query needs the next row from
it.  Your examples ensure that the CTE is read before the parent query's
results are computed; but in realistic usage you'd presumably be joining
the CTE with some other table(s), and then the execution order is going
to be a lot harder to predict.  This approach is also going to
fundamentally not work for settings that need to apply during planning
of the query (which, notably, probably includes "role").


Ok, thanks for confirming that.


You'd be far better off to investigate ways to send SET LOCAL first,
without incurring a separate network round trip for that.  If you're
using simple query mode that's easy, you can just do

res = PQexec("SET LOCAL ... ; ");

In extended query mode you can't get away with that, but you might be able
to issue the SET LOCAL without immediately waiting for the result.


Yes, that's what we did so far. We switched to set_config to parametrize 
the query.


Is there any way to not wait for a SELECT? I don't care about the 
resultset, so I need something like PERFORM but for SQL, not plpgsql, I 
think?





Re: JDBC driver - is "getGeneratedKeys()" guaranteed to return the ids in the same order a batch insert was made?

2020-12-09 Thread David G. Johnston
On Wed, Dec 9, 2020 at 8:20 AM electrotype  wrote:

> So I'm curious. Why does order matter ?
>
> Dave Cramer
> www.postgres.rocks
>
>
> When you have to save multiple new entities with subentities.
>
> You first save all the parent entities in a single SQL batch insert, you
> get the generated ids, then insert all the subentities in another single
> SQL batch insert. To know which "parent id" to use for a given subentity of
> the second query, you need a way to associate a generated id with the
> correct parent entity. The order of the parents in their batch, and the
> order of the generated ids, is the only straighforward way.
>
> I know all this could be made into a single SQL query, without having to
> associate the generated ids to the parents manually. But sometimes you have
> to fight *really *hard agains your framework or JDBC itself to write such
> more complex query, where two batch inserts are very natural.
>
Agreed.

However, this isn't really the purview of JDBC - I'm doubting it does
anything that would cause the order to be different than what is received,
and the batch items are sent and results processed sequentially.

The main question is whether any batch items are inserting multiple records
themselves - i.e., RETURNING * is producing multiple results.  Whatever
order RETURNING * produces is what the driver will capture - but it isn't
responsible for guaranteeing that the order of multiple inserted records
going in matches what comes out.  PostgreSQL needs to make that claim.  I
don't see where it does (i've sent an email to see if adding such a claim
to the documentation is proper).  Done manually one can always do "WITH
insert returning SELECT ORDER BY", but it doesn't seem workable for the
driver to try and do that when adding the returning clause, which I presume
is what is in scope here.

David J.


Re: JDBC driver - is "getGeneratedKeys()" guaranteed to return the ids in the same order a batch insert was made?

2020-12-09 Thread Dave Cramer
On Wed, 9 Dec 2020 at 10:21, electrotype  wrote:

> So I'm curious. Why does order matter ?
>
> Dave Cramer
> www.postgres.rocks
>
>
> When you have to save multiple new entities with subentities.
>
> You first save all the parent entities in a single SQL batch insert, you
> get the generated ids, then insert all the subentities in another single
> SQL batch insert. To know which "parent id" to use for a given subentity of
> the second query, you need a way to associate a generated id with the
> correct parent entity. The order of the parents in their batch, and the
> order of the generated ids, is the only straighforward way.
>
> I know all this could be made into a single SQL query, without having to
> associate the generated ids to the parents manually. But sometimes you have
> to fight *really *hard agains your framework or JDBC itself to write such
> more complex query, where two batch inserts are very natural.
>


Fair enough, however the spec does not say anything about the order. In
fact it doesn't even say which keys will be returned.

I don't think we can make any guarantees here.

Dave


SV: Tools showing table partitions as tables in listings of tables

2020-12-09 Thread Niels Jespersen
>> A small irritation point is that some tools decide that partitions 
>> under a table are to be shown in a list of tables, sometimes drowning the 
>> main table in a sea of partitions.
>
>While this doesn't answer your question directly, but when I had this problem, 
>I simply moved partitions to separate schema.
>So, let's say I have table plans in schema public. All its partitions are in 
>"plans" schema.
>Of course naming can be done differently, but the thing is that you can do it, 
>and it doesn't cause any issues with queries - as long as you use main table 
>and not directly partitions.
>
>Best regards,
>
>depesz
>
I had not thought of that. Nice trick. I think we have a standard security 
setup that revolves around schemas, that may suffer more than this helps. I 
will think about that some more. 

Thank you for the tip. 

Regards Niels


Re: JDBC driver - is "getGeneratedKeys()" guaranteed to return the ids in the same order a batch insert was made?

2020-12-09 Thread electrotype

So I'm curious. Why does order matter ?

Dave Cramer
www.postgres.rocks



When you have to save multiple new entities with subentities.

You first save all the parent entities in a single SQL batch insert, you get the generated ids, then 
insert all the subentities in another single SQL batch insert. To know which "parent id" to use for 
a given subentity of the second query, you need a way to associate a generated id with the correct 
parent entity. The order of the parents in their batch, and the order of the generated ids, is the 
only straighforward way.


I know all this could be made into a single SQL query, without having to associate the generated ids 
to the parents manually. But sometimes you have to fight /really /hard agains your framework or JDBC 
itself to write such more complex query, where two batch inserts are very natural.




Re: Execution order of CTEs / set_config and current_setting in the same query

2020-12-09 Thread Tom Lane
Wolfgang Walther  writes:
> Now, we are wondering: To reduce overhead, can we move the set_config 
> calls to a CTE as part of the main query? The values would need to be 
> available with current_setting(...) in the remaining query.

I think you're on fairly shaky ground here.  Generally speaking, a CTE
will be executed/read only when the parent query needs the next row from
it.  Your examples ensure that the CTE is read before the parent query's
results are computed; but in realistic usage you'd presumably be joining
the CTE with some other table(s), and then the execution order is going
to be a lot harder to predict.  This approach is also going to
fundamentally not work for settings that need to apply during planning
of the query (which, notably, probably includes "role").

You'd be far better off to investigate ways to send SET LOCAL first,
without incurring a separate network round trip for that.  If you're
using simple query mode that's easy, you can just do

res = PQexec("SET LOCAL ... ; ");

In extended query mode you can't get away with that, but you might be able
to issue the SET LOCAL without immediately waiting for the result.

regards, tom lane




Re: postgres-10 with FIPS

2020-12-09 Thread Joe Conway
On 12/9/20 4:51 AM, Aravindhan Krishnan wrote:
> The paid version I had mentioned about was the paid OS (ubuntu) for FIPS
> compliancy. I understand that postgres as is completely available for 
> open-source.
> 
> Since we can't get the paid version of the OS to support FIPS compliancy the
> idea was to build postgres against FIPS compliant SSL/crypto of 1.0.2g and get
> it to work on ubuntu 20.04 for which I was interested in the configure option.

Actual FIPS compliance is held by the distributor of the SSL library you use.

While you can, for example, configure a CentOS 7 system to be in "FIPS mode", it
is still not "FIPS compliant" if you didn't get the bits (the SSL library
itself) from Red Hat (which you did not if you are running CentOS).

The situation is the same with Ubuntu, except as far as I am aware you cannot
even get your hands on the SSL library for "FIPS mode" from Ubuntu unless you
pay them, unlike CentOS.

So no matter what you do with Postgres itself, you will not be FIPS compliant
without paying RHEL/Ubuntu/SUSE or getting your stack certified yourself (which
is not likely something you will want to do and would cost you more anyway).

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Execution order of CTEs / set_config and current_setting in the same query

2020-12-09 Thread Wolfgang Walther

Hi,

with PostgREST [1] we are translating HTTP requests into SQL queries. 
For each request we are setting some metadata (headers, ...) as GUCs.


We used to do it like this:
SET LOCAL request.headers.x = 'y';
...

Since this is user-provided data, we want to use parametrized/prepared 
statements. This is not possible with SET, so we switched to:


SELECT set_config($1, $2, true), ...;

Both these queries are preceding our main query. The SELECT set_config 
is a bit slower than the SET LOCAL, probably because of more overhead on 
the SELECT.


Now, we are wondering: To reduce overhead, can we move the set_config 
calls to a CTE as part of the main query? The values would need to be 
available with current_setting(...) in the remaining query.


Of course we would need to ensure execution order, so that this CTE will 
always be fully executed, before all the other parts of the query.


Is this possible to do?

We did some basic testing, that seemed to be successful:

WITH set AS (
  SELECT set_config('pgrst.hello', 'world', true)
)
SELECT current_setting('pgrst.hello')
FROM set;

or

WITH set AS (
  SELECT set_config('pgrst.hello', 'world', true)
),
main AS (
  SELECT
current_setting('pgrst.hello') AS hello,

  FROM set, 
)
SELECT
  current_setting('pgrst.hello'),
  main.hello,
  
FROM set, main, ;


Queries like this seem to have set the GUC correctly. But is this 
guaranteed? What would need to be done to guarantee it?


I have a feeling that even though this works for those simple cases, 
there is some risk involved...


Additional question: If this can be guaranteed - what about using 
set_config('role', 'xxx', true) in the same way? Putting this into those 
examples above and checking with CURRENT_USER seems to work as well. How 
likely would this lead to problems with privileges / permissions?


Any input/insight would be helpful.

Thanks

Wolfgang

[1]: https://postgrest.org




Re: Tools showing table partitions as tables in listings of tables

2020-12-09 Thread hubert depesz lubaczewski
On Wed, Dec 09, 2020 at 12:29:43PM +, Niels Jespersen wrote:
> A small irritation point is that some tools decide that partitions under a 
> table are to be shown in a list of tables, sometimes
> drowning the main table in a sea of partitions.

While this doesn't answer your question directly, but when I had this
problem, I simply moved partitions to separate schema.

So, let's say I have table plans in schema public. All its partitions
are in "plans" schema.

Of course naming can be done differently, but the thing is that you can
do it, and it doesn't cause any issues with queries - as long as you use
main table and not directly partitions.

Best regards,

depesz





Tools showing table partitions as tables in listings of tables

2020-12-09 Thread Niels Jespersen
Hello all

We are very happy with the ongoing work on partitioning i Postgres 11+. We use 
it in a number of cases.

A small irritation point is that some tools decide that partitions under a 
table are to be shown in a list of tables, sometimes drowning the main table in 
a sea of partitions.

PgAdmin shows partitions below the main table, Datagrip also. Other tools do 
not. The ones that I have heard about is the SAS System and Enterprise 
Architect, both connecting through ODBC. I am aware that the issue is really 
that these tools and/or the ODBC Driver issues queries against the catalog and 
these queries are naive in respect to table partitions.

Any ideas for working around this? Naming partitions in a special manner, 
something else. It's not the end of the world, but quite irritating for some 
users.

Regards

Niels Jespersen




Re: JDBC driver - is "getGeneratedKeys()" guaranteed to return the ids in the same order a batch insert was made?

2020-12-09 Thread Dave Cramer
So I'm curious. Why does order matter ?

Dave Cramer
www.postgres.rocks


On Wed, 9 Dec 2020 at 03:15, electrotype  wrote:

> I can't see how they could possibly be out of order.
>
> Thanks, that what I think too. But, to be honest, I'd really like to see
> this written in some documentation! In some cases, this small detail can be
> quite important.
>
>


Re: postgres-10 with FIPS

2020-12-09 Thread Aravindhan Krishnan
Hi Magnus,

The paid version I had mentioned about was the paid OS (ubuntu) for FIPS
compliancy. I understand that postgres as is completely available for
open-source.

Since we can't get the paid version of the OS to support FIPS compliancy
the idea was to build postgres against FIPS compliant SSL/crypto of 1.0.2g
and get it to work on ubuntu 20.04 for which I was interested in the
configure option.

Thanks and Regards
Aravindhan K

On Wed, 9 Dec, 2020, 3:07 pm Magnus Hagander,  wrote:

>
>
> On Wed, Dec 9, 2020 at 5:30 AM Aravindhan Krishnan <
> aravindhan...@gmail.com> wrote:
>
>> Hi Folks,
>>
>> Thanks for the responses. Since the underlying knob flip is a paid
>> version and we are a SaaS based service provider, this might not align well
>> with our requirement and so wanted to build postgres-10 against FIPS
>> compliant ssl/crypto. The "pg_config" is of of great help in order to
>> understand the build time configure options.
>>
>
> Huh? There is *nothing* in PostgreSQL that is a paid version (there is no
> such thing, simply), nor in any of our open source packaging.
>
> If you are specifically looking at the Debian or Ubuntu packages, you can
> find the full packaging information in the salsa repositories at
> https://salsa.debian.org/postgresql/postgresql. It will have all teh
> details you need.
>
> --
>  Magnus Hagander
>  Me: https://www.hagander.net/ 
>  Work: https://www.redpill-linpro.com/ 
>


Re: postgres-10 with FIPS

2020-12-09 Thread Magnus Hagander
On Wed, Dec 9, 2020 at 5:30 AM Aravindhan Krishnan 
wrote:

> Hi Folks,
>
> Thanks for the responses. Since the underlying knob flip is a paid version
> and we are a SaaS based service provider, this might not align well with
> our requirement and so wanted to build postgres-10 against FIPS compliant
> ssl/crypto. The "pg_config" is of of great help in order to understand the
> build time configure options.
>

Huh? There is *nothing* in PostgreSQL that is a paid version (there is no
such thing, simply), nor in any of our open source packaging.

If you are specifically looking at the Debian or Ubuntu packages, you can
find the full packaging information in the salsa repositories at
https://salsa.debian.org/postgresql/postgresql. It will have all teh
details you need.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Performance penalty during logical postgres replication

2020-12-09 Thread Lars Vonk
Hi,

We are doing a logical postgres replication from Postgres 11 to 12. Our
database is around 700GB (8 cpu's, 32 GB).
During the replication process, at some point, we see a huge performance
penalty on a particular table. This table acts as a queue with lots of
inserts and deletes happening throughout the day. For most of the time this
table is empty, but during this performance penalty the number of rows in
this table grows to 10.000 rows, and processing is not fast enough to empty
this table. Main reason for this (as far as we see) is that the performance
of the query for selecting the next row to process drops from < 10MS to
400MS. This eventually causes too much cpu load on the Primary and we have
to cancel the replication process.
We already tried the initial load three times, and it consistently fails
with the same "error". Last try was a per table approach and excluding this
"queue" table.
After cancelling the replication the query is fast again and the load on
the Primary goes back to normal. We see that this happens when replicating
large tables (> millions of rows). During this performance penalty the
explain of the query selecting the next row from this table tells us it is
doing a sequential scan (there is an index but it is not used).

- What could cause this performance penalty?
- Is this something other people experienced as well during the initial
load of a logical replication with large tables?
- We are now thinking of temporarily increasing the number of CPU's and RAM
for the migration. Would this help in this case?

Thanks in advance,
Lars


Re: JDBC driver - is "getGeneratedKeys()" guaranteed to return the ids in the same order a batch insert was made?

2020-12-09 Thread electrotype

I can't see how they could possibly be out of order.

Thanks, that what I think too. But, to be honest, I'd really like to see this written in some 
documentation! In some cases, this small detail can be quite important.