Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Tim Cross
On Tue, 18 Jun 2019 at 10:39, Tom Lane  wrote:

> Tim Cross  writes:
> > On Tue, 18 Jun 2019 at 09:34, Ken Tanzer  wrote:
> >> Thanks Adrian, though I wasn't really seeking tips for column names.  I
> >> was instead trying to understand whether this particular tab expansion
> was
> >> intentional and considered useful, and if so what that usefulness was,
> >> because it's rather escaping me!
>
> > Have to say, I fid that behaviour unusual as well.
>
> I don't think it's intentional.  A look into tab-complete.c shows that it
> makes no attempt to offer completions beyond the "=" part of the syntax;
> so there's room for improvement there.  But then what is producing the
> "DEFAULT" completion?  After looking around a bit, I think it's
> accidentally matching the pattern for a GUC "set" command:
>
> else if (TailMatches("SET", MatchAny, "TO|="))
> {
> /* special cased code for individual GUCs */
> ...
> else
> COMPLETE_WITH("DEFAULT");
> }
>
> So perhaps that needs to look more like this other place where somebody
> already noticed the conflict against UPDATE:
>
> else if (TailMatches("SET|RESET") && !TailMatches("UPDATE", MatchAny,
> "SET"))
> COMPLETE_WITH_QUERY(Query_for_list_of_set_vars);
>
> More generally, though, I'm inclined to think that offering DEFAULT
> and nothing else, which is what this code does if it doesn't recognize
> the "GUC name", is just ridiculous.  If the word after SET is not a known
> GUC name then we probably have misconstrued the context, as indeed is
> happening in your example; and in any case DEFAULT is about the least
> likely thing for somebody to be trying to enter here.  (They'd probably
> have selected RESET not SET if they were trying to do that.)
>
> regards, tom lane
>


Given that without adding a full blown sql parser in order to identify
legitimate candidates following a '=' in an update statement, my suggestion
would be to refine the rules so that no completion is attempted after the
=. Would rather have tab do nothing over tab replacing what I've already
typed with 'default'.

-- 
regards,

Tim

--
Tim Cross


Re: Inserts restricted to a trigger

2019-06-17 Thread raf
Adrian Klaver wrote:

> On 6/17/19 4:54 PM, Miles Elam wrote:
> > Is there are way to restrict direct access to a table for inserts but
> > allow a trigger on another table to perform an insert for that user?
> > 
> > I'm trying to implement an audit table without allowing user tampering
> > with the audit information.
> 
> Would the below not work?:
> CREATE the table as superuser or other privileged user
> Have trigger function run as above user(use SECURITY DEFINER)

and make sure not to give any other users insert/update/delete
permissions on the audit table.

> > Thanks in advance,
> > 
> > Miles Elam
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com




Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Tom Lane
Tim Cross  writes:
> On Tue, 18 Jun 2019 at 09:34, Ken Tanzer  wrote:
>> Thanks Adrian, though I wasn't really seeking tips for column names.  I
>> was instead trying to understand whether this particular tab expansion was
>> intentional and considered useful, and if so what that usefulness was,
>> because it's rather escaping me!

> Have to say, I fid that behaviour unusual as well.

I don't think it's intentional.  A look into tab-complete.c shows that it
makes no attempt to offer completions beyond the "=" part of the syntax;
so there's room for improvement there.  But then what is producing the
"DEFAULT" completion?  After looking around a bit, I think it's
accidentally matching the pattern for a GUC "set" command:

else if (TailMatches("SET", MatchAny, "TO|="))
{
/* special cased code for individual GUCs */
...
else
COMPLETE_WITH("DEFAULT");
}

So perhaps that needs to look more like this other place where somebody
already noticed the conflict against UPDATE:

else if (TailMatches("SET|RESET") && !TailMatches("UPDATE", MatchAny, 
"SET"))
COMPLETE_WITH_QUERY(Query_for_list_of_set_vars);

More generally, though, I'm inclined to think that offering DEFAULT
and nothing else, which is what this code does if it doesn't recognize
the "GUC name", is just ridiculous.  If the word after SET is not a known
GUC name then we probably have misconstrued the context, as indeed is
happening in your example; and in any case DEFAULT is about the least
likely thing for somebody to be trying to enter here.  (They'd probably
have selected RESET not SET if they were trying to do that.)

regards, tom lane




Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Jeff Janes
On Mon, Jun 17, 2019 at 8:23 PM Adrian Klaver 
wrote:

> On 6/17/19 4:33 PM, Ken Tanzer wrote:
> >
> > Thanks Adrian, though I wasn't really seeking tips for column names.  I
> > was instead trying to understand whether this particular tab expansion
> > was intentional and considered useful, and if so what that usefulness
>
> If I am following the below correctly it is intentional:
>
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/bin/psql/tab-complete.c;h=68a2ba27aec22302625c5481a8f74cf866f4dc23;hb=d22ca701a39dfd03cdfa1ca238370d34f4bc4ac4
>
> Line 2888
>

But that portion doesn't offer the DEFAULT completion.  It stops at
offering '=', and goes no further.

It is at line 2859 which accidentally offers to complete DEFAULT, and that
is not part of the UPDATE-specific code.

Cheers,

Jeff


Re: Inserts restricted to a trigger

2019-06-17 Thread Adrian Klaver

On 6/17/19 4:54 PM, Miles Elam wrote:
Is there are way to restrict direct access to a table for inserts but 
allow a trigger on another table to perform an insert for that user?


I'm trying to implement an audit table without allowing user tampering 
with the audit information.


Would the below not work?:
CREATE the table as superuser or other privileged user
Have trigger function run as above user(use SECURITY DEFINER)




Thanks in advance,

Miles Elam



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Jeff Janes
On Mon, Jun 17, 2019 at 6:03 PM Ken Tanzer  wrote:

> Hi.  If I'm using psql, and type for example:
>
> UPDATE my_table SET my_field
> (with a trailing space)
>
> and then hit Tab, it will expand that to an =, and then another tab will
> expand to DEFAULT, so that I then have:
>
> UPDATE my_table SET my_field = DEFAULT
>
> If I'm tabbing out in this situation, it's going to be after the =, and I
> will have typed "myreal"[tab] in the vain hope that psql will complete that
> to "myreallylongfieldname," but instead it gets replaced with DEFAULT.
>

Yeah, it is especially annoying to delete what I actually typed to replace
it with something else.  I've been irked by that before.  I think the
general behavior of replacing something already typed with (what it
believes to be) the only proper completion is part of the underlying
readline/libedit library, not something psql goes out of its way to do.


> So I'm curious if this is intended behavior, if it's considered useful,
> and/or if it's a placeholder for something in the future that will be
> useful.  Also, is this new, as I've never noticed it before?
>

The tab completion doesn't have a SQL parser/analyzer, it is just driven of
general rules of looking at the proceeding N words.  In this case, it is
hitting the rule for "SET anything TO", which is intended to catch the
setting of parameters, it is only accidentally hitting on the SET part of
UPDATE statements.

This goes back at least to 9.3.

We could improve it by making a higher priority rule which looks back a few
more words to:

UPDATE  SET  TO

But what would we complete with?  Any expression can go there, and we can't
make it tab complete any arbitrary expression, like function names or
literals.  If we tab complete, but only with a restricted set of choices,
that could be interpreted as misleadingly suggesting no other things are
possible.  (Of course the current accidental behavior is also misleading,
then)

If we are willing to offer an incomplete list of suggestions, what would
they be?  NULL, DEFAULT, '(' and all the columnnames present in
, with appropriate quotes where necessary?  But what to do with
 doesn't actually exist as the name of a table?

Or, we could have it implement the more precise higher priority rule, and
have it just refuse to offer any suggestions, but at least not delete what
is already there.

Cheers,

Jeff

>


Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Adrian Klaver

On 6/17/19 4:33 PM, Ken Tanzer wrote:
On Mon, Jun 17, 2019 at 4:24 PM Adrian Klaver > wrote:


On 6/17/19 3:03 PM, Ken Tanzer wrote:
 >
 > So I'm curious if this is intended behavior, if it's considered
useful,
 > and/or if it's a placeholder for something in the future that
will be
 > useful.  Also, is this new, as I've never noticed it before?

Not sure how long that has been around.

My cheat for dealing with many/long column names is:


Thanks Adrian, though I wasn't really seeking tips for column names.  I 
was instead trying to understand whether this particular tab expansion 
was intentional and considered useful, and if so what that usefulness 


If I am following the below correctly it is intentional:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/bin/psql/tab-complete.c;h=68a2ba27aec22302625c5481a8f74cf866f4dc23;hb=d22ca701a39dfd03cdfa1ca238370d34f4bc4ac4

Line 2888

Useful, that is in the eye of the beholder:)


was, because it's rather escaping me!

Cheers,
Ken






--
Adrian Klaver
adrian.kla...@aklaver.com




Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Tim Cross
On Tue, 18 Jun 2019 at 09:34, Ken Tanzer  wrote:

> On Mon, Jun 17, 2019 at 4:24 PM Adrian Klaver 
> wrote:
>
>> On 6/17/19 3:03 PM, Ken Tanzer wrote:
>> >
>> > So I'm curious if this is intended behavior, if it's considered useful,
>> > and/or if it's a placeholder for something in the future that will be
>> > useful.  Also, is this new, as I've never noticed it before?
>>
>> Not sure how long that has been around.
>>
>> My cheat for dealing with many/long column names is:
>>
>>
> Thanks Adrian, though I wasn't really seeking tips for column names.  I
> was instead trying to understand whether this particular tab expansion was
> intentional and considered useful, and if so what that usefulness was,
> because it's rather escaping me!
>
> Cheers,
> Ken
>
>
>
Have to say, I fid that behaviour unusual as well. I would expect that once
I've typed some characters, the completion mechanism would attempt to
complete based on the characters I've typed and if it cannot, to do
nothing. Instead, what happens is that what I have typed is replaced by
'default'.  For example, if I type

update my_table set my_col = other_t

and hit tab, 'other_t is replaced by 'default', which is of no use. What I
would expect is for tab to either complete (possibly only partially if
there is multiple candidates) what it could for candidates which start with
'other_t' e.g. 'other_table' or it would do nothing i.e. no completion
candidates found, telling me there is no match based on the prefix I've
typed.


-- 
regards,

Tim

--
Tim Cross


Inserts restricted to a trigger

2019-06-17 Thread Miles Elam
Is there are way to restrict direct access to a table for inserts but allow
a trigger on another table to perform an insert for that user?

I'm trying to implement an audit table without allowing user tampering with
the audit information.


Thanks in advance,

Miles Elam


Re: perf tuning for 28 cores and 252GB RAM

2019-06-17 Thread Jeff Janes
On Mon, Jun 17, 2019 at 4:51 PM Michael Curry  wrote:

> I am using a Postgres instance in an HPC cluster, where they have
> generously given me an entire node. This means I have 28 cores and 252GB
> RAM. I have to assume that the very conservative default settings for
> things like buffers and max working memory are too small here.
>
> We have about 20 billion rows in a single large table.
>

What is that in bytes?  Do you only have that one table?


> The database is not intended to run an application but rather to allow a
> few individuals to do data analysis, so we can guarantee the number of
> concurrent queries will be small, and that nothing else will need to use
> the server. Creating multiple different indices on a few subsets of the
> columns will be needed to support the kinds of queries we want.
>
> What settings should be changed to maximize performance?
>

With 28 cores for only a few users, parallelization will probably be
important.  That feature is fairly new to PostgreSQL and rapidly improving
from version to version, so you will want to use the last version you can
(v11).  And then increase the values for max_worker_processes,
max_parallel_maintenance_workers, max_parallel_workers_per_gather, and
max_parallel_workers.  With the potential for so many parallel workers
running at once, you wouldn't want to go overboard on work_mem, maybe 2GB.
If you don't think all allowed users will be running large queries at the
same time (because they are mostly thinking what query to run, or thinking
about the results of the last one they ran, rather than actually running
queries), then maybe higher than that.

If your entire database can comfortably fit in RAM, I would make
shared_buffers large enough to hold the entire database.  If not, I would
set the value small (say, 8GB) and let the OS do the heavy lifting of
deciding what to keep in cache.  If you go with the first option, you
probably want to use pg_prewarm after each restart to get the data into
cache as fast as you can, rather than let it get loaded in naturally as you
run queries;  Also, you would probably want to set random_page_cost and
seq_page_cost quite low, like maybe 0.1 and 0.05.

You haven't described what kind of IO capacity and setup you have, knowing
that could suggest other changes to make.  Also, seeing the results of
`explain (analyze, buffers)`, especially with track_io_timing turned on,
for some actual queries could provide good insight for what else might need
changing.

Cheers,

Jeff


Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Ken Tanzer
On Mon, Jun 17, 2019 at 4:24 PM Adrian Klaver 
wrote:

> On 6/17/19 3:03 PM, Ken Tanzer wrote:
> >
> > So I'm curious if this is intended behavior, if it's considered useful,
> > and/or if it's a placeholder for something in the future that will be
> > useful.  Also, is this new, as I've never noticed it before?
>
> Not sure how long that has been around.
>
> My cheat for dealing with many/long column names is:
>
>
Thanks Adrian, though I wasn't really seeking tips for column names.  I was
instead trying to understand whether this particular tab expansion was
intentional and considered useful, and if so what that usefulness was,
because it's rather escaping me!

Cheers,
Ken


-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://demo.agency-software.org/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.


Re: psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Adrian Klaver

On 6/17/19 3:03 PM, Ken Tanzer wrote:

Hi.  If I'm using psql, and type for example:

UPDATE my_table SET my_field
(with a trailing space)

and then hit Tab, it will expand that to an =, and then another tab will 
expand to DEFAULT, so that I then have:


UPDATE my_table SET my_field = DEFAULT

If I'm tabbing out in this situation, it's going to be after the =, and 
I will have typed "myreal"[tab] in the vain hope that psql will complete 
that to "myreallylongfieldname," but instead it gets replaced with DEFAULT.


So I'm curious if this is intended behavior, if it's considered useful, 
and/or if it's a placeholder for something in the future that will be 
useful.  Also, is this new, as I've never noticed it before?


Not sure how long that has been around.

My cheat for dealing with many/long column names is:

test=# \d up_test
  Table "public.up_test"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 id | integer |   |  |
 col1   | boolean |   |  |
 col2   | integer |   |  |



test=# \pset format unaligned
Output format is unaligned.
test=# \pset fieldsep ','
Field separator is ",".

select * from up_test limit 0;
id,col1,col2

Cut and paste above.

test=# \pset fieldsep '|'
Field separator is "|".

test=# \pset format 'aligned'
Output format is aligned.




Thanks in advance,
Ken

p.s.,  Version 9.6.13

--
AGENCY Software
A Free Software data system
By and for non-profits
/http://agency-software.org//
/https://demo.agency-software.org/client/
ken.tan...@agency-software.org 
(253) 245-3801

Subscribe to the mailing list 
 to

learn more about AGENCY or
follow the discussion.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Adrian Klaver

On 6/17/19 12:01 PM, Leandro Guimarães wrote:
I've installed all dependencies, but when I try to "make install" in 
pg_bulkload folder if have some errors like this:


In file included from pgut/pgut.h:24:0,
                  from pgut/pgut-fe.h:13,
                  from pg_bulkload.c:17:
/usr/include/postgresql/internal/pqexpbuffer.h:149:13: error: old-style 
parameter declarations in prototyped function definition
  extern void printfPQExpBuffer(PQExpBuffer str, const char *fmt,...) 
pg_attribute_printf(2, 3);

              ^
pg_bulkload.c:743:1: error: expected ‘{’ at end of input


Where did you get the source and what version of source?

The most recent should be 3.1.15 from here:

https://github.com/ossc-db/pg_bulkload




My OS is Ubuntu Server 14.04 and PostgreSQL 9.4. I've searched about 
theses errors but it ways advice to change the C code in files but I'm 
not sure to do this in a tool like pg_bulkload.


Leandro Guimarães



On Mon, Jun 17, 2019 at 3:56 PM Adrian Klaver > wrote:


On 6/17/19 10:04 AM, Leandro Guimarães wrote:
 > Hi Adrian,
 >
 >     Yes, that's the problem!
 >
 >     I'm testing now the pg_bulkload but I'm facing some issues to
 > install it on Postgresql 9.4.

The issues would be?

 > Leandro Guimarães
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com 




--
Adrian Klaver
adrian.kla...@aklaver.com




psql UPDATE field [tab] expands to DEFAULT?

2019-06-17 Thread Ken Tanzer
Hi.  If I'm using psql, and type for example:

UPDATE my_table SET my_field
(with a trailing space)

and then hit Tab, it will expand that to an =, and then another tab will
expand to DEFAULT, so that I then have:

UPDATE my_table SET my_field = DEFAULT

If I'm tabbing out in this situation, it's going to be after the =, and I
will have typed "myreal"[tab] in the vain hope that psql will complete that
to "myreallylongfieldname," but instead it gets replaced with DEFAULT.

So I'm curious if this is intended behavior, if it's considered useful,
and/or if it's a placeholder for something in the future that will be
useful.  Also, is this new, as I've never noticed it before?

Thanks in advance,
Ken

p.s.,  Version 9.6.13

-- 
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ *
*https://demo.agency-software.org/client
*
ken.tan...@agency-software.org
(253) 245-3801

Subscribe to the mailing list
 to
learn more about AGENCY or
follow the discussion.


perf tuning for 28 cores and 252GB RAM

2019-06-17 Thread Michael Curry
I am using a Postgres instance in an HPC cluster, where they have
generously given me an entire node. This means I have 28 cores and 252GB
RAM. I have to assume that the very conservative default settings for
things like buffers and max working memory are too small here.

We have about 20 billion rows in a single large table. The database is not
intended to run an application but rather to allow a few individuals to do
data analysis, so we can guarantee the number of concurrent queries will be
small, and that nothing else will need to use the server. Creating multiple
different indices on a few subsets of the columns will be needed to support
the kinds of queries we want.

What settings should be changed to maximize performance?

-- 
Michael J. Curry


Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Leandro Guimarães
I've installed all dependencies, but when I try to "make install" in
pg_bulkload folder if have some errors like this:

In file included from pgut/pgut.h:24:0,
 from pgut/pgut-fe.h:13,
 from pg_bulkload.c:17:
/usr/include/postgresql/internal/pqexpbuffer.h:149:13: error: old-style
parameter declarations in prototyped function definition
 extern void printfPQExpBuffer(PQExpBuffer str, const char *fmt,...)
pg_attribute_printf(2, 3);
 ^
pg_bulkload.c:743:1: error: expected ‘{’ at end of input


My OS is Ubuntu Server 14.04 and PostgreSQL 9.4. I've searched about theses
errors but it ways advice to change the C code in files but I'm not sure to
do this in a tool like pg_bulkload.

Leandro Guimarães



On Mon, Jun 17, 2019 at 3:56 PM Adrian Klaver 
wrote:

> On 6/17/19 10:04 AM, Leandro Guimarães wrote:
> > Hi Adrian,
> >
> > Yes, that's the problem!
> >
> > I'm testing now the pg_bulkload but I'm facing some issues to
> > install it on Postgresql 9.4.
>
> The issues would be?
>
> > Leandro Guimarães
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Adrian Klaver

On 6/17/19 10:04 AM, Leandro Guimarães wrote:

Hi Adrian,

    Yes, that's the problem!

    I'm testing now the pg_bulkload but I'm facing some issues to 
install it on Postgresql 9.4.


The issues would be?


Leandro Guimarães



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: checkpoints taking much longer than expected

2019-06-17 Thread Andres Freund
On 2019-06-16 12:25:58 -0400, Jeff Janes wrote:
> Right, but true only because they were "checkpoint starting: immediate".
> Otherwise the reported write time includes intentional sleeps added to
> honor the checkpoint_completion_target.  A bit confusing to report it that
> way, I think.

+1

It's even worse than that, actually. We also don't sleep if the *next*
requested checkpoint is an immediate one:

 * Perform the usual duties and take a nap, unless we're behind 
schedule,
 * in which case we just try to catch up as quickly as possible.
 */
if (!(flags & CHECKPOINT_IMMEDIATE) &&
!shutdown_requested &&
!ImmediateCheckpointRequested() &&
IsCheckpointOnSchedule(progress))

/*
 * Returns true if an immediate checkpoint request is pending.  (Note that
 * this does not check the *current* checkpoint's IMMEDIATE flag, but whether
 * there is one pending behind it.)
 */
static bool
ImmediateCheckpointRequested(void)

We ought to do better.

- Andres




Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Leandro Guimarães
Hi Adrian,

   Yes, that's the problem!

   I'm testing now the pg_bulkload but I'm facing some issues to install it
on Postgresql 9.4.
Leandro Guimarães



On Mon, Jun 17, 2019 at 1:22 PM Adrian Klaver 
wrote:

> On 6/17/19 9:06 AM, Leandro Guimarães wrote:
> Please reply to list also.
> Ccing list.
> > Ugh My bad again.
> >
> > They are UNIQUE:
> > CONSTRAINT unique_const_value_20190501_45 UNIQUE (customer_id,
> > date_time, indicator_id, element_id),
> >
> > I've made a mistake typing "check constraint" before because these are
> > partitioned tables and I have the CHECK CONSTRAINT to partition.
> >
> > Is that clear?
>
> Yes.
>
> To get back to the original issue, the problem is that when you COPY in
> new data you may get rows that conflict on the above UNIQUE constraint,
> correct?
>
> >
> > Thanks for your patience!
> > Leandro Guimarães
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Adrian Klaver

On 6/17/19 9:22 AM, Adrian Klaver wrote:

On 6/17/19 9:06 AM, Leandro Guimarães wrote:
Please reply to list also.
Ccing list.

Ugh My bad again.

They are UNIQUE:
CONSTRAINT unique_const_value_20190501_45 UNIQUE (customer_id, 
date_time, indicator_id, element_id),


I've made a mistake typing "check constraint" before because these are 
partitioned tables and I have the CHECK CONSTRAINT to partition.


Is that clear?


Yes.

To get back to the original issue, the problem is that when you COPY in 
new data you may get rows that conflict on the above UNIQUE constraint, 
correct?


Assuming the above is correct, would not something like below work?:

create table orig_tbl(id integer, fld_1 varchar, fld_2 integer, fld_3 
varchar, CONSTRAINT u_idx UNIQUE(id, fld_1, fld_2));


\d orig_tbl
   Table "public.orig_tbl"
 Column |   Type| Collation | Nullable | Default
+---+---+--+-
 id | integer   |   |  |
 fld_1  | character varying |   |  |
 fld_2  | integer   |   |  |
 fld_3  | character varying |   |  |
Indexes:
"u_idx" UNIQUE CONSTRAINT, btree (id, fld_1, fld_2)

create table tmp_tbl(id integer, fld_1 varchar, fld_2 integer, fld_3 
varchar);


insert into orig_tbl values (1, 'test', 3, 'test'), (2, 'foo', 5, 
'bar'), (3, 'cat', 8, 'dog');


select * from orig_tbl ;

 id | fld_1 | fld_2 | fld_3
+---+---+---
  1 | test  | 3 | test
  2 | foo   | 5 | bar
  3 | cat   | 8 | dog
(3 rows)

insert into tmp_tbl values (1, 'test', 3, 'test'), (4, 'fish', 6, 
'bird'), (7, 'rabbit', 8, 'squirrel'), (10, 'plant', 2, 'animal');


select * from tmp_tbl ;

 id | fld_1  | fld_2 |  fld_3
++---+--
  1 | test   | 3 | test
  4 | fish   | 6 | bird
  7 | rabbit | 8 | squirrel
 10 | plant  | 2 | animal
(4 rows)

select * from tmp_tbl AS tt left join orig_tbl AS ot on (tt.id, 
tt.fld_1, tt.fld_2) = (ot.id, ot.fld_1, ot.fld_2) where ot.id is null;


 id | fld_1  | fld_2 |  fld_3   | id | fld_1 | fld_2 | fld_3
++---+--++---+---+---
  4 | fish   | 6 | bird ||   |   |
  7 | rabbit | 8 | squirrel ||   |   |
 10 | plant  | 2 | animal   ||   |   |





Thanks for your patience!
Leandro Guimarães







--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Adrian Klaver

On 6/17/19 9:06 AM, Leandro Guimarães wrote:
Please reply to list also.
Ccing list.

Ugh My bad again.

They are UNIQUE:
CONSTRAINT unique_const_value_20190501_45 UNIQUE (customer_id, 
date_time, indicator_id, element_id),


I've made a mistake typing "check constraint" before because these are 
partitioned tables and I have the CHECK CONSTRAINT to partition.


Is that clear?


Yes.

To get back to the original issue, the problem is that when you COPY in 
new data you may get rows that conflict on the above UNIQUE constraint, 
correct?




Thanks for your patience!
Leandro Guimarães




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: psql timeout option

2019-06-17 Thread Adrian Klaver

On 6/17/19 8:59 AM, Jesús Gómez wrote:

Hello!

Does the psql command have a timeout option for network connections?


Something like below? If not you will need to be more specific

https://www.postgresql.org/docs/11/app-psql.html

From within psql:


\c or \connect

 \c "host=localhost port=5432 dbname=mydb connect_timeout=10 
sslmode=disable"



Borrowing from above for initial connection

psql "host=localhost port=5432 dbname=mydb connect_timeout=10 
sslmode=disable"


Where connect_timeout is:

https://www.postgresql.org/docs/11/libpq-connect.html#LIBPQ-CONNSTRING

"connect_timeout

Maximum wait for connection, in seconds (write as a decimal 
integer, e.g. 10). Zero, negative, or not specified means wait 
indefinitely. The minimum allowed timeout is 2 seconds, therefore a 
value of 1 is interpreted as 2. This timeout applies separately to each 
host name or IP address. For example, if you specify two hosts and 
connect_timeout is 5, each host will time out if no connection is made 
within 5 seconds, so the total time spent waiting for a connection might 
be up to 10 seconds.

"



--
Adrian Klaver
adrian.kla...@aklaver.com




psql timeout option

2019-06-17 Thread Jesús Gómez
Hello!

Does the psql command have a timeout option for network connections?


Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Adrian Klaver

On 6/17/19 8:14 AM, Leandro Guimarães wrote:

Hi Adrian,

    You are right, these fields are in CHECK CONSTRAiNTS and they are 
not formally defined as Primary Keys.


Alright. Two things:

1) If you are are thinking of them as keys, why not make them a PK or a 
UNIQUE index?


2) Still not clear to me whether you are looking for duplicated 
information within a row or between rows?

To put it another way, what are the CHECK constraints doing?




Thanks!
Leandro Guimarães



On Sat, Jun 15, 2019 at 10:45 AM Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 6/14/19 7:24 PM, Leandro Guimarães wrote:
 > Hi Tim, thanks for you answer!
 >
 > The columns were just examples, but let me explain the database
 > structure, the fields in *bold are the keys*:
 >
 > *customer_id integer*
 > *date_time timestamp*
 > *indicator_id integer*
 > *element_id integer*
 > indicator_value double precision

Huh, earlier you said you had a check constraint that was causing
issues.

Does that also exist or where you referring to the keys above?

Are the keys above formally defined as the PRIMARY KEY?
-
 >
 > The table is partitioned per day and customer_id (it works great)
the
 > problem is just the duplicated key situation that I'm really
worried about.
 >
 > I populate the database via a Java Application with JDBC.
 >
 > Maybe this info could help to provide some light!
 >
 > Thanks Again!
 >
 > Leandro Guimarães
 >
 >
 >
 > On Fri, Jun 14, 2019 at 7:39 PM Tim Cross mailto:theophil...@gmail.com>
 > >> wrote:
 >
 >
 >     Leandro Guimarães mailto:leo.guimar...@gmail.com>
 >     >> writes:
 >
 >      > Hi,
 >      >
 >      >    I have a scenario with a large table and I'm trying to
insert
 >     it via a
 >      > COPY command with a csv file.
 >      >
 >      >    Everything works, but sometimes my source .csv file has
 >     duplicated data
 >      > in the previously fulfilled table. If I add a check constraint
 >     and try to
 >      > run the COPY command I have an error that stops the whole
insertion.
 >      >
 >      >   I've tried to put the data in a tmp table and fill the
main using
 >      > distinct this way (the fields and names are just examples):
 >      >
 >      > INSERT INTO final_table values (name, document)
 >      >    SELECT DISTINCT name, document
 >      >    FROM tmp_TABLE t1
 >      >    WHERE NOT EXISTS (
 >      >    SELECT 1 FROM final_table t2
 >      >    WHERE (t2.name  ,
t2.document)
 >      >    IS NOT DISTINCT FROM (t1.name 
, t1.document))
 >      >
 >      > The problem is that my final_table is a large (and
partitioned)
 >     table and
 >      > this query is taking a long time to execute.
 >      >
 >      > Someone have any idea (really guys anything would be
great) how
 >     to solve
 >      > this situation? I need to ignore duplicates instead to
have some
 >     error.
 >      >
 >      > I'm using* PostgreSQL 9.4* so I can't use "ON CONFLICT" and
 >     upgrade is not
 >      > an option.
 >      >
 >
 >     Explain plan would probably shed some light, but I suspect your
 >     performance is being heavily hit by the sub query. Distinct is an
 >     expensive operation and you are performing it once for every
 >     distinct row
 >     in your temp table.
 >
 >     It isn't clear what the primary key is for your final table -
name +
 >     document seems suspicious given these seem to be the only two
columns
 >     your inserting as well. You don't indicate what the data
types are
 >     either - it document is something like 'text' then using it in a
 >     distinct clause is likely to have huge performance impact.
 >
 >     The first thing I'd do is to eliminate duplicates from your
temp table
 >     as a separate statement or by pre-filtering the CSV before
import. I
 >     would then try something like an outer join to identify rows
in your
 >     temp table which don't exist in your final table and select
from there
 >     to insert into the final table. You don't really need the
distinct in
 >     the sub query as all you really need to know is if (name,
document)
 >     exists - it doesn't matter if more than one exists (for this
test).
 >
 >     If you really don't have something more specific for a
primary key,
 >     depending on what data type 'document' is and how large it
is, you may
 >     find 

Re: Copy Bulk Ignore Duplicated

2019-06-17 Thread Leandro Guimarães
Hi Adrian,

   You are right, these fields are in CHECK CONSTRAiNTS and they are not
formally defined as Primary Keys.

Thanks!
Leandro Guimarães



On Sat, Jun 15, 2019 at 10:45 AM Adrian Klaver 
wrote:

> On 6/14/19 7:24 PM, Leandro Guimarães wrote:
> > Hi Tim, thanks for you answer!
> >
> > The columns were just examples, but let me explain the database
> > structure, the fields in *bold are the keys*:
> >
> > *customer_id integer*
> > *date_time timestamp*
> > *indicator_id integer*
> > *element_id integer*
> > indicator_value double precision
>
> Huh, earlier you said you had a check constraint that was causing issues.
>
> Does that also exist or where you referring to the keys above?
>
> Are the keys above formally defined as the PRIMARY KEY?
> -
> >
> > The table is partitioned per day and customer_id (it works great) the
> > problem is just the duplicated key situation that I'm really worried
> about.
> >
> > I populate the database via a Java Application with JDBC.
> >
> > Maybe this info could help to provide some light!
> >
> > Thanks Again!
> >
> > Leandro Guimarães
> >
> >
> >
> > On Fri, Jun 14, 2019 at 7:39 PM Tim Cross  > > wrote:
> >
> >
> > Leandro Guimarães  > > writes:
> >
> >  > Hi,
> >  >
> >  >I have a scenario with a large table and I'm trying to insert
> > it via a
> >  > COPY command with a csv file.
> >  >
> >  >Everything works, but sometimes my source .csv file has
> > duplicated data
> >  > in the previously fulfilled table. If I add a check constraint
> > and try to
> >  > run the COPY command I have an error that stops the whole
> insertion.
> >  >
> >  >   I've tried to put the data in a tmp table and fill the main
> using
> >  > distinct this way (the fields and names are just examples):
> >  >
> >  > INSERT INTO final_table values (name, document)
> >  >SELECT DISTINCT name, document
> >  >FROM tmp_TABLE t1
> >  >WHERE NOT EXISTS (
> >  >SELECT 1 FROM final_table t2
> >  >WHERE (t2.name , t2.document)
> >  >IS NOT DISTINCT FROM (t1.name , t1.document))
> >  >
> >  > The problem is that my final_table is a large (and partitioned)
> > table and
> >  > this query is taking a long time to execute.
> >  >
> >  > Someone have any idea (really guys anything would be great) how
> > to solve
> >  > this situation? I need to ignore duplicates instead to have some
> > error.
> >  >
> >  > I'm using* PostgreSQL 9.4* so I can't use "ON CONFLICT" and
> > upgrade is not
> >  > an option.
> >  >
> >
> > Explain plan would probably shed some light, but I suspect your
> > performance is being heavily hit by the sub query. Distinct is an
> > expensive operation and you are performing it once for every
> > distinct row
> > in your temp table.
> >
> > It isn't clear what the primary key is for your final table - name +
> > document seems suspicious given these seem to be the only two columns
> > your inserting as well. You don't indicate what the data types are
> > either - it document is something like 'text' then using it in a
> > distinct clause is likely to have huge performance impact.
> >
> > The first thing I'd do is to eliminate duplicates from your temp
> table
> > as a separate statement or by pre-filtering the CSV before import. I
> > would then try something like an outer join to identify rows in your
> > temp table which don't exist in your final table and select from
> there
> > to insert into the final table. You don't really need the distinct in
> > the sub query as all you really need to know is if (name, document)
> > exists - it doesn't matter if more than one exists (for this test).
> >
> > If you really don't have something more specific for a primary key,
> > depending on what data type 'document' is and how large it is, you
> may
> > find adding a column which is a checksum of your 'document' field a
> > useful addition. I have done this in the past where I had an
> application
> > where name was not unique and we only wanted distinct instances of
> > 'document' (document was a fairly large XML document in this case).
> >
> > --
> > Tim Cross
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: bug regclass::oid

2019-06-17 Thread Adrian Klaver

On 6/17/19 1:58 AM, John Mikel wrote:


hi again


here is my query
*select   A.table_name  as "table_name",A.domain_name as "domain",*
* format_type(c.atttypid, c.atttypmod)  AS data_type ,A.column_name as 
"column_name",*

*  A.is_nullable as "nullable",A.column_default as "default"*
*  from information_schema.columns A inner  join pg_attribute c  on 
  a.table_name::regclass::oid=c.attrelid*
*  where  a.table_schema in (select current_schema()) and  a.column_name 
=c.attname ;*


if i run this query in any database contain at least one table with 
space in their name , an error will occur

if i run this query in other database will work fine
I tested this on pg 11.1  , pg 10.3, pg 9.6

PS:Sorry if this message is duplicated i canceled the previous message 
by mistake by clicking on link




The previous message came through, see the replies. The basic issue is 
that a table name with a space in it will need to be quoted. So use 
quote_ident() per Peter and Tom's suggestions.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Async client libraries - not worth it?

2019-06-17 Thread Dave Cramer
On Mon, 17 Jun 2019 at 09:43, Rob Nikander  wrote:

>
>
> > On Jun 17, 2019, at 3:57 PM, Dave Cramer  wrote:
> > […] Postgres can pipeline requests if the client is written correctly so
> it is conceivable that this would be much faster.
>
> Can the JDBC driver do this? I don’t see it documented anywhere.


No, as it's blocking. It's conceivable that we could do it in the batch
stuff .

Dave Cramer

da...@postgresintl.com
www.postgresintl.com


Re: Async client libraries - not worth it?

2019-06-17 Thread Rob Nikander



> On Jun 17, 2019, at 3:57 PM, Dave Cramer  wrote:
> […] Postgres can pipeline requests if the client is written correctly so it 
> is conceivable that this would be much faster.

Can the JDBC driver do this? I don’t see it documented anywhere.



Re: Async client libraries - not worth it?

2019-06-17 Thread Dave Cramer
On Mon, 17 Jun 2019 at 07:35, Rob Nikander  wrote:

>
>
> On Jun 17, 2019, at 1:12 PM, Dave Cramer  wrote:
>
> https://www.techempower.com/benchmarks/#section=data-r17=ph=db
>
> Seems to be worth it.
>
> Now it appears that ADBA is going to die on the vine, R2DBC and vertx seem
> to be pretty good
>
>
> The “async” frameworks are faster, but I think they might be getting the
> performance gain not from the async DB API, but from the fact that they
> don’t block OS threads that are handling frontend HTTP requests. They may
> be using an async DB API to achieve that, but they *could* (I think) also
> use traditional JDBC and other general purpose concurrency tools from
> Java’s standard library.  That way would be easier to reason about, in my
> opinion.
>
> I may just have to write something both ways and wait to get real world
> experience with it to see how it goes.
>
>
Yes, the async framework is faster, but ultimately they have to return
something from the database which effectively makes them "block". Postgres
can pipeline requests if the client is written correctly so it is
conceivable that this would be much faster.


Dave


Re: checkpoints taking much longer than expected

2019-06-17 Thread Tiemen Ruiten
On Sun, Jun 16, 2019 at 8:57 PM Alvaro Herrera 
wrote:

> On 2019-Jun-14, Peter J. Holzer wrote:
>
> > There was a discussion about ZFS' COW behaviour and PostgreSQL reusing
> > WAL files not being a good combination about a year ago:
> >
> https://www.postgresql.org/message-id/flat/CACukRjO7DJvub8e2AijOayj8BfKK3XXBTwu3KKARiTr67M3E3w%40mail.gmail.com
> >
> > Maybe you have the same problem?
>
> Note that Joyent ended up proposing patches to fix their performance
> problem (and got them committed).  Maybe it would be useful for Tiemen
> to try that code?  (That commit cherry-picks cleanly on REL_11_STABLE.)
>

Unfortunately it doesn't compile:

access/transam/xlog.o: In function `XLogFileInit':
/home/ter/rpmbuild/BUILD/postgresql-11.3/src/backend/access/transam/xlog.c:3299:
undefined reference to `pg_pwrite'
collect2: error: ld returned 1 exit status


>
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>


Re: bug regclass::oid

2019-06-17 Thread John Mikel
hi again


here is my query
 *select   A.table_name  as "table_name",A.domain_name as "domain",*
* format_type(c.atttypid, c.atttypmod)  AS data_type ,A.column_name as
"column_name",*
*  A.is_nullable as "nullable",A.column_default as "default"*
*  from information_schema.columns A inner  join pg_attribute c
on  a.table_name::regclass::oid=c.attrelid*
*  where  a.table_schema in (select current_schema()) and  a.column_name
=c.attname ;*

if i run this query in any database contain at least one table with space
in their name , an error will occur
if i run this query in other database will work fine
I tested this on pg 11.1  , pg 10.3, pg 9.6

PS:Sorry if this message is duplicated i canceled the previous message by
mistake by clicking on link



Le jeu. 13 juin 2019 à 17:33, Adrian Klaver  a
écrit :

> On 6/13/19 8:14 AM, John Mikel wrote:
> > * hi ,
> >   i am here to tell you that this  test in query is not working when the
> >   table name in the database or schema name   contain space
> >   a.table_name::regclass=b.attrelid
> >
> >   a is information_schema.columns
> >   b is pg_attribute
> >
> >   trying this in two different databases the first database contain table
> >   with space in his name (Problem when running query)
> >   the second no ( work fine)
> >
> >   the same problme if you get Oid from schema name.
> >   SELECT 'public'::regnamespace::oid;  work ;
> >   create schema " this is test" ;
> >   SELECT 'this is test'::regnamespace::oid;  not working ;
>
> Try:
>
> SELECT '"this is test"'::regnamespace::oid;  not working ;
>
> Same for table name. As example:
>
> select '"space table"'::regclass;
>   regclass
> ---
>   "space table"
> (1 row)
>
> >
> >   i have question how use join between  information_schema.columns and
> >   pg_attribute ? thanks
> >
> > regards*
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Async client libraries - not worth it?

2019-06-17 Thread Rob Nikander


> On Jun 17, 2019, at 1:12 PM, Dave Cramer  wrote:
> 
> https://www.techempower.com/benchmarks/#section=data-r17=ph=db 
> 
> 
> Seems to be worth it.
> 
> Now it appears that ADBA is going to die on the vine, R2DBC and vertx seem to 
> be pretty good

The “async” frameworks are faster, but I think they might be getting the 
performance gain not from the async DB API, but from the fact that they don’t 
block OS threads that are handling frontend HTTP requests. They may be using an 
async DB API to achieve that, but they *could* (I think) also use traditional 
JDBC and other general purpose concurrency tools from Java’s standard library.  
That way would be easier to reason about, in my opinion.

I may just have to write something both ways and wait to get real world 
experience with it to see how it goes.






Re: Async client libraries - not worth it?

2019-06-17 Thread Dave Cramer
On Mon, 17 Jun 2019 at 01:34, Rob Nikander  wrote:

> Hi,
>
> I’m writing a new web app, and I’ve been experimenting with some async DB
> access libraries [1]. I also see some discussion online about a future Java
> standard to replace or supplement JDBC with an async API.
>
> While I understand the benefits of async in some situations, it seems to
> me that these libraries are not going to give much performance benefit,
> given the architecture of a PostgreSQL server. (Nothing against PG;
> probably most RDBMSs are like this.)
>
> I wonder if anyone else has looked at this and agrees, or not. ?
>
> A client library with an async-style API may allow 100,000s of concurrent
> “operations”, but since the PG server itself doesn’t handle connections on
> that scale (and has no plans to, I assume?), the client library is really
> maintaining a queue of operations waiting for a connection pool. Maybe
> there is some performance benefit there, but the most important point - to
> free up the front end to handle many HTTP connections - can also happen by
> combining an operation queue with a synchronous API.
>
> Rob
>
>
https://www.techempower.com/benchmarks/#section=data-r17=ph=db

Seems to be worth it.

Now it appears that ADBA is going to die on the vine, R2DBC and vertx seem
to be pretty good

Dave Cramer

da...@postgresintl.com
www.postgresintl.com