Re: some random() clarification needed

2020-07-14 Thread David Rowley
On Wed, 15 Jul 2020 at 04:01, Marc Millas  wrote:
> your answer helps me understand my first problem.
> so, I rewrote a simple loop so as to avoid the "volatile" behaviour.

Not sure what you're trying to do with the plpgsql, but you can just
escape the multiple evaluations by putting the volatile function in a
sub-select with no FROM clause.

SELECT ... FROM ... WHERE id = (SELECT ceiling(random()*2582));

Or the more traditional way to get a random row is:

SELECT ... FROM ... WHERE id BETWEEN 0 AND 2585 ORDER BY random() LIMIT 1;

David




Re: Convert hot_standby 9.4 postgresql into standalone server

2020-07-14 Thread Tom Lane
"David G. Johnston"  writes:
> On Tuesday, July 14, 2020, Julie Nishimura  wrote:
>> Hello, we currently have 9.4 hot_standby master-slave pair. Going forward,
>> we can keep only one server. How can I convert the system properly?

> If you are keeping the primary you shouldn’t have to do anything.  The
> absence of a secondary server shouldn’t impact the primary.  You might be
> recording more detail to WAL than strictly necessary but that isn’t wrong
> outright.

Yeah.  What you *do* need to think about is, if you no longer have
a standby server, what's your disaster recovery plan in case the
primary fails?  You need to have some good backup procedures in place.

regards, tom lane




Re: Convert hot_standby 9.4 postgresql into standalone server

2020-07-14 Thread David G. Johnston
On Tuesday, July 14, 2020, Julie Nishimura  wrote:

> Hello, we currently have 9.4 hot_standby master-slave pair. Going forward,
> we can keep only one server. How can I convert the system properly?
>

If you are keeping the primary you shouldn’t have to do anything.  The
absence of a secondary server shouldn’t impact the primary.  You might be
recording more detail to WAL than strictly necessary but that isn’t wrong
outright.

David J.


Convert hot_standby 9.4 postgresql into standalone server

2020-07-14 Thread Julie Nishimura
Hello, we currently have 9.4 hot_standby master-slave pair. Going forward, we 
can keep only one server. How can I convert the system properly?

These are some snippets from master postgresql.conf file:

==
wal_level = hot_standby # minimal, archive, hot_standby, or 
logical
# (change requires restart)
checkpoint_segments = 16
wal_keep_segments = 512
max_wal_senders = 16# max number of walsender processes
# (change requires restart)
max_replication_slots = 12  # max number of replication slots
===

Is it enough if I do the following:
1) backup all dbs on master
2) shutdown both servers
3) update max_replication_slots=0 in postgresql.conf on master
4) update wal_level = minimal in postgresql.conf on master
4) start master

Do I need to do anything else? Thank you for your help


Re: Problem with FDW wrapper errors

2020-07-14 Thread Michael Nolan
Thanks, for the time being we're looking at using something other than the
FDW for this task.
--
Mike Nolan


Re: some random() clarification needed

2020-07-14 Thread Adrian Klaver

On 7/14/20 9:01 AM, Marc Millas wrote:

Hi,
your answer helps me understand my first problem.
so, I rewrote a simple loop so as to avoid the "volatile" behaviour.
(at least I was thinking I did... looks like I was wrong !)
step by step loop:
DO $$
BEGIN
   FOR counter IN 1..1000 LOOP
begin
declare
id1 integer =ceiling(random()*2582);
id3 date= '2000-01-01';
id2 date;
pren varchar;
begin
id2=id3 + (random()*7200)::integer;
SELECT prenom FROM prenoms WHERE id=id1 into pren;
INSERT INTO testparttransac (datenaissance, prenoms) values(id2,pren);
   end;
end;
END LOOP;
END; $$;

I truncated the table, executed the loop with no errors, and expected 
that a select count(*)

may answer 1000 !
no.
it varies, from less than 1000 (much less, something like 900)
and more than 1000 (up to 1094)

so... what s "volatile" in the loop ?


I think it has more to do with the structure of the function. Not sure 
how all those nested BEGINs interact, but when I simplify the above to:


DO $$

DECLARE
id1 integer =ceiling(random()*2582);
id3 date= '2000-01-01';
id2 date;
pren varchar = 'test';
BEGIN

FOR counter IN 1..1000 LOOP
id2 = id3 + (random()*7200)::integer;
INSERT INTO testparttransac (datenaissance, prenoms) values(id2,pren);
END LOOP;
END; $$;

I get 1000 rows each time I TRUNCATE testparttransac and then run above 
and repeat.




BTW the testparttransac table is partitioned on datenaissance, with a 
default partition.


thanks,
regards,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com 



On Tue, Jul 14, 2020 at 5:24 PM David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:


On Tue, Jul 14, 2020 at 8:15 AM Marc Millas mailto:marc.mil...@mokadb.com>> wrote:

select id, prenom from prenoms where id=ceiling(random()*2582);

expecting to get, allways, one line.
But its not the case.
around 15% of time I get 0 lines which is already quite strange
to me.
but 10% of time, I get a random number of lines, until now up to 4.
even weirder (to me !)

so, can someone please clarify ?


You are basically asking:

For each row in my table compare the id to some random number and if
they match return that row, otherwise skip it.  The random number
being compared to is different for each row because random() is
volatile and thus evaluated for each row.

David J.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: single table - fighting a seq scan

2020-07-14 Thread Tom Lane
Radoslav Nedyalkov  writes:
> Ah, I could have messed up the examples I gave. Row numbers are different.
> Once again the plans , sorry about that.

Given that it works at 100 entries and not 101, I can't escape the
suspicion that you're being burnt by predtest.c's MAX_SAOP_ARRAY_SIZE
limit.  However, that only affects the planner's willingness to make
constraint proofs involving the large IN clause, and nothing you've
mentioned here explains why such a proof would be needed.  Is there
something you're not telling us about this table's schema?  (I'm
wondering if the index is partial, for instance, though one would
think that the CTE form of the query wouldn't work either if so.)

regards, tom lane




Re: Issue executing query from container

2020-07-14 Thread Tom Lane
=?UTF-8?Q?Eudald_Valc=C3=A0rcel_Lacasa?=  writes:
> I'm running a docker container that executes a php script running a
> sequence of queries.
> One of the queries gets stuck (takes more than 2 hours in execution,
> active in pg_stat_activity).
> The query is executed with a JOIN between a FOREIGN TABLE and a local table.

> Executing this query from the psql console takes less than 5 seconds to 
> resolve.
> If I execute the query from the container created, manually, it
> resolves as well within 5 seconds.

> Can anyone guide me in a way I can troubleshoot what is causing the
> query to get stuck?

It's a very good bet that it's something about the query being
parameterized or not.  You could try installing auto_explain to
compare the plans that are generated.

regards, tom lane




Issue executing query from container

2020-07-14 Thread Eudald Valcàrcel Lacasa
Hello!
I'm running a docker container that executes a php script running a
sequence of queries.
One of the queries gets stuck (takes more than 2 hours in execution,
active in pg_stat_activity).
The query is executed with a JOIN between a FOREIGN TABLE and a local table.

Executing this query from the psql console takes less than 5 seconds to resolve.
If I execute the query from the container created, manually, it
resolves as well within 5 seconds.

Can anyone guide me in a way I can troubleshoot what is causing the
query to get stuck?

Thanks!
Eudald




Re: single table - fighting a seq scan

2020-07-14 Thread Radoslav Nedyalkov
Ah, I could have messed up the examples I gave. Row numbers are different.
Once again the plans , sorry about that.

-- 200 entries

 Gather  (cost=1000.00..106905910.97 rows=7893 width=8)
   Workers Planned: 2
   ->  Parallel Seq Scan on te  (cost=0.00..106904121.67 rows=3289 width=8)
 Filter: (current_pid = ANY
('{240900026,240900027,240900028,240900029,240900030,240900031,240900032,240900033,240900034,240900035,240900036,240900037,240900038,240900039,240900040,240900041,240900042,240900043,240900044,240900045,240900046,240900047,240900048,240900049,240900050,240900051,240900052,240900053,240900054,240900055,240900056,240900057,240900058,240900059,240900060,240900061,240900062,240900063,240900064,240900065,240900066,240900067,240900068,240900069,240900070,240900071,240900072,240900073,240900074,240900075,240900076,240900077,240900078,240900079,240900080,240900081,240900082,240900083,240900084,240900085,240900086,240900165,240900087,240900166,240900088,240900167,240900089,240900168,240900090,240900169,240900091,240900170,240900092,240900171,240900093,240900172,240900094,240900173,240900905,240900174,240900175,240900176,240900177,240900178,240900179,240900180,240900181,240900182,240900183,240900184,240900185,240900186,240900187,240900188,240900189,240900190,240900191,240900192,240900193,240900194,240900195,240900196,240900197,240900198,240900199,240900906,240900907,240900908,240900909,240900910,240900911,240900912,240900913,240900914,240900915,240900916,240900917,240900918,240900919,240900920,240900921,240900922,240900923,240900924,240900925,240900926,240900927,240900928,240901048,240901053,240901054,240901055,240901056,240901057,240901058,240901059,240901060,240901061,240901062,240901063,240901064,240901065,240901066,240901067,240901068,240901069,240901070,240901071,240901072,240901073,240901074,240901075,240901076,240901077,240901078,240901079,240901080,240901081,240901082,240901083,240901084,240901085,240901086,240901087,240901088,240901089,240901090,240901091,240901092,240901093,240901094,240901095,240901096,240901097,240901098,240901099,240901100,240901101,240901102,240901103,240901104,240901105,240901106,240901107,240901108,240901109,240901110,24090,240901112,240901113,240901114,240901115,240901116,240901117,240901118,240901119,240901120,240901121,240901122,240901123,240901124,240901125,240901126,240901127,240901128,240901129}'::bigint[]))
(4 rows)

Time: 5.261 ms




-- 100 entries
--

 Index Scan using te_current_pid_idx on te  (cost=0.57..731.26 rows=3832
width=8) (actual time=1.244..15.897 rows=600 loops=1)
   Index Cond: (current_pid = ANY
('{240900026,240900027,240900028,240900029,240900030,240900031,240900032,240900033,240900034,240900035,240900036,240900037,240900038,240900039,240900040,240900041,240900042,240900043,240900044,240900045,240900046,240900047,240900048,240900049,240900050,240900051,240900052,240900053,240900054,240900055,240900056,240900057,240900058,240900059,240900060,240900061,240900062,240900063,240900064,240900065,240900066,240900067,240900068,240900069,240900070,240900071,240900072,240900073,240900074,240900075,240900076,240900077,240900078,240900079,240900080,240900081,240900082,240900083,240900084,240900085,240900086,240900165,240900087,240900166,240900088,240900167,240900089,240900168,240900090,240900169,240900091,240900170,240900092,240900171,240900093,240900172,240900094,240900173,240900905,240900174,240900175,240900176,240900177,240900178,240900179,240900180,240900181,240900182,240900183,240900184,240900185,240900186,240900187,240900188,240900189,240900190,240900191,240900192,240900193,240900194}'::bigint[]))
 Planning Time: 3.430 ms
 Execution Time: 15.954 ms
(4 rows)

Time: 20.257 ms


On Tue, Jul 14, 2020 at 10:31 PM Radoslav Nedyalkov 
wrote:

> Hi Michael,
> full output from the query is attached.
> here is the truncated lists version.

Re: Problem with FDW wrapper errors

2020-07-14 Thread Adrian Klaver

On 7/14/20 10:03 AM, Michael Nolan wrote:
An application I wrote is being modified by our development team to use 
a FDW to a remote MySQL table instead of the postgresql table for a query.


We're getting this error in the logs periodically.  Maybe about once 
every 50,000 queries:


2020-07-14 11:35:22.799 CDT uscf ::1 ERROR:  failed to execute the MySQL 
query:

         Prepared statement needs to be re-prepared
2020-07-14 11:35:22.799 CDT uscf ::1 STATEMENT:  select memtp, 
('2004-12-18'::date-membirthdt)/365.25  as mage from 
memmast_simulant_ratings where memid = '13036134'


If I restart it, it usually goes through, but might fail on that query, 
but using different data.


Is this possibly due to transient communication issues between the 
postgresql server and the remote MySQL server



https://www.postgresql.org/docs/12/sql-prepare.html

"Prepared statements only last for the duration of the current database 
session. When the session ends, the prepared statement is forgotten, so 
it must be recreated before being used again. This also means that a 
single prepared statement cannot be used by multiple simultaneous 
database clients; however, each client can create their own prepared 
statement to use. Prepared statements can be manually cleaned up using 
the DEALLOCATE command."



Dropped connections could be an issue per above. You might also explore 
the other conditions in the above.




If I just trap the error, wait a few seconds and re-issue the query, is 
that likely to be sufficient?


I'm guessing not, you will need to issue the PREPARE again per the error 
message:


"Prepared statement needs to be re-prepared"


--
Mike Nolan
htf...@gmail.com 



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: how to "explain" some ddl

2020-07-14 Thread Marc Millas
Hi Tom,
a few tests later.
Looks like when you add a partition as default, all tupples of it are read,
even if there is an index on the column that is the partition key.
this do explain our attach time. We are going to clean the default
partition...

regards,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 7:05 PM Tom Lane  wrote:

> Marc Millas  writes:
> > We would like to understand where an alter table attach partition spend
> its
> > time.
> > to my understanding, explain doesnt do this.
>
> Nope :-(.  As our DDL commands have gotten more complicated, there's
> been some discussion of adding that, but nothing's really been done
> yet.
>
> There is some progress-monitoring support for some DDL commands now,
> but that's not quite the same thing.
>
> > for a BI job we have a partitionned table with 1800+ partitions.
>
> TBH I'd recommend scaling that down by at least a factor of ten.
> We are not at a point where you can expect that all operations will
> be cheap even with thousands of partitions.  We may never be at that
> point, although people continue to chip away at the bottlenecks.
>
> regards, tom lane
>


Re: single table - fighting a seq scan

2020-07-14 Thread Michael Lewis
rows=3832
rows=3870

Your estimate changed very little when you included 100 values vs 200
values. That is interesting to me.

What does the below query give you? How many of those 200 values are found
in the MCVs list? If n_distinct is low, and most of the values are NOT in
the most common value list, and the fraction of the table covered by the
MCVs is also low, then the planner will expect that each of the 200 values
represents some deceivingly high portion of the table.

You said there are 80 million rows, yes? That seems likely that ndistinct
and the MCVs list are not giving info very correlated with reality. You may
want to increase the minimum table size for sequential to kick in. I cannot
recall the name of that setting at the moment. You may also want to
increase stats target on that column at least, analyze, and explain the
query again.

SELECT
( SELECT SUM (x) FROM UNNEST (most_common_freqs) x ) frac_MCV,
tablename,
attname,
inherited,
null_frac,
n_distinct,
array_length(most_common_vals,1) n_mcv,
array_length(histogram_bounds,1) n_hist,
correlation,
*
FROM pg_stats
WHERE
schemaname = 'public'
AND tablename='te'
AND attname='current_pid';

>


Re: how to "explain" some ddl

2020-07-14 Thread Marc Millas
Hi,
thanks for the answer.
the pb is that the fact table do have mods for "old" data.
so the current scheme implies to truncate partitions and recreate them, and
copy from ods to dm, etc which is better than millions (tens of) delete and
vacuuming.
and so, the partitioning scheme is based on day s data. so the 1800+.

the other pb we do have is the very long planning time for most request.
was 120 sec in r11, down to 60 sec in 12.
vs an exec time around 4 sec. Looks like the number of indexes is of
paramount impact.
Can you take me to any doc about optimizing the index scheme for a fact
table with 40 dimensions ?
thanks
regards,




Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 7:05 PM Tom Lane  wrote:

> Marc Millas  writes:
> > We would like to understand where an alter table attach partition spend
> its
> > time.
> > to my understanding, explain doesnt do this.
>
> Nope :-(.  As our DDL commands have gotten more complicated, there's
> been some discussion of adding that, but nothing's really been done
> yet.
>
> There is some progress-monitoring support for some DDL commands now,
> but that's not quite the same thing.
>
> > for a BI job we have a partitionned table with 1800+ partitions.
>
> TBH I'd recommend scaling that down by at least a factor of ten.
> We are not at a point where you can expect that all operations will
> be cheap even with thousands of partitions.  We may never be at that
> point, although people continue to chip away at the bottlenecks.
>
> regards, tom lane
>


single table - fighting a seq scan

2020-07-14 Thread Radoslav Nedyalkov
Hi Forum,
I'm scratching my head around the following case:

*te* is a 80M rows, 100GB table. It is a bare simple select over indexed
attribute of it.

EXPLAIN SELECT te.id FROM te WHERE te.current_pid IN (240900026,
 240900027,
 240900028,
 -- 200 entries ...

 Gather  (cost=1000.00..61517367.85 rows=3870 width=8)
   Workers Planned: 2
   ->  Parallel Seq Scan on te  (cost=0.00..61515980.85 rows=1612 width=8)
 Filter: (current_pid = ANY
('{240900026,240900027,...240901129}'::bigint[]))
Execution time is about 5 minutes

Reducing number of current_pids to 100 changes the plan and it does index
scan. (101 still does seq scan)

 Index Scan using te_current_pid_idx on te  (cost=0.57..731.26 rows=3832
width=8) (actual time=0.566..1.667 rows=600 loops=1)
   Index Cond: (current_pid = ANY
('{240900026,240900027,...240900194}'::bigint[]))
 Planning Time: 3.152 ms
 Execution Time: 1.732 ms

Selecting 200 pids rewritten with CTE goes for index too.

EXPLAIN ANALYZE
WITH cte as (
select * from unnest(ARRAY[
240900026,
 240900027,
 240900028,
...
 240901129
]))
SELECT te.id FROM te join cte on te.current_pid = cte.unnest;


   QUERY PLAN
-
 Nested Loop  (cost=1.58..1097.83 rows=3847 width=8) (actual
time=0.882..14.927 rows=1468 loops=1)
   CTE cte
 ->  Function Scan on unnest  (cost=0.00..1.00 rows=100 width=4)
(actual time=0.025..0.043 rows=205 loops=1)
   ->  CTE Scan on cte  (cost=0.00..2.00 rows=100 width=4) (actual
time=0.027..0.083 rows=205 loops=1)
   ->  Index Scan using te_current_pid_idx on te  (cost=0.57..10.57 rows=38
width=16) (actual time=0.011..0.071 rows=7 loops=205)
 Index Cond: (current_pid = cte.unnest)
 Planning Time: 2.022 ms
 Execution Time: 15.044 ms


I tried random_page_cost=1, a couple of combinations with very low
cpu_index_tuple_cost and cpu_operator_cost. Only managed to get an index
scan for a few more IN entries.
Did analyze. Bumped stats target for current_pid to 5000. Did not help.

I'm out of ideas. What is the right approach to solve this ?
Thank You!

Rado


Re: how to "explain" some ddl

2020-07-14 Thread Tom Lane
Marc Millas  writes:
> We would like to understand where an alter table attach partition spend its
> time.
> to my understanding, explain doesnt do this.

Nope :-(.  As our DDL commands have gotten more complicated, there's
been some discussion of adding that, but nothing's really been done
yet.

There is some progress-monitoring support for some DDL commands now,
but that's not quite the same thing.

> for a BI job we have a partitionned table with 1800+ partitions.

TBH I'd recommend scaling that down by at least a factor of ten.
We are not at a point where you can expect that all operations will
be cheap even with thousands of partitions.  We may never be at that
point, although people continue to chip away at the bottlenecks.

regards, tom lane




Problem with FDW wrapper errors

2020-07-14 Thread Michael Nolan
An application I wrote is being modified by our development team to use a
FDW to a remote MySQL table instead of the postgresql table for a query.

We're getting this error in the logs periodically.  Maybe about once every
50,000 queries:

2020-07-14 11:35:22.799 CDT uscf ::1 ERROR:  failed to execute the MySQL
query:
Prepared statement needs to be re-prepared
2020-07-14 11:35:22.799 CDT uscf ::1 STATEMENT:  select memtp,
('2004-12-18'::date-membirthdt)/365.25  as mage from
memmast_simulant_ratings where memid = '13036134'

If I restart it, it usually goes through, but might fail on that query, but
using different data.

Is this possibly due to transient communication issues between the
postgresql server and the remote MySQL server

If I just trap the error, wait a few seconds and re-issue the query, is
that likely to be sufficient?
--
Mike Nolan
htf...@gmail.com


Re: Same query taking less time in low configuration machine

2020-07-14 Thread Kenneth Marshall
On Tue, Jul 14, 2020 at 09:27:56PM +0530, Vishwa Kalyankar wrote:
> HI,
> 
> OS cache is updated and I had run the query few times with almost the same
> result each time.
> 
> Regards,
> Vishwa

Hi Vishwa,

What are the CPU speeds, memory bandwidth, I/O bandwidth? Often the
lower core count CPUs have a faster clock speed. What is the CPU cache
size for both? Are you running the same OS and PostgreSQL build binaries
on both?

Regards,
Ken




how to "explain" some ddl

2020-07-14 Thread Marc Millas
Hi,
We would like to understand where an alter table attach partition spend its
time.
to my understanding, explain doesnt do this.

for a BI job we have a partitionned table with 1800+ partitions.
the feeding process of this table leeds to detach and attach partitions.
attaching do take time, something like 12 seconds by partition.
We need to understand where that time is spent (check constraints or check
within the default partition or...)

So, how to ?

thanks,
regards,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: some random() clarification needed

2020-07-14 Thread David G. Johnston
Please don't top-post.  Inline (with trim) is better but at minimum
bottom-post.

On Tue, Jul 14, 2020 at 9:01 AM Marc Millas  wrote:

> Hi,
> your answer helps me understand my first problem.
> so, I rewrote a simple loop so as to avoid the "volatile" behaviour.
> (at least I was thinking I did... looks like I was wrong !)
> step by step loop:
> DO $$
> BEGIN
>   FOR counter IN 1..1000 LOOP
> begin
> declare
> id1 integer =ceiling(random()*2582);
> id3 date= '2000-01-01';
> id2 date;
> pren varchar;
> begin
> id2=id3 + (random()*7200)::integer;
> SELECT prenom FROM prenoms WHERE id=id1 into pren;
> INSERT INTO testparttransac (datenaissance, prenoms) values(id2,pren);
>   end;
> end;
> END LOOP;
> END; $$;
>
> I truncated the table, executed the loop with no errors, and expected that
> a select count(*)
> may answer 1000 !
> no.
> it varies, from less than 1000 (much less, something like 900)
> and more than 1000 (up to 1094)
>
> so... what s "volatile" in the loop ?
>

Everything...

You are setting id1 to the result of an expression inside the loop.
Everytime that statement gets executed within the loop a new random number
is produced.

I mean, even "id 3 date = '2000-01-01'" is repeatedly casting (I think) the
string to a date and assigning it to the variable even though that
statement overall is effectively immutable.

David J.


Re: some random() clarification needed

2020-07-14 Thread Marc Millas
Hi,
your answer helps me understand my first problem.
so, I rewrote a simple loop so as to avoid the "volatile" behaviour.
(at least I was thinking I did... looks like I was wrong !)
step by step loop:
DO $$
BEGIN
  FOR counter IN 1..1000 LOOP
begin
declare
id1 integer =ceiling(random()*2582);
id3 date= '2000-01-01';
id2 date;
pren varchar;
begin
id2=id3 + (random()*7200)::integer;
SELECT prenom FROM prenoms WHERE id=id1 into pren;
INSERT INTO testparttransac (datenaissance, prenoms) values(id2,pren);
  end;
end;
END LOOP;
END; $$;

I truncated the table, executed the loop with no errors, and expected that
a select count(*)
may answer 1000 !
no.
it varies, from less than 1000 (much less, something like 900)
and more than 1000 (up to 1094)

so... what s "volatile" in the loop ?

BTW the testparttransac table is partitioned on datenaissance, with a
default partition.

thanks,
regards,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 5:24 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Jul 14, 2020 at 8:15 AM Marc Millas 
> wrote:
>
>> select id, prenom from prenoms where id=ceiling(random()*2582);
>>
>> expecting to get, allways, one line.
>> But its not the case.
>> around 15% of time I get 0 lines which is already quite strange to me.
>> but 10% of time, I get a random number of lines, until now up to 4.
>> even weirder (to me !)
>>
>> so, can someone please clarify ?
>>
>>
> You are basically asking:
>
> For each row in my table compare the id to some random number and if they
> match return that row, otherwise skip it.  The random number being compared
> to is different for each row because random() is volatile and thus
> evaluated for each row.
>
> David J.
>


Re: Same query taking less time in low configuration machine

2020-07-14 Thread Vishwa Kalyankar
HI,

OS cache is updated and I had run the query few times with almost the same
result each time.

Regards,
Vishwa

On Tue, Jul 14, 2020 at 6:16 PM Philip Semanchuk <
phi...@americanefficient.com> wrote:

>
>
> > On Jul 14, 2020, at 5:27 AM, Vishwa Kalyankar <
> vishwakalyank...@gmail.com> wrote:
> >
> > Hi,
> >
> > I have two machines - one with 8GB RAM & 4core CPU and the other with
> 64GB Ram & 24 core CPU.  Both machines have the same DB (Postgres 12 +
> Postgis 2.5.3).  Same query is taking less time in low end machine whereas
> more time in high end machine.  Any thoughts on where to look?  I have
> tuned the db in both machines according to https://pgtune.leopard.in.ua/#/
> >
> >
> > Below I am pasting the output of query explain in both the machines.
> >
> >  -bash-4.2$ psql -p 5434
> > psql (12.3)
> > Type "help" for help.
> >
> > postgres=# \c IPDS_KSEB;
> > You are now connected to database "IPDS_KSEB" as user "postgres".
> > IPDS_KSEB=# explain analyze select * from
> kseb_geometry_trace_with_barrier_partition(5,'kottarakara_version',437,'htline',2)
> ;
> >
> QUERY PLAN
> >
> -
> >  Function Scan on kseb_geometry_trace_with_barrier_partition
> (cost=0.25..10.25 rows=1000 width=169) (actual time=11626.548..11626.568
> rows=254 loops=1)
> >  Planning Time: 0.212 ms
> >  Execution Time: 11628.590 ms
> >
> >
> > -bash-4.2$ psql -p 5422
> > psql (12.3)
> > Type "help" for help.
> >
> > postgres=# \c IPDS_KSEB;
> > You are now connected to database "IPDS_KSEB" as user "postgres".
> > IPDS_KSEB=# explain analyze select * from
> kseb_geometry_trace_with_barrier_partition(5,'kottarakara_version',437,'htline',2)
> ;
> >
> QUERY PLAN
> >
> -
> >  Function Scan on kseb_geometry_trace_with_barrier_partition
> (cost=0.25..10.25 rows=1000 width=169) (actual time=22304.425..22304.448
> rows=254 loops=1)
> >  Planning Time: 0.219 ms
> >  Execution Time: 22352.219 ms
> > (3 rows)
> >
>
> Hi Vishwa,
> Is it possible that your data is in the cache on the low end machine but
> not on the high end machine? There’s both the Postgres cache and the OS
> disk cache to consider. You can see what’s in the Postgres cache with an
> extension like pg_buffercache. I don’t know of a way to see what’s in the
> OS cache; maybe others do.
>
> Cheers
> Philip
>
>
>
>
>
>
>


Re: Surprising connection issue

2020-07-14 Thread David G. Johnston
On Tue, Jul 14, 2020 at 8:25 AM David Gasa i Castell 
wrote:

> And my surprise went when I see the connection done while there is no user
> granted to connect the database...
>

https://www.postgresql.org/docs/12/ddl-priv.html

"""
PostgreSQL grants privileges on some types of objects to PUBLIC by default
when the objects are created. No privileges are granted to PUBLIC by
default on tables, table columns, sequences, foreign data wrappers, foreign
servers, large objects, schemas, or tablespaces. For other types of
objects, the default privileges granted to PUBLIC are as follows:
***CONNECT and TEMPORARY (create temporary tables) privileges for
databases;*** (emphasis mine)
 EXECUTE privilege for functions and procedures; and USAGE privilege for
languages and data types (including domains).
"""

David J.


Re: some random() clarification needed

2020-07-14 Thread Marc Millas
Ok, thanks for the clarification.

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com



On Tue, Jul 14, 2020 at 5:24 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Jul 14, 2020 at 8:15 AM Marc Millas 
> wrote:
>
>> select id, prenom from prenoms where id=ceiling(random()*2582);
>>
>> expecting to get, allways, one line.
>> But its not the case.
>> around 15% of time I get 0 lines which is already quite strange to me.
>> but 10% of time, I get a random number of lines, until now up to 4.
>> even weirder (to me !)
>>
>> so, can someone please clarify ?
>>
>>
> You are basically asking:
>
> For each row in my table compare the id to some random number and if they
> match return that row, otherwise skip it.  The random number being compared
> to is different for each row because random() is volatile and thus
> evaluated for each row.
>
> David J.
>


Surprising connection issue

2020-07-14 Thread David Gasa i Castell
Hi guys,

I don't know if what I'm going to explain you could be regarded as a
regular behaviour related issue... but I'm so surprised.

I'm working under the latest stable PostgreSQL version 12.3.

$ sudo /usr/lib/postgresql/12/bin/postgres --version
postgres (PostgreSQL) 12.3 (Debian 12.3-1.pgdg100+1)

And the contents of my pg_hba.conf is as follows,

$ sudo cat /etc/postgresql/12/main/pg_hba.conf | egrep -v
'^[[:space:]]*(#.*)?$' -
local   all postgrespeer
local   all all peer
hostssl all all 127.0.0.1/32   scram-sha-256
hostssl all all ::1/128
scram-sha-256
local   replication all peer
hostreplication all 127.0.0.1/32md5
hostreplication all ::1/128 md5

(Obviously the 'password_encryption' variable is correctly set to
scram-sha-256)

Once connected, I created a one new user (user1) with superuser grant and a
new one other (user2) as described below,

$ psql
psql (12.3 (Debian 12.3-1.pgdg100+1))
Type "help" for help.

postgres=# CREATE USER user1 WITH SUPERUSER;
CREATE ROLE
postgres=# \password user1
Enter new password:
Enter it again:
postgres=# CREATE USER user2;
CREATE ROLE
postgres=# \password user2
Enter new password:
Enter it again:

Once done, I created a new db1 database... putting the user user1 as the
owner of it.

postgres=# CREATE DATABASE db1 OWNER user1;
CREATE DATABASE



postgres=# \du
   List of roles
 Role name | Attributes |
Member of
---++---
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 user1 | Superuser  | {}
 user2 || {}

postgres=# \l
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access
privileges
---+--+--+-+-+---
 db1   | user1| UTF8 | ca_ES.UTF-8 | ca_ES.UTF-8 |
 postgres  | postgres | UTF8 | ca_ES.UTF-8 | ca_ES.UTF-8 |
 template0 | postgres | UTF8 | ca_ES.UTF-8 | ca_ES.UTF-8 | =c/postgres
 +
   |  |  | | |
postgres=CTc/postgres
 template1 | postgres | UTF8 | ca_ES.UTF-8 | ca_ES.UTF-8 | =c/postgres
 +
   |  |  | | |
postgres=CTc/postgres
(4 rows)

postgres=# \q

After that, I'm tried to make a connection to database db1 as a user user1.

$ psql -h localhost -d db1 -U user1
Password for user user1:
psql (12.3 (Debian 12.3-1.pgdg100+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits:
256, compression: off)
Type "help" for help.

db1=# \q

And my surprise went when I see the connection done while there is no user
granted to connect the database...

Ok -I thought- maybe because user1 is a superuser... or even maybe because
user1 is in fact the owner of the database db1.


So, I decided to try it again with another user (user2) !

$ psql -h localhost -d db1 -U user2
Password for user user2:
psql (12.3 (Debian 12.3-1.pgdg100+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits:
256, compression: off)
Type "help" for help.

db1=> \q
$

And honestly I don't know if I'm right now in front of a bug... or there is
some reason that explains all of this ?

-- 
David Gasa i Castell

Linux User #488832


Re: some random() clarification needed

2020-07-14 Thread David G. Johnston
On Tue, Jul 14, 2020 at 8:15 AM Marc Millas  wrote:

> select id, prenom from prenoms where id=ceiling(random()*2582);
>
> expecting to get, allways, one line.
> But its not the case.
> around 15% of time I get 0 lines which is already quite strange to me.
> but 10% of time, I get a random number of lines, until now up to 4.
> even weirder (to me !)
>
> so, can someone please clarify ?
>
>
You are basically asking:

For each row in my table compare the id to some random number and if they
match return that row, otherwise skip it.  The random number being compared
to is different for each row because random() is volatile and thus
evaluated for each row.

David J.


some random() clarification needed

2020-07-14 Thread Marc Millas
Hi,

when, in psql, on a postgres 12.3, I write:
select ceiling(random()*2582);
it does provide the expected answer, ie. a number between 1 and 2582,
inclusive.
allways.
when I decide to use this to get a random row within a table prenoms having
2 columns
a id serial, and a prenom varchar, with explicitly 2582 lines, no gaps,
I write:
select id, prenom from prenoms where id=ceiling(random()*2582);

expecting to get, allways, one line.
But its not the case.
around 15% of time I get 0 lines which is already quite strange to me.
but 10% of time, I get a random number of lines, until now up to 4.
even weirder (to me !)

so, can someone please clarify ?

thanks,
regards,


Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com


Re: Both side privileges

2020-07-14 Thread Toomas Kristin

And please be sure that default permissions are defined properly. Otherwise it 
may happen that userA creates a new table but userB has access for that. 
Alternative option is that user has to grant proper access for the role group 
after every time when a new db object is created.

Toomas

> On 14. Jul 2020, at 16:36, Jean-Philippe Chenel  wrote:
> 
> Thank for your answer David.
> This is what I'll do.
> De : David G. Johnston  >
> Envoyé : 13 juillet 2020 19:46
> À : Jean-Philippe Chenel mailto:jp.che...@live.ca>>
> Cc : pgsql-general@lists.postgresql.org 
>  
>  >
> Objet : Re: Both side privileges
>  
> On Mon, Jul 13, 2020 at 4:42 PM Jean-Philippe Chenel  > wrote:
> Hi,
> 
> I try to give userA privileges on userB objects and same thing to the userB, 
> giving privileges on userA objects. 
> 
> Grant userB to userA; —ok
> Grant userA to userB; —error: role userB is already member of role userA
> 
> Create a "group role" that retains ownership and then add both users to that 
> group.
> 
> David J.



Re: Both side privileges

2020-07-14 Thread Jean-Philippe Chenel
Thank for your answer David.
This is what I'll do.

De : David G. Johnston 
Envoyé : 13 juillet 2020 19:46
À : Jean-Philippe Chenel 
Cc : pgsql-general@lists.postgresql.org 
Objet : Re: Both side privileges

On Mon, Jul 13, 2020 at 4:42 PM Jean-Philippe Chenel 
mailto:jp.che...@live.ca>> wrote:
Hi,

I try to give userA privileges on userB objects and same thing to the userB, 
giving privileges on userA objects.

Grant userB to userA; —ok
Grant userA to userB; —error: role userB is already member of role userA

Create a "group role" that retains ownership and then add both users to that 
group.

David J.


Re: JDBC driver version for a given Postgres version

2020-07-14 Thread Thomas Kellerer
Shantanu Shekhar schrieb am 14.07.2020 um 14:34:
> We are upgrading our Postgres instance from 9.6.11 to 10.11. Then as
> part of a second upgrade we will go from 10.11 to 11.6. Currently
> (with 9.6.11) we are using the 42.2.1 JDBC driver. I am trying to
> figure out the impact our database upgrade will have on the JDBC
> driver version but I cannot find any such matrix that maps
> recommended JDBC driver version to the Postgres RDBMS instance
> version.
>
> How should I got about researching this information? Any help will be
> greatly appreciated.


From https://jdbc.postgresql.org/download.html

> It supports PostgreSQL 8.2 or newer

You can keep using 42.2.1 or upgrade to the latest release 42.2.14

Thomas




Re: Same query taking less time in low configuration machine

2020-07-14 Thread Philip Semanchuk



> On Jul 14, 2020, at 5:27 AM, Vishwa Kalyankar  
> wrote:
> 
> Hi,  
> 
> I have two machines - one with 8GB RAM & 4core CPU and the other with 64GB 
> Ram & 24 core CPU.  Both machines have the same DB (Postgres 12 + Postgis 
> 2.5.3).  Same query is taking less time in low end machine whereas more time 
> in high end machine.  Any thoughts on where to look?  I have tuned the db in 
> both machines according to https://pgtune.leopard.in.ua/#/ 
> 
> 
> Below I am pasting the output of query explain in both the machines.
> 
>  -bash-4.2$ psql -p 5434
> psql (12.3)
> Type "help" for help.
> 
> postgres=# \c IPDS_KSEB;
> You are now connected to database "IPDS_KSEB" as user "postgres".
> IPDS_KSEB=# explain analyze select * from 
> kseb_geometry_trace_with_barrier_partition(5,'kottarakara_version',437,'htline',2)
>  ;
>QUERY 
> PLAN
> -
>  Function Scan on kseb_geometry_trace_with_barrier_partition  
> (cost=0.25..10.25 rows=1000 width=169) (actual time=11626.548..11626.568 
> rows=254 loops=1)
>  Planning Time: 0.212 ms
>  Execution Time: 11628.590 ms
> 
> 
> -bash-4.2$ psql -p 5422
> psql (12.3)
> Type "help" for help.
> 
> postgres=# \c IPDS_KSEB;
> You are now connected to database "IPDS_KSEB" as user "postgres".
> IPDS_KSEB=# explain analyze select * from 
> kseb_geometry_trace_with_barrier_partition(5,'kottarakara_version',437,'htline',2)
>  ;
>QUERY 
> PLAN
> -
>  Function Scan on kseb_geometry_trace_with_barrier_partition  
> (cost=0.25..10.25 rows=1000 width=169) (actual time=22304.425..22304.448 
> rows=254 loops=1)
>  Planning Time: 0.219 ms
>  Execution Time: 22352.219 ms
> (3 rows)
> 

Hi Vishwa,
Is it possible that your data is in the cache on the low end machine but not on 
the high end machine? There’s both the Postgres cache and the OS disk cache to 
consider. You can see what’s in the Postgres cache with an extension like 
pg_buffercache. I don’t know of a way to see what’s in the OS cache; maybe 
others do.

Cheers
Philip










JDBC driver version for a given Postgres version

2020-07-14 Thread Shantanu Shekhar
We are upgrading our Postgres instance from 9.6.11 to 10.11. Then as part of a 
second upgrade we will go from 10.11 to 11.6. Currently (with 9.6.11) we are 
using the 42.2.1 JDBC driver. I am trying to figure out the impact our database 
upgrade will have on the JDBC driver version but I cannot find any such matrix 
that maps recommended JDBC driver version to the Postgres RDBMS instance 
version.
How should I got about researching this information? Any help will be greatly 
appreciated.
Thanks,
Shantanu


psql: FATAL: database "postgres" does not exist or ERROR: 23505: duplicate key value violates unique constraint "pg_namespace_nspname_index"

2020-07-14 Thread TALLURI Nareshkumar
Hello Postgres Support Team,

Today we have an outage, our DB was wend down due to 100% space full at FS. We 
added space and could able to bring the cluster.

DB version: psql (PostgreSQL) 10.12
OS version : Red Hat Enterprise Linux Server release 7.8 (Maipo)

[0]postgres@axmdevhkdb008$ [PHKGAXMD008] pg_ctl start -D 
/AXMDEVHKDB008/postgres/PHKGAXMD008_bck
waiting for server to start2020-07-14 08:50:42.273 CEST db:[] user:[] [] 
[0] LOG:  listening on IPv4 address "176.5.88.68", port 5433
2020-07-14 08:50:42.278 CEST db:[] user:[] [] [0] LOG:  listening on Unix 
socket "/var/run/postgresql/.s.PGSQL.5433"
2020-07-14 08:50:42.280 CEST db:[] user:[] [] [0] LOG:  listening on Unix 
socket "/tmp/.s.PGSQL.5433"
2020-07-14 08:50:42.546 CEST db:[] user:[] [] [0] LOG:  redirecting log 
output to logging collector process
2020-07-14 08:50:42.546 CEST db:[] user:[] [] [0] HINT:  Future log output 
will appear in directory "log".
done
server started
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base
 [0]postgres@axmdevhkdb008$ [PHKGAXMD008] psql
psql: FATAL:  database "postgres" does not exist
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base
 [2]postgres@axmdevhkdb008$ [PHKGAXMD008] ps -ef|grep postgres
root 16567 16513  0 14:12 pts/100:00:00 sudo su - postgres
root 16569 16567  0 14:12 pts/100:00:00 su - postgres
postgres 16571 16569  0 14:12 pts/100:00:00 -ksh
postgres 23888 1  0 14:50 pts/100:00:00 
/opt/rh/rh-postgresql10/root/usr/bin/postgres -D 
/AXMDEVHKDB008/postgres/PHKGAXMD008_bck
postgres 23890 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: logger 
process
postgres 23892 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: 
checkpointer process
postgres 23893 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: writer 
process
postgres 23894 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: wal 
writer process
postgres 23895 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: 
autovacuum launcher process
postgres 23896 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: stats 
collector process
postgres 23897 23888  0 14:50 ?00:00:00 postgres: PHKGAXMD008: 
bgworker: logical replication launcher
postgres 24689 16571  0 14:55 pts/100:00:00 ps -ef
postgres 24690 16571  0 14:55 pts/100:00:00 grep --color=auto postgres
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base

We have the directories at Data location.

 [2]postgres@axmdevhkdb008$ [PHKGAXMD008] ls -lrt
total 180
drwx-- 2 postgres postgres   4096 Jan  2  2020 1
drwx-- 2 postgres postgres   4096 Jan  2  2020 13211
drwx-- 2 postgres postgres   4096 Jul 13 09:51 pgsql_tmp
drwx-- 2 postgres postgres 139264 Jul 13 21:02 16389
drwx-- 2 postgres postgres  12288 Jul 13 21:02 13212
drwx-- 2 postgres postgres   4096 Jul 14 11:08 13213
drwx-- 2 postgres postgres  12288 Jul 14 11:08 13214
__  [RRF]postgres@axmdevhkdb008:/AXMDEVHKDB008/postgres/PHKGAXMD008_bck/base


How to overcome this ?

Note: we don't have any wal files, it is ok for us if we could bring it up with 
our recover.

2 nd one:

Since we are not able to connect to DB , we did a new initialization and copy 
the directories from base directory to new base directory(in new data location)

And update pg_catalog.pg_database table with the DB name

(postgres@[local]:5433)# [PHKGAXMP003]INSERT INTO
-- #   pg_catalog.pg_database(
-- #   datname, datdba, encoding, datcollate, datctype, datistemplate, 
datallowconn,
-- #   datconnlimit, datlastsysoid, datfrozenxid, datminmxid, 
dattablespace, datacl)
-- # VALUES(
-- #  -- Write Your collation
-- #   'axiom', 10, 0, 'C', 'C',
-- #   False, True, -1, 16389, '536', '1', 1663, Null);
INSERT 16384 1
Time: 70.239 ms
(postgres@[local]:5433)# [PHKGAXMP003]select oid from pg_database a where 
a.datname = 'axiom';
  oid
---
16384
(1 row)


After this we could able to see the DB axiom.

   Name|  Owner   | Encoding  | Collate | Ctype |   Access privileges   |  
Size   | Tablespace |Description
---+--+---+-+---+---+-++
axiom | postgres | SQL_ASCII | C   | C |   | 16 
GB   | pg_default |
postgres  | postgres | SQL_ASCII | C   | C |   | 
7647 kB | pg_default | default administrative connection database
template0 | postgres | SQL_ASCII | C   | C | =c/postgres  +| 
7513 kB | pg_default | unmodifiable empty database
   |  |   | |   | postgres=CTc/postgres |   
  ||
template1 | postgres | SQL_ASCII | C   | C | =c/postgres  +| 
7513 kB | pg_default | default template for new databases
   |  |   | |  

Same query taking less time in low configuration machine

2020-07-14 Thread Vishwa Kalyankar
Hi,

I have two machines - one with 8GB RAM & 4core CPU and the other with 64GB
Ram & 24 core CPU.  Both machines have the same DB (Postgres 12 + Postgis
2.5.3).  Same query is taking less time in low end machine whereas more
time in high end machine.  Any thoughts on where to look?  I have tuned the
db in both machines according to https://pgtune.leopard.in.ua/#/


Below I am pasting the output of query explain in both the machines.

 -bash-4.2$ psql -p 5434
psql (12.3)
Type "help" for help.

postgres=# \c IPDS_KSEB;
You are now connected to database "IPDS_KSEB" as user "postgres".
IPDS_KSEB=# explain analyze select * from
kseb_geometry_trace_with_barrier_partition(5,'kottarakara_version',437,'htline',2)
;

 QUERY PLAN
-
 Function Scan on kseb_geometry_trace_with_barrier_partition
 (cost=0.25..10.25 rows=1000 width=169) (actual time=11626.548..11626.568
rows=254 loops=1)
 Planning Time: 0.212 ms
 Execution Time: *11628.590 ms*


-bash-4.2$ psql -p 5422
psql (12.3)
Type "help" for help.

postgres=# \c IPDS_KSEB;
You are now connected to database "IPDS_KSEB" as user "postgres".
IPDS_KSEB=# explain analyze select * from
kseb_geometry_trace_with_barrier_partition(5,'kottarakara_version',437,'htline',2)
;

 QUERY PLAN
-
 Function Scan on kseb_geometry_trace_with_barrier_partition
 (cost=0.25..10.25 rows=1000 width=169) (actual time=22304.425..22304.448
rows=254 loops=1)
 Planning Time: 0.219 ms
 Execution Time: *22352.219 ms*
(3 rows)


Re: Doubt in mvcc

2020-07-14 Thread Laurenz Albe
On Mon, 2020-07-13 at 13:01 +0530, Rama Krishnan wrote:
> I m preparing for interview one of the recruiter asked me mvcc drawbacks as i 
> told due
> to mvcc it use more space and need to perform maintenance activity.

Yes.  Generally speaking, you have to pay a price for keeping old versions of 
the
data around, no matter how you implement it.

> Another one is the same data causes an update conflict because two different 
> transactions
> can update the same version of the row.
>  he told its wrong, kindly tell me will you please tell me its correct or 
> wrong?

There is always a certain version (the latest) that can be updated, so this is
the same no matter if you have MVCC or not: if two sessions want to update the 
same
row, one has to wait until the other is done.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com