Re: [GENERAL] BDR question on dboid conflicts

2017-10-26 Thread Craig Ringer
On 27 October 2017 at 01:15, Zhu, Joshua  wrote:
> Database oid is used in both bdr.bdr_nodes, as node_dboid, and
> bdr.bdr_connections, as conn_dboid, also used in construction of replication
> slot names.

Correct. However, it's used in conjunction with the sysid and node timeline ID.

> I noticed that when trying to join a bdr group, if the database oid on the
> new node happens to be the same as that of an node already in the bdr group,
> the join would fail, and the only way to resolve the conflict that I was
> able to come up with has been to retry with dropping/recreating the database
> until the dboid does not conflict with any node already in the group.

That is extremely surprising. In our regression tests the database
oids should be the same quite often, as we do various tests where we
create multiple instances. More importantly, every time you
bdr_init_copy, you get a clone with the same database oid, and that
works fine.

There's no detail here to work from, so I cannot guess what's actually
happening, but I can confidently say it's not a database oid conflict.
Nowhere in BDR should the database oid be considered without the rest
of the (sysid,timeline,dboid) tuple.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query not scaling

2017-10-26 Thread Rob Sargent


On 10/26/2017 09:01 AM, Tom Lane wrote:

Laurenz Albe  writes:

Also, to have PostgreSQL inline the function, which would be good
for performance, it should be declared IMMUTABLE.

Actually, if you hope to have a SQL function be inlined, it's better
not to decorate it at all --- not with IMMUTABLE, and not with STRICT
either.  Both of those restrict the parser's ability to inline unless
it can prove the contained expression is equally immutable/strict.
With the default attributes of volatile/not strict, there's nothing
to prove.

(In any case, it's usually easy enough to tell from EXPLAIN output
whether inlining has happened.)

regards, tom lane


In another instance of the same schema, in same database as original
slow execution I've loaded 823591 segments (though in this case all
of them are on one chromosome, one markerset)**and 65K proband sets
using same marker table as the slow(est) query.  In the fastest run,
there are only 46K segments for the given markerset.



QUERY PLAN
--
 HashAggregate  (cost=82122076.59..8215.35 rows=14876 width=48) 
(actual time=208203.091..208210.348 rows=14645 loops=1)
   Output: m.id, min(((1.0 * (((s.events_greater + s.events_equal) + 
0))::numeric) / s.events_less + s.events_equal) + s.events_greater) 
+ 0))::numeric))

   Group Key: m.id
   Buffers: shared hit=43209090
   ->  Nested Loop  (cost=3799.40..44686205.23 rows=1361304413 
width=40) (actual time=55.443..89684.451 rows=75577302 loops=1)

 Output: m.id, s.events_greater, s.events_equal, s.events_less
 Buffers: shared hit=43209090
 ->  Hash Join  (cost=3798.98..43611.56 rows=823591 width=32) 
(actual time=55.393..1397.509 rows=823591 loops=1)
   Output: s.events_greater, s.events_equal, s.events_less, 
s.startbase, s.endbase

   Inner Unique: true
   Hash Cond: (s.probandset_id = p.id)
   Buffers: shared hit=19222
   ->  Seq Scan on sui.segment s (cost=0.00..29414.86 
rows=823591 width=48) (actual time=0.017..669.915 rows=823591 loops=1)
 Output: s.id, s.chrom, s.markerset_id, 
s.probandset_id, s.startbase, s.endbase, s.firstmarker, s.lastmarker, 
s.events_less, s.events_equal, s.events_greater
 Filter: ((s.chrom = 22) AND (s.markerset_id = 
'edf95066-24d2-4ca1-bad6-aa850cc82fef'::uuid))

 Buffers: shared hit=17061
   ->  Hash  (cost=2979.99..2979.99 rows=65519 width=16) 
(actual time=55.272..55.272 rows=65519 loops=1)

 Output: p.id
 Buckets: 65536  Batches: 1  Memory Usage: 3584kB
 Buffers: shared hit=2161
 ->  Seq Scan on sui.probandset p 
(cost=0.00..2979.99 rows=65519 width=16) (actual time=0.007..33.582 
rows=65519 loops=1)

   Output: p.id
   Filter: (p.people_id = 
'9b2308b1-9659-4a2c-91ae-8f95cd0a90b3'::uuid)

   Buffers: shared hit=2161
 ->  Index Scan using marker_chrom_basepos_idx on base.marker 
m  (cost=0.42..37.67 rows=1653 width=20) (actual time=0.010..0.075 
rows=92 loops=823591)

   Output: m.id, m.name, m.chrom, m.basepos, m.alleles
   Index Cond: ((m.chrom = 22) AND (m.basepos >= 
s.startbase) AND (m.basepos <= s.endbase))

   Buffers: shared hit=43189868
 Planning time: 0.764 ms
 Execution time: 208214.816 ms
(30 rows)



[GENERAL] Combing row returning functions using "ROWS FROM"?

2017-10-26 Thread Guyren Howe
I just noticed the reference to combining set returning functions using “ROWS 
FROM” at 
https://www.postgresql.org/docs/10/static/queries-table-expressions.html#queries-tablefunctions
 


I have not previously noticed this feature, nor seen it used anywhere. The 
explanation is… terse, and the examples are… missing, AFAICT.

I would love to find a good write-up on how to use this feature. Failing that, 
I’ll happily write one and post it, if someone here can give me some examples 
and explanation of the intended use.

I’ll note in passing that this is not a syntax fragment that is very 
searchable, so please don’t RTFM me… :-)


Re: [GENERAL] Bucardo syncrun

2017-10-26 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Emanuel Araújo asked:

> Can I purge data on Postgrebucardo table syncrun ?
>
> If I will follow  this steps:
> 
> bucardo stop
> psql bucardo
> truncate bucardo.syncrun ;
> \q
> bucardo start
>
> The steps to purge all past data that table.  Is correct? Is this Simple 
> method?

Yes, you could also simply remove older entries without needing to 
stop bucardo first, like so:

delete from bucardo.syncrun where ended < now() - interval '1 week';

If there were a very large number of rows deleted, you should also:

vacuum full bucardo.syncrun;

The data in that table is not critical to Bucardo running - it is mostly 
used for the output of "bucardo status" and for auditing of past syncs.

P.S. Bucardo has its own mailing list: bucardo-gene...@bucardo.org, 
which in general is better than -general. :)

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201710261427
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlnyKbAACgkQvJuQZxSWSsj3WQCgtYecAxqk09rvJTKITRL/fcBo
oBgAoK5AaZc2Z+/UEE4HvY042iYnyvmv
=SETH
-END PGP SIGNATURE-




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Announcing PostgreSQL SLES RPM Repository

2017-10-26 Thread Devrim Gündüz

Hi,

I am proud to announce the new and shiny PostgreSQL RPM repository for SLES 12:
https://zypp.postgresql.org/.

The new repo contains (almost) all of the packages that the PostgreSQL YUM
repository has (https://yum.PostgreSQL.org) . Currently. we support PostgreSQL
10, 9.6 and 9.5, along with more than a hundred packages on each version. There
are a few packages missing, and I will add them in the upcoming days.

I also wrote a basic howto for using PostgreSQL RPMs on SLES, so please read it
first, if you are not that familiar to SLES: https://zypp.postgresql.org/howtoz
ypp.php .

For any questions/comments, please send an email to 
pgsql-pkg-...@postgresql.org , or (preferably)create a ticket at 
https://redmine.postgresql.org/projects/pgrpms/ by using your community
account.

This is a part of EnterpriseDB's contribution to the community: EDB provided
hardware, and let me to use my time for these packages, so I want to thank EDB.

Regards,
-- 
Devrim Gündüz
EnterpriseDB: https://www.enterprisedb.com
PostgreSQL Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Question regarding logical replication

2017-10-26 Thread Weiping Qu

Hi. Thank you very much for such detailed explanation. :)
We are currently testing the overhead of log-based Change Data Capture 
method (i.e. logical decoding) over Postgresql.
The test setting consists of one processing running TPC-C on node1, 
which issued transactions against a database residing on node2, which is 
accessed by three CDC processes running on three different nodes, say 
node3-5.
The difference between three CDC processes are only different table sets 
they are interested while the commonness is that each of they would 
sleep periodically during keeping capturing changes from node2.


We always measured the impact of CDC on original TPC-C workload by 
looking into the transaction throughput on node1.
We selected 0ms, 200ms and 400ms as three different sleeping periods for 
CDC processes to control their impact on node2.
We expect that the longer a sleeping period is set for CDC, the less 
impact is incurred over Postgresql, since less I/Os are triggered to 
fetch data from xlog.
However, the plots showed different trend (currently I don't have plots 
on my laptop) which shows that the more frequently are the CDC processes 
reading from logical slots, the less overhead is incurred over 
PostgreSQL, which leads to higher throughput.


That's the reason why I asked the previous question, whether logical 
slot is implemented as queue.
Without continuous dequeuing the "queue" get larger and larger, thus 
lowering the OLTP workload.


Regards;

Weiping


On 26.10.2017 21:42, Alvaro Aguayo Garcia-Rada wrote:

Hi. I've had experience with both BDR & pglogical. For each replication slot, 
postgres saves a LSN which points to the last xlog entry read by the client. When a 
client does not reads xlog, for example, if it cannot connect to the server, then 
the distance between such LSN(pg_replication_slots.restart_lsn) and the current 
xlog location(pg_current_xlog_insert_location()) will enlarge over the time. Not 
sure about the following, but postgres will not clear old xlog entries which are 
still pending to be read on any replication slot. Such situation may also happen, 
in lower degree, if the client cannot read WAL as fast as it's produced. Anyhow, 
what will happen is xlog will grow more and more. However, that will probably not 
impact performance, as xlog is written anyway. But if you don't have enough free 
space, you could get your partition full of xlog.

Regards,

Alvaro Aguayo
Operations Manager
Open Comb Systems E.I.R.L.

Office: (+51-1) 3377813 | Mobile: (+51) 995540103 | (+51) 954183248
Web: www.ocs.pe

- Original Message -
From: "Weiping Qu" 
To: "PostgreSql-general" 
Sent: Thursday, 26 October, 2017 14:07:54
Subject: [GENERAL] Question regarding logical replication

Dear postgresql community,

I have a question regarding understanding the implementation logic
behind logical replication.

Assume a replication slot created on the master node, will more and more
data get piled up in the slot and the size of replication slot
continuously increase if there is no slave reading/dequeuing data out of
this slot or very slowly, thus incurring high I/Os and slow down the
transaction throughput?

Looking forward to your explanation.


Kindly review and please share your comments on this matter.








--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Question regarding logical replication

2017-10-26 Thread Alvaro Aguayo Garcia-Rada
Hi. I've had experience with both BDR & pglogical. For each replication slot, 
postgres saves a LSN which points to the last xlog entry read by the client. 
When a client does not reads xlog, for example, if it cannot connect to the 
server, then the distance between such LSN(pg_replication_slots.restart_lsn) 
and the current xlog location(pg_current_xlog_insert_location()) will enlarge 
over the time. Not sure about the following, but postgres will not clear old 
xlog entries which are still pending to be read on any replication slot. Such 
situation may also happen, in lower degree, if the client cannot read WAL as 
fast as it's produced. Anyhow, what will happen is xlog will grow more and 
more. However, that will probably not impact performance, as xlog is written 
anyway. But if you don't have enough free space, you could get your partition 
full of xlog.

Regards,

Alvaro Aguayo
Operations Manager
Open Comb Systems E.I.R.L.

Office: (+51-1) 3377813 | Mobile: (+51) 995540103 | (+51) 954183248
Web: www.ocs.pe

- Original Message -
From: "Weiping Qu" 
To: "PostgreSql-general" 
Sent: Thursday, 26 October, 2017 14:07:54
Subject: [GENERAL] Question regarding logical replication

Dear postgresql community,

I have a question regarding understanding the implementation logic 
behind logical replication.

Assume a replication slot created on the master node, will more and more 
data get piled up in the slot and the size of replication slot 
continuously increase if there is no slave reading/dequeuing data out of 
this slot or very slowly, thus incurring high I/Os and slow down the 
transaction throughput?

Looking forward to your explanation.


Kindly review and please share your comments on this matter.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Question regarding logical replication

2017-10-26 Thread Weiping Qu

Dear postgresql community,

I have a question regarding understanding the implementation logic 
behind logical replication.


Assume a replication slot created on the master node, will more and more 
data get piled up in the slot and the size of replication slot 
continuously increase if there is no slave reading/dequeuing data out of 
this slot or very slowly, thus incurring high I/Os and slow down the 
transaction throughput?


Looking forward to your explanation.


Kindly review and please share your comments on this matter.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Bucardo syncrun

2017-10-26 Thread Emanuel Araújo
Hi,

Can I purge data on Postgrebucardo table syncrun ?

If I will follow  this steps:

bucardo stop
psql bucardo
truncate bucardo.syncrun ;
\q
bucardo start

The steps to purge all past data that table.  Is correct? Is this Simple
method?

-- 


*Atenciosamente,Emanuel Araújo*

*Linux Certified, DBA PostgreSQL*


[GENERAL] BDR question on dboid conflicts

2017-10-26 Thread Zhu, Joshua
Database oid is used in both bdr.bdr_nodes, as node_dboid, and 
bdr.bdr_connections, as conn_dboid, also used in construction of replication 
slot names.

I noticed that when trying to join a bdr group, if the database oid on the new 
node happens to be the same as that of an node already in the bdr group, the 
join would fail, and the only way to resolve the conflict that I was able to 
come up with has been to retry with dropping/recreating the database until the 
dboid does not conflict with any node already in the group.

Is there a better way to handle this kind of conflicts, especially doing so in 
a script?

Thanks


Re: [GENERAL] query not scaling

2017-10-26 Thread Rob Sargent



On 10/26/2017 09:01 AM, Tom Lane wrote:

Laurenz Albe  writes:

Also, to have PostgreSQL inline the function, which would be good
for performance, it should be declared IMMUTABLE.

Actually, if you hope to have a SQL function be inlined, it's better
not to decorate it at all --- not with IMMUTABLE, and not with STRICT
either.  Both of those restrict the parser's ability to inline unless
it can prove the contained expression is equally immutable/strict.
With the default attributes of volatile/not strict, there's nothing
to prove.

(In any case, it's usually easy enough to tell from EXPLAIN output
whether inlining has happened.)

regards, tom lane

As to the explain analyze,

   could not receive data from server: Connection timed out
   Time: 7877340.565 ms

for the second time.  I had presumed at first that this had occurred 
during a network burp.  I'll try running it directly on the pg host.




Re: [GENERAL] query not scaling

2017-10-26 Thread Tom Lane
Laurenz Albe  writes:
> Also, to have PostgreSQL inline the function, which would be good
> for performance, it should be declared IMMUTABLE.

Actually, if you hope to have a SQL function be inlined, it's better
not to decorate it at all --- not with IMMUTABLE, and not with STRICT
either.  Both of those restrict the parser's ability to inline unless
it can prove the contained expression is equally immutable/strict.
With the default attributes of volatile/not strict, there's nothing
to prove.

(In any case, it's usually easy enough to tell from EXPLAIN output
whether inlining has happened.)

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query not scaling

2017-10-26 Thread Rob Sargent


> On Oct 26, 2017, at 1:02 AM, Laurenz Albe  wrote:
> 
> Rob Sargent wrote:
>> I have a query I cannot tame and I'm wondering if there's an alternative
>> to  the "between"  clause I'm  using.  Perhaps  a custom  type could  do
>> better?  I've  tried the "<@" orperator and that changes the  query plan
>> significantly but the execution cost/time is not improved.
>> 
>> Any suggestion or pointers much appreciated.
> 
> You really need EXPLAIN (ANALYZE, BUFFERS) output to be able to
> undersrand what is going on.
> 
> A couple of simple things to check:
> 
> - Have all tables been ANALYZED beforehand?
> - Are all optimizer database parameters identical?
> 
> Also, to have PostgreSQL inline the function, which would be good
> for performance, it should be declared IMMUTABLE.
> 
> Yours,
> Laurenz Albe
The explain analyze was (maybe is)still running but without buffers. 
Thought of the immutable bit. Will be doing that test. 
All tables vacuumed and analyzed with each structural change. 
Thanks


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Making subscribers read only in Postgres 10 logical replication

2017-10-26 Thread Peter Eisentraut
On 10/11/17 14:18, rverghese wrote:
> I'm testing out logical replication on PostgreSQL 10. Is there a setting to
> make subscribers read-only slaves like with Slony. Currently I can insert
> into the Publisher and the Subscriber. If there is a conflict, i.e. same
> record exists in both, then all replication gets backed up (even to other
> tables) till that one record is resolved.

Right now there is no direct way to do that.  The other answers have
suggested some workarounds.  It might be a valuable feature to implement
something like that.  One would just have to think through exactly how
to present this in the user interface.

Another longer-term solution here is to implement conflict resolution
mechanisms.  So if you don't like local updates to break the incoming
replication stream, a remote-update-wins policy would help.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] query not scaling

2017-10-26 Thread Laurenz Albe
Rob Sargent wrote:
> I have a query I cannot tame and I'm wondering if there's an alternative
> to  the "between"  clause I'm  using.  Perhaps  a custom  type could  do
> better?  I've  tried the "<@" orperator and that changes the  query plan
> significantly but the execution cost/time is not improved.
> 
> Any suggestion or pointers much appreciated.

You really need EXPLAIN (ANALYZE, BUFFERS) output to be able to
undersrand what is going on.

A couple of simple things to check:

- Have all tables been ANALYZED beforehand?
- Are all optimizer database parameters identical?

Also, to have PostgreSQL inline the function, which would be good
for performance, it should be declared IMMUTABLE.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general