Re: [HACKERS] postgres-r

2009-08-15 Thread Markus Wanner
Hi,

[ CC'ing to the postgres-r mailing list ]

Mark Mielke wrote:
 On 08/12/2009 12:04 PM, Suno Ano wrote:
 can anybody tell me, is there a roadmap with regards to
 http://www.postgres-r.org ?

I'm glad you're asking.

 I would love to see it become production-ready asap.

Yes, me too. Do you have some spare cycles to spend? I'd be happy to
help you getting started. However, I have a 16 days old daughter at
home, so please don't expect response times under a few days ;-)

 Even a breakdown of what is left to do might be useful in case any of us
 want to pick at it. :-)

The TODO file from the patch is a good place to start from. For the sake
of simplicity, I've attached it.

I've written a series of posts covering various topics of Postgres-R
about a year ago. Here are the links, following the discussions
down-thread might be interesting as well.

Postgres-R: current state of development, Jul 15 2008:
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00735.php

Postgres-R: primary key patches, Jul 16 2008
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00777.php

Postgres-R: tuple serialization, Jul 22 2008
http://archives.postgresql.org/pgsql-hackers/2008-07/msg00969.php

Postgres-R: internal messaging, Jul 23 2008
http://archives.postgresql.org/pgsql-hackers/2008-07/msg01051.php


Some random updates to last year's current state of development that
come to mind:

 * I've adjusted to the signaling to use the signal
   multiplexer code that recently landed on HEAD.
 * Work on the initialization and recovery stuff is
   progressing slowly, but steadily.
 * The tuple serialization code is being refactored ATM
   to get a lot smaller and easier to understand and
   debug.

That should get you an impression on the current state of development, I
think. Please feel free to ask more specific questions.

Regards

Markus Wanner

P.S.: Sanu, did you note the addition of the link to the Postgres-R
mailing list, which you pointed out was hard to find?
URGENT
==

* Implement parsing of the replication_gcs GUC for spread and ensemble.
* check for places there replication_enabled should be checked more
  extensively.


complaint about select() not interrupted by signals:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg00448.php

restartable signals 'n all that
http://archives.postgresql.org/pgsql-hackers/2007-07/msg3.php

3.2.1 Internal Message Passing
==

* Maybe send IMSGT_READY after some other commands, not only after
  IMSGT_CHANGESET. Remember that local transactions also have to send an
  IMSGT_READY, so that their proc-coid gets reset.

* Make sure the coordinator copes with killed backends (local as
  well as remote ones).

* Check if we can use pselect to avoid race conditions with IMessage stuff
  within the coordinator's main loop.

* Check error conditions such as out of memory and out of disk space. Those
  could prevent a single node from applying a remote transaction. What to
  do in such cases? A similar one is limit of queued remote transactions
  reached.


3.2.2 Communication with the Postmaster
===

* Get rid of the SIGHUP signal (was IMSGT_SYSTEM_READY) for the
  coordinator and instead only start the coordinator as soon as the
  postmaster is ready to fork helper backends. Should simplify things and
  make them more similar to the current Postgres code, i.e. for the
  autovacuum launcher.

* Handle restarts of the coordinator due to a crashed backend. The
  postmaster already sends a signal to terminate an existing
  coordinator process and it tries to restart one. But the coordinator
  should then start recovery and only allow other backends after that.

  Keep in mind that this recovery process is costly and we should somehow
  prevent nodes which fail repeatedly from endlessly consuming resources
  of the complete cluster.

* The backends need to report errors from remote *and* local transactions
  to the coordinator. Worker backends erroring out while waitin for
  changesets are critical. Erroring out due to serialization failure is fine,
  we can simply ignore the changeset, once it arrives late on. But other
  errors are probably pretty bad at that stage. Upon crashes, the postmaster
  restarts all backends and the coordinator anyway, so the backend
  process itself can take care of informing the coordinator via
  imessages.

* Think about a newly requested helper backend crashing before it
  registers with the coordinator. That would prevent requesting any further
  helper backend.


3.2.3 Group Communication System Issues
===

* Drop the static receive buffers of the GCS interfaces in favor of a
  dynamic one. It's much easier to handle.

* Hot swapping of the underlying GCS of a replicated database is currently
  not supported. It would involve waiting for all nodes of the group to
  have joined the new group, then swap.

  If we enforce the GCS group name to 

[HACKERS] postgres-r

2009-08-12 Thread Suno Ano
Hello,

can anybody tell me, is there a roadmap with regards to
http://www.postgres-r.org ?

I would love to see it become production-ready asap.



pgpy65IJnozC6.pgp
Description: PGP signature


Re: [HACKERS] postgres-r

2009-08-12 Thread Mark Mielke

On 08/12/2009 12:04 PM, Suno Ano wrote:

can anybody tell me, is there a roadmap with regards to
http://www.postgres-r.org ?

I would love to see it become production-ready asap.
   


Even a breakdown of what is left to do might be useful in case any of us 
want to pick at it. :-)


Cheers,
mark

--
Mark Mielkem...@mielke.cc


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R (8) Architecture published

2009-01-13 Thread Markus Wanner
Dear hackers,

during the last three months, I've compiled my thoughts and ideas around
Postgtres-R into a rounded concept covering lots of aspects. It
integrates findings from up-to-date research papers and it's meant to
lighten up the future direction of development for Postgres-R. See
www.postgres-r.org/about/news/concept

I'm curious to get to know what you think about it and what I can do to
ease review of Postgres-R in general.

Also note that I welcome the current log shipping approach and think of
Postgres-R as a future enhancement which can benefit from the experience
with other replication solutions.

A short PSA: I am currently looking for work. My background is primarily
in development (ANSI C, C++ w/ STL, Python, JavaScript), databases and
distributed systems. If you have need for custom development, preferably
related to Postgres, please contact me offlist for more information.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R pacth

2008-10-30 Thread Markus Wanner
Hi,

Imre Geczy wrote:
 What kind of form or method must be used to patch that it can work correctly?

I finally got around writing some installation instructions:
http://www.postgres-r.org/documentation/installation

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R pacth

2008-10-24 Thread Imre Geczy
Hi All,


I would like to ask a help to Postgres-R... because have tried to use it with 
any version from CVS and normal Postgres source code but could not do patch it.
What kind of form or method must be used to patch that it can work correctly?

Thanks.
Imre


  

Re: [HACKERS] postgres-R

2008-08-21 Thread Markus Wanner

Hi,

Marcelo Martins wrote:

Anyone knows a link that has some docs about how to get that setup ?


Besides the README and other documentation in the source, there's 
admittedly not much. Check the archive of this mailing list.



Also is it stable enough for production ?


No.

I though getting postgreSQL from CVS and compiling was not such a good 
idea since the CVSROOT is probably not stable, is that wrong ?


Generally, that's a reasonable approach if you want to get stable 
software. However, Postgres-R is still in development, and there's no 
stable version. When taking snapshots, I'm only checking if Postgres-R 
still compiles.


since I could not find info out there this is what I have done to check 
it out and I downloaded the postgres-r snapshoot patch

snapshot 2008-08-13  88 kbpostgres-r-20080813.diff.bz2

cvs -d :pserver:[EMAIL PROTECTED]:/projects/cvsroot 
checkout -D 2008-08-13 -P pgsql


Any comment much appreciated


That sounds like the right thing to do if you want to peek at the source 
code or test-drive the prototype.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] postgres-R

2008-08-20 Thread Marcelo Martins


Anyone knows a link that has some docs about how to get that setup ?
Also is it stable enough for production ?
I though getting postgreSQL from CVS and compiling was not such a good  
idea since the CVSROOT is probably not stable, is that wrong ?


since I could not find info out there this is what I have done to  
check it out and I downloaded the postgres-r snapshoot patch

snapshot 2008-08-13   88 kb   postgres-r-20080813.diff.bz2

cvs -d :pserver:[EMAIL PROTECTED]:/projects/cvsroot  
checkout -D 2008-08-13 -P pgsql


Any comment much appreciated

-
thank you
M

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres-R

2008-08-20 Thread Joshua D. Drake

Marcelo Martins wrote:


Anyone knows a link that has some docs about how to get that setup ?
Also is it stable enough for production ?
I though getting postgreSQL from CVS and compiling was not such a good 
idea since the CVSROOT is probably not stable, is that wrong ?


since I could not find info out there this is what I have done to check 
it out and I downloaded the postgres-r snapshoot patch

snapshot 2008-08-13  88 kbpostgres-r-20080813.diff.bz2

cvs -d :pserver:[EMAIL PROTECTED]:/projects/cvsroot 
checkout -D 2008-08-13 -P pgsql


http://www.postgresql.org/docs/current/static/cvs.htm



Any comment much appreciated

-
thank you
M




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R

2008-08-19 Thread K, Niranjan (NSN - IN/Bangalore)
Unfortunately, I'am getting the error as below when I start the gossip. I had 
followed the same steps as you  mentioned.

REFLECT:I'm not in the list of gossip hosts, exiting
  (the hosts are [cluster_1|cluster_2])

cluster_1  cluster_2 are node names  are the in /etc/hosts.

Did you face this?

regards,
Niranjan

-Original Message-
From: ext leiyonghua [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, August 19, 2008 6:42 AM
To: K, Niranjan (NSN - IN/Bangalore)
Cc: Markus Wanner; pgsql-hackers@postgresql.org
Subject: Re: Postgres-R

hi,
Assume that we have two node
node 0 , 192.168.0.2
node 1 , 192.168.0.3
1. add a host entry in /etc/hosts for hostname resolving.
2. add the host list in configuration 'ensemble.conf' for gossip service:
ENS_GOSSIP_HOSTS=node0:node1
3. set the envrionment variable ENS_CONFIG_FILE export 
ENS_CONFIG_FILE=/xxx/xxx/ensemble.conf
4. start ensemble  gossip
5. try 'c_mtalk' and happy.

this is a simplest case for me, hehe!

leiyonghua



K, Niranjan (NSN - IN/Bangalore) 写道:
 Thanks for the information.
 For Step5 (starting ensemble daemon).- I set the multicast address to 
 both nodes (Node 1 Node 2 eth0: 224.0.0.9/4)  before starting the ensemble. 
 And started the server application mtalk in node 1  node 2 and then client 
 application in node 1  node 2. But the count of members ('nmembers') show as 
 1. This is the output of the client program 'c_mtalk'. Seeing this, I'am 
 assuming that the applications are not merged.
 Could you please let me know how did you proceed with the setup of ensemble?

 regards,
 Niranjan

 -Original Message-
 From: ext leiyonghua [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 18, 2008 2:58 PM
 To: K, Niranjan (NSN - IN/Bangalore); Markus Wanner; 
 pgsql-hackers@postgresql.org
 Subject: Re: Postgres-R

 [EMAIL PROTECTED] 写道:
   
 I wish to set up the Postgres-R environment, could you please let me know 
 the steps for setting it up.
 Thanks.


   
 
 yeah, actually, i have not been successful to set up this, but let me give 
 some information for you.
 1. download the postgresql snapshot source code from here:
 http://www.postgresql.org/ftp/snapshot/dev/
 (this is a daily tarball)

 2. Get the corresponding patch for postgres-r from:
 http://www.postgres-r.org/downloads/

 3. apply the patch for snapshot source, and configure like this:

 ./configure --enable-replication
 make  make install

 4. install the GCS ensemble, according the document : 
 http://www.cs.technion.ac.il/dsl/projects/Ensemble/doc.html

 5. start ensemble daemon and gossip if neccessary ( yes, make sure the 
 two nodes can 'GCS' each other)

 3. Assume that you have two nodes, start up postgresql and create a database 
 'db', and create a table 'tb' for testing which should be have a primary key 
 for all nodes.

 4. At the origin node, execute the command at psql console:
 alter database db start replication in group gcs; (which means the 
 database 'db' is the origin and the group 'gcs' is the GCS group name)

 5. At the subscriber node, execute the command:
 alter database db accept replication from group gcs;


 Hope information above would be helpful, and keep in touch.

 leiyonghua


   


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R

2008-08-19 Thread Markus Wanner

Hi,

leiyonghua wrote:

./configure --enable-replication
make  make install


You certainly also want --enable-debug and --enable-cassert, maybe also 
additional flags for the C compiler, like -DRMGR_DEBUG, please check the 
source code for these.


4. install the GCS ensemble, according the document : 
http://www.cs.technion.ac.il/dsl/projects/Ensemble/doc.html


5. start ensemble daemon and gossip if neccessary ( yes, make sure the 
two nodes can 'GCS' each other)


Yeah, either use the gossip process, or make sure IP multicast works for 
your network configuration. I admit that ensemble is quite a beast WRT 
compilation and configuration.


3. Assume that you have two nodes, start up postgresql and create a 
database 'db', and create a table 'tb' for testing which should be have 
a primary key for all nodes.


4. At the origin node, execute the command at psql console:
alter database db start replication in group gcs;
(which means the database 'db' is the origin and the group 'gcs' is the 
GCS group name)


5. At the subscriber node, execute the command:
alter database db accept replication from group gcs;


As recovery doesn't work automatically, you still need to sync the 
complete database from the node which initiated the replication group. 
Then accept replication.


I'm working on automatic recovery.

Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R

2008-08-19 Thread Markus Wanner
Hi,

K, Niranjan (NSN - IN/Bangalore) wrote:
 Thanks for the information.
 For Step5 (starting ensemble daemon).- 
 I set the multicast address to both nodes (Node 1 Node 2 eth0: 224.0.0.9/4)  
 before starting the ensemble. And started the server application mtalk in 
 node 1  node 2 and then client application in node 1  node 2. But the count 
 of members ('nmembers') show as 1. This is the output of the client program 
 'c_mtalk'. Seeing this, I'am assuming that the applications are not merged.

This sounds like IP multicast does not work properly for your network
(is IP multicast available and enabled for your OS? Maybe you are
running on virtual hosts with a virtual network, which doesn't support
multicasting?). You can either try to fix that or switch to using a
gossip process.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R

2008-08-18 Thread leiyonghua

[EMAIL PROTECTED] 写道:

I wish to set up the Postgres-R environment, could you please let me know the 
steps for setting it up.
Thanks.


  
yeah, actually, i have not been successful to set up this, but let me 
give some information for you.

1. download the postgresql snapshot source code from here:
http://www.postgresql.org/ftp/snapshot/dev/
(this is a daily tarball)

2. Get the corresponding patch for postgres-r from:
http://www.postgres-r.org/downloads/

3. apply the patch for snapshot source, and configure like this:

./configure --enable-replication
make  make install

4. install the GCS ensemble, according the document : 
http://www.cs.technion.ac.il/dsl/projects/Ensemble/doc.html


5. start ensemble daemon and gossip if neccessary ( yes, make sure the 
two nodes can 'GCS' each other)


3. Assume that you have two nodes, start up postgresql and create a 
database 'db', and create a table 'tb' for testing which should be have 
a primary key for all nodes.


4. At the origin node, execute the command at psql console:
alter database db start replication in group gcs;
(which means the database 'db' is the origin and the group 'gcs' is the 
GCS group name)


5. At the subscriber node, execute the command:
alter database db accept replication from group gcs;


Hope information above would be helpful, and keep in touch.

leiyonghua

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R

2008-08-18 Thread K, Niranjan (NSN - IN/Bangalore)
Thanks for the information.
For Step5 (starting ensemble daemon).- 
I set the multicast address to both nodes (Node 1 Node 2 eth0: 224.0.0.9/4)  
before starting the ensemble. And started the server application mtalk in node 
1  node 2 and then client application in node 1  node 2. But the count of 
members ('nmembers') show as 1. This is the output of the client program 
'c_mtalk'. Seeing this, I'am assuming that the applications are not merged.
Could you please let me know how did you proceed with the setup of ensemble?

regards,
Niranjan

-Original Message-
From: ext leiyonghua [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 18, 2008 2:58 PM
To: K, Niranjan (NSN - IN/Bangalore); Markus Wanner; 
pgsql-hackers@postgresql.org
Subject: Re: Postgres-R

[EMAIL PROTECTED] 写道:
 I wish to set up the Postgres-R environment, could you please let me know the 
 steps for setting it up.
 Thanks.


   
yeah, actually, i have not been successful to set up this, but let me give some 
information for you.
1. download the postgresql snapshot source code from here:
http://www.postgresql.org/ftp/snapshot/dev/
(this is a daily tarball)

2. Get the corresponding patch for postgres-r from:
http://www.postgres-r.org/downloads/

3. apply the patch for snapshot source, and configure like this:

./configure --enable-replication
make  make install

4. install the GCS ensemble, according the document : 
http://www.cs.technion.ac.il/dsl/projects/Ensemble/doc.html

5. start ensemble daemon and gossip if neccessary ( yes, make sure the two 
nodes can 'GCS' each other)

3. Assume that you have two nodes, start up postgresql and create a database 
'db', and create a table 'tb' for testing which should be have a primary key 
for all nodes.

4. At the origin node, execute the command at psql console:
alter database db start replication in group gcs; (which means the database 
'db' is the origin and the group 'gcs' is the GCS group name)

5. At the subscriber node, execute the command:
alter database db accept replication from group gcs;


Hope information above would be helpful, and keep in touch.

leiyonghua

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R

2008-08-18 Thread leiyonghua
hi,
Assume that we have two node
node 0 , 192.168.0.2
node 1 , 192.168.0.3
1. add a host entry in /etc/hosts for hostname resolving.
2. add the host list in configuration 'ensemble.conf' for gossip service:
ENS_GOSSIP_HOSTS=node0:node1
3. set the envrionment variable ENS_CONFIG_FILE
export ENS_CONFIG_FILE=/xxx/xxx/ensemble.conf
4. start ensemble  gossip
5. try 'c_mtalk' and happy.

this is a simplest case for me, hehe!

leiyonghua



K, Niranjan (NSN - IN/Bangalore) 写道:
 Thanks for the information.
 For Step5 (starting ensemble daemon).- 
 I set the multicast address to both nodes (Node 1 Node 2 eth0: 224.0.0.9/4)  
 before starting the ensemble. And started the server application mtalk in 
 node 1  node 2 and then client application in node 1  node 2. But the count 
 of members ('nmembers') show as 1. This is the output of the client program 
 'c_mtalk'. Seeing this, I'am assuming that the applications are not merged.
 Could you please let me know how did you proceed with the setup of ensemble?

 regards,
 Niranjan

 -Original Message-
 From: ext leiyonghua [mailto:[EMAIL PROTECTED] 
 Sent: Monday, August 18, 2008 2:58 PM
 To: K, Niranjan (NSN - IN/Bangalore); Markus Wanner; 
 pgsql-hackers@postgresql.org
 Subject: Re: Postgres-R

 [EMAIL PROTECTED] 写道:
   
 I wish to set up the Postgres-R environment, could you please let me know 
 the steps for setting it up.
 Thanks.


   
 
 yeah, actually, i have not been successful to set up this, but let me give 
 some information for you.
 1. download the postgresql snapshot source code from here:
 http://www.postgresql.org/ftp/snapshot/dev/
 (this is a daily tarball)

 2. Get the corresponding patch for postgres-r from:
 http://www.postgres-r.org/downloads/

 3. apply the patch for snapshot source, and configure like this:

 ./configure --enable-replication
 make  make install

 4. install the GCS ensemble, according the document : 
 http://www.cs.technion.ac.il/dsl/projects/Ensemble/doc.html

 5. start ensemble daemon and gossip if neccessary ( yes, make sure the two 
 nodes can 'GCS' each other)

 3. Assume that you have two nodes, start up postgresql and create a database 
 'db', and create a table 'tb' for testing which should be have a primary key 
 for all nodes.

 4. At the origin node, execute the command at psql console:
 alter database db start replication in group gcs; (which means the database 
 'db' is the origin and the group 'gcs' is the GCS group name)

 5. At the subscriber node, execute the command:
 alter database db accept replication from group gcs;


 Hope information above would be helpful, and keep in touch.

 leiyonghua


   


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-30 Thread Markus Wanner

Hi,

That's now changed in today's snapshot of Postgres-R: the postmaster no 
longer uses imessages (and thus shared memory) to communicate with the 
replication manager. Instead the manager signals the postmaster using a 
newish PMSIGNAL for requesting new helper backends. It now only requests 
one helper at a time and keeps track of pending requests. The helper 
backends now read the name of the database to which they must connect to 
from shared memory themselves. That should now adhere to the standard 
Postgres rules for shared memory safety.


Additionally, the replication manager is also restarted after a backend 
crash, to make sure it never tries to work on corrupted shared memory. 
However, that part isn't complete, as the replication manager cannot 
really handle that situation just yet. There are other outstanding 
issues having to do with that change. Those are documented in the TODO 
file in src/backend/replication/.


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-24 Thread Markus Wanner

Hi,

Tom Lane wrote:

I hope you're not expecting the contents of shared memory to still be
trustworthy after a backend crash.


Hm.. that's a good point.

So I either need to bullet-proof the imessages with checksums or some 
such. I'm not sure that's doable reliably. Not to speak about performance.


Thus it might be better to just restart the replication manager as well. 
Note that this means leaving the replication group temporarily and going 
through node recovery to apply remote transactions it has missed in 
between. This sounds expensive, but it's certainly the safer way to do 
it. And as such backend crashes are Expected Not To Happen(tm) on 
production systems, that's probably good enough.



If the manager is working strictly
from its own local memory, then it would be reasonable to operate
as above.


That's not the case... :-(

Thanks for your excellent guidance.

Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R: internal messaging

2008-07-23 Thread Markus Wanner

Hi,

As you certainly know by now, Postgres-R introduces an additional
manager process. That one is forked from the postmaster, so are all
backends, no matter if they are processing local or remote transactions.
That led to a communication problem, which has originally (i.e. around
Postgres-R for 6.4) been solved by using unix pipes. I didn't like that
approach for various reasons: first, AFAIK there are portability issues,
second it eats file descriptors and third it involves copying around the
messages several times. As the replication manager needs to talk to the
backends, but they both need to be forked from the postmaster, pipes
would also have to go through the postmaster process.

Trying to be as portable as Postgres itself and still wanting an
efficient messaging system, I came up with that imessages stuff, which
I've already posted to -patches before [1]. It uses shared memory to
store and 'transfer' the messages and signals to notify other processes
(the so far unused SIGUSR2, IIRC). Of course this implies having a hard
limit on the total size of messages waiting to be delivered, due to the
fixed size of the shared memory area.

Besides the communication between the replication manager and the
backends, which is currently done by using these imessages, the
replication manager also needs to communicate with the postmaster: it
needs to be able to request new helper backends and it wants to be
notified upon termination (or crash) of such a helper backend (and other 
backends as well...). I'm currently doing this with imessages as well, 
which violates the rule that the postmaster may not to touch shared 
memory. I didn't look into ripping that out, yet. I'm not sure it can be 
done with the existing signaling of the postmaster.


Let's have a simple example: consider a local transaction which changes 
some tuples. Those are being collected into a change set, which gets 
written to the shared memory area as an imessage for the replication 
manager. The backend then also signals the manager, which then awakes 
from its select(), checks its imessages queue and processes the message, 
delivering it to the GCS. It then removes the imessage from the shared 
memory area again.


My initial design features only a single doubly linked list as the 
message queue, holding all messages for all processes. An imessages lock 
blocks concurrent writing acces. That's still what's in there, but I 
realize that's not enough. Each process should better have its own 
queue, and the single lock needs to vanish to avoid contention on that 
lock. However, that would require dynamically allocatable shared memory...


As another side node: I've had to write methods similar to those in 
libpq, which serialize and deserialize integers or strings. The libpq 
functions were not appropriate because they cannot write shared memory, 
instead they are designed to flush to a socket, if I understand 
correctly. Maybe, these could be extended or modified to be usable there 
as well? I've been hesitating and rather implemented separate methods in 
src/backed/storage/ipc/buffer.c.


Comments?

Regards

Markus Wanner

[1]: last time I published IMessage stuff on -patches, WIP:
http://archives.postgresql.org/pgsql-patches/2007-01/msg00578.php


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-23 Thread Alexey Klyukin
Markus Wanner wrote:
 Besides the communication between the replication manager and the
 backends, which is currently done by using these imessages, the
 replication manager also needs to communicate with the postmaster: it
 needs to be able to request new helper backends and it wants to be
 notified upon termination (or crash) of such a helper backend (and other  
 backends as well...). I'm currently doing this with imessages as well,  
 which violates the rule that the postmaster may not to touch shared  
 memory. I didn't look into ripping that out, yet. I'm not sure it can be  
 done with the existing signaling of the postmaster.

In Replicator we avoided the need for postmaster to read/write backend's
shmem data by using it as a signal forwarder. When a backend wants to
inform a special process (i.e. queue monitor) about replication-related
event (such as commit) it sends SIGUSR1 to Postmaster with a related
reason flag and the postmaster upon receiving this signal forwards
it to the destination process. Termination of backends and special
processes are handled by the postmaster itself.


 Let's have a simple example: consider a local transaction which changes  
 some tuples. Those are being collected into a change set, which gets  
 written to the shared memory area as an imessage for the replication  
 manager. The backend then also signals the manager, which then awakes  
 from its select(), checks its imessages queue and processes the message,  
 delivering it to the GCS. It then removes the imessage from the shared  
 memory area again.

Hm...what would happen with the new data under heavy load when the queue 
would eventually be filled with messages, the relevant transactions
would be aborted or they would wait for the manager to release the queue
space occupied by already processed messages? ISTM that having a fixed
size buffer limits the maximum transaction rate.


 My initial design features only a single doubly linked list as the  
 message queue, holding all messages for all processes. An imessages lock  
 blocks concurrent writing acces. That's still what's in there, but I  
 realize that's not enough. Each process should better have its own  
 queue, and the single lock needs to vanish to avoid contention on that  
 lock. However, that would require dynamically allocatable shared 
 memory...

What about keeping the per-process message queue in the local memory of
the process, and exporting only the queue head to the shmem, thus having
only one message per-process there. When the queue manager gets a
message from the process it may signal that process to copy the next
message from the process local memory into the shmem. To keep a
correct ordering of queue messages an additional shared memory queue of
pid_t can be maintained, containing one pid per each message. 

-- 
Alexey Klyukin http://www.commandprompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-23 Thread Markus Wanner

Hi Alexey,

thanks for your feedback, these are interesting points.

Alexey Klyukin wrote:

In Replicator we avoided the need for postmaster to read/write backend's
shmem data by using it as a signal forwarder. When a backend wants to
inform a special process (i.e. queue monitor) about replication-related
event (such as commit) it sends SIGUSR1 to Postmaster with a related
reason flag and the postmaster upon receiving this signal forwards
it to the destination process. Termination of backends and special
processes are handled by the postmaster itself.


Hm.. how about larger data chunks, like change sets? In Postgres-R, 
those need to travel between the backends and the replication manager, 
which then sends it to the GCS.


Hm...what would happen with the new data under heavy load when the queue 
would eventually be filled with messages, the relevant transactions

would be aborted or they would wait for the manager to release the queue
space occupied by already processed messages? ISTM that having a fixed
size buffer limits the maximum transaction rate.


That's why the replication manager is a very simple forwarder, which 
does not block messages, but consumes them immediately from shared 
memory. It already features a message cache, which holds messages it 
cannot currently forward to a backend, because all backends are busy.


And it takes care to only send change sets to helper backend which are 
not busy and can consume the process the remote transaction immediately. 
That way, I don't think the limit on shared memory is the bottleneck. 
However, I didn't measure.


WRT waiting vs aborting: I think at the moment I don't handle this 
situation gracefully. I've never encountered it. ;-)  But I think the 
simpler option is letting the sender wait until there is enough room in 
the queue for its message. To avoid deadlocks, each process should 
consume its messages, before trying to send one. (Which is done 
correctly only for the replication manager ATM, not for the backends, IIRC).



What about keeping the per-process message queue in the local memory of
the process, and exporting only the queue head to the shmem, thus having
only one message per-process there.


The replication manager already does that with its cache. No other 
process needs to send (large enough) messages which cannot be consumed 
immediately. So such a local cache does not make much sense for any 
other process.


Even for the replication manager, I find it dubious to require such a 
cache, because it introduces an unnecessary copying of data within memory.



When the queue manager gets a
message from the process it may signal that process to copy the next
message from the process local memory into the shmem. To keep a
correct ordering of queue messages an additional shared memory queue of
pid_t can be maintained, containing one pid per each message.


The replication manager takes care of the ordering for cached messages.

Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-23 Thread Tom Lane
Alexey Klyukin [EMAIL PROTECTED] writes:
 Markus Wanner wrote:
 I'm currently doing this with imessages as well,  
 which violates the rule that the postmaster may not to touch shared  
 memory. I didn't look into ripping that out, yet. I'm not sure it can be  
 done with the existing signaling of the postmaster.

 In Replicator we avoided the need for postmaster to read/write backend's
 shmem data by using it as a signal forwarder.

You should also look at the current code for communication between
autovac launcher and autovac workers.  That seems to be largely a
similar problem, and it's been solved in a way that seems to be
safe enough with respect to the postmaster vs shared memory issue.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-23 Thread Markus Wanner

Hi,

Tom Lane wrote:

You should also look at the current code for communication between
autovac launcher and autovac workers.  That seems to be largely a
similar problem, and it's been solved in a way that seems to be
safe enough with respect to the postmaster vs shared memory issue.


Oh yeah, thanks for reminding me. Back when it was added I thought I 
might find some helpful insights in there. But I didn't ever take the 
time to read through it...


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-23 Thread Markus Wanner

Hi,

what follows are some comments after trying to understand how the 
autovacuum launcher works and thoughts on how to apply this to the 
replication manager in Postgres-R.


The initial comments in autovacuum.c say:


If the fork() call fails in the postmaster, it sets a flag in the shared
memory area, and sends a signal to the launcher.  


I note that the shmem area that the postmaster is writing to is pretty 
static and not dependent on any other state stored in shmem. That 
certainly makes a difference compared to my imessages approach, where a 
corruption in the shmem for imessages could also confuse the postmaster.


Reading on, the 'can_launch' flag in the launcher's main loop makes sure 
that only one worker is requested concurrently, so that the launcher 
doesn't miss a failure or success notice from either the postmaster or 
the newly started worker. The replication manager currently shamelessly 
requests as many helper backend as it wants. I think I can change that 
without much trouble. Would certainly make sense.


Notifications of the replication manager after termination or crashes of 
a helper backend remain. Upon normal errors (i.e. elog(ERROR... ), the 
backend processes themselves should take care of notifying the 
replication manager. But crashes are more difficult. IMO the replication 
manager needs to stay alive during this reinitialization, to keep the 
GCS connection. However, it can easily detach from shared memory 
temporarily (the imessages stuff is the only shmem place it touches, 
IIRC). However, a more difficult aspect is: it must be able to tell if a 
backend has applied its transaction *before* it died or not. Thus, after 
all backends have been killed, the postmaster needs to wait with 
reinitializing shared memory, until the replication manager has consumed 
all its messages. (Otherwise we would risk losing local transactions, 
probably also remote ones).


So, yes, after thinking about it, detaching the postmaster from shared 
memory seems doable for Postgres-R (in the sense of the postmaster does 
not rely on possibly corrupted data in shared memory). Reinitialization 
needs some more thoughts, but in general that seems like the way to go.


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: internal messaging

2008-07-23 Thread Tom Lane
Markus Wanner [EMAIL PROTECTED] writes:
 ... crashes are more difficult. IMO the replication 
 manager needs to stay alive during this reinitialization, to keep the 
 GCS connection. However, it can easily detach from shared memory 
 temporarily (the imessages stuff is the only shmem place it touches, 
 IIRC). However, a more difficult aspect is: it must be able to tell if a 
 backend has applied its transaction *before* it died or not. Thus, after 
 all backends have been killed, the postmaster needs to wait with 
 reinitializing shared memory, until the replication manager has consumed 
 all its messages. (Otherwise we would risk losing local transactions, 
 probably also remote ones).

I hope you're not expecting the contents of shared memory to still be
trustworthy after a backend crash.  If the manager is working strictly
from its own local memory, then it would be reasonable to operate
as above.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R: tuple serialization

2008-07-22 Thread Markus Wanner

Hi,

yesterday, I promised to outline the requirements of Postgres-R for 
tuple serialization, which we have been talking about before. There are 
basically three types of how to serialize tuple changes, depending on 
whether they originate from an INSERT, UPDATE or DELETE. For updates and 
deletes, it saves the old pkey as well as the origin (a global 
transaction id) of the tuple (required for consistent serialization on 
remote nodes). For inserts and updates, all added or changed attributes 
need to be serialized as well.


   pkey+originchanges
  INSERT-x
  UPDATExx
  DELETEx-

Note, that the pkey attributes may never be null, so an isnull bit field 
can be skipped for those attributes. For the insert case, all attributes 
(including primary key attributes) are serialized. Updates require an 
additional bit field (well, I'm using chars ATM) to store which 
attributes have changed. Only those should be transferred.


I'm tempted to unify that, so that inserts are serialized as the 
difference against the default vaules or NULL. That would make things 
easier for Postgres-R. However, how about other uses of such a fast 
tuple applicator? Does such a use case exist at all? I mean, for 
parallelizing COPY FROM STDIN, one certainly doesn't want to serialize 
all input tuples into that format before feeding multiple helper 
backends. Instead, I'd recommend letting the helper backends do the 
parsing and therefore parallelize that as well.


For other features, like parallel pg_dump or even parallel query 
execution, this tuple serialization code doesn't help much, IMO. So I'm 
thinking that optimizing it for Postgres-R's internal use is the best 
way to go.


Comments? Opinions?

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread chris
[EMAIL PROTECTED] (Markus Wanner) writes:
 chris wrote:
 I agree with you that tables are *supposed* to have primary keys;
 that's proper design, and if tables are missing them, then something
 is definitely broken.

 Ah, I see, so you are not concerned about tables with a PRIMARY KEY
 for which one wants another REPLICATION KEY, but only about tables
 without a PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the
 first place.

Doesn't want is probably overstating the matter.

I'll describe a scenario to suggest where it might happen.

- A system is implemented, using the database, and, for some reason,
  no PRIMARY KEY is defined for a table.  Someone forgot; it got
  misconfigured; a mistake was probably made.

- The system then goes into production, and runs for a while.  The
  table has data added to it, and starts to grow rather large.

- At THIS point, we decide to introduce replication, only to discover
  that there isn't a PRIMARY KEY on the table.

Ideally, we'd take an outage and add the primary key.  But suppose we
can't afford to do so?

The add indexes concurrently added in 8.3 (if memory serves) *would*
allow us to create a *candidate* primary key without forcing an
outage.

In theory, we'd like to have a true primary key.  Sometimes
operational issues get in the way.

 However, that's a general limitation of replication at tuple level:
 you need to be able to uniquely identify tuples. (Unlike replication
 on storage level, which can use the storage location for that).

No disagreement; yes, we certainly do need a way to uniquely identify
tuples, otherwise we can't replicate UPDATE or DELETE.

 Sometimes, unfortunately, people make errors in design, and we wind up
 needing to accomodate situations that are less than perfect.

 The happy happenstance is that, in modern versions of PostgreSQL, a
 unique index may be added in the background so that this may be
 rectified without outage if you can live with a candidate primary
 key rather than a true PRIMARY KEY.

 I cannot see any reason for not wanting a PRIMARY KEY, but wanting
 replication, and therefore a REPLICATION KEY.

 Or are you saying we should add a hidden REPLICATION KEY for people
 who are afraid of schema changes and dislike a visible primary key? 
 Would you want to hide the underlying index as well?

The scenario I outline above hopefully answers this.  It's not a
matter that I expect people to specifically desire not to have a
primary key.  Instead, I expect cases where mistakes compound with
operational issues to make them say Ow - I can't do that now!
-- 
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules  of the  Evil Overlord  #145. My  dungeon cell  decor  will not
feature exposed pipes.  While they add to the  gloomy atmosphere, they
are good  conductors of vibrations and  a lot of  prisoners know Morse
code. http://www.eviloverlord.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread Markus Wanner

Hi,

chris wrote:

I'll describe a scenario to suggest where it might happen.

- A system is implemented, using the database, and, for some reason,
  no PRIMARY KEY is defined for a table.  Someone forgot; it got
  misconfigured; a mistake was probably made.

- The system then goes into production, and runs for a while.  The
  table has data added to it, and starts to grow rather large.

- At THIS point, we decide to introduce replication, only to discover
  that there isn't a PRIMARY KEY on the table.


Yeah, that's the situation I had in mind as well.


Ideally, we'd take an outage and add the primary key.  But suppose we
can't afford to do so?


You are assuming that one doesn't need to take an outage to start 
replication in the first place. As Postgres-R comes with system catalog 
changes, that's not the case. You will at least need to restart the 
postmaster, without some sort of system catalog upgrading (which doesn't 
currently exists) you even need a full dump/restore cycle.



The add indexes concurrently added in 8.3 (if memory serves) *would*
allow us to create a *candidate* primary key without forcing an
outage.


Postgres-R is primarily being developed for *future* versions of 
Postgres, I don't see any point in back porting something that is not 
complete for the current version, yet.



However, that's a general limitation of replication at tuple level:
you need to be able to uniquely identify tuples. (Unlike replication
on storage level, which can use the storage location for that).


No disagreement; yes, we certainly do need a way to uniquely identify
tuples, otherwise we can't replicate UPDATE or DELETE.


Yup, that's the real issue here.


The scenario I outline above hopefully answers this.


I see the problem of wanting to replicate tables which didn't have a 
PRIMARY KEY before. But I still cannot see a use case for hiding indices 
or keys.



It's not a
matter that I expect people to specifically desire not to have a
primary key.  Instead, I expect cases where mistakes compound with
operational issues to make them say Ow - I can't do that now!


Yeah, these issues certainly need to be addressed. I think the ability 
to add a hidden column and a (visible!) primary key on that column 
should help in that case.


Thinking about index creation time doesn't make sense, as long as we 
still need a dump/restore cycle to setup replication. And even then, 
that operational issue has nothing to do with the question of hiding the 
newly generated index or not.


Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread Christopher Browne
Markus Wanner [EMAIL PROTECTED] writes:
 Thinking about index creation time doesn't make sense, as long as we
 still need a dump/restore cycle to setup replication. And even then,
  
 that operational issue has nothing to do with the question of hiding
 the newly generated index or not.

Let me note that one of the design criteria for Slony-I was to
explicitly NOT have such a requirement.

Making the assumption that it *is* acceptable to disrupt operations
for the duration of a dump/restore cycle is certain to limit interest
in a replication system.

A most pointed case where that will cause heartburn of the I refuse
to use this sort is if that disruption needs to take place when
recovering from the failure of a node.  That sort of disruption is
certainly counterproductive to the usual goal of replication enhancing
system availability.

Maybe I am misreading you; I rather hope so.
-- 
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules  of the  Evil Overlord  #145. My  dungeon cell  decor  will not
feature exposed pipes.  While they add to the  gloomy atmosphere, they
are good  conductors of vibrations and  a lot of  prisoners know Morse
code. http://www.eviloverlord.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread Markus Wanner

Hi,

Christopher Browne wrote:

Markus Wanner [EMAIL PROTECTED] writes:

Thinking about index creation time doesn't make sense, as long as we
still need a dump/restore cycle to setup replication. And even then,

  

that operational issue has nothing to do with the question of hiding
the newly generated index or not.


Let me note that one of the design criteria for Slony-I was to
explicitly NOT have such a requirement.


That's a pretty cool feature, but hasn't been one of the primary design 
goal of Postgres-R.



Making the assumption that it *is* acceptable to disrupt operations
for the duration of a dump/restore cycle is certain to limit interest
in a replication system.


I agree, that's certainly true.


A most pointed case where that will cause heartburn of the I refuse
to use this sort is if that disruption needs to take place when
recovering from the failure of a node.  That sort of disruption is
certainly counterproductive to the usual goal of replication enhancing
system availability.


Huh? What does migration between major Postgres versions have to do with 
node failures or recoveries?


System availability certainly *is* one of the primary design goals of 
Posgres-R. Thus, once installed and running, you certainly don't need 
any such procedure again. Certainly not due to node failures.


It would be nice if future upgrades (i.e. major version upgrades) of 
single nodes could be done while the rest of the cluster is running. 
That would mean having a pretty static binary change set communication 
protocol, which works between different major Postgres versions. That's 
certainly planned, but hey, we don't have a production ready version for 
*any* major version, yet.



Maybe I am misreading you; I rather hope so.


With an additional process and schema changes, Postgres-R takes quite a 
different approach than Slony. I don't think that would have been 
possible without forcing at least a Postmaster restart.


The schema changes are pretty minimal and can probably be done manually 
(well, script driven, perhaps) before restarting with a Postmaster which 
has replication compiled in. That would save the dump/restore cycle, but 
certainly not the Postmaster restart.


However, with regard to the catalog version, Postgres-R can be thought 
of as another major version of Postgres. (Maybe I should even extend the 
catalog version with an 'R' or something, so as to prevent normal 
Postgres version from running on a data directory of a Postgres-R 
installation).


Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread Alvaro Herrera
Markus Wanner wrote:

 Ideally, we'd take an outage and add the primary key.  But suppose we
 can't afford to do so?

 You are assuming that one doesn't need to take an outage to start  
 replication in the first place. As Postgres-R comes with system catalog  
 changes, that's not the case. You will at least need to restart the  
 postmaster, without some sort of system catalog upgrading (which doesn't  
 currently exists) you even need a full dump/restore cycle.

Hey, for Replicator I wrote a bootstrapping system for catalog
upgrading -- it starts a special bootstrap mode and allows creating
new catalogs, their indexes, and a bunch of functions.  Afterwards
everything is considered internal.

It's quite hackish but it works ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread Dimitri Fontaine
Le mardi 22 juillet 2008, Christopher Browne a écrit :
 A most pointed case where that will cause heartburn of the I refuse
 to use this sort is if that disruption needs to take place when
 recovering from the failure of a node.  That sort of disruption is
 certainly counterproductive to the usual goal of replication enhancing
 system availability.

 Maybe I am misreading you; I rather hope so.

This part of Markus's mail makes me think the need may change if Postgres-R is 
ever integrated into -core:

Le mardi 22 juillet 2008, Markus Wanner a écrit :
  As Postgres-R comes with system catalog changes, that's not the case.

So currently to use Postgres-R you'd have to start with a patched code base at 
each and every node, because it's how Markus wanted to proceed (Postgres-R 
being a separated code base). In Postgres-R adding a node to the cluster is 
what is done without dump/restore cycle.

Now that he's Open-Sourcing the solution, I hope to see this mode of operation 
change, thanks to integration of some key part (catalog changes) of the 
project into -core, if possible.

Note that while slony doesn't require a dump/restore to get activated, it 
seems to me (as a non user of it) that it still plays with catalog, 
preventing normal usage of pg_dump...

I'm not sure which disease I prefer: not being able to dump/restore normally 
or getting to have to restore on a specific product version, not the -core 
one.

Just my 2 cents, hoping I'm understanding correctly the point at hand here,
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread Markus Wanner

Hi,

Dimitri Fontaine wrote:
This part of Markus's mail makes me think the need may change if Postgres-R is 
ever integrated into -core:


Yes, in that case, you'd have replication already compiled in and 
distributed with standard Postgres. However, ATM that's pipe dreaming 
and I'm pretty sure no developer (neither me nor Postgres hackers) want 
to mix code (and responsibility!) at this stage of development of 
Postgres-R.


The most I'd be willing to ask for at the moment would be to get a range 
of OIDs reserved for use in Postgres-R. It would not make sense at the 
moment to add the schema changes to stardard Postgres, because I will 
pretty have to change these again.


So currently to use Postgres-R you'd have to start with a patched code base at 
each and every node, because it's how Markus wanted to proceed (Postgres-R 
being a separated code base). In Postgres-R adding a node to the cluster is 
what is done without dump/restore cycle.


Yup.

Now that he's Open-Sourcing the solution, I hope to see this mode of operation 
change, thanks to integration of some key part (catalog changes) of the 
project into -core, if possible.


Sorry, but at the moment, I disagree, because I think this would 
complicate matters for both projects. This might (and hopefully will) 
change, sure. But we are not there, yet.


Note that while slony doesn't require a dump/restore to get activated, it 
seems to me (as a non user of it) that it still plays with catalog, 
preventing normal usage of pg_dump...


Oh, does it? Well, it obviously doesn't require a Postmaster restart, 
nor does it add a separate background process.


I'm not sure which disease I prefer: not being able to dump/restore normally 
or getting to have to restore on a specific product version, not the -core 
one.


I think this process of moving between ordinary Postgres and Postgres-R 
schema variants for the same(!) major version can be automated. It would 
be a pretty small pg_upgrade sort of tool. I'm not that afraid of these 
schema changes. Heck, in the worst case, we could even let Postgres-R 
add them itself during startup.


Sorry if this sounds a little rude. I've just had the 'why isn't 
Postgres-R integrated?' discussion a little too often.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-22 Thread Tom Lane
Dimitri Fontaine [EMAIL PROTECTED] writes:
 Note that while slony doesn't require a dump/restore to get activated, it 
 seems to me (as a non user of it) that it still plays with catalog, 
 preventing normal usage of pg_dump...

As of 8.3 there are some new trigger features in core that were put
there for Slony.  I'm not sure to what extent that will let them get
rid of making nonstandard catalog changes ... perhaps Chris or Jan
can explain.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: tuple serialization

2008-07-22 Thread Decibel!

On Jul 22, 2008, at 3:04 AM, Markus Wanner wrote:
yesterday, I promised to outline the requirements of Postgres-R for  
tuple serialization, which we have been talking about before. There  
are basically three types of how to serialize tuple changes,  
depending on whether they originate from an INSERT, UPDATE or  
DELETE. For updates and deletes, it saves the old pkey as well as  
the origin (a global transaction id) of the tuple (required for  
consistent serialization on remote nodes). For inserts and updates,  
all added or changed attributes need to be serialized as well.


   pkey+originchanges
  INSERT-x
  UPDATExx
  DELETEx-

Note, that the pkey attributes may never be null, so an isnull bit  
field can be skipped for those attributes. For the insert case, all  
attributes (including primary key attributes) are serialized.  
Updates require an additional bit field (well, I'm using chars ATM)  
to store which attributes have changed. Only those should be  
transferred.


I'm tempted to unify that, so that inserts are serialized as the  
difference against the default vaules or NULL. That would make  
things easier for Postgres-R. However, how about other uses of such  
a fast tuple applicator? Does such a use case exist at all? I mean,  
for parallelizing COPY FROM STDIN, one certainly doesn't want to  
serialize all input tuples into that format before feeding multiple  
helper backends. Instead, I'd recommend letting the helper backends  
do the parsing and therefore parallelize that as well.


For other features, like parallel pg_dump or even parallel query  
execution, this tuple serialization code doesn't help much, IMO. So  
I'm thinking that optimizing it for Postgres-R's internal use is  
the best way to go.


Comments? Opinions?


ISTM that both londiste and Slony would be able to make use of these  
improvements as well. A modular replication system should be able to  
use a variety of methods for logging data changes and then applying  
them on a subscriber, so long as some kind of common transport can be  
agreed upon (such as text). So having a change capture and apply  
mechanism that isn't dependent on a lot of extra stuff would be  
generally useful to any replication mechanism.

--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] Postgres-R: tuple serialization

2008-07-22 Thread Markus Wanner

Hi,

Decibel! wrote:
ISTM that both londiste and Slony would be able to make use of these 
improvements as well. A modular replication system should be able to use 
a variety of methods for logging data changes and then applying them on 
a subscriber, so long as some kind of common transport can be agreed 
upon (such as text). So having a change capture and apply mechanism that 
isn't dependent on a lot of extra stuff would be generally useful to any 
replication mechanism.


Hm.. yeah, that's a good hint. However, I'm not sure how londiste and 
Slony would interface with these internal methods. That would require 
some sort of special replication triggers or something. But when to fire 
them? After every statement (sync)? Just before commit (eager)? After 
commit (lazy)? (These are the points in Postgres-R, where the internal 
methods are called).


I'm claiming that Postgres-R is modular (enough). But I'm unsure what 
interface it could provide to the outer world.


Regards

Markus Wanner


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: tuple serialization

2008-07-22 Thread Decibel!

On Jul 22, 2008, at 4:43 PM, Markus Wanner wrote:

Decibel! wrote:
ISTM that both londiste and Slony would be able to make use of  
these improvements as well. A modular replication system should be  
able to use a variety of methods for logging data changes and then  
applying them on a subscriber, so long as some kind of common  
transport can be agreed upon (such as text). So having a change  
capture and apply mechanism that isn't dependent on a lot of extra  
stuff would be generally useful to any replication mechanism.


Hm.. yeah, that's a good hint. However, I'm not sure how londiste  
and Slony would interface with these internal methods. That would  
require some sort of special replication triggers or something. But  
when to fire them? After every statement (sync)? Just before commit  
(eager)? After commit (lazy)? (These are the points in Postgres-R,  
where the internal methods are called).


Currently, londiste triggers are per-row, not deferred. IIRC,  
londiste is the same. ISTM it'd be much better if we had per- 
statement triggers that could see what data had changed; that'd  
likely be a lot more efficient than doing stuff per-row.


In any case, both replication systems should work with either sync or  
eager. I can't see them working with lazy.


What about just making all three available?

I'm claiming that Postgres-R is modular (enough). But I'm unsure  
what interface it could provide to the outer world.



Yeah. I suspect that Postgres-R could end up taking the place of the  
replica-hooks mailing list (and more, of course).

--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] Postgres-R: tuple serialization

2008-07-22 Thread Markus Wanner

Hi,

Decibel! wrote:
Currently, londiste triggers are per-row, not deferred. IIRC, londiste 
is the same. ISTM it'd be much better if we had per-statement triggers 
that could see what data had changed; that'd likely be a lot more 
efficient than doing stuff per-row.


Well, now that I think about it... there might be *lots* of changes. 
Certainly something you don't want to collect in memory. At the moment, 
Postgres-R cannot handle this, but I plan to add an upper limit on the 
change set size, and just send it out as soon as it exceeds that limit, 
then continue collecting. (Note for the GCS adept: this partial change 
set may be sent via reliable multicast, only the very last change set 
before the commit needs to be totally ordered.)


That would mean, introducing another 'change set full' hook...

In any case, both replication systems should work with either sync or 
eager. I can't see them working with lazy.


Huh? AFAIK, londiste as well as Slony-I are both async. So what would 
hooks for sync replication be good for? Why not rather only lazy hooks?


(Well, lazy hooks will pose yet another problem: those theoretically 
need to run somewhen *after* the commit, but at that time we don't have 
an open transaction, so where exactly shall we do this?)



What about just making all three available?


Doh. Ehm. That's a lot of work for something we are not even sure it's 
good for anything. I'm certainly willing to help. And if other projects 
show enough interest, I might even add the appropriate triggers myself. 
But as long as this is all unclear, I certainly have more important 
things on my todo list for Postgres-R (see that TODO list ;-) ).


Yeah. I suspect that Postgres-R could end up taking the place of the 
replica-hooks mailing list (and more, of course).


Let's hope so, yeah!

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-21 Thread Markus Wanner

Hi,

Alvaro Herrera wrote:

Markus Wanner wrote:
(Although, I'm still less than thrilled about the internal storage  
format of these tuple collections. That can certainly be improved and  
simplified.)


Care to expand more on what it is?


Well, what I really dislike is the overhead in code to first transform 
tuples into a string based internal change set representation, which 
then gets serialized again. That looks like two conversion steps, which 
are both prone to error.


I'm about to merge those into a simpler tuple serializer, which shares 
code with the initializer (or recovery provider/subscriber) part. This 
is where I'd like to know what requirements Jan or others have. I will 
try to outline the current implementation and requirements of Postgres-R 
in a new thread soon.



On Replicator we're using the binary
send/recv routines to transmit tuples.  (Obviously this fails when the
master and slave have differing binary output, but currently we just
punt on this point).


Yeah, that's another point. I'm currently using the textual input/output 
functions, but would like to switch to the binary one as well. However, 
that's an optimization, where the above is simplification of code, which 
is more important to me at the moment.


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-19 Thread chris
[EMAIL PROTECTED] (Markus Wanner) writes:
 Hello Chris,

 chris wrote:
 Slony-I does the same, with the variation that it permits the option
 of using a candidate primary key, namely an index that is unique+NOT
 NULL.

 If it is possible to support that broader notion, that might make
 addition of these sorts of logic more widely useful.

 Well, yeah, that's technically not much different, so it would
 probably be very easy to extend Postgres-R to work on any arbitrary
 Index.

 But what do we have primary keys for, in the first place? Isn't it
 exactly the *primay* key into the table, which you want to use for
 replication? Or do we need an additional per-table configuration
 option for that? A REPLICATION KEY besides the PRIMARY KEY?

I agree with you that tables are *supposed* to have primary keys;
that's proper design, and if tables are missing them, then something
is definitely broken.

Sometimes, unfortunately, people make errors in design, and we wind up
needing to accomodate situations that are less than perfect.

The happy happenstance is that, in modern versions of PostgreSQL, a
unique index may be added in the background so that this may be
rectified without outage if you can live with a candidate primary
key rather than a true PRIMARY KEY.

It seems to me that this extension can cover over a number of design
sins, which looks like a very kind accomodation where it is surely
preferable to design it in earlier rather than later.

 I know Jan Wieck has in mind the idea of adding an interface to enable
 doing highly efficient IUD (Insert/Update/Delete) via generating a way
 to do direct heap updates, which would be *enormously* more efficient
 than the present need (in Slony-I, for instance) to parse, plan and
 execute thousands of IUD statements.  For UPDATE/DELETE to work
 requires utilizing (candidate) primary keys, so there is some
 seemingly relevant similarity there.

 Definitely. The remote backend does exactly that for Postgres-R: it
 takes a change set, which consists of one or more tuple collections,
 and then applies these collections. See ExecProcessCollection() in
 execMain.c.

 (Although, I'm still less than thrilled about the internal storage
 format of these tuple collections. That can certainly be improved and
 simplified.)

You may want to have a chat with Jan; he's got some thoughts on a more
general purpose mechanism that would be good for this as well as for
(we think) extremely efficient bulk data loading.
-- 
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules  of the  Evil Overlord  #145. My  dungeon cell  decor  will not
feature exposed pipes.  While they add to the  gloomy atmosphere, they
are good  conductors of vibrations and  a lot of  prisoners know Morse
code. http://www.eviloverlord.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-19 Thread Markus Wanner

Hi,

chris wrote:

I agree with you that tables are *supposed* to have primary keys;
that's proper design, and if tables are missing them, then something
is definitely broken.


Ah, I see, so you are not concerned about tables with a PRIMARY KEY for 
which one wants another REPLICATION KEY, but only about tables without a 
PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the first place.


However, that's a general limitation of replication at tuple level: you 
need to be able to uniquely identify tuples. (Unlike replication on 
storage level, which can use the storage location for that).



Sometimes, unfortunately, people make errors in design, and we wind up
needing to accomodate situations that are less than perfect.

The happy happenstance is that, in modern versions of PostgreSQL, a
unique index may be added in the background so that this may be
rectified without outage if you can live with a candidate primary
key rather than a true PRIMARY KEY.


I cannot see any reason for not wanting a PRIMARY KEY, but wanting 
replication, and therefore a REPLICATION KEY.


Or are you saying we should add a hidden REPLICATION KEY for people who 
are afraid of schema changes and dislike a visible primary key? Would 
you want to hide the underlying index as well?



It seems to me that this extension can cover over a number of design
sins, which looks like a very kind accomodation where it is surely
preferable to design it in earlier rather than later.


Sorry, but I fail to see any real advantage of that covering of sins. 
I would find it rather confusing to have keys and indices hidden from 
the admin. It's not like an additional index or a primary key would lead 
to functional changes.


That's certainly different for additional columns, where a SELECT * 
could all of a sudden return more columns than before. So that's the 
exception where I agree that hiding such an additional column like we 
already do for system columns would make sense. That's for example the 
situation where you add an 'id' column later on and make that the new 
primary (and thus replication) key. Maybe that's what you meant? 
However, even in that case, I wouldn't hide the index nor the primary 
key, but only the column.


Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-19 Thread Markus Wanner

Hi,

chris wrote:

You may want to have a chat with Jan; he's got some thoughts on a more
general purpose mechanism that would be good for this as well as for
(we think) extremely efficient bulk data loading.


Jan, mind to share your thoughts? What use cases for such a general 
purpose mechanism do you see?


What I can imagine doing on top of Postgres-R is: splitting up the data 
and feeding multiple backends with it. Unlike Postgres-R's internal use, 
you'd still have to check the data against constraints, I think.


It would involve the origin backend asking for help from the manager. 
That one checks for available helper backends and then serves as a 
message dispatcher between the origin and helper backends (as it does 
for replication purposes). Please note that it already uses shared 
memory extensively, so the manager doesn't need to copy around the data 
itself.


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-19 Thread Alvaro Herrera
Markus Wanner wrote:

 (Although, I'm still less than thrilled about the internal storage  
 format of these tuple collections. That can certainly be improved and  
 simplified.)

Care to expand more on what it is?  On Replicator we're using the binary
send/recv routines to transmit tuples.  (Obviously this fails when the
master and slave have differing binary output, but currently we just
punt on this point).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread chris
[EMAIL PROTECTED] (Markus Wanner) writes:
 as you might know, Postgres-R relies on primary keys to address tuples
 of a table. It cannot replicate tables without a primary key.

Slony-I does the same, with the variation that it permits the option
of using a candidate primary key, namely an index that is unique+NOT
NULL.

If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.

 Primary keys currently aren't really used within the executor, so I
 had to extended and modify Postgres here and there, to get the
 required information. To ease reviewing I have split out these
 modifications and present them here as two separate little patches.

I know Jan Wieck has in mind the idea of adding an interface to enable
doing highly efficient IUD (Insert/Update/Delete) via generating a way
to do direct heap updates, which would be *enormously* more efficient
than the present need (in Slony-I, for instance) to parse, plan and
execute thousands of IUD statements.  For UPDATE/DELETE to work
requires utilizing (candidate) primary keys, so there is some
seemingly relevant similarity there.
-- 
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules  of the  Evil Overlord  #145. My  dungeon cell  decor  will not
feature exposed pipes.  While they add to the  gloomy atmosphere, they
are good  conductors of vibrations and  a lot of  prisoners know Morse
code. http://www.eviloverlord.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hello Chris,

chris wrote:

Slony-I does the same, with the variation that it permits the option
of using a candidate primary key, namely an index that is unique+NOT
NULL.

If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.


Well, yeah, that's technically not much different, so it would probably 
be very easy to extend Postgres-R to work on any arbitrary Index.


But what do we have primary keys for, in the first place? Isn't it 
exactly the *primay* key into the table, which you want to use for 
replication? Or do we need an additional per-table configuration option 
for that? A REPLICATION KEY besides the PRIMARY KEY?



I know Jan Wieck has in mind the idea of adding an interface to enable
doing highly efficient IUD (Insert/Update/Delete) via generating a way
to do direct heap updates, which would be *enormously* more efficient
than the present need (in Slony-I, for instance) to parse, plan and
execute thousands of IUD statements.  For UPDATE/DELETE to work
requires utilizing (candidate) primary keys, so there is some
seemingly relevant similarity there.


Definitely. The remote backend does exactly that for Postgres-R: it 
takes a change set, which consists of one or more tuple collections, and 
then applies these collections. See ExecProcessCollection() in execMain.c.


(Although, I'm still less than thrilled about the internal storage 
format of these tuple collections. That can certainly be improved and 
simplified.)


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread David Fetter
On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
 Hello Chris,

 chris wrote:
 Slony-I does the same, with the variation that it permits the
 option of using a candidate primary key, namely an index that is
 unique+NOT NULL.

 If it is possible to support that broader notion, that might make
 addition of these sorts of logic more widely useful.

 Well, yeah, that's technically not much different, so it would
 probably  be very easy to extend Postgres-R to work on any arbitrary
 Index.

 But what do we have primary keys for, in the first place? Isn't it
 exactly the *primay* key into the table, which you want to use for
 replication? Or do we need an additional per-table configuration
 option  for that? A REPLICATION KEY besides the PRIMARY KEY?

We have them because people are used to thinking in terms of a
PRIMARY KEY, not because that concept is actually distinguishable
from a non-partial UNIQUE NOT NULL constraint.

While I'm a chicken rather than a pig on this project
http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig, I believe that
covering the more general case right from the start would be a much
better plan.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
 But what do we have primary keys for, in the first place?

 We have them because people are used to thinking in terms of a
 PRIMARY KEY, not because that concept is actually distinguishable
 from a non-partial UNIQUE NOT NULL constraint.

No, we have them because the SQL standard actually assigns a distinct
meaning to a primary key.  (It's the default foreign key reference
column(s) for the table --- and in that context it's clear that
There Can Be Only One.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Gregory Stark
David Fetter [EMAIL PROTECTED] writes:

 On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
 Hello Chris,

 chris wrote:
 Slony-I does the same, with the variation that it permits the
 option of using a candidate primary key, namely an index that is
 unique+NOT NULL.

 If it is possible to support that broader notion, that might make
 addition of these sorts of logic more widely useful.

 Well, yeah, that's technically not much different, so it would
 probably  be very easy to extend Postgres-R to work on any arbitrary
 Index.

 But what do we have primary keys for, in the first place? Isn't it
 exactly the *primay* key into the table, which you want to use for
 replication? Or do we need an additional per-table configuration
 option  for that? A REPLICATION KEY besides the PRIMARY KEY?

Hm, it occurs to me that really Slony should be saying 
 WHERE (col1,col2,...) = ('x','y','z',...)

and letting the server figure out what access method is best for finding the
candidate record. That could mean using the primary key index, or it could
mean using some other index (perhaps a partial index for example).

It would be nice if there was a way for Slony to express to the server that
really, it only needs any UNIQUE NOT NULL combination of columns to match.
Once the server has any such combination which matches it can skip checking
the rest. I can't think of any way to write such a query in SQL.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

David Fetter wrote:

While I'm a chicken rather than a pig on this project
http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig, I believe that
covering the more general case right from the start would be a much
better plan.


I was trying to say that Postgres-R internally relies only on a unique 
index with not null constraint. It doesn't care if you name it PRIMARY 
KEY or REPLICATION KEY or whatever.


So, it's just a question of the syntax. We already have PRIMARY KEYs, 
and those are pretty much what I think is needed in 99% of all cases as 
the pointer to the replication


While I'm normally an absolute fan of generality,

I think you didn't quite get the point.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

sorry, some strange key-combination made my mail client send too early...

I myself wrote:
I was trying to say that Postgres-R internally relies only on a unique 
index with not null constraint. It doesn't care if you name it PRIMARY 
KEY or REPLICATION KEY or whatever.


So, it's just a question of the syntax. We already have PRIMARY KEYs, 
and those are pretty much what I think is needed in 99% of all cases as 
the pointer to the replication


.. as the pointer to the index to use for replication.

Offering the user a possibility to choose another (index + not null) 
would require something like ALTER TABLE ... ADD REPLICATION KEY ... or 
some such. Mostly syntactic sugar, which can be added as soon as we 
really need it.



While I'm normally an absolute fan of generality,

I think you didn't quite get the point.


Iiik.. that's what I didn't want to send and wanted to delete before 
sending... :-)  Sorry.


Regards

Markus



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

Tom Lane wrote:

It's the default foreign key reference
column(s) for the table


That's why I think it makes for a pretty good replication key as well.

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

I realize that you are talk about Slony, let me answer for the 
Postgres-R case, anyway.


Gregory Stark wrote:
Hm, it occurs to me that really Slony should be saying 
 WHERE (col1,col2,...) = ('x','y','z',...)


Hm.. that would mean increasing the amount of work for the remote 
backend, which applies remote transaction. For scalability reasons, I'm 
trying to keep that minimal.



and letting the server figure out what access method is best for finding the
candidate record. That could mean using the primary key index, or it could
mean using some other index (perhaps a partial index for example).


For Postgres-R, I think that would only be a gain in those cases, where 
all tuples of a collection (or even the entire change set) only affect 
tuples from a partial index. That doesn't look like it's worth the 
trouble, IMO. Or do you think that's a frequent case?


Thinking about it, I'd even say that requiring only one index frequently 
is favorable because of caching effects. Dunno.



It would be nice if there was a way for Slony to express to the server that
really, it only needs any UNIQUE NOT NULL combination of columns to match.
Once the server has any such combination which matches it can skip checking
the rest. I can't think of any way to write such a query in SQL.


I don't quite get your point here. For UPDATEs which change the PRIMARY 
KEY, the sender currently sends the *old* values plus the changes. In 
that case, you certainly don't want to send the entire olde tuple, but 
only the fields for *one* KEY. That's what I'm calling the replication 
key. (And currently equals the PRIMARY KEY).


Maybe I'm thinking too much in terms of Postgres-R, instead of Slony, 
what you are talking about.


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Alvaro Herrera
Markus Wanner wrote:

 Gregory Stark wrote:

 It would be nice if there was a way for Slony to express to the server that
 really, it only needs any UNIQUE NOT NULL combination of columns to match.
 Once the server has any such combination which matches it can skip checking
 the rest. I can't think of any way to write such a query in SQL.

 I don't quite get your point here. For UPDATEs which change the PRIMARY  
 KEY, the sender currently sends the *old* values plus the changes. In  
 that case, you certainly don't want to send the entire olde tuple, but  
 only the fields for *one* KEY. That's what I'm calling the replication  
 key. (And currently equals the PRIMARY KEY).

I think the point here is that you need to distinguish which tuple you
need to update.  For this, our Replicator uses the primary key only;
there's no way to use another candidate key (unique not null).  It would
certainly be possible to use a different candidate key, but as far as I
know no customer has ever requested this.

(FWIW we don't send the old values -- only the original PK columns, the
values of columns that changed, and the update mask in terms of
heap_modify_tuple.)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R: primary key patches

2008-07-18 Thread Markus Wanner

Hi,

Alvaro Herrera wrote:

I think the point here is that you need to distinguish which tuple you
need to update.  For this, our Replicator uses the primary key only;
there's no way to use another candidate key (unique not null).  It would
certainly be possible to use a different candidate key,


Yeah, and for this to work, the *sender* needs to decide on a key to use.


but as far as I
know no customer has ever requested this.


I can't see the use case for a separate REPLICATION KEY, different from 
the PRIMARY KEY, either..



(FWIW we don't send the old values -- only the original PK columns, the
values of columns that changed, and the update mask in terms of
heap_modify_tuple.)


Yup, that's pretty much the same what I'm doing for Postgres-R.

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-16 Thread Markus Wanner

Hi,

David Fetter wrote:

Would you mind if I were to make a git branch for it on
http://git.postgresql.org/ ?


I've set up a git-daemon with the Postgres-R patch here:

git://postgres-r.org/repo

Since it's a distributed VCS, you should be able to mirror that to 
git.postgtresql.org somehow (if you figure out how, please tell me!).


Please note that I'm still struggling with git and I cannot promise to 
keep using it.


Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R: primary key patches

2008-07-16 Thread Markus Wanner

Hi,

as you might know, Postgres-R relies on primary keys to address tuples 
of a table. It cannot replicate tables without a primary key.


Primary keys currently aren't really used within the executor, so I had 
to extended and modify Postgres here and there, to get the required 
information. To ease reviewing I have split out these modifications and 
present them here as two separate little patches.


The first one, get_pkey_index_oid.diff, changes the function 
relationHasPrimaryKey into GetPrimaryKeyIndexOid, which now returns an 
index oid instead of just a boolean. It works pretty much the same, 
except from returning an oid instead of just a boolean. (In the current 
Postgres-R code, I've duplicated that code to 
src/backend/replication/recovery.c)


And secondly, the add_pkey_info.diff patch adds a boolean field 
ii_Primary to the IndexInfo struct and ri_PrimaryKey to the 
ResultRelInfo struct, which is an index into the indexInfoArray.


I think these are relatively trivial modifications which could be 
helpful for other purposes as well. So I suggest to apply them to 
mainline whenever appropriate (read: choose the appropriate commit fest).


This also raises the more general question of how to start collaborating 
on Postgres-R. I realize that it's a pretty huge project. However, I'm 
unsure on how to ease reviewing for others, so if you have any ideas or 
questions, please don't hesitate to ask.


Regards

Markus


*** src/backend/commands/indexcmds.c	61a8b3774b682554e8670624583ab4cf4b9dbdb9
--- src/backend/commands/indexcmds.c	dc6fc2a3fbce90748bcf4cd7a60ea2ea887bc97f
*** static Oid GetIndexOpClass(List *opclass
*** 64,70 
    bool isconstraint);
  static Oid GetIndexOpClass(List *opclass, Oid attrType,
  char *accessMethodName, Oid accessMethodId);
- static bool relationHasPrimaryKey(Relation rel);
  
  
  /*
--- 64,69 
*** DefineIndex(RangeVar *heapRelation,
*** 324,330 
  		 * it's no problem either.
  		 */
  		if (is_alter_table 
! 			relationHasPrimaryKey(rel))
  		{
  			ereport(ERROR,
  	(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
--- 323,329 
  		 * it's no problem either.
  		 */
  		if (is_alter_table 
! 			(GetPrimaryKeyIndexOid(rel) != InvalidOid))
  		{
  			ereport(ERROR,
  	(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
*** ChooseRelationName(const char *name1, co
*** 1216,1229 
  }
  
  /*
!  * relationHasPrimaryKey -
   *
!  *	See whether an existing relation has a primary key.
   */
! static bool
! relationHasPrimaryKey(Relation rel)
  {
! 	bool		result = false;
  	List	   *indexoidlist;
  	ListCell   *indexoidscan;
  
--- 1215,1229 
  }
  
  /*
!  * GetPrimaryKeyIndexOid
   *
!  * Returns the oid of the primary key index of the relation, if any,
!  * otherwise InvalidOid is returned.
   */
! Oid
! GetPrimaryKeyIndexOid(Relation rel)
  {
! 	Oid			result = InvalidOid;
  	List	   *indexoidlist;
  	ListCell   *indexoidscan;
  
*** relationHasPrimaryKey(Relation rel)
*** 1244,1257 
  	0, 0, 0);
  		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
  			elog(ERROR, cache lookup failed for index %u, indexoid);
! 		result = ((Form_pg_index) GETSTRUCT(indexTuple))-indisprimary;
  		ReleaseSysCache(indexTuple);
! 		if (result)
  			break;
  	}
  
  	list_free(indexoidlist);
- 
  	return result;
  }
  
--- 1244,1260 
  	0, 0, 0);
  		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
  			elog(ERROR, cache lookup failed for index %u, indexoid);
! 
! 		if (((Form_pg_index) GETSTRUCT(indexTuple))-indisprimary)
! 			result = indexoid;
! 
  		ReleaseSysCache(indexTuple);
! 
! 		if (result != InvalidOid)
  			break;
  	}
  
  	list_free(indexoidlist);
  	return result;
  }
  

*** src/include/commands/defrem.h	e2384af33d917bff68234bbe407ea16e3ec43123
--- src/include/commands/defrem.h	58bb763402c9bef8ead035a3524505ad8fe58de5
***
*** 15,22 
  #define DEFREM_H
  
  #include nodes/parsenodes.h
  
- 
  /* commands/indexcmds.c */
  extern void DefineIndex(RangeVar *heapRelation,
  			char *indexRelationName,
--- 15,22 
  #define DEFREM_H
  
  #include nodes/parsenodes.h
+ #include utils/relcache.h
  
  /* commands/indexcmds.c */
  extern void DefineIndex(RangeVar *heapRelation,
  			char *indexRelationName,
*** extern Oid	GetDefaultOpClass(Oid type_id
*** 43,48 
--- 43,49 
  extern char *ChooseRelationName(const char *name1, const char *name2,
     const char *label, Oid namespace);
  extern Oid	GetDefaultOpClass(Oid type_id, Oid am_id);
+ extern Oid	GetPrimaryKeyIndexOid(Relation rel);
  
  /* commands/functioncmds.c */
  extern void CreateFunction(CreateFunctionStmt *stmt);

*** src/backend/catalog/index.c	c360fcfd1002ffa557c1a376d3e74c9c2a0924db

Re: [HACKERS] Postgres-R: current state of development

2008-07-16 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

First, thanks a lot for opening Postgres-R, I hope -core will find in  
your code as many good ideas and code as possible :)


Le 15 juil. 08 à 18:48, Markus Wanner a écrit :
A pretty general framework for helper processes is provided. I think  
this framework could be used for parallel querying or data loading  
as well. The helper processes are ordinary backends which process a  
single transaction at a time. But they don't have a client  
connection, instead they communicate with a manager via a messaging  
module based on shared memory and signals. Within Postgres-R, those  
helper backends are mostly called 'remote backends', which is a  
somewhat misleading name. It's just a short name for a helper  
backend which processes a remote transaction.


Could this framework help the current TODO item to have a concurrent  
pg_restore?


The ideas I remember of on this topic where to add the capability for  
pg_restore to create all indexes of any given table in parallel as to  
benefit from concurrent seqscan improvements of 8.3.


There was also the idea to have pg_restore handle the ALTER TABLE  
statements in parallel to the other data copying taking place, this  
part maybe requiring more dependancy information than currently  
available.


And there was some parallel pg_dump idea floating around too, in order  
to give PostgreSQL the capability to saturate high-end hardware at  
pg_dump time, as far as I understood this part of the mails.


Of course, reading that an Open Source framework for parallel queries  
in PostgreSQL is available, can we skip asking if having the executor  
benefit from it for general purpose queries would be doable?


Regards,
- --
dim


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkh+UjwACgkQlBXRlnbh1bmsAwCaAhr4xTeCeGjtuap4sHL04IOP
OL8AoI0yv0qEn1eDt+s0qeajzxyIqRhI
=KaLQ
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-16 Thread David Fetter
On Wed, Jul 16, 2008 at 09:35:28PM +0200, Markus Schiltknecht wrote:
 Hi,

 David Fetter wrote:
 Would you mind if I were to make a git branch for it on
 http://git.postgresql.org/ ?

 I've set up a git-daemon with the Postgres-R patch here:

 git://postgres-r.org/repo

 Since it's a distributed VCS, you should be able to mirror that to  
 git.postgtresql.org somehow (if you figure out how, please tell me!).

I've merged the latest Postgres in.  Care to see whether it runs?
http://git.postgresql.org/?p=~davidfetter/pgr/.git;a=summary

 Please note that I'm still struggling with git and I cannot promise
 to  keep using it.

I'm struggling, too, but the cheapness of experimenting is making it
easier and easier :)

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R: current state of development

2008-07-15 Thread Markus Wanner

Hi,

After having published the source code, I'd like to add some words about 
the current state of the project.


Postgres-R is currently capable of replicating tuples via binary change 
sets, does proper conflict detection and prevention. It offers three 
different timing methods: sync, eager and lazy. Out of those, the eager 
method is the most advanced one, which I've been focusing on. However, 
for fully synchronous replication, we mainly need to be able to handle 
multiple changesets per transaction. This will be necessary anyway to 
support long running transactions. Because it simply doesn't make sense 
to keep a huge changeset back and send it just before the commit.


A pretty general framework for helper processes is provided. I think 
this framework could be used for parallel querying or data loading as 
well. The helper processes are ordinary backends which process a single 
transaction at a time. But they don't have a client connection, instead 
they communicate with a manager via a messaging module based on shared 
memory and signals. Within Postgres-R, those helper backends are mostly 
called 'remote backends', which is a somewhat misleading name. It's just 
a short name for a helper backend which processes a remote transaction.


I've written interfaces to ensemble, spread and an emulated GCS for 
testing purposes. The spread interface is still lacking functionality, 
the other two should work fine. None of the interfaces is dependent on 
external libraries, because I have added asynchronous clients, which 
none of the given libraries for ensemble or spread offered, but is 
required for the replication manager.


Sequence increments are replicated just fine and sequences feature an 
additional per-node cache. The setval() functionality is still missing, 
though.


Recovery and initialization must still be done manually, although I've 
already done much of the work to synchronize table data. A daunting task 
will be the synchronization of the sytsem catalogs. Postgres-R can 
currently not replicate any DDL command.


Compared with the WAL log shipping method mentioned in the core team 
statement about built-in replication, this is certainly the longer way 
to go. But on the other hand it isn't limited to single-master 
replication and certainly offers more options for future extensions.


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-15 Thread Markus Wanner

Hi,

Alvaro Herrera wrote:

I think the
way to go here is to have Markus open up his Monotone repo, or convince
him to migrate it to Git, but I really doubt that's ever going to
happen.


He he... good guess ;-)

However, as much as I personally like monotone and as much as I dislike
git for being a bad rip off of monotone, git certainly has its merits.
It has a much larger users base and is faster than monotone for some 
operations.


Having used subversion, CVS and monotone for Postgres-R, I think I'm now
giving git a try. I'm about to setup the necessary repositories, but I'm 
still having a hard time grokking this tool set. (And yes, I'm missing 
some features compared to monotone. In our case, the most obvious one is 
'mtn diff --context' ;-) )


Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R source code release

2008-07-14 Thread Markus Wanner

Dear Hackers,

it has been two years, since I've presented my work on Postgres-R for 
the first time to a broader audience in Toronto. I continued maintaining 
that code in my spare time, but to be honest, I didn't have much time 
for it.


As much as I'd like to change that, I now think the best thing for the 
project itself is to open it up and release the source code. I'd very 
much like to get others aboard and turn Postgres-R into a real community 
project again. Of course, the recent core statement about wanting an 
integrated replication solution also influenced my decision, even though 
I realize that Postgres-R is way more ambitious path than a log shipping 
solution.


As a first step towards a community project, I've cleaned up the code 
and tried to comment and document it as good as I can. You are welcome 
to download the latest patch from http://www.postgres-r.org/downloads/. 
It expects to be applied against today's CVS HEAD. You'll find a README 
and a pretty exhaustive TODO file under src/backend/replication/.


I plan to write a series of more technical mails following this one, to 
start the discussion about the internals of Postgres-R and decisions I 
made. That should give some starting points for those interested it the 
project.


In case you are wondering: I've got married and I'm glad to tell you 
that I changed my very German sounding name Schiltknecht to a more 
pronounceable one: my new surname is Wanner.


Looking forward to inspiring discussions with fellow hackers.

Regards

Markus Wanner

 configure.in  |9 
 src/Makefile.global.in|1 
 src/backend/Makefile  |4 
 src/backend/access/transam/varsup.c   |   23 
 src/backend/access/transam/xact.c |  115 +
 src/backend/bootstrap/bootstrap.c |3 
 src/backend/catalog/Makefile  |4 
 src/backend/catalog/catalog.c |8 
 src/backend/catalog/index.c   |   10 
 src/backend/commands/Makefile |4 
 src/backend/commands/aggregatecmds.c  |   25 
 src/backend/commands/comment.c|   25 
 src/backend/commands/conversioncmds.c |   25 
 src/backend/commands/dbcommands.c |  220 ++-
 src/backend/commands/functioncmds.c   |   64 
 src/backend/commands/indexcmds.c  |9 
 src/backend/commands/lockcmds.c   |   10 
 src/backend/commands/opclasscmds.c|   75 +
 src/backend/commands/operatorcmds.c   |   24 
 src/backend/commands/prepare.c|   20 
 src/backend/commands/proclang.c   |   32 
 src/backend/commands/replicacmds.c|  188 ++
 src/backend/commands/schemacmds.c |   29 
 src/backend/commands/sequence.c   |  193 ++
 src/backend/commands/tablecmds.c  |   44 
 src/backend/commands/trigger.c|   30 
 src/backend/commands/tsearchcmds.c|  136 +
 src/backend/commands/typecmds.c   |   69 
 src/backend/commands/variable.c   |   68 
 src/backend/executor/execMain.c   |  651 +
 src/backend/executor/execUtils.c  |9 
 src/backend/nodes/copyfuncs.c |   14 
 src/backend/nodes/equalfuncs.c|   12 
 src/backend/parser/gram.y |   92 +
 src/backend/parser/keywords.c |6 
 src/backend/postmaster/postmaster.c   |  349 
 src/backend/replication/HISTORY   |   33 
 src/backend/replication/Makefile  |   17 
 src/backend/replication/README|   93 +
 src/backend/replication/TODO  |  161 ++
 src/backend/replication/cset.c|  280 +++
 src/backend/replication/gc_egcs.c |  580 
 src/backend/replication/gc_ensemble.c |  823 +++
 src/backend/replication/gc_spread.c   |  432 ++
 src/backend/replication/gc_utils.c|   93 +
 src/backend/replication/local.c   |  878 
 src/backend/replication/manager.c | 1859 ++
 src/backend/replication/recovery.c| 1375 +++
 src/backend/replication/remote.c  |  463 ++
 src/backend/replication/utils.c   |  558 +++
 src/backend/storage/ipc/Makefile  |4 
 src/backend/storage/ipc/buffer.c  |  308 
 src/backend/storage/ipc/imsg.c|  369 +
 src/backend/storage/ipc/ipci.c|   12 
 src/backend/storage/ipc/procarray.c   |  106 +
 src/backend/storage/lmgr/lock.c   |2 
 src/backend/storage/lmgr/proc.c   |   70 
 src/backend/tcop/postgres.c   |  421 +
 src/backend/tcop/utility.c|6 
 src/backend/utils/cache/syscache.c|1 
 

Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread Devrim GÜNDÜZ
On Mon, 2008-07-14 at 17:42 +0200, Markus Wanner wrote:
 As much as I'd like to change that, I now think the best thing for the
 project itself is to open it up and release the source code. I'd very 
 much like to get others aboard and turn Postgres-R into a real
 community  project again. Of course, the recent core statement about
 wanting an  integrated replication solution also influenced my
 decision, even though I realize that Postgres-R is way more ambitious
 path than a log shipping solution.
 
 As a first step towards a community project, I've cleaned up the code 
 and tried to comment and document it as good as I can. You are welcome
 to download the latest patch from
 http://www.postgres-r.org/downloads/.

That's good news, really.

 In case you are wondering: I've got married and I'm glad to tell you 
 that I changed my very German sounding name Schiltknecht to a more 
 pronounceable one: my new surname is Wanner.

Congrats :)
-- 
Devrim GÜNDÜZ
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
   http://www.gunduz.org



signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread Jonah H. Harris
On Mon, Jul 14, 2008 at 11:42 AM, Markus Wanner
[EMAIL PROTECTED] wrote:
 As a first step towards a community project, I've cleaned up the code and
 tried to comment and document it as good as I can. You are welcome to
 download the latest patch from http://www.postgres-r.org/downloads/. It
 expects to be applied against today's CVS HEAD. You'll find a README and a
 pretty exhaustive TODO file under src/backend/replication/.

Awesome!

 In case you are wondering: I've got married

Congratulations!

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread David Fetter
On Mon, Jul 14, 2008 at 05:42:21PM +0200, Markus Wanner wrote:
 Dear Hackers,

 it has been two years, since I've presented my work on Postgres-R for  
 the first time to a broader audience in Toronto. I continued maintaining  
 that code in my spare time, but to be honest, I didn't have much time  
 for it.

 As much as I'd like to change that, I now think the best thing for the  
 project itself is to open it up and release the source code. I'd very  
 much like to get others aboard and turn Postgres-R into a real community  
 project again. Of course, the recent core statement about wanting an  
 integrated replication solution also influenced my decision, even though  
 I realize that Postgres-R is way more ambitious path than a log shipping  
 solution.

 As a first step towards a community project, I've cleaned up the code  
 and tried to comment and document it as good as I can. You are welcome  
 to download the latest patch from http://www.postgres-r.org/downloads/.  
 It expects to be applied against today's CVS HEAD. You'll find a README  
 and a pretty exhaustive TODO file under src/backend/replication/.

Would you mind if I were to make a git branch for it on
http://git.postgresql.org/ ?

 In case you are wondering: I've got married

Congratulations!

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread Alvaro Herrera
David Fetter wrote:
 On Mon, Jul 14, 2008 at 05:42:21PM +0200, Markus Wanner wrote:

 Would you mind if I were to make a git branch for it on
 http://git.postgresql.org/ ?

That's very likely wasted effort, since obviously Markus has got a
Monotone tree somewhere ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread David Fetter
On Mon, Jul 14, 2008 at 05:35:28PM -0400, Alvaro Herrera wrote:
 David Fetter wrote:
  On Mon, Jul 14, 2008 at 05:42:21PM +0200, Markus Wanner wrote:
 
  Would you mind if I were to make a git branch for it on
  http://git.postgresql.org/ ?
 
 That's very likely wasted effort, since obviously Markus has got a
 Monotone tree somewhere ...

I'm curious as to your motive behind continual deriding any effort to
get git going on branches.  Do you have some other alternative in mind?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread Alvaro Herrera
David Fetter wrote:
 On Mon, Jul 14, 2008 at 05:35:28PM -0400, Alvaro Herrera wrote:
  David Fetter wrote:
   On Mon, Jul 14, 2008 at 05:42:21PM +0200, Markus Wanner wrote:
  
   Would you mind if I were to make a git branch for it on
   http://git.postgresql.org/ ?
  
  That's very likely wasted effort, since obviously Markus has got a
  Monotone tree somewhere ...
 
 I'm curious as to your motive behind continual deriding any effort to
 get git going on branches.  Do you have some other alternative in mind?

That's not my intention.  It's just that the two cases that have popped
up are not useful.  The with-recursive one because upstream does not use
it.  If that can be solved, then I have no objection and I think it
would be a good thing.  This one because it would be messy.  I think the
way to go here is to have Markus open up his Monotone repo, or convince
him to migrate it to Git, but I really doubt that's ever going to
happen.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread Joshua D. Drake
On Mon, 2008-07-14 at 14:41 -0700, David Fetter wrote:
 On Mon, Jul 14, 2008 at 05:35:28PM -0400, Alvaro Herrera wrote:
  David Fetter wrote:
   On Mon, Jul 14, 2008 at 05:42:21PM +0200, Markus Wanner wrote:
  
   Would you mind if I were to make a git branch for it on
   http://git.postgresql.org/ ?
  
  That's very likely wasted effort, since obviously Markus has got a
  Monotone tree somewhere ...
 
 I'm curious as to your motive behind continual deriding any effort to
 get git going on branches.  Do you have some other alternative in mind?

Obviously monotone, but I think the point is, Git gives them nothing
Monotone doesn't. Of course, I am personally happy with SVN but hey :P

Joshua D. Drake

-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread David Fetter
On Mon, Jul 14, 2008 at 04:22:47PM -0700, Joshua D. Drake wrote:
 On Mon, 2008-07-14 at 14:41 -0700, David Fetter wrote:
  On Mon, Jul 14, 2008 at 05:35:28PM -0400, Alvaro Herrera wrote:
   David Fetter wrote:
On Mon, Jul 14, 2008 at 05:42:21PM +0200, Markus Wanner wrote:
   
Would you mind if I were to make a git branch for it on
http://git.postgresql.org/ ?
   
   That's very likely wasted effort, since obviously Markus has got
   a Monotone tree somewhere ...
  
  I'm curious as to your motive behind continual deriding any effort
  to get git going on branches.  Do you have some other alternative
  in mind?
 
 Obviously monotone, but I think the point is, Git gives them nothing
 Monotone doesn't.

Apart from being able to interoperate with CVS, you mean? ;)

 Of course, I am personally happy with SVN but hey :P

You can't have tried a merge in SVN if that's so :P :P

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread David E. Wheeler

On Jul 14, 2008, at 16:54, David Fetter wrote:


Of course, I am personally happy with SVN but hey :P


You can't have tried a merge in SVN if that's so :P :P


Those of us who have been doing it for years, in CVS and in SVN,  
aren't too worried about it.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-14 Thread Andrej Ricnik-Bay
On 15/07/2008, David E. Wheeler [EMAIL PROTECTED] wrote:
   Of course, I am personally happy with SVN but hey :P
 
  You can't have tried a merge in SVN if that's so :P :P

  Those of us who have been doing it for years, in CVS and in SVN, aren't too
 worried about it.


Follow the sandal! :D



-- 
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers