On Mon, Jun 18, 2012 at 11:50 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> Hi Simon,
>
> On Monday, June 18, 2012 05:35:40 PM Simon Riggs wrote:
>> On 13 June 2012 19:28, Andres Freund <and...@2ndquadrant.com> wrote:
>> > This adds a new configuration parameter multimaster_node_id which
>> > determines the id used for wal originating in one cluster.
>>
>> Looks good and it seems this aspect at least is commitable in this CF.
> I think we need to agree on the parameter name. It currently is
> 'multimaster_node_id'. In the discussion with Steve we got to
> "replication_node_id". I don't particularly like either.
>
> Other suggestions?

I wonder if it should be origin_node_id?  That is the term Slony uses.

>> Design decisions I think we need to review are
>>
>> * Naming of field. I think origin is the right term, borrowing from Slony.
> I think it fits as well.
>
>> * Can we add the origin_id dynamically to each WAL record? Probably no
>> need, but lets consider why and document that.
> Not sure what you mean? Its already set in XLogInsert to
> current_replication_origin_id which defaults to the value of the guc?
>
>> * Size of field. 16 bits is enough for 32,000 master nodes, which is
>> quite a lot. Do we need that many? I think we may have need for a few
>> flag bits, so I'd like to reserve at least 4 bits for flag bits, maybe
>> 8 bits. Even if we don't need them in this release, I'd like to have
>> them. If they remain unused after a few releases, we may choose to
>> redeploy some of them as additional nodeids in future. I don't foresee
>> complaints that 256 master nodes is too few anytime soon, so we can
>> defer that decision.
> I wished we had some flag bits available before as well. I find 256 nodes a
> pretty low value to start with though, 4096 sounds better though, so I would
> be happy with 4 flag bits. I think for cascading setups and such you want to
> add node ids for every node, not only masters...

Even though the number of nodes that can reasonably participate in
replication is likely to be not too terribly large, it might be good
to allow larger values, in case someone is keen on encoding something
descriptive in the node number.

If you restrict the number to a tiny range, then you'll be left
wanting some other mapping.  At one point, I did some work trying to
get a notion of named nodes implemented in Slony; gave up on it, as
the coordination process was wildly too bug-prone.

In our environment, at Afilias, we have used quasi-symbolic node
numbers that encoded something somewhat meaningful about the
environment.  That seems better to me than the risky "kludge" of
saying:
- The first node I created is node #1
- The second one is node #2.
- The third and fourth are #3 and #4
- I dropped node #2 due to a problem, and thus the "new node 2" is #5.

That numbering scheme gets pretty anti-intuitive fairly quickly, from
whence we took the approach of having a couple digits indicating data
centre followed by a digit indicating which node in that data centre.

If that all sounds incoherent, well, the more nodes you have around,
the more difficult it becomes to make sure you *do* have a coherent
picture of your cluster.

I recall the Slony-II project having a notion of attaching a permanent
UUID-based node ID to each node.  As long as there is somewhere decent
to find a symbolically significant node "name," I like the idea of the
ID *not* being in a tiny range, and being UUID/OID-like...

> Any opinions from others on this?
>
>> * Do we want origin_id as a parameter or as a setting in pgcontrol?
>> IIRC we go to a lot of trouble elsewhere to avoid problems with
>> changing on/off parameter values. I think we need some discussion to
>> validate where that should live.
> Hm. I don't really forsee any need to have it in pg_control. What do you want
> to protect against with that?
> It would need to be changeable anyway, because otherwise it would need to
> become a parameter for initdb which would suck for anybody migrating to use
> replication at some point.
>
> Do you want to protect against problems in replication setups after changing
> the value?

In Slony, changing the node ID is Not Something That Is Done.  The ID
is captured in *way* too many places to be able to have any hope of
updating it in a coordinated way.  I should be surprised if it wasn't
similarly troublesome here.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to