Hello Ajay,
You wrote ...
> I completely uderstood the thigs that you have explain me.
It doesn't seem like I'm getting through because you didn't answer my
question:
If node A does NOT forward data to the other nodes, WHO will?
Let's address the basic purpose of replication first. As implied by the
term Consistent Distributed Data Sets (CDDS), the main use of
replication is to keep tables in one database completely in sync (in
"near real-time") with the same named tables in another database. If
database A has a customer table with 3,245 rows, then Replicator's
objective is to keep those same 3,245 rows in database B (and C, D and
E, in your case). If you add a new customer in B, it has to be added to
all other four databases, otherwise there's no point in talking about
CONSISTENT data sets.
If you have two full peer databases, the paths to achieve consistency
are straightforward: A to B and B to A. If you define the path from B
to A and do NOT define the path from A to B, then updates that are made
in A will be lost and the databases will be inconsistent with each
other. The only way you could avoid defining one of the paths with two
databases is if one of them was to be read-only. In other words, if you
only define the path from B to A, then A should never be updated
locally. Otherwise, if you allow updates in A they will not be
replicated to B and again the databases will be inconsistent with each
other.
Another use for Replicator is to distribute a data set. Instead of
having five customer tables each with 3,245 rows, you could have each of
what you're calling "remote" nodes would have a portion of the table,
e.g., node B only customers with numbers from 1000 to 1999, node C only
customer numbers from 2000 to 2999, and so forth (this is just an
example and is NOT a good production technique). The "master" node
could then have the full table, possibly with its own customers. This
scheme distributes the data set in what is called horizontal
partitioning. It's analogous to a five layer cake in the "master" node
with each "remote" node having a copy of their "slice" of the cake.
>From this perspective, each "remote" slice can be viewed as full peer to
read-only and the "master" slice (if it exists) isn't replicated
anywhere. If a customer is added at node B it's replicated to A so that
users at the master node can see the entire table, but users at A
shouldn't update "remote" slices in the full table at A.
Please review the above and advise whether the tables that you intend to
replicate are going to be full copies of each other at each remote node
and the master or whether they will be horizontally partitioned.
Joe Abbate
Senior Software Engineer
Computer Associates
[EMAIL PROTECTED]
________________________________
From: Ajay Dalvi [mailto:[EMAIL PROTECTED]
Sent: Wednesday, August 31, 2005 4:17 AM
To: [email protected]
Cc: Abbate, Joseph M
Subject: RE: [Users] Related to replication from multiple
database nodestoonedatabase node
Hi Joseph,
thanks for your guidance.
I would like to add some few points in my yesterdays mails.
As per my requirement the various peers ( remote nodes) will
just feed a single
repository of all data.
I will explain this with a diagram
My current setup can be depicted as follows
B C
\ /
\ /
A
/ \
/ \
D E
Node A will be Master node and all other nodes are remote nodes.
The requirement is that from all remote nodes (B,C,D,E) data should be
replicated to the master node (A). In this case master node A is not
going to forward the data to other remote nodes.
So in above case the data propagation paths will be from
remote nodes to master node.
With the numbers to the nodes A to E as 10,20,30,40,50
Data propagation paths will be as follows
Original Local Target
20 20 10
30 30 10
40 40 10
50 50 10
If I defined the paths as above, what is happening is that at
each remote node end,records are getting transfered in to the input
queue but they are not getting processed further as if all the nodes are
getting deadlock.
But if I add atleast one path in the reverse way. i.e.
from master node to remote node
i.e.
Original Local Target
20 20 10
10 10 20 //extra added path from
master to remote
30 30 10
40 40 10
50 50 10
with above paths replication setup works fine and data from
remote nodes gets transfered to master nodes. But the caveat is: if any
record is changed on master(a.k.a. Repository Server) will try to
enforce replication of that record onto Remote. If the record exists on
the remote server (e.g. B or 20 in above case) it gets modified. But if
it does not exist on the remote server B (the reason could be that this
record was replicated from C or 30) then there is an error "archive
append error". This is obviously not desired.
I would like to know if there is a way that will allow me to
avoid this extra path from master to remote node just to make it work.
OR
If there is a way to define one-way replication where bunch of
nodes replicate data to one repository node.
I would appreciate if you can confirm if this is the way Ingres
replication works, i.e. we need to define paths from every node to every
other node in the replication setup?
Thanks
-Ajay
On Mon, 2005-08-29 at 20:17, Abbate, Joseph M wrote:
Hi Ajay,
You wrote ...
> Node A will be Master node and all other nodes are
remote nodes. The
requirement
> is that from all remote nodes (B,C,D,E) data should
be replicated to
the master
> node (A). In this case master node A is not going to
forward the data
to other
> remote nodes.
If node A does NOT forward data to the other nodes, WHO
will? Consider
a customer table. Unless the table is horizontally
partitioned, so that
node A has one segment, B has another segment, etc., and
the master node
has all four segments, any update to a customer in B
*has* to be
propagated to all other four nodes, so either B
replicates directly to
the other four, or it replicates to the master and the
master forwards,
or some combination thereof. Otherwise the address of
some customer
will be correct in B and A but not in C, D or E.
> So is there any way that will allow me to avoid this
extra path from
master to
> remote node.
As I stated earlier, you have to imagine yourself at
*each* node and
ensure that an update reaches *all* desired targets. I
only started to
give you the configuration, but as I said you have to
have 20 paths when
you're done. This is the full configuration for a star,
five-node
scheme:
Orig Local Target
20 20 10 B copies to master
20 10 30 Master sends on to other
three
20 10 40
20 10 50
30 30 10 C copies to master
30 10 20 Master sends on to other
three
30 10 40
30 10 50
40 40 10 D copies to master
40 10 20 Master sends on to other
three
40 10 30
40 10 50
50 50 10 E copies to master
50 10 20 Master sends on to other
three
50 10 30
50 10 40
10 10 20 Master propagates to all
four
10 10 30 (for completeness)
10 10 40
10 10 50
The last four may never be used but they should still be
specified.
Regards,
Joe Abbate
Senior Software Engineer
Computer Associates
[EMAIL PROTECTED]
_______________________________________________
Users mailing list
[email protected]
http://ingres.ca.com/mailman/listinfo/users
<http://ingres.ca.com/mailman/listinfo/users>
_______________________________________________
Users mailing list
[email protected]
http://ingres.ca.com/mailman/listinfo/users