Hi,

I am running a 3 node system, one master, and two slaves talking
directly to the master. There are in fact many separate clusters in my
organisation, some running with 1 slave, some with up to 10 slaves, and
I have found I have the same problem on them all.

The sl_event table is growing forever on my slave nodes. It is cleaned
out after a slon daemon restart for that node, on the first run of
cleanupEvent() roughly ten minutes after starting. Thereafter
cleanupEvent runs every ten minutes and reports no errors but never
clears out old events. I traced the problem back to the max(seq_no) not
getting updated in sl_confirm, i.e. no new confirms arriving for certain
node origin, node received pairs.

I thought I had found the solution yesterday. On creation I only set up
paths from master to each slave and slave back to master. i.e. no cross
slave paths. I created the missing slave to slave paths yesterday and
during the first cleanupEvent after that most old events were purged.
However, since then the event table keeps growing.

If I look in pg_listener on each node the nodes with the oldest running
slon daemons have most entries, then less for newer slon daemons. I know
the pg_listener entries are created when a slon daemon starts so I guess
older running ones are missing some listen entries and that is why I am
missing confirm notifies. I am a bit stuck now though,

1. My theory about missing pg_listener entries must be wrong as there is
no way you can start every node after every other.
2. Restarting a slon daemon updates the confirm table with newer
confirms so the first cleanup works. What is special about what happens
on start up to fill this table that doesn't happen during normal running
time.

Maybe I just have something misconfigured somewhere. All events are
replicating fine to all nodes. It is only missing confirms and hence
growing event tables that are causing me problems.

Thanks in advance for any help,
Vicki


This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately.
Statements of intent shall only become binding when confirmed in hard copy 
by an authorized signatory.

_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Reply via email to