Hello list, I'm new on this mailing list and on PGCluster. I have been testing pgcluster 1.7.0rc8 whole last week with three clusterdb nodes with one replication server and I have ran into few issues/observations.
- Load balancer (pglb) isn't using all cluster servers. It uses only first 2 at the configure. If I have config with order "clusterdb1, clusterdb2, clusterdb3", it starts using only 1 and 2. With order 3, 2, 1, it uses 3 and 2. The pglb.sts file shows following (with order 1, 2, 3): Fri Jan 25 10:25:49 2008 port(5432) host:clusterdb1 initialize Fri Jan 25 10:25:49 2008 port(5432) host:clusterdb2 initialize Fri Jan 25 10:25:49 2008 port(5432) host:clusterdb3 initialize Fri Jan 25 10:26:00 2008 port(5432) host:clusterdb1 start use Fri Jan 25 10:26:02 2008 port(5432) host:clusterdb2 start use This is might be related to http://pgfoundry.org/pipermail/pgcluster-general/2008-January/001797.html, even that the cluster_table.c in 1.7.0rc8 doesn't (afaik) have two "cnt++". - Function inet_server_addr() isn't replicated correctly, is there more of these? I was testing about that load balancer servers with that function and I noticed that when I tried to insert value of that function to table it didn't work correctly. When inserting to clusterdb1, the clusterdb1 gets empty value, clusterdb2 gets clusterdb2 IP, and clusterdb3 gets clusterdb3 IP. Correct behavior would be clusterdb1 IP to all. Functions like NOW() works that way, why not all? - Replication server restart causes clusterdbs to think that replication server is down. There was patch for this in http://pgfoundry.org/pipermail/pgcluster-general/2007-December/001744.html, is that coming to the main tree also? - When running multiple updates to load balancer and trying to recover replication server went to never ending loop with message "now, waiting clear every transaction for recovery", recovery with no updates running works fine. Is this a feature? - I have also experienced some hangups which at the first seemed like a random hangups. When running update to one clusterdb the update was replicated to others, but not ran correctly to the one that the update was originally issued. Replication server log didn't show any special, with just sem_unlock as last message. For further testing it seems to be be happening when running update to a clusterdb which isn't quite up to sync. It should be much nicer if this kind of situation could end up with error message than just hanging the query. Or should I try to get more debug out if this? Tommi Berg _______________________________________________ Pgcluster-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgcluster-general
