On Wednesday, August 04, 2010 07:43:40 pm Tatsuo Ishii wrote: > > Ok I have some more information. The cluster fell apart again yesterday. > > Again, there was nothing out of the ordinary in the postgresql logs > > leading upto the nodes falling out but this was logged in the pgpool log > > > > First node fell out at 10:36am: > > > > 2010-08-02 10:36:47 DEBUG: pid 16699: AsciiRow: 24 th field size does not > > match between master(16777216) and 2 th backend(0) > > 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind > > from 0 th backend ^@ NUM_BACKENDS: 3 > > 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind > > from 1 th backend ^@ NUM_BACKENDS: 3 > > 2010-08-02 10:36:47 DEBUG: pid 16699: read_kind_from_backend: read kind > > from 2 th backend D NUM_BACKENDS: 3 > > 2010-08-02 10:36:47 ERROR: pid 16699: read_kind_from_backend: 2 th kind D > > does not match with master or majority connection kind ^@ > > 2010-08-02 10:36:47 ERROR: pid 16699: kind mismatch among backends. > > Possible last query was: "select * from carts where cart_id in > > (12979,12984,12987,12986,12982,12981)" kind details are: 2[D] > > 2010-08-02 10:36:47 LOG: pid 16699: notice_backend_error: 2 fail over > > request from pid 16699 > > 2010-08-02 10:36:47 DEBUG: pid 24933: failover_handler called > > > > Second node fell out at 1:56pm: > > > > 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t > > 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind > > from 0 th backend ^@ NUM_BACKENDS: 3 > > 2010-08-02 13:56:21 DEBUG: pid 19720: read_kind_from_backend: read kind > > from 1 th backend D NUM_BACKENDS: 3 > > 2010-08-02 13:56:21 ERROR: pid 19720: read_kind_from_backend: 1 th kind D > > does not match with master or majority connection kind ^@ > > 2010-08-02 13:56:21 ERROR: pid 19720: kind mismatch among backends. > > Possible last query was: "select * from carts where cart_id in > > (13019,13018)" kind details are: 1[D] > > 2010-08-02 13:56:21 LOG: pid 19720: notice_backend_error: 1 fail over > > request from pid 19720 > > 2010-08-02 13:56:21 DEBUG: pid 24933: failover_handler called > > > > I notice two things, first that we died both times on a query of the > > carts table, but this could just be a coincidence. Second, we have the > > ^@ showing up as a kind again, except while two nodes returned ^@ when > > the first node fell out, only one returned ^@ when the second fell out, > > even though it was one of the nodes that returned ^@ that morning. > > > > I have PgPool 2.3.2 and PostgreSQL 8.4.4 on CentOS 5.5. Is this really a > > PgPool issue, or should I walk over to the postgresql mailing lists? > > Seems like pgpool issue. What I am wondering is this line: > > 2010-08-02 13:56:21 DEBUG: pid 19720: AsciiRow: len: 1 data: t > > This suggests that your DB application uses very old PostgreSQL > protocol(call version 2 protocol), implemented in PostgreSQL 7.3 or > before. But you said your PostgreSQL is 8.4.4. Is there anything > special with your DB application? > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp
>Is there anything special with your DB application? Ya, to put it nicely, it's old as dirt thanks to about a decade of bad decisions by people who shouldn't be allowed to decide what to have for lunch. Sorry, I forgot to add the primary user of that table is a custom ordering system that currently runs Python 1.5 using _pg that was most likely built against PostgreSQL before 7.2 I can try to see if we can move it to using something built against 8.3 (or 8.4 if I can convince it to install on RHEL2.1) though, maybe in a week or two. Is it possible that that is confusing pgpool? _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
