Thanks for your response, I really appreciate it. > First, I don't really agree on just attaching a node back into the > pool the manner your are doing with the steps shown below. If a > postgreSQL backend node goes down, for some reason out of anyone's > control, you should bring that node back into the pool by using > online_recovery, that's why that mechanism is in place. > > Now there are times that we may need to purposely take one of the > postgreSQL backend nodes down, (I agree on that) but when that is > the case one should have in place some maintenance > procedures. There > are several scenarios though depending on your setup. You > may need to > keep your environment in read/write mode at all times which > means you > would use the pcp utilities to detach the PG node, do whatever you > need to do and then use the pcp online recovery to bring that node > back on the pool. (not pcp attach) > If you happen to be able to have your environment in read-only mode > then you could use the pcp detach to take the backend node > out of the > pool and then then use pcp attach to bring that node back > into the pool.
I understand your point and that's what I think too. But my example only shows unit testing. My real case is as follows: I have 2 or 4 server configuration. 2-server configuration: Application and DB run in each server 4-server configuration: Application run in two of the servers DB run in the other two servers. Pgpool-II would run only in the server where the application is running (a total of 2), but only one application would be active at a time. The applications would always connect to localhost port 9999. In any case, when we are installing the applications and DBs, it's always done one at a time (this is the procedure and can not currently be changed). The worst case scenerio for pgpool is at installation time with 2-server setup: 1. Install first server (App & pgpool and DB) 2. Install second server (App & pgpool and DB) For changes to take effect, the installation reboots the server (don't ask me... It's the way it has been and takes a lot of time/money to replace this procedure). So imagine it: At the end of step 1, the system reboots. When it comes up, only the first of the two servers is up; the other one does not have even IP address set. Pgpool starts and sees that there's no secondary database. With failover_command I trigger a script that would look for availability of the secondary database. At the end of step two, after rebooting, secondary server is up and running. Its pgpool will successfully connect to both databases since the first one is already up. However, the script running in the primary server detects that there's the secondary database running (I check for specific tables in the database, so I know it's up and ready for running application requests). If specific data in tables are not the same between primary and secondary database for any reason, I will do *manual* pcp recovery; otherwise (which is the most likely to happen at installation time since it has been just installed and both databases should have the same data), do pcp attach. Why don't I do pcp recovery in all cases? Because pcp recovery requires no connections from the application at the second stage of the recovery. With the release that is working for me (2.1) I can not disconnect clients at second stage only (using client_idle_limit_in_recovery in the latest copy of pgpool-II), so I need to close the application on purpose. Therefore, I need manual recovery. In this regard, I'm going to re-image my development box and install a fresh latest CVS version of pgpool-II, because something funny like this happened when I went from 2.0.1 to 2.1, so... No clue. The thing is that I'm running out of time. In conclusion, it should not behave the way it does when I disconnect a backend and do pcp attach after that. > I have downloaded the latest CVS version and tried the > following a few > times and did not see any issues. I'll push very hard to use it, starting with re-imaging my box. > On your last step though, you mentioned that you "re-attached the > primary" backend but I guess you meant the secondary backend since > that was the one you stopped. Yes, you are right: I meant 'sceondary'. > Marcelo > PostgreSQL DBA > Linux/Solaris System Administrator Thanks, Marcelo Daniel > > On Jan 20, 2009, at 5:46 PM, [email protected] wrote: > > > I think the patch is for debugging purposes, but I'm not sure. > > > > The weird thing that happens to me is the following (I just > tested it > > again): > > > > 1. The two backends start > > 2. start pgpool. So both backend statuses are 2. > > 3.a stop primary backend, > > The connection is lost with the message "server closed the > > connection unexpectedly > > This probably means the server terminated abnormally > > before or while processing the request. > > The connection to the server was lost. Attempting reset: > Succeeded.", > > every time I try to re-run the query. > > If I re-attach the primary backend, the connection works > just fine > > again. > > 3.b stop secondary backend. > > The connection keeps going (good). > > If I re-attach the primary backend, the connection blocks. > > > > It's weird > > > > Daniel > > > > > >> -----Original Message----- > >> From: Marcelo Martins [mailto:[email protected]] > >> Sent: Tuesday, January 20, 2009 6:03 PM > >> To: Crespo, Daniel @ SDS > >> Cc: [email protected] > >> Subject: Re: [Pgpool-general] pcp_attach_node problem? > >> > >> yeah just saw your new one when sent mine :) > >> > >> weird that it just keeps throwing that error. > >> I think I have done the PG shutdown and then re-attaching about 15 > >> times now and I only get the "server closed the connection > >> unexpectedly" once. > >> > >> I haven't tried to apply the patch that Tatsuo mentioned on 18th > >> though to see what difference it makes. might try that today > >> > >> > >> Marcelo > >> PostgreSQL DBA > >> Linux/Solaris System Administrator > >> > >> On Jan 20, 2009, at 4:52 PM, [email protected] wrote: > >> > >>> Hi, Marcelo, > >>> > >>> I just wrote to the mail list something about exactly this. > >>> > >>> In your description, it doesn't happen to me... I don't > know why... > >>> After doing failover, when a query is executed it throws back that > >>> "server closed the connection unexpectedly", and keeps > >> throwing that > >>> for > >>> every try I make. No idea about this. > >>> > >>> Thanks for the information! > >>> > >>> Daniel > >>> > >>>> -----Original Message----- > >>>> From: Marcelo Martins [mailto:[email protected]] > >>>> Sent: Tuesday, January 20, 2009 5:34 PM > >>>> To: Crespo, Daniel @ SDS > >>>> Subject: Re: [Pgpool-general] pcp_attach_node problem? > >>>> > >>>> Hi Daniel, > >>>> > >>>> I have just tested that with pgpool 2.1 and I also have the > >>>> same issue. > >>>> When I re-attach node 1 (second node) back, the psql > >>>> connection that I > >>>> had opened hangs after executing a second query. > >>>> > >>>> ERROR: pid 31003: pool_read2: EOF encountered with backend > >>>> > >>>> On the latest CVS version though the hanging issue seems > >> to be fixed. > >>>> Now when the failover/failback happens though it seems > like pgpool > >>>> failover_handler process kills the childs that pgpool > had open with > >>>> node 1 (second node - at least that is what I can tell > from what I > >>>> see ) therefore when a query is executed it throws back > >> that "server > >>>> closed the connection unexpectedly" . When I execute the query a > >>>> second time then pgpool uses a new child that has connection > >>>> opened to > >>>> node 0 "new_connection: skipping slot 1 because > backend_status = 3" > >>>> > >>>> > >>>> Marcelo > >>>> PostgreSQL DBA > >>>> Linux/Solaris System Administrator > >>>> > >>>> On Jan 13, 2009, at 8:18 AM, [email protected] wrote: > >>>> > >>>>> Sorry for the delay, I haven't had enough time. > >>>>> > >>>>>> 1. Show us the logs. Full logs, but only the relevant > >>>> parts (got tons > >>>>>> of things to read every day here). :) > >>>>> > >>>>> I'll try it again with full logs to give them to you guys > >>>>> > >>>>>> 2. Check whether PostgreSQL is having some problem of some sort > >>>>>> before > >>>>>> blaming it on pgpool-II. Can you run the same queries on > >> both nodes > >>>>>> and get the same results? > >>>>> > >>>>> PostgreSQL is not having any problems. It's not a query problem. > >>>>> When I > >>>>> install the latest CVS head, what I showed to you is > what happens. > >>>>> However, when I uninstall it and install the 2.1 released > >>>> version, it > >>>>> doesn't happen anymore. The problem with this 2.1 release > >> is that it > >>>>> doesn't keep the connection when a node is detached or > >>>> attached (if I > >>>>> have an already opened connection and do attach/detach node, it > >>>>> locks. I > >>>>> must disconnect and reconnect in order to keep doing > >>>> queries). Another > >>>>> problem is that I need the insert lock newly introduced to > >>>>> automatically > >>>>> apply on serial fields tables. > >>>>> > >>>>>> 3. Check permissions in both bg_hba.conf files. > >>>>> No problem with this. > >>>>> > >>>>>> 4. Have you considered using version 8.3.5 of PostgreSQL > >>>> and see how > >>>>>> it goes? Or at least, the last revision of the 8.1 branch. > >>>>> No. I can not update PostgreSQL. I'm using 8.2.1. > >>>>> > >>>>> When I have the logs, I'll post them for sure. Thanks! > >>>>> > >>>>> Daniel > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: [email protected] > >>>>>> [mailto:[email protected]] On Behalf Of > >>>>>> Jaume Sabater > >>>>>> Sent: Friday, January 09, 2009 2:32 AM > >>>>>> To: [email protected] > >>>>>> Subject: Re: [Pgpool-general] pcp_attach_node problem? > >>>>>> > >>>>>> On Thu, Jan 8, 2009 at 10:14 PM, > >> <[email protected]> wrote: > >>>>>> > >>>>>>> And issue a SQL Select command on a table, like: > >>>>>>> postgres=# select * from pg_stat_activity ; > >>>>>>> > >>>>>>> It returns: > >>>>>>> postgres=# select 1; > >>>>>>> server closed the connection unexpectedly > >>>>>>> This probably means the server terminated abnormally > >>>>>>> before or while processing the request. > >>>>>>> The connection to the server was lost. Attempting reset: > >>>>>> Succeeded. > >>>>>>> > >>>>>>> postgres=# select 1; > >>>>>> > >>>>>> Some ideas: > >>>>>> > >>>>>> 1. Show us the logs. Full logs, but only the relevant > >>>> parts (got tons > >>>>>> of things to read every day here). :) > >>>>>> 2. Check whether PostgreSQL is having some problem of some sort > >>>>>> before > >>>>>> blaming it on pgpool-II. Can you run the same queries on > >> both nodes > >>>>>> and get the same results? > >>>>>> 3. Check permissions in both bg_hba.conf files. > >>>>>> 4. Have you considered using version 8.3.5 of PostgreSQL > >>>> and see how > >>>>>> it goes? Or at least, the last revision of the 8.1 branch. > >>>>>> > >>>>>> -- > >>>>>> Jaume Sabater > >>>>>> http://linuxsilo.net/ > >>>>>> > >>>>>> "Ubi sapientas ibi libertas" > >>>>>> _______________________________________________ > >>>>>> Pgpool-general mailing list > >>>>>> [email protected] > >>>>>> http://pgfoundry.org/mailman/listinfo/pgpool-general > >>>>>> > >>>>> _______________________________________________ > >>>>> Pgpool-general mailing list > >>>>> [email protected] > >>>>> http://pgfoundry.org/mailman/listinfo/pgpool-general > >>>> > >>>> > >> > >> > > _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
