Hi,
On Tue, 20 May 2008 17:16:27 +0200 Emmanuel Cecchet <[EMAIL PROTECTED]> wrote: > Hi Stuart, > > > We experienced the below after an attempted automated backup > > operation, the pg_dump command did not actually begin till 3:30. > > Why is there a timeout on 0 pending transactions? > > > Actually, begin handling is very complex in Sequoia depending on the > nature of the transaction. There is an optimization that starts > transactions lazily to prevent starting read-only transactions on > multiple nodes. > So a couple of questions about your configuration since each case has > a different code path: > - are you using multiple controllers? Yes 2 controllers, primarily the app writes to the other controller, and reading from the one we are taking the backup on. We currently are using only 1 url for the writes, so not taking advantage of HA on the driver, as this is built into the code, and it happens our delimiter is a ",", which of course causes issues with the url string that Sequoia recommends. We will testing this with a new delimiter in the future. > - was the client starting the transaction connected to the controller > where the timeout occurred? We do not use (being,insert/update,commit) transaction When you refer to client, I assume you refer to our application? > - was the query following the begin a SELECT or an update (INSERT, > UPDATE, DELETE...)? again not using transactions It was an update, I looked up the checkpoint log_id in the recovery database. log ids are 2535406 insert 2535407 update (checkpoint) 2535408 insert > - are you using persistent connections? We are not specifying persistent connection in the driver url. I will have to dig deeper to know how the code is re-using what we define as pool, and if it is assigning a persistent connection in a pool. > > 2008-05-20 01:51:23,631 WARN sequoia.controller.scheduler Timeout > > in begin, still waiting for 0 pending transactions to complete > > > > 2008-05-20 01:51:23,631 WARN controller.virtualdatabase.XXXX > > Begin failed (Timeout in begin, still waiting for 0 pending > > transactions to complete) > > > > Would you have a complete log to see when the backup operation was > triggered and what were the subsequent errors if any? > 2008-05-20 01:45:09,624 INFO controller.RequestManager.XXXX Suspending activity for XXXX 2008-05-20 01:45:09,629 INFO controller.RequestManager.XXXX All activity is suspended for XXXX 2008-05-20 01:45:09,632 INFO controller.virtualdatabase.XXXX Storing checkpoint backup 20080520-172.30.20.250:25322-20080520014509624+0100 2008-05-20 01:45:09,672 INFO controller.virtualdatabase.XXXX Disabling backend ldb4, waiting for current transaction to complete on this backend 2008-05-20 01:45:09,682 INFO sequoia.controller.connection 5 connections freed on "jdbc:postgresql://X.X.X.X:5432/XXXX" 2008-05-20 01:45:10,062 INFO sequoia.controller.connection 20 connections freed on "jdbc:postgresql://X.X.X.X:5432/XXXX" 2008-05-20 01:45:10,434 INFO sequoia.controller.connection 20 connections freed on "jdbc:postgresql://X.X.X.X:5432/XXXX" 2008-05-20 01:45:17,351 sequoia.controller.scheduler Timeout in begin, still waiting for 0 pending transactions to complete 2008-05-20 01:51:23,631 WARN controller.virtualdatabase.XXXX Begin failed (Timeout in begin, still waiting for 0 pending transactions to complete) 2008-05-20 01:51:23,631 INFO virtualdatabase.VirtualDatabaseWorkerThread.XXXX Error during command execution (Timeout in begin, still waiting for 0 pending transactions to complete) The last 2 messages were repeated at random for the next 1 3/4hrs looks as though the checkpoint was not stored for till 3:30am? 2008-05-20 03:30:38,825 WARN sequoia.controller.recoverylog Checkpoint backup 20080520-172.30.20.250:25322-20080520014509624+0100 was stored 2008-05-20 03:30:38,836 INFO controller.virtualdatabase.XXXX Backend ldb4 is now disabled 2008-05-20 03:30:38,837 INFO controller.virtualdatabase.XXXX Backend ldb4 disabled on controller Member(address=/X.X.X.X:50850, uid=172.30.20.80:50850) 2008-05-20 03:30:38,840 INFO controller.RequestManager.XXXX Resuming activity for XXXX 2008-05-20 03:30:38,845 INFO controller.RequestManager.XXXX All activity is now resumed for XXXX 2008-05-20 03:30:38,846 INFO controller.RequestManager.XXXX Backend ldb4 is now disabled 2008-05-20 03:30:38,846 INFO controller.RequestManager.XXXX Starting backup of backend ldb4 2008-05-20 03:30:39,034 INFO backup.backupers.NativeCommandExec Starting execution of "pg_dump -h X.X.X.X -p 5432 -U XXX --format=c -f /home/database/20080520 -i XXXX" Thanks again, Stuart _______________________________________________ Sequoia mailing list [email protected] https://forge.continuent.org/mailman/listinfo/sequoia
