Hi Stuart, Ok, thanks for the clarifications. Do you have time to handle this bug, or should I ask the maintenance team to have a go at it?
Do we have a RT for the pgbouncer upgrade yet? Cheers On 12-05-31 04:29 PM, Stuart Bishop wrote: > On Thu, May 31, 2012 at 10:04 PM, Francis J. Lacoste > <francis.laco...@canonical.com> wrote: >> Hi Stuart, >> >> We have a cluster of recent bugs that seems to hint that the retry >> transaction code might need some tweaking since our upgrade to PG 9.1. >> >> https://bugs.launchpad.net/launchpad/+bug/1000805 >> >> That first one is a >> >> psycopg2.OperationalError: could not send data to server: Connection >> timed out >> >> when serving private attachments from the librarian. Usually, attempting >> again will work. Is that a new error in PG 9.1 that we should add to the >> retry list? It only re-attempts DisconnectionError, IntegrityError and >> TransactionalRollbackError. > > Its not PG 9.1 - this is entirely client side. The trigger was likely > psycopg2 2.4 or libpq5, both of which needed to be upgraded before the > PG 9.1 upgrade. I've updated the bug report - Storm needs to catch > these exceptions so connections get reopened, and it will reraise them > as a DisconnectionError IIRC. > > It might also be new because our sockets were not failing like this > before. We really shouldn't be losing sockets like this - perhaps a > pg_bouncer upgrade is in order? I think the relevant connection limit > in pg_bouncer was set to 20 connections and was recently bumped to 40. > > >> https://bugs.launchpad.net/launchpad/+bug/1006530 >> https://bugs.launchpad.net/launchpad/+bug/1006531 >> >> These two are OOPSes triggered during fastdowntime. I was under the >> impression that we weren't logging those during fastdowntime and thus >> our filters might need updating. Or maybe, I'm mistaken and it's just >> that Diogo is our normal filter here, and since he's on leave this it >> explains why Laura reported bugs about those. >> >> Thanks for your insights. > > We log OOPSes during fastdowntime, because fastdowntime looks exactly > like a database outage from the client side and we want to know about > database outages. I'm not sure what filtering was being done to hide > them from the reports. We should report these failures if they happen > outside of the scheduled fastdowntime window. > > > -- Francis J. Lacoste francis.laco...@canonical.com
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : launchpad-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp