For performance tuesday today, I spent a little time checking for new oopses, trying to get a handle on existing ones.
One in particular stood out - https://launchpad.net/bugs/816235 - where we had a trivial insert take over 9 seconds. We've had a number of this sort of thing and often handwaved it off as contention or been unable to really pin it down to a specific event which we can then design away. tl;dr: all mutating transactions, including manual queries, backend scrips, webapp POST and xmlrpc requests need to be < 2.5 seconds uncontended for us to be able to do <5 second pages reliably. having referenced data and updated-by-backend fields (like heat, last_scanned etc) in the same table will increase contention The long version :)... So, I've done a little digging, and it boils down to 'contention', but we can actually put a very specific behaviour on it. The schema in question depends on two tables: Branch with an id and a last_scanned field. BugBranch with a branch foreign key reference which postgresql knows about. When a transaction F updates a branch row, any change from another transaction to bugbranch which adds a reference to the branch row that F is updating will block on transaction F completing (one way or another). The underlying thing here is that postgresql supports row level locks but not column level locks; so to prevent someone invalidating the foreign key reference being added to bugbranch, the adding transaction acquires a share level lock on the row, which blocks on any existing update lock (and conversely blocks other writers). Things we can do to prevent this being a problem: - stop using foreign keys - make sure the maximum transaction length with contention on that branch row is no longer than (max acceptable page load time divided by max concurrent transactions that will need that row). - enhance postgresql to support column locks (nontrivial!) - move status fields - fields that backend processes (in particular) work on to leaf tables rather than referenced tables. There is another similar situation with UNIQUE constraints (of any sort AFAICT) where the update to the index enforcing the constraint is enforced with a lock, and the second transaction will block until that lock completes so it can see whether to succeed, or to fail with 'unique constraint violated'. For that case: - stop using UNIQUE constraints - short transactions (same formula) Foreign keys are pretty useful to use, so I'd hesitate to stop using them. Likewise UNIQUE constraints are much simpler to work with than a journal-and-resolve-later approach (which things like cassandra need :P). So, short transactions -really- matter for avoiding these points of contention; but how short is short? The formula I gave above is a rule of thumb - its wrong, but its also right enough for us: if we have (say) 3 things that can take a lock out on a branch [branch scanner, code importer, owner of branch in the webapp], then if all of them try at the same time, all three things will serialise, so to get a guaranteed service time of 5 seconds, the sum of the times for those three things must be <= 5 seconds. Reality is there are 64 concurrent frontend requests we could have, plus 64 internal xmlrpc + ? backend scripts - so we need mutating transactions to be <= 5s/128 - or 39ms - to guarantee that we'll never spike over 5 seconds. readonly transactions are generally not a problem, because of the MVCC model postgresql has, most updates will not block on readers ( in the examples above both transactions were mutating, even though the FK case was mutating a different table). Thats pretty extreme, and we're probably safe with a lower threshold but - we can expect occasional contention where two or three things do end up taking up the same lock at the same time, so we need to make sure /all/ transactions - appserver, LOSA, or backend script are under 2.5 seconds long, to fully squelch things. We have a few things to do from here... we need to get some automation for LOSAs doing queries so they are not doing them interactively - and they have timeouts on what they do; we need to address backend scripts so that they also have transaction timeouts, and we need to get our soft timeout figure low enough to detect long transactions - so we can tweak the schema to reduce contention, and also fix things doing too much work in one transaction. I'm going to file some bugs as a followup; my initial implementation concept is that moving all backend scripts to be internal XMLRPC clients with no direct database access is the best approach overall - we get to use our existing rich reporting for performance and latency, we have automated termination of bad requests. -Rob -Rob _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : launchpad-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp