On 13 January 2016 at 21:45, Sylvain MARECHAL <[email protected]> wrote:
> The problem is that the (1) DDL request will wait indefinitely, meaning > all transactions will continue to fail until the DDL operation is manually > aborted (for example, doing CTRL C in psql to abort the "CREATE TABLE"). > Correct, and by design. I'd like to do a pre-check where we sync up with the peer nodes and see if they're all alive before we take the DDL lock. This would reduce the impact a bit and allow an early ERROR like "ERROR: cannot perform DDL when one or more nodes is unreachable". However... we have something pretty close already. You can just set a statement_timeout in the session doing the DDL. It'll cancel the operation if it takes too long. Note that a lock_timeout will NOT work because the BDR global DDL lock is not recognised as a true lock by PostgreSQL. > What is the best practice to make sure the DDL operation will fail, > possibly after a timeout, if one of the node is down? statement_timeout > I could check the state of the node before issuing the DDL operation, but > this solution is far from being perfect as the node may fail right after > this. > Correct, but it's still useful to do. I'd check to see all nodes are connected in pg_stat_replication then I'd issue the DDL with a statement_timeout set. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
