I made a small modification in pg_dump to prevent parallel backup failures
due to exclusive lock requests made by other tasks.

The modification I made take shared locks for each parallel backup worker
at the very beginning of the job. That way, any other job that attempts to
acquire exclusive locks will wait for the backup to finish.

In my case, each server was taking a day to complete the backup, now with
parallel backup one is taking 3 hours and the others less than a hour.

The code below is not very elegant, but it works for me. My whishlist for
the backup is:

1) replace plpgsql by c code reading the backup toc and assembling the lock
2) create an timeout to the locks.
3) broadcast the end of copy to every worker in order to release the locks
as early as possible;
4) create a monitor thread that prioritize an copy job based on a exclusive
lock acquired;
5) grant the lock for other connection of the same distributed transaction
if it is held by any connection of the same distributed transaction. There
is some sideefect I can't see on that?

1 to 4 are within my capabilities and I may do it in the future. 4 is to
advanced for me and I do not dare to mess with something so fundamental
rights now.

Anyone else is working on that?

On, Parallel.c, void RunWorker(...), add:

PQExpBuffer query;
PGresult   *res;

query = createPQExpBuffer();
"do language 'plpgsql' $$"
" declare "
"    x record;"
" begin"
"    for x in select * from pg_tables where schemaname not in
('pg_catalog','information_schema') loop"
"        raise info 'lock table %.%', x.schemaname, x.tablename;"
"        execute 'LOCK TABLE
'||quote_ident(x.schemaname)||'.'||quote_ident(x.tablename)||' IN ACCESS
"    end loop;"
"$$" );

res = PQexec(AH->connection, query->data);

if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
        exit_horribly(modulename,"Could not lock the tables to begin the

Reply via email to