Hi everyone,

I am just responding on the latest mail on this thread. But the question is 
about functionality. The proposal is to add a single flag 
--include-foreign-data which controls whether or not data is dumped for all the 
foreign tables in a database. That may not serve the purpose. A foreign table 
may point to a view, materialized view or inheritance tree, and so on. A 
database can have foreign tables pointing to all of those kinds. Restoring data 
to a view won't be possible and restoring it into an inheritance tree would 
insert it into the parent only and not the children. Further, a user may not 
want the data to be dumped for all the foreign tables since their usages are 
different esp. considering restore. I think a better option is to extract data 
in a foreign table using --table if that's the only usage. Otherwise, we need a 
foreign table level flag indicating whether pg_dump should dump the data for 
that foreign table or not.

The option enables the user to dump data of tables backed by a specific 
foreign_server. It is up to the user to guarantee that the foreign server is 
also writable, that is the reason to make the option opt-in. The option can be 
combined with --table to dump specific tables if needed. If the user has 
different foreign servers in the database has to make the conscious decision of 
dumping each one of them. Without this option the user is totally unable to do 
it.


> On 2020-01-21 10:36, Luis Carril wrote:
>>> Yes we can support --include-foreign-data without parallel option and
>>> later add support for parallel option as a different patch.
>>
>> Hi,
>>
>>      I've attached a new version of the patch in which an error is
>> emitted if the parallel backup is used with the --include-foreign-data
>> option.
>
> This seems like an overreaction.  The whole point of
> lockTableForWorker() is to avoid deadlocks, but foreign tables don't
> have locks, so it's not a problem.  I think you can just skip foreign
> tables in lockTableForWorker() using the same logic that getTables() uses.
>
> I think parallel data dump would be an especially interesting option
> when using foreign tables, so it's worth figuring this out.

What do you think of Peter's comment?
I took a look at it, we could skip foreign tables by checking the catalog in 
lockTableForWorker but this would imply an extra query per call to the function 
(as in getTables), which would be irrelevant for most of the cases. Or we could 
pass in the TocEntry that it is a foreign table (although that seems highly 
specific).
Also, would it not be possible to offer support of LOCK TABLE on foreign tables?

At this point I would like to leave the patch as is, and discuss further 
improvement in a future patch.

Luis M.

________________________________
From: Ashutosh Bapat <ashutosh.bapat....@gmail.com>
Sent: Wednesday, March 4, 2020 5:39 PM
To: David Steele <da...@pgmasters.net>
Cc: Luis Carril <luis.car...@swarm64.com>; vignesh C <vignes...@gmail.com>; 
Peter Eisentraut <peter.eisentr...@2ndquadrant.com>; Alvaro Herrera 
<alvhe...@2ndquadrant.com>; Daniel Gustafsson <dan...@yesql.se>; Laurenz Albe 
<laurenz.a...@cybertec.at>; PostgreSQL Hackers 
<pgsql-hackers@lists.postgresql.org>
Subject: Re: Option to dump foreign data in pg_dump

I am just responding on the latest mail on this thread. But the question is 
about functionality. The proposal is to add a single flag 
--include-foreign-data which controls whether or not data is dumped for all the 
foreign tables in a database. That may not serve the purpose. A foreign table 
may point to a view, materialized view or inheritance tree, and so on. A 
database can have foreign tables pointing to all of those kinds. Restoring data 
to a view won't be possible and restoring it into an inheritance tree would 
insert it into the parent only and not the children. Further, a user may not 
want the data to be dumped for all the foreign tables since their usages are 
different esp. considering restore. I think a better option is to extract data 
in a foreign table using --table if that's the only usage. Otherwise, we need a 
foreign table level flag indicating whether pg_dump should dump the data for 
that foreign table or not.

On Wed, Mar 4, 2020 at 12:41 AM David Steele 
<da...@pgmasters.net<mailto:da...@pgmasters.net>> wrote:
Hi Luis,

On 1/29/20 11:05 AM, Peter Eisentraut wrote:
> On 2020-01-21 10:36, Luis Carril wrote:
>>> Yes we can support --include-foreign-data without parallel option and
>>> later add support for parallel option as a different patch.
>>
>> Hi,
>>
>>      I've attached a new version of the patch in which an error is
>> emitted if the parallel backup is used with the --include-foreign-data
>> option.
>
> This seems like an overreaction.  The whole point of
> lockTableForWorker() is to avoid deadlocks, but foreign tables don't
> have locks, so it's not a problem.  I think you can just skip foreign
> tables in lockTableForWorker() using the same logic that getTables() uses.
>
> I think parallel data dump would be an especially interesting option
> when using foreign tables, so it's worth figuring this out.

What do you think of Peter's comment?

Regards,
--
-David
da...@pgmasters.net<mailto:da...@pgmasters.net>




--
Best Wishes,
Ashutosh Bapat

Reply via email to