On 08/17/2015 10:14 AM, Bear Giles wrote:
I'm starting to work on a tar FDW as a proxy for a much more specific
FDW. (It's the 'faster to build two and toss the first away' approach
- tar lets me get the FDW stuff nailed down before attacking the more
complex container.) It could also be useful in its own right, or as
the basis for a zip file FDW.
I have figured out that in one mode the FDW mapping that would take
the name of the tarball as an option and produce a relation that has
all of the metadata for the contained files - filename, size, owner,
timestamp, etc. I can use the same approach I used for the /etc/passwd
FDW for that.
(BTW the current version is at
https://github.com/beargiles/passwd-fdw. It's skimpy on automated
tests until I can figure out how to handle the user mapping but it works.)
The problem is the second mode where I pull a single file out of the
FDW. I've identified three approachs so far:
1. A FDW mapping specific to each file. It would take the name of the
tarfile and the embedded file. Cleanest in some ways but it would be a
real pain if you're reading a tarball dynamically.
2. A user-defined function that takes the name of the tarball and file
and returns a blob. This is the traditional approach but why bother
with a FDW then? It also brings up access control issues since it
requires disclosure of the tarball name to the user. A FDW could hide
that.
3. A user-defined function that takes a tar FDW and the name of a file
and returns a blob. I think this is the best approach but I don't know
if I can specify a FDW as a parameter or how to access it.
I've skimmed the existing list of FDW but didn't find anything that
can serve as a model. The foreign DB are closest but, again, they
aren't designed for dynamic use where you want to do something with
every file in an archive / table in a foreign DB.
Is there an obvious approach? Or is it simply a bad match for FDW and
should be two standard UDF? (One returns the metadata, the second
returns the specific file.)
I would probably do something like this:
In this mode, define a table that has <path, blob>. To get the blob for
a single file, just do "select blob from fdwtable where path =
'/path/to/foo'". Make sure you process the qual in the FDW.
e.g.
create foreign table tarblobs (path text, blob bytea)
server tarfiles options (filename '/path/to/tarball', mode 'contents');
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers