Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Greg Smith
Moving onto the directory archive part of this patch, the feature seems to work as advertised; here's a quick test case: createdb pgbench pgbench -i -s 1 pgbench pg_dump -F d -f test pg_restore -k test pg_restore -l test createdb copy pg_restore -d copy test The copy made that way looked good.

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 12:12, Greg Smith wrote: Moving onto the directory archive part of this patch, the feature seems to work as advertised; here's a quick test case: createdb pgbench pgbench -i -s 1 pgbench pg_dump -F d -f test pg_restore -k test pg_restore -l test createdb copy pg_restore -d copy

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 17:23, Heikki Linnakangas wrote: On 16.12.2010 12:12, Greg Smith wrote: There's a number of small things that I'd like to see improved in new rev of this code ... In addition to those: ... One more thing: the motivation behind this patch is to allow parallel pg_dump in the

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Robert Haas
On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: One more thing: the motivation behind this patch is to allow parallel pg_dump in the future, so we should be make sure this patch caters well for that. As soon as we have parallel pg_dump, the next

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 19:58, Robert Haas wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: One more thing: the motivation behind this patch is to allow parallel pg_dump in the future, so we should be make sure this patch caters well for that. As

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Joachim Wieland
On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: As soon as we have parallel pg_dump, the next big thing is going to be parallel dump of the same table using multiple processes. Perhaps we should prepare for that in the directory archive format, by

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 20:33, Joachim Wieland wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: As soon as we have parallel pg_dump, the next big thing is going to be parallel dump of the same table using multiple processes. Perhaps we should prepare

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 16.12.2010 20:33, Joachim Wieland wrote: How exactly would you just split the table in chunks of roughly the same size ? Check pg_class.relpages, and divide that evenly across the processes. That should be good enough. Not

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Robert Haas
On Thu, Dec 16, 2010 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 16.12.2010 20:33, Joachim Wieland wrote: How exactly would you just split the table in chunks of roughly the same size ? Check pg_class.relpages, and divide

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 22:13, Robert Haas wrote: So how bad would it be if we committed this new format without support for splitting large relations into multiple files, or with some stub support that never actually gets used, and fixed this later? Because this is starting to sound like a bigger

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andrew Dunstan
On 12/16/2010 03:13 PM, Robert Haas wrote: So how bad would it be if we committed this new format without support for splitting large relations into multiple files, or with some stub support that never actually gets used, and fixed this later? Because this is starting to sound like a bigger

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: On 12/16/2010 03:13 PM, Robert Haas wrote: So how bad would it be if we committed this new format without support for splitting large relations into multiple files, or with some stub support that never actually gets used, and fixed this later?

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andres Freund
On Thursday 16 December 2010 19:33:10 Joachim Wieland wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: As soon as we have parallel pg_dump, the next big thing is going to be parallel dump of the same table using multiple processes.

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 17.12.2010 00:29, Andres Freund wrote: On Thursday 16 December 2010 19:33:10 Joachim Wieland wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: As soon as we have parallel pg_dump, the next big thing is going to be parallel dump of the

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andres Freund
On Thursday 16 December 2010 23:34:02 Heikki Linnakangas wrote: On 17.12.2010 00:29, Andres Freund wrote: On Thursday 16 December 2010 19:33:10 Joachim Wieland wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: As soon as we have

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andrew Dunstan
On 12/16/2010 03:52 PM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: On 12/16/2010 03:13 PM, Robert Haas wrote: So how bad would it be if we committed this new format without support for splitting large relations into multiple files, or with some stub support that never

Re: [HACKERS] directory archive format for pg_dump

2010-12-07 Thread Joachim Wieland
On Thu, Dec 2, 2010 at 2:52 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Ok, committed, with some small cleanup since the last patch I posted. Could you update the directory-format patch on top of the committed version, please? Thanks for committing the first part. Here

Re: [HACKERS] directory archive format for pg_dump

2010-12-03 Thread Heikki Linnakangas
On 02.12.2010 23:12, Alvaro Herrera wrote: Excerpts from Heikki Linnakangas's message of jue dic 02 16:52:27 -0300 2010: Ok, committed, with some small cleanup since the last patch I posted. I think the comments on _ReadBuf and friends need to be updated, since they are not just for headers

Re: [HACKERS] directory archive format for pg_dump

2010-12-02 Thread Heikki Linnakangas
Ok, committed, with some small cleanup since the last patch I posted. Could you update the directory-format patch on top of the committed version, please? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list

Re: [HACKERS] directory archive format for pg_dump

2010-12-02 Thread Alvaro Herrera
Excerpts from Heikki Linnakangas's message of jue dic 02 16:52:27 -0300 2010: Ok, committed, with some small cleanup since the last patch I posted. I think the comments on _ReadBuf and friends need to be updated, since they are not just for headers and TOC stuff anymore. I'm not sure if they

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Heikki Linnakangas
On 29.11.2010 22:21, Heikki Linnakangas wrote: On 29.11.2010 07:11, Joachim Wieland wrote: On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: * wrap long lines * use extern in function prototypes in header files * inline some functions like

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Heikki Linnakangas
On 01.12.2010 16:03, Heikki Linnakangas wrote: On 29.11.2010 22:21, Heikki Linnakangas wrote: I combined those, and the Free/Flush steps, and did a bunch of other editorializations and cleanups. Here's an updated patch, also available in my git repository at

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Joachim Wieland
On Wed, Dec 1, 2010 at 9:05 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Forgot attachment. This is also available in the above git repo. I have quickly checked your modifications, on the one hand I like the reduction of functions, I would have said that we have AH around

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Heikki Linnakangas
On 02.12.2010 04:35, Joachim Wieland wrote: There is one thing however that I am not in favor of, which is the removal of the sizeHint parameter for the read functions. The reason for this parameter is not very clear now without LZF but I have tried to put in a few comments to explain the

Re: [HACKERS] directory archive format for pg_dump

2010-11-29 Thread Heikki Linnakangas
On 29.11.2010 07:11, Joachim Wieland wrote: On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: * wrap long lines * use extern in function prototypes in header files * inline some functions like _StartDataCompressor, _EndDataCompressor,

Re: [HACKERS] directory archive format for pg_dump

2010-11-29 Thread Robert Haas
On Mon, Nov 29, 2010 at 10:49 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 29.11.2010 07:11, Joachim Wieland wrote: On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com  wrote: * wrap long lines * use extern in function prototypes

Re: [HACKERS] directory archive format for pg_dump

2010-11-29 Thread Heikki Linnakangas
On 29.11.2010 07:11, Joachim Wieland wrote: On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: * wrap long lines * use extern in function prototypes in header files * inline some functions like _StartDataCompressor, _EndDataCompressor,

Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Heikki Linnakangas
On 20.11.2010 06:10, Joachim Wieland wrote: 2010/11/19 José Arthur Benetasso Villanovajose.art...@gmail.com: The md5.c and kwlookup.c reuse using a link doesn't look nice either. This way you need to compile twice, among others things, but I think that its temporary, right? No, it isn't.

Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: But I'm not actually sure we should be preventing mix match of files from different dumps. It might be very useful to do just that sometimes, like restoring a recent backup, with the contents of one table replaced with older

Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Heikki Linnakangas
On 22.11.2010 19:07, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: But I'm not actually sure we should be preventing mix match of files from different dumps. It might be very useful to do just that sometimes, like restoring a recent backup, with the contents of

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Dimitri Fontaine
Hi, Sharing some thoughts after a first round of reviewing, where I only had time to read the patch itself. Joachim Wieland j...@mcknight.de writes: Since the compression is currently all down in the custom format backup code, the first thing I've done was refactoring the compression

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread José Arthur Benetasso Villanova
Hi Dimitri and Joachim. I've looked the patch too, and I want to share some thoughts too. I've used http://wiki.postgresql.org/wiki/Reviewing_a_Patch to guide my review. Submission review: I've apllied and compiled the patch successfully using the current master. Usability review: The dir

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Alvaro Herrera
Excerpts from José Arthur Benetasso Villanova's message of vie nov 19 18:28:03 -0300 2010: The md5.c and kwlookup.c reuse using a link doesn't look nice either. This way you need to compile twice, among others things, but I think that its temporary, right? Not sure what you mean here, but

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Joachim Wieland
Hi Dimitri, thanks for reviewing my patch! On Fri, Nov 19, 2010 at 2:44 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: I think I'd like to see a separate patch for the new compression support. Sorry about that, I realize that's extra work… I guess it wouldn't be a very big deal but I also

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes: I think I'd like to see a separate patch for the new compression support. Sorry about that, I realize that's extra work… That part of the patch is likely to get rejected outright anyway, so I *strongly* recommend splitting it out. We have

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Joachim Wieland
On Fri, Nov 19, 2010 at 11:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: I think I'd like to see a separate patch for the new compression support. Sorry about that, I realize that's extra work… That part of the patch is likely to get rejected

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Joachim Wieland
Hi Jose, 2010/11/19 José Arthur Benetasso Villanova jose.art...@gmail.com: The dir format generated in my database 60 files, with different sizes, and it looks very confusing. Is it possible to use the same trick as pigz and pbzip2, creating a concatenated file of streams? What pigz is