Re: [HACKERS] COPY enhancements

2009-10-20 Thread Emmanuel Cecchet
Tom, Emmanuel Cecchet m...@asterdata.com writes: Tom Lane wrote: There aren't any. You can *not* put a try/catch around arbitrary code without a subtransaction. Don't even think about it Well then why the tests provided with the patch are working? Because they carefully

Re: [HACKERS] COPY enhancements

2009-10-20 Thread Tom Lane
Emmanuel Cecchet m...@asterdata.com writes: Tom Lane wrote: The key word in my sentence above is arbitrary. You don't know what a datatype input function might try to do, let alone triggers or other functions that COPY might have to invoke. They might do things that need to be cleaned up

Re: [HACKERS] COPY enhancements

2009-10-19 Thread Alvaro Herrera
Gokulakannan Somasundaram escribió: Actually this problem is present even in today's transaction id scenario and the only way we avoid is by using freezing. Can we use a similar approach? This freezing should mean that we are freezing the sub-transaction in order to avoid the sub-transaction

Re: [HACKERS] COPY enhancements

2009-10-19 Thread Robert Haas
On Mon, Oct 19, 2009 at 11:21 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Gokulakannan Somasundaram escribió: Actually this problem is present even in today's transaction id scenario and the only way we avoid is by using freezing. Can we use a similar approach? This freezing should

Re: [HACKERS] COPY enhancements

2009-10-18 Thread Gokulakannan Somasundaram
Actually i thought of a solution for the wrap-around sometime back. Let me try to put my initial thoughts into it. May be it would get refined over conversation. Transaction wrap-around failure Actually this problem is present even in today's transaction id scenario and the only way we avoid is

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Emmanuel Cecchet
Tom Lane wrote: Ultimately, there's always going to be a tradeoff between speed and flexibility. It may be that we should just say if you want to import dirty data, it's gonna cost ya and not worry about the speed penalty of subtransaction-per-row. But that still leaves us with the 2^32 limit.

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Tom Lane
Emmanuel Cecchet m...@frogthinker.org writes: - speed with error logging best effort: no use of sub-transactions but errors that can safely be trapped with pg_try/catch (no index violation, There aren't any. You can *not* put a try/catch around arbitrary code without a subtransaction. Don't

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Emmanuel Cecchet
Tom Lane wrote: Emmanuel Cecchet m...@frogthinker.org writes: - speed with error logging best effort: no use of sub-transactions but errors that can safely be trapped with pg_try/catch (no index violation, There aren't any. You can *not* put a try/catch around arbitrary code without

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Dimitri Fontaine
Emmanuel Cecchet m...@frogthinker.org writes: Tom was also suggesting 'refactoring COPY into a series of steps that the user can control'. What would these steps be? Would that be per row and allow to discard a bad tuple? The idea is to have COPY usable from a general SELECT query so that the

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Tom Lane
Emmanuel Cecchet m...@asterdata.com writes: Tom Lane wrote: There aren't any. You can *not* put a try/catch around arbitrary code without a subtransaction. Don't even think about it. Well then why the tests provided with the patch are working? Because they carefully exercise only a tiny

Re: [HACKERS] COPY enhancements

2009-10-12 Thread Simon Riggs
On Thu, 2009-10-08 at 11:01 -0400, Tom Lane wrote: So as far as I can see, the only form of COPY error handling that wouldn't be a cruel joke is to run a separate subtransaction for each row, and roll back the subtransaction on error. Of course the problems with that are (a) speed, (b) the

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Simon Riggs
On Fri, 2009-10-09 at 00:15 +0100, Simon Riggs wrote: On Thu, 2009-10-08 at 12:21 -0400, Tom Lane wrote: You'd eat a sub-sub-transaction per row, and start a new sub-transaction every 2^32 rows. However, on second thought this really doesn't get us anywhere, it just moves the 2^32

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Hannu Krosing
On Thu, 2009-10-08 at 11:32 -0400, Robert Haas wrote: Another possible approach, which isn't perfect either, is the idea of allowing COPY to generate a single column of output of type text[]. That greatly reduces the number of possible error cases, maybe make it bytea[] to further reduce

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: On Thu, 2009-10-08 at 12:21 -0400, Tom Lane wrote: So really we have to find some way to only expend one XID per failure, not one per row. I discovered a few days back that ~550 subtransactions is sufficient to blow max_stack_depth. 1 subtransaction

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: Another thing that has occurred to me is that RI checks are currently resolved at end of statement and could end up rejecting any/all rows loaded. If we break down the load into subtransaction pieces we would really want the RI checks on the rows to be

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Tom Lane
Hannu Krosing ha...@2ndquadrant.com writes: On Thu, 2009-10-08 at 11:32 -0400, Robert Haas wrote: Another possible approach, which isn't perfect either, is the idea of allowing COPY to generate a single column of output of type text[]. That greatly reduces the number of possible error cases,

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Greg Smith
On Fri, 9 Oct 2009, Tom Lane wrote: what do we do with rows that fail encoding conversion? For logging to a file we could/should just decree that we write out the original, allegedly-in-the-client-encoding data. I'm not sure what we do about logging to a table though. The idea of storing

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes: What's really bad about this is that a flag called error_logging is actually changing the behavior of the command in a way that is far more dramatic than (and doesn't actually have much to do with) error logging. It's actually making a COPY command

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 4:42 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Robert Haas robertmh...@gmail.com writes: What's really bad about this is that a flag called error_logging is actually changing the behavior of the command in a way that is far more dramatic than (and doesn't

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Simon Riggs
On Wed, 2009-10-07 at 22:30 -0400, Robert Haas wrote: On Fri, Sep 25, 2009 at 10:01 AM, Emmanuel Cecchet m...@asterdata.com wrote: Robert, Here is the new version of the patch that applies to CVS HEAD as of this morning. Emmanuel I took a look at this patch tonight and, having now

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes: I'm a little mystified by this response since I spent several paragraphs following the one that you have quoted here explaining how I think we should approach the problem of providing the features that are currently all encapsulated under the mantle of

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 8:34 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Robert Haas robertmh...@gmail.com writes: I'm a little mystified by this response since I spent several paragraphs following the one that you have quoted here explaining how I think we should approach the problem of

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: It seems quite odd to me that when COPY succeeds but there are errors, the transaction commits. The only indication that some of my data didn't end up in the table is that the output says COPY n where n is less than the total number of rows I

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Lest there be any unclarity, I am NOT trying to shoot down this feature with my laser-powered bazooka. Well, if you need somebody to do that --- I took a quick look through this patch, and it is NOT going to get committed. Not in anything approximately

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Alvaro Herrera
Robert Haas escribió: Some defective part of my brain enjoys seeing things run smoothly more than it enjoys being lazy. Strangely, that seems to say you'd make a bad Perl programmer, per Larry Wall's three virtues. -- Alvaro Herrera

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 11:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Lest there be any unclarity, I am NOT trying to shoot down this feature with my laser-powered bazooka. Well, if you need somebody to do that Well, I'm trying not to demoralize people

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 11:29 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: Robert Haas escribió: Some defective part of my brain enjoys seeing things run smoothly more than it enjoys being lazy. Strangely, that seems to say you'd make a bad Perl programmer, per Larry Wall's three

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Subcommitting every single row is going to be really painful, especially after Hot Standby goes in and we have to issue a WAL record after every 64 subtransactions (AIUI). Yikes ... I had not been following that discussion, but that sure sounds like a

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: Another possible approach, which isn't perfect either, is the idea of allowing COPY to generate a single column of output of type text[]. That greatly reduces the number of possible error cases, and at least gets the data into

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Joshua D. Drake
On Thu, 2009-10-08 at 11:59 -0400, Robert Haas wrote: On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: Another possible approach, which isn't perfect either, is the idea of allowing COPY to generate a single column of output of type text[]. That greatly reduces the number

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Rod Taylor
Yeah. I think it's going to be hard to make this work without having standalone transactions. One idea would be to start a subtransaction, insert tuples until one fails, then rollback the subtransaction and start a new one, and continue on until the error limit is reached. I've found

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether we could break down COPY into sub-sub transactions to work around that... How would that work? Don't you still need to increment the command counter? Actually,

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes: Couldn't you just commit each range of subtransactions based on some threshold? COPY foo from '/tmp/bar/' COMMIT_THRESHOLD 100; It counts to 1mil, commits starts a new transaction. Yes there would be 1million sub transactions but once it

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 12:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Joshua D. Drake j...@commandprompt.com writes: Couldn't you just commit each range of subtransactions based on some threshold? COPY foo from '/tmp/bar/' COMMIT_THRESHOLD 100; It counts to 1mil, commits starts a new

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Hmm, if we were willing to break COPY into multiple *top level* transactions, that would avoid my concern about XID wraparound. The issue here is that if the COPY does eventually fail (and there will always be failure conditions, eg out of disk space), then

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 12:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether we could break down COPY into sub-sub transactions to work around that... How would that work?  

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Greg Smith
On Thu, 8 Oct 2009, Tom Lane wrote: It may be that we should just say if you want to import dirty data, it's gonna cost ya and not worry about the speed penalty of subtransaction-per-row. This goes along with the response I gave on objections to adding other bits of overhead into COPY. If

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Greg Smith
On Thu, 8 Oct 2009, Rod Taylor wrote: 1) Having copy remember which specific line caused the error. So it can replace lines 1 through 487 in a subtransaction since it knows those are successful. Run 488 in its on subtransaction. Run 489 through ... in a new subtransaction. This is the

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Thu, Oct 8, 2009 at 12:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Another approach that was discussed earlier was to divvy the rows into batches.  Say every thousand rows you sub-commit and start a new subtransaction.  Up to that point you save aside

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 1:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Oct 8, 2009 at 12:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Another approach that was discussed earlier was to divvy the rows into batches.  Say every thousand rows you

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Bruce Momjian
Dimitri Fontaine wrote: Simon Riggs si...@2ndquadrant.com writes: It will be best to have the ability to have a specific rejection reason for each row rejected. That way we will be able to tell the difference between uniqueness violation errors, invalid date format on col7, value fails

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Andrew Dunstan
Bruce Momjian wrote: What would be _cool_ would be to add the ability to have comments in the COPY files, like \#, and then the copy data lines and errors could be adjacent. (Because of the way we control COPY escaping, adding \# would not be a problem. We have \N for null, for example.)

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Bruce Momjian
Robert Haas wrote: Each of those features deserves a separate discussion to decide whether we want it and how best to implement it. Personally, I think we should skip (C), at least as a starting point. Instead of logging to a table, I think we should consider making COPY return the tuples

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Bruce Momjian
Robert Haas wrote: That was a compliment on your project management skills. ?Keeping the CF work moving forward steadily is both unglamorous and extremely valuable, and I don't think anyone else even understands why you've volunteered to handle so much of it. ?But I know I appreciate it.

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Simon Riggs
On Thu, 2009-10-08 at 18:23 -0400, Bruce Momjian wrote: Dimitri Fontaine wrote: Simon Riggs si...@2ndquadrant.com writes: It will be best to have the ability to have a specific rejection reason for each row rejected. That way we will be able to tell the difference between uniqueness

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Simon Riggs
On Thu, 2009-10-08 at 12:21 -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wonder whether we could break down COPY into sub-sub transactions to work around that... How would that work? Don't you

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Greg Smith
On Mon, 5 Oct 2009, Josh Berkus wrote: I think that this was the original idea but we should probably rollback the error logging if the command has been rolled back. It might be more consistent to use the same hi_options as the copy command. Any idea what would be best? Well, if we're logging

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Simon Riggs
On Wed, 2009-10-07 at 03:17 -0400, Greg Smith wrote: On Mon, 5 Oct 2009, Josh Berkus wrote: Also, presumbly, if you abort a COPY because of errors, you probably want to keep the errors around for later analysis. No? Absolutely, that's the whole point of logging to a file in the first

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 3:17 AM, Greg Smith gsm...@gregsmith.com wrote: I know this patch is attracting more reviewers lately, is anyone tracking the general architecture of the code yet?  Emmanuel's work is tough to review just because there's so many things mixed together, and there's other

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Hi all, I think there is a misunderstanding about what the current patch is about. The patch includes 2 things: - error logging in a table for bad tuples in a COPY operation (see http://wiki.postgresql.org/wiki/Error_logging_in_COPY for an example; the error message, command and so on are

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Andrew Dunstan
Emmanuel Cecchet wrote: If you prefer to postpone the auto-partitioning to the next commit fest, I can strip it from the current patch and re-submit it for the next fest (but it's just 2 isolated methods really easy to review). I certainly think this should be separated out. In general

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Dimitri Fontaine
Simon Riggs si...@2ndquadrant.com writes: It will be best to have the ability to have a specific rejection reason for each row rejected. That way we will be able to tell the difference between uniqueness violation errors, invalid date format on col7, value fails check constraint on col22 etc..

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Simon Riggs
On Wed, 2009-10-07 at 15:33 +0200, Dimitri Fontaine wrote: Simon Riggs si...@2ndquadrant.com writes: It will be best to have the ability to have a specific rejection reason for each row rejected. That way we will be able to tell the difference between uniqueness violation errors, invalid

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet m...@asterdata.com wrote: Hi all, I think there is a misunderstanding about what the current patch is about. The patch includes 2 things: - error logging in a table for bad tuples in a COPY operation (see

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: Emmanuel Cecchet wrote: If you prefer to postpone the auto-partitioning to the next commit fest, I can strip it from the current patch and re-submit it for the next fest (but it's just 2 isolated methods really easy to review). I certainly think

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Andrew Dunstan
Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Emmanuel Cecchet wrote: If you prefer to postpone the auto-partitioning to the next commit fest, I can strip it from the current patch and re-submit it for the next fest (but it's just 2 isolated methods really easy to

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Robert Haas wrote: On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet m...@asterdata.com wrote: Hi all, I think there is a misunderstanding about what the current patch is about. The patch includes 2 things: - error logging in a table for bad tuples in a COPY operation (see

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 11:39 AM, Emmanuel Cecchet m...@asterdata.com wrote: Robert Haas wrote: On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet m...@asterdata.com wrote: Hi all, I think there is a misunderstanding about what the current patch is about. The patch includes 2 things: -

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
The roadmap I would propose for the current list of enhancements to COPY is as follows: 1. new syntax for COPY options (already committed) 2. error logging in a table 3. auto-partitioning (just relies on basic error logging, so can be scheduled anytime after 2) 4. error logging in a file manu

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Robert Haas wrote: On Wed, Oct 7, 2009 at 11:39 AM, Emmanuel Cecchet m...@asterdata.com wrote: Robert Haas wrote: On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet m...@asterdata.com wrote: Hi all, I think there is a misunderstanding about what the current patch is about. The

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Greg Smith wrote: Absolutely, that's the whole point of logging to a file in the first place. What needs to happen here is that when one is aborted, you need to make sure that fact is logged, and with enough information (the pid?) to tie it to the COPY that failed. Then someone can crawl

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 11:45 AM, Emmanuel Cecchet m...@asterdata.com wrote: You are suggesting then that it is the COPY command that aborts the transaction. That would only happen if you had set a limit on the number of errors that you want to accept in a COPY command (in which case you know

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Greg Smith
On Wed, 7 Oct 2009, Emmanuel Cecchet wrote: I think there is a misunderstanding about what the current patch is about...the patch does NOT include logging errors into a file (a feature we can add later on (next commit fest?)) I understand that (as one of the few people who has read the patch

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Greg Smith
On Wed, 7 Oct 2009, Robert Haas wrote: On Wed, Oct 7, 2009 at 3:17 AM, Greg Smith gsm...@gregsmith.com wrote: I doubt taskmaster Robert is going to let this one linger around with scope creep for too long before being pushed out to the next CommitFest. I'm can't decide whether to feel good

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 7:52 PM, Greg Smith gsm...@gregsmith.com wrote: On Wed, 7 Oct 2009, Robert Haas wrote: On Wed, Oct 7, 2009 at 3:17 AM, Greg Smith gsm...@gregsmith.com wrote: I doubt taskmaster Robert is going to let this one linger around with scope creep for too long before being

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Fri, Sep 25, 2009 at 10:01 AM, Emmanuel Cecchet m...@asterdata.com wrote: Robert, Here is the new version of the patch that applies to CVS HEAD as of this morning. Emmanuel I took a look at this patch tonight and, having now read through some of it, I have some more detailed comments.

Re: [HACKERS] COPY enhancements

2009-10-06 Thread Emmanuel Cecchet
I just realized that I forgot to CC the list when I answered to Josh... resending! Josh, I think that this was the original idea but we should probably rollback the error logging if the command has been rolled back. It might be more consistent to use the same hi_options as the copy

Re: [HACKERS] COPY enhancements

2009-10-05 Thread Emmanuel Cecchet
Hi Selena, This is my first pass at the error logging portion of this patch. I'm going to take a break and try to go through the partitioning logic as well later this afternoon. caveat: I'm not familiar with most of the code paths that are being touched by this patch. Overall: * I noticed

Re: [HACKERS] COPY enhancements

2009-10-05 Thread Josh Berkus
Emmanuel, I think that this was the original idea but we should probably rollback the error logging if the command has been rolled back. It might be more consistent to use the same hi_options as the copy command. Any idea what would be best? Well, if we're logging to a file, you wouldn't be

Re: [HACKERS] COPY enhancements

2009-10-04 Thread Jeff Davis
On Fri, 2009-09-25 at 10:01 -0400, Emmanuel Cecchet wrote: Robert, Here is the new version of the patch that applies to CVS HEAD as of this morning. I just started looking at this now. It seems to fail make check, diffs attached. I haven't looked into the cause of the failure yet. Regards,

Re: [HACKERS] COPY enhancements

2009-10-04 Thread Selena Deckelmann
Hi! On Fri, Sep 25, 2009 at 7:01 AM, Emmanuel Cecchet m...@asterdata.com wrote: Here is the new version of the patch that applies to CVS HEAD as of this morning. Cool features! This is my first pass at the error logging portion of this patch. I'm going to take a break and try to go through

Re: [HACKERS] COPY enhancements

2009-10-04 Thread Emmanuel Cecchet
The problem comes from the foo_malformed_terminator.data file. It is supposed to have a malformed terminator that was not catch by patch. The second line should look like: 2 two^M If it does not, you can edit it with emacs, go at the end of the second line and press Ctrl+q followed by

Re: [HACKERS] COPY enhancements

2009-09-25 Thread Emmanuel Cecchet
Robert, Here is the new version of the patch that applies to CVS HEAD as of this morning. Emmanuel On Fri, Sep 18, 2009 at 12:14 AM, Emmanuel Cecchet m...@asterdata.com wrote: Here is a new version of error logging and autopartitioning in COPY based on the latest COPY patch that

Re: [HACKERS] COPY enhancements

2009-09-24 Thread Robert Haas
On Fri, Sep 18, 2009 at 12:14 AM, Emmanuel Cecchet m...@asterdata.com wrote: Here is a new version of error logging and autopartitioning in COPY based on the latest COPY patch that provides the new syntax for copy options (this patch also includes the COPY option patch). New features compared

Re: [HACKERS] COPY enhancements

2009-09-24 Thread Emmanuel Cecchet
Yes, I have to update the patch following what Tom already integrated of the COPY patch. I will get a new version posted as soon as I can. Emmanuel Robert Haas wrote: On Fri, Sep 18, 2009 at 12:14 AM, Emmanuel Cecchet m...@asterdata.com wrote: Here is a new version of error logging and

Re: [HACKERS] COPY enhancements

2009-09-19 Thread Bruce Momjian
Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: It's not as if we don't have the ability to measure performance impact. It's reasonable to make a requirement that new options to COPY shouldn't slow it down noticeably if those options aren't used. And we can test that, and even

Re: [HACKERS] COPY enhancements

2009-09-14 Thread Emmanuel Cecchet
Greg Smith wrote: On Fri, 11 Sep 2009, Emmanuel Cecchet wrote: I guess the problem with extra or missing columns is to make sure that you know exactly which data belongs to which column so that you don't put data in the wrong columns which is likely to happen if this is fully automated.

Re: [HACKERS] COPY enhancements

2009-09-14 Thread Andrew Dunstan
Emmanuel Cecchet wrote: Greg Smith wrote: On Fri, 11 Sep 2009, Emmanuel Cecchet wrote: I guess the problem with extra or missing columns is to make sure that you know exactly which data belongs to which column so that you don't put data in the wrong columns which is likely to happen if

Re: [HACKERS] COPY enhancements

2009-09-13 Thread Josh Berkus
Tom, [ shrug... ] Everybody in the world is going to want their own little problem to be handled in the fast path. And soon it won't be so fast anymore. I think it is perfectly reasonable to insist that the fast path is only for clean data import. Why? No, really. It's not as if we

Re: [HACKERS] COPY enhancements

2009-09-13 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes: It's not as if we don't have the ability to measure performance impact. It's reasonable to make a requirement that new options to COPY shouldn't slow it down noticeably if those options aren't used. And we can test that, and even make such testing part

Re: [HACKERS] COPY enhancements

2009-09-13 Thread Andrew Dunstan
Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: It's not as if we don't have the ability to measure performance impact. It's reasonable to make a requirement that new options to COPY shouldn't slow it down noticeably if those options aren't used. And we can test that, and even make

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Heikki Linnakangas
Josh Berkus wrote: The performance of every path to get data into the database besides COPY is too miserable for us to use anything else, and the current inflexibility makes it useless for anything but the cleanest input data. One potential issue we're facing down this road is that current

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Heikki Linnakangas
Josh Berkus wrote: The user-defined table for rejects is obviously exclusive of the system one, either of those would be fine from my perspective. I've been thinking about it, and can't come up with a really strong case for wanting a user-defined table if we settle the issue of having a

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Greg Smith
On Fri, 11 Sep 2009, Josh Berkus wrote: I've been thinking about it, and can't come up with a really strong case for wanting a user-defined table if we settle the issue of having a strong key for pg_copy_errors. Do you have one? No, but I'd think that if the user table was only allowed to be

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Greg Smith
On Fri, 11 Sep 2009, Emmanuel Cecchet wrote: I guess the problem with extra or missing columns is to make sure that you know exactly which data belongs to which column so that you don't put data in the wrong columns which is likely to happen if this is fully automated. Allowing the extra

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Andrew Dunstan
Greg Smith wrote: After some thought, I think that Andrew's feature *is* generally applicable, if done as IGNORE COLUMN COUNT (or, more likely, column_count=ignore). I can think of a lot of data sets where column count is jagged and you want to do ELT instead of ETL. Exactly, the ELT

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: Right. What I proposed would not have been terribly invasive or difficult, certainly less so than what seems to be our direction by an order of magnitude at least. I don't for a moment accept the assertion that we can get a general solution for the

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Andrew Dunstan
Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Right. What I proposed would not have been terribly invasive or difficult, certainly less so than what seems to be our direction by an order of magnitude at least. I don't for a moment accept the assertion that we can get a

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes: At the same time, I think it's probably not a good thing that users who deal with very large amounts of data would be forced off the COPY fast path by a need for something like input support for non-rectangular data. [ shrug... ] Everybody in the

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Greg Smith
On Sat, 12 Sep 2009, Tom Lane wrote: Everybody in the world is going to want their own little problem to be handled in the fast path. And soon it won't be so fast anymore. I think it is perfectly reasonable to insist that the fast path is only for clean data import. The extra overhead is

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet
Hi Robert, I like this idea, perhaps not surprisingly (for those not following at home: that was my patch). Unfortunately, it looks to me like there is no way to do this without overhauling the syntax. If the existing syntax required a comma between options (i.e. copy blah to stdout binary,

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane
Emmanuel Cecchet m...@asterdata.com writes: The new syntax could look like: COPY /tablename/ [ ( /column/ [, ...] ) ] FROM { '/filename/' | STDIN } [ [, BINARY ] [, OIDS ] [, DELIMITER [ AS ] '/delimiter/' ] [, NULL [ AS ] '/null string/' ] [, CSV [ HEADER

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet
Tom, I looked at EXPLAIN (http://www.postgresql.org/docs/current/interactive/sql-explain.html) and there is not a single line of what you are talking about. And the current syntax is just EXPLAIN [ ANALYZE ] [ VERBOSE ] /statement / If I try to decrypt what you said, you are looking at

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas
On Fri, Sep 11, 2009 at 10:53 AM, Emmanuel Cecchet m...@asterdata.com wrote: Tom, I looked at EXPLAIN (http://www.postgresql.org/docs/current/interactive/sql-explain.html) and there is not a single line of what you are talking about. And the current syntax is just EXPLAIN [ ANALYZE ] [

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas
On Fri, Sep 11, 2009 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: Emmanuel Cecchet m...@asterdata.com writes: The new syntax could look like: COPY /tablename/ [ ( /column/ [, ...] ) ]     FROM { '/filename/' | STDIN }     [ [, BINARY ]       [, OIDS ]       [, DELIMITER [ AS ]

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Or look at your CVS/git checkout. The important point is to look at the grammar, which doesn't have any idea what the specific options are in the list. (Well, okay, it had to have special cases for ANALYZE and VERBOSE because those are reserved words

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax into a keyword/value notation. Any number of ways, for example force_not_null = true or multiple occurrences of force_not_null = column_name. Andrew was on the verge of admitting we

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Pierre Frédéric Caillau d
I was thinking something like: COPY tablename [ ( column [, ...] ) ] FROM { 'filename' | STDIN } [WITH] [option [, ...]] Where: option := ColId [Sconst] | FORCE NOT NULL (column [,...]) I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax into a keyword/value notation.

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas
On Fri, Sep 11, 2009 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I don't see any reasonable way to sandwhich the FORCE NOT NULL syntax into a keyword/value notation. Any number of ways, for example force_not_null = true or multiple occurrences of

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas
2009/9/11 Pierre Frédéric Caillaud li...@peufeu.com: I was thinking something like: COPY tablename [ ( column [, ...] ) ] FROM { 'filename' | STDIN } [WITH] [option [, ...]] Where: option := ColId [Sconst] | FORCE NOT NULL (column [,...]) I don't see any reasonable way to sandwhich the

  1   2   >