Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
Peter Eisentraut writes: > On 3/3/18 00:48, Tom Lane wrote: >> I don't think that can possibly work. It would only be safe if, between >> the thrower and the catcher, there were no other levels of control >> operating according to the normal error-handling rules. But input >> functions certainly cannot assume that they are only called by COPY, >> so how could they safely throw a "soft error"? > That assumes that throwing a soft error in a context that does not > handle it specially is not safe. I'd imagine in such situations the > soft error just behaves like a normal exception. My point is that neither the thrower of the error, nor the catcher of it, can possibly guarantee that there were no levels of control in between that were expecting their resource leakages to be cleaned up by normal error recovery. As a counterexample consider the chain of control COPY -> domain_in -> SQL function in CHECK -> int4in If int4in throws a "soft" error, and COPY catches it and skips error cleanup, we've likely got problems, depending on what the SQL function did. This could even break domain_in, although probably any such effort would've taken care to harden domain_in against the issue. But the possibility of SQL or PL functions having done nearly-arbitrary things in between means that most of the backend is at hazard. You could imagine making something like this work, but it's gonna take very invasive and bug-inducing changes to our error cleanup rules, affecting a ton of places that have no connection to the goal. regards, tom lane
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
2018-03-03 15:02 GMT+01:00 Peter Eisentraut < peter.eisentr...@2ndquadrant.com>: > On 3/3/18 00:48, Tom Lane wrote: > > I don't think that can possibly work. It would only be safe if, between > > the thrower and the catcher, there were no other levels of control > > operating according to the normal error-handling rules. But input > > functions certainly cannot assume that they are only called by COPY, > > so how could they safely throw a "soft error"? > > That assumes that throwing a soft error in a context that does not > handle it specially is not safe. I'd imagine in such situations the > soft error just behaves like a normal exception. > This soft error solves what? If somebody use fault tolerant COPY, then he will not be satisfied, when only format errors will be ignored. Any constraints, maybe trigger errors should be ignored too. But is true, so some other databases raises soft error on format issues, and doesn't raise a exception. Isn't better to use some alternative algorithm like under subtransaction try to insert block of 1000 lines. When some fails do rollback and try to import per block of 100 lines, and try to find wrong line? or another way - for this specific case reduce the cost of subtransaction? This is special case, subtransaction under one command. Regards Pavel > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
On 3/3/18 00:48, Tom Lane wrote: > I don't think that can possibly work. It would only be safe if, between > the thrower and the catcher, there were no other levels of control > operating according to the normal error-handling rules. But input > functions certainly cannot assume that they are only called by COPY, > so how could they safely throw a "soft error"? That assumes that throwing a soft error in a context that does not handle it specially is not safe. I'd imagine in such situations the soft error just behaves like a normal exception. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
Craig Ringer writes: > On 3 March 2018 at 13:08, Peter Eisentraut wrote: >> I think one thing to try would to define a special kind of exception >> that can safely be caught and ignored. Then, input functions can >> communicate benign parse errors by doing their own cleanup first, then >> throwing this exception, and then the COPY subsystem can deal with it. > That makes sense. We'd only use the error code range in question when it > was safe to catch without re-throw, and we'd have to enforce rules around > using a specific memory context. I don't think that can possibly work. It would only be safe if, between the thrower and the catcher, there were no other levels of control operating according to the normal error-handling rules. But input functions certainly cannot assume that they are only called by COPY, so how could they safely throw a "soft error"? regards, tom lane
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
On 3 March 2018 at 13:08, Peter Eisentraut wrote: > On 1/22/18 21:33, Craig Ringer wrote: > > We don't have much in the way of rules about what input functions can or > > cannot do, so you can't assume much about their behaviour and what must > > / must not be cleaned up. Nor can you just reset the state in a heavy > > handed manner like (say) plpgsql does. > > I think one thing to try would to define a special kind of exception > that can safely be caught and ignored. Then, input functions can > communicate benign parse errors by doing their own cleanup first, then > throwing this exception, and then the COPY subsystem can deal with it. > That makes sense. We'd only use the error code range in question when it was safe to catch without re-throw, and we'd have to enforce rules around using a specific memory context. Of course no LWLocks could be held, but that's IIRC true when throwing anyway unless you plan to proc_exit() in your handler. People will immediately ask for it to handle RI errors too, so something similar would be needed there. But frankly, Pg's RI handling for bulk loading desperately needs a major change in how it works to make it efficient anyway, the current model of individual row triggers is horrible for bulk load performance. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
On 1/22/18 21:33, Craig Ringer wrote: > We don't have much in the way of rules about what input functions can or > cannot do, so you can't assume much about their behaviour and what must > / must not be cleaned up. Nor can you just reset the state in a heavy > handed manner like (say) plpgsql does. I think one thing to try would to define a special kind of exception that can safely be caught and ignored. Then, input functions can communicate benign parse errors by doing their own cleanup first, then throwing this exception, and then the COPY subsystem can deal with it. Anyway, this patch is clearly not doing this, so I'm setting it to returned with feedback. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
On 23 January 2018 at 15:21, Stephen Frost wrote: > Alexey, > > * Alexey Kondratov (kondratov.alek...@gmail.com) wrote: > > On Fri, Dec 1, 2017 at 1:58 AM, Alvaro Herrera > wrote: > > > On a *very* quick look, please use an enum to return from NextCopyFrom > > > rather than 'int'. The chunks that change bool to int are very > > > odd-looking. This would move the comment that explains the value from > > > copy.c to copy.h, obviously. Also, you seem to be using non-ASCII > dashes > > > in the descriptions of those values; please don't. > > > > I will fix it, thank you. > > > > > Or maybe I misunderstood the patch completely. > > > > I hope so. Here is my thoughts how it all works, please correct me, > > where I am wrong: > > > > 1) First, I have simply changed ereport level to WARNING for specific > > validations (extra or missing columns, etc) if INGONE_ERRORS option is > > used. All these checks are inside NextCopyFrom. Thus, this patch > > performs here pretty much the same as before, excepting that it is > > possible to skip bad lines, and this part should be safe as well > > > > 2) About PG_TRY/CATCH. I use it to catch the only one specific > > function call inside NextCopyFrom--it is InputFunctionCall--which is > > used just to parse datatype from the input string. I have no idea how > > WAL write or trigger errors could get here > > > > All of these is done before actually forming a tuple, putting it into > > the heap, firing insert-related triggers, etc. I am not trying to > > catch all errors during the row processing, only input data errors. So > > why is it unsafe? > > The issue is that InputFunctionCall is actually calling out to other > functions and they could be allocating memory and doing other things. > What you're missing by using PG_TRY() here without doing a PG_RE_THROW() > is all the cleanup work that AbortTransaction() and friends do- ensuring > there's a valid memory context (cleaning up the ones that might have > been allocated by the input function), releasing locks, resetting user > and security contexts, and a whole slew of other things. > > Is all of that always needed for an error thrown by an input function? > Hard to say, really, since input functions can be provided by > extensions. You might get away with doing this for int4in(), but we > also have to be thinking about extensions like PostGIS which introduce > their own geometry and geography data types that are a great deal more > complicated. > > For example, an input function could conceivably take a lwlock, do some work in the heapam, or whatever. We don't have much in the way of rules about what input functions can or cannot do, so you can't assume much about their behaviour and what must / must not be cleaned up. Nor can you just reset the state in a heavy handed manner like (say) plpgsql does. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
Alexey, * Alexey Kondratov (kondratov.alek...@gmail.com) wrote: > On Fri, Dec 1, 2017 at 1:58 AM, Alvaro Herrera > wrote: > > On a *very* quick look, please use an enum to return from NextCopyFrom > > rather than 'int'. The chunks that change bool to int are very > > odd-looking. This would move the comment that explains the value from > > copy.c to copy.h, obviously. Also, you seem to be using non-ASCII dashes > > in the descriptions of those values; please don't. > > I will fix it, thank you. > > > Or maybe I misunderstood the patch completely. > > I hope so. Here is my thoughts how it all works, please correct me, > where I am wrong: > > 1) First, I have simply changed ereport level to WARNING for specific > validations (extra or missing columns, etc) if INGONE_ERRORS option is > used. All these checks are inside NextCopyFrom. Thus, this patch > performs here pretty much the same as before, excepting that it is > possible to skip bad lines, and this part should be safe as well > > 2) About PG_TRY/CATCH. I use it to catch the only one specific > function call inside NextCopyFrom--it is InputFunctionCall--which is > used just to parse datatype from the input string. I have no idea how > WAL write or trigger errors could get here > > All of these is done before actually forming a tuple, putting it into > the heap, firing insert-related triggers, etc. I am not trying to > catch all errors during the row processing, only input data errors. So > why is it unsafe? The issue is that InputFunctionCall is actually calling out to other functions and they could be allocating memory and doing other things. What you're missing by using PG_TRY() here without doing a PG_RE_THROW() is all the cleanup work that AbortTransaction() and friends do- ensuring there's a valid memory context (cleaning up the ones that might have been allocated by the input function), releasing locks, resetting user and security contexts, and a whole slew of other things. Is all of that always needed for an error thrown by an input function? Hard to say, really, since input functions can be provided by extensions. You might get away with doing this for int4in(), but we also have to be thinking about extensions like PostGIS which introduce their own geometry and geography data types that are a great deal more complicated. In the end, this isn't an acceptable approach to this problem. The approach outlined previously using sub-transactions which attempted insert some number of records and then, on failure, restarted and inserted up until the failed record and then skipped it, starting over again with the next batch, seemed pretty reasonable to me. If you have time to work on that, that'd be great, otherwise we'll probably need to mark this as returned with feedback. Thanks! Stephen signature.asc Description: PGP signature
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
On Fri, Dec 1, 2017 at 1:58 AM, Alvaro Herrera wrote: > On a *very* quick look, please use an enum to return from NextCopyFrom > rather than 'int'. The chunks that change bool to int are very > odd-looking. This would move the comment that explains the value from > copy.c to copy.h, obviously. Also, you seem to be using non-ASCII dashes > in the descriptions of those values; please don't. I will fix it, thank you. > > Or maybe I misunderstood the patch completely. > I hope so. Here is my thoughts how it all works, please correct me, where I am wrong: 1) First, I have simply changed ereport level to WARNING for specific validations (extra or missing columns, etc) if INGONE_ERRORS option is used. All these checks are inside NextCopyFrom. Thus, this patch performs here pretty much the same as before, excepting that it is possible to skip bad lines, and this part should be safe as well 2) About PG_TRY/CATCH. I use it to catch the only one specific function call inside NextCopyFrom--it is InputFunctionCall--which is used just to parse datatype from the input string. I have no idea how WAL write or trigger errors could get here All of these is done before actually forming a tuple, putting it into the heap, firing insert-related triggers, etc. I am not trying to catch all errors during the row processing, only input data errors. So why is it unsafe? Best, Alexey
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
Alexey Kondratov wrote: > I have rebased my branch and squashed all commits into one, since the > patch is quite small. Everything seems to be working fine now, patch > passes all regression tests. On a *very* quick look, please use an enum to return from NextCopyFrom rather than 'int'. The chunks that change bool to int are very odd-looking. This would move the comment that explains the value from copy.c to copy.h, obviously. Also, you seem to be using non-ASCII dashes in the descriptions of those values; please don't. But those are really just minor nitpicks while paging through. After looking at the way you're handling errors, it seems that you've chosen to go precisely the way people were saying that was not admissible: use a PG_TRY block, and when things go south, simply log a WARNING and move on. This is unsafe. For example: what happens if the error being raised is out-of-memory? Failing to re-throw means you don't release current transaction memory context. What if the error is that WAL cannot be written because the WAL disk ran out of space? This would just be reported as a warning over and over until the server log disk is also full. What if your insertion fired a trigger that caused an update which got a serializability exception? You cannot just move on, because at that point session state (serializability session state) might be busted until you abort the current transaction. Or maybe I misunderstood the patch completely. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling
On Wed, Jun 21, 2017 at 11:37 PM, Alex K wrote: >> On 16 Jun 2017, at 21:30, Alexey Kondratov >> wrote: > >> > On 13 Jun 2017, at 01:44, Peter Geoghegan wrote: > > >> > Speculative insertion has the following special entry points to >> > heapam.c and execIndexing.c, currently only called within >> > nodeModifyTable.c > >> > Offhand, it doesn't seem like it would be that hard to teach another >> > heap_insert() caller the same tricks. > > >> I went through the nodeModifyTable.c code and it seems not to be so >> difficult to do the same inside COPY. > > After a more precise look, I have figured out at least one difficulty, COPY > and INSERT follow the different execution paths: INSERT goes through > the Planner, while COPY does not. It leads to the absence of some required > attributes like arbiterIndexes, which are available during INSERT via > PlanState/ModifyTableState. Probably it is possible to get the same in the > COPY, but it is not clear for me how. > > Anyway, adding of the 'speculative insertion' into the COPY is worth of a > separated patch; and I would be glad to try implementing it. > > In the same time I have prepared a complete working patch with: > > - ignoring of the input data formatting errors > - IGNORE_ERRORS parameter in the COPY options > - updated regression tests > > Please, find the patch attached or check the web UI diff on GitHub as always: > https://github.com/ololobus/postgres/pull/1/files "git diff master --check" complains heavily, and the patch does not apply anymore. The last patch is 5-month old as well, so I am marking the patch as returned with feedback. -- Michael