Hi, > > Another approach would be to distinguish between errors that require a > subtransaction to recover to a consistent state, and less serious errors > that don't have this requirement (e.g. invalid input to a data type > input function). If all the errors that we want to tolerate during a > bulk load fall into the latter category, we can do without > subtransactions. >
I think errors which occur after we have done a fast_heap_insert of the tuple generated from the current input row are the ones which would require the subtransaction to recover. Examples could be unique/primary key violation errors or FKey/triggers related errors. Any errors which occur before doing the heap_insert should not require any recovery according to me. The overhead of having a subtransaction per row is a very valid concern. But instead of using a per insert or a batch insert substraction, I am thinking that we can start off a subtraction and continue it till we encounter a failure. The moment an error is encountered, since we have the offending (already in heap) tuple around, we can call a simple_heap_delete on the same and commit (instead of aborting) this subtransaction after doing some minor cleanup. This current input data row can also be logged into a bad file. Recall that we need to only handle those errors in which the simple_heap_insert is successful, but the index insertion or the after row insert trigger causes an error. The rest of the load then can go ahead with the start of a new subtransaction. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com