> Alon Goldshuv on Bizgres has been working on this as well. Maybe you > could collaborate? Alon?
I would love to collaborate. The proposal is neat, however, I am not too excited about handling errors in such high granularity, as far as the user is concerned. I am more on the same line with Tom Lane's statement in Simon's thread (Practical error logging for very large COPY statements): "The general problem that needs to be solved is "trap any error that occurs during attempted insertion of a COPY row, and instead of aborting the copy, record the data and the error message someplace else". Seen in that light, implementing a special path for uniqueness violations is pretty pointless." But, I definitely share your struggle to finding a good way to handle those unique/FK constraints... Out of your current possible known solutions list: . Temporary table that filters out the evil tuples. . BEFORE TRIGGER handling the tuple if the constraint of interest is violated. . INSERT wrapped in a subtransaction. . (Other variations) I really don't like Temporary tables (too much user intervention) or subtransactions (sloooow). I also don't like using pg_loader for that manner, as although it's a nice tool, isolating errors with it for large data sets is impractical. I guess the BEFORE TRIGGER is the closest solution to what I would like to achieve. I think something can be done even without a trigger. We could trap any of the following: - bad data (any error before the tuple can be created). - domain constraints - check constraints - NOT NULL constraints As far as UNIQUE goes, maybe there is a good way to do a bt scan against the index table right before the simple_heap_insert call? Hopefully without too much code duplication. I am not too familiar with that code, so I don't have a very specific idea yet. I don't know how much slower things will become with this extra scan (I would think it will still be simpler and faster than a subtransaction), but I figure that there is a price to pay if you want single row error isolation. Otherwise, if the user wants to run COPY like it is currently (all data rows or nothing) they could still do it in the same speed using the current code path, bypassing the extra scan. Not sure this way very helpful, but these are my thoughts at this moment. Regards, Alon. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings