Great idea. Thinking that COPY command can tolerate error lines, if we want to support COPY in a transaction, we need to regard each line as sub-transaction, otherwise the whole transaction need to be rollbacked after error line occurs. i.e. we need create a SAVEPOINT before processing one line, and if error occurs, we should ROLLBACK TO SAVEPOINT.
Support transaction in this way will definitely slow down performance, I suggest that add one option to switch transaction mode and non-transaction mode. Thanks. On Thu, Jul 28, 2016 at 6:53 AM, Oleksandr Diachenko <[email protected]> wrote: > Hi, > > Recently we discovered that COPY command for external tables is not > transactional. > It means if user is running COPY command and during this process underlying > HDFS fails - the end state of table is inconstant. > As a result I filled a JIRA - > https://issues.apache.org/jira/browse/HAWQ-956 > . > > From user perspective, COPY command might also need to support additional > options, to make import/export process more flexible. > > - Number of error lines to tolerate. > - Batch size to commit. > - TBD > > Please feel free to join discussion. > > Regards, Alex. >
