Hi, I'm very interested in this patch and would like to give a review within a week. On the feature side, how about simply using the less verbose "ERRORS" instead of "ERROR LIMIT" ?
On Wed, Jul 3, 2019 at 1:42 PM Surafel Temesgen <surafel3...@gmail.com> wrote: > Hi Alexey, > Thank you for looking at it > > On Tue, Jul 2, 2019 at 7:57 PM Alexey Kondratov < > a.kondra...@postgrespro.ru> wrote: > >> On 28.06.2019 16:12, Alvaro Herrera wrote: >> >> On Wed, Feb 20, 2019 at 7:04 PM Andres Freund <and...@anarazel.de> >> wrote: >> >>> Or even just return it as a row. CopyBoth is relatively widely >> supported >> >>> these days. >> >> i think generating warning about it also sufficiently meet its propose >> of >> >> notifying user about skipped record with existing logging facility >> >> and we use it for similar propose in other place too. The different >> >> i see is the number of warning that can be generated >> > Warnings seem useless for this purpose. I'm with Andres: returning rows >> > would make this a fine feature. If the user wants the rows in a table >> > as Andrew suggests, she can use wrap the whole thing in an insert. >> >> I agree with previous commentators that returning rows will make this >> feature more versatile. > > > I agree. am looking at the options > > Also, I would prefer having an option to ignore all errors, e.g. with >> option ERROR_LIMIT set to -1. Because it is rather difficult to estimate >> a number of future errors if you are playing with some badly structured >> data, while always setting it to 100500k looks ugly. >> >> > Good idea > > I also +1 having an option to ignore all errors. Other RDBMS might use a large number, but "-1" seems cleaner so far. > > >> 1) Calculation of processed rows isn't correct (I've checked). You do it >> in two places, and >> >> - processed++; >> + if (!cstate->error_limit) >> + processed++; >> >> is never incremented if ERROR_LIMIT is specified and no errors >> occurred/no constraints exist, so the result will always be 0. However, >> if primary column with constraints exists, then processed is calculated >> correctly, since another code path is used: >> >> > Correct. i will fix > > + if (specConflict) >> + { >> + ... >> + } >> + else >> + processed++; >> >> I would prefer this calculation in a single place (as it was before >> patch) for simplicity and in order to avoid such problems. >> >> > ok > > >> 2) This ExecInsertIndexTuples call is only executed now if ERROR_LIMIT >> is specified and was exceeded, which doesn't seem to be correct, does it? >> >> - if (resultRelInfo->ri_NumIndices > 0) >> + if (resultRelInfo->ri_NumIndices > 0 && >> cstate->error_limit == 0) >> recheckIndexes = >> ExecInsertIndexTuples(myslot, >> >> > No it alwase executed . I did it this way to avoid > inserting index tuple twice but i see its unlikely > > >> 3) Trailing whitespaces added to error messages and tests for some reason: >> >> + ereport(WARNING, >> + (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), >> + errmsg("skipping \"%s\" --- missing data >> for column \"%s\" ", >> >> + ereport(ERROR, >> + (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), >> + errmsg("missing data for column \"%s\" ", >> >> -ERROR: missing data for column "e" >> +ERROR: missing data for column "e" >> CONTEXT: COPY x, line 1: "2000 230 23 23" >> >> -ERROR: missing data for column "e" >> +ERROR: missing data for column "e" >> CONTEXT: COPY x, line 1: "2001 231 \N \N" >> >> > > regards > Surafel > Thanks, Anthony