[PR] [spark] Support ON_ERROR = CONTINUE / SKIP_FILE in COPY INTO [paimon]

via GitHub Mon, 01 Jun 2026 02:32:42 -0700


JunRuiLee opened a new pull request, #8062:
URL: https://github.com/apache/paimon/pull/8062


   ## Motivation
   
   COPY INTO previously only supported `ON_ERROR = ABORT_STATEMENT`: any parse 
or
   cast error aborted the entire command. In production data-loading pipelines a
   single malformed row or file would then fail the whole batch, which is often
   too strict. This adds two error-tolerant modes:
   
   - `CONTINUE` — skip bad rows and load the rest (row-level tolerance).
   - `SKIP_FILE` — skip any file that contains an error, all-or-nothing per 
file.
   
   `ABORT_STATEMENT` remains the default, so existing behavior is unchanged.
   
   ## Changes
   
   - Grammar: `ON_ERROR` now accepts `CONTINUE` and `SKIP_FILE` in addition to
     `ABORT_STATEMENT`.
   - Result schema gains two columns:
     - `errors_seen` (BIGINT) — number of error rows per file.
     - `first_error` (STRING) — first error message, NULL when the file is 
clean.
     - `status` now also reports `PARTIALLY_LOADED` and `LOAD_FAILED`.
   - Error detection runs once per batch; both modes write in a single commit.
     Load history is recorded so error-tolerant runs stay idempotent under
     `FORCE = FALSE`.
   - Refactor: `CopyIntoTableExec` is split into focused helpers
     (`CopyIntoHelper`, `CopyIntoCastValidator`, `CopyIntoDataFrameBuilder`,
     `CopyIntoErrorHandler`, `CopyIntoResultBuilder`), shared across 
CSV/JSON/Parquet.
   - Docs updated in `sql-write.md`, including the CSV column-count-mismatch 
caveat
     under `CONTINUE`.
   
   Supported for CSV, JSON, and Parquet.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [spark] Support ON_ERROR = CONTINUE / SKIP_FILE in COPY INTO [paimon]

Reply via email to