JunRuiLee opened a new pull request, #7993: URL: https://github.com/apache/paimon/pull/7993
## Summary - Add JSON format support for `COPY INTO` import and export, alongside existing CSV support - JSON uses column-name matching (not positional), with options for `MULTI_LINE`, `NULL_IF`, `EMPTY_FIELD_AS_NULL`, and `COMPRESSION` - CSV-only options (e.g. `FIELD_DELIMITER`, `SKIP_HEADER`) are rejected for JSON format with clear error messages ## Motivation JSON is a common format for semi-structured data in data lake scenarios. Some users have requested JSON support for `COPY INTO` to complement the existing CSV capability. ## Changes - **Grammar**: Add `JSON` lexer token to `PaimonSqlExtensions.g4` - **CopyOptions.scala**: Add `FileFormatType.JSON`, format-specific option validation and Spark reader/writer option mapping - **CopyIntoTableExec.scala**: JSON reads with column-name schema (vs CSV positional `_c0/_c1`), dispatch `.json()` / `.csv()` by format type - **CopyIntoLocationExec.scala**: Dispatch export by format type - **Documentation**: Updated `sql-write.md` with JSON syntax, options, and column mapping semantics ## Tests Added 16 JSON test cases covering: basic import, column-name matching, multi-line, explicit column list, NULL_IF, export, option validation, round-trip (export then import), extra/missing fields handling, malformed data abort, bad cast abort, GZIP compression, and date/timestamp column casting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
