JunRuiLee commented on PR #7898:
URL: https://github.com/apache/paimon/pull/7898#issuecomment-4504477944
> Maybe it is better to support `COPY INTO`? It seems that many products are
designed in this way.
@JingsongLi Thanks for the suggestion.
After checking existing systems, I found two common directions:
1. Databricks-style import
In Databricks, COPY INTO is mainly used for loading files into tables.
Following this model, we can use COPY INTO only for importing files into Paimon
tables, while keeping CSV export as a Spark procedure.
2. Snowflake-style bidirectional COPY
In Snowflake, COPY INTO supports both loading data into tables and
unloading data to files. Following this model, we would use COPY INTO for both
import and export.
I prefer starting with Option 1: support Databricks-style COPY INTO for
import first, with CSV as the first supported format, **and keep export as a
procedure**. This is closer to the Spark ecosystem and keeps the initial scope
smaller. Snowflake-style export can be discussed separately later if needed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]