Re: [PR] [spark] Add load_csv and export_csv procedures [paimon]

via GitHub Wed, 20 May 2026 20:20:49 -0700


JunRuiLee commented on PR #7898:
URL: https://github.com/apache/paimon/pull/7898#issuecomment-4504477944


   > Maybe it is better to support `COPY INTO`? It seems that many products are 
designed in this way.
   
   
     @JingsongLi Thanks for the suggestion.
   
     After checking existing systems, I found two common directions:
   
     1. Databricks-style import
        In Databricks, COPY INTO is mainly used for loading files into tables. 
Following this model, we can use COPY INTO only for importing files into Paimon 
tables, while keeping CSV export as a Spark procedure.
   
     2. Snowflake-style bidirectional COPY
        In Snowflake, COPY INTO supports both loading data into tables and 
unloading data to files. Following this model, we would use COPY INTO for both 
import and export.
   
     I prefer starting with Option 1: support Databricks-style COPY INTO for 
import first, with CSV as the first supported format, **and keep export as a 
procedure**. This is closer to the Spark ecosystem and keeps the initial scope 
smaller. Snowflake-style export can be discussed separately later if needed.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [spark] Add load_csv and export_csv procedures [paimon]

Reply via email to