CTTY opened a new issue, #1540:
URL: https://github.com/apache/iceberg-rust/issues/1540

   ### Is your feature request related to a problem or challenge?
   
   As a part of #1382 , we need to implement `insert_into` for 
`IcebergTableProvider` to support `INSERT INTO` query in datafusion:
   ```
   insert into t value (1, 'a');
   ```
   
   ### Physical Plans
   Within `insert_into`, we will need to add a few nodes / Datafusion physical 
plans to complete the write process. And the entire write process can be 
described by the flowchart below:
   ```mermaid
   flowchart TD
       A(["Input Node"]) --> F["Project Node"]
       F --> B["Repartition Node"]
       B --> C["Sort Node"]
       C --> D["Writer Node"]
       D --> E["Commit Node"]
   ```
   - Input Node: Input physical plan that represents the input data
   - [ ] Project Node: Caculate partition value
   - [ ] Repartition Node: Decide when the partitioning mode for the best 
parallelism
   - [ ] Sort Node: Sort the input data
   - [ ] Writer Node: Spawn Iceberg writers and write the input data
   - [ ] Commit Node: Commit the data written using Iceberg Tx API
   
   ### Writer Extension
   Except writers mentioned in the writer path of #1382 , there are other 
writers that can be useful:
   - [ ] Implement `RollingFileWriter`: Helps split incoming data into multiple 
files
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Willingness to contribute
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to