timsaucer opened a new issue, #1458:
URL: https://github.com/apache/datafusion-python/issues/1458

   ## Summary
   
   Several SessionContext methods for reading data sources and registering 
tables from upstream DataFusion v53 are not yet exposed in datafusion-python.
   
   ## Missing Methods
   
   **Read methods:**
   - [ ] `read_arrow` — read an Arrow IPC file into a DataFrame
   - [ ] `read_batch` — read a single RecordBatch into a DataFrame
   - [ ] `read_batches` — read multiple RecordBatches into a DataFrame
   - [ ] `read_empty` — create an empty DataFrame with a given schema
   
   **Write methods:**
   - [ ] `write_csv` — write query results to CSV directly from context
   - [ ] `write_json` — write query results to JSON directly from context
   - [ ] `write_parquet` — write query results to Parquet directly from context
   
   **Registration:**
   - [ ] `register_arrow` — register an Arrow IPC file as a table
   - [ ] `register_batch` — register a single RecordBatch as a table
   
   ## Upstream Reference
   
   - 
https://docs.rs/datafusion/53.0.0/datafusion/execution/context/struct.SessionContext.html
   
   ## Implementation
   
   - Rust bindings: `crates/core/src/context.rs`
   - Python wrappers: `python/datafusion/context.py`
   
   > **Note:** This gap analysis was performed using an AI agent comparing 
upstream DataFusion v53 documentation against the current datafusion-python 
codebase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to