harrygav opened a new pull request, #665:
URL: https://github.com/apache/wayang/pull/665
## Summary
This PR introduces a new `TableSink` operator for writing `Record` data into
a database table via JDBC, with implementations for the Java and Spark
platforms.
Opening as **Draft** to start discussion on the operator design and expected
behavior.
## Changes
- **New operator:** `TableSink` (in `wayang-basic`)
- A `UnarySink<Record>` that targets a table name and accepts JDBC
connection `Properties`
- Supports a write `mode` (e.g. overwrite) and optional column names
- **Java platform:** `JavaTableSink` (in `wayang-java`)
- JDBC-based implementation that can create the target table (if missing)
and batch-insert records
- Supports `overwrite` by dropping the target table first
- **Spark platform:** `SparkTableSink` (in `wayang-spark`)
- Spark-side implementation of the same `TableSink` operator
## Notes / open questions
- This started as a PostgreSQL sink, but the intention should likely be a
**generic JDBC sink** that works across multiple databases.
- DDL generation is currently basic (e.g., columns are auto-created as
`VARCHAR`s)
- `mode` behavior (overwrite vs append, etc.) should be agreed on and
formalized.
## How to use / test
To run end-to-end locally, you currently need an external PostgreSQL
instance available and provide JDBC connection details
(driver/url/user/password) in the test setup/environment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]