Hello! Usually DDLs (create table / alter table) live outside of the applications. >From my experience I've seen that sort of tasks being done either manually or via automations like Liquibase / Flyway. This is not specific to Beam, it is a common pattern of backend / data engineering apps development.
Some applications may have support of most simple and conflict-free DDLs like adding a nullable column without restarting the app itself. I remember I've seen some examples in Java and Python for Beam apps which supported schema automatic migrations. Example: https://medium.com/inside-league/streaming-data-to-bigquery-with-dataflow-and-real-time-schema-updating-c7a3deba3bad I am not aware of automatic solutions for arbitrary schema changes though. On Saturday, 6 May 2023, Michal Charemza <mic...@charemza.name> wrote: > I'm looking into using Beam to ingest from various sources into a PostgreSQL database, but there is something that I don't quite know how to fit into the Beam model. How to deal with "non data" tasks that would need to happen before or after the pipeline proper? > For example, creation of tables, renames of tables, migrations on existing tables. Where should all this sort of code/logic live if the fetch/ingestion of data is via Beam? Or - is this entirely outside of the Beam model? It should happen before the pipeline, or after the pipeline, but not as part of the pipeline? -- Best Regards, Pavel Solomin Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin <https://www.linkedin.com/in/pavelsolomin>