Hello!

Usually DDLs (create table / alter table) live outside of the applications.
>From my experience I've seen that sort of tasks being done either manually
or via automations like Liquibase / Flyway. This is not specific to Beam,
it is a common pattern of backend / data engineering apps development.

Some applications may have support of most simple and conflict-free DDLs
like adding a nullable column without restarting the app itself.

I remember I've seen some examples in Java and Python for Beam apps which
supported schema automatic migrations. Example:


https://medium.com/inside-league/streaming-data-to-bigquery-with-dataflow-and-real-time-schema-updating-c7a3deba3bad

I am not aware of automatic solutions for arbitrary schema changes though.

On Saturday, 6 May 2023, Michal Charemza <mic...@charemza.name> wrote:
> I'm looking into using Beam to ingest from various sources into a
PostgreSQL database, but there is something that I don't quite know how to
fit into the Beam model. How to deal with "non data" tasks that would need
to happen before or after the pipeline proper?
> For example, creation of tables, renames of tables, migrations on
existing tables. Where should all this sort of code/logic live if the
fetch/ingestion of data is via Beam? Or - is this entirely outside of the
Beam model? It should happen before the pipeline, or after the pipeline,
but not as part of the pipeline?

-- 
Best Regards,
Pavel Solomin

Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin
<https://www.linkedin.com/in/pavelsolomin>

Reply via email to