mattcasters commented on issue #7077:
URL: https://github.com/apache/hop/issues/7077#issuecomment-4610034945
OK, we need some more architecture requirements. Some food for thought.
Example: a column gets added to a satelite at some point in time. What do
we do?
* Add another satelite with one column in it, modify the existing satelite?
* Re-load the full history of the satelite (drop/replace),
* Add the column and start populating values from the point of modification
on?
I guess in general the answer would be: all of the above.
This shifts the question to: what's the architecture going to be like in
terms of
* Daily operations
* Deployment to DTAP
* The DDL generation/deployment
Ideally we'd be doing all of this model-driven, without the need to copy DDL
into external tools or scripts to hand off to the people doing the deployments.
So we'd have something like `sh hop model-deploy --environment Production` ,
using a particular branch in git.
This leads to the next phases beyond the Raw Vault: can we think of an
architecture where Business Vault and Data Marts are covered as well? Some of
the Business Vault artifacts will be simple views, others will need incremental
loading for performance reasons.
Features like accurate dependency management and automatic execution of
updates are essential. Some sort of dbt alternative.
Some decisions are also needed for the initial version around the nature of
the DV itself with respect to time-lines: do we want an insert-only update
strategy or do we go a step further and support from/to date ranges in the
satelites for convenience (at the expense of performance).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]