Hi Deigo/Bart
Thanks for the replies
I am a big proponent of 'single source of truth'. The business application in this case is common across the businesses/databases, and in most respects they *are* separate environments, with almost no other scenario where I would need to be able to access across the different databases. My plan is to be able to do things like creating a pipeline, (get some business data and email it out, for example) which will satisfy the same requirement for all the businesses, so then I can just schedule 'hop-run' telling it which environment/database to run it in, and do that for one/some/all of the databases on a case-by-case basis. Then I have this where I need to be able to launch a workflow which will do the specific pipeline for each database, but it will depend on the first part of the workflow which databases it needs to happen in, and the information each database will need comes from the central repository.
Ideally, I'd like to only set things up once in a way that can satisfy both
sets requirements.
I was hoping I could do it in a way where it is all in one workflow, but looks
like I may have to break it up so that I have one controlling workflow, that
launches a workflow/pipeline for each the individual databases via a call-out
to another 'hop-run' with the database-specific environment. I can still have
an environment per database, but each of those environments will have a shared
config that points back to the central repository.
Cheers
Phil
On 26/10/23 3:36 pm, Diego Mainou <[email protected]> wrote:
Hi Phillip,
This is what I would do
Scenario 1
Assumptions:
* You have set a project folder with all your projects, environments,
database connections, etc.
* You have dset either a server or a container with hop
Job:
* Create a job that has the list of projects/environments
* Trigger a container or curl to the server that performs the required
jobs for each of the project/ environment combinations.
i.e. you are running a loop on a multi environment setup.
Scenario 2
Assumptions:
* You have setup some form of keystore
* setup an account that can access the required keys
Job
* job has the list of projects/environments.
* Retrieve the list of keys for the environment combination
* Execute a loop that performs the required actions
i.e. this is just a simple loop that manages the security in a way where
you don't keep things out of the key vault
Diego
<https://www.bizcubed.com.au> Diego Mainou
Product Manager
M. +61 415 152 091
E. [email protected] <mailto:[email protected]>
www.bizcubed.com.au <https://www.bizcubed.com.au>
<https://www.bizcubed.com.au>
------------------------------------------------------------------------
*From: *"Phillip Brown" <[email protected]>
*To: *"users" <[email protected]>
*Sent: *Thursday, 26 October, 2023 2:53:41 PM
*Subject: *variables, environments, databases, OH MY!
Hi all
They say the best way to learn something new is to use it solve a
specific problem. I am wanting to do that with Apache Hop, but am stuck
on a particular part.
Requirement:
I am responsible for managing a number of application databases for our
various business units. We have people who support those applications
who periodically require the ability to update data in specific tables.
To facilitate that I have implemented a system where they can log a
request for update access into a central repository, and periodically
that repository will be synced with the various databases. The access
they are granted is (mostly) temporary, and expires after a number of
days. The system to do the syncing currently uses a mix of shell and
python, and works pretty well. Additionally, the passwords used in each
of the target databases are different.
the basic flow is:
1) get a list of databases which need to be synced (based on recent
requests and/or expiry) from the repository
2) for each database, get a list of what should be in that database from
the repository
3) then in the target database, remove anything that shouldn't be there,
and add anything that isn't there
The bit I am having trouble with is the 'for each database'. I expect I
am going to have to use variables, and thought I might be able to use an
environment per database, but I can't see how (or if) you can tell Hop
to use a different environment once you've started a workflow or a
pipeline. So somehow it will need to be all one environment, but I don't
really want to have a squillion different variables.
So, any guidance in what pattern to use in setting this up would be
appreciated.
Thanks and regards
Phil Brown.