Ok,
2c and just guessing
* add a X second wait between pipelines and see how you go. May be just a
timming issue with your DB.
* add some logging steps not only on the pipelines but on the error
handling of the workflow.
Going to bed now
Good luck
[ https://www.bizcubed.com.au/ |
] Diego Mainou
Product Manager
M. +61 415 152 091
E. [ mailto:[email protected] | [email protected] ]
[ https://www.bizcubed.com.au/ | www.bizcubed.com.au ] [
https://www.bizcubed.com.au/ |
]
From: "Davide Cisco" <[email protected]>
To: "users" <[email protected]>
Sent: Friday, 31 May, 2024 9:01:04 PM
Subject: Re: Performance issues when connecting to Oracle databases
Hi Diego,
the pipelines are indeed different files, they are all concatenated in
the workflow with a "success" (green) link sequentially one to
another.
There is no real exchange of data from one pipeline to another (except
for a set of workflow variables, that are set in the first pipeline
and eventually used by the others): each pipeline reads data from one
or two tables and writes the results in another table, all on its own.
I attach some screenshots of the workflow (main) and some of the
pipelines (the missing one are actually similar to those included... I
could probably group them using ETL Metadata Injection somehow, but I
couldn't make it work even this way...)
DC
--
Davide Cisco
Università degli Studi di Padova
Area Servizi Informatici e Telematici
Ufficio Applicativi
Settore Data Governance
via San Francesco, 11
35121 Padova
tel. (+39) 049 8273819
Il giorno ven 31 mag 2024 alle ore 12:36 Diego Mainou
<[email protected]> ha scritto:
>
> Hi Davide,
>
> It is not clear from your description that the various pipelines are
> different "files"
>
> If they are not then that is your problem. Pipelines do not run sequentially.
> Everything happens at the same time.
> Therefore things are tripping over themselves.
>
> The recommended path would be that the first pipeline finishes with a
> pipeline executor transform.
> This transform then executes the next pipeline and so on.
> This transform also passes down the parameters collated by the first
> transform.
>
> In turn if we are talking about a workflow can you please explain how are you
> passing the values from one pipeline to the next?
> e.g. as described above passing them down, or setting variables and passing
> them up, or passing the results up (not recommended due to the memory
> requirements)?
>
> Diego
>
>
>
>
>
> Diego Mainou
> Product Manager
> M. +61 415 152 091
> E. [email protected]
> www.bizcubed.com.au
>
> ________________________________
> From: "Davide Cisco" <[email protected]>
> To: "users" <[email protected]>
> Sent: Friday, 31 May, 2024 8:06:48 PM
> Subject: Performance issues when connecting to Oracle databases
>
> Hello,
>
> I set up a workflow with various pipelines in sequence, each one of them
> should:
> - read data from a table in an Oracle database (sometimes it has to
> lookup a key in another table from another Oracle database)
> - process the collected records (by adding a key and eventually a
> reference year)
> - write the resulting rows to a third table in another Oracle database
> (the same of the eventual lookup above)
>
> As long as I developed the various pipelines one at a time and tested
> the workflow at every edit, it ran without errors.
>
> Since I completed the workflow with all the necessary pipelines, I
> could never make it to the end: the workflow hangs at a random
> pipeline, shows no particular error but can't reach its conclusion.
>
> I suspected there are some memory allocation issues, but even
> increasing the Java dedicated memory to 8 GB (HOP_OPTIONS="-Xmx8192m")
> didn't do the job.
>
> I could probably solve the issues by adding here and there some
> "cleanup" statements/components (since each pipeline processes a
> completely different set of tables), but I can't figure a way to do
> that.
>
> Is there any possibilty to improve the performance (by the above
> mentioned cleanup components and/or some configuration in the Oracle
> JDBC driver), in order to make the workflow complete?
>
> Thanks for any suggestion
>
> DC