Hi Davide, 

It is not clear from your description that the various pipelines are different 
"files" 

If they are not then that is your problem. Pipelines do not run sequentially. 
Everything happens at the same time. 
Therefore things are tripping over themselves. 

The recommended path would be that the first pipeline finishes with a pipeline 
executor transform. 
This transform then executes the next pipeline and so on. 
This transform also passes down the parameters collated by the first transform. 

In turn if we are talking about a workflow can you please explain how are you 
passing the values from one pipeline to the next? 
e.g. as described above passing them down, or setting variables and passing 
them up, or passing the results up (not recommended due to the memory 
requirements)? 

Diego 





[ https://www.bizcubed.com.au/ | 
                                                                
                                                                                
             ]  Diego Mainou 
Product Manager 
M. +61 415 152 091 
E. [ mailto:[email protected] | [email protected] ] 
[ https://www.bizcubed.com.au/ | www.bizcubed.com.au ]          [ 
https://www.bizcubed.com.au/ | 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                         ] 


From: "Davide Cisco" <[email protected]> 
To: "users" <[email protected]> 
Sent: Friday, 31 May, 2024 8:06:48 PM 
Subject: Performance issues when connecting to Oracle databases 

Hello, 

I set up a workflow with various pipelines in sequence, each one of them 
should: 
- read data from a table in an Oracle database (sometimes it has to 
lookup a key in another table from another Oracle database) 
- process the collected records (by adding a key and eventually a 
reference year) 
- write the resulting rows to a third table in another Oracle database 
(the same of the eventual lookup above) 

As long as I developed the various pipelines one at a time and tested 
the workflow at every edit, it ran without errors. 

Since I completed the workflow with all the necessary pipelines, I 
could never make it to the end: the workflow hangs at a random 
pipeline, shows no particular error but can't reach its conclusion. 

I suspected there are some memory allocation issues, but even 
increasing the Java dedicated memory to 8 GB (HOP_OPTIONS="-Xmx8192m") 
didn't do the job. 

I could probably solve the issues by adding here and there some 
"cleanup" statements/components (since each pipeline processes a 
completely different set of tables), but I can't figure a way to do 
that. 

Is there any possibilty to improve the performance (by the above 
mentioned cleanup components and/or some configuration in the Oracle 
JDBC driver), in order to make the workflow complete? 

Thanks for any suggestion 

DC 

Reply via email to