Not sure if that is correct. This could be related to this issue [1] where we still have some parts of the default configuration (and inheritance of the default project as the parent) in the container.
[1] https://github.com/apache/hop/issues/3893#issuecomment-2091161026 Regards, Bart On Wed, May 22, 2024 at 2:33 PM Hans Van Akelyen <hans.van.akel...@gmail.com> wrote: > Hi Davide, > > That might be the root cause of this issue. The container creates a new > project using the provided variables [1]. We currently do not have the > options to create/specify a parent project. > > Kr, > Hans > > [1] > https://github.com/apache/hop/blob/main/docker/resources/load-and-execute.sh > On 22 May 2024 at 10:39 +0200, Davide Cisco <davide.ci...@unipd.it>, > wrote: > > Hello, > > @Gert: I tried to insert a fictional database name in the field (after > removing the actual connection string, and then reinserting it). > Despite that name is saved in the configuration file, I got the same > error. The connection string is in the form > jdbc:oracle:thin:@${APEX_SID} since the connection parameters are > actually stored in the file tnsnames.ora (where a record matching the > SID is present, of course). > > @Diego: Adding the port forward (-p 1521:1521) in the command line > didn't work. Drivers are already forwarded to Docker with the use of > mount points and environment variables (--env > HOP_SHARED_JDBC_FOLDERS=./lib/jdbc,/opt/jdbc --env > TNS_ADMIN=/home/hop/tns -v $TNS_ADMIN:/home/hop/tns -v > /opt/jdbc:/opt/jdbc). > > What I noticed during further analysis is that the project has been > created in Docker in a different way than what has been developed: in > the "local" configuration the project "geko" is inheriting the > connection from another project ("_root", where I basically store the > database connections, since they are used by different projects). In > the Docker configuration "geko" inherits from the "default" project > instead, and no entry for the "_root" project is defined in the > hop-config.json file. This is what the project-config.json file looks > like in Docker: > > === begin file === > 2cf530a949da:~/config/projects/geko$ cat project-config.json > { > "metadataBaseFolder" : "${PROJECT_HOME}/metadata", > "unitTestsBasePath" : "${PROJECT_HOME}", > "dataSetsCsvFolder" : "${PROJECT_HOME}/datasets", > "enforcingExecutionInHome" : true, > "parentProjectName" : "default", > "config" : { > "variables" : [ ] > } > } > ==== end file ==== > > The local configuration has the line "parentProjectName" : "_root" > instead. So basically the question is: do I need to set other > variables in the command line to get the inheritance work, or move > some files prior to load the project in Docker? (or even file this > behaviour as a bug in GitHub? :) > > Thanks for any other suggestions, > > Davide > > -- > Davide Cisco > Università degli Studi di Padova > Area Servizi Informatici e Telematici > Ufficio Applicativi > Settore Data Governance > via San Francesco, 11 > 35121 Padova > tel. (+39) 049 8273819 > > > Il giorno mer 22 mag 2024 alle ore 07:52 Diego Mainou > <diego.mai...@bizcubed.com.au> ha scritto: > > > Hey my 2c. > > Try on the non docker version. > If you are successful you may need to tackle one of the following. > > Add the drivers to the Docker container > Open the ports of the Docker container. > > > Diego Mainou > Product Manager > M. +61 415 152 091 > E. diego.mai...@bizcubed.com.au > www.bizcubed.com.au > > ________________________________ > From: "Davide Cisco" <davide.ci...@unipd.it> > To: "users" <users@hop.apache.org> > Sent: Wednesday, 22 May, 2024 12:46:42 AM > Subject: Re: Connecting to an Oracle Database in a Docker container > > Hi Bart, > > I tried to write to log the variables regarding the database > connection, and they are correctly imported. > > In Hop GUI the database connection has indeed the "Server host name" > and "Database name" fields empty, but those fields have been disabled > since I entered the TNS connection string (something like > jdbc:oracle:thin:@${APEX_SID} with one of those variables) in the > "Manual connection URL". As mentioned before, that string allows Hop > to connect to the DB in a local environment, but not in the Docker one > (despite the tnsnames.ora file is the same and it is correctly > referenced with the environment variable TNS_ADMIN). > > Thanks for any other suggestions... > > Davide > > > > Il giorno mar 21 mag 2024 alle ore 15:10 Bart Maertens > <bart.maert...@know.bi> ha scritto: > > > > > Hi Davide, > > > > The error below points to a missing configuration parameter (database > name?). > > > > It's hard to say what could be wrong with your project and/or > environment configuration, I'm afraid you'll have to dig in. > > You could try printing all of your configuration variables to the logs > or write them to a file to see if everything is in place... > > > > 2024/05/21 12:00:37 - Table input.0 - ERROR: > java.lang.NullPointerException > > 2024/05/21 12:00:37 - Table input.0 - at > org.apache.hop.core.database.Database.getObjectName(Database.java:4546) > > > > Good luck! > > > > Regards, > > Bart > > > > On Tue, May 21, 2024 at 2:48 PM Davide Cisco <davide.ci...@unipd.it> > wrote: > > >> > >> Hello, > >> > >> I'm trying to deploy in the Apache Hop Docker container a simple > >> workflow with some pipelines that connect to an Oracle database via > >> tnsnames.ora and output some table data in an Excel file. > >> > >> This workflow works well in the "local" environment, using both the > >> Hop GUI and the command line interface hop-run.sh. When I try to > >> deploy everything in a Docker container, despite having mounted as > >> volumes all the needed stuff to run (configuration files, JDBC > >> libraries, environment variables), the workflow fails at the first > >> pipeline saying that the component Table input can't be initialized. > >> > >> I set up the docker "base" image using this Dockerfile (git and > >> openssh are needed to get the configuration downloaded from our > >> repository): > >> > >> === begin file === > >> FROM apache/hop:latest > >> USER root > >> COPY --chown=hop:hop ./setup/home/hop /home/hop > >> RUN apk update > >> RUN apk add --no-cache git > >> RUN apk add --no-cache openssh > >> RUN chmod 600 /home/hop/.ssh/* > >> USER hop > >> ==== end file ==== > >> > >> Then launched the Docker container with this (very long) command line: > >> > >> === begin file === > >> sudo docker run -it --rm --name hop-geko.dev --env > >> HOP_CUSTOM_ENTRYPOINT_EXTENSION_SHELL_FILE_PATH=/home/hop/dl-repo.sh > >> --env HOP_LOG_LEVEL=Rowlevel --env > >> HOP_LOG_PATH=/files/geko/log/hop.err.log --env HOP_PROJECT_NAME=geko > >> --env HOP_PROJECT_FOLDER=/home/hop/config/projects/geko --env > >> HOP_PROJECT_CONFIG_FILE_NAME=project-config.json --env > >> HOP_ENVIRONMENT_NAME=geko.dev --env > >> HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS="/home/hop/env/dev/Oracle > >> APEX.json",'${PROJECT_HOME}/env/dev.json' --env > >> HOP_SHARED_JDBC_FOLDERS=./lib/jdbc,/opt/jdbc --env > >> HOP_FILE_PATH=main.hwf --env HOP_RUN_CONFIG=local --env > >> HOP_SYSTEM_PROPERTIES=HOP_FILE_MOUNT=/files --env > >> TNS_ADMIN=/home/hop/tns -v /work/hop/env:/home/hop/env -v > >> /work/hop/mount:/files -v $TNS_ADMIN:/home/hop/tns -v > >> /opt/jdbc:/opt/jdbc hop-docker-base > >> ==== end file ==== > >> > >> The output I got from the Docker runner is the following: > >> > >> === begin file === > >> 2024/05/21 11:59:40 - Running the entrypoint script with PID 7 > >> 2024/05/21 11:59:40 - Sourcing custom entry point extension: > >> /home/hop/dl-repo.sh > >> Cloning into 'hopconfig'... > >> remote: Enumerating objects: 312, done. > >> remote: Counting objects: 100% (299/299), done. > >> remote: Compressing objects: 100% (290/290), done. > >> remote: Total 312 (delta 166), reused 0 (delta 0), pack-reused 13 > >> Receiving objects: 100% (312/312), 84.00 KiB | 2.90 MiB/s, done. > >> Resolving deltas: 100% (167/167), done. > >> 2024/05/21 11:59:41 - Setting system properties at runtime: > >> HOP_FILE_MOUNT=/files > >> 2024/05/21 11:59:41 - The project folder for geko is set to: > >> /home/hop/config/projects/geko > >> 2024/05/21 11:59:41 - The specified project folder exists > >> 2024/05/21 11:59:41 - Registering project geko in the Hop container > >> configuration > >> 2024/05/21 11:59:41 - /opt/hop/hop-conf.sh --project=geko > >> --project-create --project-home='/home/hop/config/projects/geko' > >> --project-config-file='project-config.json' > >> Creating project 'geko' > >> Project 'geko' was created for home folder : > /home/hop/config/projects/geko > >> Configuration file for project 'geko' was saved to : > >> file:/home/hop/config/projects/geko/project-config.json > >> 2024/05/21 12:00:04 - Registering environment geko.dev in the Hop > >> container configuration > >> 2024/05/21 12:00:04 - /opt/hop/hop-conf.sh --environment-create > >> --environment=geko.dev --environment-project=geko > >> --environment-config-files='/home/hop/env/dev/Oracle > >> APEX.json,${PROJECT_HOME}/env/dev.json' > >> Creating environment 'geko.dev' > >> Environment 'geko.dev' was created in Hop configuration file > >> /opt/hop/config/hop-config.json > >> Found existing environment configuration file: > >> /home/hop/env/dev/Oracle APEX.json > >> Found existing environment configuration file: > >> /home/hop/config/projects/geko/env/dev.json > >> geko.dev > >> Purpose: Apache Hop docker container > >> Project name: geko > >> Config file: /home/hop/env/dev/Oracle APEX.json > >> Config file: ${PROJECT_HOME}/env/dev.json > >> 2024/05/21 12:00:25 - Running a single hop workflow / pipeline > (main.hwf) > >> 2024/05/21 12:00:35 - HopRun - Start of Hop Run > >> 2024/05/21 12:00:35 - HopRun - Referencing environment 'geko.dev' for > >> project geko' in Apache Hop docker container > >> 2024/05/21 12:00:35 - HopRun - Enabling project 'geko' > >> 2024/05/21 12:00:35 - HopRun - Relative path filename specified: > >> /home/hop/config/projects/geko/main.hwf > >> 2024/05/21 12:00:35 - HopRun - Starting workflow: > >> /home/hop/config/projects/geko/main.hwf > >> 2024/05/21 12:00:36 - main - Start of workflow execution > >> 2024/05/21 12:00:36 - main - exec(0, 0, Start) > >> 2024/05/21 12:00:36 - Start - Starting action > >> 2024/05/21 12:00:36 - main - Starting action [01_strutture.hpl] > >> 2024/05/21 12:00:36 - main - exec(1, 0, 01_strutture.hpl) > >> 2024/05/21 12:00:36 - 01_strutture.hpl - Starting action > >> 2024/05/21 12:00:36 - 01_strutture.hpl - Opening pipeline: > >> [/home/hop/config/projects/geko/01_strutture.hpl] > >> 2024/05/21 12:00:36 - 01_strutture.hpl - Starting > >> pipeline...(file=${PROJECT_HOME}/01_strutture.hpl, > >> name=01_strutture.hpl, repinfo=null) > >> 2024/05/21 12:00:36 - 01_strutture.hpl - Using run configuration [local] > >> 2024/05/21 12:00:36 - 01_strutture - nr of transforms to run : 3 , nr > >> of hops : 2 > >> 2024/05/21 12:00:36 - 01_strutture - Executing this pipeline using the > >> Local Pipeline Engine with run configuration 'local' > >> 2024/05/21 12:00:36 - 01_strutture - Not running a unit test... > >> 2024/05/21 12:00:36 - 01_strutture - Execution started for pipeline > >> [01_strutture] > >> 2024/05/21 12:00:36 - 01_strutture - I found 3 different transforms to > launch. > >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets... > >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets for transform > >> 0 --> Table input > >> 2024/05/21 12:00:36 - 01_strutture - prevcopies = 1, nextcopies=1 > >> 2024/05/21 12:00:36 - 01_strutture - Pipeline allocated new rowset > >> [Table input.0 - Select values.0] > >> 2024/05/21 12:00:36 - 01_strutture - Allocated 1 rowsets for > >> transform 0 --> Table input > >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets for transform > >> 1 --> Select values > >> 2024/05/21 12:00:36 - 01_strutture - prevcopies = 1, nextcopies=1 > >> 2024/05/21 12:00:36 - 01_strutture - Pipeline allocated new rowset > >> [Select values.0 - Microsoft Excel writer.0] > >> 2024/05/21 12:00:36 - 01_strutture - Allocated 2 rowsets for > >> transform 1 --> Select values > >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets for transform > >> 2 --> Microsoft Excel writer > >> 2024/05/21 12:00:36 - 01_strutture - Allocated 2 rowsets for > >> transform 2 --> Microsoft Excel writer > >> 2024/05/21 12:00:36 - 01_strutture - Allocating Transforms & > TransformData... > >> 2024/05/21 12:00:36 - 01_strutture - Pipeline is about to allocate > >> transform [Table input] of type [TableInput] > >> 2024/05/21 12:00:36 - 01_strutture - Transform has nrcopies=1 > >> 2024/05/21 12:00:36 - Table input.0 - Starting allocation of buffers & > >> new threads... > >> 2024/05/21 12:00:36 - Table input.0 - Transform info: nrinput=0 > nroutput=1 > >> 2024/05/21 12:00:36 - Table input.0 - output rel. is 1:1 > >> 2024/05/21 12:00:36 - Table input.0 - Found output rowset [Table > >> input.0 - Select values.0] > >> 2024/05/21 12:00:36 - Table input.0 - Finished dispatching > >> 2024/05/21 12:00:36 - 01_strutture - Pipeline has allocated a new > >> transform: [Table input].0 > >> 2024/05/21 12:00:36 - 01_strutture - Pipeline is about to allocate > >> transform [Select values] of type [SelectValues] > >> 2024/05/21 12:00:36 - 01_strutture - Transform has nrcopies=1 > >> 2024/05/21 12:00:36 - Select values.0 - Starting allocation of buffers > >> & new threads... > >> 2024/05/21 12:00:36 - Select values.0 - Transform info: nrinput=1 > nroutput=1 > >> 2024/05/21 12:00:36 - Select values.0 - Got previous transform from > >> [Select values] #0 --> Table input > >> 2024/05/21 12:00:36 - Select values.0 - input rel is 1:1 > >> 2024/05/21 12:00:36 - Select values.0 - Found input rowset [Table > >> input.0 - Select values.0] > >> 2024/05/21 12:00:36 - Select values.0 - output rel. is 1:1 > >> 2024/05/21 12:00:36 - Select values.0 - Found output rowset [Select > >> values.0 - Microsoft Excel writer.0] > >> 2024/05/21 12:00:36 - Select values.0 - Finished dispatching > >> 2024/05/21 12:00:36 - 01_strutture - Pipeline has allocated a new > >> transform: [Select values].0 > >> 2024/05/21 12:00:36 - 01_strutture - Pipeline is about to allocate > >> transform [Microsoft Excel writer] of type > >> [TypeExitExcelWriterTransform] > >> 2024/05/21 12:00:36 - 01_strutture - Transform has nrcopies=1 > >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Starting allocation > >> of buffers & new threads... > >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Transform info: > >> nrinput=1 nroutput=0 > >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Got previous > >> transform from [Microsoft Excel writer] #0 --> Select values > >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - input rel is 1:1 > >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Found input rowset > >> [Select values.0 - Microsoft Excel writer.0] > >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Finished dispatching > >> 2024/05/21 12:00:37 - 01_strutture - Pipeline has allocated a new > >> transform: [Microsoft Excel writer].0 > >> 2024/05/21 12:00:37 - 01_strutture - Initialising 3 transforms... > >> 2024/05/21 12:00:37 - Table input.0 - ERROR: Error initializing > >> transform [Table input] > >> 2024/05/21 12:00:37 - Table input.0 - ERROR: > java.lang.NullPointerException > >> 2024/05/21 12:00:37 - Table input.0 - at > >> org.apache.hop.core.database.Database.getObjectName(Database.java:4546) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> > org.apache.hop.core.logging.LoggingObject.grabLoggingObjectInformation(LoggingObject.java:145) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> org.apache.hop.core.logging.LoggingObject.<init>(LoggingObject.java:45) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> > org.apache.hop.core.logging.LoggingRegistry.registerLoggingSource(LoggingRegistry.java:65) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> org.apache.hop.core.logging.LogChannel.<init>(LogChannel.java:83) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> org.apache.hop.core.logging.LogChannel.<init>(LogChannel.java:65) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> org.apache.hop.core.database.Database.<init>(Database.java:182) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> > org.apache.hop.pipeline.transforms.tableinput.TableInput.init(TableInput.java:337) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> > org.apache.hop.pipeline.transform.TransformInitThread.run(TransformInitThread.java:66) > >> 2024/05/21 12:00:37 - Table input.0 - at > >> java.base/java.lang.Thread.run(Thread.java:829) > >> 2024/05/21 12:00:37 - 01_strutture - ERROR: Transform [Table input.0] > >> failed to initialize! > >> 2024/05/21 12:00:37 - 01_strutture - Transform [Select values.0] > >> initialized flawlessly. > >> 2024/05/21 12:00:37 - 01_strutture - Transform [Microsoft Excel > >> writer.0] initialized flawlessly. > >> 2024/05/21 12:00:37 - Table input.0 - Finished reading query, closing > >> connection. > >> 2024/05/21 12:00:37 - 01_strutture.hpl - ERROR: Unable to prepare for > >> execution of the pipeline > >> 2024/05/21 12:00:37 - 01_strutture.hpl - ERROR: > >> org.apache.hop.core.exception.HopException: > >> 2024/05/21 12:00:37 - 01_strutture.hpl - We failed to initialize at > >> least one transform. Execution can not begin! > >> 2024/05/21 12:00:37 - 01_strutture.hpl - > >> 2024/05/21 12:00:37 - 01_strutture.hpl - > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.pipeline.Pipeline.prepareExecution(Pipeline.java:1089) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> > org.apache.hop.pipeline.engines.local.LocalPipelineEngine.prepareExecution(LocalPipelineEngine.java:236) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.pipeline.Pipeline.execute(Pipeline.java:529) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> > org.apache.hop.workflow.actions.pipeline.ActionPipeline.execute(ActionPipeline.java:539) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.workflow.Workflow.executeFromStart(Workflow.java:655) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.workflow.Workflow.executeFromStart(Workflow.java:798) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.workflow.Workflow.executeFromStart(Workflow.java:439) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.workflow.Workflow.startExecution(Workflow.java:300) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> > org.apache.hop.workflow.engines.local.LocalWorkflowEngine.startExecution(LocalWorkflowEngine.java:249) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.run.HopRun.runWorkflow(HopRun.java:433) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.run.HopRun.runWorkflow(HopRun.java:384) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.run.HopRun.run(HopRun.java:201) > >> 2024/05/21 12:00:37 - 01_strutture.hpl - at > >> org.apache.hop.run.HopRun.main(HopRun.java:924) > >> 2024/05/21 12:00:37 - main - Finished action [01_strutture.hpl] > (result=[false]) > >> 2024/05/21 12:00:37 - main - Workflow execution finished > >> 2024/05/21 12:00:37 - main - Workflow duration : 2.041 seconds [ 2.040" > ] > >> HopRun exit. > >> ==== end file ==== > >> > >> I also tried to disable the default entrypoint in the Hop Docker (by > >> adding --entrypoint /bin/bash in the command line above) to check if > >> the files have been correctly downloaded and mounted: all of them seem > >> to be in the expected position. > >> > >> Is there any additional configuration am I missing to get the > >> connection working? Feel free to ask for any information/file that you > >> might need to diagnose the problem. > >> > >> Thanks in advance for your support > >> > >> (note: in the above files text lines might have been broken due to > >> limitations of the simple text format) > >> > >> DC > >