Not sure if that is correct. This could be related to this issue [1] where
we still have some parts of the default configuration (and inheritance of
the default project as the parent) in the container.

[1] https://github.com/apache/hop/issues/3893#issuecomment-2091161026

Regards,
Bart

On Wed, May 22, 2024 at 2:33 PM Hans Van Akelyen <hans.van.akel...@gmail.com>
wrote:

> Hi Davide,
>
> That might be the root cause of this issue. The container creates a new
> project using the provided variables [1]. We currently do not have the
> options to create/specify a parent project.
>
> Kr,
> Hans
>
> [1]
> https://github.com/apache/hop/blob/main/docker/resources/load-and-execute.sh
> On 22 May 2024 at 10:39 +0200, Davide Cisco <davide.ci...@unipd.it>,
> wrote:
>
> Hello,
>
> @Gert: I tried to insert a fictional database name in the field (after
> removing the actual connection string, and then reinserting it).
> Despite that name is saved in the configuration file, I got the same
> error. The connection string is in the form
> jdbc:oracle:thin:@${APEX_SID} since the connection parameters are
> actually stored in the file tnsnames.ora (where a record matching the
> SID is present, of course).
>
> @Diego: Adding the port forward (-p 1521:1521) in the command line
> didn't work. Drivers are already forwarded to Docker with the use of
> mount points and environment variables (--env
> HOP_SHARED_JDBC_FOLDERS=./lib/jdbc,/opt/jdbc --env
> TNS_ADMIN=/home/hop/tns -v $TNS_ADMIN:/home/hop/tns -v
> /opt/jdbc:/opt/jdbc).
>
> What I noticed during further analysis is that the project has been
> created in Docker in a different way than what has been developed: in
> the "local" configuration the project "geko" is inheriting the
> connection from another project ("_root", where I basically store the
> database connections, since they are used by different projects). In
> the Docker configuration "geko" inherits from the "default" project
> instead, and no entry for the "_root" project is defined in the
> hop-config.json file. This is what the project-config.json file looks
> like in Docker:
>
> === begin file ===
> 2cf530a949da:~/config/projects/geko$ cat project-config.json
> {
> "metadataBaseFolder" : "${PROJECT_HOME}/metadata",
> "unitTestsBasePath" : "${PROJECT_HOME}",
> "dataSetsCsvFolder" : "${PROJECT_HOME}/datasets",
> "enforcingExecutionInHome" : true,
> "parentProjectName" : "default",
> "config" : {
> "variables" : [ ]
> }
> }
> ==== end file ====
>
> The local configuration has the line "parentProjectName" : "_root"
> instead. So basically the question is: do I need to set other
> variables in the command line to get the inheritance work, or move
> some files prior to load the project in Docker? (or even file this
> behaviour as a bug in GitHub? :)
>
> Thanks for any other suggestions,
>
> Davide
>
> --
> Davide Cisco
> Università degli Studi di Padova
> Area Servizi Informatici e Telematici
> Ufficio Applicativi
> Settore Data Governance
> via San Francesco, 11
> 35121 Padova
> tel. (+39) 049 8273819
>
>
> Il giorno mer 22 mag 2024 alle ore 07:52 Diego Mainou
> <diego.mai...@bizcubed.com.au> ha scritto:
>
>
> Hey my 2c.
>
> Try on the non docker version.
> If you are successful you may need to tackle one of the following.
>
> Add the drivers to the Docker container
> Open the ports of the Docker container.
>
>
> Diego Mainou
> Product Manager
> M. +61 415 152 091
> E. diego.mai...@bizcubed.com.au
> www.bizcubed.com.au
>
> ________________________________
> From: "Davide Cisco" <davide.ci...@unipd.it>
> To: "users" <users@hop.apache.org>
> Sent: Wednesday, 22 May, 2024 12:46:42 AM
> Subject: Re: Connecting to an Oracle Database in a Docker container
>
> Hi Bart,
>
> I tried to write to log the variables regarding the database
> connection, and they are correctly imported.
>
> In Hop GUI the database connection has indeed the "Server host name"
> and "Database name" fields empty, but those fields have been disabled
> since I entered the TNS connection string (something like
> jdbc:oracle:thin:@${APEX_SID} with one of those variables) in the
> "Manual connection URL". As mentioned before, that string allows Hop
> to connect to the DB in a local environment, but not in the Docker one
> (despite the tnsnames.ora file is the same and it is correctly
> referenced with the environment variable TNS_ADMIN).
>
> Thanks for any other suggestions...
>
> Davide
>
>
>
> Il giorno mar 21 mag 2024 alle ore 15:10 Bart Maertens
> <bart.maert...@know.bi> ha scritto:
>
> >
> > Hi Davide,
> >
> > The error below points to a missing configuration parameter (database
> name?).
> >
> > It's hard to say what could be wrong with your project and/or
> environment configuration, I'm afraid you'll have to dig in.
> > You could try printing all of your configuration variables to the logs
> or write them to a file to see if everything is in place...
> >
> > 2024/05/21 12:00:37 - Table input.0 - ERROR:
> java.lang.NullPointerException
> > 2024/05/21 12:00:37 - Table input.0 - at
> org.apache.hop.core.database.Database.getObjectName(Database.java:4546)
> >
> > Good luck!
> >
> > Regards,
> > Bart
> >
> > On Tue, May 21, 2024 at 2:48 PM Davide Cisco <davide.ci...@unipd.it>
> wrote:
>
> >>
> >> Hello,
> >>
> >> I'm trying to deploy in the Apache Hop Docker container a simple
> >> workflow with some pipelines that connect to an Oracle database via
> >> tnsnames.ora and output some table data in an Excel file.
> >>
> >> This workflow works well in the "local" environment, using both the
> >> Hop GUI and the command line interface hop-run.sh. When I try to
> >> deploy everything in a Docker container, despite having mounted as
> >> volumes all the needed stuff to run (configuration files, JDBC
> >> libraries, environment variables), the workflow fails at the first
> >> pipeline saying that the component Table input can't be initialized.
> >>
> >> I set up the docker "base" image using this Dockerfile (git and
> >> openssh are needed to get the configuration downloaded from our
> >> repository):
> >>
> >> === begin file ===
> >> FROM apache/hop:latest
> >> USER root
> >> COPY --chown=hop:hop ./setup/home/hop /home/hop
> >> RUN apk update
> >> RUN apk add --no-cache git
> >> RUN apk add --no-cache openssh
> >> RUN chmod 600 /home/hop/.ssh/*
> >> USER hop
> >> ==== end file ====
> >>
> >> Then launched the Docker container with this (very long) command line:
> >>
> >> === begin file ===
> >> sudo docker run -it --rm --name hop-geko.dev --env
> >> HOP_CUSTOM_ENTRYPOINT_EXTENSION_SHELL_FILE_PATH=/home/hop/dl-repo.sh
> >> --env HOP_LOG_LEVEL=Rowlevel --env
> >> HOP_LOG_PATH=/files/geko/log/hop.err.log --env HOP_PROJECT_NAME=geko
> >> --env HOP_PROJECT_FOLDER=/home/hop/config/projects/geko --env
> >> HOP_PROJECT_CONFIG_FILE_NAME=project-config.json --env
> >> HOP_ENVIRONMENT_NAME=geko.dev --env
> >> HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS="/home/hop/env/dev/Oracle
> >> APEX.json",'${PROJECT_HOME}/env/dev.json' --env
> >> HOP_SHARED_JDBC_FOLDERS=./lib/jdbc,/opt/jdbc --env
> >> HOP_FILE_PATH=main.hwf --env HOP_RUN_CONFIG=local --env
> >> HOP_SYSTEM_PROPERTIES=HOP_FILE_MOUNT=/files --env
> >> TNS_ADMIN=/home/hop/tns -v /work/hop/env:/home/hop/env -v
> >> /work/hop/mount:/files -v $TNS_ADMIN:/home/hop/tns -v
> >> /opt/jdbc:/opt/jdbc hop-docker-base
> >> ==== end file ====
> >>
> >> The output I got from the Docker runner is the following:
> >>
> >> === begin file ===
> >> 2024/05/21 11:59:40 - Running the entrypoint script with PID 7
> >> 2024/05/21 11:59:40 - Sourcing custom entry point extension:
> >> /home/hop/dl-repo.sh
> >> Cloning into 'hopconfig'...
> >> remote: Enumerating objects: 312, done.
> >> remote: Counting objects: 100% (299/299), done.
> >> remote: Compressing objects: 100% (290/290), done.
> >> remote: Total 312 (delta 166), reused 0 (delta 0), pack-reused 13
> >> Receiving objects: 100% (312/312), 84.00 KiB | 2.90 MiB/s, done.
> >> Resolving deltas: 100% (167/167), done.
> >> 2024/05/21 11:59:41 - Setting system properties at runtime:
> >> HOP_FILE_MOUNT=/files
> >> 2024/05/21 11:59:41 - The project folder for geko is set to:
> >> /home/hop/config/projects/geko
> >> 2024/05/21 11:59:41 - The specified project folder exists
> >> 2024/05/21 11:59:41 - Registering project geko in the Hop container
> >> configuration
> >> 2024/05/21 11:59:41 - /opt/hop/hop-conf.sh --project=geko
> >> --project-create --project-home='/home/hop/config/projects/geko'
> >> --project-config-file='project-config.json'
> >> Creating project 'geko'
> >> Project 'geko' was created for home folder :
> /home/hop/config/projects/geko
> >> Configuration file for project 'geko' was saved to :
> >> file:/home/hop/config/projects/geko/project-config.json
> >> 2024/05/21 12:00:04 - Registering environment geko.dev in the Hop
> >> container configuration
> >> 2024/05/21 12:00:04 - /opt/hop/hop-conf.sh --environment-create
> >> --environment=geko.dev --environment-project=geko
> >> --environment-config-files='/home/hop/env/dev/Oracle
> >> APEX.json,${PROJECT_HOME}/env/dev.json'
> >> Creating environment 'geko.dev'
> >> Environment 'geko.dev' was created in Hop configuration file
> >> /opt/hop/config/hop-config.json
> >> Found existing environment configuration file:
> >> /home/hop/env/dev/Oracle APEX.json
> >> Found existing environment configuration file:
> >> /home/hop/config/projects/geko/env/dev.json
> >> geko.dev
> >> Purpose: Apache Hop docker container
> >> Project name: geko
> >> Config file: /home/hop/env/dev/Oracle APEX.json
> >> Config file: ${PROJECT_HOME}/env/dev.json
> >> 2024/05/21 12:00:25 - Running a single hop workflow / pipeline
> (main.hwf)
> >> 2024/05/21 12:00:35 - HopRun - Start of Hop Run
> >> 2024/05/21 12:00:35 - HopRun - Referencing environment 'geko.dev' for
> >> project geko' in Apache Hop docker container
> >> 2024/05/21 12:00:35 - HopRun - Enabling project 'geko'
> >> 2024/05/21 12:00:35 - HopRun - Relative path filename specified:
> >> /home/hop/config/projects/geko/main.hwf
> >> 2024/05/21 12:00:35 - HopRun - Starting workflow:
> >> /home/hop/config/projects/geko/main.hwf
> >> 2024/05/21 12:00:36 - main - Start of workflow execution
> >> 2024/05/21 12:00:36 - main - exec(0, 0, Start)
> >> 2024/05/21 12:00:36 - Start - Starting action
> >> 2024/05/21 12:00:36 - main - Starting action [01_strutture.hpl]
> >> 2024/05/21 12:00:36 - main - exec(1, 0, 01_strutture.hpl)
> >> 2024/05/21 12:00:36 - 01_strutture.hpl - Starting action
> >> 2024/05/21 12:00:36 - 01_strutture.hpl - Opening pipeline:
> >> [/home/hop/config/projects/geko/01_strutture.hpl]
> >> 2024/05/21 12:00:36 - 01_strutture.hpl - Starting
> >> pipeline...(file=${PROJECT_HOME}/01_strutture.hpl,
> >> name=01_strutture.hpl, repinfo=null)
> >> 2024/05/21 12:00:36 - 01_strutture.hpl - Using run configuration [local]
> >> 2024/05/21 12:00:36 - 01_strutture - nr of transforms to run : 3 , nr
> >> of hops : 2
> >> 2024/05/21 12:00:36 - 01_strutture - Executing this pipeline using the
> >> Local Pipeline Engine with run configuration 'local'
> >> 2024/05/21 12:00:36 - 01_strutture - Not running a unit test...
> >> 2024/05/21 12:00:36 - 01_strutture - Execution started for pipeline
> >> [01_strutture]
> >> 2024/05/21 12:00:36 - 01_strutture - I found 3 different transforms to
> launch.
> >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets...
> >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets for transform
> >> 0 --> Table input
> >> 2024/05/21 12:00:36 - 01_strutture - prevcopies = 1, nextcopies=1
> >> 2024/05/21 12:00:36 - 01_strutture - Pipeline allocated new rowset
> >> [Table input.0 - Select values.0]
> >> 2024/05/21 12:00:36 - 01_strutture - Allocated 1 rowsets for
> >> transform 0 --> Table input
> >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets for transform
> >> 1 --> Select values
> >> 2024/05/21 12:00:36 - 01_strutture - prevcopies = 1, nextcopies=1
> >> 2024/05/21 12:00:36 - 01_strutture - Pipeline allocated new rowset
> >> [Select values.0 - Microsoft Excel writer.0]
> >> 2024/05/21 12:00:36 - 01_strutture - Allocated 2 rowsets for
> >> transform 1 --> Select values
> >> 2024/05/21 12:00:36 - 01_strutture - Allocating rowsets for transform
> >> 2 --> Microsoft Excel writer
> >> 2024/05/21 12:00:36 - 01_strutture - Allocated 2 rowsets for
> >> transform 2 --> Microsoft Excel writer
> >> 2024/05/21 12:00:36 - 01_strutture - Allocating Transforms &
> TransformData...
> >> 2024/05/21 12:00:36 - 01_strutture - Pipeline is about to allocate
> >> transform [Table input] of type [TableInput]
> >> 2024/05/21 12:00:36 - 01_strutture - Transform has nrcopies=1
> >> 2024/05/21 12:00:36 - Table input.0 - Starting allocation of buffers &
> >> new threads...
> >> 2024/05/21 12:00:36 - Table input.0 - Transform info: nrinput=0
> nroutput=1
> >> 2024/05/21 12:00:36 - Table input.0 - output rel. is 1:1
> >> 2024/05/21 12:00:36 - Table input.0 - Found output rowset [Table
> >> input.0 - Select values.0]
> >> 2024/05/21 12:00:36 - Table input.0 - Finished dispatching
> >> 2024/05/21 12:00:36 - 01_strutture - Pipeline has allocated a new
> >> transform: [Table input].0
> >> 2024/05/21 12:00:36 - 01_strutture - Pipeline is about to allocate
> >> transform [Select values] of type [SelectValues]
> >> 2024/05/21 12:00:36 - 01_strutture - Transform has nrcopies=1
> >> 2024/05/21 12:00:36 - Select values.0 - Starting allocation of buffers
> >> & new threads...
> >> 2024/05/21 12:00:36 - Select values.0 - Transform info: nrinput=1
> nroutput=1
> >> 2024/05/21 12:00:36 - Select values.0 - Got previous transform from
> >> [Select values] #0 --> Table input
> >> 2024/05/21 12:00:36 - Select values.0 - input rel is 1:1
> >> 2024/05/21 12:00:36 - Select values.0 - Found input rowset [Table
> >> input.0 - Select values.0]
> >> 2024/05/21 12:00:36 - Select values.0 - output rel. is 1:1
> >> 2024/05/21 12:00:36 - Select values.0 - Found output rowset [Select
> >> values.0 - Microsoft Excel writer.0]
> >> 2024/05/21 12:00:36 - Select values.0 - Finished dispatching
> >> 2024/05/21 12:00:36 - 01_strutture - Pipeline has allocated a new
> >> transform: [Select values].0
> >> 2024/05/21 12:00:36 - 01_strutture - Pipeline is about to allocate
> >> transform [Microsoft Excel writer] of type
> >> [TypeExitExcelWriterTransform]
> >> 2024/05/21 12:00:36 - 01_strutture - Transform has nrcopies=1
> >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Starting allocation
> >> of buffers & new threads...
> >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Transform info:
> >> nrinput=1 nroutput=0
> >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Got previous
> >> transform from [Microsoft Excel writer] #0 --> Select values
> >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - input rel is 1:1
> >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Found input rowset
> >> [Select values.0 - Microsoft Excel writer.0]
> >> 2024/05/21 12:00:36 - Microsoft Excel writer.0 - Finished dispatching
> >> 2024/05/21 12:00:37 - 01_strutture - Pipeline has allocated a new
> >> transform: [Microsoft Excel writer].0
> >> 2024/05/21 12:00:37 - 01_strutture - Initialising 3 transforms...
> >> 2024/05/21 12:00:37 - Table input.0 - ERROR: Error initializing
> >> transform [Table input]
> >> 2024/05/21 12:00:37 - Table input.0 - ERROR:
> java.lang.NullPointerException
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >> org.apache.hop.core.database.Database.getObjectName(Database.java:4546)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >>
> org.apache.hop.core.logging.LoggingObject.grabLoggingObjectInformation(LoggingObject.java:145)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >> org.apache.hop.core.logging.LoggingObject.<init>(LoggingObject.java:45)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >>
> org.apache.hop.core.logging.LoggingRegistry.registerLoggingSource(LoggingRegistry.java:65)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >> org.apache.hop.core.logging.LogChannel.<init>(LogChannel.java:83)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >> org.apache.hop.core.logging.LogChannel.<init>(LogChannel.java:65)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >> org.apache.hop.core.database.Database.<init>(Database.java:182)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >>
> org.apache.hop.pipeline.transforms.tableinput.TableInput.init(TableInput.java:337)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >>
> org.apache.hop.pipeline.transform.TransformInitThread.run(TransformInitThread.java:66)
> >> 2024/05/21 12:00:37 - Table input.0 - at
> >> java.base/java.lang.Thread.run(Thread.java:829)
> >> 2024/05/21 12:00:37 - 01_strutture - ERROR: Transform [Table input.0]
> >> failed to initialize!
> >> 2024/05/21 12:00:37 - 01_strutture - Transform [Select values.0]
> >> initialized flawlessly.
> >> 2024/05/21 12:00:37 - 01_strutture - Transform [Microsoft Excel
> >> writer.0] initialized flawlessly.
> >> 2024/05/21 12:00:37 - Table input.0 - Finished reading query, closing
> >> connection.
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - ERROR: Unable to prepare for
> >> execution of the pipeline
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - ERROR:
> >> org.apache.hop.core.exception.HopException:
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - We failed to initialize at
> >> least one transform. Execution can not begin!
> >> 2024/05/21 12:00:37 - 01_strutture.hpl -
> >> 2024/05/21 12:00:37 - 01_strutture.hpl -
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.pipeline.Pipeline.prepareExecution(Pipeline.java:1089)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >>
> org.apache.hop.pipeline.engines.local.LocalPipelineEngine.prepareExecution(LocalPipelineEngine.java:236)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.pipeline.Pipeline.execute(Pipeline.java:529)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >>
> org.apache.hop.workflow.actions.pipeline.ActionPipeline.execute(ActionPipeline.java:539)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.workflow.Workflow.executeFromStart(Workflow.java:655)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.workflow.Workflow.executeFromStart(Workflow.java:798)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.workflow.Workflow.executeFromStart(Workflow.java:439)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.workflow.Workflow.startExecution(Workflow.java:300)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >>
> org.apache.hop.workflow.engines.local.LocalWorkflowEngine.startExecution(LocalWorkflowEngine.java:249)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.run.HopRun.runWorkflow(HopRun.java:433)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.run.HopRun.runWorkflow(HopRun.java:384)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.run.HopRun.run(HopRun.java:201)
> >> 2024/05/21 12:00:37 - 01_strutture.hpl - at
> >> org.apache.hop.run.HopRun.main(HopRun.java:924)
> >> 2024/05/21 12:00:37 - main - Finished action [01_strutture.hpl]
> (result=[false])
> >> 2024/05/21 12:00:37 - main - Workflow execution finished
> >> 2024/05/21 12:00:37 - main - Workflow duration : 2.041 seconds [ 2.040"
> ]
> >> HopRun exit.
> >> ==== end file ====
> >>
> >> I also tried to disable the default entrypoint in the Hop Docker (by
> >> adding --entrypoint /bin/bash in the command line above) to check if
> >> the files have been correctly downloaded and mounted: all of them seem
> >> to be in the expected position.
> >>
> >> Is there any additional configuration am I missing to get the
> >> connection working? Feel free to ask for any information/file that you
> >> might need to diagnose the problem.
> >>
> >> Thanks in advance for your support
> >>
> >> (note: in the above files text lines might have been broken due to
> >> limitations of the simple text format)
> >>
> >> DC
>
>

Reply via email to