Andy, I think it's easier to show what I mean instead of talking about it: https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb
I took your Dockerfile and added ARG INIT_D=$BASE/init.d and entrypoint.sh. If $DATA folder is empty, RDF files mounted to $INIT_D will get uploaded to a temporary Fuseki instance on port 3333. Afterwards/otherwise, Fuseki instance on 3030 is started. Run like this: docker run \ -p 3030:3030 \ -v /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import1.ttl:/var/fuseki/init.d/import1.ttl \ -v /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import2.ttl:/var/fuseki/init.d/import2.ttl \ -v /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/config.ttl:/var/fuseki/config.ttl \ -v /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/data:/var/fuseki/data \ atomgraph/fuseki \ --config /var/fuseki/config.ttl This is not just a test use case - for tests we're using the in-memory dataset which is initialized using ja:data in the config. For production, we need a webapp to have certain data before it can start. But we naturally also want persistence, so that leaves us with TDB and this is the way to initialize it during first startup. The mechanism is a simplified variant of what mysql is doing. There are some shaky parts: e.g. "/ds" in the entrypoint might not match the dataset specified by the user; $DATA might not match the actual volume specified by the use. What do you think? Is this too complex for the default Fuseki image? On Mon, Dec 16, 2019 at 4:22 PM Martynas Jusevičius <[email protected]> wrote: > > We were looking at different mysql entrypoint scripts. Not sure why > there are two, but this repo says it's the "official image": > https://github.com/docker-library/mysql/blob/master/8.0/docker-entrypoint.sh#L87 > > You can see it is doing a "temporary startup of the MySQL server, for > init purposes", with --skip-networking option on. Switching ports was > the alternative I thought could be applied to Fuseki. > > On Mon, Dec 16, 2019 at 3:30 PM Andy Seaborne <[email protected]> wrote: > > > > The Dockerfile is at: > > > > https://github.com/mysql/mysql-docker/tree/mysql-server/8.0 > > > > all the work is done by > > > > docker-entrypoint.sh > > > > whcih si a 200 line shall script. > > > > and includes: > > > > "$@" --initialize-insecure > > > > an argument to mysqld, not a feature specific to the docker container image. > > > > Andy > > > > > > On 16/12/2019 12:40, Martynas Jusevičius wrote: > > > See "Initializing a fresh instance": https://hub.docker.com/_/mysql > > > > > > On Mon, Dec 16, 2019 at 12:20 PM Martynas Jusevičius > > > <[email protected]> wrote: > > >> > > >> That's the thing: I don't want to execute docker run multiple times, > > >> or remap ports manually. (In general we're using docker-compose much > > >> more than plain docker run.) > > >> > > >> We want a super simple setup: docker-compose up and 4 or so services > > >> start, waiting on each other to initialize. No more instructions, no > > >> steps to follow, no room for error. > > >> > > >> And we have done that: by mounting the dataset I got Fuseki to import > > >> data on startup -- but so far only using an in-memory dataset, not > > >> TDB. > > >> > > >> GSP for loading data is fine, but Fuseki container should do it on > > >> itself IMO. Waiting for completion by polling for specific data would > > >> also work, but it does not sound like a generic solution to me -- and > > >> increases coupling between containers. > > >> > > >> Bringing the Fuseki container online *after* data import is complete > > >> would be a more general solution IMO, but I agree that complexity > > >> increases. > > >> I wasn't trying to reinvent the wheel here, just looked at how a > > >> similar image (mysql) is addressing this. They seem to do an import on > > >> a DB instance with networking disabled, and start a proper instance > > >> after it's completed. > > >> > > >> > > >> On Mon, Dec 16, 2019 at 11:45 AM Andy Seaborne <[email protected]> wrote: > > >>> > > >>> > > >>> > > >>> On 16/12/2019 09:23, Andy Seaborne wrote: > > >>>> Remap it when starting the container. > > >>>> > > >>>> For loading test data, why not use GSP? Then the loading script > > >>>> indicates when it is read for the other to use the container. > > >>>> Or poll the server for specific data available. > > >>> > > >>> And in the 909 Dockerfile, the data can be placed in the container via > > >>> a mount point. Separation of concerns - build the database or place the > > >>> RDF files on the mount, then start the container. > > >>> > > >>> You can also use the container image to loader - the Java command line > > >>> tools are available in the Fuseki jar file. "docker run" once to build > > >>> on the volume for setup, "docker run" again to start and run Fuseki. > > >>> > > >>> Using java -cp fuseki-server.jar has always worked and is convenient > > >>> when working on a remote server. (The commands take up very little > > >>> space.) > > >>> > > >>> Andy > > >>> > > >>>> > > >>>> Andy > > >>>> > > >>>> On 13/12/2019 23:37, Martynas Jusevičius wrote: > > >>>>> Hi, > > >>>>> > > >>>>> is it possible to change the default port 3030 to something else by > > >>>>> means of configuration? > > >>>>> > > >>>>> This is related to the Dockerfile which is a long time coming :) > > >>>>> https://issues.apache.org/jira/browse/JENA-909 > > >>>>> > > >>>>> I need some init scripts that execute on Fuseki's launch (e.g. import > > >>>>> mounted data) - but before making the server available externally. > > >>>>> Other services are waiting on Fuseki, so we want to make sure it comes > > >>>>> online only when the data is fully loaded. > > >>>>> > > >>>>> My plan is to follow roughly what the mysql entrypoint is doing: > > >>>>> 1. start temporary Fuseki server on a non-EXPOSEd port > > >>>>> 2. execute the init script(s) > > >>>>> 3. shutdown the temporary server > > >>>>> 4. start proper Fuseki server on the normal EXPOSEd port > > >>>>> > > >>>>> Would this work? > > >>>>> > > >>>>> Stian's image seems to do half of what I need: it can load the data, > > >>>>> but does not look like it happens during startup - the load.sh needs > > >>>>> to be run separately: > > >>>>> https://github.com/stain/jena-docker/tree/master/jena-fuseki#data-loading > > >>>>> > > >>>>> I want to avoid any additional script executions after startup as > > >>>>> we're aiming for a plain 'docker-compose up' launch of all services. > > >>>>> > > >>>>> Martynas > > >>>>>
