Andy, in hindsight you were right :) The entrypoint logic in the fuseki-tdb image that tried to load data into a temporary server got way to complex.
We're back to using atomgraph/fuseki with your Dockerfile which is sufficient for our needs: https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile The webapp container does the waiting for Fuseki and the loading over GSP and it's all much simpler now. On Thu, Jan 2, 2020 at 5:16 PM Martynas Jusevičius <[email protected]> wrote: > > Now published as a separate fuseki-tdb branch/image: > https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb > https://hub.docker.com/r/atomgraph/fuseki-tdb > > > On Tue, 17 Dec 2019 at 17.01, Martynas Jusevičius <[email protected]> > wrote: >> >> Andy, >> >> I think it's easier to show what I mean instead of talking about it: >> https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb >> >> I took your Dockerfile and added ARG INIT_D=$BASE/init.d and entrypoint.sh. >> >> If $DATA folder is empty, RDF files mounted to $INIT_D will get >> uploaded to a temporary Fuseki instance on port 3333. >> Afterwards/otherwise, Fuseki instance on 3030 is started. >> >> Run like this: >> >> docker run \ >> -p 3030:3030 \ >> -v >> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import1.ttl:/var/fuseki/init.d/import1.ttl >> \ >> -v >> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import2.ttl:/var/fuseki/init.d/import2.ttl >> \ >> -v >> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/config.ttl:/var/fuseki/config.ttl >> \ >> -v /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/data:/var/fuseki/data >> \ >> atomgraph/fuseki \ >> --config /var/fuseki/config.ttl >> >> This is not just a test use case - for tests we're using the in-memory >> dataset which is initialized using ja:data in the config. >> For production, we need a webapp to have certain data before it can >> start. But we naturally also want persistence, so that leaves us with >> TDB and this is the way to initialize it during first startup. >> The mechanism is a simplified variant of what mysql is doing. >> >> There are some shaky parts: e.g. "/ds" in the entrypoint might not >> match the dataset specified by the user; $DATA might not match the >> actual volume specified by the use. >> >> What do you think? Is this too complex for the default Fuseki image? >> >> On Mon, Dec 16, 2019 at 4:22 PM Martynas Jusevičius >> <[email protected]> wrote: >> > >> > We were looking at different mysql entrypoint scripts. Not sure why >> > there are two, but this repo says it's the "official image": >> > https://github.com/docker-library/mysql/blob/master/8.0/docker-entrypoint.sh#L87 >> > >> > You can see it is doing a "temporary startup of the MySQL server, for >> > init purposes", with --skip-networking option on. Switching ports was >> > the alternative I thought could be applied to Fuseki. >> > >> > On Mon, Dec 16, 2019 at 3:30 PM Andy Seaborne <[email protected]> wrote: >> > > >> > > The Dockerfile is at: >> > > >> > > https://github.com/mysql/mysql-docker/tree/mysql-server/8.0 >> > > >> > > all the work is done by >> > > >> > > docker-entrypoint.sh >> > > >> > > whcih si a 200 line shall script. >> > > >> > > and includes: >> > > >> > > "$@" --initialize-insecure >> > > >> > > an argument to mysqld, not a feature specific to the docker container >> > > image. >> > > >> > > Andy >> > > >> > > >> > > On 16/12/2019 12:40, Martynas Jusevičius wrote: >> > > > See "Initializing a fresh instance": https://hub.docker.com/_/mysql >> > > > >> > > > On Mon, Dec 16, 2019 at 12:20 PM Martynas Jusevičius >> > > > <[email protected]> wrote: >> > > >> >> > > >> That's the thing: I don't want to execute docker run multiple times, >> > > >> or remap ports manually. (In general we're using docker-compose much >> > > >> more than plain docker run.) >> > > >> >> > > >> We want a super simple setup: docker-compose up and 4 or so services >> > > >> start, waiting on each other to initialize. No more instructions, no >> > > >> steps to follow, no room for error. >> > > >> >> > > >> And we have done that: by mounting the dataset I got Fuseki to import >> > > >> data on startup -- but so far only using an in-memory dataset, not >> > > >> TDB. >> > > >> >> > > >> GSP for loading data is fine, but Fuseki container should do it on >> > > >> itself IMO. Waiting for completion by polling for specific data would >> > > >> also work, but it does not sound like a generic solution to me -- and >> > > >> increases coupling between containers. >> > > >> >> > > >> Bringing the Fuseki container online *after* data import is complete >> > > >> would be a more general solution IMO, but I agree that complexity >> > > >> increases. >> > > >> I wasn't trying to reinvent the wheel here, just looked at how a >> > > >> similar image (mysql) is addressing this. They seem to do an import on >> > > >> a DB instance with networking disabled, and start a proper instance >> > > >> after it's completed. >> > > >> >> > > >> >> > > >> On Mon, Dec 16, 2019 at 11:45 AM Andy Seaborne <[email protected]> >> > > >> wrote: >> > > >>> >> > > >>> >> > > >>> >> > > >>> On 16/12/2019 09:23, Andy Seaborne wrote: >> > > >>>> Remap it when starting the container. >> > > >>>> >> > > >>>> For loading test data, why not use GSP? Then the loading script >> > > >>>> indicates when it is read for the other to use the container. >> > > >>>> Or poll the server for specific data available. >> > > >>> >> > > >>> And in the 909 Dockerfile, the data can be placed in the container >> > > >>> via >> > > >>> a mount point. Separation of concerns - build the database or place >> > > >>> the >> > > >>> RDF files on the mount, then start the container. >> > > >>> >> > > >>> You can also use the container image to loader - the Java command >> > > >>> line >> > > >>> tools are available in the Fuseki jar file. "docker run" once to >> > > >>> build >> > > >>> on the volume for setup, "docker run" again to start and run Fuseki. >> > > >>> >> > > >>> Using java -cp fuseki-server.jar has always worked and is convenient >> > > >>> when working on a remote server. (The commands take up very little >> > > >>> space.) >> > > >>> >> > > >>> Andy >> > > >>> >> > > >>>> >> > > >>>> Andy >> > > >>>> >> > > >>>> On 13/12/2019 23:37, Martynas Jusevičius wrote: >> > > >>>>> Hi, >> > > >>>>> >> > > >>>>> is it possible to change the default port 3030 to something else by >> > > >>>>> means of configuration? >> > > >>>>> >> > > >>>>> This is related to the Dockerfile which is a long time coming :) >> > > >>>>> https://issues.apache.org/jira/browse/JENA-909 >> > > >>>>> >> > > >>>>> I need some init scripts that execute on Fuseki's launch (e.g. >> > > >>>>> import >> > > >>>>> mounted data) - but before making the server available externally. >> > > >>>>> Other services are waiting on Fuseki, so we want to make sure it >> > > >>>>> comes >> > > >>>>> online only when the data is fully loaded. >> > > >>>>> >> > > >>>>> My plan is to follow roughly what the mysql entrypoint is doing: >> > > >>>>> 1. start temporary Fuseki server on a non-EXPOSEd port >> > > >>>>> 2. execute the init script(s) >> > > >>>>> 3. shutdown the temporary server >> > > >>>>> 4. start proper Fuseki server on the normal EXPOSEd port >> > > >>>>> >> > > >>>>> Would this work? >> > > >>>>> >> > > >>>>> Stian's image seems to do half of what I need: it can load the >> > > >>>>> data, >> > > >>>>> but does not look like it happens during startup - the load.sh >> > > >>>>> needs >> > > >>>>> to be run separately: >> > > >>>>> https://github.com/stain/jena-docker/tree/master/jena-fuseki#data-loading >> > > >>>>> >> > > >>>>> I want to avoid any additional script executions after startup as >> > > >>>>> we're aiming for a plain 'docker-compose up' launch of all >> > > >>>>> services. >> > > >>>>> >> > > >>>>> Martynas >> > > >>>>>
