Now published as a separate fuseki-tdb branch/image: https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb https://hub.docker.com/r/atomgraph/fuseki-tdb
On Tue, 17 Dec 2019 at 17.01, Martynas Jusevičius <[email protected]> wrote: > Andy, > > I think it's easier to show what I mean instead of talking about it: > https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb > > I took your Dockerfile and added ARG INIT_D=$BASE/init.d and entrypoint.sh. > > If $DATA folder is empty, RDF files mounted to $INIT_D will get > uploaded to a temporary Fuseki instance on port 3333. > Afterwards/otherwise, Fuseki instance on 3030 is started. > > Run like this: > > docker run \ > -p 3030:3030 \ > -v > /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import1.ttl:/var/fuseki/init.d/import1.ttl > \ > -v > /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import2.ttl:/var/fuseki/init.d/import2.ttl > \ > -v > /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/config.ttl:/var/fuseki/config.ttl > \ > -v > /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/data:/var/fuseki/data \ > atomgraph/fuseki \ > --config /var/fuseki/config.ttl > > This is not just a test use case - for tests we're using the in-memory > dataset which is initialized using ja:data in the config. > For production, we need a webapp to have certain data before it can > start. But we naturally also want persistence, so that leaves us with > TDB and this is the way to initialize it during first startup. > The mechanism is a simplified variant of what mysql is doing. > > There are some shaky parts: e.g. "/ds" in the entrypoint might not > match the dataset specified by the user; $DATA might not match the > actual volume specified by the use. > > What do you think? Is this too complex for the default Fuseki image? > > On Mon, Dec 16, 2019 at 4:22 PM Martynas Jusevičius > <[email protected]> wrote: > > > > We were looking at different mysql entrypoint scripts. Not sure why > > there are two, but this repo says it's the "official image": > > > https://github.com/docker-library/mysql/blob/master/8.0/docker-entrypoint.sh#L87 > > > > You can see it is doing a "temporary startup of the MySQL server, for > > init purposes", with --skip-networking option on. Switching ports was > > the alternative I thought could be applied to Fuseki. > > > > On Mon, Dec 16, 2019 at 3:30 PM Andy Seaborne <[email protected]> wrote: > > > > > > The Dockerfile is at: > > > > > > https://github.com/mysql/mysql-docker/tree/mysql-server/8.0 > > > > > > all the work is done by > > > > > > docker-entrypoint.sh > > > > > > whcih si a 200 line shall script. > > > > > > and includes: > > > > > > "$@" --initialize-insecure > > > > > > an argument to mysqld, not a feature specific to the docker container > image. > > > > > > Andy > > > > > > > > > On 16/12/2019 12:40, Martynas Jusevičius wrote: > > > > See "Initializing a fresh instance": https://hub.docker.com/_/mysql > > > > > > > > On Mon, Dec 16, 2019 at 12:20 PM Martynas Jusevičius > > > > <[email protected]> wrote: > > > >> > > > >> That's the thing: I don't want to execute docker run multiple times, > > > >> or remap ports manually. (In general we're using docker-compose much > > > >> more than plain docker run.) > > > >> > > > >> We want a super simple setup: docker-compose up and 4 or so services > > > >> start, waiting on each other to initialize. No more instructions, no > > > >> steps to follow, no room for error. > > > >> > > > >> And we have done that: by mounting the dataset I got Fuseki to > import > > > >> data on startup -- but so far only using an in-memory dataset, not > > > >> TDB. > > > >> > > > >> GSP for loading data is fine, but Fuseki container should do it on > > > >> itself IMO. Waiting for completion by polling for specific data > would > > > >> also work, but it does not sound like a generic solution to me -- > and > > > >> increases coupling between containers. > > > >> > > > >> Bringing the Fuseki container online *after* data import is complete > > > >> would be a more general solution IMO, but I agree that complexity > > > >> increases. > > > >> I wasn't trying to reinvent the wheel here, just looked at how a > > > >> similar image (mysql) is addressing this. They seem to do an import > on > > > >> a DB instance with networking disabled, and start a proper instance > > > >> after it's completed. > > > >> > > > >> > > > >> On Mon, Dec 16, 2019 at 11:45 AM Andy Seaborne <[email protected]> > wrote: > > > >>> > > > >>> > > > >>> > > > >>> On 16/12/2019 09:23, Andy Seaborne wrote: > > > >>>> Remap it when starting the container. > > > >>>> > > > >>>> For loading test data, why not use GSP? Then the loading script > > > >>>> indicates when it is read for the other to use the container. > > > >>>> Or poll the server for specific data available. > > > >>> > > > >>> And in the 909 Dockerfile, the data can be placed in the container > via > > > >>> a mount point. Separation of concerns - build the database or > place the > > > >>> RDF files on the mount, then start the container. > > > >>> > > > >>> You can also use the container image to loader - the Java command > line > > > >>> tools are available in the Fuseki jar file. "docker run" once to > build > > > >>> on the volume for setup, "docker run" again to start and run > Fuseki. > > > >>> > > > >>> Using java -cp fuseki-server.jar has always worked and is > convenient > > > >>> when working on a remote server. (The commands take up very > little space.) > > > >>> > > > >>> Andy > > > >>> > > > >>>> > > > >>>> Andy > > > >>>> > > > >>>> On 13/12/2019 23:37, Martynas Jusevičius wrote: > > > >>>>> Hi, > > > >>>>> > > > >>>>> is it possible to change the default port 3030 to something else > by > > > >>>>> means of configuration? > > > >>>>> > > > >>>>> This is related to the Dockerfile which is a long time coming :) > > > >>>>> https://issues.apache.org/jira/browse/JENA-909 > > > >>>>> > > > >>>>> I need some init scripts that execute on Fuseki's launch (e.g. > import > > > >>>>> mounted data) - but before making the server available > externally. > > > >>>>> Other services are waiting on Fuseki, so we want to make sure it > comes > > > >>>>> online only when the data is fully loaded. > > > >>>>> > > > >>>>> My plan is to follow roughly what the mysql entrypoint is doing: > > > >>>>> 1. start temporary Fuseki server on a non-EXPOSEd port > > > >>>>> 2. execute the init script(s) > > > >>>>> 3. shutdown the temporary server > > > >>>>> 4. start proper Fuseki server on the normal EXPOSEd port > > > >>>>> > > > >>>>> Would this work? > > > >>>>> > > > >>>>> Stian's image seems to do half of what I need: it can load the > data, > > > >>>>> but does not look like it happens during startup - the load.sh > needs > > > >>>>> to be run separately: > > > >>>>> > https://github.com/stain/jena-docker/tree/master/jena-fuseki#data-loading > > > >>>>> > > > >>>>> I want to avoid any additional script executions after startup as > > > >>>>> we're aiming for a plain 'docker-compose up' launch of all > services. > > > >>>>> > > > >>>>> Martynas > > > >>>>> >
