Now published as a separate fuseki-tdb branch/image:
https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb
https://hub.docker.com/r/atomgraph/fuseki-tdb


On Tue, 17 Dec 2019 at 17.01, Martynas Jusevičius <[email protected]>
wrote:

> Andy,
>
> I think it's easier to show what I mean instead of talking about it:
> https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb
>
> I took your Dockerfile and added ARG INIT_D=$BASE/init.d and entrypoint.sh.
>
> If $DATA folder is empty, RDF files mounted to $INIT_D will get
> uploaded to a temporary Fuseki instance on port 3333.
> Afterwards/otherwise, Fuseki instance on 3030 is started.
>
> Run like this:
>
> docker run \
>   -p 3030:3030 \
>   -v
> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import1.ttl:/var/fuseki/init.d/import1.ttl
> \
>   -v
> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import2.ttl:/var/fuseki/init.d/import2.ttl
> \
>   -v
> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/config.ttl:/var/fuseki/config.ttl
> \
>   -v
> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/data:/var/fuseki/data \
>   atomgraph/fuseki \
>   --config /var/fuseki/config.ttl
>
> This is not just a test use case - for tests we're using the in-memory
> dataset which is initialized using ja:data in the config.
> For production, we need a webapp to have certain data before it can
> start. But we naturally also want persistence, so that leaves us with
> TDB and this is the way to initialize it during first startup.
> The mechanism is a simplified variant of what mysql is doing.
>
> There are some shaky parts: e.g. "/ds" in the entrypoint might not
> match the dataset specified by the user; $DATA might not match the
> actual volume specified by the use.
>
> What do you think? Is this too complex for the default Fuseki image?
>
> On Mon, Dec 16, 2019 at 4:22 PM Martynas Jusevičius
> <[email protected]> wrote:
> >
> > We were looking at different mysql entrypoint scripts. Not sure why
> > there are two, but this repo says it's the "official image":
> >
> https://github.com/docker-library/mysql/blob/master/8.0/docker-entrypoint.sh#L87
> >
> > You can see it is doing a "temporary startup of the MySQL server, for
> > init purposes", with --skip-networking option on. Switching ports was
> > the alternative I thought could be applied to Fuseki.
> >
> > On Mon, Dec 16, 2019 at 3:30 PM Andy Seaborne <[email protected]> wrote:
> > >
> > > The Dockerfile is at:
> > >
> > > https://github.com/mysql/mysql-docker/tree/mysql-server/8.0
> > >
> > > all the work is done by
> > >
> > > docker-entrypoint.sh
> > >
> > > whcih si a 200 line shall script.
> > >
> > > and includes:
> > >
> > > "$@" --initialize-insecure
> > >
> > > an argument to mysqld, not a feature specific to the docker container
> image.
> > >
> > >      Andy
> > >
> > >
> > > On 16/12/2019 12:40, Martynas Jusevičius wrote:
> > > > See "Initializing a fresh instance": https://hub.docker.com/_/mysql
> > > >
> > > > On Mon, Dec 16, 2019 at 12:20 PM Martynas Jusevičius
> > > > <[email protected]> wrote:
> > > >>
> > > >> That's the thing: I don't want to execute docker run multiple times,
> > > >> or remap ports manually. (In general we're using docker-compose much
> > > >> more than plain docker run.)
> > > >>
> > > >> We want a super simple setup: docker-compose up and 4 or so services
> > > >> start, waiting on each other to initialize. No more instructions, no
> > > >> steps to follow, no room for error.
> > > >>
> > > >> And we have done that: by mounting the dataset I got Fuseki to
> import
> > > >> data on startup -- but so far only using an in-memory dataset, not
> > > >> TDB.
> > > >>
> > > >> GSP for loading data is fine, but Fuseki container should do it on
> > > >> itself IMO. Waiting for completion by polling for specific data
> would
> > > >> also work, but it does not sound like a generic solution to me --
> and
> > > >> increases coupling between containers.
> > > >>
> > > >> Bringing the Fuseki container online *after* data import is complete
> > > >> would be a more general solution IMO, but I agree that complexity
> > > >> increases.
> > > >> I wasn't trying to reinvent the wheel here, just looked at how a
> > > >> similar image (mysql) is addressing this. They seem to do an import
> on
> > > >> a DB instance with networking disabled, and start a proper instance
> > > >> after it's completed.
> > > >>
> > > >>
> > > >> On Mon, Dec 16, 2019 at 11:45 AM Andy Seaborne <[email protected]>
> wrote:
> > > >>>
> > > >>>
> > > >>>
> > > >>> On 16/12/2019 09:23, Andy Seaborne wrote:
> > > >>>> Remap it when starting the container.
> > > >>>>
> > > >>>> For loading test data, why not use GSP? Then the loading script
> > > >>>> indicates when it is read for the other to use the container.
> > > >>>> Or poll the server for specific data available.
> > > >>>
> > > >>> And in the 909 Dockerfile, the data can be placed in the container
> via
> > > >>> a mount point. Separation of concerns - build the database or
> place the
> > > >>> RDF files on the mount, then start the container.
> > > >>>
> > > >>> You can also use the container image to loader - the Java command
> line
> > > >>> tools are available in the Fuseki jar file.   "docker run" once to
> build
> > > >>> on the volume for setup, "docker run" again to start and run
> Fuseki.
> > > >>>
> > > >>> Using java -cp fuseki-server.jar has always worked and is
> convenient
> > > >>> when working on a  remote server.  (The commands take up very
> little space.)
> > > >>>
> > > >>>       Andy
> > > >>>
> > > >>>>
> > > >>>>       Andy
> > > >>>>
> > > >>>> On 13/12/2019 23:37, Martynas Jusevičius wrote:
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> is it possible to change the default port 3030 to something else
> by
> > > >>>>> means of configuration?
> > > >>>>>
> > > >>>>> This is related to the Dockerfile which is a long time coming :)
> > > >>>>> https://issues.apache.org/jira/browse/JENA-909
> > > >>>>>
> > > >>>>> I need some init scripts that execute on Fuseki's launch (e.g.
> import
> > > >>>>> mounted data) - but before making the server available
> externally.
> > > >>>>> Other services are waiting on Fuseki, so we want to make sure it
> comes
> > > >>>>> online only when the data is fully loaded.
> > > >>>>>
> > > >>>>> My plan is to follow roughly what the mysql entrypoint is doing:
> > > >>>>> 1. start temporary Fuseki server on a non-EXPOSEd port
> > > >>>>> 2. execute the init script(s)
> > > >>>>> 3. shutdown the temporary server
> > > >>>>> 4. start proper Fuseki server on the normal EXPOSEd port
> > > >>>>>
> > > >>>>> Would this work?
> > > >>>>>
> > > >>>>> Stian's image seems to do half of what I need: it can load the
> data,
> > > >>>>> but does not look like it happens during startup - the load.sh
> needs
> > > >>>>> to be run separately:
> > > >>>>>
> https://github.com/stain/jena-docker/tree/master/jena-fuseki#data-loading
> > > >>>>>
> > > >>>>> I want to avoid any additional script executions after startup as
> > > >>>>> we're aiming for a plain 'docker-compose up' launch of all
> services.
> > > >>>>>
> > > >>>>> Martynas
> > > >>>>>
>

Reply via email to