Andy,

in hindsight you were right :) The entrypoint logic in the fuseki-tdb
image that tried to load data into a temporary server got way to
complex.

We're back to using atomgraph/fuseki with your Dockerfile which is
sufficient for our needs:
https://github.com/AtomGraph/fuseki-docker/blob/master/Dockerfile

The webapp container does the waiting for Fuseki and the loading over
GSP and it's all much simpler now.

On Thu, Jan 2, 2020 at 5:16 PM Martynas Jusevičius
<[email protected]> wrote:
>
> Now published as a separate fuseki-tdb branch/image:
> https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb
> https://hub.docker.com/r/atomgraph/fuseki-tdb
>
>
> On Tue, 17 Dec 2019 at 17.01, Martynas Jusevičius <[email protected]> 
> wrote:
>>
>> Andy,
>>
>> I think it's easier to show what I mean instead of talking about it:
>> https://github.com/AtomGraph/fuseki-docker/tree/fuseki-tdb
>>
>> I took your Dockerfile and added ARG INIT_D=$BASE/init.d and entrypoint.sh.
>>
>> If $DATA folder is empty, RDF files mounted to $INIT_D will get
>> uploaded to a temporary Fuseki instance on port 3333.
>> Afterwards/otherwise, Fuseki instance on 3030 is started.
>>
>> Run like this:
>>
>> docker run \
>>   -p 3030:3030 \
>>   -v 
>> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import1.ttl:/var/fuseki/init.d/import1.ttl
>> \
>>   -v 
>> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/import2.ttl:/var/fuseki/init.d/import2.ttl
>> \
>>   -v 
>> /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/config.ttl:/var/fuseki/config.ttl
>> \
>>   -v /c/Users/Martynas/WebRoot/AtomGraph/fuseki-docker/data:/var/fuseki/data 
>> \
>>   atomgraph/fuseki \
>>   --config /var/fuseki/config.ttl
>>
>> This is not just a test use case - for tests we're using the in-memory
>> dataset which is initialized using ja:data in the config.
>> For production, we need a webapp to have certain data before it can
>> start. But we naturally also want persistence, so that leaves us with
>> TDB and this is the way to initialize it during first startup.
>> The mechanism is a simplified variant of what mysql is doing.
>>
>> There are some shaky parts: e.g. "/ds" in the entrypoint might not
>> match the dataset specified by the user; $DATA might not match the
>> actual volume specified by the use.
>>
>> What do you think? Is this too complex for the default Fuseki image?
>>
>> On Mon, Dec 16, 2019 at 4:22 PM Martynas Jusevičius
>> <[email protected]> wrote:
>> >
>> > We were looking at different mysql entrypoint scripts. Not sure why
>> > there are two, but this repo says it's the "official image":
>> > https://github.com/docker-library/mysql/blob/master/8.0/docker-entrypoint.sh#L87
>> >
>> > You can see it is doing a "temporary startup of the MySQL server, for
>> > init purposes", with --skip-networking option on. Switching ports was
>> > the alternative I thought could be applied to Fuseki.
>> >
>> > On Mon, Dec 16, 2019 at 3:30 PM Andy Seaborne <[email protected]> wrote:
>> > >
>> > > The Dockerfile is at:
>> > >
>> > > https://github.com/mysql/mysql-docker/tree/mysql-server/8.0
>> > >
>> > > all the work is done by
>> > >
>> > > docker-entrypoint.sh
>> > >
>> > > whcih si a 200 line shall script.
>> > >
>> > > and includes:
>> > >
>> > > "$@" --initialize-insecure
>> > >
>> > > an argument to mysqld, not a feature specific to the docker container 
>> > > image.
>> > >
>> > >      Andy
>> > >
>> > >
>> > > On 16/12/2019 12:40, Martynas Jusevičius wrote:
>> > > > See "Initializing a fresh instance": https://hub.docker.com/_/mysql
>> > > >
>> > > > On Mon, Dec 16, 2019 at 12:20 PM Martynas Jusevičius
>> > > > <[email protected]> wrote:
>> > > >>
>> > > >> That's the thing: I don't want to execute docker run multiple times,
>> > > >> or remap ports manually. (In general we're using docker-compose much
>> > > >> more than plain docker run.)
>> > > >>
>> > > >> We want a super simple setup: docker-compose up and 4 or so services
>> > > >> start, waiting on each other to initialize. No more instructions, no
>> > > >> steps to follow, no room for error.
>> > > >>
>> > > >> And we have done that: by mounting the dataset I got Fuseki to import
>> > > >> data on startup -- but so far only using an in-memory dataset, not
>> > > >> TDB.
>> > > >>
>> > > >> GSP for loading data is fine, but Fuseki container should do it on
>> > > >> itself IMO. Waiting for completion by polling for specific data would
>> > > >> also work, but it does not sound like a generic solution to me -- and
>> > > >> increases coupling between containers.
>> > > >>
>> > > >> Bringing the Fuseki container online *after* data import is complete
>> > > >> would be a more general solution IMO, but I agree that complexity
>> > > >> increases.
>> > > >> I wasn't trying to reinvent the wheel here, just looked at how a
>> > > >> similar image (mysql) is addressing this. They seem to do an import on
>> > > >> a DB instance with networking disabled, and start a proper instance
>> > > >> after it's completed.
>> > > >>
>> > > >>
>> > > >> On Mon, Dec 16, 2019 at 11:45 AM Andy Seaborne <[email protected]> 
>> > > >> wrote:
>> > > >>>
>> > > >>>
>> > > >>>
>> > > >>> On 16/12/2019 09:23, Andy Seaborne wrote:
>> > > >>>> Remap it when starting the container.
>> > > >>>>
>> > > >>>> For loading test data, why not use GSP? Then the loading script
>> > > >>>> indicates when it is read for the other to use the container.
>> > > >>>> Or poll the server for specific data available.
>> > > >>>
>> > > >>> And in the 909 Dockerfile, the data can be placed in the container 
>> > > >>> via
>> > > >>> a mount point. Separation of concerns - build the database or place 
>> > > >>> the
>> > > >>> RDF files on the mount, then start the container.
>> > > >>>
>> > > >>> You can also use the container image to loader - the Java command 
>> > > >>> line
>> > > >>> tools are available in the Fuseki jar file.   "docker run" once to 
>> > > >>> build
>> > > >>> on the volume for setup, "docker run" again to start and run Fuseki.
>> > > >>>
>> > > >>> Using java -cp fuseki-server.jar has always worked and is convenient
>> > > >>> when working on a  remote server.  (The commands take up very little 
>> > > >>> space.)
>> > > >>>
>> > > >>>       Andy
>> > > >>>
>> > > >>>>
>> > > >>>>       Andy
>> > > >>>>
>> > > >>>> On 13/12/2019 23:37, Martynas Jusevičius wrote:
>> > > >>>>> Hi,
>> > > >>>>>
>> > > >>>>> is it possible to change the default port 3030 to something else by
>> > > >>>>> means of configuration?
>> > > >>>>>
>> > > >>>>> This is related to the Dockerfile which is a long time coming :)
>> > > >>>>> https://issues.apache.org/jira/browse/JENA-909
>> > > >>>>>
>> > > >>>>> I need some init scripts that execute on Fuseki's launch (e.g. 
>> > > >>>>> import
>> > > >>>>> mounted data) - but before making the server available externally.
>> > > >>>>> Other services are waiting on Fuseki, so we want to make sure it 
>> > > >>>>> comes
>> > > >>>>> online only when the data is fully loaded.
>> > > >>>>>
>> > > >>>>> My plan is to follow roughly what the mysql entrypoint is doing:
>> > > >>>>> 1. start temporary Fuseki server on a non-EXPOSEd port
>> > > >>>>> 2. execute the init script(s)
>> > > >>>>> 3. shutdown the temporary server
>> > > >>>>> 4. start proper Fuseki server on the normal EXPOSEd port
>> > > >>>>>
>> > > >>>>> Would this work?
>> > > >>>>>
>> > > >>>>> Stian's image seems to do half of what I need: it can load the 
>> > > >>>>> data,
>> > > >>>>> but does not look like it happens during startup - the load.sh 
>> > > >>>>> needs
>> > > >>>>> to be run separately:
>> > > >>>>> https://github.com/stain/jena-docker/tree/master/jena-fuseki#data-loading
>> > > >>>>>
>> > > >>>>> I want to avoid any additional script executions after startup as
>> > > >>>>> we're aiming for a plain 'docker-compose up' launch of all 
>> > > >>>>> services.
>> > > >>>>>
>> > > >>>>> Martynas
>> > > >>>>>

Reply via email to