Re: [DBpedia-developers] [DBpedia-discussion] Community coordination action: DBpedia reproducubility / Dockerization

2016-10-11 Thread Natanael Arndt
Hi everybody,

here is some summary of what we did so far: at SEMANTiCS 2015 we've 
presented dld: dockerizing linked data [1]. The idea of this was to 
provide a tool that basically creates docker-compose setups to easily 
create an infrastructure for serving RDF datasets via a SPARQL endpoint 
or other applications like OntoWiki. We've followed the principle of 
having one container per service, so we didn't want to put the complete 
stack including the data into a single container (Microservices [2], 
Single Responsibility Principle [3]).

To achieve this we have identified three (or four) tasks in a setup:
(1a/b): load and back-up data, in case of DBpedia, there would only be 
loading data
(2): storage of the data
(3): presentation, exploration and editing data

This idea is meant to be very generic to cover different types of setups 
dealing with RDF data or data in general. For the case of DBpedia and 
also to achieve a "scientific reproducibility", as Dimitris mentioned, 
we might have to rethink this setup. Further I think also in the docker 
community best practices have evolved. We should also keep an eye on 
performance of services running in the containers vs. running them 
"natively" on a system, we have experienced some impact here already, 
but this might differ from setup to setup.

[1] https://dockerizing.github.io/; 
http://www.bibsonomy.org/bibtex/2b1e393a0bfd62e83b99704a52c20c877/aksw
[2] https://en.wikipedia.org/wiki/Microservices
[3] https://en.wikipedia.org/wiki/Single_responsibility_principle

All the best,
Natanael


(Sorry for sending it twice, but I first had to subscribe)

--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
DBpedia-developers mailing list
DBpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers


Re: [DBpedia-developers] [DBpedia-discussion] Community coordination action: DBpedia reproducubility / Dockerization

2016-10-11 Thread Jörn Hees
Hi Olivier,

> On 10 Oct 2016, at 21:59, Olivier Rossel  wrote:
> 
> The availability of DbPedia  as a virtuoso DB file was really helpful for me. 
> I hope this feature will still be maintained on the long run.
> (And the fact that linkedgeodata was just one namedgraph away was also pretty 
> cool!)

Yes, that was an (inofficial) option that i offered in the past (simply a 
compressed backup of the whole Virtuoso DB directory directly after import).
Unlike the docker approach it doesn't bundle a fixed Virtuoso version with the 
DB, so you'd need to install Virtuoso on your own.

While the main focus here is make reproducibility very easy (so same data and 
executable as at release time), there are scenarios in which it's desirable to 
run an old DBpedia version with a newer Virtuoso version than at release time 
(think of SPARQL 1.1).
So maybe we should also have a look at how to ship a DB version.
As the DB snapshot is however very easy to extract from a docker image 
(probably a volume anyhow), I think this could be an easy addenum on top of (or 
as part) of the dockerization efforts.

Best,
Jörn


--
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
DBpedia-developers mailing list
DBpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers