Hi,

thanks to Dimitris for the introduction. As mentioned, i'm happy to coordinate 
the efforts to improve reproducibility of the online DBpedia endpoint.

My background on this:
I've been running a local Virtuoso Linked Data endpoint for our research group 
for quite some time now. Amongst others, we develop learning algorithms for 
Linked Data and they perform tons of mean SPARQL queries. Running these against 
online endpoints would disrupt their service and isn't really fair-use. So 
local mirror it was and as DBpedia is pretty interesting for us as glue for 
many datasets, i've always tried to keep up with the latest releases and host 
them locally.
I started documenting this in form of line-by-line bash HowTos publicly, i 
think back with DBpedia 3.5 and updated the guide a couple of times. Over the 
time the process became easier, but it's still not what i'd call "simple". More 
than 200 monthly readers of each of my guides indicate that we should make this 
easier.

In the latest revision of the guide, i also started dockerizing stuff, allowing 
me to quickly switch between different database snapshots for evaluations:
https://joernhees.de/blog/2015/11/23/setting-up-a-linked-data-mirror-from-rdf-dumps-dbpedia-2015-04-freebase-wikidata-linkedgeodata-with-virtuoso-7-2-1-and-docker-optional/


Coming from this background, I fully agree with the goals that Dimitris 
mentioned.

To kick things off, i already created a small overview document for the first 
"phase" reproducibility:
https://pad.okfn.org/p/DBpediaReproducibility

You're invited to edit, discuss and get involved.

Cheers,
Jörn




> On 10 Oct 2016, at 12:03, Dimitris Kontokostas <jimk...@gmail.com> wrote:
> 
> Hi Everyone,
> 
> During the last DBpedia meeting, we decided to create  a community 
> coordinated action for making the DBpedia SPARQL endpoint reproducible.
> 
> After a little brainstorming we came up with the following goals:
>       • with each release create docker images
>       • spread the docker images over servers from the community
>       • keep a lit of all endpoints on the DBpedia website, first manually, 
> then automatically updated
>       • - Option 2: Community crowd-sourcing, i.e. uptime can be improved 
> when we take off the heaviest users. For example, packaging DBpedia in docker 
> and offering an easy way for configuration should help potential exploiters 
> to do it on their own infrastructure, thus freeing resources to incidental 
> (and less skilled) users. First steps are to create a list of public DBpedia 
> endpoints and an official tutorials on setting up a DBpedia mirror.
>       • scientific reproducibility -> for http://dbpedia.org/sparql not 
> given, therefore Docker images
> 
> As decided at the meeting, Jörn Hees will lead this action but we also 
> identified some members that have done work in this field like Natanael 
> Arndt, Markus Ackermann, Ritesh Kumar Singh and Kay Muller (in cc)
> 
> *Next steps(
>  - Everyone (as well as other community members that we didn't include) will 
> present their work here
>  - We will create a task force lead by Jörn and work on the above (or new) 
> goals
> 
> Looking forward to getting everyone's input here
> 
> 
> Cheers,
> Dimitris
> 
> -- 
> Kontokostas Dimitris


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-developers mailing list
DBpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to