Hi,
Wikidata will provide RDF dumps as well as XML dumps btw..
There might also be a third party SPARQL endpoint serving them if we find
hosting support.
Cheers,
Anja
On Apr 5, 2013, at 18:12, Andrea Di Menna <[email protected]> wrote:
> Hi,
>
> from what I understood the problem which will arise for DBpedia with the
> introduction of Wikidata, is that actual values will not be available in
> Wikipedia dumps anymore.
> Instead, we will end up either finding nothing (see InterWiki links) or
> finding wikidata parser functions, e.g. {{#property:p169}} as shown here [1]
>
> Maybe one approach could be to gather wikidata dumps and build some sort of
> triples stores, which can be used to resolve actual data during extraction
> (adding handling of specific parser functions in the extraction framework).
>
> WDYT?
>
> Cheers
> Andrea
>
> [1] https://blog.wikimedia.de/2013/03/27/you-can-have-all-the-data/
>
>
> 2013/4/5 Julien Plu <[email protected]>
> @Jona Christopher Sahnwaldt : Lua scripting has been made to create new
> templates, and these templates will able to be used in the infoboxes.
>
> By the way, happy to see that some solutions are in thinking to solve the
> problem about wikidata at least. I hope that efficients solutions will be
> found :-)
>
> Best.
>
> Julien Plu.
>
>
> 2013/4/5 Anja Jentzsch <[email protected]>
> Hi all,
>
> this is a relevant topic to be address soonish.
> There is already work to replace the Template:Infobox with a Lua script. Also
> see the blog post here:
> http://blog.wikimedia.org/2013/03/14/what-lua-scripting-means-wikimedia-open-source/
>
> Also the Wikidata inclusion syntax is starting to replace values by calls to
> the repository, see:
> http://meta.m.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax_v0.3 and
> http://blog.wikimedia.de/2013/03/27/you-can-have-all-the-data/
>
> This will make it increasingly harder to retrieve properties along with their
> values from Wikipedia dumps without a) interpreting the Lua script results
> and b) accessing Wikidata.
>
> Cheers,
> Anja
>
> On Apr 5, 2013, at 15:40, Jona Christopher Sahnwaldt <[email protected]>
> wrote:
>
> > Hi Julien,
> >
> > thanks for the heads-up!
> >
> > On 5 April 2013 10:44, Julien Plu <[email protected]>
> > wrote:
> >> Hi,
> >>
> >> I saw few days ago that MediaWiki since one month allow to create infoboxes
> >> (or part of them) with Lua scripting language.
> >> http://www.mediawiki.org/wiki/Lua_scripting
> >>
> >> So my question is, if every data in the wikipedia infoboxes are in Lua
> >> scripts, DBPedia will still be able to retrieve all the data as usual ?
> >
> > I'm not 100% sure, and we should look into this, but I think that Lua
> > is only used in template definitions, not in template calls or other
> > places in content pages. DBpedia does not parse template definitions,
> > only content pages. The content pages probably will only change in
> > minor ways, if at all. For example, {{Foo}} might change to
> > {{#invoke:Foo}}. But that's just my preliminary understanding after
> > looking through a few tuorial pages.
> >
> >>
> >> My other question is mainly concerned by Wikipedia FR, because I don't
> >> found
> >> the same thing in english, sorry. Since almost one year for the infobox
> >> population property we can do something like that :
> >>
> >> population = {{population number}}
> >>
> >> Where "population number" refer to a number which is on another page. Let
> >> me
> >> give you an example, the Wikipedia page about Toulouse city, contain this
> >> infobox property :
> >>
> >> | population = {{Dernière population commune de France}}
> >>
> >> And the value of "Dernière population commune de France" is contained in
> >> this wikipedia page :
> >> http://fr.wikipedia.org/wiki/Mod%C3%A8le:Donn%C3%A9es/Toulouse/%C3%A9volution_population
> >>
> >> So now the problem is that in the xml dump we don't have the real value of
> >> the population so it exist a way to have the value and not the "string"
> >> which represent the value ?
> >
> > I've seen similar structures on Wikipedia de [1] and I think also on
> > pl or cs: the actual data is not in the content pages, but in some
> > template, and is rendered on the content page by rather complex
> > mechanisms.
> >
> > To deal with this, DBpedia could try to expand templates, or maybe
> > just certain templates (we don't want all the HTML stuff). Great
> > generality, but may cause perfomance and other problems. In the worst
> > case, mapping-based extraction could become as slow as abstract
> > extraction.
> >
> > Or we could let people add rules on the mappings wiki about which
> > templates contain data and how the data should be attached to certain
> > DBpedia resources. Of course, determining syntax and semantics for
> > such rules wouldn't be trivial...
> >
> > ...but if we get there, we could implement the data extraction as a
> > preprocessing step: in a first extraction phase, go through the
> > Wikipedia dump, collect and store stuff from these 'data templates',
> > and during the main extraction, pull the data from the store where
> > needed and generate triples. Informally, we already have such a
> > preprocessing phase for the redirects. It would make sense to
> > "formalize" it and also use it for other info, e.g. disambiguation
> > pages, inter-language links, resource types, etc.
> >
> > Cheers,
> > JC
> >
> > [1] For example, http://de.wikipedia.org/wiki/Hannover contains
> >
> > {{Infobox Gemeinde in Deutschland
> > ...
> > | Gemeindeschlüssel = 03241001
> > ...
> > }}
> >
> > ("Gemeinde in Deutschland" means "community in Germany",
> > "Gemeindeschlüssel" means "community key".)
> >
> > The actual data is in pages like
> >
> > http://de.wikipedia.org/wiki/Vorlage:Metadaten_Einwohnerzahl_DE-NI
> >
> >>
> >> I hope that I was enough clear, otherwise don't hesitate to ask me some
> >> informations in more about these problems.
> >>
> >> Thanks for your lights.
> >>
> >> Best regards.
> >>
> >> Julien Plu.
> >>
> >> ------------------------------------------------------------------------------
> >> Minimize network downtime and maximize team effectiveness.
> >> Reduce network management and security costs.Learn how to hire
> >> the most talented Cisco Certified professionals. Visit the
> >> Employer Resources Portal
> >> http://www.cisco.com/web/learning/employer_resources/index.html
> >> _______________________________________________
> >> Dbpedia-discussion mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >>
> >
> > ------------------------------------------------------------------------------
> > Minimize network downtime and maximize team effectiveness.
> > Reduce network management and security costs.Learn how to hire
> > the most talented Cisco Certified professionals. Visit the
> > Employer Resources Portal
> > http://www.cisco.com/web/learning/employer_resources/index.html
> > _______________________________________________
> > Dbpedia-discussion mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire
> the most talented Cisco Certified professionals. Visit the
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire
the most talented Cisco Certified professionals. Visit the
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion