Hi,

from what I understood the problem which will arise for DBpedia with the
introduction of Wikidata, is that actual values will not be available in
Wikipedia dumps anymore.
Instead, we will end up either finding nothing (see InterWiki links) or
finding wikidata parser functions, e.g. {{#property:p169}} as shown here [1]

Maybe one approach could be to gather wikidata dumps and build some sort of
triples stores, which can be used to resolve actual data during extraction
(adding handling of specific parser functions in the extraction framework).

WDYT?

Cheers
Andrea

[1]  https://blog.wikimedia.de/2013/03/27/you-can-have-all-the-data/


2013/4/5 Julien Plu <julien....@redaction-developpez.com>

> @Jona Christopher Sahnwaldt : Lua scripting has been made to create new
> templates, and these templates will able to be used in the infoboxes.
>
> By the way, happy to see that some solutions are in thinking to solve the
> problem about wikidata at least. I hope that efficients solutions will be
> found :-)
>
> Best.
>
> Julien Plu.
>
>
> 2013/4/5 Anja Jentzsch <a...@anjeve.de>
>
>> Hi all,
>>
>> this is a relevant topic to be address soonish.
>> There is already work to replace the Template:Infobox with a Lua script.
>> Also see the blog post here:
>>
>> http://blog.wikimedia.org/2013/03/14/what-lua-scripting-means-wikimedia-open-source/
>>
>> Also the Wikidata inclusion syntax is starting to replace values by calls
>> to the repository, see:
>> http://meta.m.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax_v0.3and
>> http://blog.wikimedia.de/2013/03/27/you-can-have-all-the-data/
>>
>> This will make it increasingly harder to retrieve properties along with
>> their values from Wikipedia dumps without a) interpreting the Lua script
>> results and b) accessing Wikidata.
>>
>> Cheers,
>> Anja
>>
>> On Apr 5, 2013, at 15:40, Jona Christopher Sahnwaldt <j...@sahnwaldt.de>
>> wrote:
>>
>> > Hi Julien,
>> >
>> > thanks for the heads-up!
>> >
>> > On 5 April 2013 10:44, Julien Plu <julien....@redaction-developpez.com>
>> wrote:
>> >> Hi,
>> >>
>> >> I saw few days ago that MediaWiki since one month allow to create
>> infoboxes
>> >> (or part of them) with Lua scripting language.
>> >> http://www.mediawiki.org/wiki/Lua_scripting
>> >>
>> >> So my question is, if every data in the wikipedia infoboxes are in Lua
>> >> scripts, DBPedia will still be able to retrieve all the data as usual ?
>> >
>> > I'm not 100% sure, and we should look into this, but I think that Lua
>> > is only used in template definitions, not in template calls or other
>> > places in content pages. DBpedia does not parse template definitions,
>> > only content pages. The content pages probably will only change in
>> > minor ways, if at all. For example, {{Foo}} might change to
>> > {{#invoke:Foo}}. But that's just my preliminary understanding after
>> > looking through a few tuorial pages.
>> >
>> >>
>> >> My other question is mainly concerned by Wikipedia FR, because I don't
>> found
>> >> the same thing in english, sorry. Since almost one year for the infobox
>> >> population property we can do something like that :
>> >>
>> >> population = {{population number}}
>> >>
>> >> Where "population number" refer to a number which is on another page.
>> Let me
>> >> give you an example, the Wikipedia page about Toulouse city, contain
>> this
>> >> infobox property :
>> >>
>> >> | population         = {{Dernière population commune de France}}
>> >>
>> >> And the value of "Dernière population commune de France" is contained
>> in
>> >> this wikipedia page :
>> >>
>> http://fr.wikipedia.org/wiki/Mod%C3%A8le:Donn%C3%A9es/Toulouse/%C3%A9volution_population
>> >>
>> >> So now the problem is that in the xml dump we don't have the real
>> value of
>> >> the population so it exist a way to have the value and not the "string"
>> >> which represent the value ?
>> >
>> > I've seen similar structures on Wikipedia de [1] and I think also on
>> > pl or cs: the actual data is not in the content pages, but in some
>> > template, and is rendered on the content page by rather complex
>> > mechanisms.
>> >
>> > To deal with this, DBpedia could try to expand templates, or maybe
>> > just certain templates (we don't want all the HTML stuff). Great
>> > generality, but may cause perfomance and other problems. In the worst
>> > case, mapping-based extraction could become as slow as abstract
>> > extraction.
>> >
>> > Or we could let people add rules on the mappings wiki about which
>> > templates contain data and how the data should be attached to certain
>> > DBpedia resources. Of course, determining syntax and semantics for
>> > such rules wouldn't be trivial...
>> >
>> > ...but if we get there, we could implement the data extraction as a
>> > preprocessing step: in a first extraction phase, go through the
>> > Wikipedia dump, collect and store stuff from these 'data templates',
>> > and during the main extraction, pull the data from the store where
>> > needed and generate triples. Informally, we already have such a
>> > preprocessing phase for the redirects. It would make sense to
>> > "formalize" it and also use it for other info, e.g. disambiguation
>> > pages, inter-language links, resource types, etc.
>> >
>> > Cheers,
>> > JC
>> >
>> > [1] For example, http://de.wikipedia.org/wiki/Hannover contains
>> >
>> > {{Infobox Gemeinde in Deutschland
>> > ...
>> > | Gemeindeschlüssel = 03241001
>> > ...
>> > }}
>> >
>> > ("Gemeinde in Deutschland" means "community in Germany",
>> > "Gemeindeschlüssel" means "community key".)
>> >
>> > The actual data is in pages like
>> >
>> > http://de.wikipedia.org/wiki/Vorlage:Metadaten_Einwohnerzahl_DE-NI
>> >
>> >>
>> >> I hope that I was enough clear, otherwise don't hesitate to ask me some
>> >> informations in more about these problems.
>> >>
>> >> Thanks for your lights.
>> >>
>> >> Best regards.
>> >>
>> >> Julien Plu.
>> >>
>> >>
>> ------------------------------------------------------------------------------
>> >> Minimize network downtime and maximize team effectiveness.
>> >> Reduce network management and security costs.Learn how to hire
>> >> the most talented Cisco Certified professionals. Visit the
>> >> Employer Resources Portal
>> >> http://www.cisco.com/web/learning/employer_resources/index.html
>> >> _______________________________________________
>> >> Dbpedia-discussion mailing list
>> >> Dbpedia-discussion@lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>> >>
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Minimize network downtime and maximize team effectiveness.
>> > Reduce network management and security costs.Learn how to hire
>> > the most talented Cisco Certified professionals. Visit the
>> > Employer Resources Portal
>> > http://www.cisco.com/web/learning/employer_resources/index.html
>> > _______________________________________________
>> > Dbpedia-discussion mailing list
>> > Dbpedia-discussion@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>
>
>
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire
> the most talented Cisco Certified professionals. Visit the
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to